In defence of the offence on URLs

August 2019

The Chromium browser recently started displaying addresses in the omnibox in accordance with the Chromium URL display guidelines. The gist of it is to simplify display of URLs by omitting the scheme parts and trivial subdomains like www. This was met with a mixed response ranging from strong support to outrage and outcry.

This is a list of arguments to support this simplification, in addition to the usual security and usability ones. Not only in the context of web browser address bars, but also elsewhere on the web.

Context is usually enough

Context is enough for us to know example.com/page is a webpage. We don't need "https" before it to know that, for the same reasons we don't need a URL scheme for Twitter usernames. @handles are usually enough.

URLs are awkward in right-to-left scripts

Internationalised domain names, as complicated as they are, finally have widespread support. This means we can have web resource addresses comprised entirely of non-ASCII characters. That's great news for the billions of people who use non-Latin alphabet languages, but the string "https://" presents a problem for right to left scripts such as Arabic and Hebrew.

Right-to-left languages are usually rendered using the Unicode bidirectional algorithm. These rules determine the direction in which embedded "runs" of text are rendered as well as the direction of the paragraph they are in. While there are explicit zero-width characters that set this direction, it's almost always implicitly derived from the direction of the characters in the run. In addition to this, different implementations of these rules have many nuanced differences and subtleties.

As a result, the string "http://" for an otherwise Arabic URL is not only inconvenient to type, requiring a keyboard layout change, but it can also affects the direction of the paragraph it's in if it starts with it, and is almost always very frustrating to select and edit.

The scheme part is redundant in browser address boxes

Gone are the days of gopher and ftp. And with h2 & h3 requiring TLS, http URLs' days are numbered. As for file URLs, file paths can be trivially disambiguated from web resources and browsers have always done this. If it begins with ./ or / it's local.

Browsers still support arbitrary schemes in an a tag's href attribute, like tel and mailto. But these are never exposed to the user. In this case it's just an encoding for some information that a machine interprets. If it was another attribute, say "proto", on the same tag, it would make no difference to how any of these work.

The scheme part is ambiguous anyway

What does the scheme part actually mean? It's not very clear cut what goes in there.

Is it the protocols used to fetch a resource? QUIC, h2, and Tor Onion services all use vastly different protocols, at various layers and levels of encapsulation, yet they can all share the https scheme.

Is it the semantics of whatever stack of protocols used add up to? http and https share their semantics but evidently have different URL schemes.

If it's clear enough how to retrieve a resource given its address without the scheme, why hold on to an arbitrary piece of string?

Email me or reach out on Twitter for corrections or suggestions.