Home Tags Text fragment links

Tag: text fragment links

Google Pushes “Text Fragment Links” with New Chrome Extension

Google has been concocting an extension to the “Text Fragments” URL standard. The new link style will allow you to link not only to a page but to specific text on a page, which will automatically be scrolled to and highlighted once the page is loaded. This is like an anchor link, but with anyone’s emphasis.

The function has been in fact supported in Chrome since version 80, reaching the stable channel in February. Now a new Google extension makes this new form of connection simple to build, which will function for everyone else using Chrome on desktop OS and iOS. Google has proposed the concept to W3C and hopes it will be embraced by other browsers, but even if they don’t, the connections are backward compatible.

The syntax in this URL looks rather unusual. The magic is in the string “#: ~:text=” after the URL, and then whatever text you wish to match. So, this will look like a complete link:

https://en.wikipedia.org/wiki/Cat#:~:text=Most breeds of cats have a noted fondness for sitting in high places

If you copy and paste it into Chrome, the browser would then open the cat page for Wikipedia, scroll to its first text that matches “Most cat breeds have a noted fondness for sitting in high places” and point it out. If nothing matches the text, the page will still load. Backward-compatibility tends to work because search engines assist the number sign (#) as a fragment of the URL.

When you paste this into some kind of browser that does not support it, the page will still load and it will just ignore everything after the number sign as a bad anchor link. So far, pretty good.

One issue is that this means you can have spaces within a URL.

URL parsers now have a shot at linkifying this correctly, but it looks like a mess:

https://en.wikipedia.org/wiki/Cat#:~:text=Most%20breeds%20of%20cat%20have%20a%20noted%20fondness%20for%20sitting%20in%20high%20places.

Spaces aren’t the only characters that could cause issues. The standard RFC 3986 defines several “reserved” characters in a URL as having a particular meaning, so they should not be in a URL. Web page authoring tools tend to solve these characters automatically, but that you’re now inserting arbitrary phrases in a URL to highlight, you’re more likely to run into one of those characters: ! * ‘ ( ) ; : @ & = + $ , / ? # [ ].

They all need to be encoded percentage-wise for the URL to function and Google’s extension will take care of that for you.