article

Link rot is the process by which links on a website gradually become irrelevant or broken as time goes on, because websites that they link to disappear, change their content or redirect to new locations.

The phrase also describes the effects of failing to update webpages so that they become out-of-date, containing information that is old and useless, and that clutters up search engine results. This process most frequently occurs in personal homepages and is prevalent in free webhosts such as GeoCities, where there is no financial incentive to fix link rot.

Discovering


Detecting link rot for a given URL may be difficult using automated methods. If a URL is accessed and returns back an HTTP 200 (OK) response, it may be considered accessible, but the contents of the page may have changed and may no longer be relevant. Some web servers also return a soft 404, a page returned with a 200 (found) response (instead of a 404) that indicates the URL is no longer accessible. Bar-Yossef et al. (Bar-Yossef et al., 2004) developed a heuristic for automatically discovering soft 404s.

Combating


Webmasters

A number of basic rules can help webmasters to reduce link rot, including:
  • Do not keep a hyperlink collection unless you are willing to look after it.
  • Design your hyperlinks to be maintained, such as a central hyperlink collection.
  • Do not link to sub-pages ("deep linking") unless you are confident that they will remain stable.
  • Use hyperlink checker software or a Content Management System (CMS) with link checking included.
  • Use permalinks.
  • Put the right e-mail address or other contact information on the same page where the links are with specific information ("Found a bad link? Contact links@example.com and we'll fix it.")
  • When changing domains, help others fix their link pages by spreading the information well ahead of the migration, and use HTTP status codes to communicate that a page has moved (eg. "301: Moved Permanently").

Authors citing URLs

A number of studies have shown how wide-spread link rot is in academic literature (see below). Authors of scholarly publications should avoid citing "unstable" Internet references. There are several approaches authors may take to avoid introducing link rot into their work:

Tools

There are a number of tools that can be used to combat link rot by archiving web resources:

  • Archive-It, a 'Web-archive-on-demand' service created by the Internet Archive *, allows users to prospectively initiate the archiving process.
  • hanzo:web is a personal web archiving service created by Hanzo Archives that can archive a single web resource, a cluster of web resources, or an entire website, as a one-off collection, scheduled/repeated collection, an RSS/Atom feed collection or collect on-demand via Hanzo's open API.
  • Spurl.net is a free on-line bookmarking service and search engine that allows users to save important web resources.

Modern management


On Wikipedia, and other Wiki-based websites only external links still present a maintenance problem. Wikipedia's uses a clear color system with internal links, so the user can see if the link is live before clicking on it.

In academic citations


A number of studies have been performed showing the prevalence of link rot in academic literature:

References


See also


External links


World Wide Web

linkrot | Länkröta

 

This article is licensed under the GNU Free Documentation License. It uses material from the "Link rot".

Home Pageartsbusinesscomputersgameshealthhospitalshomekids & teensnewsphysiciansrecreationreferenceregionalscienceshoppingsocietysportsworld