Books go out of print, although you can usually find them in some library or used book store if you are desperate enough. Soon more publishers will be offering print on demand for rare and out of print books, which is great. And when books go in the public domain you can sometimes find them on Project Gutenberg. Music CDs go out of print, as do DVDs. That’s even harder to deal with, although there is a big second-hand market online you can explore.
But what to do when a URL goes dead? If it is recent you might be able to simply find it in the cached search results for Google or Yahoo!, but after a while I’ve found those caches are updated to show the error message on those pages. Well, there is the Internet Archive, which has the WayBackMachine, but that is not very good either. It sometimes works, but many sites block web engine robots from crawling their site. Other sites are simply difficult for the WayBackMachine to crawl, so you can find the front page, but then the links are all dead.
This is a serious problem for scholars and teachers. A study done in 2002 found that links for some courses in biochemistry decayed at the same rate as radioactive isotopes:
The links in the three courses had a half-life of 55 months: Half of the links would be expected to have died in 55 months, half of the remaining links would be expected to have died in another 55 months, and so forth.
I don’t know if things have improved any.
I am sad to admit that I am personally a source of link rot. (Maybe there is some kind of cream I should be using?) Having recently moved my entire web site over to TextDrive, I ran into numerous problems leading to link rot.
The first involved the various different wiki software I was using. My main wiki uses MediaWiki as the backend, and in moving to the new server I did a fresh install and updated everything to comply with certain file naming standards I had foolishly ignored the first time I installed the software. As a result, all the old links are now dead! I could probably go through and redirect all the old links, one by one, to their new site, but I took the easy way out and created an error page that will appear to anyone who tries to use the old links, telling them how to fix them.
But that was not the only source of link rot. There were two software packages running old portions of my site which I decided I wouldn’t move to the new site because it was too much work. Instead I removed them. Now they are gone. I thought I could solve the problem by linking to the Internet Archive version of those sites, but I found that this didn’t work since the archive had not properly stored the whole site. So now, those pages are simply lost in the ether.
Also, all my URLs changed a few years back when I moved my blog from MovableType to WordPress. At the time somebody helped me write a redirect script to point all the URLs at the new site, but with the move that is lost as well, and I lack the time and skills to recreate the script.
I did some searches, and found that there aren’t that many links to my older stuff. It is only in the last couple of years that my site began attracting much attention. So I’m not really that worried about all those dead URLs. Although it is frustrating that some Google searches to my older stuff doesn’t work. Conversely, I have no idea how many of the sites I linked to over the years are still around. I’m not sure I want to find out.
It is a good argument for wikis, since anyone who finds a dead link can update it. Anyone can self-publish on the web, which is great, but it also means that everyone is personally responsible for preventing link rot on their own site. As I just discovered, that isn’t so easy.