My web warehouse has been emptied.

In my first job in ed-tech, a 14 year stint as an instructional technologist at the Maricopa Community Colleges, I spawned a vast, messy array of web sites from 1993 until I left in 2006, hosted at – that URL is retired, and should redirect to the center’s current site at… but the old link is stuck in the spin cycle.

My stuff was a hodgepodge of HTML, PHP, old per scripts, creaky wikis some really insecure ones that wrote to open text files… and was completely decommissioned from their web site sometime after I left. I cannot fault anyone, especially since I could count on the Internet Archive’s Wayback machine — it always worked to find my old web sites. I counted on it.

Until today, I was writing a comment about archiving, and meant to show how well the Wayback machine did to preserve my digital past, and what I got was… bubkahs.

Got no wayback, peabody

Got no wayback, peabody

It seems odd since my other old servers, stuff I ran really old wikis and blog platforms, still show up in the Wayback Machine

I can only guess that some DNS configuration at Maricopa has rendered the old URL that pointed to my stuff as unresolved, and thus the Wayback Machine gets hung testing the URL? I am wildly guessing, as I do not know how this works. If it really irked me, I could try and find someone I know who works at Maricopa to see if they can fix the redirect.

But if the Web Archive hinges on a remote DNS setting, what does that say?

The lesson is, again, how weak the links are in the web fabric chain when they rely on other entities to manage your stuff– or how I barked recently Digital Durability? My Money is on the Individual.

I have all me web content from those years on a hard drive, and began re-archiving them myself, on my own domain —… but that does not solve the problems of old links that no longer work, and now when the next recourse, the Wayback Machine, is an empty warehouse.

Because some IT person changed a setting on a server.

It’s castles of internet sand we think we live in.

UPDATE Jun 23, 2016

I contacted the Internet Archive via email about this issue, and appreciate their quick response:

Hi Alan,

Thank you for contacting the Internet Archive.

You are right, according to the current Wayback Machine policy, if robots.txt cannot be reached, we no longer allow access to this domain through our website.

And thus, confirms my assertion on the fragility. Now I have to see if I can contact anyone at the Maricopa IT department…

Top / Featured Image: Searched google images (set for results licensed for reuse) on “empty warehouse” — more options than I could dream of. I like to use flickr ones, as I have this nifty tool for generating attributions– so used the flickr photo by nickton shared under a Creative Commons (BY) license

If this kind of stuff has value, please support me by tossing a one time PayPal kibble or monthly on Patreon
Become a patron at Patreon!
Profile Picture for cogdog
An early 90s builder of the web and blogging Alan Levine barks at on web storytelling (#ds106 #4life), photography, bending WordPress, and serendipity in the infinite internet river. He thinks it's weird to write about himself in the third person.


    1. Yes, I understand that.

      But there is no explicit robots.txt file at that URL (there was one there for the last 8 years) because the domain never resolves so is never found, so this means that the Wayback Machine assumes exclusion?

      My point is more that IT staff does not know anything about an 8 year old DNS entry, decides its not needed, and the entire archive of 14 years of web work are gone. That is fragile (and also easily fixed).

Leave a Reply

Your email address will not be published. Required fields are marked *