How Good Archiving Prevents Dark Ages

More data has been created in the last two years than in the entirety of human history, but it could all be lost in an instant.

Earlier this year when Amazon’s servers blacked out for about four hours, sites from streaming giants like Netflix and Spotify down to the average Internet Joes temporarily lost their livelihoods in the cloud.

USA Today reported that “Amazon wasn’t able to update its own service health dashboard for the first two hours of the outage because the dashboard itself was hosted on AWS.”

Modern archiving has turned out to be a natural enemy of copyright law, and an increasing reliance on electronic data storage could leave our recent history in a delicate position.

Our relatively young, 5,000-year recorded history has seen the most significant archiving science advancements in the last few centuries, but that could easily be undone.

Adam Farquhar, head of the British Library’s digital preservation efforts says, “If we’re not careful, we will know more about the beginning of the 20th century than the beginning of the 21st century.”

Most people see our history as a permanent incline on the line graph of civilizational evolution, but we have faced setbacks before; Ancient scientific advancements and literature, some of which we think were on par with our current scientific understandings (but can only imagine), were once lost at the Library of Alexandria.

In a testament to the elusiveness of factual, historical record keeping, the destruction of the library could have been because of Julius Caesar‘s civil war in 46 BCE or the Muslim conquest of Egypt in AD 642, with 700 years and two other possible explanations in between.

As the fall of the Roman Empire gave way to the Dark Ages, Christendom destroyed pagan temples and libraries because their ideologies conflicted. This is not unlike archivists being denied copies of copyrighted materials because of their negative effect on company profits.

How we Could Lose Recent History

If we suddenly lost all of the data and collected knowledge we’ve discovered since 1990, how would we recover it? How would we have lost it in the first place?

The most obvious concern with digital archiving is hardware. As new systems are developed, it becomes increasingly difficult to maintain read access across all hardware systems. This is an easy fix, however, as many modern archives update their systems every five years to mitigate decay and/or obsolescence. That’s not expensive, either, as hard drives are relatively cheap and durable.

Changes in software file types also hamper effective archiving, as digital information can usually only be accessed by the program where it originated. Communication is a two-way street, and the message is meaningless if we have no way to read it or to retrieve it.

Any software older than a decade usually requires hardware emulation for access. These problems can usually be fixed, too, albeit via increasingly complicated means.

The trickiest side of contemporary archiving is capturing online digital information.

Government and institutional archivists can only harvest information in the public domain- that is, information that can be accessed freely.

This means that anything that requires a password or specific search terms is off limits to these archivists.

For example, the Library of Congress must request permission before archiving a website. Circumventing this process has been illegalized under the Digital Millennium Copyright Act.

What’s more, streaming content is even more difficult to capture.

As we may be watching our Open Internet come to a close, the job becomes harder for these archivists.

If we don’t find a more concrete and accessible way to both store and retrieve data, all of the progress we’ve made in the last century runs the risk of being lost very easily.

Information Rebels

A private digital preservation effort known as the Internet Archive (IA) goes outside the realm of the DMCA.

Created by Brewster Kahle in 1996, IA boasts access to books, video, software and over 160 billion websites, compared to the Library of Congress which can only archive public domain and government-sponsored sites.

To safeguard our history, we will have to practice good, unbiased archiving.Click To Tweet

Unfortunately, it seems that one element to protecting our digital history is in changing regulation to be more congruent to the state of technology.

More importantly, however, even if information archiving is deregulated, the information is still stored on computer servers. As the 2017 Amazon Outage showed, cloud servers are not foolproof.

But, traditional archiving can’t keep up the pace to record the data hard copy. And, even if it could, is paper really sufficient?

Thinking back to the Library of Alexandria, the obvious answer seems to be “no”.

What are our other options?

Etching in stone?

Laminating every page in a potentially costly flame-retardant plastic?

The limitations here are obvious. Maybe we will stick to electronic storage. But, how do we ensure this data stays protected?

We’ve previously covered the potential for diamond data storage. We’ve even used that information to evaluate if their data storage capacity explains the true value of diamonds and the practice of wearing them on our persons. Let’s take a look at a few other data storage alternatives.

Data Centers Evolve

The practice of including massive data archives in nuke-resistant bunkers has been a thing for decades. Yet, digital storage facilities in subterranean complexes are a recent adaptation of traditional data storage institutions.

Iron Mountain has turned a section of a Western Pennsylvania limestone mine into an energy-efficient data storage center referred to as Room 48.

Iron Mountain boasts the mine’s natural cooling properties which saves them $1.7 million USD annually.

The data center houses everything from private data to new data security systems of the company’s own like CloudRecovery. This system provides offsite backup and extra security for its enterprise customers.

But again, here, we see a greater respect for financial information and information that is privately owned. Unless a benevolent philanthropist wants to pay Iron Mountain to securely store our recent global histories, where will this information survive?

How do we settle on a modern Rosetta Stone?

Knowledge Lost Long ago

In 1901, off the coast of the Greek island Antikythera, an analogue computer was recovered from an ancient shipwreck. Archaeologists assert that the device was used to predict astronomical positions for use with the calendar as well as for the cycles of the Olympic games.

This complex device is made of at least 30 meshing bronze gears and seems to have been able to fit in one’s hand. Scientists have dated the object back to somewhere between 205 and 100 BCE.

Nature news editor Jo Marchant mentions in a November 2006 feature of the Antikythera mechanism that after the knowledge of this technology was lost some time in the ancient world, artifacts of its complexity and purpose were not found again until astronomical clocks were developed in 14th century Europe.

Say a world war breaks out or the Yellowstone caldera erupts. Humans surface from Fallout-style vaults 100 years from now. There was no time to safeguard sensitive technologies. Will these people recognize the D-Wave 2x quantum supercomputer when they dig it out of the abandoned NASA research center?

Whatever the case, we have evidence of cultures making the mistake of letting technological developments disappear due to inadequate archiving of information. If we want to avoid these kinds of mistakes, what must be done?

Sure, we could throw everything we know in Iron Mountain’s Room 48.

Yet, some argue that changing our nationalist attitude with regards to archiving information is necessary to improve the chance of that information surviving throughout the ages.

Changing Archival Science

Anne Gilliand, professor of Information Studies at UCLA, asserts in her book, Conceptualizing 21st Century Archives, that, historically, archiving practices have mainly been used as a tool for political or economic control.

Gilliand told Joanie Harmon of UCLA Graduate School of Education and Information Studies in an interview:

“Legal and political systems and cultural practices are all influential in how and why records are created and preserved in different traditions, and archivists have traditionally been educated to work in their own national contexts or according to one specific worldview.”

“The isolation that occurred during the First and Second World Wars limited the international and inter-institutional exchange that had begun to flourish in prior decades.”

“It’s very important that people understand the way that records almost always have two sides: a liberating or empowering side, and a controlling or disempowering side and archival practice needs to account for and anticipate both.”

“What archives have to do now is to think of the fact that they don’t have just an institutional audience, a local community audience, or a national audience; they have a global audience and a global responsibility, and the materials that they’re working with are globally created and shared and stewarded.”

A Global Audience

As Gilliand argues, archivists hoping to free archival science from being a tool for political or economic control will have to embrace that all of humanity is a global audience.

Even if that becomes the case, how will we be able to decide what is important enough to archive? Who should control that decision?

This is why a grassroots, organic contribution system like the Internet might serve as the best model for information collection and categorization. Yet, as we mentioned before, as Net Neutrality is compromised, the Internet also seems to be moving toward an economic and political tool.

As it stands, it’s best to keep a journal and write down everything you experience.

Do you keep a physical journal? How should we decide what is archived and what’s not?