Web archiving climate change: A call for community documentation strategy

October 31st, 2017

The following is a guest post by Laura Alagna, Digital Preservation Librarian at Northwestern University.

When I got involved several years ago, I quickly learned that archiving the web poses new questions with each opportunity. I find that “We have this amazing tool that can archive the internet!” can either capture the imagination or induce indecision. Take for instance the subject of climate change. Addressing one of humanity’s biggest challenges will be impossible without vast information resources, yet much of climate science data is vulnerable to removal, deletion, or obscurity. Acting at this scale inspires us to transcend the normal institutional or departmental boundaries, collaborating with new partners and stakeholders. To rise to the scope and importance of this mission, a few Archive-It partners and peers have begun to discuss potential strategies and roles, and you can join us.

This problem came to my attention through the efforts of DataRefuge, one of several projects working to back-up and preserve government-funded data through web archiving and other workflows. Northwestern University hosted a DataRefuge event in April 2017, which brought together scholars, students, and information professionals to try to protect climate data for future use. However, the spontaneous events alone are not a long-term solution to the problem of preserving government-funded data, including web-based resources. Web archiving climate change will require a great deal more discussion, collaboration, and strategy between web archivists and their stakeholders.


Photo of the #DataRescueChicago event at Northwestern University's Knight Lab

Earth Day 2017 DataRefuge event at Northwestern University (photo by Joe Germuska, NU Knight Lab)


Through participating in the DataRefuge project, I learned that Ben Goldman, Kalin Librarian for Technological Innovations at Penn State University, was participating in similar events and coming to the same conclusion as I did: web archiving climate change is an essential but difficult problem, and we needed to strategize how to approach it together. We brainstormed over many too-long emails throughout the spring and summer: could we start something collaborative between our universities? What were other web archivists already doing? What is the proper scope for what we are trying to accomplish, and how will we know when we’ve reached it? Ben and I talked a great deal, but the problem seemed to be so insurmountable that we weren’t sure where to begin.

It was suggested that we introduce web archiving climate change as a topic for discussion at the annual Archive-It Partner Meeting in Portland. Ben and I happily agreed, and hoped to use the opportunity to brainstorm ideas for web archiving climate change with the wider web archiving community — hoping that a firm deadline would prompt us to actually figure something out (at least, that was my strategy!). To make the most of our time, we created a handout with jumping-off points for discussion, a list of existing web archives related to climate change, and our live collaborative notes. The examples demonstrated the huge variety of possible angles with which to approach web archiving climate change, from climate science (New York Climate Change Science Web Archive, Cornell University) to energy development (Pennsylvania Shale Energy Web Archive, Penn State) to natural disasters (Alberta Floods June 2013 Web Archive, University of Alberta).

Capture from the Climate.gov web archive

A January, 2017 capture from the Climate.gov Web Archive


The discussion was fruitful and thought-provoking! Participants discussed users, context, scoping, and other ideas related to web archiving climate change. All agreed that there are a large number of potential users of climate change web archives already, from researchers to journalists to activists, but we also wondered how to anticipate and accommodate new types of users in the future. The group represented a wide variety of organizations all over the country, and as such brainstormed a diverse variety of potential collecting themes related to climate change. Some participants came from California, for instance, so drought and water levels were topics that piqued a lot of interest. The group also discussed the need to archive data from local and state governments elsewhere around the country, rather than focus exclusively on the federal level.

Finally, the group brainstormed next possible steps, focusing on a few questions: how can we put the ideas discussed today into practice? And how can we promote linkages and longer term collaboration in our efforts? We all agreed that one of our key challenges is lack of time since most of us have many duties beyond web archiving. To overcome this, we discussed collaborative web archiving projects among several organizations or under the auspice of a consortium, so that no one organization or web archivist need be solely responsible for establishing and managing a climate change web archive. The effort could in turn represent a useful opportunity to test emerging collaborative tools among web archivists, such as the IMLS-funded Cobweb platform currently being developed among partners at the California Digital Library, UCLA Library, and Harvard Library. Furthermore, we discussed decentralizing responsibilities like developing controlled vocabularies across web archive collections related to climate change, so that users can more easily connect, navigate, and use these resources.

Our discussion on web archiving climate change may not have solved the enormous challenge of preserving web-published climate data right away, but the group took some very valuable steps towards defining the core questions, scope, and strategies that can make it possible. We very much want to continue this important work with an even wider community. If you are interested in learning more about web archiving climate change or joining a group that would continue to work on this vital issue: Contact us!