Piecing together the debate: Archiving the Supreme Court DOMA/Proposition 8 deliberations collaboratively

March 28th, 2013

This week the Archive-It team is coordinating a collection of archived web pages related to the Supreme Court hearings for the Defense of Marriage Act (DOMA) and California’s Proposition 8.

Based on the success of a similar approach we took with collaboratively gathering URLs for a Hurricane Sandy collection in partnership with the Virginia Tech: Crisis, Tragedy, and Recovery Network, we reached out to the Archive-It community via social media to spread the word about a collaborative GoogleDoc where anyone interested could nominate seed URLs to be archived. The response was instantaneous and energetic- at one point over 150 people were working on the document, and more than 100 URLs were added within the first 24 hours.

Human Right Campaign's Tumblr Blog

Human Right Campaign’s Tumblr Blog

The URLs nominated through the document spanned the entire political spectrum, and included local, national and international news stories, blog posts, tweets, organizational websites both for and against the repeal of DOMA and Prop 8; as well as the official Supreme Court online web presence which includes transcripts and audio files of the deliberations.

The document to nominate URLs is available here and the Archive-It team will continue to capture URLs, including running QA crawls to ensure that content has been captured comprehensively . You can see the archived content as the collection progresses on Archive-It.org: http://archive-it.org/collections/3611.

The Supreme Court's official website, which included audio files and text transcripts of the deliberations.

The Supreme Court’s official website, which included audio files and text transcripts of the deliberations.

Archiving digital content for  “spontaneous events” as they unfold can be a complicated task. Several of our partners, including colleagues at Virginia Tech, are developing ways for us to be able to more easily and effectively locate and curate reliable Twitter and other social media feeds. There are current features of Archive-It make the process less time consuming, including the ability to create collections, designate URLs to be captured, and start a crawl in less than 5 minutes.

In addition, utilizing various “seed types” such as News/RSS or One Page Only, we are able to quickly target relevant material instead of archiving an entire website, or if needed, limit the crawl to a specific directory. Because of the quick and collaborative nature of the URL nomination process during a spontaneous event, it is typical that duplicate URLs will be nominated. Before running a crawl, Archive-It checks for redundancy to see if the URL has already been added to a collection.

Of course, the human effort involved can be intensive in a project like this. In the first few days, it’s difficult to do a timely assessment of the nominated seeds to determine their relevancy. After initial crawls have been run, the next steps will be to perform additional quality assurance tasks, and add seed level metadata (using the Dublin Core Element Set available within the web application). Post crawl reports and other quality assurance features within the web application will be utilized to analyze the results of the crawls and improve the functionality and appearance of the archived websites.

Organizational websites across the political spectrum were archived.

Organizational websites across the political spectrum were archived.

When a major cultural and political event like this takes place, resources for information and user generated content seem to be boundless and available at our fingertips by searching Google or opening up our social media feeds. We may even be overwhelmed or “burned out” by the presence of the material online. We have learned that content on the web is ephemeral and as time passes, so does the availability of freely accessible information and content. Links break, content moves, and web technology changes. Archiving these websites this week ensures that they will be available, searchable, browsable, and in context for generations to come.

You can browse and search the collection as it grows and progresses at:


We encourage the Archive-It community to continue nominating additional URLs to be archived at the URL below. We are particularly interested in blog posts and fringe voices in the debate, and as well as first person accounts of protests and marches outside of the Supreme Court.