Feature Spotlight: Sharing Seeds

November 30th, 2022

by Tanya Ulmer, Web Archivist for Archive-It

Sharing seeds across collections is now widely available in all Archive-It partner accounts!

Moving seeds between collections has long been on the wish list of some Archive-It partners. Collections evolve over time, and partners wanted more flexibility in shifting content already collected in one collection into another without incurring further data charges. First formally requested in the Help Center’s Features Request Forum in January 2014, it was among the most upvoted Feature Requests so far with 34 total votes.

The 2022 solution was renamed ‘Sharing Seeds’ by Derek Enos, Internet Archive Software Engineer. Derek and the Archive-It engineering team explored solutions that would fulfill partners’ needs for more flexibility to manage web archive holdings across collections, without costly or risky backend data storage changes. The solution they found involved mapping seeds and collections on the backend, offering an ‘elegant infrastructural solution’ to make seeds appear as if they had been ‘moved.’ The seed management, crawling, and WARCs all stayed put in the seeds’ original ‘source’ collections in the partner web application, but their content could now be discovered in a second ‘Target’ collection. To an end-user on archive-it.org, the publicly available content appears the same whether accessed in the ‘source’ collection or ‘target’ collection.

In its first six months (as of November 14, 2022), over 1600 seeds have been shared across 142 collections in 56 accounts, with a strong take off starting in mid-August continuing until October, when it plateaued slightly.

Line graph showing an increase in the number of seeds shared between July and November, 2002.

Number of Shared Seeds across Archive-It collections between July and November, 2022.

What has sharing these seeds’ content helped the earliest users achieve? And has it met users’ expectations so far? As a strong early user, Lynn University‘s Lea Iadarola put it this way:

I have been looking forward to having the ability to move or share seeds across collections for some time, so when the Sharing Seeds feature became available, I immediately started experimenting. It has been helpful to share existing seeds into more curated collections, such as Athletics or COVID without duplicating seeds or crawls.

Screenshot with many people doing a heart sign with their hands over their chests.

Screenshot of the Common Thread seed that Lynn University has shared into its COVID collection.

Echoing such enthusiasm, Amy Welten from the Netherlands Institute for Sound & Vision (NISV) says:

So far, our experience working with the Sharing Seeds feature has been better than expected. From the first use of it until now, the Sharing Seeds function is clear, easy to use, and very convenient for organizing seeds in multiple collections. For us, archivists of the Netherlands Institute for Sound & Vision, that was the particular reason why we started using the Sharing Seeds feature: we wanted to archive certain websites that seemed to belong in multiple collections we had already made. Instead of doing the more time consuming thing and making a whole new collection for those specific seeds, we tried using the Sharing Seeds feature, and it’s been great so far. It allows us to archive numerous websites – which fit in multiple collections and also have to be easy to find back in multiple collections – as correctly as possible.

User Interface of the Partner Web Application showing seeds that NISV shared

Screenshot of NISV’s Partner web application with list of shared seeds.

Finally, one of the earliest requesters of the feature, Alex Thurman at Columbia University Libraries, had this to say about how Sharing Seeds has helped redefine some of their collections so far:

At Columbia University Libraries, we’ve been building thematic web archive collections of external content since 2008. Over the years the thematic focus of one of our largest collections, originally called the Avery Library Historic Preservation and Urban Planning collection (Archive-It collection #1757), evolved to the point where in 2020 we decided to split it into two distinct collections. Most of the collection’s seeds remained in collection #1757, renamed as the New York City Places and Spaces collection. But the original collection had included a large subset of about 75 seeds of the official websites of individual public and residential buildings designed by Frank Lloyd Wright; these sites related to the original topic of historic preservation but were not connected to New York City, so they didn’t fit the newer “New York City Places and Spaces” theme–so we then created a new Frank Lloyd Wright collection (#13204) and began capturing those same seeds separately there. All the earlier (2010-2020) capture data of the Frank Lloyd Wright seeds, however, remained stuck in #1757. Happily this year the Sharing Seeds functionality made it possible for us to share all the earlier capture data from the Frank Lloyd Wright seeds residing in #1757 into the new Frank Lloyd Wright collection (#13204), while no longer displaying them in the public seed list for New York City Places and Spaces. So now our standalone Frank Lloyd Wright collection includes captures for some seeds going back to 2010, even though the collection itself was created in 2020. Very helpful to be able to repurpose or reframe archived web content in new groupings!

Screenshot with a modern house designed by Frank Lloyd Wright.

The David and Gladys Wright House seed that has been shared into Columbia University Libraries’ new Frank Lloyd Wright Collection.

Do you have similar collections that could use some redefining? Some content you’d like to provide access to through multiple collections? Get started Sharing Seeds in your collections today!