What do you want to know about Archive-it?
I'd like to survey you, our loyal FGI readers. I'm co-presenting with Molly Bragg at next week's Depository Library Council conference about digital collections using archive-it (see title and abstract below). I've got an outline but I'd really like to know what questions YOU have about archive-it and digital collections. What do YOU want to know about archive-it? So, please please please leave a comment here so that my presentation will be even more amazing :-)
Title of Presentation:
Gone Today, Here Tomorrow: Archiving and Preserving Born Digital Government Documents
Abstract:
Stanford University Library has been a federal depository library since 1895. In 2007, the library began collecting born digital documents using Archive-It, the web archiving service from Internet Archive (www.archive-it.org). In this presentation James Jacobs will discuss his group's objectives and procedures for selecting and archiving digital content and share examples of the unique content preserved. Molly Bragg will present an overview of web archiving projects and tools used and developed by Internet Archive. These tools are used by libraries around the world to preserve government documents and other born digital content.











Questions about Archive-It
1. Does Archive-It also archive digitized historical government documents or is that only via Internet Archive's (archive.org) text collections?
2. I know Internet Archive is working with UNT on the "End of Term Presidential Harvest 2008" project, so that means this project is using the capturing tool from Archive-It, correct?
3. What are the competitors? Or is this pretty much the best web archiving project there is?
4. Is it partnered with UNT's Cyber Cemetery?
I think I just need a really good overview of how it works and how I can subscribe and/or use the tool myself to archive government websites and digital government information.
I want libraries in Louisiana to get involved in this too. Louisiana government information needs to be archived and I think Archive-It is a great way to do it. There is a partnership with OCLC preserving digital Louisiana state info/websites though I'm not sure about the specifics or the extent.
Archive-It General Information
Hi Blakeley,
Thanks for expressing your interest in Archive-It. My talk with James will give some overview information about the service, but really the best way to get the full picture is to come to one of the Web Archiving and Archive-It informational webinars we hold twice a month. The next one is coming up on Oct 28 at 11am PT. Contact me if you want to sign up.
I will mention the End of Term harvest briefly in my presentation; Cathy Hartman and Mark Philips from UNT are also doing a presentation on the project Tuesday at the DLC meetings. I don't know if this is part of their cyber cemetery. Also, IA is not using Archive-It for the harvesting. Instead one of our engineers is running the crawl directly.
Lets follow up at the conference and/or over email for more specifics to Louisiana. My email is mbragg at archive.org
thanks again,
Molly
Thanks, Molly
I would love to attend one of your webinars! I'll email you. And it will be great to meet with you at the conference.
Thanks for answering some of my questions.
See you soon,
Rebecca Blakeley
steps for involvement
I think it would be great if the session gave us some steps/resources/action items we could use for small-scale projects, or ways that we could find out about contributing to projects already in place for a particular area of interest.
Also, I am personally curious about whether and how you (= IA) can track research done with the Internet Archives. Maybe that question's a bit off-topic for this session, though I can see it helping to define potential Archive-It projects.
Questions about Archive-It
Does Archive-It include provisions for migration or emulation to preserve data as formats/software changes? How does it ensure that the object in the archive remains unchanged over time? As the official state repository for state e-docs, we have permanent public access as a goal. How does Archive-It provide that?
long-term preservation of archive-it data
Hi Julie. Good questions. The Archive-it FAQ talks a bit about preservation and data formats. Quite a few state libraries/archives are archive-it partners as well as many state and federal agencies (including the DOE E-print Network) and even .
Your questions are difficult to pin down because digital preservation is in its infancy and nobody really knows for sure the perfect system for long term preservation. There is a lot of research going on from various organizations (like Library of Congress, Digital Library Federation) and numerous librarians and academics.
Be that as it may, the Internet Archive is at the forefront of digital preservation with one of the largest and longest standing digital repositories. They use the ARC file format, a non-proprietary storage format and an ISO work item, and open-source harvesting, indexing, and accessing software. The archive stores 2 copies of every file, and has built-in redundancy and mirroring of their data. Additionally, all data harvested can be copied and stored locally, for example, in an institutional repository. I don't know much about their hardware management, but am sure that they've built in hardware upgrades into their infrastructure in addition to assurances for long-term data preservation.
I hope that Molly can chime in some more because I'm sure that I've missed some things. The long and short of it is that the Archive is concerned about long term storage and preservation. The large number of libraries, archives and government agencies currently using the service attest to the fact that the Archive is a trusted organization and active player in the field of digital preservation.
Post new comment