[UPDATE 9/23/11: It's come to our attention that scribd, the site that hosts the document below, does not make it easy for users to download. In some instances it appears as if the user has to subscribe to scribd before they can download. So I've attached a copy of the document below for your free downloading pleasure. JRJ]
In early April, Michael Keller, Stanford University Librarian and my boss, had a phone conversation with Beth Simone Noveck, US deputy Chief Technology Officer for Open Government leading President Obama's Open Government Initiative. Noveck requested a short report outlining how the digital FDLP would work.
Below is that report outlining a distributed ecosystem, or publications.gov, that "would incorporate collaborative cataloging/metadata creation, as well as shared or Peer-to-Peer (P2P) technical infrastructure in which data and technological redundancy and collective and proactive action reign." As many of you already know, some of the pieces for a digital FDLP ecosystem are already in place. However, as our recent post, "The State of FDsys and the Future of the FDLP", showed, some of those critical pieces are on shaky ground to say the least.
The report was forwarded to Bob Tapella and Mike Wash at GPO as well as Aneesh Chopra, Chief Technology Officer (CTO), Vivek Kundra, Chief Information Officer (CIO), and US Archivist David Ferriero.
FDLP issues are now front and center to the movers and shakers in the Obama administration. But we'll need more libraries and librarians willing to step up and pitch in to make the digital FDLP ecosystem a reality.
Digital FDLP Ecosystem
A couple of recent events have caused me to reanalyze and clarify my thoughts about Cloud computing: first there was the GPO Purl server crash and today there's the story about massive data loss from T-Mobile and Microsoft/Danger for anyone using a Sidekick:
"Regrettably, based on Microsoft/Danger's latest recovery assessment of their systems, we must now inform you that personal information stored on your device—such as contacts, calendar entries, to-do lists or photos—that is no longer on your Sidekick almost certainly has been lost as a result of a server failure at Microsoft/Danger."
Cloud computing is basically the outsourcing of Web services (storage, email and other application layers, computational cycles etc) to a third party. Although I am guilty of using the cloud metaphor to describe the digital FDLP, it's clear from the concept map below that I don't mean we should outsource FDLP Web services to third parties. I hope it's clear that I'm describing a collaborative and distributed system of digital content, collaborative cataloging/metadata creation, as well as shared technical infrastructure in which data and technological redundancy and collective and proactive action reign. This is the exact opposite of the "cloud."
So what would that metaphor be? I was thinking of the birch or banyan tree; but it's more like the symbiosis or mutual aid exhibited by certain ants and trees. It's a Peer-to-peer network with a conscience. Let's call it the FDLP ecosystem.
Below is a glossary of terms that are commonly used when talking about the Depository Library Program, government information in libraries, and digital technologies. It's important to have clearly defined terms when discussing issues of access to and preservation of digital government information and so we've created an ongoing glossary of key terms. Please contact us at admin AT freegovinfo DOT info with questions or to suggest other terms for the glossary.
API - An Application Programming Interface (API) is a definition of how computer programs can interact with a particular dataset, database, web site, or other cache of information. Programmers can use an API to design ways of dynamically interacting with the target data. An API increases the flexibility of information provision because, rather than limiting users to a single interface provided by the information provider or publisher, the API allows others to program different interfaces customized for different communities or purposes. Mashups often use open APIs to combine data from different sources (e.g., census and crime and maps) to re-use and re-purpose what would otherwise be separate stove-pipes of information and static documents.
Backup - A copy of data intended to be used to restore the original after loss or corruption of the original.
Cloud computing -
Dark Archive - "An archive that cannot be accessed by any users.... The purpose of a dark archive is to function as a repository for information that can be used as a failsafe during disaster recovery." http://www.webopedia.com/TERM/D/dark_archive.html
Digital Deposit - U.S. government publications in digital format deposited in FDLP libraries. In other words, GPO sends (i.e., deposits) authentic digital files to depository libraries. A Digital Depository has a digital collection that it selects, acquires, organizes, and maintains. Note that, although a library may also maintain links to digital publications that are housed, organized, and maintained by someone else, and that this may be useful, this is not "digital deposit" because nothing is deposited.
FDDLP, Federal Digital Depository Library Program - The FDLP with digital deposit.
Infrastructure, Financial - The existing and continuing economic support for a digital library.
Infrastructure, social/community/mission - The non-technical and non-financial foundations of a digital library. These include the library's mission and purpose, its designated community of users, and its role in society.
Infrastructure, Technological - The hardware, software, networking, and data-center facilities of a digital library.
Mirror Site - "On the Internet, a mirror site is an exact copy of another Internet site. Mirror sites are most commonly used to provide multiple sources of the same information." http://en.wikipedia.org/wiki/Mirror_%28computing%29
OAI - The Open Archives Initiative. (http://www.openarchives.org/) Not to be confused with OAIS. "The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content." One standard OAI promotes is The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) which is a "mechanism for repository interoperability." It specifies a way for repositories to expose structured metadata about the contents of repositories. Others can then use the Protocol to harvest that metadata or the documents they describe.
OAIS - OAIS - Open Archival Information System (OAIS) is a document that describes the essential functions and components of a digital archive. It is widely used (Library of Congress, National Archives, RLG, Harvard, UK Data Archive, British Library, etc.) in the design and evaluation of of digital archives. GPO used it in designing FDSys. OAIS goes beyond "bit preservation." An OAIS-compliant archive accepts the responsibility to ensure that information will be usable by a designated community.
PURL - Persistent Uniform Resource Locator, or purl, See "Purls vs handles" for a larger discussion.
XML - Extensible Markup Language (XML) is specification for creating custom markup languages intended for use on the Web. Like HTML, it allows text to be tagged or "marked up." Unlike HTML, which focuses on appearance, XML allows content to be tagged to denote meaning. XML is generally human-readable and therefore more easily preserved than proprietary, binary formats. It is also designed so that it can be easily parsed by computers and so is often called "machine-actionable" because it makes it possible to more easily re-use and re-purpose content. An example of a XML being used for government publications is the XML version of the Federal Register. This has already prompted new uses of the FR at FedThread.org. FedThread ("a new way of interacting with the Federal Register") uses the government-provided XML to create functionality that the government does not provide. This includes collaborative annotation, advanced search, customized feeds, and more.
Virtual Depository - This is a commonly used, but misleading term. Virtual depository is NOT digital deposit (See FGI post, "Toward a definition of 'virtual depository'"). GPO has used the term to mean the substitution/negation of actual deposit of paper documents in a depository library's collections for links in a library's online catalog to digital documents housed on GPO servers. Virtual depository is then the antithesis of digital deposit.