LOCKSS
LOCKSS Group on Facebook
Submitted by dcornwall on Sun, 2008-01-20 19:16.If you're a member of Facebook and a fan of Lots of Copies Keep Stuff Safe (LOCKSS), then you should join the LOCKSS group on Facebook. If you're not on Facebook, then check out the main LOCKSS web site at http://www.lockss.org or read some of our coverage about LOCKSS.
So far the Facebook group has 31 members, including some members of the LOCKSS team. Check it out if you can, if only to get your friends intrigued by LOCKSS.
- dcornwall's blog
- Add new comment
- Email this blog
- 373 reads
Canadian study: P2P users buy more music
Submitted by jrjacobs on Tue, 2007-11-20 11:53.An economic study funded by the Canadian government has concluded that heavy Peer-to-peer (P2P) users buy more music, not less as had been posited by entertainment industry organizations like the MPAA and RIAA. Michael Geist, Canada Research Chair of Internet and E-commerce Law at the University of Ottawa, has more background on his blog.
And why, you say, should FGI care about a Canadian study about file-sharing technology like Napster? Because this technology, a fundementally different 'Net architecture -- and one that looks and acts like a library consortium! -- is currently the architecture being used in LOCKSS and could be widely employed to much positive effect by libraries to build and share digital collections, that's why :-)
However, P2P has been under attack from entertainment industry organizations paranoid about copyright infringement. The attack has been so fierce that some states have begun looking into legislation against P2P (On September 16, 2004, Governor Schwarzenegger signed executive order S-16-04 charging the CA state CIO with the development of a statewide policy on P2P technology. See my P2P backgrounder for more). So legislation against P2P and the perpetuation of equating P2P with "piracy" has a deleterious effect on libraries and other cultural institutions trying to build systems of better digital access and preservation for the public.
- When assessing the P2P downloading population, there was "a strong positive relationship between P2P file sharing and CD purchasing. That is, among Canadians actually engaged in it, P2P file sharing increases CD purchases." The study estimates that one additional P2P download per month increases music purchasing by 0.44 CDs per year.
- When viewed in the aggreggate (ie. the entire Canadian population), there is no direct relationship between P2P file sharing and CD purchases in Canada. According to the study authors, "the analysis of the entire Canadian population does not uncover either a positive or negative relationship between the number of files downloaded from P2P networks and CDs purchased. That is, we find no direct evidence to suggest that the net effect of P2P file sharing on CD purchasing is either positive or negative for Canada as a whole."
- jrjacobs's blog
- Email this blog
- 597 reads
LOCKSS setup, made visual in 5 min
Submitted by dcornwall on Mon, 2007-10-15 16:43.Many thanks to Karen over at Free Range Librarian, for pointing out this YouTube video of LOCKSS installation:
In this video, Angela Slaughter of Indiana University walks us through the steps of setting up a LOCKSS cache. Please watch this five minute video and see how easy it could be for your institution. Then hope on over to http://www.lockss.org, learn more and consider what LOCKSS could do for you!
- dcornwall's blog
- Add new comment
- Email this blog
- 529 reads
Karen G. Schneider on LOCKSS
Submitted by jajacobs on Thu, 2007-08-16 06:21.Karen explains LOCKSS software clearly and succinctly -- its technology, costs, benefits, and purpose. She compares LOCKSS and Portico, too.
- Lots of Librarians Can Keep Stuff Safe, By Karen G. Schneider, Library Journal (8/15/2007)
[Libraries] have never before owned so little of the content they manage. LOCKSS offers one solution... [T]he use of LOCKSS for preserving local born-digital content -- with a free download, plus one morning's worth of time -- is certainly worth a spin around the block.
- jajacobs's blog
- Add new comment
- Email this blog
- 670 reads
Lunchtime Listen: LOCKSS in Six Minutes, with Cat
Submitted by dcornwall on Sat, 2007-06-23 10:12.Free Range Librarian Karen G. Schneider sets at a kitchen table with her cat and a basket of eggs and gives a great, simple explanation of how LOCKSS (Lots of Copies Keep Stuff Safe) works, why your library needs to be a part of it and why it won't be hard. All in six minutes. Watch and learn.
If you pick this as a lunchtime listen, you'll still have 54 minutes of your lunch hour to take care of other errands or read why privacy, access and preservation of government information need a geographically distributed depository library system of the future to be a reality.
- dcornwall's blog
- Add new comment
- Email this blog
- 670 reads
FGI Podcast #2 - FGI Roundtable, LOCKSS, Profile America, Scary Guy
Submitted by FGIcaster on Sun, 2007-06-17 07:46.Show notes for FGI podcast #2
Note: We still have things to learn about podcasting. This time it is audio leveling. The sound does go up and down a bit but hopefully won't be too distracting. If you have tips about equalizing volume levels in Audacity, please send them our way in one of the comment options listed below:
Today's 40 minute show had the following segments:
- FGI Roundtable - Jim A Jacobs, James R. Jacobs, Shinjoung Yeo and Daniel Cornwall discuss what led them to work with Free Government Information and what we see as the big issues facing the government information community.
- Elizabeth Cowell discusses the GPO LOCKSS pilot project at the Spring 2007 Depository Library Conference.
- Podcast Sampler - we take a quick listen to Profile America, a podcast from the US Census bureau intended to be played by broadcasters. Check out other government podcasts.
- Outro music - Scary Guy by Maria Daines used by permission. According to the story behind the song, this song was inspired by the global Scary Guy project.
- FGIcaster's blog
- Add new comment
- Email this blog
- 1015 reads
David Rosenthal says "Do it for Preservation!"
Submitted by dcornwall on Tue, 2007-06-12 12:59.David Rosenthal is a member of Stanford's LOCKSS development team who maintains a blog about his professional work. It is well worth reading and deserves a place in everyone's list of RSS feeds.
In a June 10, 2007 posting on reasons to preserve e-journals, David explains that multiple, independently hosted government publications are a good thing because they are TAMPER EVIDENT:
The goal of the FDLP was to provide citizens with ready access to their government's information. But, even though this wasn't the FDLP's primary purpose, it provided a remarkably effective preservation system. It created a large number of copies of the material to be preserved, the more important the material, the more copies. These copies were on low-cost, durable, write-once, tamper-evident media. They were stored in a large number of independently administered repositories, some in different jurisdictions. They are indexed in such a way that it is easy to find some of the copies, but hard to be sure that you have found them all.
Preserved in this way, the information was protected from most of the threats to which stored information is subject. The FDLP's massive degree of replication protected against media decay, fire, flood, earthquake, and so on. The independent administration of the repositories protected against human error, incompetence and many types of process failures. But, perhaps most important, the system made the record tamper evident.
Winston Smith in "1984" was "a clerk for the Ministry of Truth, where his job is to rewrite historical documents so that they match the current party line". George Orwell wasn't a prophet. Throughout history, governments of all stripes have found the need to employ Winston Smiths and the US government is no exception. Government documents are routinely recalled from the FDLP, and some are re-issued after alteration.
An illustration is Volume XXVI of Foreign Relations of the United States, the official history of the US State Department. It covers Indonesia, Malaysia, Singapore and the Philippines between 1964 and 1968. It was completed in 1997 and underwent a 4-year review process. Shortly after publication in 2001, the fact that it included official admissions of US complicity in the murder of at least 100,000 Indonesian "communists"by Suharto's forces became an embarrassment, and the CIA attempted to prevent distribution. This effort became public, and was thwarted when the incriminating material was leaked to the National Security Archive and others.
The important property of the FDLP is that in order to suppress or edit the record of government documents, the administration of the day has to write letters, or send US Marshals, to a large number of libraries around the country. It is hard to do this without attracting attention, as happened with Volume XXVI. Attracting attention to the fact that you are attempting to suppress or re-write history is self-defeating. This deters most attempts to do it, and raises the bar of desperation needed to try. It also ensures that, without really extraordinary precautions, even if an attempt succeeds it will not do so without trace. That is what tamper-evident means. It is almost impossible to make the record tamper-proof against the government in power, but the paper FDLP was a very good implementation of a tamper-evident record.
You'll notice that David refers to the depository program in the past tense. He does so because, like GPO itself, he sees the Future Digital System (FDSys) as an inevitable total replacement:
It should have become evident by now that I am using the past tense when describing the FDLP. The program is ending and being replaced by FDSys. This is in effect a single huge web server run by the GPO on which all government documents will be published. The argument is that through the Web citizens have much better and more immediate access to government information than through an FDLP library. That's true, but FDSys is also Winston Smith's dream machine, providing a point-and-click interface to instant history suppression and re-writing.
David thinks this is a bad thing, GPO assures us it is a good thing, but both assume this is where we are going.
But it doesn't have to be this way. We in the FDLP are definitely "Not Dead Yet!" We have a vital role to play in continuing to preserve the tangible materials entrusted into our care. Further, hundreds of new tangible titles are being shipped each month by GPO to the 1200 plus federal depository libraries.
And while the depository community hasn't exactly leaped up and embraced their responsibility to preserve federal electronic publications, individual libraries like the University of North Texas and the New Mexico State Library have. Together with others who have held views on preservation similar to David's for years these libraries will help build the depository system of the future.
Or we can sit back and let Winston Smith control our government information. If you are a government information specialist, it's up to you.
- dcornwall's blog
- 1 comment
- Email this blog
- 896 reads
P2P Knowledge low in academic librarians?
Submitted by dcornwall on Thu, 2007-06-07 12:43.As I mentioned in my posting on social psychology for librarians, people tend to follow the "central" route of attitude change only if these three conditions are present:
- Relevance to audience;
- Audience has knowledge in the domain;
- Audience has sense of personal responsibility.
I suggested that items 2 and 3 are weak among documents librarians who hear messages about the importance of building local, but Internet accessible digital collections of government documents like UNT CRS Reports Collection.
A new article:
Hendrix, Dean.
Peer-to-Peer (P2P) Knowledge, Use, and Attitudes of Academic Librarians
portal: Libraries and the Academy - Volume 7, Number 2, April 2007, pp. 191-212
Link to Abstract
seems to show that lack of knowledge is part of the problem. This article documents a survey of 162 academic librarians and finds in part:
Overall, academic librarians demonstrated low knowledge levels (mean quiz score = 49 percent), rarely used P2P applications, and exhibited indifferent attitudes (total neutral responses = 42 percent) toward these burgeoning information technologies.
Considering that LOCKSS is a P2P technology, maybe it shouldn't be surprising that the mostly academic documents depository community doesn't quite grasp the power of the P2P approach.
But we don't have to stay unaware of such technologies. Here are a few things you can do to become aware of what's available and what it can do:
- Read James R's P2P Backgrounder
- Check out our Digital Libraries Technologies Page
- Read LOCKSS for Librarians
- dcornwall's blog
- Add new comment
- Email this blog
- 911 reads
Early CLOCKSS Lessons
Submitted by dcornwall on Mon, 2007-05-14 10:15.Reprinted with permission from the LOCKSS Alliance mailing list:
-------------------
Dear Colleagues,
The CLOCKSS (Controlled LOCKSS) Board would like to take this opportunity to apprise you of our progress, to share early lessons, and to encourage you to participate in the process of building this shared resource.
The CLOCKSS participants (major academic publishers, research libraries, and the Stanford University team) are building a community-governed, stable, digital archive for published scholarly content. CLOCKSS access is unbundled from fees: after a “trigger†event (when a publisher is no longer able to provide electronic access to some or all of its archived material), content will be freely available to all. Many libraries have moved away from building and preserving collections, and there is increasing interest in community stewardship and preservation of, and guaranteed long-term access to, scholarly publications.
Since its inception early in 2006, the CLOCKSS members made significant strides towards the effective management of archived materials, and learned some important lessons. We are also extremely proud to have been awarded the ALA ALCTS 2007 Outstanding Collaboration Citation, which will be formally presented at ALA’s annual meeting in Washington in June.
To find out more about our early lessons and progress, go to www.clockss.org and click on the link “CLOCKSS Lessons.â€
As always, we welcome comments and suggestions. Please let us hear from you.
Sincerely,
Vicky Reich
vreich@stanford.edu
--------------------
I took Vicky's advice and checked out some of the CLOCKSS lessons. While I think you should read the entire five page documents, here are some good quotes that I think are worthwhile to documents librarians. Just think of "federal government" whenever you see the word "publisher":
The most important, and first, lesson learned by CLOCKSS participants was that commercial,
university press, and society publishers; and librarians can collaborate effectively and thrive by working as equals to build a community-governed archive. The CLOCKSS Board meets formally
twice each month by phone and twice a year in person. The Board establishes policies and implements procedures for wide range of social, business, content, and technical issues.
---------
The archived content is a valuable asset, into which scholars, librarians, and publishers have
made considerable long-term investments; it must be protected from a wide variety of possible disruptions whether deliberate or accidental. The CLOCKSS archive network is made up of
widely distributed host libraries spanning geographic, political and legal boundaries, and this global network, under the stewardship of those who’ve invested so heavily in it, will protect these important assets for future generations of scholars.
------------
In February 2007, the CLOCKSS team first successfully demonstrated the process that would follow a trigger event (retrieving preserved presentation content from the network of CLOCKSS boxes, transferring it to a publishing platform, and making it available to readers).
-------------
Over the long term, the CLOCKSS Board intends to raise a capital fund to pay for most (if not all) of the archive’s ongoing expenses. Digital preservation requires continuous processes; when
active preservation ceases, materials are lost. By building a capital fund and becoming selfsustaining, CLOCKSS will ensure that the preservation processes continue over time, regardless of the availability of outside sources of revenue (a circumstance with which libraries are wellfamiliar – witness the recent rescission of Library of Congress NDIIPP funding to help finance other American government priorities).
----------------
No one agency can or should preserve government information all on it's own. There is another way.
- dcornwall's blog
- Add new comment
- Email this blog
- 590 reads
More LOCKSS slides from Spring 2007 Depository Library Council
Submitted by jrjacobs on Tue, 2007-05-01 13:54.Patricia Kenly from Georgia Tech has given us permission to post her powerpoint presentation describing Georgia Tech's LOCKSS program (PPT). I also added a link to it on our Spring '07 DLC page. While I wasn't at DLC, the information that I gleaned from Patricia's slides is that LOCKSS is relatively easy to administer, the hardware is cheap, LOCKSS has quickly become the library's primary strategy for digital preservation, and Georgia Tech is willing to share information with anyone. Thanks Patricia for the slides!
- jrjacobs's blog
- Add new comment
- Email this blog
- 683 reads
Roughly 2/3 of DLC LOCKSS Session Available
Submitted by dcornwall on Tue, 2007-04-24 21:23.We've updated our Spring 2007 DLC conference page with audio and a rough transcript from the LOCKSS panel. We regret that our reporter missed the first 20 minutes. The session began at 1330 and our audio and transcript start at 1351.
We figure LOCKSS is so important that missing some of it was better than not posting. If you were at the panel, please fill us in about the first 20 minutes.
- dcornwall's blog
- Add new comment
- Email this blog
- 717 reads
GPO LOCKSS report: Why LOCKSS vs. FDsys?
Submitted by jajacobs on Thu, 2007-04-12 06:26.I do not understand why the GPO report on LOCKSS ( GPO LOCKSS Pilot: Final Analysis, Government Printing Office, April 12, 2007) seems to take an "either LOCKSS or FDsys" approach.*
It is my understanding that FDsys uses the Open Archival Information System (OAIS) reference model and that LOCKSS is 100 percent OAIS compatible (OAIS Formal statement of Conformance to ISO 14721:2003).
To me, this means that the bulk of the work that GPO did to get content ready for LOCKSS would be work that would also prepare content for FDsys. If that is true, the two systems are not incompatible, but complementary. It should not be necessary for GPO to choose one or the other, but, instead, could choose both.
The report notes in its Final Recommendations that "GPO’s emerging enterprise architecture requires that new applications be compatible with FDsys or face the risk of near-term obsolescence" but it does not explain how LOCKSS does not meet this criterion.
In both the report and the Clarification, GPO says that it would require 9 FTEs to harvest and re-publish content to GPO servers in a LOCKSS-friendly format and only 1 FTE to write LOCKSS plugins. This is the "much larger use of staff time" that evidently is the primary reason GPO is using to decide to "devote its resources to the development of FDsys..." and devote no resources to using or enabling use of LOCKSS.
This leads me to more questions about the GPO LOCKSS report. Perhaps these can be addressed at DLC?
- In what way does LOCKSS fail to meet the criterion that "that new applications be compatible with FDsys"
- In what way would content processed for LOCKSS not be usable by FDsys?
- What will the cost be to process into FDsys the 592 serial titles identified in the GPO LOCKSS report?
Finally, the report consistently refers to LOCKSS as a "distribution" system and ignores LOCKSS as a preservation system. The report only mentions the benefits of LOCKSS once ("IP authentication for over 1260 depositories would be cumbersome, and may not be cost effective in relation to the benefit received." p.6) but does not specify what GPO sees as the benefits of LOCKSS. I would suggest that a better evaluation of LOCKSS by GPO would have included the benefits of LOCKSS, particularly as a redundant (fiscally, physically, and technologically) preservation system, and would have included a perspective of LOCKSS as complementary to FDsys, not a competitor to it.
* The report says that GPO tested the LOCKSS technology "as a potential precursor to GPO’s Future Digital System (FDsys)" not as a complement to FDsys. It speaks of a choice between "duplicating effort" or requiring all libraries to use LOCKSS. It sets up scenarios such as "If a title such as this were to be distributed through LOCKSS only..." and "If LOCKSS were to become the only distribution method for e-journals distributed to the FDLP..."
- jajacobs's blog
- Add new comment
- Email this blog
- 911 reads
GPO, LOCKSS, IP Authentication, and the future of FDLP -- more clarification needed
Submitted by jajacobs on Wed, 2007-04-11 17:34.If you have not had a chance to read the message from Joseph P. Paskoski (Clarification on GPO LOCKSS report), I encourage you to do so. It does indeed help clarify GPO's intentions in ways that, I believe, seriously endanger long-term, free, public access to government information.
For those confused by the recent thread about GPO, LOCKSS, and IP authentication, allow me to try to summarize what we now know:
GPO is not "advocating" use of IP authentication for LOCKSS.
On the other hand, GPO is considering "an exclusive service for depository libraries" and is recommending exploring "other user authentication options" to implement such a system.
Further, GPO is only willing to "consider" (not guarantee) making content available without user authentication. This is evidently true of FDsys as well as any use of LOCKSS.
To me, this means that GPO is, indeed, planning a two-tier system of digital distribution: one exclusively for depository libraries (and, presumably, free) and, by implication, a second system presumably for the general public and based on cost recovery.
For this to work, GPO would have to do two things. First, it would have to restrict what FDLP libraries can do with the content they receive, either through technological locks or limitations, or licensing restrictions (including restrictions on re-distribution). Second, if GPO offered any content to the general public for free, it would have to offer similarly technologically dumbed-down, less-than-fully-functional, non-reusable content -- much the way Amazon offers one-page-at-a-time viewing of books as a teaser to get you to purchase the entire book. (For more on this see Why does GPO want to use IP Authentication?)
This model seems clear: distribution to depository libraries for free, but with limitations on use, location, and so forth; and distribution to the general public for a fee.
This sounds like an implementation of what GPO's strategic vision promised: a commitment to "free and ready public access to" government information "in partnership with Federal Depository libraries" while maintaining a separate, fee-based channel to meet its commitment to "distribute, on a cost recovery basis, copies of printed and electronic documents and other government information products to the general public," [emphasis added] (A strategic vision for the 21st century).
Why is this a threat to long-term, free, public access to government information? Imagine what such a system would look like to your users: They could use the net to get what they need, but they may have to pay or use a dumbed-down version. Or they could go offline, go to their library and use a "free" version, which would also have DRM or licensing restrictions, or both.
This is a far cry from DLC's underlying assumption that "much of the access to federal information resources is available 24/7 on the Internet" (Knowledge Will Forever Govern" A Vision Statement For Federal Depository Libraries In The 21st Century).
What this sounds like to me is a revival of the GPO bookstore concept for the digital age with the (fee-based) bookstore as the primary means of access to most government information and the go-to-the-library-building FDLP as the "free" path. This puts libraries and free access as second-tiers, non-networked alternatives for users. It would mean that libraries would be unable to participate in the open and free flow and re-use of government information (Web 2.0, Semantic Web, etc.). It is a vision of government information closer to Jack Valenti's vision of movie distribution than to Jefferson or Madison's visions of government information.
Imagine what this would mean to FDLP libraries and their ability to preserve access to information. Would systems like LOCKSS even be permitted? Or would locked-down-with-DRM or technologically-dumbed-down free versions made available to libraries be technically (or by license) un-preseravable?
I may be misinterpreting GPO's statements and I hope I am. I would welcome hearing further clarifications from GPO including that it does not intend to use DRM and that it does not intend to restrict what FDLP libraries or others can do with free content. I would welcome hearing from GPO that it does not intend to provide dumbed-down or technologically locked or functionally-disabled content for free while providing fully-functional content for a fee. I invite GPO to commit itself to open, free, reusable, preservable, distributable, unencumbered, fully-functional government information. I urge DLC to insist on digital distribution so that FDLP libraries can be fully functional online partners in the organization and preservation of government information and not by-standers who hope GPO will get funding to do so.
GPO could also show its good faith by continuing to study LOCKSS as one (not the exclusive) method of preservation and not just a method for distribution. The project could be expanded to include more than just e-journals. GPO could evaluate automated harvesting using tools that automatically create new directory structures and actively seek ways to help other depository libraries participate (e.g., reviewing automated harvesting).
Perhaps there can be some discussion of these issues at DLC.
Until we get further clarification, we'll all be left wondering.
- jajacobs's blog
- Add new comment
- Email this blog
- 924 reads
GPO LOCKSS Report: Why does GPO want to use IP Authentication?
Submitted by jajacobs on Mon, 2007-04-09 10:20.GPO's report (GPO LOCKSS Pilot: Final Analysis, Government Printing Office, April 12, 2007), which analyzes the LOCKSS technology and announces GPO's findings and "future recommendations" on using LOCKSS, refers repeatedly to use of IP Authentication*. The document does not, however, discuss the need for IP authentication either in the pilot or in possible future implementations of LOCKSS for government information.
Since LOCKSS does not require IP Authentication, this raises interesting questions.
While it is reasonable to assume that GPO wanted to limit access to the documents it was using for the LOCKSS pilot project to those who were participating in the project, it is not clear why GPO would consider IP Authentication necessary for a live implementation of LOCKSS. But the report clearly states as an "Outstanding Issue":
IP authentication for over 1260 depositories would be cumbersome, and may not be cost effective in relation to the benefit received. (page 6)
And, later, the report stresses the costs of IP Authentication:
LOCKSS technology in itself appears to be relatively cost efficient as a distribution mechanism. Costs appear to be a bigger issue in relation to staff time required to ... administer IP authentication. (page 11)
Since the report does not explain why it would want to use IP Authentication for LOCKSS, we can only speculate why it includes it as a cost. Here are my speculations. I pose them as questions and would welcome answers from GPO.
- Is GPO planning to set up a special distribution system for depository libraries only?
This would make sense if GPO wants to use this special distribution channel to meet its commitment to "free and ready public access to" government information "in partnership with Federal Depository libraries" while maintaining a separate, fee-based channel to meet its commitment to "distribute, on a cost recovery basis, copies of printed and electronic documents and other government information products to the general public," [emphasis added] (A strategic vision for the 21st century).
For this to work, GPO would have to restrict what FDLP libraries can do with the content they receive, either through technological locks or limitations, or licensing restrictions. We have already seen a precursor to the use of licensing restricitions with the Library of Congress Subject Headings (See: GPO details onerous restrictions on digital materials).
This seems to me the most likely reason for the inclusion of IP authentication in the report because it fits in well with the contradictory missions noted above of providing information for free and for a fee and with GPO's previous experience with this very contradiction. (Years ago, when GPO tried to charge for GPO Access, it tried to limit free use of it to those physically inside depository libraries. When that failed because libraries made the same content available on the net, GPO was forced to go to a model of making "it free to the general public." But, as Bruce James said, "This cannot continue." [See Summary, 2003 Fall Meeting Depository Library Council.])
The new model seems clear: "Free" to depository libraries, but with limitations on use, location, and so forth; and "Fee" to the general public.
- Is GPO planning a separate, FDLP-only distribution channel as a way of providing "authentication" of content?
This would fit in well with GPO's consistently stated intention of being a "single authoritative resource" for digital Federal documents (A Strategic Vision).
Information distributed through such a limited access channel could come with a special cachet of "being deposited" and FDLP libraries and no one else would be able to claim a special authenticity to such distribution. I do not think that it would be either necessary or wise to limit "authenticity" in this way, but, perhaps someone at GPO is thinking along those lines?
- Did those who wrote the report fail to consult policy makers within GPO and make a faulty assumption that GPO wants IP Authentication?
This would indicate either that the report is incomplete or badly done.
- Did those who wrote the report fail to understand the technology they were describing and think IP Authentication was necessary to implement LOCKSS?
This would also indicate that the report, and perhaps the entire evaluation process, was flawed badly.
- Is IP Authentication just a red-herring intended to confuse the issue, raise the theoretical costs of implementation, and provide evidence for GPO's conclusion that LOCKSS won't work for GPO?
This would indicate that GPO, which in its own words only took on the pilot project after receiving "requests from research institutions, universities, depository libraries, and other Federal Government agencies to investigate using LOCKSS", never considered LOCKSS as a viable alternative. Indeed the report makes this fairly clear when it says that it "agreed" to the pilot project to test the LOCKSS technology "...as a potential precursor to GPO’s Future Digital System (FDsys)." [emphasis added]
None of these speculations are encouraging. They lead me to conclusions that do not augur well for free public access to public information. Again, I would welcome a response from GPO.
* In a library environment, "IP Authentication" normally refers to a process that allows access to licensed content. For example, the library subscribes to a journal collection or database and pays the vendor fees that allow certain computers (e.g. all those on a campus) to have access to that content. The library sends the addresses of those computers ("IP addresses") to the publisher. The publisher maintains a service that allows any request from one of those machines to get content. For more information, see Offering remote access to restricted resources by Marshall Breeding, Information Today, Volume 18 Number 18 (May 2001) p52-53.
Depository libraries can use IP Authentication to allow two and only two machines to have access to STAT-USA. (See "STAT-USA Offers Depositories IP Authentication Access" in Administrative Notes Newsletter of the Federal Depository Library Program Vol. 27, no. 03-04 GP 3.16/3-2:27/03-04 March 15/April 15, 2006.)
- jajacobs's blog
- Add new comment
- Email this blog
- 800 reads
GPO LOCKSS Report: Mistakes and Irrelevancies
Submitted by dcornwall on Sun, 2007-04-08 20:50.As I mentioned a few days ago, the final analysis report of the GPO LOCKSS Pilot Project is now available on GPO Access at
http://www.access.gpo.gov/su_docs/fdlp/lockss/index.html.
The comments that follow are based on analysis by Jim A. Jacobs and Daniel Cornwall.
The most disappointing thing about the report isn't so much that GPO rejects LOCKSS as a distribution mechanism, but its reasons for doing so. The 19 page report doesn't read as much like an evaluation of software than as a document defending a previously made decision on the basis of mere assertions, many of which have nothing to do with LOCKSS itself.
We believe the biggest problems with this report fall into two categories; 1) statements about LOCKSS that appeared to be based on a lack of understanding about how LOCKSS works and 2) negative statements made that have little connection to LOCKSS. Under the second class of problems is one so big that Jim will provide a separate posting on this issue. This is the issue of IP authentication. GPO repeats the non-scalability of IP authentication as a major stumbling block to using LOCKSS as a distribution channel. The report never explains why IP authentication is so important to GPO and that will be the subject of Jim's posting.
So, what else is wrong with this report? Let's start with some statements about LOCKSS that as a participant in LOCKSS both as publisher and library simply don't make sense.
Page 12 - LOCKSS may have format migration issues similar to CD-ROMs and other tangible electronic media in depository libraries.
I'm not positive I really know what this means, but if they're talking about LOCKSS hardware, that's no more of an issue than having computers in libraries at all. LOCKSS caches are just regular computers as the report itself says. Because content is duplicated on other LOCKSS caches, migrating hardware is as simple as buying a new computer, installing LOCKSS, picking your content and sitting back as content is reloaded from other caches. If GPO is referring to file format migration, LOCKSS researchers have been working on that issue since 2005 and have come up with a promising approach.
Page 12 - Explore options for making content available from a single site that would allow LOCKSS libraries and non-LOCKSS libraries to access content from the same source. This would eliminate duplication of effort required to make content available to both groups of libraries.
If you check out any of the nonrestricted materials available through LOCKSS, whether it is Alaska State Documents or BioMed Central Journals, all content available through LOCKSS is already available to non-LOCKSS users from the same interface. No duplication of effort is needed. Simply have the content available in some kind of human-intuitive archive units, and people and LOCKSS caches alike are happy. Unless GPO has plans for restricting public domain government information, this just isn't an issue.
Page 12-13 - However, formatting the content to enable LOCKSS use could require more clicks than the current model for non-LOCKSS users to access the content.
All LOCKSS needs are issue dates organized by year or some other suitable Archival Unit. Seems to me that is how most journals are laid out. No more clicks would be involved. GPO could have bolstered it's case by providing an example, but I just can't think of one.
Page 11 - If LOCKSS were to become the only distribution method for e-journals distributed to the FDLP, all depositories would have to join the LOCKSS Alliance.
No. Only libraries wishing to build local digital collections and pursuing preservation through multiple copies. Otherwise the model of access instead of custody will continue to operate. Making content available through LOCKSS is a matter of providing a publisher manifest page and writing a LOCKSS plugin. LOCKSS doesn't restrict your content any more than you already do. No library would be forced into LOCKSS unless GPO chose to make it a requirement.
Finally, there is the unspoken issue that indicates a lack of understanding about LOCKSS. GPO consistently refers to LOCKSS as a system of distribution, but never mentions its preservation value. LOCKSS has proved itself since 1999 in the area of commercial journals. I would think that any evenhanded evaluation of LOCKSS would need to examine its a capability as a preservation system, which is has done for eight years longer than FDSys has.
The second major problem area is that this purported evaluation of the LOCKSS technology is sprinkled with negative comments that have little to do with LOCKSS itself. In this category I'd put:
- Some libraries may want to “weed†publications from their caches in order to regain disk space at times. (page 11)
- It is unclear what percentage of FDLP libraries want to utilize LOCKSS for e-journal content. (page 11)
- It is not clear whether libraries that do want LOCKSS want it as an exclusive service to depositories, or whether they simply want to enable libraries to archive content locally. (page 11)
- Many agencies will complain if they believe GPO is taking business away from their Web sites by republishing content on GPO Web sites. (page 11)
- The depository library community has not been surveyed to determine whether there is wide enough support for LOCKSS to use it as the only e-journal delivery mechanism to justify requiring all libraries to receive e-journal content through LOCKSS. (page 12)
What all of the above statements have in common is that GPO doesn't know the answers to the implied questions, hasn't asked despite being formally interested in LOCKSS for more than three years, and have no bearing on LOCKSS capability. They really sound like excuses rather than reasons.
Aside from these two major sets of problems, I believe that most of the problems that GPO attributes to LOCKSS apply equally to the Future Digital System (FDSys). Think about it. If GPO serves agency content through FDSys, won't that take business away from agency web sites? Or when GPO states on page nine "technical sustainability and longevity of the LOCKSS platform as a long-term archiving solution needs to be assessed", how is this different from the Future Digital System, except that LOCKSS has been around since 1999 and the FDSys is still in development?
Since I'd like to note when I agree with an opponent's point, I'd like to say that GPO does have a point about the time intensive nature of manually harvesting journals. Their collection times are similar to what I have in collecting Alaska State Journals and annual reports. That's part of the reason that Alaska's LOCKSS program focuses on monographs distributed on our monthly shipping lists. This allows it to distribute hundreds of titles in a single plugin in a way that adds at least minimal metadata to the LOCKSS system.
Rather than abandoning LOCKSS because manually harvesting journals is too hard, GPO or some enterprising library or group of libraries should explore the automated download and posting of materials found in GPO's New Electronic Titles on a monthly basis. Even an experiment with a single item number or small agency might be useful.
In as much as the Future Digital System could produce an output like the new electronic titles page, all they might have to do would be to include a LOCKSS permission statement on their "new titles" results page and let the LOCKSS community write the plugins. They could have their centralized system and interested libraries, some depository and some not, could have assurance that publications would be preserved outside of a chronically underfunded federal agency.
- dcornwall's blog
- Add new comment
- Email this blog
- 723 reads



Recent comments
3 days 13 hours ago
4 days 9 hours ago
5 days 8 hours ago
1 week 14 hours ago
1 week 2 days ago
1 week 5 days ago
1 week 6 days ago
1 week 6 days ago
2 weeks 1 day ago
2 weeks 3 days ago