gpo

GPO's draft regional libraries report and FGI comments

A few weeks ago, the Government Printing Office released their draft report entitled, Regional Depository Libraries in the 21st Century: A Time for Change? and asked for comments until June 30. I'm not sure how many comments they received, but wanted to publish comments we submitted. Lynne Bradley, Director American Library Association Washington Office, DID submit comments that were endorsed by the Association of College & Research Libraries (ACRL), the Association for Library Collections & Technical Services (ALCTS), and the Government Documents Roundtable (GODORT). GODORT republished Ms. Bradley's letter on their wiki.

While we are in general agreement with ALA's letter calling for increased flexibility of Title 44 (*not* wholesale changes in the title) and increased appropriations for GPO initiatives and "regional depository libraries to help offset the costs of storing and preserving government property," our comments deal with the more philosophical issues embedded in the draft report. Please let us know what you think.

I. Delete from the report all uses of the adjective "legacy" when referring to collections. The use of the word "legacy" as an adjective comes from computer science and is used to indicate things that are "outdated" and "undesirable." When the report uses the phrase "legacy collections" it implies that it is referring to unwanted and outdated collections. (The report uses "legacy" as an adjective in only one other context: in its reference to sections 1911 and 1912 of Title 44 USC as "Legacy Sections" -- apparently in order to define these section as out of date and undesirable.) Thus, the use of the phrase "legacy collections" is either inaccurate and misleading, or imprecise.

In its place GPO should use phrases that accurately describe the collections it wishes to discuss. For example, in place of "legacy collections" the report could uses phrases such as "collections without adequate bibliographic records" or "collections of print materials" or "collections without digital equivalents" or other phrases that accurately describe the collections GPO is referring to.

If GPO does wish to refer to unwanted out of date materials it should describe them that way explicitly rather than use the term "legacy."

II. The report should more explicitly and accurately address the difference between roles and responsibilities that are legally mandated and those that have been assumed without a legal mandate.

Specifically, we object to the following sentences of the report (Section V.B. pages 16-17) that gloss over these differences. (These sentences refer to Public Law 103-40, The Government Printing Office Electronic Information Access Enhancement Act of 1993.)

The implementation of the GPO Access Act ushered GPO into the online age and accelerated the paradigm shift in the FDLP that changed GPO’s relationship with depository libraries. Regional depositories have the responsibility for permanent public access in the tangible publication environment. In the online information environment GPO has assumed primary responsibility for ensuring content and permanent public access. [emphasis added]

We suggest the following wording instead:

While the GPO Access Act specifically required GPO to "provide a system of online access" and to "operate an electronic storage facility for Federal electronic information," it did not specify any change in the roles of the depository libraries. It added new roles for GPO, but did not reduce, alter, or delete the roles of depository libraries.

Since 1993, Congress has consistently provided funds to GPO for the "distribution" of government publications to designated depository libraries. This wording was carefully chosen. In 2000 the House attempted to substitute the wording "on-line access" for "distribution," but that language was rejected.

Nevertheless, GPO has chosen to implement this law in a way that is shifting the relationship between GPO and depository libraries. GPO has chosen to assume responsibility for permanent public access to digital materials and has chosen not to offer digital deposit as an option to FDLP libraries.

This has resulted in a paradigm shift in access, preservation, and service within the FDLP. Instead of relying on FDLP libraries and their different locations, funding, and technological infrastructures, GPO has chosen to implement policies a) that do not "distribute" digital objects to FDLP libraries, b) that make it difficult for FDLP libraries to build local digital collections, and c) that create a preservation system that depends on a single centralized collection with a single funding source.

While these choices seemed appropriate 15 years ago, much has changed over the years. Many libraries are developing institutional repositories and other digital collections. In a survey in August of 2005, 85% of responding FDLP libraries expressed "high" or "very high" interest in being able to "pull" content from GPO and 65% were equally interested in GPO "pushing" digital content to FDLP libraries. In the current survey of Regionals, 52% expressed a willingness to receive digital files on deposit. Commercial and open source software for managing digital collections is now widely available. As we look at new models and roles for FDLP libraries, we need to consider true digital deposit as a viable and important option. We need to look beyond the now-old model of relying solely on GPO having primary responsibility for ensuring content and permanent public access.

EPA Tagging Results - Ready and Promising

Our report on our experiment in using del.icio.us to tag EPA documents originally harvested by GPO is now completed and available for your review and comment at http://freegovinfo.info/node/1825.

For more information about this project, including a list of tags assigned to documents by project participants, please see http://freegovinfo.info/epatagging.

Our thanks to the project participants!

EPA Tagging Results and Future Directions

Back in January we asked people to use del.icio.us to tag a sample of 32 documents taken from the 100 EPA documents posted by the Government Printing Office (GPO) to http://www.gpoaccess.gov/harvesting/index.html.
We asked people to tag documents from 1/18/2008 through /18/2008. A spreadsheet of the results is available at http://spreadsheets.google.com/pub?key=pybymZBlZ80PVat2ggty2GA.
This brief article informally discusses some of our results, offers some lessons learned, and offers suggestions for future projects. Finally, a short list of articles on other research relating to tagging is presented.

1) Findings

  • Number of tagged documents - 31
  • Average number of people tagging a given document - 2.5
  • Highest number of taggers for a document - 8, for the document "Environmental Results Under EPA Assistance Agreements"
  • Average number of deduplicated tags per document - 11.25
  • Number of documents with descriptions - 31, with a majority of documents having more than one human generated description.

2) Some Promising Results

While we would have liked to have seen more participation (see below under "study limitations"), these initial results are somewhat positive. There is some interest in tagging. Tagged documents tended to receive meaningful descriptions beyond what a brief bibliographic record would provide. For example, for the document "Air Sealing: Building Envelope Improvements", we have the following descriptions from five users:

* Mount Desert Spring Water was able to win a bid to provide bottled water and water coolers to the University of Maine. Mount Desert Spring Water was successful because the water coolers it provided were energy efficient and the lowest cost to the Universi - samchap

* Describes the benefits of proper air sealing for homes. EPA awards the EnergyStar when legal minimum standards are exceeded. - mkvs

* Conserving energy in your house by having it sealed correctly - bookswoman

* "Air sealing the building envelope is one of the most critical features of an energy efficient home." "25-40% of energy" "ENERGY STAR qualified homes, constructed to exceed [building] codes with air sealing, can offer a better quality product." - keyvowel

* This Energy Star news release describes ways homeowners can reduce home heating and cooling costs by implementing air sealing techniques. - tadamich

Without question, the first description is problematic, but the other four descriptions are in agreement about what this document is about AND provide more relevant information than a brief bibliographic record.

For the most part, the tags we got were also meaningful and descriptive. Staying with the document "Air Sealing", we have the following tags:

Air, air-sealing, airsealing, building-insulation, efficient, energy,
energy-efficiency, Energy-Star-Branding, energyconservation, energystar, epa, EPA-advertising, globalwarming, greenhousegases, home-building, home-building-techniques, home-construction, home-improvement, homes, hvac, indoor, leakage, money-saving, quality, sealing, ventilation

Contrast that with a brief bibliographic record that simply has title, agency, and URL. How would people know that this document is part of the EnergyStar initiative, or that it was related to home building or energy efficiency? Clearly, in this instance and in a number of other project documents, there was a clear value added.

3) Limitations of current study

Our promising results were limited by three factors, the most important was the lack of participation. We estimate that about ten people participated in our tagging project. The available research on tagging is pretty firm on stating that good social tagging requires many users. Some say 100 or so is good, others suggest higher numbers. Our numbers are clearly too low. There are also too many instances (12) when a document was tagged by a single user. This could greatly bias how a document gets tagged. Consider if the only description of "Air Sealing" had been the mistaken one about water coolers. That would have been worse than useless. But even in this instance, a user pulling up this document while searching for water coolers could have provided a more accurate description.

The low number of taggers also made it difficult to see how much tag agreement existed among the various taggers.

Another problem was self-inflicted. We forgot to instruct people on tag construction. These were our original instructions:

1) Visit http://www.archive.org/search.php?query=epapilotproject and go to a document on the list. Open the pdf file in a separate browser window.
2) In del.icio.us, tag the page for the Internet Archive record (i.e. not the PDF file) after examining the PDF file.
3) In the del.icio.us "notes" field, write a one or two sentence description of what the document is about.
4) In the tags field, please use epapilotproject, for:freegovinfo and then any tags that you feel describe this document.

del.icio.us uses a space separated tag system. In other words, a space begins a new tag. So tagging something as "air quality" results in the two tags of "air" and "quality" and not the more helpful tag of "air quality" This resulted in some of the tagging becoming meaningless. If we had asked people to put dots or dashes in multiple word tags, we would have gotten more meaningful tags. We still got some useful tags because some of our taggers were used to the del.icio.us system, but we shouldn't have assumed that everyone tagging would know how to construct multiword tags in del.icio.us. On the other hand, this problem might have been less noticeable if we had more taggers per document.

Our final problem is one we think could be avoided in future projects. That is people tagging different files with the same document title. We asked people to bookmark the Internet Archive page for a given document, which has a link to the PDF file. We specifically asked people NOT to tag the PDF file because del.icio.us doesn't populate the title field of bookmarked PDFs. But one person in our project consistently bookmarked a document's PDF file instead of the Internet Archive page and this separated that person's tagging from everyone else's and made it more difficult to compile tagging info for every document.

4) What next? Some suggestions

Our findings indicate that tagging does have potential to add value to web harvested documents that do not receive full cataloging, but for this benefit to be fully realized, there must be more taggers. When we realized we didn't have the number of taggers we wanted, we headed for the literature and found some articles
listed below under "References Consulted." They offer some interesting guidance for other document tagging efforts.

While all of the papers below talked about user motivation, I think Tim Spalding said it best in a post titled "When tags work and when they don't: Amazon and LibraryThing":

"Something is going on here—something with broad implications for tagging, classification and "Web 2.0" commerce. There are a couple of lessons, but the most important is this: Tagging works well when people tag "their" stuff, but it fails when they're asked to do it to "someone else's" stuff. You can't get your customers to organize your products, unless you give them a very good incentive. We all make our beds, but nobody volunteers to fluff pillows at the local Sheraton."

The EPA documents are sort of like fluffing pillows at the local Sheraton, to me at least. My primary interest isn't environmental documents and EPA documents are not a major component of my library's depository collection. In addition our particular sample was unintentionally heavy on flyers, applications, and brochures. It could be that another agency's documents, say NASA or DoD might get more attention.

There's another angle too. In my anecdotal experience, librarians don't see web stuff as theirs, so they don't spend much processing time on it. Of if they are concerned about web documents, perhaps their administration does not. So how could we make them owners and think of web harvested materials as "their stuff" so they'll make their "documents beds"? A few suggestions follow:

1) For the EPA documents, GPO could partner with libraries that do have a strong environmental collection. Perhaps candidate libraries could be determined through item selection analysis.

2) GPO might wish to consider doing a depository survey to see what agency depositories would most like to see web-harvested. The survey could include a question asking libraries if they would tag if the desired content was harvested.

There wouldn't have to be a commitment to tag every document, but to tag some of the documents.

While GPO should continue with web harvesting no matter what, we wouldn't blame them for not moving forward with a documents tagging initiative if the depository community failed to register interest in such a project.

3) If GPO re-harvests EPA or moves on to another agency, it should consider setting up RSS feeds for newly harvested documents. Subject specialists from inside and outside the library community could take part in tagging. Again, GPO would need to start with some broadly popular agencies to have a chance of recruiting a significant number of taggers.

4) If GPO or another organization does a large scale tagging project, significant thought should go into tagging conventions. Not the vocabulary itself -- research seems to show that once an item reaches 100 tags or so, the proportion of tags stays constant. That is to say that agreed upon terms appear to predominate over idiosyncratic or spam tags (See Golder and Huberman below for details). What needs to be spelled out is how multi-word tags should be constructed -- is it air-quality, air.quality, or air_quality? They all mean the same thing, but del.icio.us and other tagging services interpret them differently. A consistent new word marker or a choice of tagging site that supported spaces inside tags will make any tagging project go smoother.

These are our thoughts. What are yours? Look at our spreadsheet. Check out the item pages on del.icio.us and read the articles below. Then let us know what you think about the future of social tagging for government documents.

References Consulted

- "HT06, Tagging Paper, Taxonomy, Flickr, Academic Article, ToRead" by Cameron Marlow, Mor Naaman, danah boyd, Marc Davis http://www.danah.org/papers/Hypertext2006.pdf

- The Structure of Collaborative Tagging Systems
by Scott A. Golder and Bernardo A. Huberman
http://www.hpl.hp.com/research/idl/papers/tags/
http://www.hpl.hp.com/research/idl/papers/tags/tags.pdf

- "Can Social Bookmarking Improve Web Search?" by Paul Heymann, Georgia Koutrika, and Hector Garcia-Molina
http://heymann.stanford.edu/improvewebsearch.html
http://dbpubs.stanford.edu/pub/showDoc.Fulltext?lang=en&doc=2008-2&format=pdf&compression=&name=2008-2.pdf

- "When tags work and when they don't: Amazon and LibraryThing"
Thingology Blog, posted by Tim Spalding Tuesday, February 20, 2007
http://www.librarything.com/thingology/2007/02/when-tags-works-and-when-...

Spring 2008 DLC Materials Now Available

I wanted to write about this a few days ago, but have only found the time to do so now.

The Government Printing Office (GPO) started releasing materials from the Spring 2008 Depository Library Conference even before the DLC meeting closed on April 2, 2008. You can find their materials at http://www.fdlp.gov/repository/dlc/spring08/index.html

GPO deserves credit for being prompt for the initial release of DLC materials. It is a refreshing change from a few years ago when people who couldn't make meetings had to wait many weeks for materials to be made available. So thanks GPO!

I hope to go through most of this material in the near future in more detail, but here are some items that seem like they are of special interest:

  • Improved Access to EPA Information:Before and After with Web 2.0 by Brand Niemann (in 3 parts)
  • Web 2.0 Power Point Presentation by Cindy Etkin of the U.S. Government Printing Office
  • Back to the drawing board in Virtual and Real Worlds
  • Web Harvesting Update for the Depository Library Council
  • Web Scraping Government Information

Go forth and check out! Let us know what you think of what's been released.

Spring 2008 DLC Conference March 31 - April 2 Resources Page

Official materials including handouts and some prepared speeches can be found at http://www.fdlp.gov/repository/dlc/spring08/index.html.

Please feel free to send us your notes or recordings of conference sessions.

"Making a Passport" Video

The State Department's DIPNOTE blog posted a video of Under Secretary for Management Patrick F. Kennedy and State Department Spokesman Sean McCormack discussing the process for creating a passport and its new security features. For the video's transcript, go to http://video.state.gov.

More GPO Scrutiny from The Washington Times & GPO Responds

The Washington Times published the concluding article on their GPO Passport story on Friday:

"When the government finally built a backup passport center to be used in case Washington became debilitated, it picked a location directly in the path of potential future disaster, the hurricane-prone Mississippi Gulf Coast, which was ravaged by Katrina just a few years ago.

...The Times examined the state of America's new e-passport program, disclosing in stories this week that the GPO outsourced production of key components for the passport to overseas facilities and has charged the State Department substantially more than it actually costs to make each passport.

Secretary of State Condoleezza Rice said yesterday her department is investigating the pricing issues, and two congressional committees also launched investigations into the security issues raised by having the crown jewels of America's border-security system produced overseas".

More details on the State Department's investigation can be found at this article also published by The Washington Times. Also, some members of Congress are investigating the passport security issue, according to an article from The Washington Times, as well as an editorial addressing these GPO passport issues.

GPO responded to the second story in a press release dated March 27th and also responded to the third story on the secure production facility via a press release dated March 28th:

"During the investigatory process, GPO did several site analyses for different locations. GPO and the State Department determined Stennis Space Center met all the requirements and to be the most secure and cost effective location".

GPO/GODORT Conf Call Minutes Posted

Bill Sleeman, chair of the ALA Government Documents Roundtable (GODORT), recently posted the minutes to the GPO / GODORT Steering Conference Call of March 12, 2008. These conference calls take place from time to time and often have news of value. The minutes can be found (may have to scroll) at http://wikis.ala.org/godort/index.php/GODORT_Chair and covered the following topics, among others:

  • Request for Information for Mass Digitization Opportunities
  • Status of EPA Web Harvesting
  • Status of the Federal Digital System (FDSys)
  • Addition of pre-1976 cataloging to the Catalog of Government Publications - in progress.
  • Continued distance ed through OPAL
  • Current stats on the newish Government Information Online reference service.

If I were you, I'd look over the entire set of minutes as it was all interesting. I'd like to highlight two issues, both of which cry out for the documents community to do more to support GPO in some of its efforts:

EPA Web Harvest Project

Here are the notes on this subject (full names available from minutes page):

LH & RHM: Status of EPA harvesting project: GPO worked through 300 of the documents to gather information on what it will take for GPO to provide access to harvested materials (process, workflow and staffing implications). So far: the back end automation of meta-data extraction is not ready; parameters for metadata that accompanies the files needs improvement to automate de-duping; and the rules, methods and mechanisms for harvesting need to be refined (approximately 28% of material was not in scope). Basically, it is still taking more staff time to make these available than GPO can afford. BS asked about the FGI taxonomy experiment and if GPO would be investigating the results of that effort. GPO may incorporate that information into the project as the project moves forward.

GPO's results of automated harvesting finding a lot of out of scope material and difficult automated extraction of metadata are about what I expected based on my own experience and from my reading of the literature. Whether or not GPO builds on our modest taxonomy experiment (Thanks Bill!), I think that a GPO - community/citizen collaboration will be needed to begin getting a handle on web-based agency documents. They could start simply by publishing their spidering logs and see what happens. Or perhaps they can obtain some of the $2 Billion/week currently being spent elsewhere. If GPO choose to take the mass collaboration route, I hope the documents community is in the forefront of helping them.

If you're interested in taking part in our tagging experiment, please see http://freegovinfo.info/epatagging. We will be running the project through April 18, 2008. To see what has been tagged so far, please visit http://del.icio.us/tag/epapilotproject.

OPAL Training

Here are the notes on this subject:

LC: OPAL, GPO continues to use OPAL for online training and demos. At present, technical capabilities limit presentations to slide shows, such as PowerPoint presentations. Interactive web functions will be added in the future. January call for participation in creation of tutorials netted one submission; hoping to generate interest at DLC.

The FDLP has over 1200 libraries and GPO got ONE SUBMISSION? A majority of FDLP libraries are teaching oriented academic libraries and GPO got ONE SUBMISSION?

Hello! I know I'm not the only one who has insisted that GPO provide training between conferences for those of us who don't get out much. The documents community has a great reservoir of government information expertise. We should be actively aiding GPO in their efforts to spread that expertise.

I admit that GPO's one submission wasn't from my library. I have a pretty new docs staff that's still getting up to speed. But that can't be the case everywhere. If only 10% of FDLP libraries could step up with a program, that would still be 120 programs -- twice a week for a whole year.

Just so I can at least pretend to put my money (or staff time) where my mouth is, I will spend some time next month looking at our library's gov info information strengths, our customer needs and patron interests. And then sometime during the summer I or someone else from our library will submit a program. If you run a depository, will you commit to doing the same? Not only does GPO need our help, so do our colleagues.

FGI thanks the GODORT and GPO personnel who participated, Jill Vassilakos-Long for taking the minutes and Bill for posting them to the ALA GODORT Wiki.

Washington Times Scrutinizes GPO

This article was published in The Washington Times yesterday: "Outsourced passport work scrutinized". Several concerns were cited, including security and profits:

"Documents and interviews with Bush administration officials said the GPO made about $100 million in profits on the production of electronic passports since 2006 and their sale to the State Department far beyond the costs.

The profits are raising questions among congressional investigators about whether the GPO is complying with laws that limit its business activities to recovering printing costs on a break-even basis".

GPO replied with their side of the story via a lengthy press release.

"The current agreed upon price between GPO and our customer (the State Department) for the production of the e-passport is $14.80 per book. That includes: materials, labor, overhead, required inventory, the secure production facility and future investments. GPO does not have any role in setting the price to the public for a passport, the State Department determines that price.

...GPO is unlike most other Federal agencies in that all GPO activities are financed through a business-like revolving fund. The revolving fund functions as GPO’s checking account with the U.S. Treasury. The fund is used to pay all of GPO’s costs and the fund is reimbursed by our agency customers when they pay GPO invoices".

Today, the Washington Times published the second part of this three-part story: "GPO profits go to bonuses and trips".

"When the government's main printing agency booked $100 million in unexpected profit it went on a spending spree: large bonuses to top managers, trips to Paris and Las Vegas, and an official photo of the boss that cost $10,000.

The bonuses, some nearly as high as $13,000, and travel are raising questions among congressional investigators and Government Printing Office officials about whether the agency is misusing its newfound wealth and whether it received the proper authority for some of the larger compensation payments from the Office of Budget and Management".

I'll be honest...my head is spinning. I'm trying to find out more information about this and understand some of these issues. What do you think about all this? I'll be sure to post a link to the third part of this story and any responses from GPO that are released. Stay tuned...

Communications Inside GPO

GPO will be presenting information about its internal communications at the conference Strategic Internal Communications In Government - West: Engaging Employees To Drive Change in June in San Diego, CA.

Jeffrey S. Brooke, ABC, Director of Employee Communications, and Terri C. Ehrenfeld, Employee Communications Specialist will be presenting. Here is the abstract of their presenation:

Ghosts, New Years’ Resolutions and Other Agency Information: How To Create Simple Employee Polls That Gather And Distribute Important Information

The U.S. Government Printing Of?ce (GPO) started polling employees on October 31, 2007 with this question: Do you believe in ghosts? From this humorous beginning, a weekly employee poll located on the home page of GPO’s intranet site caught on like wild?re. GPO’s Employee Communications Of?ce uses their weekly polls to gather business-related information, to solicit suggestions, to entertain – and to provide education and resources.

In this presentation you will learn how to:

  • Use simple, commercially-available software to launch your own employee polls
  • Promote the poll site to employees and encourage participation
  • Create a mix of questions that gathers and distributes valuable information – and uses educational entertainment to keep employees coming back for more

This session will provide a live demonstration to show how easy it is to create your own employee poll.

Ric Davis Shares about FDLP at ALA Midwinter 2008

UPDATE 1/16/2008 - An alert reader who attended Ric Davis' speech wrote me to say that the speech was his prepared text and that there was more information about shared regionals than his speech text indicated.

If anyone else has observations, please make a comment. We welcome feedback and discussion.

------------------------

In a refreshing change from the mid 2000s, the Government Printing Office has already posted a speech given by Acting Superintendent of Documents Ric Davis at the 2008 Midwinter meeting of the American Library Association.

The speech can be found at http://www.fdlp.gov/file-repository/gpo-attended-events/lscm-director-speech-ala-midwinter/view.html and covers a variety of topics including GPO's Budget, the new FDLP desktop, web harvesting plans, current FDSys status and new marketing plans that sound like they will be developed in conjunction with depository libraries, at least in part. The whole 13 page, double-spaced speech is worth reading and I hope that at least some of you will have comments on it.

FGI tips its hat to Ric Davis for posting his ALA MW speech only a few days after it was given. It beats the months we've sometimes had to wait in the past.

 

 

GPO and DLC: Thanks for Sharing

Recently, GPO released a sample of EPA documents that had been harvested from the EPA's website by software agents. These documents were gathered as a result of GPO's web harvesting project and a sample can now be found at: http://www.gpoaccess.gov/harvesting/index.html.

According to the EPA Web Harvesting page, documents are being made available in two ways:

The first method involves the creating of brief bibliographic records for the monographs and the CONSER standard record format for the serials. The majority of publications included in this sample will be made accessible through this method. Users may conduct a keyword search in the Catalog of U.S. Government Publications (CGP) for the phrase “EPA pilot project” to review these cataloging records. The second method of access being tested involves posting a portion of the publications from the sample to GPO Access using a browse table. At the request of the Depository Library Council, LSCM is also trying to determine if there is a mechanism that enables public access to Web harvested content while these publications are in the queue for brief bibliographic records. LSCM has posted a small portion of the sample to GPO Access using a browse table.

If you want to see the full methodology and the documents available through the browse table, then please visit http://www.gpoaccess.gov/harvesting/index.html.

Regular readers of FGI know that we have problems with brief bibliographic records with no subject access, but we definitely appreciate GPO's efforts at item description and posting the browse table.

We are also very thankful to the Depository Library Council for requesting that the public be able to access harvested content while the publications are waiting for their brief records. We hope that GPO can find a way to accomodate their request as it will open up many possibilities for getting EPA publications in a timely yet searchable manner.

So thanks for sharing!

 

Three Cheers for GPO: Tangible Copies of US Budget

I was very happy to hear that the Government Printing Office will be producing paper copies of the annual US Budget despite a White House announcement to go electronic only. Here is the GPO message sent out to FDLP-L today:

From: Announcements from the Federal Depository Library Program [mailto:GPO-FDLP-L@LISTSERV.ACCESS.GPO.GOV] On Behalf Of FDLP Listserv
Sent: Thursday, January 10, 2008 12:03 PM
To: GPO-FDLP-L@LISTSERV.ACCESS.GPO.GOV
Subject: Tangible Copies of the Budget of the United States Government

On January 9, 2008, Office of Management and Budget Director Jim Nussle announced that the Budget of the United States Government would be released in a web-only format for Fiscal Year 2009 on February 4th, 2008. Mr. Nussle cited the cost savings of such a move as the reason for the discontinuance of paper copies of the Budget.

GPO wishes to assure the members of the Federal Depository Library Program that we are committed to keeping the various Budget publications in printed format. To this end, OMB has agreed to provide GPO with files of the Budget documents that will be put to press for the purpose of dissemination to the public through the FDLP and Publication & Information Sales Program. We intend to ship these tangible copies of the Budget in conjunction with the February 4th internet release.

The class numbers, titles, and item numbers involved in this announcement by OMB are:

PREX 2.8: Budget of the United States Government; 0853
PREX 2.8/1: Budget of the United States Government; 0853-C
PREX 2.8/5: Analytical Perspectives; 0855-B
PREX 2.8/7: Budget Revisions; 0853
PREX 2.8/8: Historical Tables, Budget of the United States
Government; 0853

If you have questions, please use the GPO online help service at:
<http://www.gpoaccess.gov/help>.


Why is this good news? For several reasons:
 
Preservation - At this point in time, tangible formats are the only absolutely proven way to ensure something will be readable 100 years from now. LOCKSS and other technologies may eventually change this, but certainly in digital preservation still belongs to the future. And folks in 2109 will want to know how our government spent its money in 2009.
Access - While electronic versions of documents like this are terrific for searching for a specific piece of information, they can be cumbersome to use. And for the 80,000,000+ Americans without internet access, a tangible format is the only access they'll have to the President's spending plans.
Privacy - With the United States called a pervasive surveillance society by Privacy International and other groups, the best way to avoid gov't and commercial scrutiny of your scrutiny of the US budget is by using the paper version. I still plan to use both print and electronic, but it's nice to have the choice in my hands.
 

I haven't asked GPO their reasons for continuing to print the US Budget in light of its migration to the Internet, but I am very glad they will be preserving this year's documents for future generations. Three Cheers!

 

GPO's FY 2008 Budget Request via CRS

Thanks to our friends at the Open House Project and Open CRS, we can bring the Congressional Research Service summary of GPO's FY 2008 budget request contained in the report:

 

RL34031
Legislative Branch: FY2008 Appropriations
June 05, 2007

Government Printing Office (GPO). The agency’s FY2008 request of $181.98 million represents a 49% increase over the $122.1 million made available for FY2007. GPO’s budget authority is  contained in three accounts: (1) congressional printing and binding, (2) Office of Superintendent of Documents (salaries and expenses), and (3) the revolving fund. FY2008 requests for these accounts are ! congressional printing and binding — $109.5 million;  Office of Superintendent of Documents (salaries and expenses) — $45.6 million; and  revolving fund — $26.8 million.

The congressional printing and binding account pays for expenses of printing and binding required for congressional use, and for statutorily authorized printing, binding, and distribution of government publications for specified recipients at no charge. Included within these publications are the  Congressional Record; Congressional Directory; Senate and House Journals; memorial addresses of Members; nominations; U.S. Code and supplements; serial sets; publications printed without a document or report number, for example, laws and treaties; envelopes provided to Members of Congress for the mailing of documents; House and Senate business and committee calendars; bills, resolutions, and amendments; committee reports and prints; hearings; and other documents.

The Office of Superintendent of Documents account funds the mailing of government documents for Members of Congress and federal agencies, as statutorily authorized; the compilation of catalogs and indexes of government publications; and the cataloging, indexing, and distribution of government publications to the Federal Depository and International Exchange libraries, and to other individuals and entities, as authorized by law.

GPO requested $26.8 million for its revolving fund to support the agency’s acquisition of information technology infrastructure and security enhancements, workforce retraining and restructuring efforts, and facilities maintenance and repairs. This is an increase of $25.8 million over the $1 million provided in FY2007. Of the requested amount, $10.5 million was proposed for the completion of the development of GPO’s Future Digital System, while $9.4 million would cover the replacement of a 30-year-old automated composition system.20

Highlights of House and Senate Hearings on FY2008 Budget of the GPO.

Acting Public Printer William H. Turri, in his written testimony, discussed recent efforts to transform GPO’s operations for the digital age.21 GPO’s production of U.S. passports to meet new standards and increased demand has also been of
interest to appropriators.

20 Testimony of William H. Turri, Acting Public Printer of the United States, U.S. Congress, Senate Committee on Appropriations, Subcommittee on the Legislative Branch, Legislative Branch Appropriations for 2008, hearings, 110th Cong., 1st sess., March 16, 2007 (not yet published).

 

We at FGI would welcome any comments that GPO staff would want to post here, along with any other interested party's comments.

Google Books/Fed Docs: Google Books Statistics--The Bigger Picture

Now that I had some statistics it dawned on me I had no idea whether or not this was a lot documents.  So I was off to the FDLP desktop and the  Catalog of U.S. Government Publications.

I looked around the desktop to see if GPO listed any statistics.  On the "about" page for the CGP, GPO says merely that there are more than 500,000 records in the database.  So I gave some thought to how I might get a better figure, and off I went to OCLC and the GPO database in FirstSearch.  On the database info page, OCLC lists 507,000+ as the number of records and that the database had its last monthly update on August 8, 2007.

So I went back to the CGP and its advance search page.  Searching for GPO in the publisher field is not terribly effective.  Of course, in this database everything is a government document so that is not a problem. 

But how to get a real number out of the database?  I tried using the most common of words--a and the-- but to no great effect.  A brings up 359,875 records and the brings up 411,493.  Neither result comes close enough to the supposed 507,000.

I had another realization that the CGP now includes records for electronic titles--titles that would not be fodder for the Google Book Project. Using the New Electronic Titles page is not really an option to count them as it only goes back to April of 2005 and since early 2006 the monthly lists are not numbered (leaving me to do a lot of counting).

So back to the advanced searching page in the CGP.  Happily here you can search for terms in the URL/PURL. I proceeded to search for every record that listed .gov, .mil, .us, .org, and .com. I came up with a total of 64,504 records.  So approximately 13% of the records in the CGP are electronic titles or are titles with an electronic counterpart.

Unfortunately I had another realization that these figures really only represent documents published from 1976 on. This is a really big problem in that most of the documents I found in Google Books dated to before 1923.  My only hope to get good numbers was to askGPO.  So late on August 8th I shot off a query to GPO asking for statistics on the number of documents GPO has distributed both before and after 1976.

Surprisingly enough, GPO called me first thing the next morning. askGPO is notoriously slow in providing answers to queries so I was very surprised!  I spoke with Nancy Faget at GPO and she was very pleasant though not exactly forthcoming with numbers.  It struck me that I got the quick call back as GPO viewed my query as the first step in getting out of the program.  As far as I know my director has no real intentions of doing that, but I don't think I convinced her on that point.  But aside from that she told me that GPO really didn't know how many documents went to depositories since the beginning.  Alas!!

I honestly don't know if would be fair to take the view that probably as much was distributed from 1813 to 1976 as was published after 1976.  But if you did, that would lead to believe that over one million documents have been produced.

So the bigger picture suggests that the 167,878 titles in Google Books is only about 17% of all the documents that could be digitized.  At a guess...

So I put a call out to everyone in GovDoc Land.  If you are a full depository and have been one since 1813 and have kept really good records, could you please send me the statistics?  Thank you very kindly in advance!

 

 

Syndicate content