fdsys
Demonstration videos of GPO's FDsys database
Submitted by jrjacobs on Fri, 2008-11-28 15:18.Check out the search demonstrations of GPO's FDsys (nee Future Digital System). GPO's Federal Digital System (FDsys) will "manage federal govt documents, allow them to be uploaded, accessed via the internet, included in the depository library program (italics added!), and preserved for the future." The video images are a bit fuzzy, but you can see that the basic utility of FDsys from an end-user's perspective is getting close to full functionality. I'm most interested in APIs and other tools and services for exporting large chunks of data and associated metadata for reuse, digital deposit into library repositories/LOCKSS caches etc and generally being able to expand on access, preservation and long-term sustainability. Hopefully, future video demonstrations will elaborate on those possibilities.
- part 1: simple search
- part 2: advanced search
- part 3: citation search
- part 4: boolean search
- part 5 is mentioned in part 4, but there's no video available as of 11/28/08 from GPO's youtube page.
Questions and comments should be emailed to pmo AT gpo DOT gov. Also feel free to leave comments here as well.
- jrjacobs's blog
- Add new comment
- Email this blog
- 101 reads
Planned Launch of GPO's Federal Digital System (Fdsys)
Submitted by rdavis on Sun, 2008-08-17 11:55.The first public release of GPO’s Federal Digital System (FDsys) http://www.gpo.gov/projects/fdsys.htm will launch later this year. Staff in the Superintendent of Documents Library Services and Content Management unit at GPO have been working with GPO's Program Management Office who are responsible for developing this system. We want to ensure that the requirements for the Federal Depository Library Program will be met.
In order to ensure that we are accurately communicating the requirements capabilities of the system, we are requesting your feedback. Based on your knowledge of FDsys, what are your expectations for the first public release? We want to make sure we continue to get this information in the hands of the development team and keep lines of communications open. Additional information should be also be shared with GPO's Program Management Office at pmo@gpo.gov
- rdavis's blog
- 2 comments
- Email this blog
- 613 reads
GPO/GODORT Conf Call Minutes Posted
Submitted by dcornwall on Sat, 2008-03-29 11:20.Bill Sleeman, chair of the ALA Government Documents Roundtable (GODORT), recently posted the minutes to the GPO / GODORT Steering Conference Call of March 12, 2008. These conference calls take place from time to time and often have news of value. The minutes can be found (may have to scroll) at http://wikis.ala.org/godort/index.php/GODORT_Chair and covered the following topics, among others:
- Request for Information for Mass Digitization Opportunities
- Status of EPA Web Harvesting
- Status of the Federal Digital System (FDSys)
- Addition of pre-1976 cataloging to the Catalog of Government Publications - in progress.
- Continued distance ed through OPAL
- Current stats on the newish Government Information Online reference service.
If I were you, I'd look over the entire set of minutes as it was all interesting. I'd like to highlight two issues, both of which cry out for the documents community to do more to support GPO in some of its efforts:
EPA Web Harvest Project
Here are the notes on this subject (full names available from minutes page):
LH & RHM: Status of EPA harvesting project: GPO worked through 300 of the documents to gather information on what it will take for GPO to provide access to harvested materials (process, workflow and staffing implications). So far: the back end automation of meta-data extraction is not ready; parameters for metadata that accompanies the files needs improvement to automate de-duping; and the rules, methods and mechanisms for harvesting need to be refined (approximately 28% of material was not in scope). Basically, it is still taking more staff time to make these available than GPO can afford. BS asked about the FGI taxonomy experiment and if GPO would be investigating the results of that effort. GPO may incorporate that information into the project as the project moves forward.
GPO's results of automated harvesting finding a lot of out of scope material and difficult automated extraction of metadata are about what I expected based on my own experience and from my reading of the literature. Whether or not GPO builds on our modest taxonomy experiment (Thanks Bill!), I think that a GPO - community/citizen collaboration will be needed to begin getting a handle on web-based agency documents. They could start simply by publishing their spidering logs and see what happens. Or perhaps they can obtain some of the $2 Billion/week currently being spent elsewhere. If GPO choose to take the mass collaboration route, I hope the documents community is in the forefront of helping them.
If you're interested in taking part in our tagging experiment, please see http://freegovinfo.info/epatagging. We will be running the project through April 18, 2008. To see what has been tagged so far, please visit http://del.icio.us/tag/epapilotproject.
OPAL Training
Here are the notes on this subject:
LC: OPAL, GPO continues to use OPAL for online training and demos. At present, technical capabilities limit presentations to slide shows, such as PowerPoint presentations. Interactive web functions will be added in the future. January call for participation in creation of tutorials netted one submission; hoping to generate interest at DLC.
The FDLP has over 1200 libraries and GPO got ONE SUBMISSION? A majority of FDLP libraries are teaching oriented academic libraries and GPO got ONE SUBMISSION?
Hello! I know I'm not the only one who has insisted that GPO provide training between conferences for those of us who don't get out much. The documents community has a great reservoir of government information expertise. We should be actively aiding GPO in their efforts to spread that expertise.
I admit that GPO's one submission wasn't from my library. I have a pretty new docs staff that's still getting up to speed. But that can't be the case everywhere. If only 10% of FDLP libraries could step up with a program, that would still be 120 programs -- twice a week for a whole year.
Just so I can at least pretend to put my money (or staff time) where my mouth is, I will spend some time next month looking at our library's gov info information strengths, our customer needs and patron interests. And then sometime during the summer I or someone else from our library will submit a program. If you run a depository, will you commit to doing the same? Not only does GPO need our help, so do our colleagues.
FGI thanks the GODORT and GPO personnel who participated, Jill Vassilakos-Long for taking the minutes and Bill for posting them to the ALA GODORT Wiki.
- dcornwall's blog
- 2 comments
- Email this blog
- 529 reads
A First Look at FDSys
Submitted by dcornwall on Thu, 2008-02-28 15:15.The Government Printing Office released two videos demonstrating the "Proof of Concept" release of the Federal Digital System (FDSys):
Search - 3:22
Submission - 9:21
The videos require Windows Media Player to view and one of our volunteers was unable to view the videos using his Mac version of WMV. But I did get a look.
Everything below is subject to GPO's own caveat that this "proof of concept" version will change a lot between now and the first public release in late 2008.
I was impressed with the search functionality. The search box is simple with few options, but navigation boxes on the right hand side of the screen allow one to quickly zero in on likely documents. Search results may also be sorted by title, relevance, type of resource and date issued. Individual records have links to the content, preservation metadata and more.
The only problem I had with the search was the fact the search defaults to "OR". GPO itself now understands that is a problem based on beta user input and the video narrator promised this would be fixed in the public release. I have to admit that I don't understand how anyone in the age of Google would have made a default "OR" to begin with. But what's important is that the problem is being fixed.
The Submission section of FDSys will not be available to the public, but is interesting to watch the demo to see what is involved with document creation. Currently the ability to add metadata does not include keywords, which is something I hope will be included in the official release. Given GPO's move towards brief records under some circumstances, it would be very helpful to have agency generated keywords in FDSys.
There is some information in the agency submission side of FDSys that I hope carries over to either the public side or at least for FDLP librarians. FDSys requires an agency contact for every document. It would be helpful to librarians to a contact for documents in case something goes out of print or if a user had in-depth questions about the content of a given document.
Overall, this looks like a promising start. No mention of the alert features and push delivery slated for future releases, but hopefully those will be demoed soon.
- dcornwall's blog
- 1 comment
- Email this blog
- 791 reads
Delivery by FTP and RSS coming in FDSys
Submitted by dcornwall on Sat, 2007-12-22 13:23.The Government Printing Office released a new technical document on the Federal Digital System (FDSys) called FDsys System Release and Capabilities v5.0, December 2007.
In paging through this document, I was pleased to learn that when the latest release (1C) is deployed, FDSys will support delivery of documents in formats including PDF by FTP and RSS. The sections making note of these new features are brief:
4.10.5 Delivery by RSS
FDsys will allow users to sign up to receive DIPs via RSS.4.10.8 Delivery by FTP
Release 1C.4 builds upon 1C.2 by allowing users to request delivery based on user defined criteria.
Depending on how these features are implemented, it could be exciting for federal documents stakeholders including depository libraries.
As previously announced, I believe, people will also be able to get notifications of new content without having to get full file downloads:
4.10.22 User Notifications
Release 1C.4 will build upon Release 1C.2 by providing additional email and RSS notifications. Users will be able to sign up to receive notifications for system events, business events, and job processing events. This includes receiving a notification when now content is added to a collection or when there is a match to a user defined string.
According to the release schedule, some of these features will appear by November 2008 and others won't be ready till 2009. But it looks like GPO is on the right track and bringing out features long sought by many in the documents community. We at FGI are cautiously optimistic.
- dcornwall's blog
- Add new comment
- Email this blog
- 565 reads
FDSys Receives 1.4TB worth of Statutes at Large
Submitted by dcornwall on Tue, 2007-07-31 12:38.According to the Future Digital System blog, FDSys is ingesting a full run of Statutes at Large from the Library of Congress.
The scanned files take up 1.4TB worth of storage space and "The next step is for GPO to assess the content and determine whether the content complies with GPO specifications and create access derivatives (including OCR text) of the content."
People who are considering LOCKSS boxes to store federal content shouldn't blanch at the 1.4TB figure for Statutes at Large. Generally speaking, scanned files (which are images) are much larger than born digital content. For example, GPO deposited a year's worth of 10 Federal born-digital e-journals during their LOCKSS pilot. These 10 "journal-years"" worth of content took up 900MB or roughly 0.9 GB. At that rate, we could have harvested these 10 journals for over 250 years before filling up our 250GB hard drive. Of course, we'd need to upgrade our hard drives well before that.
Having said that, it will be interesting to see what sort of uses that GPO can put this material to.
- dcornwall's blog
- Add new comment
- Email this blog
- 693 reads
No Fee Access under FDSYS: Past Performance No Guide
Submitted by dcornwall on Wed, 2007-07-25 18:03.In the latest issue of the Federal Depository Library Program's Administrative Notes, there is this item:
Library Services and Content Management Update
Remarks by Richard G. Davis
Acting Superintendent of Documents
Director, Library Services and Content Management
At the Federal Documents Task Force Meeting
ALA Annual Conference
Saturday, June 23, 2007
Among other issues, Mr. Davis talks about the no-fee access to government information:
GPO Access – No Fee Access
I also want to emphasize that GPO’s commitment to provide the public with no-fee access to Government information through the FDLP, including GPO Access, remains the same. GPO will not allow access to content available through GPO Access to be restricted, diminished, or based on user fees for the FDLP. The public will be able to continue to print and download this information through the FDLP without restriction into the future.
While we at FGI greatly respect the 150+ year tradition of no-fee access and while we believe that Mr. Davis is sincere in his commitment, he cannot bind Congress or future GPO leadership. Free access (leaving aside NTIS and other cost recovery bits) is Congressional and GPO policy today, but only future leadership can determine policy in the future. Whatever access scheme is available in 2095, Ric Davis won't have put it into place.
While GPO cannot guarantee permanent no-fee public access by simple decree, it stands at a crossroads where it can either facilitate no-fee access or make it possible for future leadership to institute a pay per view model. If GPO were to work with the library community and deposit non-DRM digital publications based on selection profile, those publications would remain available for access and preservation no matter what future GPO policy was. A future Congress and GPO wouldn't be able to successfully order over 1,000 libraries to destroy their files and sign up for a pay per view system.
Where GPO is currently going however, is a centralized model where users have access to a central repository of files. Today that repository is free. It isn't subject to Digital Rights Management. But that's today. Because they have custody of the electronic publication, our free access depends on the the good will of GPO and Congress' unwillingness to change USC Title 44. But given enough time that can change. Especially in the secrecy and privatization-hungry environment Americans find themselves in today.
That is why what we need is a decentralized, distributed system of depositing electronic files to local libraries willing to host them. Libraries need to step up to the plate, and a Government Printing Office wishing to facilitate permanent no-fee public access needs to help with training, storage estimates and simple section mechanisms like those used to select LOCKSS journals.
- dcornwall's blog
- Add new comment
- Email this blog
- 1368 reads
FDSys web site is getting a make over
Submitted by dcornwall on Mon, 2007-07-09 12:45.The FDSys blog is reporting that the main FDSys web site will be redone. According to the brief post, "Updates will include Frequently Asked Questions, system capabilities, reference material, and outreach."
If you have questions about the new web site, the FDSys blog post invites you to leave comments on the blog or sending questions to pmo@gpo.gov.
Based on the new FDLP Desktop design, I'm hoping the FDSys web site redesign will bring good things and be more user friendly. There's still the issue of FDSys itself promoting an information monoculture, but that's been covered before and will be covered again.
FGI appreciates all of the Government Printing Office's efforts to provide more communication and interaction with the community of federal information users.
- dcornwall's blog
- Add new comment
- Email this blog
- 885 reads
Nevada Library Assn presentation: Access
Submitted by sjyeo on Tue, 2005-11-01 20:31.Below is the text of my part of FGI presentation at the Nevada Library Assn. Annual Conference on October 21, 2005. ShinJoung Yeo's part of the panel was regarding access to government information in the digital age. Please feel free to give us feedback.
I am only talking about the issue of access to government information here. But, it is important to think about the issue of access within a larger context since it is tightly intertwined with long term preservation, local control, privacy etc. We have to remember that Information has a cycle -- creation, collection, distribution, access and preservation. They are not mutually exclusive. With that said:
1. Why access?
We are living in a society where economic forces are at the front of many social and political decisions. However I believe there are certain things in our society that need to be free or have to free from a purely economic motivation such as water, air, education, health care etc. I hope you all agree with me that government information falls into this category. If you haven't thought about government information in this way, I hope our talk today will convince you of its inherent importance.
Imagine that you aren't able to access information about local environmental conditions (water and air quality…), or about current legislation pending in Congress, or find out about government research into cancer cures /for your loved one. When you think about government information in this way, access to government information is an inherent right of citizens.
2. Current Conditions
From the beginning, the U.S. government has recognized the importance of government information. Title 44 was written to codify its importance in our legal system.
Under Title 44, GPO has historically had primary responsibility for the printing, distribution, and sale of government publications. Thus, government publications passed through GPO and GPO distributed the publications to depository libraries. The geographically dispersed system of the FDLP libraries was then responsible for providing free, local access to government information.
However, the recent development of the Internet and its associated technologies has brought a shift from paper to purely digital information, bypassing the depository library system.
Judy Russell (who oversees the FDLP), the Superintendent of Documents, estimated that only 14 percent of federal government documents is deposited in the FDLP libraries. The other 86% is available only through the Internet and only from government-controlled Web servers. Ms Russell has stated that by 2007 fully 95% of all government information will be digital-only.
Because much of government information is now being produced digitally and not in paper, GPO is doing much less printing and distributing less print materials to depository libraries. In addition, it is becoming increasingly routine for government agencies to produce their own documents digitally and make them available directly to the public through the Internet. So without visiting a physical library building / now people are able to access government information anywhere there is an Internet connection. Sounds great right?
However, behind all this seemingly quick and easy access, there are far-reaching consequences of bypassing FDLP libraries. I am not saying this because I am a librarian. Rather, I am talking as a citizen here as well. If we don't take this issue seriously and critically now, then we might completely lose access to government information.
3. Who controls access to information?
In the print world, the government collected, created bibliographic information, printed and distributed documents to FDLP libraries. After the government information was distributed in the FDLP libraries, the role of government was ended. However, in a digital world, it becomes up to government agencies and the GPO what information is accessible and how information is accessed. So Basically, the responsibility for access has shifted from libraries to the government. That does not mean that this responsibility cannot or will not shift back to libraries. I hope to persuade you that it MUST return to libraries.
In response to the shift from print to digital, GPO is proposing the creation of a centralized digital content management system (called their Future Digital System or "FDSys") to provide access to all government information. GPO's proposal implicates that GPO will be responsible for the collection, description, access and preservation and will also bear the full cost of these responsibilities. In other words, libraries will relinquish their traditional responsibilities –collecting, organizing, providing free access along with services -- so they will be merely service points for helping patrons with search engines.
In this scenario that I've just painted -- which is closer to reality than you might think -- what could affect, change, limit access to government information? I would like to talk about 4 areas that are related to access in the event that libraries no longer have collections: Economics, Technologies, Politics of government information, and the Digital Divide.
Economics
Let's say that GPO's funding is fine now, but there is no guarantee that the government will fund GPO at the level needed to continue to provide no fee access to digital government information. We are already seeing GPO needing to fight for funding and under constant pressure from the Office of Management and Budget (OMB). In the midst of a budget crisis, can we assume that GPO's funding will remain a government priority? In this hypothetical situation, where GPO is the sole information provider, if GPO's budget line fails then there will be no access at all to government information.
Another possibility is that GPO or some government agencies might want to sell their information to private corporations for profit, or create a fee-based system based on cost-recovery, or even privatize popular or marketable documents or serials.
GPO's strategic plan in November of last year states that they will provide free access AND distribute information on a cost recovery basis. Actually GPO tried to do this with GPO access about 10 years ago and failed due in part to the fact that FDLP libraries had the same information available for free.
So, it would seem obvious that in order to make a profit or at least to recover their costs, GPO or agencies will need to create information that is somehow limited or less-than-fully-functional in order to be able to charge for fully-functional information. We're not saying that GPO WILL do this, but it seems to us that GPO's contradictory mission statement of free access and cost recovery will lead to reduction or limitations on free and fully-functional access.
Technologies
Technologies that GPO and other agencies implement could easily facilitate this fee based system -- restrict user access, and/or render digital documents unusable or barely usable. For instance, Digital Rights Management tools, which are designed to authenticate users to prevent piracy or copying copyrighted materials, can easily restrict access by identifying users based on whether or not they have paid a fee or subscription to access information in the FDsys.
Even the Depository Library Council's Vision paper has recognized the problem of DRM and has stated that, "GPO should work with agencies to ensure that the standard for web-publishing is fully-enabled digital files."
Content management systems like FDsys can be set up to reduce the functionality of information products -- prohibiting downloading, printing or transferring of text to other programs.
This is not simply my paranoia. Systems like I describe are already in place. For instance, at the National Academies Press Website, users may view a document one page at a time for free, but a fee is charged for downloading or printing the document.
Another issue that I'd like to bring up is the limiting of access based on software. GPO and other government agencies have decided in many instances to use only specific software for whatever reason. The decision to adopt a specific piece of software can restrict or limit users' ability to access needed information.
For instance, FEMA's online application for assistance after Hurricane Katrina could only be used with Internet Explorer browser. Another example is GPO's own annual report. In order to view the fully functional annual report one would need to download the Vizio, a document viewing software and register with Vizio to use the software.
As you can see it is not that difficult to manipulate technologies to limit or restrict information according to economic, political and social motivation.
So do we really want to rely on the government to provide no fee and full-functional government information? Or do we want to rely on libraries have diverse technologies and backgrounds and are committed to free public access to government information?
Politics of government information
As you know the issue of access to government information is highly political. Depending on the political climate at any given time, what is able to be accessed might change regardless of public interest or public's right to know.
In the current FDLP system, after the government deposits their documents, libraries have control over their collections. GPO can recall documents, but, in this system, it is difficult and cumbersome for the government to remove or alter or restrict those deposited documents located in FDLP libraries.
However, in the digital realm, documents can be removed or altered or restricted. We have already seen this on many occasions.
For instance, in May of this year, the Overseas Base Closing Commission report was released on the commission's web site, but the report was pulled off of the site a few days later by order of the Secretary of Defense who didn't like some of the information in the report.
In March, 2002, the EPA announced that it would no longer allow direct access to its Envirofacts databases. EPA stated that "As part of our continuing efforts to respond to Homeland Security issues.
I could go on and on and I bet you have a story like this as well. But you get the picture.
In the digital realm, information can be more easily and strictly controlled to the detriment of those who need access to the information: citizens, students, researchers, mothers, etc.
So the question is do we really want to put our trust to the government and expect them to provide us with free and fully functional government information? Or trust our 1300 libraries who have been fighting and advocating for people's access to information for 150 years?
Digital Divide
Another issue that has barely been on the radar in the docs community is the issue of the digital divide. Daniel raised this issue recently on Govdoc-l in a response to the DLC Vision Statement, but we have not heard others discuss this.
This issue has largely been ignored because we all think of the Internet as ubiquitous. We frequently forget that those with lower incomes or in rural areas do not enjoy the privilege of internet technologies, or do not have access to high speed internet connections necessary to download and view large PDFs, audio or video files.
According to the recent Pew report on the Digital Divide in the United States:
68% of adults use the Internet, 32% do not.
73% of adults live in a household with an Internet connection and 27% do not.
22% of adults have never used the Internet and do not have access in their homes.
38% of adults living with disabilities have access to the Internet.
22% of adults over 70 have Internet access whereas 53% of adults between 60 and 69 have access.
11% of Internet non-users say that getting access is too difficult, frustrating or expensive.
We have to find ways to make sure that EVERYONE has access to government information -- not just those who are privileged -- and need to take into account those on the other side of the digital divide when making our decisions about future systems of government information or shifting our primary role as to merely being service center.
4. Solutions
So what are the possible solutions for the problems that I've just described above. Go back to paper? Print out every digital government document? That will be highly unlikely.
I think the white elephant in the living room that has been in front of our eyes this whole time is revitalizing FDLP instead of relinquishing its responsibilities to the government.
I often hear that the roles of FDLP are changing because of digital technologies. Remember only the formats have changed. The primary roles of FDLP libraries (and libraries in general!) -- collecting, organizing, distributing, providing free access to government information -- have not changed and I don't think this should be changed. The only change that libraries need to make is accepting digital documents instead of paper documents. GPO needs to deposit digital documents instead of paper documents. It's simply a matter of a format change. Libraries dealt with microcards, microfiche, CDROMs, DVDs…, they'll and can deal with digital as well.
How does digital deposit solve these problems? Let me count the ways!
We believe that digital deposit and the creation of a digital FDLP system will solve the economic problem by dispersing and sharing the cost of digital access among many libraries instead of one chronically cash-strapped agency. Libraries will guarantee no fee access to information like they have been doing for the last 150 years.
Digital deposit will allow libraries to save digital documents on their local servers, have local control over those collections, and share them with other libraries in a collaborative manner. By doing this, we, libraries, can assure the provision of no fee and fully-functional access and better and more expanded services to government information.
By having local digital collections we will be able to use and reuse information and won't need to worry about possible restrictions, information alternation or removal. Additionally, fugitive documents will be greatly reduced.
Digital deposit can also go a long way toward alleviating the digital divide by for instance, facilitating print on demand in libraries and/or allowing local libraries to create and maintain their own digital collections that could be used without having to have an expensive T-1 line, or burned to CDs or other media for off-line use.
Some might argue that not every library can afford to create their own digital repository, do not have the technological know-how, or might not have a need for or have the staff to process local collections. These same arguments were brought up not so long ago in regards to other formats, and at the advent of the Internet age. We dealt with those changes and we can deal with the current set.
5. Conclusion
I'd like to end with a story. In 1999, following World Bank advice and a condition for the country's development loans, the Bolivian government granted a 40 year privatization lease to a subsidiary of the Bechtel Corporation, giving the company control over the water on which Bolivian citizens needed to survive. Immediately, the company doubled and then tripled the water rates for some of South America's poorest families.
How do you think the people of a small town in Bolivia -- a large majority of whom are poor peasants -- responded to this? How could these peasants even imagine challenging Bechtel, one of biggest multinational corporations in the world? The people believed that access to water was a sacred right, not a commodity to be bought and sold, so they fought with their lives for access to water. Eventually, because of the fervor of the protests, Bechtel's contract was rescinded and Bolivian citizens took back their water right.
There are only a few countries in the world where the right to access to government information is given to citizens. I believe this is your privilege and it's your sacred right. I hope that the library community will fight to assure citizens' access to freely available, fully functional digital government information.
- sjyeo's blog
- Add new comment
- Email this blog
- 1631 reads



Recent comments
1 hour 26 min ago
1 hour 52 min ago
2 hours 20 min ago
2 hours 36 min ago
6 hours 49 min ago
6 hours 54 min ago
19 hours 17 min ago
19 hours 44 min ago
1 day 16 hours ago
2 days 3 hours ago