jajacobs's blog

The Words They Used

The Words They Used, by MATTHEW ERICSON, New York Times, September 4, 2008. "The words that speakers used at the two political conventions show the themes that the parties have highlighted."

This is a bubble graph of number of times words were used per 25,000 words spoken and a list of which speakers used which words. Ericson has done a good job of looking at phrases as well as individual words, of combining similar words and phrases, and of noting phrases that have very little or no use by one or both parties. Another good example of how, when we have access to the "raw data" (as opposed to transaction-based, search-and-retrieve, one-page-at-a-time access), the data can be used, re-used, and analyzed.

Two new federal government blogs

Two new blogs appeared on the USA.gov Blogs from the U.S. Government page recently:

  • Arctic Chronicles, by Jessica Robertson, Public Affairs Specialist for the U.S. Geological Survey. She will be documenting her journey to the Arctic as she accompanies scientists on an expedition to map the seafloor.
  • The Energy Savers Blog, which aims to provide "a place for consumers to learn about and discuss energy efficiency and renewable technologies at home, on the road, and in the workplace."

While the Energy Savers Blog is apparently provided by the U.S. Department of Energy's Office Efficiency and Renewable Energy (EERE), it is hosted on a .com website. That creates a variety of problems for long-term access and preservation. (See more examples of government information on .com sites here.)

Both blogs have RSS feeds.

Coverage of Political Conventions

Looking for online coverage of the political conventions? Here is a good roundup of options!

I was surprised to read that C-SPAN network is using live-streaming Qik cams! TechCrunch has lauded them for it!

http://qik.com/groups/421/

Google and the Search for Federal Government Information

A couple of weeks ago, Bonnie Klein, of the Defense Technical Information Center, submitted a comment here with a link to an article she wrote about the effectiveness of using Google and other commercial search tools to find government information. I recommend it highly:

In it, Klein notes that "Google and other search engines are commercial enterprises, not public utilities." She addresses in particular the fact that government information gets no priority in ranking of search results: "Business operations and revenue-generating advertising partnerships, not altruism, factor into page ranking."

The article examines legal, technical, commercial, and copyright issues, and includes many useful citations.

For example, she quotes, Donna Bogatin from ZDNet, who observes that "By requiring that Web pages have inbound links from third-party Web sites, the PageRank based algorithm may result in automatic exclusion of the most relevant pages for a given query simply because no other Websites have linked to them." (Google Search Page Rank Excludes Relevant Websites, by Donna Bogatin. ZDNet, January 26, 2007).

This is a good reminder of how government web sites that make it difficult to link to documents ("Documents that exist within databases on GPO Access cannot be bookmarked") automatically lower their PageRank.

Thanks, and a tip of the hat to Bonnie for this useful article!

See also: Hiding in Plain Sight: Why Important Government Information Cannot Be Found Through Commercial Search Engines, Center for Democracy and Technology.

Excellent site for Labor-Related Information

In celebration of Labor Day, I invite you to visit one of my favorite sites for labor-related information:

  • IWS Documented News WEEKLY BULLETIN, from the Institute for Workplace Studies School of Industrial & Labor Relations Cornell University, Stuart Basefsky, Director, IWS News Bureau.

Visit it regularly, or use its News feed (Atom format), and, of course, it has a search feature.

The Bulletin includes citations and links to Laws, Statistical Reports, Academic Research, Government Reports and press releases from International, Federal, State, and Local governments, and even "Idiosyncratic But Relevant Facts." Basefsky says the intent of the Bulletin is "to keep researchers, companies, workers, and governments aware of the latest information related to ILR disciplines as it becomes available for the purposes of research, understanding and debate" and that, "The service is unique in that it provides the original source documentation, via links, behind the news and research of the day."

It is a truly rich and useful resource.

Happy Labor Day!

New SSL policy in Firefox hurting tens of thousands of sites

"SSL" (Secure Sockets Layer) is a standard for establishing an encrypted link between a web server and a browser to ensure that all data passed between the web server and the browser remains private.

The "geeks at Pingdom" describe a problem with the way Firefox version 3 handles "SSL certificates" (which the casual user does not even see under normal conditions):

If you visit a website with either an expired or a self-signed SSL certificate, Firefox 3 will not show that page at all. Instead it will display an error message, similar to any other browser error (for example a “page not found” 404 message).

...[T]his is not something that only affects smaller websites. For example, the SSL certificate for the official US Army website [https://www.us.army.mil/] is declared invalid by Firefox 3.

See also:
What is SSL? (ssl.com)
SSL (Webopedia)
SSL (Wikipedia)

FGI at SAA: Citizens in the Dark? Government Information in the Digital Age.

I have posted my "speaker notes" for the presentation I gave at the meeting of the Acquisitions and Appraisal Section of the Society of American Archivists Convention in San Francisco.

My thanks go to SAA and the section for a inviting me and making me feel at home.

Survey of the Current Legal Landscape of Federal Right-to-Know Laws

Following up on Daniel's post this week (Using FOIA in book writing), here is a rather comprehensive review of the state information access: FOIA and beyond. The article is from a symposium on "Harnessing The Power Of Information For The Next Generation Of Environmental Law."

In practice ... this net of government-information statutes provides what is at best a piecemeal and not entirely satisfactory pathway to needed environmental information and is at worst the illusion of a right of access where none exists.

Why the Federal Register Is the Most Important Publication in America Right Now

It is not often you see a headline that is so documents-specific as this:

Since my first library job in a law library, I have been intrigued by the dryest of dry documents, the Federal Register, where, every working day, announcements, draft regulations, and invitations for public comment appear.

The headline above was added by Alternet when it was re-posted from the original posting in The Progressive on August 18, 2008, but it was drawn from the original text in which Rothschild says, "Today, the most important publication in America is the Federal Register." Yes, both publications have strong editorial positions and the article is an opinion piece. But these contexts make the headline and the comment no less true. As Rothschild says, you have to "look at proposed regulatory changes at the Department of Labor, the Department of Health and Human Services, the Department of the Interior, and the Justice Department you get a sense of" the vast, last minute changes that the current administration is trying to instantiate. "Unable to accomplish his goals legislatively, Bush is trying to get them done by fiat."

Regulations and regulatory law are the implementation of legislated law and make all the difference in how laws are enforced and how activities of all citizens are, well..., regulated. Whether or not you agree or disagree with Rothschild's comments or with what the Bush administration is doing, this is a text-book worthy case of studying how huge changes in our way of life can be implemented by the dryest of dry government documents.

See also: More Lame Ducks: shortened reviews for regulations.

Text Visualization Tools

What would it be like if we had true open access to large quantities of government text? We would be able to do much more than retrieve a page of the Congressional Record and read it. Researchers would be able to analyze the text and create new, innovative ways of discovering, browsing, searching, and reading text-based information.

Clifford Lynch has written eloquently about this in the realm of scholarly literature (Clifford A. Lynch, "Open Computation: Beyond Human-Reader-Centric Views of Scholarly Literatures," Open Access: Key Strategic, Technical and Economic Aspects, Neil Jacobs Ed., Oxford: Chandos Publishing, 2006, pp. 185-193.).

I was reminded of these issues this morning when looking at Visualization Strategies: Text & Documents on Tim Showers Web Design Blog (August 20th, 2008). Tim lists more than a dozen examples of techniques and tools. One of my favorites is the visualization of the 2008 Democratic primary debates offered by the New York Times. You can hear the debate, search for keywords and see where they appear, browse a transcript, and more.

Shouldn't we have free, open, access to large bodies of all government texts (not just search-and-retrieve access to bits-and-pieces) so that we can easily create corpora that can be indexed, browsed, and analyzed?

Thanks and a tip of the hat to Tim Dennis!

Political Fundraising? It's Party Time!

The Sunlight Foundation has launched a new web site, Party Time!, which aims to document the political party circuit -- not "political parties" as in "GOP" and "Democratic," but parties as in champagne, food, golf... and money: "the social whirl surrounding politicians in their quests to raise cash to run their campaigns."

There is a searchable database that lets you track parties thrown at the 2008 Democratic and Republican National Conventions as well as fundraising activities by all lawmakers running for Congress that happen all year round going back to 2006.

SEC To Replace EDGAR With 'IDEA'

SEC To Replace EDGAR With 'IDEA', by K.C. Jones, InformationWeek, Aug. 19, 2008.

The Securities and Exchange Commission (SEC) intends to supplement its aging EDGAR system and eventually replace it with a new one called Interactive Data Electronic Applications (Idea). It hopes to make Internet searches about publicly held companies and mutual funds simpler and more comprehensive and make data easier to downloaded to spreadsheets, entered in databases, and compared.

The SEC said the move would allow it to transition from collecting forms and documents to making the information freely available to investors. The new system should also provide current information in a format that is easy to access, collate, sift through, and compile into new reports.

Most SEC filings have used the Edgar format, which has limited investors and others who want to examine information about public companies to viewing one form at a time.

Presidential Signing Statements

The Subcommittee on Oversight and Investigations of the House Armed Services Committee issued a report on Presidential signing statements: Findings of the Subcommittee on Oversight and Investigations in Support of the Full Committee re: Presidential Signing Statements (PDF, 4 pages). It is also available here from the Federation of American Scientists Project on Government Secrecy, which describes the report here: White House Signing Statements “Unsubstantiated,” Report Says, by Steven Aftergood, Secrecy News, August 20, 2008.

The Subcommittee held hearings on signing statements ("Testimony on the impact of the Presidential signing statement on the Department of Defense’s implementation of the Fiscal Year 2008 National Defense Authorization Act) on March 11, 2008. Prepared statements and audio transcripts are available on the Committee's hearing information page. (Nothing is available from GPO Access yet, apparently.)

Also see: Essential Reading About Signing Statements (which includes links to audio files of the hearings from hascaudio.house.gov, but which I had no success in loading) and News About Signing Statements maintained by Joyce A. Green.

Just browsing around this important topic and trying to find a single, reliable link to all the information from the government is a good demonstration how far we have to go to get good access to government information.

More Lame Ducks: shortened reviews for regulations

OMB Watch is reporting on the Bush Administration Pushing Last-Minute Rollbacks (08/19/2008). "The Bush administration is trying to finalize several new rules, covering a range of policy issues, before a new administration takes over and despite its own policy directive."

The article reports that one proposed change in regulations missed the administration's own deadline for new rules by two and a half months. It also says that the The Office of Information and Regulatory Affairs (OIRA), the White House office in charge of approving, changing, or rejecting new administration policy, spent only three days reviewing one proposed rule when the average time as been 71 days.

Government information specialists should be aware of this since the public comment period for some changes in rules has been reduced from the standard 60 days to 30 days.

Proposals included changes to Endangered Species Act; changes to how estimates for on-the-job risks are calculated; and changes that would make it easier for state and local police to collect intelligence about Americans.

A leaked draft rule that could reduce women's access to birth control by classifying oral contraception as a form of abortion has not been submitted to OIRA yet.

Discovering the Library With Google Earth

Discovering the Library With Google Earth, by Michaela Brenner and Peter Klein, Information Technology & Libraries, Volume 27, Number 2 June 2008 (re-posted at redOrbit).

Libraries need to provide attractive and exciting discovery tools to draw patrons to the valuable resources in their catalogs. The authors conducted a pilot project to explore the free version of Google Earth as such a discover tool for Portland State Library's digital collection of urban planning documents. They created eye- catching placemarks with links to parts of this collection, as well as to other pertinent materials like books, images, and historical background information. The detailed how-to-do part of this article is preceded by a discussion about discovery of library materials and followed by possible applications of this Google Earth project. In Calhoun's report to the Library of Congress, it becomes clear that staff time and resources will need to move from cataloging traditional formats, like books, to cataloging unique primary sources, and then providing access to these sources from many different angles. "Organize, digitize, expose unique special collections" (Calhoun 2006).

Syndicate content