government data

ALA Annual GODORT Update Meeting: "Need Data – but don’t know where to go?"

More Gov Info Presentations @ ALA Annual

If you are going to the ALA Annual 2009 Conference in Chicago next week, please come to the "ALA Unconference" where I will be leading a broad discussion on Friday, July 10th from 11:10-12:00 on the library's role in current & emerging trends of civic engagement, transparency, preservation and access to Government information. The supporting materials and presentation will be linked in the Unconference wiki.

Also, please come to the LITA BIGWIG Social Software Showcase to discuss and learn about Government Information Mashups! I will be presenting on this topic and would love to have you help out and/or join in on the conversation! The presentation will be posted on their website but the face to face portion of the BIGWIG Showcase presentations will take place Monday, July 13th from 10:30am - 12:30pm in the McCormick Convention Center West, Room W-184.

Lunchtime listen: Tim Berners-Lee on government data

I just read Tim Berners-Lee's notes on putting government data online. I must say, when TBL describes it, it sounds like a piece of cake :-) The key seems to be the use of linked data. It's a snap; let's do it! RAW DATA NOW!!


Footnote: Do's and Don'ts

* Do pick URIs which are likely to be persistent
* Do put RDF metadata giving the license.
* Do use the RDF and SPARQL standards
* Make sure your human readable pages are accessible.

* Do NOT hide data files inside zip files unless they are also available directly.
* Do NOT put data up in proprietary formats.
* Do NOT wait until you have a complete schema or ontology to publish data.
* Do NOT seek to replace existing data systems.

Open Up Government Data Wiki

Over at the Wired magazine "How-to wiki," there is a page about making government data more easily available.

It looks like it is specifically oriented toward statistical information, numeric data, and other surveys that collect information; ("The numbers — about how much corn we grow, what the universe looks like from Hubble, how much coal we have, and how well drugs work — are the results from the grand experiment of this country"). But it is already including documents (like technical reports).

The purpose of the wiki?

We've established this wiki to help focus attention on valuable data resources that need to be made more accessible or usable. Do you know of a legacy dataset in danger of being lost? How about a set of Excel (or — shudder — Lotus 1-2-3) spreadsheets that would work better in another format? Data locked up in PDF's?

This is your place to report where government data is locked up by design, neglect or misapplication of technology. We want you to point out the government data that you need or would like to have. Get involved!

Based on what you contribute here, we'll follow up with government agencies to see what their plans are for that data — and track the results of the emerging era of Data.gov.

With your help, we can combine the best of new social media and old-school journalism to get more of the data we've already paid for in our hands.

From the look of it, I'd say that the folks who designed this page are not familiar with the many existing sources of government data, but that's just a guess. Nevertheless, I think this is worth monitoring and I hope that librarians contribute to it. (It's easy! "Just jump in and edit the wiki. Add links to data that's out of date or in danger of being forgotten or that comes stored in a less-than-ideal format. Help define how Data.gov gets built by making sure that the data you need is included." And... "If you're not comfortable with the MediaWiki formatting language, feel free to get in touch with Wired.com staff writer, Alexis Madrigal, either by e-mail alexis.madrigal[at]gmail.com or on Twitter: @alexismadrigal.)

They note:

We're not writing a policy paper here. We're trying to highlight datasets and sources of knowledge that the new Administration — and it's open-data friendly CIO — could make more widely available and accessible with small, concrete actions.

Already on the list: Economic Research Service, ClinicalTrials.gov, creating a data catalog of every agency's data streams, "View Data Release From the User's Point of View, not the Agency's", and more.

It also lists "Models for Government Data Release, Transparency" such as ESDIS (Earth Science Data and Information System Project at Goddard).

bulk data downloads

Recommended reading: O'Reilly Radar - Bulk Data Downloads: A Breakthrough in Government Transparency by Tim O'Reilly (March 4, 2009):

"What would it mean if all the bulk data from the Library of Congress, Congressional Research Service, Government Printing Office, and "the appropriate entities of the House of Representatives" were made available?"

Here's an excerpt from the appropriations bill that's the focus of the post - "*Public Access to Legislative Data* - There is support for enhancing public access to legislative documents, bill status, summary information, and other legislative data through more direct methods such as bulk data downloads and other means of no-charge digital access to legislative databases..."

O'Reilly was especially struck by the possibilities embedded in that final passage - "bulk data downloads and other means of no-charge digital access to legislative databases" and the specific reference to agencies ...

Read the full Radar post here!

Bulk data and Legislative Information 2.0

Tim O'Reilly writes that bulk data of Congressional information may soon be a reality. Congressman Mike Honda (D-CA 15th District) an interesting rider explanatory statement the subcommittee was working on which would require the agencies that the U.S. Congress to distribute their data in bulk. No doubt this was due in no small part to the wonderful folks at the Sunlight Foundation and many other govt transparency activists. The rider explanatory statement has made it into Division G - Legislative Branch Appropriations Act, 2009, the section that Honda's subcommittee is working on, but not in the current text of the bill. You can track the changes of the bill with GovTrack.us bill tracker (I hope Tim is reading this because he lamented the lack of change control :-) ):

The money quote on the H.R.1105 appropriations Omnibus bill rider explanatory statement is this paragraph here:

*Public Access to Legislative Data* - There is support for enhancing public access to legislative documents, bill status, summary information, and other legislative data through more direct methods such as bulk data downloads and other means of no-charge digital access to legislative databases. The Library of Congress, Congressional Research Service, and Government Printing Office and the appropriate entities of the House of Representatives are directed to prepare a report on the feasibility of providing advanced search capabilities. This report is to be provided to the Committees on Appropriations of the House and Senate within 120 days of the release of Legislative Information System 2.0.

Numbers Aren't Enough: Providing Context

In my first post, I wrote about making information useful for ordinary people. It's been a pleasure and an honor to guest blog here for the past month, and as the month of October is nearly gone, I figure it seems fitting to come back to this subject as my reign as "Blogger of the Month" comes to an end.

Large numbers in particular are difficult to comprehend and the world of government information is full of them: earmarks range from hundreds of thousands to tens of millions of dollars, Barack Obama's fundraising totals have eclipsed six-hundred million dollars, and the $700 billion dollar bailout package had pundits scrambling to describe things that cost $700 billion. The difficulty of explaining just how big some of these numbers are was seen to an absurd end when CNN presented a calculation as to how many McDonald's apple pies could be purchased for each US citizen with such a sum.

One of the most useful ways of putting information in context that I've seen involving government information or anything else are the sparklines at watchdog.net:

These graphics show the statistics of each lawmaker in context, as well as the general shape of the distribution of Congress as a whole. Knowing that a congressperson requested $147 million in earmarks may sound like a lot, but seeing that it puts them outside of the top 100 may provide some useful and much needed context to these numbers. The shape of the line also shows if there is a smooth trend or a sharp jump with a small handful of lawmakers raising or spending drastically more than others.

Hopefully more and more presentations of government information will follow the lead of the terrific watchdog.net and attempt to surround information with relative context so that government information isn't simply available, but understandable.

Syndicate content Syndicate content