APIs
NYTimes updates its API to Congressional data
Submitted by jajacobs on Tue, 2010-02-23 11:16.The New York Times announced today the release of version 3 of its "Congress API."
- Introducing Version 3 of the Congress API, By DEREK WILLIS, New York Times Open Blog (February 23, 2010).
The Times gets raw data directly from the U.S. House and Senate Web sites and Thomas, the Library of Congress public web site with legislative information. It parses and stores the data on its own servers and provides an API (Applications Programming Interface) to the data so that programmers can query the data, get results, and easily provide the data to users in interesting and unique ways.
This is an excellent example of treating government information as "data" rather than as "documents." Rather than having a PDF file that lists all members of Congress (a document-centric way to deal with information), a database of all members of Congress with an API front-end to the database (which treats information as data) allows developers to build software that allows users to get a list for a state or district. When combined with other information such as voting records, bill-sponsorship, party affiliation, and so forth, users can get the information they need assembled in response to a specific information request. To the user the end result looks like a "document" but the document is built dynamically from the data.
Developers at the NY Times and elsewhere are using this to create interesting web sites and applications. See, for example, Your Government - The Oregonian, and Congress Speaks, and the Times' own Represent, which combines Federal and State information to allow users to find elected representatives in New York City.
- jajacobs's blog
- Add new comment
- 425 reads
NYT enhances its Congress API
Submitted by jajacobs on Wed, 2009-09-02 09:52.The New York Times has a nifty interface that programmers can use to access information about Congress, the Congress API. Recently, they have added improvements including bill cosponsorships, a new members response, and member voting record comparisons. Read about it here:
- Congress Returns, As Does an Improved Congress API, By Derek Willis, New York Times Open Blog, (September 2, 2009).
Also see: NY Times Announces the Congress API.
- jajacobs's blog
- Add new comment
- 575 reads
More Gov Info Presentations @ ALA Annual
Submitted by blakeley on Wed, 2009-07-01 08:01.If you are going to the ALA Annual 2009 Conference in Chicago next week, please come to the "ALA Unconference" where I will be leading a broad discussion on Friday, July 10th from 11:10-12:00 on the library's role in current & emerging trends of civic engagement, transparency, preservation and access to Government information. The supporting materials and presentation will be linked in the Unconference wiki.
Also, please come to the LITA BIGWIG Social Software Showcase to discuss and learn about Government Information Mashups! I will be presenting on this topic and would love to have you help out and/or join in on the conversation! The presentation will be posted on their website but the face to face portion of the BIGWIG Showcase presentations will take place Monday, July 13th from 10:30am - 12:30pm in the McCormick Convention Center West, Room W-184.
- blakeley's blog
- Add new comment
- 1616 reads
Response to Public Printer
Submitted by jajacobs on Thu, 2009-04-16 20:13.We at FGI would like to thank Robert C. Tapella, the Public Printer of the United States, for his response to our comments on his letter to President Obama regarding open government.
Mr. Tappella's response has some information that should be very encouraging and heartening to the depository library community. It also leaves some issues troublingly unaddressed.
Bulk Data Access to Legislative Information
First, it is wonderful to know that GPO is working with the Library of Congress, Congressional Research Service, the Law Library of Congress, and the Senate and House on the issue of access to bulk legislative data!
That news is important and significant. It is also very encouraging because it marks a new direction for dissemination of government information. Taken to its logical conclusion, this would mean that we will have a new route to obtaining government information. No longer will we be limited to information presented as web pages through government-built interfaces. No longer will we have to hope that web scraping will find all the information we want to gather or preserve. Raw information -- once locked in the dark web of government databases -- will be, potentially, available for libraries and others to download and repurpose.
Unfortunately, we can't look for this right away. Congress has only asked for a report, not action. The report itself is due "within 120 days of the release of Legislative Information System 2.0." Presumably that is a reference to a new version of the LIS that is currently only available within the legislative branch. I have not seen an announcement of a date for the release of a new version of the LIS, so it is not clear even when we can expect the report.
Nevertheless, it is certainly good to hear directly from Mr. Tapella that the task force working on this report will develop "a position on access to bulk data" and even intends to "work on making bulk data accessible."
It is somewhat ironic that this long, drawn-out process itself demonstrates the need for bulk data access. Although there have been calls for bulk data access for years, it literally took a legislative directive to get GPO and LOC and CRS to take the tentative steps they are taking now: to "develop a position" and "work on" the problem. Such passivity and long delays are, perhaps, inherent in a large, bureaucratic system, but they are crippling when it comes to keeping up with technological changes. This demonstrates why it is essential for the government to provide easy, free, reliable access to the raw information of government: doing so will enable others -- who can more quickly adopt new technologies -- to provide better access to that information faster than the government can.
What about Non-Legislative Data?
It is also unfortunate that the task force is only looking at bulk delivery of legislative information. Will it take another legislative directive to get GPO to "develop a position" on bulk access to other data? See Bulk Data Downloads: A Breakthrough in Government Transparency (by Tim O'Reilly, O'Reilly Radar, Mar 4, 2009) for a short list of other other data for which we need bulk access.
Will GPO Support Collections in FDLP Libraries or Just Backups?
Mr. Tapella's statement does not indicate that GPO has yet grasped the difference between 'backups' and digital deposit. GPO's focus is apparently still on making sure that its own collection is functional rather than facilitating digital collections in FDLP libraries. The "geographically dispersed content repository" described by Mr. Tapella is only "our backup" designed to ensure GPO's "continuity of operations" if GPO's own data repository becomes inoperable. This is a good and necessary feature but it is only a backup for GPO and has nothing to do with digital deposit.
Although Mr. Tapella points out that FDsys supports "repositories that can accept data much like libraries today accept tangible publications distributed from GPO," it seems clear that this generic design is intended as providing "backups" and would require "enhancements" to include bulk data access. This is a GPO-centric way of thinking. This is still a long way from GPO having a "position" on digital deposit and even further from "working on" making it possible.
Until GPO understands that it needs to support digital deposit so that FDLP libraries can build their own digital collections with their own functionality, FDLP libraries will not be partners in preservation and access; they will be, at best, little more than a backup for GPO.
APIs are not Digital Deposit
Mr. Tapella repeats the advantages of APIs, but fails to address the need for digital deposit. Providing APIs is not the same thing as providing digital deposit. As we have said in our original comment APIs are not magic. Each is a design for access and the product of choices made by the designer. Each has its own constraints built in. But don't take our word for it; read what developers say about the constraints of using existing government APIs:
- Extracting Government Spending Data via Talend and Ruby into CouchDB, by Rohit Amarnath, Full360 (04/11/2009).
- Improve databases, By Joshua Tauberer, The Hill (06/12/07).
We love APIs! We think they are great! We want more! We are so very glad that GPO will support them at last! But, please, Mr. Tapella, understand that APIs and a web site are only two of the three parts of a complete access system. Bulk data access is essential and we'd like to hear that GPO is planning for it now.
OAIS is not Digital Deposit
We are so very happy that FDsys is based on OAIS. It is something we have long advocated. But, again, Mr. Tapella, please understand that telling us about your preservation system and your intentions to preserve information does not reassure us that everything will be preserved and freely available to everyone forever. As we pointed out in our original comments, regardless of your intentions and the quality of your system, GPO may not always have the funding, resources, or mandate to provide free, permanent, public access to all government information and we therefore cannot rely on it alone to do so. And no single digital archive or repository can ever be as secure and safe as multiple archives. We need digital deposit to guarantee preservation and free access.
The GPO-centric approach to preservation and access is like a medieval town that stores all of its grain in one barn. When lightening strikes, the whole town goes hungry. In this day and age of $200 terabyte hard drives, peer-to-peer networks, and successful preservation systems like LOCKSS, it concerns us greatly that you still don't understand the need to have many collaborators working together to ensure long-term, free, public access.
Good News?
There are a couple of sentences in Mr. Tapella's reply that make me optimistic that GPO is on a path to change and does understand this need for collaborators. He says:
We need help from you and others in the community to help define future enhancements to access and data distribution. We see APIs as a one of the methods to provide advanced access tools, and realize that this is just one part of the ultimate solution.
To me, this says two important things: First, "data distribution" is on the GPO agenda, at least nominally; second, APIs are just one part of a bigger, ultimate, solution. This gives me hope for more. I hope I'm not reading too much into this.
See also:
- Bulk data and Legislative Information 2.0.
- Congress’ legislative information systems: THOMAS and the LIS by Jeffrey C. Griffith, Government Information Quarterly 18.1 (2001): 43-60. Apr 16, 2009
- Congressional Research Service Products: Taxpayers Should Have Easy Access, Project on Government Oversight, February 10, 2003.)
- Comparison of Legislative Resources on GPO Access and Selected Government and Non-Government Web Sites
- Remixes: Creative uses of free government information
- OpenHouse Project Op-Ed on Databases
- jajacobs's blog
- Add new comment
- 945 reads
Library Application Program Interfaces
Submitted by jajacobs on Sat, 2009-03-21 11:26.For reasons of serendipity, this seems to have been API week at FGI. We just keep posting stories about APIs. So, here is one more. It's not new, but still a good one:
- Library Application Program Interfaces, By Roy Tennant, TechEssence, July 17th, 2008.
Application Program Interfaces (APIs) are structured methods for one software application to communicate with another. APIs allow programs to interoperate and share data and services in a standard way. Here is a list of library-related APIs that library developers may find useful.
- jajacobs's blog
- Add new comment
- 592 reads
The People’s Data
Submitted by jajacobs on Sat, 2009-03-21 10:48.The People’s Data, by Christopher Werth, NEWSWEEK, From the magazine issue dated Mar 9, 2009.
"Government should make data openly available and then let outside talent reimagine how it can be used online."
See also: Realizing Transparency Through Federal Government APIs, by Andres Ferrate, ProgrammableWeb, March 4th, 2009.
- jajacobs's blog
- Add new comment
- 1255 reads
St Louis Fed goes gaga for APIs
Submitted by jrjacobs on Thu, 2009-03-19 18:18.And speaking of APIs, I just noticed a post on govdoc-l that the St Louis Fed is now providing APIs for FRED (Federal Reserve Economic Data) and ALFRED (ArchivaL Federal Reserve Economic Data). Here's more information on their API. I hope our programming friends will check out FRED and ALFRED as there's a TON of data there, some going back to 1927!
From the St Louis Fed programmer:
"The FRED API accommodates any programming language that can parse XML and communicate with our servers using HTTP. The FRED API is based on the REST web service architecture. REST leverages familiar web technologies. Like a website, the FRED API uses HTTP to receive requests and send responses. Also like a website, the FRED API uses URLs to specify requests. This web service differs from a normal website by sending XML instead of HTML. HTML is a visual medium that's not always strictly formatted and flexible enough for arbitrary data structures. XML allows custom tags and relationships among tags."
This is *exactly* how govt agencies should building their Web/data services. Thanks St Louis Fed!!
- jrjacobs's blog
- Add new comment
- 736 reads
Mmmmm... APIs!
Submitted by jajacobs on Thu, 2009-03-19 14:06.Although we often mention the desirability of having APIs for government information, it is one of those things that we feel you can't say too often. So, it is nice to see the announcement by excellent UK newspaper The Guardian that they are launching a robust API for its content and data.
Equally nice is the write up about this event by Felix Salmon, who outlines the benefits of APIs.
- Blogonomics: APIs, by Felix Salmon, Portfolio.com, Mar 11 2009.
Enjoy!
- jajacobs's blog
- Add new comment
- 632 reads
Hacking Congress .org
Submitted by jajacobs on Wed, 2009-03-18 13:46.Josh Tauberer has set up a new site, HackingCongress.org, meant to be "the intersection of civics & technology." Josh invites you to create an account so you can participate and edit any content on the site. "It's basically a new wiki," he says.
HackingCongress is a new hub for projects at the intersection of civics & technology, fostering civic engagement and education, advancing government transparency, and supporting communication with government. ("Hacking" has a dual meaning in the computer world and in this case it is positive slang for creative programming.) The focus of this site is on projects related to the U.S. Congress and state-level legislatures.
The goal is to be a hub, or at least a links page, for the developer community surrounding the intersection of civics and technology especially (but not exclusively) as it relates to the U.S. Congress.
Create an account and start editing pages. Make sure your project is listed with a description you like, and add any other relevant projects, data sources, and APIs to the appropriate pages.
There is already a lot there: Then beginnings of a list of the databases and APIs that are available for government transparency data; Links to ongoing projects broken down by type; An aggregator of of blogs in the open government tech community, planet.hackingcongress.org.
...and more!
- jajacobs's blog
- Add new comment
- 940 reads
iGov
Submitted by blakeley on Tue, 2009-01-20 22:38.The Atlantic published an article entitled "iGov: How Geeks are Opening Up Government on the Web" by Douglas McGray that discusses API Documentation and examines the possibilities when government agencies allow access to their raw data in an open, standard file format. The article uses the BART system as an example:
Turns out, it didn’t. In 2007, Google engineers asked public-transit agencies across the country to submit their arrival and departure data in a simple, standard, open format—a text file, basically, with a bunch of numbers separated by commas—so Google Maps could generate bus and subway directions. A handful of agencies, including BART, decided to go a step further and publish that raw data online. Once they did that, any programmer could grab the data and write a trip planner, for any platform.
“It’s not 1995,” BART’s Web-site manager, Timothy Moore, explained. “A single Web site is not the endgame anymore. People are planning trips on Google, they’re using their iPhones. Because we opened up our schedule, we are in those places.”
“We can’t envision every beneficial use for our data,” Moore told me. “We don’t have the time, we don’t have the resources, and frankly, we don’t have the vision. I’m sure there are people out there who have better ideas than we do. That’s why we’ve opened it up.”
- blakeley's blog
- Add new comment
- 1000 reads
Lunchtime Listen: Greg Elin on One Click Disclosure
Submitted by jajacobs on Sat, 2009-01-17 08:14.Here is a short, informative interview from NPR. You can listen online or download. It is only about 6 minutes long -- more of a coffee-break listen!
One Click Disclosure, On the Media, (January 16, 2009).
"Government spending data has long been publicly available but it's never been easier to find and interpret. That's thanks to USAspending.gov, a site created by the Federal Funding Accountability and Transparency Act of 2006 which was sponsored by Tom Coburn and Barack Obama. The Sunlight Foundation's Greg Elin explains what makes the site so revolutionary. "
- jajacobs's blog
- Add new comment
- 815 reads
NY Times Announces the Congress API
Submitted by jajacobs on Fri, 2009-01-09 06:23.Announcing the Congress API, By Andrei Scheinkman and Derek Willis, New York Times, January 8, 2009.
The initial release exposes four types of data: a list of members for a given Congress and chamber, details of a specific roll-call vote, biographical and role information about a specific member of Congress, and a member’s most recent positions on roll-call votes.
The four work together, so you can start by retrieving a list of members, find the one(s) you’re interested in and then fetch additional details through other calls. We built this service to work with other publicly available data sources, so you can identify members of Congress with a seven-character code from the Biographical Directory of the United States Congress. For individual member responses, we included the numeric ID assigned by GovTrack, a free and open-source service that monitors legislative activity.
Our data comes directly from the U.S. House and Senate Web sites, and is updated throughout the day while Congress is in session....
You have to register for an api-key to use the system, but it is free (for now). Check it out here!
(Note that this an API and returns XML so that you can build live data applications. You agree not to "archive any of the API content for access by users at any future date after you have finished using the service...." It is for building interactive applications.)
- jajacobs's blog
- Add new comment
- 930 reads
APIs in 15 Minutes
Submitted by jturk on Fri, 2008-10-24 10:33.There is a lot of talk about making data accessible via APIs, but there is also a lot of confusion about what this means, how to do it, and why it is beneficial when the average citizen cannot make heads or tails of an API.
API stands for "Application Programming Interface" but typically what we are discussing when we talk about APIs around data is a way to access data in a machine readable format. A machine readable format is something that is more or less understandable by a computer program, so that it may be used to present data in new and interesting ways.
The house.gov website has a listing of all representatives by state but a computer program has no way of knowing how to understand this listing. A more useful listing might look like an excel (or CSV) file that listed each congressperson's name in the first column, state in the second, and so on.
This is the fundamental advantage of an API, it makes data available in a way that a computer program can understand so that more complicated things can be done by such a program. (eg. draw a map with states colored according to their representatives' party affiliations)
A side effect of this computer readable format is that it is possible to ask more useful and specific questions of the data. When you go to the above house.gov site it is possible to get a listing of all Representatives, but it is impossible to say "show me all representatives that are Democrats from North Carolina" or "show me all representatives named John." With an API this kind of query is typically very simple, as an example in the Sunlight Labs API this could be done by going to a URL like http://services.sunlightfoundation.com/api/legislators.get?state=NC&part....
It is the availability of these APIs that have allowed all sorts of interesting sites that combine data from multiple sources known as "mashups." One of the earliest and most popular examples was a site called HousingMaps that combines Craigslist housing data with Google maps.
A handful of APIs exist to help make government data more accessible, through which it is now possible to make mashups using government data.
A rich sampling of them includes:
- Sunlight Labs API
- Capitol Words API
- FollowTheMoney API
- GovTrack.us API
- MapLight.org API
- NYTimes Campaign Finance API
- OpenSecrets API
- Project Vote Smart API
- Watchdog.net API
All of these can be used to pull the information available from these sites and do new and interesting things with it and even combine it with data from other sites to provide a more in-depth view than any single site or dataset can hope to offer.
- jturk's blog
- 1 comment
- 996 reads
NYTimes Announces Campaign Finance API
Submitted by jturk on Tue, 2008-10-14 18:24.The New York Times has just announced an API that makes available the data they have gleaned from the Federal Election Commission's electronic filings for the presidential candidates.
"The initial version of the Campaign Finance API offers overall figures for presidential candidates, as well as state-by-state and ZIP code totals for specific candidates. In addition, the API supports a contributor name search using any of the following parameters: first name, last name and ZIP code."
This allows people with the appropriate technical skills to build mashups and other web services that take a look at donations by individual or by area with relative ease. In essence it is now possible for web developers to create views on this valuable data that previously would have involved digging through millions of FEC electronic filings.
It should also be possible for researchers with moderate technical knowledge to analyze the individual contributions going to candidates to perform statistical and other analysis on what makes for a very interesting dataset.
The New York times providing this service is certainly a positive step towards helping people make use of what is one of the richest (pun not intended) datasets the federal government has to offer.
- jturk's blog
- Add new comment
- 884 reads
FEC data available as a widget and API!
Submitted by jajacobs on Fri, 2007-08-24 10:44.No, The FEC isn't doing this; MAPLight.org is. But, the FEC is providing the data in an an open format with detailed documentation which makes this all possible (see Files by Election Cycle at the FEC site).
MAPLight.org is providing access to Federal Elections Commission (FEC) data through an API (application programming interface) that makes it easy for any Web developer to build their own site or software program that displays or shares up-to-date campaign contributions from the FEC (www.maplight.org/widgets/apis) and through widgets (www.maplight.org/widgets) that allow anyone to track presidential fundraising on their own blogs, social media sites, and personal Web sites.
Both services, the widgets and the API, are free and open source, so anyone can use or modify them as they see fit.
Here is an example of a widget (but you can customize for your own site, of course!).
The MAPLight.org presidential widget is the first of several more widgets that the organization will release. By September 15, MAPLight.org will release a widget for U.S. Congress, showing total campaign contributions for each candidate for Congress. By September 30, MAPLight.org will release its "Money and Votes" widget, revealing correlations between campaign contributions and votes for any bill in U.S. Congress. To be notified when MAPLight.org releases these widgets, visit www.maplight.org/participate/signup. MAPLight.org is a nonprofit, nonpartisan organization based in Berkeley, California. Its search engine at MAPLight.org illuminates the connection between money and politics (MAP) via an unprecedented database of campaign contributions and legislative outcomes.
- jajacobs's blog
- Add new comment
- 1328 reads


Recent comments
16 hours 31 min ago
1 day 11 min ago
2 days 11 hours ago
2 days 14 hours ago
3 days 11 hours ago
1 week 5 days ago
3 weeks 1 day ago
3 weeks 1 day ago
3 weeks 2 days ago
4 weeks 3 days ago