Open Government: NARA Releases Six Datasets

January 29, 2010 by Matthew Schaefer, posted in Databases, Open Government, Research

As part of the Open Government initiative, NARA recently released six datasets available for the first time as raw data in XML format. The datasets are:

three editions (2007, 2008, and 2009) of the Code of Federal Regulations (CFR)
archival descriptions from the Archival Research Catalog (ARC)
organization descriptions from the Archival Research Catalog (ARC)

To learn more, see the press release.

6 thoughts on “Open Government: NARA Releases Six Datasets”

Kate Theimer says:

January 29, 2010 at 5:03 pm

Clever people have already started working with these data sets, as discussed in a post on Mark Matienzo’s blog, “thesecretmirror” (http://thesecretmirror.com/description/nara-and-data-dot-gov):

“…Obviously, transferring this much data is difficult, and I was quite shocked when I discovered that NARA didn’t bother to compress this data in the first place when I first decided to get my grubby paws on it. Not to be outdone, I corresponded with a few people over Twitter who were just as interested in the data, specifically Simon Spero at the UNC School of Information and Library Science, and Richard Urban, at UIUC’s Graduate School of Library and Information Science. The three of us made a concerted effort to grab the data from NARA’s web server and make a compressed version available.

After 6 hours of so of transferring the files and compressing them, Simon has posted the compressed dataset on ibiblio.org, as part of his Fred2.0 dataset project. Download the whole thing, decompress it, and start crunching – there’s so much you can do with it! Convert the series descriptions to EAD! Convert the organizational descriptions and histories to EAC! Throw Mitchell Whitelaw’s series browser on top of it! The future’s in your hands, people, and now the data is too.”

Reply
Jill says:

January 29, 2010 at 5:44 pm

We are excited to hear that Mark Matienzo, Simon Spero and Richard Urban are eager to work with the data NARA has made available. NARA IT staff kept running into technical issues while working to compress the data, so we decided to go ahead and post it uncompressed rather than hold the data back. It’s great to see that Mark, Simon and Richard took the initiative to compress and share it.

We look forward to hearing more about what mashups and visualizations people are able to create based on the data. We hope they have fun working on it!

– Jill (Admin)

Reply
Simon Spero says:

February 15, 2010 at 10:24 am

Hi Jill –
I’ve run into some issues with the ARC data sets; so far I’ve fixed some character encoding issues that let to a number of records not being valid XML.

There are some also some oracle generated errors that I haven’t fixed yet, but there’s more detail, and tentatively fixed tarballs on the Fred 2.0 blog.

See some of the posts under the ARC tag
archival-research-catalog tag.

Reply
1. Jill says:
  
  February 17, 2010 at 6:13 pm
  
  Simon,
  
  Thanks for bringing these issues to our attention. I will pass them along to the tech staff who help us export and post the data. Please stay tuned.
  
  – Jill (Admin)
  
  Reply
Dieter Maurer says:

October 2, 2010 at 6:01 pm

I have been looking for my military records from 1976-1988 for thelast 20 years. everytime I request them they cant seem to find certain medical and my records from enewetak atoll cleanup project can anyone help me with this cause I have the beginning stages of cancer

Reply
1. John says:
  
  October 6, 2010 at 1:12 pm
  
  Hi Dieter,
  
  When you have previously tried to obtain your military records, have you been contacting the National Personnel Records Center in St. Louis, Missouri? If not, you should definitely give them a try, they typically hold personnel service records, including medical information, for retired service members of all military branches. You can request your personnel file by filling out Standard Form 180 (available as a download in pdf format on NARA’s web site at http://www.archives.gov) or by submitting an online request through eVetrecs (an electronic service available to veterans and their next of kin on NARA’s site at http://www.archives.gov/veterans/evetrecs/).
  
  – John
  
  Reply

Share this:

6 thoughts on “Open Government: NARA Releases Six Datasets”

Leave a Reply Cancel reply