Thursday, June 26, 2008

Blog post: U.S. copyright renewal records available for download

This is wonderful news from Google for people research U.S. copyrights:
How do you find out whether a book was renewed? You have to check the U.S. Copyright Office records. Records from 1978 onward are online (see http://www.copyright.gov/records) but not downloadable in bulk. The Copyright Office hasn't digitized their earlier records, but Carnegie Mellon scanned them as part of their Universal Library Project, and the tireless folks at Project Gutenberg and the Distributed Proofreaders painstakingly corrected the OCR.

Thanks to the efforts of Google software engineer Jarkko Hietaniemi, we've gathered the records from both sources, massaged them a bit for easier parsing, and combined them into a single XML file available for download here.
Based on comments made by Siva rote:
This is great news for historians, journalists, researchers, publishers, and librarians. It's also great for the Open Content Alliance and other book digitization projects.

Of course, this does not help much with books published and copyrighted outside of the United States. But that's always a complication.

However, I wonder if Google itself is going to use these records to change the format of many of the scanned books published between 1923 and 1963. Currently, these are only available in "snippet" form. Will Google Book Search change significantly now that this file is available?

That last paragraph is very interesting. Could this allow Google and others to display more, now that they can easily check these records? I hope the answer is "yes."


Technorati tags: ,

No comments: