Wednesday, November 21, 2007

Yale, Microsoft & Kirtas...and a short rant

Yale University has signed an agreement with Microsoft for the company to digitize 100,000 out-of-copyright books over the next year. University Librarian Alice Prochaska said in an interview that the books Microsoft "scans will be available only on Microsoft’s search engine, the University will receive digital files of all the books that are put online, and the entire digital collection will be linked through the Yale Library Web site and Orbis catalog listings."

The article states that there will be (or is) a non-disclosure agreement, so the financial details will be unknown, however, generally Microsoft and Google subsidize the cost of the digitization either in its entirety or in part.

And who is actually doing the digitization? Kirtas, the creator/manufacturer of a high-speed automated book scanner. Kirtas has an "in-house service bureau that employs more than 75 image technicians and operates three shifts- has mastered a proprietary digitization process that guarantees an overall error rate lower than one per 10,000 pages, ensuring quality mass digitization that will meet the highest standards and endure the test of time."

[rant] I continue to find these (Google, Microsoft, OCA) projects to be fascinating to watch for a variety of reasons. However, I also find it sad to think of the non-book content that should be digitized that is not. There are many cultural heritage organizations that need to begin to digitize, but that can't find funding to get them started. Yes, they should collaborate, but do they have what other collaborators would want? They have content, but not money and maybe not manpower. I also know of libraries in the U.S. that have not yet automated their catalogues. I know that digitization is different than retrospective conversion, but...well...I guess respective conversions aren't sexy at this point. Okay...I'll get off my soapbox. [/rant]

Technorati tag: , ,


i agree said...

I completely agree. A justified and necessary rant. Keep it up.

Hans J. Albertsson, hobby audio archivist said...

Another point, which is just as serious, is that they're inputting the results into what to my eye looks like proprietary formats. On then needs proprietary SW to read the books, at least for complete and easy access.

Public resources should be kept public.