...mentioned that Google was NOT going for archival quality (indeed COULD not) in their scans and were ok with skipped pages, missing content and less than perfect OCR -- he mentioned that the OCR process AVERAGED one word error per page of every book scanned!. The key point that I took away from this is that Google book project IS NOT an alternative to library/archive/archival/preservation scans.When we digitize materials, we want to only digitize them once. Therefore, we want the digital asset that we create to be the best that it can be. I agree with Jim Jacobs of diglet that the libraries involved with Google should not be pleased with the quality that Google is turning out. (And neither should we.) If those institutions want archival quality scans of their books -- especially those older, fragile works -- they will need to digitize them again. If they want to preserve the full contents of the books, they will also need to digitize them again, since we know that Google efforts are not up to par.
Most disturbing was reading that Google is okay with the effort it is putting out. Let's hope that another book digitization project will show Google how it should be done.
Technorati tags: Google, Digitization
No comments:
Post a Comment