Monday, January 31, 2005

Digital Assets and Mass Transit

Traveling recently on Amtrak and the Long Island Railroad, I noticed how other people occupied their time. Some did the traditional things: talking, sleeping, eating and reading. Others used "new" technologies to keep themselves amused. People talked on cell phones, listened to music CDs, watch movies on small DVD players, and worked on their laptop computers. This got me thinking about the digital assets we're placing on the Internet and how we access them.

Since we don't always have access to the Internet (don't have the right equipment or don't have an access point), these digital assets are often unavailable to us. If I'm Bellport, NY and don't have Internet access, I may be unaware of a wealth of digitized information located there. And I won't know that it would be worth my while to not only see the materials online, but also to venture perhaps a few miles to see the actual items in person.

Well...good marketing could overcome that as would Internet kiosks or "digitization project" kiosks. cultural heritage organizations (i.e., museums, libraries, historical societies) market themselves adequately to people traveling through their area? Even if I don't visit in person, will I know that one exists and that I could visit it online? Looking in the tourist information publication in my hotel room, I noted that no libraries were listed, although some museums were.

The idea of having Internet kiosks is an interesting one. Some shopping malls did this several years ago. The kiosks were generally used by kids and were heavily abused. But what if kiosks were placed in transit stations (perhaps with some technical limitations place on what they can do), hotels, or convention centers? And what if those kiosks highlighted the materials available in person and online at local cultural heritage organizations? The impact could be interesting and might lead people to think not only about what's available there, but also what's available back home (if the person is a traveler) or at another train stop (if the person is a commuter).

If you have a digitization project, how are you marketing it? If you've done something creative, used a kiosk, or reached out to tourists please write a comment and let me know.

Public Knowledge and Public Domain

When we think of digitization projects, we tend to think of digitizing those items that exist in the public domain. Going through copyright clearance on items not in the public domain can be time consuming and costly. The advocacy group called Public Knowledge is working to ensure that governments understand the impact of the public domain on creativity and innovation.

The web site notes that:
Primarily due to new legislation responding to new technologies, the information
commons has fallen out of balance. Nothing new will really enter the public
domain for the next 20 years. This will have a major impact of people's need to
communicate, to share ideas, to pass down traditional knowledge, to participate
in popular culture. Whether a writer or scientist, issues surrounding
Intellectual Property and the Public Domain affect us all. Journalists who are
being forced to sign over the rights to their writings for the next 95 years to
their employers will be allied with the genetic researchers who can't study
breast cancer genes because of patents.
The web site tracks issues and contains very good background materials. Given who is involved in the group, this is a reputable organization with clout. Definitely a group worth keeping your eyes on (and perhaps helping out).

Friday, January 28, 2005

The Copyright Office seeks to examine the issues raised by “orphan works”

The U.S. Copyright Office "seeks to examine the issues raised by 'orphan works,' i.e., copyrighted works whose owners are difficult or even impossible to locate. Concerns have been raised that the uncertainty surrounding ownership of such works might needlessly discourage subsequent creators and users from incorporating such works in new creative efforts or making such works available to the public. This notice requests written comments from all interested parties. Specifically, the Office is seeking comments on whether there are compelling concerns raised by orphan works that merit a legislative, regulatory or other solution, and what type of solution could effectively address these concerns without conflicting with the legitimate interests of authors and right holders."

Written comments are due by March 25, 2005. Background on this subject and additional information is available in the Federal Register and a

Tuesday, January 25, 2005

Rochester Public Library Embarks Upon Major Local History Digitizing Project

The Rochester Regional Library Council reported the following in its January/February 2005 newsletter. (Reprinted with permission)

Substantial gifts to Rundel Library Foundation from the Gleason Foundation, the estate of Josephine Tait, and other individuals will allow Rochester Public Library (RPL) to digitize over 1.5 million pages of its Local History Division’s bound material. The purchase of a Kirtas Technologies APT BookScan 1200™ system will not only make this effort possible, but will also earn RPL the distinction of being the first public library location for this revolutionary book digitizing technology. The acquisition of this system, which will be delivered in February, 2005, is part of a three-year, $500,000 digitization project recently launched by the Library.

“We see it as a win-win situation,” explained Carol Nersinger, Director of Rochester Public Library. “Our purchase allows us to support a local company and expand our leadership role in the use of digital technology to preserve and make accessible these important resources for both historians and genealogists around the world.”

“We are very excited to have a local partner in the Rochester Public Library System. Their vision around digitization of critical library resources is a strong statement that our technology is attractive and cost-effective for libraries of all sizes,” says Lotfi Belkhir, CEO of Kirtas Technologies.

Monday, January 24, 2005

A book scanner for under $250

Having mentioned two scanners last week that will do 1,200 pages per hour and are quite expensive, I feel that I should mention something in a lower price range.

The Alestron Opticbook 3600 is a manual scanner that can do single pages as well as books, and costs under $250. The bundled software allow for OCR and PDF creation, as well as image editing. For an organization that needs a flatbed scanner and would like the possibility of scanning books, this piece of equipment would be worth checking out.

Friday, January 21, 2005

Bookmobiles for the Digital Age: Conversation with Brewster Kahle

In Brewster Kahle's December 2004 speech at the Library of Congress, he talked about printing digitized books for $1.00 each from a bookmobile. The idea was to giveaway books (or sell them very cheaply) rather than loaning them. In this 2003 article from The Book & The Computer, we get more details on this remarkable -- and low cost -- bookmobile that he has constructed. This bookmobile is already being used in several areas of the world.

Librarians will likely not like some of the things Kahle says in this interview. Perhaps this is a "call" for libraries to think more creatively about what they do (i.e., disseminate information) and how they do it.

How to Digitize Eight Million Books: A Conversation with Michael Keller

This article from 2003 details the Stanford University digitization program, where they are using a Swiss scanner that can do up to 1200 book pages per hour. The article notes that a 200 page book can be digitized in 20 minutes.

Stanford is now working with Google. In this more recent interview, Keller notes that "The Google plan allows us to accelerate our digitization schemes by orders of magnitude." The interview notes that the books will be digitized at Google's Mountain View, CA headquarters, which means that the books will be unavailable for some period of time.

Grant Writing Workshop For Preservation And Digitization Projects

The following was posted to a digitization discussion list.


Grant Writing Workshop For Preservation And Digitization Projects Sponsored by the Ohio Historical Records Advisory Board (OHRAB)

April 13, 2005
Wright State University, Dayton, Ohio

Join us for this Society of Ohio Archivists pre-conference workshop presented by Tom Clareson, Manager of Education & Planning, Digital Collection & Preservation Services Division of OCLC. This one-day workshop focuses on preparing for and writing grants for digitization and/or preservation projects. Due to sponsorship from the Ohio Historical Records Advisory Board (OHRAB), this workshop is being offered at a reduced cost ($35) to all participants.


Since most digitization and preservation grants are funded through the National Endowment for the Humanities (NEH), the Institute of Museum and Library Services (IMLS), and the National Historical Publications and Records Commission (NHPRC), the session is based upon those particular granting entities' requirements. Developing proposals for state, local and foundation funding sources will also be addressed.

The workshop includes details on matching your institution's project with the appropriate funding agency and preparing to write a proposal. The majority of the day is spent covering the elements of a grant proposal and practicing writing those elements. Depending on the participants' needs, this workshop can be customized to focus on preservation, digitization, or both; and to focus on a particular granting entity. Learn how to properly choose and prepare a grant proposal including:

* Evaluating appropriate funding sources
* Project preparation
* Hands-on practice in preparing a proposal


Time: 9:30-3:30 (lunch is on your own)
Place: Wright State University Archives, Dayton, Ohio
Cost: $35
Registration: Register online at: or contact Angela O'Neal for a registration form.

Deadline: Registrations must be received by March 13, 2005. Late registrations will be an additional $10 and are subject to availability.

Contact: Angela O'Neal, Society of Ohio Archivists Program Co-Chair
Ohio Historical Society
1982 Velma Ave. Columbus, OH 43211
Phone: (614) 297-2576
Fax: (614) 297-2546

Scanning 1200 pages per hour

As we think about Google's announcement, we all wonder what technology they might be using. Is it manual or automatic? Are they using an off-shore operation, which might be less expensive? Is the cost of $10.00 per book really feasible?

We don't exactly what Google and their library partners are doing, but we can look to see what technologies are available and perhaps speculate.

One vendor, Kirtas Technologies in Victor, NY, has a highly automated book scanner that will do 1200 pages per hour with costs as low as $.03 per page depending on "several parameters such as number of shifts, level of maintenance contract, labor cost and operator efficiency." Their scanner will handle fragile books without damaging them as well as thin pages. The operator needs very little training, so anyone who can change a toner cartridge (as they say) can work this scanner. The scanner also does post processing automatically.

The cost? Well, they don't tell you that up front. Looking at the video of the technology, etc., we can imagine that this is not an inexpensive machine to own, however, one option is to have Kirtas do the scanning for you (which we can hope is reasonably priced).

This technology has caught people's eyes, including those of Brewster Kahle who is now on their Board of Directors.

Thursday, January 20, 2005

Why the future doesn't need us

While talking to someone last week about digitization, we got off-topic and began talking about intelligent machines (e.g., scanners that were more intelligent, OCR software that really understood the text, etc). This article, "Why the future doesn't need us", was mentioned. The author, Bill Joy, was the cofounder and Chief Scientist of Sun Microsystems, and cochair of the presidential commission on the future of IT research. Here he talks about theories related to the development of intelligent machines (think of the Borg from Star Trek or the Cylons in Battlestar Galactica). This is a long article, but worth skimming, even if you just skim the first couple of pages.

School for Scanning: Building Good Digital Collections

The Northeast Document Conservation Center (NEDCC) is holding its School for Scanning on June 1 - 3, 2005 at the Boston Park Plaza Hotel in Boston, MA. The registration fee is $410 per person. More information is available on NEDCC's web site.

Wednesday, January 19, 2005

Lesson learned: Is digitization preservation?

I teach a graduate class in Creating, Managing and Preserving Digital Assets at Syracuse University and last year I found that some students did consider digitization a form of preservation. Typically I hear people argue that if an item is used less due to digitization, then it is being preserved. Of course, using an item less does help to preserve it, but an archivist would want the item to go through some conservation efforts, etc., in order to ensure that it is preserved. However, these students were not thinking of that, but instead were looking at the preservation of the content through digitization and not of the item itself. Preserving the content is indeed important. If we are relying on digitization to do that, then those digital files must be properly preserved for the future. These students didn't consider that the item itself might "tell" a story. Researchers might, for example, be interested in the words written in a book as well as the book's binding, paper, ink used, etc. Those things cannot be captured by a digital file.

From working with those students a year ago I have learned to talk more in depth about why preserving the original item is important. I do acknowledge the role of preserving the content, but hope that we don't lose sight of the value of the original item.

Learning how others think about preservation was a very valuable lesson.

Tuesday, January 18, 2005

Call for Proposals: Digits Fugit! Preserving Knowledge into the Future

The following is being circulated on the Internet.


Call for Proposals
Preserving Knowledge into the Future
33rd Annual Museum Computer Network Conference
Location: Boston, Massachusetts
Meeting Dates: November 3 - 5, 2005
Proposal Deadline: February 12, 2005
Proposal Forms:

Dear Colleagues:

We've all spent a great deal of time and effort organizing vast quantities
of information in a variety of digital formats. We have now realized the
time has come to focus on preserving the valuable results of that tremendous
effort. In response, the theme for the Museum Computer Network's 2005
conference will be Digits Fugit! Preserving Knowledge into the Future. Come
to Boston, where the city's historic downtown will provide a perfect setting
to remind us of the value of preserving a proud past while keeping our eye
firmly on a glorious future.

As an organization, MCN has always championed back-office, mission-critical,
"heavy lifting" aspects of museum technologies. We provided advocacy and
professional development programs to help our membership accomplish basic
networking and database efforts that transformed the record keeping part of
museum work. For years now, we have turned our attention to refining
cataloguing standards, image digitizing technologies, developing smarter
systems for inter-operability, sustainability, and electronic publication.

The good news is that collectively, we have become quite skilled at creating
and delivering digital resources! The not-so-good news is that we now need
to spend equally prodigious efforts at preserving the fruits of our labor.
And as we know, this is not lacking in complexity! Fortunately, we have
kindred spirits and friends in this effort and we will be making every
effort to bring as many of them as we can to our conference. We will join
forces with the Northeast Document Conservation Center (NEDCC) who will
produce a new two-day curriculum on Digital Preservation at the Omni Parker
House, just before our conference. Speakers from Harvard, MIT, and the W3C
will be on hand to lead workshops and speak on panels, and we will organize
facilities tours of some of the more well-known digital production and
preservation sites in town.

Even though the theme will focus on digital longevity issues, MCN
conferences are always open for presentations on all of the other
technologies and concerns in our field. Accordingly, the Program Committee
is seeking presentations based on current and planned activities or research
that focus on one of the following broad areas of interest:

Preservation Metadata Standards
Preservation Planning
Preservation Policy Development
Research & Evaluation
Storage Technologies
Standards & Interoperability
Digital Rights Management

Multimedia & Streaming Technologies
Collection Information Management
Collaboration & Data Aggregation
Point of Sale & eCommerce
Electronic Publication
Imaging Technologies
Intellectual Property Rights
Management Issues
Membership & Fund Raising

Proposals will be accepted in one of the following three formats: panels,
workshops or roundtables. Each session must have a chairperson responsible
for finding other speakers and coordinating logistics. All chairs must be
affiliated with an institutional member or be an MCN individual member.
Topics for the sessions should fall into at least one of the categories
listed on page 1 of the proposal form. All proposals for sessions and/or
workshops must be submitted on Call for Proposals form. To obtain
additional copies or get more specific information, visit the MCN web site, or contact Susan Rawlyk at
the MCN office by email or phone (403) 288-9394.


Sam Quigley

Christine Bostick and Cathryn Goodwin
2005 Conference Co-Chairs

Tuesday, January 11, 2005

Online Copyright Courses

Thsi following notice has been posted to several discussion lists.

----- will be offering four courses this Spring:
  • U.S. Copyright Law
  • Canadian and International Copyright Law
  • Digital Licensing Online
  • Managing Copyright Issues
Further information and registration is at or e-mail

Monday, January 10, 2005

Digital Preservation - Barbarians at the Gate

Sometimes it is worthwhile to read an older article -- even slightly older -- to see if (and how) things have change. Or one might read and find that nothing has changed. In October 2003, FreePint published an article entitled "Digital Preservation - Barbarians at the Gate." The article notes a prediction of 35 billion emails per day by 2005. It also notes that 93% of all information produced each year is digital. But are we closer to solving the problem of preserving digital information no matter where it is produced? Are we still thinking short-term, rather than what information will be available about our current times in 10,000 years? We all know the answers to those questions.

BTW the entire FreePint index -- for all articles appearing in the FreePint Newsletter, 1997-2004 -- can be viewed here.

MDLP Digitization Training Manual

I keep searching the Internet for new information on digitization, especially "how to" items. Today I found a link to the Mississippi Digital Library Program Digitization Training Manual.

Several libraries in Mississippi are cooperating on an IMLS funded project to "make civil
rights materials from a variety of archival repositories available to a wide audience by
means of computer access." Information will be captured using scanning, optical character
recognition (OCR), and transcription.

The 40-page manual is undated, but seems recent given that the introduction refers to the current grant. It is worth perusing to see if you can learn from what they are doing.

Friday, January 07, 2005

JPEG 2000 vs. TIFF

There has been a discussion this week on the IMAGELIB discussion list about the use of JPEG 2000 for creating high quality/archival images rather than TIFF. It has been noted that JPEG2000 was not meant to replace TIFF, but rather it is an upgrade to the JPEG format that we all are familiar with.

The discussion points have surrounded the following points:
  • JPEG 2000 is a young format, which means that we are still gaining experience with it.
  • The number of tools available for JPEG 2000 is limited, but growing.
  • JPEG 2000 supports both visually lossless and true lossless compression.
Given that JPEG 2000 is a lossless compression, it should have a bright feature in digitization projects. Therefore, it's worth noting some resources on it:

Thursday, January 06, 2005

Digital Imaging & Metadata Workshops [Ames, IA - February 2-3, 2005]

The following is circulating on several discussion lists.


The following workshop will be offered at Rm 32 Parks Library Iowa State University Ames, IA 50011

- All workshops are $70 for CDP/BCR Members; $95 for nonmembers.
- $5 Discount for early registration, 15 days in advance.
- Please note that BCR training vouchers cannot be used for these workshops.

Introduction to Digital Imaging:
February 2, 2005, 9:30 a.m.-4 p.m., $70 for BCR/CDP members/$95 for nonmembers

Introduction to Dublin Core Metadata:
February 3, 2005, 9:30 a.m.-4 p.m., $70 for BCR/CDP members/$95 for nonmembers

Register for all classes at

Richard Urban, CDP Operations Coordinator will be the instructor for Introduction to Digital Imaging and Introduction to Dublin Core Metadata. His bio is available at For a campus map showing the library's location, see: For lodging, parking and other visitor information, see:

What is CDP?

The Collaborative Digitization Program (CDP) is a nonprofit organization that enables access to cultural, historical and scientific heritage collections in the West. The CDP provides assistance to the cultural heritage community through best practice guidelines, workshops and by encouraging collaborative partnerships among museums, libraries, archives and historical societies.

What is BCR?

The Bibliographical Center for Research (BCR) is a nonprofit, multistate library cooperative that has served the library community since its founding in 1935, providing cost-effective library and information services. Today BCR serves 1,064 voting-member libraries in 41 states and Canada. Agreements with the state library agencies in Alaska, Colorado, Idaho, Iowa, Kansas, Montana, Nevada, Oregon, Utah, Washington and Wyoming allow any library located in those states to use BCR services as a member institution.

Wednesday, January 05, 2005

Thoughts about Google's library plan

Michael Gorman, the dean of library services at California State University, Fresno, and president-elect of the American Library Association, wrote an article for on the Google digitization project. Two quotes stand out to me:

Are you going to print the book, and end up with 500 unbound sheets?

I am all in favor of digitizing books that concentrate on delivering
information, such as dictionaries, encyclopedias and gazetteers, as opposed to knowledge.

One of the things that has stood in the way of e-books is that people prefer to curl up with a real book. They like the feel of a book, its portability, its low-tech-ness. They like to be able to take notes, highlight, bend corners, and share it with others. They like to read in the bathroom, in bed, on a plane (when electronic devices must be turned off) and outdoors. Books are flexible. How do you gain that same flexibility with an electronic version of a book that has been digitized? Do you end up printing the entire thing (and killing trees)?

Undoubtedly, we read differently online that on paper. I think we skim more, although I don't think it means we learn less. But would access to so many electronic text change -- negatively -- how we learn? Would people become less accustomed to hardcopy texts? Perhaps even less trustful of "real" books? I am not sure of my own thoughts, but I do think we need to consider the long-term effect on how we gather, manipulate and value information.

Gorman advocates digitizing reference books, which are consulted but not read. I have to admit that I used many reference works online and find them very handy. I do wish I had access to more (for free). Perhaps this is not as glamorous as digitize fiction and non-fiction books, but it may be more useful. Time will tell.

What I do know is that Google's efforts should thrust digitization into the spotlight for everyone and open up many conversations. Long-term the technologies and techniques used by Google may help much smaller institutions as they embark on their own digitization projects. For now, I'm sure, those smaller institutions are just in awe.

Monday, January 03, 2005

Google’s Library Project: Questions, Questions, Questions

This is a very interesting Q&A written by Barbara Quint for Information Today. Well worth reading in order to understand other people's view of this project.

Iraqi and Tsunami blogs

In the left hand column of Digitization 101, I have added links to blogs about Iraq (from the Iraqi point of view) and the tsunami (including stories from survivors). If you haven't visited the web page in a while, you may also be surprised to see some of the other blogs that have been added. Take a look...explore...learn....