Wednesday, January 31, 2007

Thinking about federated search again (still!)

I've been submerged in federated searching "articles" for a few hours today and compiling information (and thoughts). There is much that has been written on federated search, but very little of it seems to talk about using federated search software to search across digitization programs. We might think "oh, of course it can do that" but I know that one vendor seemed a bit iffy on whether theirs would work with CONTENTdm and some of the other digital asset management systems.

Below is the beginnings of a bibliography I'm compiling on federated search. These are pieces that I think are relevant to what I'm working on and the list will likely get longer as my research continues.

I want to point out that the Library of Congress Portal Applications Issues Group has created a list of federated search portal products and vendors. Unfortunately, the list is two years old, so it is a snapshot in time. Still a good starting place since there are likely few lists like it.

When you talk to people about federated search, a few names always rise to the surface, yet -- looking at the LOC list -- there are many products that have a federated search component to them including products built into integrated library systems (ILS). So when you think about implementing a federated search option for searching across digitization programs, be sure to talk to the companies that are supplying those systems to you and your collaborators. They may have have the functionality that you need or be willing to work with you to develop those capabilities.

By the way, I find it interesting to see what companies say about their federated search capabilities. Some like Polaris Library System describe functionality that I would describe as being federated search, but they don't use those words. We might shy away from the term "federated search" because it seems too technical or unfamiliar, yet those of us who are researching federated search products look for those words (and then the other details to back it up). So please don't shy away from the "federated search" phrase!

Short Bibliography:

An off-topic thought: When we think of companies that provide integrated library systems, we often think immediate of behemoth companies like SirsiDynix, yet there are non-behemoths like Polaris we should be paying attention to. Cast a broad net and look to see who will provide the functionality and support that you need. Don't make assumptions!

Technorati tag:

Guest editorial in FreePint

FreePint's February 1 issue is on the paperless office and I did the guest editorial. There is also an article by Katrina Hughes entitled "Why I Prefer Hardcopy" and an article by Ulla de Stricker entitled "Paperless Myth: Rumours of Paper's Demise Have Been Greatly Exaggerated." Within Hughes' article is a short list of web resources on digitization. It was very difficult to produce a short list because of all of the aspects that must be considered when embarking on a digitization program. Thankfully, there is also a link to a longer list that K.M. Dames and I created last June.

Technorati tag:

Tuesday, January 30, 2007

Wyoming Newspaper Project

Today I received an e-mail announcement from PTFS about a case study they are making available on the Wyoming Newspaper Project. PFTS is providing digitization services for the project. The project will be using the SirsiDynix Digital Library to store the images. Digitization is should be underway according to the web site. According the information sheet, this is a three-year project which will be completed in 2008.

The case study is a one-page document, which highlights the services that PTFS is providing. Even though the case study is brief (and vendor focused), some might find the information useful.

Technorati tag:

Monday, January 29, 2007

Return on time invested (ROTI)

Ken Haycock, director of the School of Library & Information Science at San Jose State University, was one of the speakers last week at the SLA Leadership Summit in Reno, NV. His lively and informative presentation was entitled "Leadership and You: Tackling the Dragon." One of the many things he touched on was "return on time invested." A quick search of Google does not provide a good definition or example for us. Most of the examples have to do with meetings, training or product sales. (Good example here.)

As I heard Haycock talked, I realized that return on time invested (ROTI) is something we deal with in our projects, but likely not in an organized manner. Could we implement procedures in our programs to measure ROTI and then eliminate activities that have a poor return on the time invested in them? For example, are there meetings or procedures that have a low return on investment?

Now I must admit that some tasks take a long time and may seem to have a poor return (ROTI), especially when "our bosses" look at them, yet they can be essential. For example, one project spent approx. 30 minutes per item research the item and creating the metadata. That was a huge investment of time, which in the short-term might have a poor ROTI, yet in the long-term might have a high return on the investment.

Perhaps one use of ROTI is thinking about alternate ways of getting things done. If I invest the time of someone who is knowledgeable at doing the task, how does the ROTI compare to having someone who is not knowledgeable doing the task? How much time would each invest in the task? Which would give the better return on investment?

I will have to think about ROTI more. For now, I thought it a worthwhile concept to mention.

Friday, January 26, 2007

About Digitising five centuries of UK life: Massive £12m boost for digitisation of national scholarly resources

Quoting the press release:
JISC today announced the successful bids in a further £12m investment in the digitisation of major resources of national importance. Following the enormous interest in last year's call for proposals and the high quality of the many bids received, the extra investment has been made by HEFCE to support the wider availability of national, scholarly resources.
Full press release available here.

Tuesday, January 23, 2007

Report of the East of England Digital Preservation Regional Pilot Project

Quoting the forward (bolding added):
The study reported on here has its origins in the Lord Chancellor’s 2002 request that regional solutions for digital preservation should be investigated. Here, in the East of England, we have a head start, since the University of Essex at Colchester has over thirty years experience in this area and we knew – through Professor Kevin Schürer, Director of the UK Data Archive (UKDA) – that they were keen to help us pick up the Lord Chancellor’s challenge. So, in 2004, we three partners called together representatives of all the county councils in the region to discuss future options and to identify who else might be able to come on board to test the water. In the event, it was agreed that Hertfordshire County Council and Bedfordshire County Council were best placed for a pilot investigation, which began in earnest in the winter of 2004/5.

Throughout 2005, the UKDA processed a range of obsolete or near obsolete electronic media selected from Hertfordshire’s council records, to try to determine how much time and money it would take to process these records into a form in which they would remain readable and accessible for the foreseeable future. This exercise proved far more challenging than originally expected, but has, we believe, resulted in some robust cost estimates upon which we have been able to build a credible comparison, to which local authorities and other large organizations can refer to when deciding whether to ‘go it alone’ or to buy in their digital preservation from a third party.

So we present here a first stab at what it costs either to set up one’s own repository, or to pay some other expert organization, such as the UKDA, to do it for you. We have been forced during this project to grapple with (and we hope resolve) many extraordinarily difficult issues, not least in finding a common language through which local government and higher education can understand each other in this ever-changing area, and we believe we have evolved the basis for a workable process model which will guide the next stage of our work.
The 55-page report is available here.

Technorati tag:

Privacy & Publicity

In October, I posted two articles (here & here) that I had written for WNYLRC Watch on copyright. This article below was just published in the Jan./Feb. 2007 issue of WNYLRC Watch and compliments the other two articles.

The September/October and November/December issues of WNYRLC Watch contained articles about copyright and copyright clearance in regards to a digitization program. This article is focused on two remaining concerns that are frequently overlooked: privacy and publicity. While copyright gives the creator the right to reproduce the work, prepare derivatives, distribute copies, perform the work publicly and as well as display the work publicly, it is the rights of privacy and publicity that force us to stop and consider our moral responsibilities.

Both privacy and publicity are protected by state, common or statutory law. In a broader sense, the roots of the right of privacy are found in the U.S. Constitution, which has been interpreted to give us the “right to be left alone.” There are four types of invasion of privacy: intrusion, appropriation of name or likeness, unreasonable publicity and false light. In understanding whether you are violating someone's right to privacy, you might ask questions such as: Does the matter cast a false light on the person? Are you intruding on an aspect of the person that they wanted to be truly private? How would you feel if this information was disclosed about you? Generally the right of privacy ends with the death of the individual; however, you may decide – based on the content of the materials you are digitizing – to extend the right of privacy beyond death in deference to the family.

The right of publicity “prevents the unauthorized commercial use of an individual's name, likeness, or other recognizable aspects of one's persona. It gives an individual the exclusive right to license the use of their identity for commercial promotion.” (Right Of Publicity: An Overview, Cornell Univ.) The right of publicity can extend beyond a person's death and be controlled by his/her estate. We encounter the right of publicity generally with famous individuals, both dead and alive.

With both the rights of publicity and privacy, common sense will be you best guide in understanding if you are in violation. Likely you have already been educated by news story of various litigations involving these rights. Remember those stories as you make your decisions. If you would like a second opinion, contact your organization's legal counsel for advice.

In conclusion, when deciding if materials can be digitized, you must consider rights related to copyright, privacy and publicity. Keep in mind that doing copyright clearance, which can be resource-consuming, is only one of the huddles. By clearing all three, you will be assured of your right to digitize the materials.


  1. WNYLRC Watch, Sept/Oct 2006, contained a brief overview of copyright law,
  2. WNYLRC Watch, Nov/Dec 2006,) contained an article on the copyright clearance process,
  3. Right Of Privacy: An Overview,
  4. Right Of Publicity: An Overview,
  5. Right of Publicity,

Technorati tag: , ,

Univ. of Notre Dame: Institutional Digital Repository Phase I Final Report

Quoting the web site:
This is an abridged version of a final report describing the activities surrounding Phase I of a one-year pilot project called the University of Notre Dame Institutional Digital Repository. After outlining the goals and methods of the project, the report enumerates ways the project could be continued. The seventy-some people who participated in the project are now looking to administrators across the University to become familiar with the contents of the report and set its future course.
What problems is Notre Dame trying to solve?

The three-fold purpose of the Institutional Digital Repository (IDR) is closely aligned with the goals of the University. The IDR's three goals are:

  1. to make it easier for students to learn
  2. to make it easier for instructors to teach
  3. to supplement the scholarly communications process

With these goals in mind the IDR is defined as a set of digital objects combined with sets of services applied against those objects - think "digital library".

The need to create institutional repositories was a series of hallway conversations at the iPRES conference. Given those conversation and the goal set out by Notre Dame, I suspect that this report-- both abridged and full versions available online -- will be well read.

Archivists' Toolkit

If you have not seen an announcement elsewhere, the University of California has released the Archivists' Toolkit. As the web site says:

The University of California, San Diego, New York University, and the Five Colleges, Inc. are pleased to announce the release of the Archivists’ Toolkit™ Version 1.0. The Archivists’ Toolkit™ is being offered under an Educational Community License (ECL).

The AT is the first open source archival data management system to provide integrated support for accessioning, description, donor tracking, name and subject authority work, and location management for archival materials. The effort to build this application has benefited tremendously from the interested guidance of the archival community and was made possible through the generous funding from The Andrew W. Mellon Foundation.

Organizing a collection is an important step when thinking about digitization. In addition to organizing, this toolkit will create EAD finding aids.

Technorati tags: ,

Monday, January 22, 2007

What is being preserved?

Already in the graduate class I'm teaching, questions have been raised about digital preservation and this is only week #2! People are concerned about "what" they are preserving when they digitize. Are they preserving the content? Or are they preserving the original item?

For some (likely not librarians), digitization is akin to microfilming. The content of an item is preserved in a new format. Now that it is in this new format, the old format (original) can be destroyed. In this case, the idea that you are preserving the content is clear.

You might wonder when anyone in his right mind would digitize and destroy the original. Consider the checks/cheques you write. Many checks are digitized when deposited, leaving only the digital version as proof of the transaction.

For archivists and curators, the original item has value outside of the content it contains. They -- and many researchers -- are interested in the physical material, its composition, how it has aged, as well as any wear-n-tear. All those things mean something and are deemed important. So archivists and curators will be adamant about keeping the original item, conserving it, and preserving it.

Librarians will understand that digitization will have preserved the content, but may be unsure what digitization means for the original. Has digitized helped to preserve the original? (The correct answer is "no.") Has digitization made the original less valuable? (No.) And the ultimate question for some may be: What do I do with the original now -- keep it in circulation, put it in storage or what? A question that I suspect is answered differently for even items within one digitization program.

I have told the class that digital preservation will be discussed more -- much more -- during the semester, but that there are many other things to learn first -- project management, material selection, copyright....! We cannot think seriously about digital preservation until we have thought through the steps that come before it.

Technorati tag:

Friday, January 19, 2007

Book Review: Digital Preservation

Spellbound blog has done a book review Marilyn Deegan and Simon Tanner’s book Digital Preservation. Jeanne Kramer-Smyth of Spellbound, wrote:
They don’t oversimplify things - but take the time to explain things well. They are honest about those questions that aren’t answered yet… and they point to as many resources, standards and examples as they can.
A nice endorsement!

Technorati tag:

January is off to a fast start...

Actually, January is more than half over, yet it feels like it just began. There is likely much I should say about what I've been doing, but the things that come to mind are these...

Do you know...? -- I bet you get these questions too. " What hardware should we be looking at?" " What questions should I be asking about this prospective project?" "How do I talk about...?" They come in e-mail, by phone and sometimes over lunch. One of the joys of this industry is that we like to talk, to compare notes, and discuss successes and failures. Opinions are valued.

Last year, I did a several facilitated discussions in Buffalo on digitization. A few years ago, I did a series of discussions in Rome (NY) and Syracuse. It can be quite interesting to get a group together to talk about a specific digitization topic. It is a great way of exchanging information and learning from each other. It can be hard, though, to get people to commit to these types of meetings. Yet from the "do you know" questions I get, I'm sure that getting groups of people together just to talk about digitization is very worthwhile, because there is a back-and-both in these conversations that may not happen in a workshop.

Wherever you are, you might consider doing facilitated discussions around digitization. Consider some pre-reading so that everyone comes with new knowledge and ready to talk. Think about inviting those in your organization that are on the fringe of your project (if you are doing a project), yet should know more about what's going on. Invite the curious. The more that understand, the better.

By the way, facilitated discussions are also wonderful for hearing how people are thinking about what they are learning. You might find that people aren't correctly applying what they have learned. In the series I need in Rome, I found that two people had not heard/learned what I had told them about deed of gift forms. It became a very lively interchange as I tried to get them to realize that what they were using at their institution was not a good form!

IST 677 -- The graduate class I teach began on Jan. 16. This year the class includes at least one student who is situated outside of the U.S. (in New Zealand). The class will be blogging beginning January 29. Although the new version of WebCT (which is being used to teach this online class) allows internal blogs, I've decided to keep the blog in BlogSpot so that others can read it. [The blog posts from last year's class are there, so there is already rich content available.]

I should note that I have more than 30 students in this class and even more than wanted to register, but could not because the class is full. And not all of them are Syracuse University students, but rather are taking the class through the WISE program. At any rate, to me that means that more library & information schools should be offering such as class.

Conferences -- This morning I made reservation for the SLA Annual Conference in Denver in June. Thinking ahead for this conference used to mean making reservations in March, but now...!

Next week I'm at the SLA Leadership Summit in Reno, then in April at Computers in Libraries, and at SLA in June. Although not quite obvious from my conference schedule, it seems like there isn't a "conference season" anymore. There are conferences all the time perhaps due to overlapping foci and the ability to travel anywhere at anytime.

Although many conferences do have some sessions specifically on digitization, I am heartened to see those conferences that are only on digitization. I will be even more pleased when that are more of these conferences -- perhaps regional events. I think we have enough local and regional projects underway so that regional conferences might be do-able. In New York State, I think the Digitization Expo in Buffalo as well as events held in NYC (by METRO) show that regional conferences can be successful.

Who am I? -- When I was a corporate employee, the "who" could be answered by saying what department or division I was in...or what project I was working on. "Who" -- back then -- was an easy question to answer.

Although I'm entering my ninth year as a consultant, I still find "who" a difficult question to answer. How I answer "who" depends on the other person's needs or perspective. The answer might be quick, or long and involved. The answer, though, always must include something about what I've done lately, since we are all only as good as our last thought, project or success.

Last week, I found that in order to answer "who," I needed to mentioned and document every detail about me and digitization. I've placed that document on my web site, so now the next time "who" requires that level of detail, I already have it!

How do you answer "who are you?" If you are like me, you are thankful when the person asking knows something about digitization (or whatever your focus). When the person have no frame of reference, then answering "who" becomes harder. We then have to tell stories, give analogies, and maybe draw mental pictures. As I reminded a friend last night, how we define who we are is often different than the definition that others would give of us. And so we need to be aware of -- and perhaps incorporate, if they are glowing -- those other definitions of who we are.

mmm...11 days left in January...time is indeed flying!

Technorati tag:

Thursday, January 18, 2007

Press Release: ARL Publishes Managing Digitization Activities, SPEC Kit 294

Thanks to Charles Bailey Jr. for pointing this out. Here is the press release on this SPEC Kit.

ARL Publishes Managing Digitization Activities, SPEC Kit 294

Increasingly, academic and research libraries are becoming involved in reformatting materials from their collections to create digital content and are providing access to that content through metadata. As the management of digital projects and initiatives is a relatively new endeavor for most libraries, there is a significant impact on libraries’ budgets, organizational structures, and staffing.

This SPEC survey was designed to identify the purposes of ARL member libraries’ digitization efforts, the organizational structures these libraries use to manage digital initiatives, whether and how staff have been reassigned to support digitization activities, where funding to sustain digital activities originated and how that funding is allocated, how priorities are determined, whether libraries are outsourcing any digitization work, and how the success of libraries’ digital activities has been assessed. The focus of the survey was on the digitization of existing library materials, rather than the creation of born-digital objects.

This survey was distributed to the 123 ARL member libraries in February 2006. Sixty-eight libraries (55%) responded to the survey, of which all but two (97%) reported having engaged in digitization activities. Only one respondent reported having begun digitization activities prior to 1992; five other pioneers followed in 1992. From 1994 through 1998 there was a steady increase in the number of libraries beginning digital initiatives; 30 joined the pioneers at the rate of three to six a year. There was a spike of activity at the turn of the millennium that reached a high in 2000, when nine libraries began digital projects. Subsequently, new start-ups have slowed, with only an additional one to five libraries beginning digitization activities each year.

The primary factor that influenced the start up of digitization activities was the availability of grant funding (39 responses or 59%). Other factors that influenced the commencement of these activities were the addition of new staff with related skills (50%), staff receiving training (44%), the decision to use digitization as a preservation option (42%), and the availability of gift monies (29%). An additional factor that motivated many survey respondents was the need to improve access to library resources. Others commented that participating in digitization activities was a strategic goal of the library.

In addition to being one of the instigating factors in many libraries’ decision to begin digitizing library materials, improving access to the library’s collection was cited by all of the respondents as an ongoing purpose behind these efforts. Other purposes that were highly ranked by respondents are support for research (85%), preservation (71%), and support for classroom teaching (70%). For a smaller number (24 or 36%), the purpose of their efforts is to support distance learning. Several respondents reported that promoting the library and its collections was also a reason to participate in digitization activity.

Only four libraries reported that their digitization activities are solely ongoing functions; the great majority (60 or 91%) reported that their digitization efforts are a combination of ongoing library functions and discrete, finite projects.

This SPEC Kit includes documentation from respondents in the form of organization charts, mission statements, job descriptions, policies and procedures, and selection criteria.

The table of contents and executive summary from this SPEC Kit are available online at

SPEC Kit 294, Managing Digitization Activities
Rebecca L. Mugridge • September 2006 • ISBN 1-59407-710-X • 162 pp. • $45 ($35 ARL members)

Shipping and Handling
US: UPS Ground, $10/publication
Canada: UPS Ground, $15/per publication
International and Rush Orders: Call (301) 362-8196 or e-mail for quote.

Payment by check, money order, MasterCard, or Visa is accepted. Make check or money order payable in US funds to the Association of Research Libraries, Federal ID #52-0784198-N.

Order from:

Designed to examine current research library practices and policies and serve as resource guides for libraries as they face ever-changing management problems, each SPEC Kit contains a summary analysis, survey questions with tallies, pertinent documentation from participating libraries, and a reading list and Web site references for further information on the topic.

2007 SPEC Kit subscription (ISSN 0160-3582): $215 ARL member/$285 nonmember, six issues per year, shipping included (additional postage may apply outside North America).

Hurst-Wahl & Dames: Article & podcast

K.M. Dames and I wrote an article for the spring Library Journal netConnect supplement. Our article is entitled "Digitizing 101." Since Kevin and I did several workshops together on digitization, we were asked to write an article-version of our workshop. That was tall order given the word count limit, so what we focused on for the article were "detail how to bring collections online, from copyright issues to outsourcing scanning" with a heavy emphasis material selection and copyright.

Kevin and I were also fortunate to participate in a podcast for Library Journal that discusses the article and other digitization related topics. Also on the podcast was Lotfi Belkhir, CEO of Kirtas and the podcast was moderated by Jay Datema, the Technology Editor for Library Journal. One cool thing Jay did was to create a time index the podcast on the web site and provided links to those topics we discussed at those times. For example, when the iPRES conference was mentioned at 45:33, there is a link to the iPRES conference page.

Technorati tag:

Tuesday, January 16, 2007

Event: Articulating value in the digital world

I don't normally post information on one-day conferences, but this one seems quite interesting and useful. If anyone goes and blogs this, I hope you'll let me know.

The announcement was found on the Digital Preservation discussion list.

Articulating value in the digital world
A conference on the espida Approach

Producing a realistic assessment of the benefits of IT or Information projects is tough. This conference will be of value to both the managers of resources (decision makers, funders, etc.) who are seeking to understand what they might get for money expended and those that prepare business cases for projects who want to convince those with the money that what they propose is worthwhile. It will explain the background to the problem, ways that it has been addressed in the past, the approach developed in the espida Project and the perspectives of funders, decision makers and others on the problem and this approach.


The rapid pace of change in Higher and Further Education means that decision-makers and funders are frequently required to evaluate project proposals that have serious implications for their institutions. There are never enough resources available to fund more than a small fraction of the proposals and decision makers are keenly aware that the size of the resource pool is fixed and that every pound spent on infrastructure and administration is a pound not spent on `primary production': learning, teaching and research.

Costs of projects are relatively straight-forward to define, but benefits that are not expressed in financial terms can be very difficult to communicate and measure. These intangible benefits are frequently a major feature of business cases and are often expressed in vague prosaic language.

This conference, held by the espida Project will offer an approach that can help construct and communicate intangible benefits in such a way that informed and transparent decisions can be made for the benefit of the organisation. Speakers will present a view of the economic background to the issue of understanding intangibles, the espida Approach itself, and examples of how the Approach can be used successfully in different types of organisation.


How are business cases for resources made within your organisation? Are hours spent carefully crafting purple prose to convince senior management about the merits of your work? Do management find it hard to understand the benefits of the proposal?

The espida Approach was initially developed as an aid to securing resources for actions to preserve digital materials, helping to define the value of such work in a language that senior management can understand. In addition to the digital preservation community however, the Approach has high relevance for areas that measure their outcomes, not with financial indicators but rather more intangible results. These include records management, knowledge management and IT. In general, any business case that must convey outcomes that are not purely financial may benefit from applying the espida Approach.

The Approach helps users:

a) Figure out what the benefits of their proposal really are,
b) Express these benefits in a way that communicates them effectively,
c) Identify outcomes in a systematic fashion.

This conference, held by the espida Project, will offer an Approach that can help communicate intangible benefits in such a way that proposers can increase the chances of their proposal being understood and resourced.

Speakers include:

Helen Shenton (British Library)

Professor Sir Laurie Hunter (University of Glasgow)
Setting the scene

Dr. James Currall & Peter McKinney (University of Glasgow)
The espida Approach

Alice Colban (JISC)
The Approach in the context of Funding Bodies

Dugald Mackie (Vice-Principal, University of Manchester)
The Approach in the context of HE decision-making

Julie Carpenter (Director, Education for Change)
The Approach in the context of consultancy in Heritage

Conference details:

Monday 12th February
British Library Conference Centre

Cost: Free

To register for the conference please visit the website (

Contact Joan Keenan for further information (

Event: The Challenge: Long-term Preservation. Strategies and Practice of European Partnerships

As posted on the Digital-Preservation discussion list:

To coincide with the German Presidency of the Council of the European Union the conference "The Challenge: Long-term Preservation. Strategies and Practice of European Partnerships" will take place on April the 20th and 21st 2007. On behalf of the Federal Government Commissioner for Culture and the Media as well as of the German National Library I would like to cordially invite you to this conference!

The conference aims to present and discuss the current technical and organisational status of measures being taken for the long-term preservation of digital media in various countries in Europe.

For further information visit the conference homepage at

I am looking forward to seeing you in Frankfurt!

Best wishes,

Dr. Elisabeth Niggemann

Monday, January 15, 2007

MLK Jr.: Can digitization & the Internet help perpetuate the dream?

An MSNBC article tells us that not every student in the U.S. understands who Martin Luther King Jr. was. Unfortunately, what is being taught in our schools has changed.
In many schools across the country, teachers say social studies has taken a back seat under the federal No Child Left Behind law, which stresses math and reading. Squeezing history into the curriculum can be difficult, educators say, and taking time out of a scheduled lesson to use a federal holiday -- even King's -- as a teaching moment can be tough.
If it is not being taught well in schools, can the Internet help? As I write this, I'm also Googling to see what is retrieved when I type in "Martin Luther King".
  • The first six hits are either from the Nobel Prize web site, the King Center, or Stanford University and are all pro-King sites.
  • The seventh hit is really an anti-King site hosted by Stormfront White Nationalist Community. This site works to discredit both Dr. King and the work that he did.
  • The eight hit is from Wikipedia. We might suspect that the Wikipedia article contains some incorrect information, but "what" is not immediately obvious. What I find interesting though, is reading through the discussion page, where you can see what people disagree on in the article.
  • Knowing that most people look only at the first ten hits returned by a search engine, I'll note that the ninth and tenth hits are also good.
So in the first ten hits in Google, there is one that would lead you astray. But also in the first ten hits, there are no digitization projects that are making King's writing available to the public. Stanford Univ., which is spearheading the King Papers Project, has created a web site of information for teachers that contains some documents, but not a lot.

Can digitization and the Internet help perpetuate the story of Martin Luther King, Jr.? Obviously, the Internet is helping since there are articles, biographies and other information available about King on it. There are also dissenting voices, which can be important to understand. However, there is not the digitized content available to help bring his story alive. That content is still under copyright and has not yet been freed so that it can be easily digitized and disseminated. The King family has been enforcing the copyright on Dr. King's works, yet placing some of his works in the public domain could do more to spread the message that he believed in.

Maybe in a year, on January 15, I'll do a Google search and find that a legitimate digitization project of Dr. King's works is available and highly ranked. Until then, it will be up to us as individuals to retell the story and keep the images -- even if it is our verbal "images" -- of those events alive.

Related post: Martin Luther King Jr.: Digitized and available illegally

Technorati tags: ,

Friday, January 12, 2007

Digitization 101 is now using the new version of Blogger/BlogSpot

Okay...this likely means nothing to you, but already moving to the new version has been helpful for me. I found two comments that had not (supposedly) been moderated! One was a comment on the Microsoft digitization deal.

As I get used to this new version, I think you'll see some useful changes in the blog, like the addition of "labels" (categories) to the posts. [In fact, I think I'll just try that feature with this post!]

Footnote launches and announces partnership with National Archives

From the press release:
Archivist of the United States Allen Weinstein and Footnote, Inc. CEO Russell Wilding today announced an agreement to digitize selected records from the vast holdings of the National Archives. The 4.5 million pages that have been digitized so far are now available at

This non-exclusive agreement, beginning with the sizeable collection of materials currently on microfilm,will enable researchers and the general public to access millions of newly-digitized images of the National Archives historic records on a subscription basis from the Footnote web site. By February 6, the digitized materials will also be available at no charge in National Archives research rooms in Washington D.C. and regional facilities across the country. After an interval of five years, all images digitized through this agreement will be available at no charge through the National Archives web site.

Footnote also has a blog, which you can view here. In the blog, they mention records from Pennsylvania (1700-1800s) that they are digitizing.

Footnote allows users to interact with the materials -- to annotate, to download and to upload! Quoting their Terms & Conditions:
Members and All-Access Members have permission to post historical documents to the Website, to create Member Pages, to annotate historical documents and to post comments on Member Pages and on historical documents. They may also create a special profile that can be seen by other Users. All-Access Members have access to some collections of documents that are not available to Members.
Basic membership is free. All-Access Membership costs $9.99/mo. or $99.99/year. All-Access members have access to all of the content. And, indeed, on the homepage, you can see that some content is free, while some isn't. (I have no idea what makes some content not free.)

Haven't heard of Footnote? Neither had I until I saw the a blog post that mentioned it, then the press release. It is definitely worth exploring.

Technorati tag:

Thursday, January 11, 2007

Searching Google Book Search and I found....

I did a search in Google Book Search on the word "digitization." One of the books that was retrieved was Technical Guidelines for Digitizing Archival Materials for Electronic Access written by Steven T Puglia, Jeffrey Reed, and Erin Rhodes. The full-text of this book is available because it was published with a Creative Commons Attribution 2.0 license, even though it is copyrighted.

Now I think having the full-text available is very cool. I could see recommending that someone read the book at the Google web site IF the resolution was better. True, someone who is patient -- and doesn't mind fuzzy type -- could read the book through Google. More likely either a person would use it electronically to look up something specific in the book or they would order a hardcopy version.

So true to its mission, Google has allowed me to find a useful book and linked me to sources for obtaining a readable full-text. The user in me, though, wishes I could read it in Google. Would I be willing to pay something to see a clearer version of the book? Only if I could also download a copy to my hard drive, too. Maybe that is an option they are already considering?

Technorati tags: ,

The positives of massive book digitization

In the blog post Cows and the Colossus, Stuart Weibel, talks about a presentation give by Mike Keller of Stanford University entitled "Mass Digitization in Google Book Search: Effects on Scholarship." Keller believes that Google Book Search (GBS) will revolutionize access to books more than anything else we have done to date.

In constructing his argument, Keller provided these facts (quoting Weibel):

  • Digitization of the card catalog resulted in a 50 % increase in book usage
  • Google indexing is the #1 driver of article usage in High Wire – by a large margin (10 to 1 beyond the next highest, if I understood him correctly)
  • Metadata searching (what Keller describes as subtle searching), in combination with novel methods of taxonomic search and citation cross-linking, dramatically improves discovery and navigation within large result sets.
So if just digitizing the card catalog improved access, imagine what digitizing entire books will do (as the argument goes).

Keller makes several excellent points, which you'll find if you read the post. But the one that stood out to me was (quoting Weibel):
Stepping away from the somewhat daunting implications for libraries, Keller suggested that the most important thing about GBS is that it has occasioned a great debate about the importance of copyright in the intellectual life of the nation (and the world).
The debate is occurring because there is a conflict-- real or perceived -- over adherence to the law and providing access. There is also a debate because users can see that access is being limited because of copyright. Users are being directly affected, which forces them to learn about this debate and have an opinion about it. Those who follow copyright, like myself, know of legal cases that have caused industries to wake up to copyright (e.g., Tasini and American Geophysical Union v. Texaco, Inc.), but this action is causing users to wake up. And that is good thing.

Technorati tags: , ,

Wednesday, January 10, 2007

Article: Historic passenger lists go online

Quoting the article:
People looking to track ancestors who emigrated from British ports will from Wednesday be able to search online passenger lists of the ships that carried them to new lands.

Released by Britain's National Archives, the passenger manifests give an insight into all long-distance trips made by 30 million travelers from the country's ports between 1890 and 1960, including that of the Titanic which sank in 1912.

The digitized records are available at, which is a commercial web site. Registration is free, but to view the full records costs.

Technorati tag:

New blog: Polymath Paradise

Bennett Lovett-Graff, Publisher of Content Solutions at National Archives Publishing Company, has started a blog called Polymath Paradise. He describes this as a place to provide "good background analysis of any number of topics that impact our business." There are posts on:
Although Lovett-Graff works for NAPC, this blog doesn't talk about NAPC but about topics that impact many projects (all of our projects).

Since he's new to blogging, if you like what you read at Polymath Paradise, I hope you'll leave a comment and encourage Bennett to keep going.

Technorati tag:

Tuesday, January 09, 2007

Event: Web Archiving Training Session

This is from the Digital-Preservation discussion list.

Web Archiving Training Session

The European Web Archive is organizing a two-day training session on Web Archiving in Paris on the 1st and 2nd of March 2007. The training will cover all aspects of Web Archiving for librarians, archivists as well as technicians in charge of web archiving. Special attention will be given to providing the necessary background on Internet technologies in general and Web publishing in particular to understand the media and requirements for its preservation.

The training will present a complete overview of web archiving methodologies with, for each of them, contextual background and assumptions as well as preferred use.

The training is not a technical training for set-up and use nor for the development of web archiving tools. However, it will comprise a critical review of tools with a demonstration where possible.

Topics covered:

  • The web as a disruptive media
  • Actors of web publishing
  • New role for memory institution
  • Current initiatives overview in Web archiving
  • Legal issues
  • Internet Infrastructure
  • Identification and naming on the Web
  • Web format and standards
  • Selection policies
  • Crawling strategies, issues, and good practices
  • Storage of web material, ARC and WARC format
  • Access and rendering
  • Searching web archives
  • Review and demonstration of existing tools
  • Metadata for Web Archiving
  • Preservation


The session will take place in Paris between the 1st and 2nd of March 2007.


Registration fee for the 2 days (including lunches, coffee breaks and one dinner):

  • 750 standard rate (680 for registration made before the 1st of February)
  • 580 for non-profit and governmental organizations (480 for registration made before the 1st of February)
  • 250 for students

To register please go to:

If you have any further queries or require additional information, please contact the organizers directly at:

info AT

Free Webinar: Avoiding the Pitfalls of Newspaper Digitization

I received an e-mail notice of this event, which I'm pasting below. I'm sure the webinar will have a "sales spin" to it, but I also think that it could be useful for those who are thinking about digitizing newspapers since it may surface ideas, etc., that you have not considered (or reinforce/confirm things for you).

Apex Publishing is pleased to invite
you to attend our free webinar

Avoiding the Pitfalls of Newspaper Digitization

January 31, 2007 — 11 am EST, 8 am PST, 4 pm London

Creating a database from archival newspapers can be a daunting proposition. If you are thinking about initiating a newspaper digitization project, register now to attend Apex Publishing's FREE online webinar, Avoiding the Pitfalls of Newspaper Digitization.

This webinar will show you how to avoid schedule delays, extra expenses and frustrated stakeholders, and set you on the path to creating a digital collection for users around the globe.

Apex Publishing has digitized millions of pages of newspapers – more than any other organization in the world. At this webinar, we'll share with you what our extensive experience has taught us, including:


What you must consider before you preserve
and digitize your collection

· Specific strategies for saving time without
sacrificing quality
· How to make an informed decision about what methods and technologies are appropriate for
your particular collection
· QA tools you can use, right from your desktop,
to monitor your project and your vendor
· Much more...

Presented by Tom O'Brien, Vice President, Business Development for Apex Publishing and an authority on newspaper digitization, this 60-minute webinar is an interactive program broadcast LIVE to your desktop or conference room, via your telephone and your Internet browser.

The hour you spend with us on January 31 could save you time, energy and money!

Register online now!
Space is limited.

Upon registration, you will receive an email
with complete participation instructions.

January 31, 2007 — 11 am EST, 8 am PST, 4 pm London

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Since 1988, Apex Publishing has focused on developing a suite of unrivaled solutions that enable our clients to maximize the value of their content. Publishers of books and journals, libraries, scholarly and professional societies, universities, commercial and nonprofit aggregators and information providers partner with Apex for: Archival Digitization Services; Editorial Services; Prepress, Pre-media and Composition Services; and Content Capture and Conversion Services.

Apex CoVantage
198 Van Buren Street
200 Presidents Plaza
Herndon, VA 20170-5338

Library Related Conferences

Marian Dworaczek has a web page were she lists library related conferences being held around the world. Included are ones that focus on digitization. It is a long list and undoubtedly you'll find something there that you didn't know about.

I don't know how Dworaczek finds all of these conferences, but I suspect that if you know of one that she doesn't, she would welcome hearing from you.

Monday, January 08, 2007

Digitization RFPs from city & county governments

It must be "that time of year." Requests for proposals (RFPs) from city and county governments are now circulating to bid on digitization work that they have (old documents, taxation records, etc.). I'm sure those who see these RFPs have the same questions each year:
  • Have these organizations done any preliminary reviews of the companies that they are sending these RFPs to OR are they sending them out blind, not knowing if the companies who are receiving the RFPs actually do what they want? (I know the answer to this. Blind, unfortunately.)
  • If they are using a list someone provided to them, have they checked other sources to see if there is a good local contractor that they have missed?
  • Who keeps the list these organizations are using and how can we ensure that it is up-to-date? (Wouldn't it be great if the organizations told you in their letter how they found you?)
  • How many companies are they sending these RFPs to? If they have sent only three, and two of those companies actually don't do what they want, that leaves them with one source. (Or maybe that's exactly what they want!)
  • Why are the turn-around times so tight (three-weeks)? Okay, this seems like a normal "government" think on RFPs. Organizations that bid on a lot of government work must have someone who focuses just on these RFPs to ensure the deadlines are met.
If I could change the RFP process at all, it would be that organizations spent a little time investigating those companies that they are going to send the RFPs to. Knowing that they are sending the RFPs to companies that do what they want would (1) save paper and postage, and (2) give them better/more bids to compare.

Sunday, January 07, 2007

Viewing historical markers -- Virginia & North Carolina

Commenters have pointed out the digitized information for historical markers in North Carolina and "soon to be digitized" information for Virginia. The web site for the North Carolina Highway Historical Marker Program is nice and provides good information on each marker, including a photo and a short essay.

As for the historical markers in Virginia, Tom Scheinfeldt wrote in his comment:
In fact, we at the Center for History and New Media are working to launch a project called Virginia History Here, which will use new mobile communications technologies—especially cell phones—to improve access to Virginia's historical roadside markers. In addition to providing searchable web access to the markers, Virginia History Here will allow travelers on congested roads to access the full text of historical markers from within their cars via cellphone. Moreover, in order to help extend the historical and educational value of cast iron markers, Virginia History here will also provide these travelers and web visitors with links and directions to nearby and related markers, contextual essays, and related primary source materials such as photos and short audio and video clips. New mobile technologies allow us to make all of these resources available to travelers when they are most interested and engaged: when they are at the marker sites themselves.

We are still waiting to hear about funding, but if all goes well, we should launch a prototype by 2008.
The Virginia project sounds very interesting! I hope it gets funded.

Thursday, January 04, 2007

Comments on mass digitization

My blog post entitled "Will mass digitization projects need to be re-done?" had attracted several comments including a lengthy comment from Stephen Paul Davis, Director, Libraries Digital Program at Columbia University. Davis taken exception to Joe Esposito's criticisms of the current mass digitization programs. The comments on this post are definitely worth reading.

Technorati tag:

Viewing historical markers -- Ohio!

Angela O'Neal, Digital Projects Manager for the Ohio Historical Society, left a comment on yesterday's post:
Did you see our website of Ohio Historical Markers It is similar to Pennsylvania's site, but also allows user to submit photographs and GPS coordinates of historical markers.!
  • More information on each marker.
  • Link to MapQuest to show where the markers are located.
  • Information on how people can participate in collecting and submitting information on specific markers. And information can be submitted online!
  • Ability to "save" markers in your account for viewing later.
  • Nice print facility.
Now I'm wondering who else has put historical markers online. Any other U.S. states? If yes, what information are you capturing? And you getting your citizens involved? And is this just a U.S. phenomena?

Technorati tag:

Wednesday, January 03, 2007

Viewing historical markers

In college, I would ride the Greyhound bus back to school after vacations. Winding our way from Harrisburg, PA to Elmira, NY, we would pass many historical markers. Zipping along at 55 mph (or even slower), they could be hard to read.

Pennsylvania has created a database of its historical markers and made that database available online. The site is searchable by text, category, county or title. The information given on each marker includes marker name, county, date dedicated, marker type, location, category and text.

The word "digitization" means to convert to digital form. Indeed, these historical markers have been converted to digital form. They have been digitized. The text of each marker, and its history, has been preserved. And since each marker preserves a piece of history, this database also helps to preserve history.

Unfortunately, you do not see pictures of the markers so you cannot see where they are and what they are marking. You also cannot view a map of the markers. And you cannot do complex searchers. Yet even without those things, this is a good resource and I'm glad the Pennsylvania Historical and Museum Commission has done it. Hopefully they will add more functionality to it.

What functionality could be added? Well, let's look at what information is available. For example, there was a Civil War training camp in Harrisburg called Camp Curtin. You can read the marker here. Without a photo, you cannot tell that this marker is by a park in front of a statue of Governor Curtin (info). And even this site doesn't tell you that this is the smallest park in the commonwealth of Pennsylvania (info). So if you just look at the marker database, you are not getting the entire story. Just the addition of a few links makes the history on the markers come alive.

Technorati tag:

Tuesday, January 02, 2007

Blog post: Canada's Public Domain Day

Leave it to Eli Edwards (Confessions of a Mad Librarian) and to note what works have passed into the public domain on Jan. 1.

Technorati tag: