Friday, August 31, 2007

Blog Day 2007: Blogs worth knowing about

Blog Day 2007In keeping with the tradition of Blog Day (August 31), here are several blogs that you may find worth knowing about, including a few that have been mentioned previously in Digitization 101:

Archivalia (German) with some in English -- This blog focuses on topics of interest to archives in Europe and other parts of the world. If you do not read German, you can use Babelfish to give a decent translation of the text. Although already familiar with Archivalia, I want to thank Klaus Graf, a Digitization 101 reader, for recommending it.

Collaborative Manuscript Transcription
-- Written by Ben Brumfield, a software developer and Digitization 101 reader, this blog is focused on his family history project. As he says, this is a place for him to organize his "thoughts about software for collaborative manuscript transcription and annotation."

Figoblog (French) -- This is written by Manue, who is a librarian interested in "bibliotheque numerique." The blog touches on topics just as digitization, metadata, information retrieval and much more. If you do not read French, you can use Babelfish to give a decent translation of the text.

Investigations of a Dog -- In talking about this blog, Gavin Robinson wrote:
This blog is for works in progress; reflections on things I've been reading; extracts from and criticism of my PhD thesis; and random thoughts on history, historiography, culture, literature, information technology, and, of course, other people's blogs.
Ben Brumfield, who recommended this blog, said that Investigations of a Dog contains well-thought out posts on digitization and pointed especially to this one.

O'Reilly Radar -- This is a team blog with Tim O'Reilly as the chief blogger. It covers topics such as emerging technology, Web 2.0, open source and more. Because the blog covers a technology broadly, there is always something of interest in it.

Papa's Diary -- This blog is actually the1924 diary of Matt Unger's grandfather (Harry Scheurman) transcribed and annotated. The calendar on the left side of the blog is from 1924 and each blog post is the diary entry for that specific day. He includes the digitized pages of the diary. This is a very interesting use of a blog and interesting reading. Thanks to Ben Brumfield (again) for recommending this blog.

Sivacracy -- This is a team blog headed by Siva Vaidhyanathan (pronunciation), who is a faculty member at the University of Virginia. Now a project of the Institute for the Future of the Book, this blog focuses intellectual property, free culture, globalization and more. Siva and his team tackle serious topics, but have fun too, especially since they make good use of media in the blog.

Last year I highlighted five blogs, which were (with updated info):
  • Confessions of a Mad Librarian -- When Eli Edwards talks about copyright and intellectual property issues, many people listen. Her posts are well-written and thought producing, even though they are infrequent. Hopefully she will write more frequently once she is through with law school.
  • -- "HangingTogether is a place where some of the staff at RLG Programs, part of the OCLC Programs and Research division, a partnership of libraries, archives, and museums, can talk about the intersections we see happening between these three different types of institutions." Roy Tennant has joined OCLC and is now one of the Hanging Together bloggers. For me, that has made Hanging Together a blog I want in my RSS reader.
  • Lifehacker -- The team at Lifehacker write a lot on software, web sites and tips to help us be more productive, so I look at the headlines and read only those that are truly of interest...and there is something of interest every week. I also find that I forward posts to other people. Interesting, relevant, helpful...what else can you ask for?
  • Restoration Tips & Notes / Media Formats & Resources -- For those interested in audio tape restoration, repair and mastering, adding Richard Hess' blog to your blog reader may be quite useful. Infrequent posts. Excellent content.
  • Presentation Zen -- This blog still delivers good content. Want to do better presentations? Read/skim this blog. It's that simple.
Follow whatever links peak your interest and enjoy the rest of your Blog Day!

Thanks to those who suggested blogs. I appreciated every suggestion, although I did not use all of them.

Technorati Tag:

Thursday, August 30, 2007

Interview with Brewster Kahle

Published on Aug. 15, this interview with Brewster Kahle contains some great quotes -- all classic Brewster Kahle. For example:

Are you surprised to see libraries signing up with Google under restrictive terms?

I'm not surprised that a corporation wants to be the only place someone can get information, and I was not terribly surprised that some libraries went forward with this before they understood how they could do it on their own and how much it would cost to do it for themselves, not only to do the digitization but also to create services around these collections. I was surprised to see more libraries jumping on the Google bandwagon after demonstrating how libraries can do this and after actually doing it with the Open Content Alliance.

And in talking about how the Open Content Alliance can compete with Google, Kahle said:
Revolutions aren't started by majorities.
He does provide some cost information on having materials digitized by the OCA:
At an OCA regional scanning center, we'll scan your materials for 10¢ a page. Audio recordings we can do for about $10 a disc, and videos about $15 per hour. And we'll do all of the hosting for free; you can do the interfaces.
Definitely an article worth reading.

Technorati tags: , ,

Wednesday, August 29, 2007


According to the dictionary, redaction is "The act or process of editing or revising a piece of writing; preparation for publication." However, in the vernacular, often if something is redacted, it is eliminated. For example, you might have a sensitive document and want to redact -- edit, revise, eliminate, erase -- some of the information in it, such as personal identification information (e.g., Social Security numbers).

A company named Extract Systems has developed a process that will redact -- cover up -- information on digital documents. This documents may have been born digital or digitized. The software uses various rules to locate information that the document owner deems to be sensitive and then blacks it out. The "black bar" that covers the sensitive information is part of the resultant image, so no one can remove it and see what is underneath.

I got to see this software in action a few weeks ago and I found it to be impressive. I can see applications for it in the legal and medical areas as well as with employment records and other documents. From my corporate experience, I know that eliminating sensitive information off of all types of documents can be a real pain. This is a cool tool that could make it easy.

Technorati tag:

Tuesday, August 28, 2007

Content-Aware Image Sizing

My first impression is w-w-w-o-o-o-o-o-o-o-o-w-w-w! dsphotographic summarized the video saying:
It demonstrates a software application that resizes images in such a way that the content of the image is preserved intelligently. The video describes it better than I can, but the basic idea is that if you want to stretch out an image, it will keep the key elements of the photo in sensible places while filling in less important areas. The same goes for shrinking an image - it will eliminate the less important features of an image and leave the main subject areas intact and in the same relative location as they previously appeared in the image.
Dr. Ariel Shamir and Shai Avidan wrote about this technology in ACM Transactions on Graphics, Volume 26, Number 3, SIGGRAPH 2007.

Watch the video then think about how this could be used to deliver images to your users. Then think about how it could be used to alter reality (as if we couldn't do so already). Then say "wow"!

Technorati tag:

Monday, August 27, 2007

Brainstorming: Getting that second & third opinion

On Saturday afternoon I met with two other consultants for a brainstorming session. We each have different businesses, yet we felt that we could benefit from coming together and brainstorming ideas for each other. We set very basic ground rules for our two-hour session and each of us spent some time preparing upfront. We spent time looking at each other's web sites as well as time doing some solo brainstorming (sometimes called mindstorming). I'll spare you the details and tell you why you should consider doing something similar.

First, we often think that no one will understand what we're doing except for those who are doing the same thing that we are. Wrong. In fact, talking about your projects, work flow issues, or whatever with someone who is in a different industry might yield ideas that you would not have come up with because those ideas weren't part of your paradigm.

Second, a person from a different industry will not make the same assumptions as you and may ask very different questions. When you're not hearing the same old questions, your mind may come up with new information or discover new solutions.

Third, someone from a different industry or area of focus may take the discussion in a direction that you had not anticipated, because that person sees what you do (or could do) differently. Imagine, for example, talking about work flow with someone who has been trained in streamlining manufacturing processes.

Here are the ground rules we used:
  • Food should not be a distraction. (Yes, we had munchies. A benefit is that if you have food in your mouth, it can be more difficult to interrupt, thus giving others a chance to talk!)
  • Start on time and end on time. Actually we started on time and then ran late! Once we got going, it was hard to stop...and the extra time was well spent.
  • No idea is discounted or dismissed. All ideas are valid. Critical!
  • Don't go off on tangents. Keep each other focused.
  • Keep an open mind. Critical!
  • Keep your own notes. (We did this instead of creating one master list of ideas and it worked well. Each person wrote down what s/he needed.
We also looked at brainstorming tips and used a number of them.

At the end of 2.5 hours, we each left energized and with actionable ideas. We saw things fresh -- with new eyes as it were. All useful benefits.

Need new ideas? Why not give this a try?

Technorati tag:

Saturday, August 25, 2007

Event: Stewardship of Digital Assets: Sustaining Digital Collections

See the announcement below from IMAGELIB.

A two-day workshop on sustaining digital collections

The Northeast Document Conservation Center (NEDCC), in conjunction with PALINET, SOLINET, Amigos Library Services, and OCLC Western Service Center, is pleased to announce Stewardship of Digital Assets: Sustaining Digital Collections. This lecture and exercise-based workshop is designed to provide participants with tools and methods to assess their institution's digital preservation needs, and develop plans and policies to sustain their digital collections.

STEWARDSHIP OF DIGITAL ASSETS will be taught by a faculty of nationally known digital experts: Liz Bishoff, Tom Clareson, Robin Dale, and Katherine Skinner.

This hands-on workshop will be presented at four locations across the country from Fall 2007 - Fall 2008. SPACE IS LIMITED TO 35 PEOPLE FOR EACH WORKSHOP.



PALINET headquarters, Sponsored by PALINET
Wednesday, November 14 - Thursday, November 15, 2007

REGISTER ONLINE for the Philadelphia workshop at:



MORROW, GA, Georgia Archives, Sponsored by SOLINET
Wednesday, February 6 - Thursday, February 7, 2008
Registration opens: December 10, 2007

Amigos Library Services headquarters, Sponsored by Amigos Library Services
Wednesday, May 14 - Thursday, May 15, 2008
Registration opens: February 25, 2008

Washington State Historical Society, Sponsored by OCLC Western Service Center
Thursday, September 18 - Friday, September 19, 2008
Registration opens: June 30, 2008

**To receive an announcement for workshops at these other locations, please email Julie Martin Carlson,


$275.00 for members of co-sponsoring organizations in workshop locales
$325.00 for non-members

FOR COMPLETE INFORMATION about Stewardship of Digital Assets workshops, go to:

Contact Donna Harnish,
Phone: 215-382-7031, ext.1228

Contact Tom Clareson,
Phone: 205-382-7031, ext. 1270

These workshops are made possible by a grant from the National Endowment for the Humanities.

Technorati tag:

Thursday, August 23, 2007

Clue me in

Last year, everyone in my household watched Feasting on Asphalt with Alton Brown on the Food Network. Although we learned about the show before it aired, we had not heard that he had been soliciting input from his viewers. Even though we watch his normal show, we had missed that. This year, we totally missed the first episode of Feasting on Asphalt 2 because nothing had popped up to tell us about it. Thankfully, two friends clued us in.

We live in a world now where we expect to be clued in about things that we want to know about. We can subscribe to feeds that will deliver the latest news to our email, cell phone, Twitter account, Internet start-page, or wherever. We can get updates delivered on specific TV shows or sporting events. Bands, comedians and other entertainers will even email us their schedules so we'll be sure to catch them if they come to the area.

When something doesn't clue us in (like Feasting on Asphalt) then we get miffed. If we only had known...!

How are you cluing in your users about the things that are occurring with your digitization program? Are you posting information on the web site or are you being more proactive, perhaps getting people to sign-up for email updates? Are you blogging or even micro-blogging about what going on? Or are you keeping your users in the dark? If you are treating your users like mushrooms (keeping them in the dark and feeding them manure), then don't be alarmed if they get a bit miffed when you surprise them with a change that they should have known about all along.

Go ahead...find ways to clue your users in. Informed users can lead to happy users...and happy users can be very supportive when we need them to be.

Technorati tag:

Wednesday, August 22, 2007

Events: EU Collaborative Digital Preservation Events

From the Digital-Preservation discussion list.

EU Collaborative Digital Preservation Events
4 September-8 September, 2007
Instituto dos Arquivos Nacionais / Torre do Tombo Lisbon, Portugal

To foster an environment of collaboration and cooperation between multiple stakeholders the European Commission, DPE, PLANETS, CASPAR, and the e-SciDR teams are delighted to announce that they have been working together to deliver a series of consecutive digital curation and preservation events. The events will all be held at the Instituto dos Arquivos Nacionais / Torre do Tombo in Lisbon, Portugal. For more information on the venue and its location, please see The events will take place the week beginning 4 September 2007 and run through to 8 September 2007.


Towards a European e-Infrastructure for e-Science digital repositories
(e-SciDR) Workshop
4 September 2007
The e-SciDR study has been commissioned by the European Commission to formulate policy recommendations with regard to digital repositories and a European e-infrastructure to support them. This workshop will review the study's draft findings and recommendations. Anybody interested in attending the workshop should contact

DPE, PLANETS, and CASPAR Second Annual Conference:
Progress towards Addressing Digital Preservation Challenges
5-6 September 2007
DPE, PLANETS, and CASPAR are delighted to announce their second annual joint conference to introduce participants to a range of tools and resources and to explain how these fit into the evolving context of the international curation and preservation landscape.

Disclosure and Preservation
Fostering European Culture in the Digital Landscape
7-8 September 2007
This international seminar has been organised in the framework of Portuguese EU Presidency 2007. The seminar is aimed at the promotion and development of the online availability of European digital content, this seminar will provide an up-to-date overview and discussion of the matters most relevant to the subject - the need for harmonization, cooperation, and coordination among different stakeholders, member states and their institutions.

Technorati tag:

Managing Electronic Records and Assets: a Bibliography

The message below was posted to the Metadata discussion list and seemed worth posting here.

This looks like a great resource! Sections of the bibliography include:
  1. Digital Collections
  2. Digital Audio and Video
  3. Digital Images
  4. Digital Texts
  5. Electronic Records
  6. Electronic “Manuscript Collections”
  7. Institutional Repositories
  8. Copyright
  9. Information Literacy
  10. Databases and Programming
This bibliography would be a worthwhile addition to every existing resource list or bibliography on digitization.

As a result of the Society of American Archivists' Council's strategic planning efforts in technology, a Technology Best Practices Taskforce was formed in September 2006 to work with appropriate SAA and ARMA groups to identify competences and standards and to collect, review and clarify best practices relating to all areas of archival practice that are affected by electronic records and digital asset issues.

It is hoped that the outcome will be the development, acceptance and implementation by archivists, records managers and IT professionals of widely accepted standards for archival functions (e.g. accessioning, appraisal, arrangement and description, preservation and access) for born-digital and digitized archival assets utilizing readily available tools.

Through surveys of appropriate sources, the Task Force has produced a document that will serve as a beginning called "Managing Electronic Records and Assets: a Bibliography" and found at:

This email requests the circulation and review of "Managing Electronic Records and Assets: a Bibliography" and the submission of recommendations and comments to

Help us to make "Managing Electronic Records and Assets" a compilation of proven sites, handbooks, articles, case studies and other sources to support best practices research. Elements to consider during your review include:
  • What sources have you found most helpful in your work with electronic records and assets? Annotations indicating why the source was helpful are appreciated.
  • Have the solutions provided by the source had an impact on archival practice?
  • Does the source have good documentation of technical systems and applications?
  • Was the solution proposed by the source a success and what are the characteristics of the success?
  • What can the source teach us?
This call for review and comment will enhance professional practice through the on-going identification and evaluation of electronic records and digital asset management solutions. The Task Force welcomes comments at

Technorati tag:

Tuesday, August 21, 2007

Event: Metadata for You & Me: A Workshop on Shareable Metadata

This announcement was found on an email discussion list...

Metadata for You & Me workshops address the needs of library, museum and cultural heritage professionals in the creation, development and use of interoperable or shareable descriptive metadata. The content of workshops is based on the Best Practices for Shareable Metadata, an initiative of the Digital Library Federation and the National Science Digital Library, that provides guidance for creating metadata that can be easily understood, processed and used outside of its local environment.

Registration is now open for the following dates:
Sept. 5 - Oct. 10
A 5 Week Online Course

September 20 or 21st
CDP@BCR - Denver, CO
$130/person (includes lunch)

October 5
Emory University - Atlanta, GA
$130/person (includes lunch)

Technorati tag:

Monday, August 20, 2007

Central Florida Memory

Here is another digitization program that is using YouTube to promote itself. Central Florida Memory began in 2002 and continues to grow. It is a place "where visitors can discover what Central Florida was like before theme parks and the space program. Diaries and letters describe the region and how people survived day-by-day in this extreme and rugged environment. Maps, photographs, and postcards illustrate how the region looked in the early years and how it changed over time. Voters’ registration and funeral records and city directories provide demographic information that makes the picture of the Central Florida settler come into focus." Lee Dotson reported that creating the video took more time than she anticipated, but -- with YouTube's popularity -- it is a tool you cannot ignore.

Technorati tag:

News from the European Library and beyond

Last week, the Russian State Library joined the European Library. According to the new blog TELL Fleur, which chronicles happenings at the European Library:
As a result users of will be able to search and retrieve material from the national library of Russia-Moscow. The European Library is a service of the Conference of European National Librarians (CENL).
TELL Fleur points out that the European Library is not the same as the European digital library that is now under development. However, Ms. Fleur Stigter writes that "the European Library is the organisational ground for the European digital library." It is hard to tell how these two project might entwine. There are some European digital library treasures on the European Library web site, but it is unclear if these are truly part of "the" European digital library. (Fleur, perhaps you can leave a comment and educate us?)

Following the links concerning the European digital library, I found information on the i2010 initiative to create "a European Information Society for growth and employment." What a spectacular project that will cross geographic and cultural boundaries in order to help Europe collect, manage and preserve its knowledge.

Unfortunately, we don't receive a lot of European library news on a day-to-day basis. Therefore, I'm glad to see this new blog that will help to keep us up-to-date.

Technorati tag:

Sunday, August 19, 2007

Event: Museum Computer Network (MCN) Annual Conference, Nov. 7 – 10, 2007.

I've seen this posted on several discussion list...but no harm in helping them spread the word even further.

Save the Date!

Join the Museum Computer Network in Chicago for our Annual Conference, November 7 – 10, 2007.

Building Content, Building Community: 40 Years of Museum Information and Technology


Visit for registration information, preliminary program and hotel & travel information.

MCN is a nonprofit organization with members representing a wide range of information professionals from hundreds of museums and cultural heritage institutions in the United States and around the world.

Over the past 40 years MCN has become known for the outstanding conferences we produce, bringing together professionals from museums all over—large and small, art, history, science; as well as colleagues from libraries and archives. For years our conferences have explored the latest technology while helping our contemporaries implement best practices in museum technology.

Conferences always feature exemplary speakers and sessions; this year in Chicago we’re kicking off the conference by bringing the whole group together for a day of meetings and plenary talks on the topic of leadership in our museum information profession. The first day of the conference will help set the scene for the stimulating presentations and discussions in the three days to follow.

As the museum field has grown and expanded in the past half-century, we have become more and more specialized, learning primarily from those whose experiences most closely mirror our own – art with art, science with science, etc. This conference will reaffirm and revitalize our broader connections across these types of institutions.

MCN will revive the cross-disciplinary vision of 40 years ago and explore its promise in the present day, encouraging participation from institutions across the museum spectrum. How we are the same? How are we different? How have we dealt, successfully and unsuccessfully, with the overlapping efforts in documentation, professionalism, and use of technology across the museum community? What are we doing right, and what do we need to do better?

Today, MCN members are creating everything from databases to podcasts; handling digital imaging and preservation; and working with information management, IT, and Web 2.0. You name it, our members are doing it, talking about it, and helping others achieve results.

We hope to see you in Chicago!

Technorati tags: ,

Friday, August 17, 2007

The new Fedora Commons

Fedora has had a rash of announcements in the last week. They announced:
  • The Fedora Commons
  • A new web site
  • Additional staff
  • A new version of the software
  • The receipt of a four-year, $4.9 million grant
The "Fedora Commons is a non-profit organization providing sustainable technologies to create, manage, publish, share and preserve digital content as a basis for intellectual, organizational, scientific and cultural heritage by bringing two communities together."

All of this means that Fedora will be supported and will continue to grow, and that's a good thing.

Technorati tags: ,

Thursday, August 16, 2007

Presentation: Digital Preservation for the Nation: An Introduction to the Electronic Records Archives (ERA)

Rita Cacas, ERA Communications Officer at the National Archives and Records Administration, did a presentation on this at the Special Libraries Association Annual Conference in June. Her presentation is available online. In addition, her session was recorded and the audio is also available (1 hour, 27 min.).

The volume of information that they are concerned with is tremendous. For example, in the Department of Defense, annually there are 54 million images from electronic official military personnel files. So ensuring that the digital materials are preserved for the short and long-term is critically important.

Technorati tag:

Looking ahead to Blog Day

Blog Day 2007

August 31 is Blog Day -- a day where bloggers are asked to celebrate by recommending five blogs. As the site says:
one long moment on August 31st, bloggers from all over the world will post recommendations of 5 new Blogs, preferably Blogs that are different from their own culture, point of view and attitude. On this day, blog readers will find themselves leaping around and discovering new, unknown Blogs, celebrating the discovery of new people and new bloggers.
I am already planning my Blog Day post and would like to have more than five blogs to recommend. Therefore, if you know of a new blog broadly related to digitization, please let me know, so I can check it out for inclusion. Thanks!

Here is my post from last year.

Technorati Tag:

Wednesday, August 15, 2007

Event: 7th Annual Optical Storage Symposium, Sept. 18 - 19, 2007

Nothing in this symposium's description says digitization, yet everything they will discuss will impact digitization programs in the future. Sessions include:
  • Market Perspectives of the Data Storage Industry
  • Photo Archiving- Where Have All Your Digital Photos Gone?!
  • High-Capacity Optical Storage: Will blue laser or holographic storage be the solution?
  • New Technology Roundtable
The topics within those sessions include blue laser markets, emerging holographic technology, alternate magnetic data storage products, and consumer photo archiving. Registration deadline is Aug. 27.

Technorati tag:

Google Book Search Tips: A University of Michigan University Library Handout

©ollectanea wrote a nice blog post about this five-page handout noting that her eyes glazed over after a while. (And if that happens to a librarian, what will happen when a user reads it?) Looking at the positive aspects of the handout, Georgia Harper wrote:

...the document is really helpful as it shows in detail what features the book search provides, how to use it to best advantage, and if you're at UMich, how to double-check your results against Michigan's catalog, Mirlyn. I want to say right now that I think this is a really good thing. I've heard so many people say things that indicate that there's a lot of misunderstanding about what Google Book Search does and how it works. So clearly, this is needed and kudos to UMich for doing it...

Although this handout was created specifically for UMich, it would be useful to others who are using Google Book Search, which seems to need a lot of explanation for something that seems so simple.

Also posted in the SLA IT Division blog.

Technorati tag:

Tuesday, August 14, 2007

Quote from Carwyn Jones on the effect of digitzation

In talking about digitization at the JISC Digitisation Conference 2007, Carwyn Jones, Minister for Education, Culture and the Welsh Language, said:

we are witnessing a democratisation of research

Well said!

Technorati tag:

Article: Inheritance and loss? A brief survey of Google Books

The Google Books Project has drawn a great deal of attention, offering the prospect of the library of the future and rendering many other library and digitizing projects apparently superfluous. To grasp the value of Google’s endeavor, we need among other things, to assess its quality. On such a vast and undocumented project, the task is challenging. In this essay, I attempt an initial assessment in two steps. First, I argue that most quality assurance on the Web is provided either through innovation or through “inheritance.” In the later case, Web sites rely heavily on institutional authority and quality assurance techniques that antedate the Web, assuming that they will carry across unproblematically into the digital world. I suggest that quality assurance in the Google’s Book Search and Google Books Library Project primarily comes through inheritance, drawing on the reputation of the libraries, and before them publishers involved. Then I chose one book to sample the Google’s Project, Lawrence Sterne’s Tristram Shandy. This book proved a difficult challenge for Project Gutenberg, but more surprisingly, it evidently challenged Google’s approach, suggesting that quality is not automatically inherited. In conclusion, I suggest that a strain of romanticism may limit Google’s ability to deal with that very awkward object, the book.
The findings outlined in the entire article are interesting and some are not a surprise. As he wraps things up, the author (Paul Duguid) states what we wish wasn't true:
The Google Books Project is no doubt an important, in many ways invaluable, project. It is also, on the brief evidence given here, a highly problematic one.
There are now 27 libraries that are part of this project. It would be interesting to hear from them either how they are working with Google to improve the quality of what Google is doing or why they feel this quality is acceptable. Perhaps they are looking past these problems and seeing something grander than what we see.

Technorati tag:

Monday, August 13, 2007

Article: Cornell University Library becomes newest partner in Google Book Search Library Project

In this region, Cornell remains at the forefront in regards to digitization. Last week, Cornell issued a press release with major digitization news. It is joining the Google Book Search project. The press release said:

“In its quest to be the world’s land-grant university, Cornell strives to serve the scholarly and research needs of those beyond the campus. This project advances Cornell’s ability to provide global access to our library resources and to build human capacity across the globe,” said Cornell President David J. Skorton.

Google will digitize up to 500,000 works from Cornell University Library and make them available online using Google Book Search. As a result, materials from the library’s exceptional collections will be easily accessible to students, scholars and people worldwide, supporting the library’s long-standing commitment to make its collections broadly available.

Google will digitize both public domain and copyrighted materials at Cornell. Those materials will be selected in order to complement the other work that Google is doing.

500,000 works is a small part of Cornell's collections which is "close to 8 million volumes in print and more than 60,000 journals, 300,000 e-books and 39,000 e-journals." Even so, Janet A. McCue, director of Mann Library, said “Having Google index our collections is like having a massive concordance to the information in our books.”

How many libraries is Google now working with? 27.

In 2006, Cornell announced that it would participate in Microsoft's Windows Live Book Search. Does this now mean that Cornell is working with Google and Microsoft at the same time? Undoubtedly Cornell is working on other digitization programs too. It would be interesting to hear how all of this work is (or is not) impacting them. Perhaps due to its size, the impact is minimal.

Technorati tags: ,

Thursday, August 09, 2007

The "M" word

There are some words that cause excitement, fear or dread, and that cannot be spoken. For example, we often say w-a-l-k in front of our dogs, because we don't want to get them excited about the possibility of taking a walk. The "M" word is one that causes people's eyes to glaze over and it inhibits them from understanding why it is important.

No...not marriage, but metadata.

For those unfamiliar with metadata, the word is meaningless and the definition -- data about data -- makes matters worse. If we correlate it to cataloguing or indexing, it sounds like tedious, complex work that is best done by someone else. ("No, I'm not doing that!")

So how do we get people to care about metadata, especially those people who may need to help create it? Don't use the "M" word! Instead talk about how people will want to search the materials. What terms will people want to use? What will their expectations be? Get that all out on the table, then talk about what will be needed to ensure that information is there and ready to be searched. (BTW This might lead you to talk about the software that will be used, and that's good.) For example, if people will expect to search on items that appear in a photo, how will that information be made available for them to search on? And if the answer is "they should just call and ask...", then counter with the question of "and what happens when that person is no longer working here?" You might suggest that you attached to the materials the wisdom that the staff has collected so that it is there for the next generation to use.

This is a conversation that may take more than one session to complete, since you're trying to lead the person (or persons) to understand the "M" word without actually using the "M" word! Be patient. Finally, when you have agreement on what the users' expectations and needs will be, and you agree on what information will need to be there in order to satisfy those needs and expectations, then ensure you obtain agreement on who will be responsible for providing and/or inputting that information into the system. Perhaps now you might use the word "metadata" or perhaps you've found that the word is not important now; maybe no one needs to know that they've agreed to do something they dreaded.

Technorati tag:

Wednesday, August 08, 2007

Ensuring a strong structure

Eric Clapton once sang of a weak foundation made of clay. Yesterday I mentioned that agreeing upfront on the specifics concerning your program will help your program have a firm foundation. A weak clay foundation would sink or crumble under the weight of your program, rather than providing the proper support.

Have you checked your foundation yet? Like a well-built house, is your program sitting on a solid foundation?

If yes, is the rest of the program's structure intact and strong? Like the walls and roof of a house, will your program withstand the equivalents of wind, rain, sleet and snow?

And finally, is there an exit strategy for those working on the program? Can they grow and move onto other positions? If the program fails, can you place them elsewhere or provide assistance to them as they look for another job? Do you have succession plans in place (formally or informally)? (We might equate this to the doors, windows and emergency exit on your house.)

An exit strategy? Surely I must be kidding! No, not really. Even if it is not formal, you need to think about this stuff. There is always a chance that a digitization program will lose funding or run into other difficulties (e.g., bad partners). When that happens, what will you do? Can you save what's been completed? And can you help your colleagues who may feel and be rejected?

And if the program succeeds, can your team move onto other things or will they be locked into maintaining this program? Will their only means of escape be quitting or can they be promoted (or move to another project)?

Something to think about over your afternoon tea...

Technorati tag:

Tuesday, August 07, 2007

It's like getting married

I used this analogy last week in a meeting and -- although people chuckled -- they understood it.

When we develop a digitization program, we often must collaborate with others either internally or externally. If we collaborate with people internally, we may believe that we have already laid down the necessary foundation, but perhaps we have not. We will likely be more aware of building a firm foundation with those we partner with externally, but we may shy away from negotiating everything, since we don't want to offend or appear picky.

It is like getting married. When you meet a perspective life-partner, you date and get to know that person. Perhaps you discover that you use words differently, and so you have to talk about what you mean (the meaning behind the words). As the relationships get serious, you may begin to talk about the future and making this a formal partnership (a marriage). That's when you'll begin to negotiate about who will do what or at least talk about expectations. And if you do decide to spend the rest of your lives together, you might decide to formalize your understanding in how this partnership will work in a prenuptial agreement. After the wedding, if something goes wrong, you can point to the prenup -- or other paperwork (like a list of who is responsible for what) -- to remind each other of the agreement that was made.

So it is with building a cooperative digitization program. Even if you already know the prospective partners, you need to ensure that you're speaking the same language. You'll need to formalize who is going to do what. You may think it is obvious, but formally agreeing "on the obvious" and writing it down will provide documentation that could be critical later on. Given the length of many programs (elapsed time), there may be people involved at the end, who were not involved in the program at the beginning. In that situation, the documentation becomes even more important because it communicates across time what the founders intended.

Like every marriage, there may be some renegotiation. That's fine. But you cannot renegotiate what you did not negotiate to begin with. In other words, you can't change the rules midstream if you don't know what the rules were in the beginning. Therefore, creating the upfront documentation -- the partnership agreement -- will be helpful when you decide that the agreement needs to change. And the agreement will be helpful if the partnership needs to be dissolved.

If you don't know what the rules are for your program -- if there is no paperwork, no documented expectations, etc. -- it is not too late. Talk to your partners about documenting your agreement for the future. Talk about the need for future workers on the project to understand the agreements, rules, etc. that you have been using. Take the time do to it; you won't be sorry.

Technorati tag:

Monday, August 06, 2007

My August and fall schedule

The relative quietness of my mid-summer comes to an end soon. You can view my fall workshop and speaking schedule on the left side of this blog. I am conducting workshops this summer and fall in digitization, social networking tools, and Second Life. (Yes, Second Life for me has become my "second life.") Digitization related workshops include:
  • Sept. 5 & 12: Planning Digitization Projects (done in two parts), Jamestown, NY
  • Sept. 18: Developing Digitization Projects, Fairport, NY
  • Oct. 19: Developing Digitization Projects, Tallahassee, FL
  • Nov. 7: Copyright Basics, Jamestown, NY
I'm also doing a presentation at Internet Librarian entitled " Federated Searching Feedback: Walking the Talk?" on Oct. 29 (Monterey, CA).

For more information on these events, please contact me or the sponsoring organization.

Technorati tags: , ,

Preserving The Sound Of America

On last night's CBS Evening News, there was a story about preserving sound recordings through a unique effort of the recorded sound division at the Library of Congress. Each year, this division adds 25 American sound recordings to its National Recording Registry. Since its inception in 2000, the Registry now contains 225 unique sound recordings. Suggestions for recordings to be added come from the general public and then are narrowed down by committee, with 25 finally selected that meet the criteria. Criteria includes that the recording must be at least 10 years old and:
The recordings must be historically significant, signal a major change and meet the library's strict standards.
Recordings already on the Registry include:
  • "Stars and Stripes Forever" Military Band. Berliner Gramophone disc recording. (1897)
  • Booker T. Washington's 1895 Atlanta Exposition Speech. (1906 recreation)
  • "Casey at the Bat." DeWolf Hopper, reciting. (1906)
  • "Who's on First." Abbott and Costello's first radio broadcast version. (1938)
  • "War of the Worlds." Orson Welles and the Mercury Theater. (1938)
  • “Blue Suede Shoes.” Carl Perkins. (1955)
  • Graceland. Paul Simon. (1986)
While the web site lists all of the recordings in the Registry, none are available to listen to online (at least not through the Registry). Being added to the Registry is a call for these recordings to be preserved and for others like them to be preserved. Let's hope that they also become for accessible.

Technorati tag:

Friday, August 03, 2007

Write down your assumptions and decisions

Do your project -- and your future project staff members -- a favor. Write down the decisions you have made as well as the assumptions that your digitization project is using. Why?
  • If you document your decisions, then you won't have to make the same decisions again.
  • When people ask what you decided about "X", you will know AND be able to point to the documentation.
  • Having it in writing generally means you've come to an agreement on whatever it is.
  • Not everyone operates with the same assumptions, so documenting them allows everyone to see, understand and operate under the same assumptions.
  • New people will be able to come up to speed more quickly.
  • You'll be more efficient and effective.
Don't have time to write this stuff down? Just do a little bit each day...or as decisions are made...or as you find yourself explaining it one more time to someone...or...

Technorati tags: ,

Thursday, August 02, 2007

Teasing out local collections

Over the last six years, I've helped several clients conduct surveys of their local libraries in an effort to understand what topics/themes/materials exist in local collections. For example, here in Upstate New York, we know that the creation of the Erie Canal changed and helped to develop this area, but who has materials in their collections about the Erie Canal? Of course, there are the obvious suspects, like the Erie Canal Museum in Syracuse, but might a town library have some important materials that only they know about? And what if you knew about them? Could you then propose a digitization program that would digitize materials from various institutions related to the same topic (Erie Canal) and then create a virtual exhibit?

I am starting to work with a client on a survey that will be conducted this fall. Since we want the institutions to really think about their answers, we will give them the survey in hardcopy, rather than using a service on the Internet. With the survey in hardcopy, maybe people will go and look at their finding aids -- or even at the collection itself -- in order to give more accurate answers. However, I know that some people will not be that diligent and that they will complete the survey from memory and tell us only what they remember. At least that will give us some information.

For me, the most interesting part of constructing the survey is collecting "themes" (topic, etc.) that might exist in the region. The listed themes help to tease people's memories and we provide extra space for those things that they know and which are not listed. This is where I get to learn more truly local history. For example, do you know where Lucille Ball was born and buried? Do you know why the Pan American Exposition in 1901 was a big deal? And where was the gentleman who invited the Pullman car from? How many wars (with real weapons) have been fought between people residing in what is now the U.S. and people residing in what is now Canada? ( link for that one, but it is more times than you think since the British and French occupied North America.)

I am hoping that this latest survey will have a side benefit. People have offered up a number of topics and people's names to be included in the survey. I suspect, however, that even though person "X" did something phenomenal and is well known in the community, that the local library or historical society hasn't collected that person's stuff. So year's from now, there will not be a significant amount of materials on/from that person for researchers or the curious to review. Maybe seeing the list in the survey will prompt the institutions to collect more. Wouldn't that be a wonderful benefit?!

By the way, besides listing the themes, there are boxes to check what type of materials the institution has on that theme and a place to tell us how much. (We're trying to make the survey as easy as possible, so that people will complete it.) We're also provided a place for people to list information on other collections in the region that we may not know about. Perhaps a fantastic private collection -- known by only a few -- will be listed?! Okay..well...I can hope.

Finally, as more collaborative programs are developed, information on what exists in specific regions will become more important. It will be known upfront what exists and where, which may help with creating proposals, grant writing and contacting potential partners. anything that makes those tasks easier is appreciated.

Do you know what exists in your region? If no one knows what exists in your region, consider doing some information gathering -- even a survey. Don't want to do it yourself? Consider asking your local college (history department, perhaps) for help.

Technorati tag:

Wednesday, August 01, 2007

Article: Uncovering the past, bit by byte

A very interesting digitization project is occurring in Bethlehem, PA where "rare historic documents -- are now being fed into a computer at Lehigh University to digitally reconstruct the early 20th century neighborhoods that eventually would become incorporated as the city of Bethlehem."

How are they doing the reconstruction?
The researchers are feeding the information from old census records, city directories and Steel payrolls into the Geographic Information System. The GIS allows that information to be layered on top of old maps.
According to the article, this is a $15,000 project being funded through a Pennsylvania Commonwealth Libraries Library and Technology Act grant. In another article, I see where this money is being spent to only work on the early 1900s and that they will apply for funding to do more.
The $15,000 project, paid for with a Pennsylvania Commonwealth Libraries Library and Technology Act grant, continues through May. But [Julia Maserjian, the Lehigh Univ.'s digital library coordinator] said she has already applied for more grants to do a similar project for 1920. That would allow researchers to look at the effects of a population surge in the city after thousands of people were brought in to work on World War I contracts.
I would assume that this is all part of the "Beyond Steel: A Digital Archive of Lehigh Valley Industry and Culture," which is currently listed as a work in progress.

Since this is being grant funded, I'm sure that there is in-kind contributions being made above the $15,000 amount. I hope they will publish something on what they did, how they did it, and what it cost. There is still not a lot of cost data floating "out there", yet it always helps to know how people spent their funds and what they were able to achieve with the money.

Technorati tag: