Friday, September 30, 2005

Putting Content Online: A Practical Guide for Libraries

Mark at digitizationblog is writing a book entitled Putting Content Online: A Practical Guide for Libraries. It will be published in about a year from now. His blog posting talks about his intent to make it practical and not just about digitized materials. Sounds like a book that many libraries and cultural heritage organizations will be able to use, especially those smaller one that have really thought about this yet.

Mark, good luck! We look forward to the formal announcement saying that it's available.

Thursday, September 29, 2005

NEDCC Study reveals need for planning to sustain digital collections

This press release below is a status report on efforts of the NEDCC to understand the preservation needs of institutions in regards to digital assets. The major conclusion is no surprise. Smaller institutions will need help from others in understanding their preservation needs.

The Northeast Document Conservation Center (NEDCC) is conducting a study of the preservation needs of digital assets in museums and other cultural institutions, supported by a National Museum Leadership Grant from the Institute of Museum and Library Services (IMLS). Cooperating partners for this project are the Museum Computer Network (MCN), Heritage Preservation (HP), the American Institute for Conservation (AIC), and the Center for Research Libraries (CRL).

THE GOAL IS to develop a methodology for surveying the preservation needs of an institution's digital holdings.

AS AN INITIAL STEP, the project advisory committee developed a prototype e-mail questionnaire to gather data on the status of digital collections from a sampling of cultural heritage institutions. The data suggested that institutions need to be more concerned about the fragility of digital assets:
  • 92% of the institutions responding noted that they are creating digital materials.
  • Only 29% of the responding institutions reported that they have written policies to address the management of presentation of digital assets.
NEDCC CONVENED A COLLOQUIUM of experts in Boston on July 11-12, 2005 to examine data from the questionnaire responses and preliminary information. The event began with presentations of existing survey models, especially the Conservation Assessment Program and the Heritage Health Index. Participants engaged in moderated discussion on digital preservation needs and possible solutions.

  • Steve Chapman, Harvard University
  • Tom Clareson, OCLC
  • Paul Conway, Duke University
  • Steve Dalton, Boston College
  • Kevin Glick, Yale University
  • David Green, Knowledge Culture
  • Ken Hamma, J. Paul Getty Trust
  • Peter Hirtle, Cornell University
  • Kristen Laise, Heritage Preservation
  • Paul Messier, Conservator
  • Sam Quigley, Harvard University, MCN
  • Bernard Reilly, CRL
  • Rebecca Hatcher, NEDCC
  • Ann Russell, NEDCC
THE MAJOR CONCLUSION was that small and medium-sized institutions will need help from specialists in surveying the preservation needs of their growing digital collections. The group took significant first steps toward developing practical planning tools for assessing the preservation needs of digital collections and envisioning a national survey program.

THE NEXT STEPS of the project include refining a new assessment instrument and testing it through expert-conducted site visits to gather information on an institution's digital assets. Information will be released as it is available.

For more information on NEDCC's programs and services
Visit or call 978/470-1010

Standard Disclaimer for External Links

I found this page today on the Library of Congress web site. It is a disclaimer concerning external links -- no implied endorsement, no control over content, etc. Nice done. Lots of web sites could use a disclaimer like that.

SU's Bird Library Copyright web site

If you're like me, no matter how much you know about copyright, you often look for documentation and interpretation of others to see how they interpreted the law. This semester, Syracuse University's Bird Library has launched a site dedicated to copyright. The site says:
Since responsibility for copyright compliance rests with you, the user, these guidelines provide information and educational tools to assist you in making informed decisions regarding appropriate use of copyrighted materials.
The site includes the Common Academic Uses of Copyrighted Information, which are:

Under each subsection are suggestions, lists of resources, or pointers to reputable content on other web sites.

Congrats to SU for posting this information. Hopefully other institutions, who have not yet done so, will follow suit.

Wednesday, September 28, 2005

The five C's

At the SOHO business show yesterday, I attended a presentation entitled "How To Turn Your Website Into A Dynamic Search Engine Magnet." Unfortunately, the presentation was really a vendor touting his services, complete with a live testimonial. So my notes are sparse from it, but do include information on the five C's that I thought was worth sharing.

According to the CEO of Cazbah Total Internet Marketing Solution, the five C's are key to being successful on the Internet. If true, that means that all web sites -- not just business sites -- should pay attention to them. The five C's are:
  • Content -- We all know that a web site needs content. Keep in mind that visitors to your site will make decisions about the content in the first 3 - 8 seconds at the site. Often that decision is made based on how fast the site loads or what is on the homepage. Make sure you site works well (optimized/efficient code) and that when they do see the homepage that they will understand what the content is.

  • Community -- Web sites are built for specific communities of people. Do you know what that community is? Can you define that community and interact on some level with it? When members of that community get to your site, can they tell that the site is for them? Of course, we might also talk about how you market to that community as well as how you keep them loyal...

  • Communication -- Can your users communicate with you and you with them? Don't just think of e-mail, but also think of instant message, newsletters, blogs, etc. Remember that a community of people will want to communicate with you and likely with each other.

  • Commerce -- We don't all use our web sites for ecommerce, but we might want people to "do" something (use a database, view photos, read articles). A web site that is focused on ecommerce will make sure that the ability to find, select, and buy is easy. Do the same for your site. If you want users to search or browse, for example, make sure that it is obvious and easy.

  • Customer service & support -- We hope that our web sites and online products are easy to use and that no one will need help, but that's not true. Users will want to be able to contact you with questions, especially if they are having problems. And they will want to contact you when they are having that problem, not when it is convenient for you. So what customer service and support mechanisms do you have in place? Are they obvious and easy to use?
The five C's were a good take-away from the presentation. As the saying goes, if you get one thing out of a presentation, then it was worthwhile. With that as a standard, I'm glad I sat through it.

No libraries...

Yesterday I attended the SOHO business show here in Syracuse. (SOHO = Small Office, Home Office) All of the exhibitors were businesses/organizations who felt their services were pertinent to small businesses. There were no libraries -- digital or brick -- among the exhibitors, nor any business information products. Thus there were no clues to the vast amount of information available that businesses can tap into.

As information professionals, we can't expect our customers (or patrons) automatically think of us and our resources, we must continually market ourselves. How cool it would have been if the local libraries had had a booth at the SOHO show. They could have demonstrated business related databases, talked about the services they have to help the business community, and even registered people for library cards. It was an opportunity missed.

If there is a business show in your area, would it be worth your while to exhibit at it? Could you collaborate with other institutions to share the cost and create a dynamite booth space with pertinent information? Could you collaborate with a vendor? Would doing so raise your organization's profile and possibly expand your user-base?

Tuesday, September 27, 2005

Teaching about blogging

Last week I did my first public workshop on blogging (How to Create a Blog for Your Business). I was pleased with the number of participants as well as hearing from those who wanted to come, but couldn't due to their schedules. I was intrigued by the blogs people wanted to start, including those for their businesses, on a forthcoming book, and to support college courses. Only about half of the people were already reading blogs and then only a few blogs (with one person reading/monitoring eight blogs). I believe no one was using a blog reader, so teaching about blog/RSS readers became an addendum to the workshop via e-mail two days later.

What lessons did I learn from the experience? Among them are:
  • As the surveys have shown, a lot of people don't know what blogs are, so we still need to make people aware of them.
    • If you have a blog for your organization, remember to continually market it.
    • This also means continuing to make people aware of RSS and RSS readers.
      • RSS is very versatile. There is a list of things you can do with RSS that really shows off the technology's capabilities.
  • People need to know about blog readers. Period. You can monitor more blogs and be impacted by more information/opinions if you use a blog reader.
  • People are willing to adopt new modes of reaching out and communicating if they understand what the benefits are.
    • With your users and colleagues, make sure that they know how to use the technologies that you present AND tell them how using the technologies will benefit them. Give very specific examples.
  • Although anyone can blog, not everyone can modify a blog's template. So for example, a person may not be able to insert the HTML for a Creative Commons license in the temple.
    • I joked that they should turn to their IT support person, web site developer, family member or the kid next door for help. The "kid next door" as the IT support person is both real and folklore. There is a TV ad here showing a police car stopping in front of a group of teenagers. The one teen approaches the police car and the police officer holds up his laptop which has frozen. The teenager tells him how to fix it, then walked away.
  • It is important to emphasis simplicity. One blog I showed was a library blog for internal staff (although it has an external following) because it was simple and effective. The template had not been modified at all. The emphasis on keeping it simple helped the participants know that blogging is doable.
I learned more than that, but I'll keep it simple for now! It was a wonderful experience, which I'm looking forward to repeating.

Monday, September 26, 2005

Event: Workshop on the Long-term Curation and Preservation of Medical Databases

DCC and ERPANET Workshop on the Long-term Curation and Preservation of Medical Databases 13th-14th October, 2005 Calouste Gulbenkian Foundation, Lisbon, Portugal

Register for this Workshop

The Digital Curation Centre is pleased to announce that it will be delivering a two-day workshop on the long-term curation and preservation of medical databases. This workshop will be held at the Calouste Gulbenkian Foundation in Lisbon, Portugal from 13th - 14th October 2005.
This event is co-sponsored by ERPANET, the Calouste Gulbenkian Foundation and the Digital Curation Centre.

Enabling access to and re-use of digital information is an integral part of medical research and health-service provision and is essential for enabling accountability. This workshop will explore the ways that practitioners can support accessibility and reusability of medical databases over time. The workshop will provide an opportunity for information professionals, medical practitioners, commercial developers, students, and researchers to network with colleagues, learn about current developments, discuss their own experiences, and to identify risks and challenges that are likely to emerge. Presentations, and panel discussions will offer participants cross-disciplinary perspectives on the long-term preservation of medical databases.


The workshop will be delivered over three broad sessions: medical database creation, curation and reuse. Each session will be chaired by a practitioner in the field. The chair will begin the session by placing the topic into the context of digital curation and provide references to international efforts in the area. Following this introduction, each session will highlight specific tools, practical approaches and emerging standards in the form of presentations. Each session will conclude with an open question period which will be moderated by the session chair.

Key themes include:
  • Tools and standards to help create, curate and reuse medical databases;
  • Description techniques that may help to ensure the long-term usability of digital medical information within databases;
  • Overview of curation and preservation activities being undertaken internationally
Confirmed speakers for this event include:
  • Peter Singleton, Cambridge Health Informatics
  • Peter Kerr, NCRI Cancer Informatics Initiative
  • Warren Hilder, Medical Research Council
  • Livia Iacovino, Australia’s HealthConnect Project, Monash University
  • Robert Terry, Wellcome Trust
To see the full programme, please see

Programme Committee:
  • Niklaus Butikofer, Co-Director ERPANET
  • Alan Doyle, Science Programme Manager, Wellcome Trust
  • Mariella Guerico, UniversitĂ  degli Studi di Urbino Carlo Bo
  • Hans Hofman, Nationaal Archief van Nederland
  • Seamus Ross, Associate Director DCC Services and Director, HATII, University of Glasgow
  • Allan Sudlow, Programme Manager, Research Management Group, Medical Research Council

The venue for this event will be the Calouste Gulbenkian Foundation, Lisbon, Portugal. For more information on the venue please see

Travel Instructions

See detailed options for travelling to the Calouste Gulbenkian Foundation at


Registration fees are £75 for DCC Associates Network members and £125 for non-members. These fees include all workshop materials and handouts, lunch on the 14th and refreshments. Membership of the DCC Associates Network is free and more information on becoming a member can be found at

Event: Workshop in Geographical Information Systems, Oct. 13 - 14

Two-day Workshop in Geographical Information Systems in universities, libraries, museums and archives sponsored by the Centre for Data Digitisation and Analysis, Queen’s University and the Polis Center, Indiana University Purdue University Indianapolis

Belfast, October 13 - 14 2005

Geographic Information Systems (GIS) are fast becoming the preferred technology for managing and presenting information in university humanities and social science departments, in museums, in libraries and in archives. Its capacity to integrate and map data by geographical location and chronology makes it an ideal tool for associating disparate research datasets, museum objects and collections or archival records. Implementing a geographic information system (GIS) means making a number of difficult decisions and each of those decisions will affect the success of a project. This workshop will provide guidance on the correct judgments to make and includes assessing the need for GIS, developing an implementation strategy, assessing costs, reviewing training needs and evaluating the benefits and returns. The workshop will focus on introducing a variety of data resources that can be integrated into a GIS as well as procedures for translating existing data into a GIS environment. There will be discussion of specific issues associated with developing, managing, visualising, and analysing temporal and spatial data of interest to the archives, library and research community.

Hands-on experience of one of the key GIS software packages, ArcGIS 9, will be provided with introductory sessions on GIS terms and concepts; an Overview of ArcGIS and basic navigation techniques; ArcGIS symbology tools, ArcGIS Query Tools; and creating Data in ArcGIS. On completion attendees will have a working knowledge of the software.

Costs: Two day course: £250

Cheques, made payable to Queen’s University, should be sent to Elaine Yeates, Centre for Data Digitisation and Analysis, School of Geography, Queen’s University, Belfast, BT7 1NN, UK including your full contact details with e-mail address. Confirmation of bookings, and further information on the workshop will be sent by return.

Location: Queen’s University main campus

Dates: October 13th and 14th 10am-4pm. A lunch will be provided between 1pm and 2pm


Dr Paul S. Ell, Centre for Data Digitisation and Analysis, School of Geography, Archaeology and Palaeoecology, E- mail

Dr Ian N. Gregory, Centre for Data Digitisation and Analysis, School of Geography, Archaeology and Palaeoecology, Room, E-mail

Kevin Mickey, Director, Professional Education and Outreach, The Polis Center IndianaUniversity Purdue University Indianapolis, 1200 Waterway Boulevard Indianapolis, Indiana 46202., E-mail:

Friday, September 23, 2005

Article: Art Librarians Pay Widely Diverging Prices to Convert 35 mm Images of Artwork to Digital Formats

Of course, the press release gives no cost information, although it does talk about the number of images scanned and usage. Cost information must be in the full report.

In talking about the study, the press release says:
The new study from Primary Research Group is based on thorough interviews with leading art and image libraries, including those from Cornell University, Ohio State University, ARTstor, the National Archives & Records Administration, the Smithsonian, McGill University, the National Gallery of Canada, the University of North Carolina, the Illinois Institute of Technology and the Union Catalog Project for Art Image Metadata.

Art librarians are converting their 35 MM image libraries on a selective basis, as they re-shoot images, acquire new ones from commercial providers, enter into consortium sharing arrangements, and take other measures to digitize their collections.

The librarians interviewed discuss their digitization efforts commenting on the impact of the mega-library and emerging resource ARTstor, consortium activities, costs and benefits of in-house and outsourced image conversion, metadata development, copyright and licensing issues and other topics in art and image digital librarianship.
Read the full press release for details and a link for purchasing the full report.

Article: Google Inc. hiring for Ann Arbor operation

I'm posting this because I found it interesting. (And small peak into what Google is doing.)

Although Google has been scanning on-site at the University of Michigan, it has no room to expand, so...

Google Inc. appears set to launch an expanded operation in Ann Arbor for its library digitization project for the University of Michigan.

The company seeks to hire technicians that will install and maintain new servers equipment capable of processing and storing network information in a server room at an undisclosed site in the city.

The article also says:
At U-M, Google is working on non-copyrighted material at the Buhr Shelving Facility until the copyright issues are resolved.

Thursday, September 22, 2005

The lawsuit against Google

I'm sure you've heard about it already, but I want to point you towards some commentary.
Steve Abram (Stephen's Lighthouse) has a good summary with commentary of a study done of the Google project by two researchers at OCLC. One tidbit of information is that the study noted that more than 80% of the materials in the Google 5 collections are still in copyright.

Wednesday, September 21, 2005

Event: Preserving the Digital Heritage, Nov. 4 - 5, 2005

The Netherlands National Commission for UNESCO and the Koninklijke Bibliotheek, National Library of the Netherlands, organise an international conference 'Preserving the Digital Heritage: Principles and Policies', which will be held in The Hague, Friday 4 November and Saturday 5 November 2005.

The conference is focussing on issues of interest for higher management in libraries, archives and museums, and other policy makers in the field of information and culture. The programme and registration form can be downloaded from:

The special characteristics of digital objects ask for new preservation polices and agreements. Amongst the many problems that librarians, archivists and other information professionals face, two issues seem to be especially important: the selection of material and the division of tasks and responsibilities between institutions for preservation purposes. These issues are mentioned in the 'Charter on the Preservation of the Digital Heritage', adopted at the 32nd session of the General Conference of UNESCO in November 2003.

At the conference a range of state of the art examples will be presented from a variety of materials, heritage institutions and cooperation schemes. The aim is to further clarify the above mentioned issues and to formulate recommendations on actions to be taken in the period to come.

The Organising Committee
Gerard van Trier, Chair
Koninklijke Bibliotheek, National Library of the Netherlands P.O. Box 90407
2509 LK The Hague
T: +31 70 31 40 463 / 495
F: +31 70 31 40 651

Tuesday, September 20, 2005

E-Book: Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web

An online version of this book, Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web, by Daniel I. Cohen and Roy Rosenzweig is available for free on the Internet (HTML version). A print version will also be available for purchase through several booksellers when it is released later this fall. The chapters of the book include:
  • Introduction
  • Exploring the History Web
  • Becoming Digital
  • Designing for the History Web
  • Building an Audience
  • Collecting History Online
  • Owning the Past?
  • Preserving Digital History
  • Final Thoughts
  • Appendix
Skimming through the pages, I can see that this is a very readable book that covers are the areas an project/program needs to consider. Now if there were only a downloadable version.

462 postings later

I began this blog on August 30, 2004 -- more than a year ago. Now it's 462 postings later. When I look back, what stands out to me about digitization over the last year are four things:
  1. The technology keeps getting better. This is one field where the technology does not stand still. Equipment can become outdated quickly, so organizations need to be mindful of keeping equipment -- hardware, software, servers, etc. -- up-to-date.
    • By the way, the two pieces of hardware that I saw this year -- and that stand out in my mind -- are the Kirtas book scanner and the S-T Imaging microfilm reader/scanner for use by library patrons.
  2. The impact of the Google project. No matter what happens long-term with Google Print, this project is having an impact.
    • More people know about digitization and what it is.
    • The media is writing about digitization, copyright and Fair Use.
    • The idea of building massive digital collections no longer seems unrealistic.
    • Publishers see that people want information in digital form (as well as in hardcopy).
    • The technology for large scale projects is providing itself. (Of course, those outside of the Google project are not suppose to know what technology is being used, but history has proven that rumors often are based on core pieces of fact.)
  3. The influence of Brewster Kahle. Brewster Kahle hit me like a "ton of bricks" when he did his presentation at the Library of Congress in December 2004 (part of the LOC's Series on Digital Future). I watched that presentation three times and each time heard/saw something new. Then I began looking at how Kahle was connected to the things that were happening. He is both out in front leading the charge and behind the scenes ensuring that things happen. He is very much worth keeping on your radar.
  4. The things that will cause an organization not to begin a digitization project have not changed. The stoppers remain money, staff, time, equipment, training, shifting priorities... Although money can fix the other problems, getting someone to fund a project to convert materials into electronic form isn't always an easy sell. The funder needs to understand the benefits and some just don't get it.
What also stands out to me is that there is always more to write about in regards to digitizing materials and it is never boring!

Thursday, September 15, 2005

How do I keep up?

I got asked this by a student this morning who interviewed me for one of her assignments. We spent two hours together talking about digitizing materials in libraries (and other cultural heritage organizations), corporations and government entities. At the end of the two hours, she asked how I kept up with all of the new information, technologies, etc. related to creating, managing and preserving digital assets.

The answer...
  • I follow around 70 blogs, some of which are related to digitization, digital libraries and copyright. I told her that we all follow and find different pieces of information, so I learn by reading about what has peeked the interest of others. I pointed her towards my Bloglines blogroll to see what blogs and RSS feeds I'm tracking.
  • I belong to a host of discussion lists which I skim for important tidbits. (BTW with all the stuff that crosses by virtual desk each day, the ability to skim well is very important.)
  • I talk to people who are doing different things than me (vendors, project managers, etc.). Just hearing the different points of view is helpful.
  • I research specific areas -- digging deeper into them -- when I'm working on projects or answering a question that someone has asked me.
What I've learned over the years, and in various situations, is that having all the details at hand isn't always necessary. I need to know an awful lot, but sometimes I just need to know where to find the specifics. For example, what are the various TIFF formats and how are they different? I don't know off the top of my head, but I know where to find the information.

So once I find something important, how do I keep track of it?

Keeping track of information in my blog has been helpful and I find myself searching the contents on occasion to find something that I know I've written about. (You can search the blog by going to the blog's homepage and using the search feature at the top of the page.) Some details/resources/ideas have been written into planning documents and so I find myself going back and re-looking at those. Some are noted on a page on my web site that is no longer maintained (Resources for Learning More About Creating and Managing Digital Images).

Keeping up isn't always easy. Sometimes it seems like so much information and so little time. Perhaps it is the challenge that keeps me going.

Metasearch / Federated search resources

The Metasearch Infrastructure Project at the California Digital Library, University of California has a list of resources on its web site including "A Checklist of Considerations for Selecting Metasearch Software." Definitely worth bookmarking if you will be looking at federated search/metasearch systems.

Thanks to The Ten Thousand Year Blog for pointing this out.

Metasearch and metadata

Thom Hickey in his blog Outgoing has a posting entitled "Metasearch and metadata" that talks the role controled vacabularies. Hickey is a chief scientist at OCLC and quite interested in metadata. His opinions on controlled vocabulary are worth reading and includes the importance of speed. Hickey has promised to follow-up with more on their experiments in avoiding/enhancing fielded searching later.

Wednesday, September 14, 2005

What people are teaching in digitization/digital library classes

I've been asked to post the information that people have sent to me.

The Rogue Librarian, who teaches at the Pratt Institute in NYC, posted her syllabus in her blog in August, looking for feedback. So you can go there and see her syllabus for "Projects in Digital Archives." She's using the book The Social Life of Information which is partially available for free online as well as other readings.

The final project is quite interesting:
Each student will participate in a real-world digital preservation project. This is a group project. The scope of this project will be ambitious; we will be using NYPL's LOCKSS cache to preserve a small number of blogs. Students will be creating a collection development policy, working with publishers to obtain permission to preserve their blogs, coordinating with members of the LOCKSS team at Stanford, and creating a policy for the ongoing preservation and maintenance of this collection. Students will leave this course ready to participate in the digital library and digital preservation community at a sophisticated level.
Thanks to the person who e-mailed me about this syllabus. And since she had already posted it publicly, I'm glad to be able to point others to it.

There's another class being given this fall entitled "Developing Digital Collections." I'm not sure if the instructor would want the syllabus announced to the world, so I'll not do that. Looking at the web site, two things stand out to me about this class:
  1. The class covers planning, executing, and managing projects. I like the inclusion of the word "planning." That really gets it out in the open that one must plan a digitization project.
  2. The instructor has built an extensive web site for the class. This class is taught on campus, but using the Internet in this way allows for a continues flow of information. Undoubtedly, the instructor put a lot of work into the site.
My syllabus from the spring 2005 semester is available here (MS Word format). The class is taught online using WebCT and is called "Creating, Managing and Preserving Digital Assets." Since it is taught online, I've had students in this class from across the U.S. as well as from other countries.

The first and fourth assignments are connected, and give the students a chance to think about a specific collection. The third assignment is a vendor interview assignment and teaches the students not only what digitization vendors do, but also how to find them. (Interestingly, some vendors refuse to talk to the students even though many of the students are working in libraries AND will be working on digitization projects in the future. In other words, they could be the vendor's customers.)

The second assignment deals with copyright. If you look at the syllabus, you'll be able to imagine all of the various answers I get. I end up writing a memo to the students telling them what I would do and why, etc., and giving it to them with their grades. BTW there's much to think about with the assignment, but sometimes the students make it harder than what it is.

If other people have suggestions, etc., please e-mail them to me or post a comment. The sharing of these ideas related to teaching about digital assets is very important.

What should be taught? (Keep the ideas coming)

Two days ago, I asked for input on restructuring a class on creating, managing and preserving digital assets. The e-mails have been wonderful, including pointers to syllabi from others teaching on the same topic. Please keep the e-mails and comments coming.

The two questions were:
  • If you were TAKING a 15-week course on digitization and you knew nothing in the beginning, what would you want to learn? (Yes, I know that's a broad question...)
  • If you were HIRING someone who had taken a semester-long course in digitization, what would you expect the person to know?

Event: ASIS&T 2005 Annual Meeting, Oct. 29 - Nov. 2

The ASSIS&T Annual Meeting includes some topics related to digitization and digital libraries. For example:
  • Greenstone in Practice: Implentations of an Open Source Digital Library System
  • Implementing and Evaluating DLs
  • Building Digital Library Collections with Open Source Software
Go to the conference web site for details and to view the full program.

Tuesday, September 13, 2005

Employment prospects

The information field (library science and its related avenues) is supposedly hot. We keep saying that we need more librarians (information professionals), but graduates have a hard time finding jobs. Meredith talks about her job search and give some statistics in her blog. What she found, looking at ALA Placement Center Statistics, is that there are more job applicants than jobs. Now this is limited data, but I've heard the same thing from my students. Finding a job with appropriate duties, with a good salary, and in the geographic area that you want to be in can be very difficult (if not impossible). Staying in your geographic region might mean working for a corporation instead of a library, being a consultant instead of a full-time employee, or working virtually.

Although people with skills related to digitization can take their careers in many directions, there is still a need to be flexible. There are library-related projects/programs as well as projects with other cultural heritage organizations, corporations and government agencies. One might work for an institution or for a vendor, and have a very fulfilling career. There are roles for consultants and teachers. The sky may not be the limit, but there are many possibilities. Is there competition? Yes, and more coming each day. But there are also more projects being launched, more products needing development teams, and more students looking for teachers (professors) with these skills.

There are possibilities...and that's all we can hope for.

Monday, September 12, 2005

Event: 5th International Web Archiving Workshop

This workshop is September 22 - 23, 2005 in Vienna, Austria. It is being held in conjunction with
8th European Conference on Research and Advanced Technologies for Digital Libraries. Visit the web site for full details.

What should be taught? (I need your ideas)

I'm asking for your help. I'll be teaching a graduate class in creating, managing and preserving digital assets during the spring 2006 semester. I wrote the original syllabus for this at Syracuse University and have taught the class twice since then. When I teach the class in the spring, I want to have a revamped syllabi.

I really like to have the students focus on real-world problems and so I have had them look at collections that might be digitized and interview vendors, as well as learn about all of the phrases of a digitization project. There has been lots of reading and online discussions (the course is taught online via WebCT). Eyes are opened when students realize that its not all technology. Eyes open even wider then they realize ALL of the things that must be considered.

So I have two questions:
  • If you were TAKING a 15-week course on digitization and you knew nothing in the beginning, what would you want to learn? (Yes, I know that's a broad question...)
  • If you were HIRING someone who had taken a semester-long course in digitization, what would you expect the person to know?
Please leave a comment to this post to give me your thoughts. What you say will help to mold the revamped class AND mold those who will become your colleagues.

And...thanks in advance for your input!

Event: Digital Preservation Training Programme

From the Digital Preservation discussion list.

Booking is now open for the pilot Digital Preservation Training Programme.
This is a one week residential course held at the University of Warwick.

DPTP is a project funded by JISC under its Digital Preservation and Asset Management programme, or JISC 4/04 as it is more commonly known. The project is led by ULCC, working with its partners the Digital Preservation Coalition, Cornell University and the British Library.

The project's aim is to develop a modular training programme with content aimed at multiple levels of attendee. It builds on the excellent foundations of Cornell's Digital Preservation Management Workshop, funded by the National Endowment for the Humanities.

The pilot is being developed by the project partners along with King's College London Digital Consultancy Services and ADS/AHDS Archaeology.

Subjects are likely to include:

* Planning and strategy
* Five organisational stages of digital preservation
* OAIS: initiatives and tools
* Living with obsolescence
* Mass storage
* Preservation approaches
* Metadata: overview and case study
* Web archiving
* Access
* Costs and risk management
* Legal considerations
* Outsourcing
* Certification
* Where do you go from here?

Places are strictly limited and will be assigned once all applications been received. Due to the nature of DPTP's funding, priority will be given to those from the UK HE and FE communities. Booking will close at 5pm on Friday September 16th.

You will be advised whether your booking was successful by email after September 16th.

To book, please go to

If you have any queries regarding the course, please email

Can you outsource an entire digitization project?

Outsourcing is seen as a way of getting a project done without impacting the institution's staff. Often the creating of the digital files is outsourced, but creating the metadata, setting up the digital asset management software and creating the web site can also be outsourced.

However, there are three things that cannot be outsourced.
  1. Someone internal to the organization will need to manage the project. This means talking to the vendors, ensuring that all is on schedule, looking over shoulders to see that things are done appropriately, and giving feedback. The vendors will assume that this person is available to answer questions and help them keep things on track.
  2. Staff will be needed to select the materials to be digitized and to pull them from the collection. Could you hire someone to do this for you? Yes. Would you be happy with the results? Likely not.
  3. "Designing" the content on the site (or system). Let's think in terms of a web site that contains digitized materials and text. What should it look like? How it is going to be used? What will it contain? Again, this could be outsourced to some extent, but would you want to trust this to someone who isn't associated with your institution?
In order to allocate someone (or a team) to do these tasks, an institution needs to look at look at its staff's workload. An institution might find that it can shift the workload in order to create time for people to do these tasks. If that is not the case, then the institution will need to hire staff so that someone can be freed up to do these tasks. If the project is being grant funded, request money for a project manager who then becomes responsible for the three tasks above.

If you find that the only way to do the project is to outsource everything and leave all decisions in the hands of contractors, ask yourself one question -- will you be happy with the result?

NARA and electronic archives

Someone from Tessella e-mailed me to point out that they were part of a big contract to create electronic records archive system for the National Archives and Records Administration. I had seen the announcement and didn't mention it (prior to this) for one reason: this work isn't going to have an immediate impact on us. The initial system will be rolled out in 2007 and it should be fully operational by 2011. Now if knowledge, protocols, software, etc., "falls" out of this effort for the rest of us to pick up on, I'll be thrilled. (Hopefully the government will see the wisdom in sharing what it is doing or at least what is being learned.)

Now Harris Corp. competed with Lockheed for this project, so perhaps it might be Harris that will use what it learned to help organizations that don't have millions to spend. Who knows. As the saying goes, I'm not holding my breath.

BTW the news article states that "NARA estimates that [government] agencies use more than 4,800 different types of formats." Wow! That is both a nightmare and a opportunity for those managing the information.

Friday, September 09, 2005

Streetprint Engine 3.0

Here is another piece of open source software solution "for showcasing, teaching, and archiving popular print and countless other kinds of collections and artifacts." Streetprint Engine was developed at the University of Alberta to solve a specific problem. Now the software and documentation are available to anyone. The web site links to some collection that use it and contains a forum where people discuss the software.

digitizationblog did a review of Streetprint Engine back in August and said that the software was easy to use, but had some rough spots. If you're looking for software to "power" a digital collection and you're agreeable to using open source software, be sure to give this one a look.

Article: Cultural Suicide via Digitalization (tells us what we don't know)

As digitizationblog wrote, this article is predictable. However, it is a "story" that must be retold over and over again. Although digitization increases access, paper is a better medium for long-term storage. Paper doesn't require special equipment to read. Even microforms can be read with simple (non-electronic) magnifiers.

If we want to use CDs and servers, etc., to store precious materials (especially those where there is no hardcopy equivalent), then we need to solve the problems that cause those media to fail and lose information.

BTW there is a wonderful episode of Stargate SG-1 when the SG-1 team comes across a planet where everyone is connected into the computer network. They trust the network and don't realize that it keeps rewriting history. However, a search of the old paper documents shows the truth. Here the paper had lasted and had not been corrupted (modified).

Technorati tag:

Thursday, September 08, 2005

Report: Kids & Teens: Blurring the Line between Online and Offline

eMarketer has published a report entitled "Kids & Teens: Blurring the Line between Online and Offline" This article gives an overview of the entire report. (The report is available for purchase.) The article notes:
The way today's teens use the Internet and digital devices such as mobile phones provides a view into the way tomorrow's adults will interact with media and technology.
As librarians and information specialists, we need to pay attention to what this means to us now and in the future. We need to ensure that we're building services that today's teens will want to use in the years to come.

The thud factor & digital libraries

Paul Myers once wrote an article on the thud factor. The thud factor can be described as the weight of a book when dropped on a table. A heavy book -- which we'll assume as a lot of good content -- goes "thud." Myers goes onto equate the thud factor with value (price). [Now we know that something that is light weight or small can be extremely valuable, but set that aside for the moment and stay with me on this.] He believes that thud can be created through the use of add-ons, for example, a book that also contains a CD. People see the add-ons -- judge the weight/thud -- and see real value.

And what does this have to do with digitization or a digital library?

When people see or use your project, can they judge its thud factor? Does it "feel" like a voluminous hardcopy book or seem more like a lightweight paperback? Does it have add-ons -- explanatory text, timelines, lesson plans, glossaries, etc. -- that contribute to its weight and value? If your project doesn't have a good thud factor, can you work on creating it?

You may be surprised at how easy you can increase your thud factor. In fact, you might do it by calling better attention to the additional pieces (add-ons) that already exist, but are not in plain sight. I found recently that I was able to create thud by breaking something (a workbook) into pieces, which also made it easier to use and more logical. It just took a little thinking and a little editing.

Wednesday, September 07, 2005

"Saving" collections by digitizing them (disaster recovery)

Digitizing materials does not automatically preserve them, although they may be handled less. It does, however, preserve the information that the materials contained. For example, if you digitize a book, you have the information it contained in an electronic format. If you then lose the book, you can still refer to the electronic version.

We know that Katrina and her wake affected many collections. Some are likely totally destroyed while others -- we hope -- can be restored to some resemblance of their former glory. If the materials were digitized, then there is hope that those images might be retrieved and serve as important surrogates for items that have been lost or damaged. Unfortunately, often those digitized materials are on servers in the same area as the original items. Katrina and other disasters demonstrate that there is a need to keep the digital versions (or one copy of the digital versions) in another region. By keeping the two in different geographic regions, one version has a higher chance of being spared by a disaster.

Now is the time to step back -- pause -- and think about your digitized materials, their storage and quality.
  • Have you stored a version off-site?
  • Is a version someplace that is fire-proof and flood-proof?
  • If your servers fail, can you bring a version of your collection online from another site?
  • If you lose the original items, are your digitized materials of high enough quality to serve as good surrogates?

Tuesday, September 06, 2005

Blogs and Information Community Respond to Hurricane Katrina

Information Today has published an article on its web site of sites/sources within the information community for staying up-to-date on what's going on with Katrina, including the responses/actions of SLA and ALA. Information Today has promised to add to the list as it learns more.

Monday, September 05, 2005

Gas Prices and Libraries

Library Web Chic has a good posting on gas prices and libraries in response to mine of last week. I seem to have struck a cord...

Article: Automatic Indexing

Marjorie Hlava has an article in the August 2005 issue of the Special Libraries Association's Information Outlook entitled "Automatic Indexing: Comparing Rule-based and Statistics-based Indexing Systems." Although this may not be pertinent to a digital imaging project, your colleagues may think that you know about automatic indexing just "because." This article compares these two types of indexing systems AND gives some cost information (something that can be very useful and not always readily available).

If you don't receive Information Outlook, you might find the article through one of the online databases. Also many libraries do receive Information Outlook for their librarians, so you might check to see if the issue is available there (perhaps not in the catalogue but in their lounge).

Sunday, September 04, 2005

Opinion: Katrina - The world is watching

In 1968, demonstrators outside the Democratic National Convention yelled, “The whole world’s watching, the whole world’s watching…”[1] As in other points in our history, the entire world is focused on the United States and how it is treating its own citizens. We have boasted in the past at how quickly we (the U.S.) have responded to disasters around the world, but last week we failed to get help quickly enough to our own people – on our own soil – in the aftermath of Katrina.

Didn’t the government know that something like this could happen in New Orleans? Yes. They had done simulations and everything that occurred was known ahead of time. Don’t we have resources to respond to such disasters? Yes, although some had already been diverted to other governmental priorities. The result? Many more people died than should have – poor people who did not have the means to evacuate, the infirm, those who stayed to protect their property, and the stubborn. It is the poor, though, that have suffered the most and likely make up a high proportion of those who died.

Like others, I want to thank our news media for asking the questions that we are all asking and keeping a light on the problems that made this disaster worse. I also want to thank Lourdes Muñoz Santamaria, a member of Spain’s parliament, who was stranded with her family at the New Orleans Convention Center. Undoubtedly her phone calls back to Spain, and then Spain’s phone calls to the U.S. government, helped to get things moving. (“Two thumbs up” for bringing international pressure on the U.S.)

Like many others, I’ll contribute to the relief efforts. I know, though, that relief for these people will not be something that will happen for a short period of time, but that it will take years for them to get their lives back in order. (Having lived through hurricane/tropical storm Agnes, I know firsthand of this. The sadness and memories of the event remain with you forever. And this one is so much worse…)

There is a New Orleans tradition of the jazz funeral[2]. Musicians march along with the casket to the cemetery and play music to comfort the mourners. Let’s take time to play some jazz, mourn what has happen in the Gulf Coast, and find comfort in the music. And let us reflect on the changes that need to occur in our government and society to ensure that this never happens again.

[1] You can hear a bit of this on the Chicago Transit Authority album released in 1969 in “Prologue”.
[2] You can listen to some New Orleans jazz funeral music on this page.

Friday, September 02, 2005

The impact of gas prices on your library, part 2

catablog picked up my posting and added to it (and tells a funny story about her mother's library). I left a comment there that included the following:
Now I can afford to pay for gas, but what about those who can't? So perhaps libraries need to look at how to provide services more cost effectively to their patrons whom this will affect the most? And maybe it isn't online services, but going back to bookmobiles (or high-tech bookmobiles with Internet access) or something else.

Since this problem (gas $) isn't going away overnight, we should look at the options and adopt strategies that will truly benefit our users.

Checklist for the Certification of Trusted Digital Repositories

This was posted on the DIGITAL-PRESERVATION discussion list.

Checklist for the Certification of Trusted Digital Repositories Completed and Released

This announcement is also being made today, August 31, 2005, on our Web site and to a variety of other lists.

RLG has just released a draft report for the certification of digital repositories. The draft, titled "An Audit Checklist for the Certification of Trusted Digital Repositories," is available at It is the product of a task force working on a joint project between RLG and the National Archives and Records Administration (NARA).

The goal of the RLG-NARA Digital Repository Certification project has been to identify the criteria repositories must meet for reliably storing, migrating, and providing access to digital collections. The "Audit Checklist" identifies procedures for certifying digital repositories. Leveraging the RLG-NARA checklist, the Center for Research Libraries (CRL) Audit and Certification of Digital Archives project will test audit the Koninklijke Bibliotheek National Library of the Netherlands), which maintains the digital archive for Elsevier Science Direct Journals, the Inter-University Consortium for Political and Social Research (ICPSR), and Portico, an archive for electronic journals incubated within Ithaka Harbors, Inc. Stanford's LOCKSS system will also participate in this effort.

Robin Dale, manager of both projects, says: "We look forward to receiving comments on the draft and to hearing the response from the community." Comments on the draft are due before mid-January 2006 to (+1-650-691-2238).

For more about the RLG-NARA task force, see

Thursday, September 01, 2005

Digital libraries are...

"Digital libraries are more about integration than systems development"

That quote comes from a presentation done in 2000 on the Harvard University Digital Library Initiative. Most of the presentation is on developing the technical infrastructure. If your work involves building a digital library, you may want to look at this presentation. I found some useful tidbits/thoughts.

It means? Definitions from the Digital Preservation Coalition

The Digital Preservation Coalition has a page of useful definitions of words associated with materials that are born digital. This is a good page to bookmark for later use, so you don't have to recreate a definition that already exists.

Katrina's impact on libraries in the Gulf Coast region

Two other bloggers have captured pointers to information on the libraries affected by Katrina. Follow the links to read their posts and to find additional resources.
There is also a Katrina blog at

The impact of gas prices on your digital library

Energy prices were already rising due to increased demand from China and other areas that are using more oil. Katrina has made matters worse. One impact that will occur, of course, is that people will drive less. Driving will be what we do when we really need something. We will no longer drive just to drive.

With this shift in gas prices and people changing their driving patterns, I think that digital libraries will be used more as well as the online "ask a librarian" services (and may even telephone reference services). Why drive to look something up when you can do it from home (or work)?

Now here are the key questions:
  • Do your users know how to use your services online?
  • Do they know how much they can access without physically coming to the library? (Names of resources, types of content, subject coverage...)
  • Have you reminded them recently so it really stands out to them?
If you can e-mail your patrons, do it and remind them of your online services. Tell them how they can use your services without driving to see you.

Post notes in your library's blog, newsletter, homepage, etc.

Try to get your local media to publish an article (or more) about what's available from the library without driving to the library.

If you deal with a specific user group -- corporate employees, college students, whatever -- use the right communication methods to reach them.

Oh, you think they know already? Perhaps...but it never hurts to remind people of our services. And in this instance, we can talk about saving gas (as well as saving travel time) by accessing resources online.