Jennifer Evert spoke about using people to help create a robust -- yet precious -- taxonomy for use in indexing and retrieval of content at LexisNexis. One thing that stood out to me is that the software tools also help the human indexers apply the terms correctly. She spoke of "editorial drift" which is when indexers do not apply the terms consistently. Although we don't use that phrase, editorial drift is something digitization projects need to be aware of when creating metadata. Terms must be apply correctly and consistently.
Marjorie Hlava (Margie) is the President, Chairman, and founder of Access Innovations, Inc. The MediaSleuth web site says that the company "is a division of NICEM (National Information Center for Educational Media) and was developed in conjunction with Access Innovations in response to market conditions and requests from both sides of the educational and training media community." Part of what Margie talked about was using machine aided indexers (MAI). Quoting from her slides (in the collected presentations book):
- M.A.I. suggests the correct terms from the taxonomy as descriptors
- M.A.I. rulebase recognizes term equivalents
When most libraries think of creating metadata, they think of doing it manually. As we our need to create metadata increases, we need to look at tools that will help us do it faster and smarter...and help us guard against editorial drift. Tools like those developed by LexisNexis and Access Innovations might be things that we would use.
In talking with Margie after CIL, I learned that Access Innovations does scanning and OCR as a way of helping their clients load content into databases. This is not their main focus, but it reminded me of how many companies have gotten involved in digitization. In this case, as a way of helping their clients and maintaining their client-base.
Technorati tag: CIL2006, taxonomy