Tuesday, June 25, 2013

Partnerships & Preservation: Go "Behind the Screens" with Gale (video)

Gale has created a series of videos about its digitization efforts. This particular video provides a nice peek at historic documents, document repairs being done, and actual digitization.  While this video below (5 min. 30 sec.) may not be of interest to you, there may be someone in your organization who would benefit from watching this, as a way of giving them a quick overview of what digitization is and its benefits.

Wednesday, June 12, 2013

#SLA2013 - Dennie Heye, Why come to SLA?

Dennie Heye, a member of SLA Europe Chapter, was named an SLA Fellow this year.  He's an active member, who has been able to attend the conference, although not yearly.  I asked him about why attending the conference is important and this is his answer.

Monday, June 10, 2013

#SLA2013 - The Digital Preservation Network, James Hilton

He is the chief evangelist for the DPN, which is indeed a large network of institutions and people.
"Everything that ever mattered in the world was more complicated" than was stipulated.

Problem: the scholarship that is being produced today is at serious risk of being lost forever to future generations. True for traditional and emerging scholarship.  Less than 50-50 chance that our intellectual children will have access to our work.  In regards to data, this is also true.  The volume of data is increasing and so are the preservation requirements.
  • The LHC (CERN's super collider) produce 3 million DVDs worth of data every six months.
  • Much of the coming data tsunami will be discarded.
  • Some data can NEVER be thrown away. Some data is a glimpse in time.
Only universities - in partnership with others -  are positioned to solve the data problem for the long haul. Why?  They've been around for a long time, hence, likely will continue to be around for a long time.  Yes, you can do partnerships with for-profit organizations, but they want to make money, rather than lose money on a scholarly endeavor.

There are currently lots of digital collections with a smattering of aggregation... e.g., HathiTrust. Most of the emphasis is on current access and little more than a promissory nod to preservation.  All susceptible to multiple single points of failure.  Single points of failure come in multiple forms - tech, physical, political, failure of organizational will, etc.  We need to consider not decade long access/preservation, but century-long access/preservation. [The fourth sentence in this paragraph used the word "value" instead of "failure", due to an iPad that like to autocorrect.  The word should indeed be "failure".  Thank you, Andrew, for pointing out the flaw.]

We've seen this picture before in networking and it's why we built Internet2.  Internet2 is our high-speed backbone. Internet2 allows us to continue to compete in science.  Internet2 was built to scale and evolve.

  • Lots of one-off solutions
  • Emerging aggregations
  • Multiple single points of failure
  • Many layers to the problems
  • Huge cost advantages  accrue to scaled solutions
  • Commercial solutions come with their own problems
  • Waiting only makes the problem harder and more expensive to solve
DPN is a leverage to force the debate and adoption on standards.  We have momentum to put behind the standards debate.

DPN - eliminate single points of failure by building in replication diversity starting at archive layer.
  • Light archives are better at preservation than dark archives.  However, some things can not be made "light" (accessible).
  • DPN is a dark archive.
Create a sustainable framework at scale that evolves and adapts to new preservation challenges and allows movement up the preservation stack.

Start with well-understood objects and leverage current efforts.  Evolve and adapt to new forms of scholarship and data, adjusting replicating node architecture if/as needed.
DPN is an ecosystem, not a software project.  Designed to evolve to address new forms of scholarship, changing formats and the evolution of software and tech platforms. DPN is a federation,not a monopoly. DPN federation.
  • Audit and verify to ensure succession
  • Provide grant-based/contract funding to the replicating Ned's in a manner that ensures functional independence.
  • Provide a legal framework for holding succession rights
  • Provide a structure for aligning and leveraging  preservation activities/investment.
Currently 56 AAU-like institutions.
Can we afford DPN? Membership is $20000/year/institution.
  • At $15million for first year at full capacity, that is  .0005% of the research expenditures of R1 institutions. (I hope that is an accurate paraphrase.)
  • Grant funding
  • Charge for some services
The emerging digital stack
  • Access oriented Repositories
  • Preservation oriented repositories
  • DPN backbone
  • Code
Activities related to the stack - Internet, CLOCKSS, Portico, Meta

DPN benefits:
  • Preserve scholarship
  • Ensure continued access
  • Evolve to include new forms of scholarship and data as they emerge
  • Rationalize our collective investment in preservation efforts and leverage diverse funding sources
  • Create a framework against which the academy can retool publication workflows for a digital world
  • Provide a way of planning campus-based cyber infrastructure so that it efficiently feeds preservation efforts
Interesting thought - these days, peer review happens after the fact with "likes", etc.

{Syracuse Univ. is a charter member!}
Inaugural board will be in place fall 2013. To include 3 librarians.
Bylaws and elections 2014.
Have a technology working group.

Where are we today?
DPN members connect to a DPN node.  Deposits are replicated on three other nodes. What are the continuous auditing that needs to occur in order to ensure integrity?

Have a succession rights working group.  {Peter Hirtle is a member of that group.}
Also a business model working group.  What is the cost to preserve for 50 years?

  • The perfect
  • The study
  • Inertia 

#SLA2013 - Sam Wiggins - Why come to SLA?

Sam Wiggins (@LibWig) attended the SLA Annual Conference in Philadelphia two years ago as one of the winners of the Europe Early Career Conference Award. He is back this year for this conference. Why come to SLA? Here is Sam's answer (1 min. 15 sec.).

P.S. - Congrats to Sam for getting a mention in the San Diego Union-Tribune on National Doughnut Day!

#SLA2013 - Maya Kucij - Why attend SLA?

Maya is the president-elect for SLA's Education Division and the division's conference planner for 2014 in Vancouver, BC. Why should someone attend the conference?  Here's her answer (1 min.)

#SLA2013 - Big Data. Big Challenges.

This is an SLA Spotlight session.

Amy Affelt - Compass Lexicon
Know 'em when you see 'em: Big opportunities in big data 

Cool big data applications
* Healthcare
*** Microsoft readmission manager - surfaced some red flags that cause readmission
*** Stanford drug pairings
***** analyze Internet searches for indication of drug interactions
*** Gojo Industries - sensors in hospitals 

* Transportation
*** Street Bump - sends information on possible potholes so that they can be fixed.
*** ODOT - analyze info on cars that are going below the speed limit.  Do those areas need some improvement to make traffic flow better?
*** Xerox ExpressLanes - congestion pricing 

* The Magical World of Disney
*** entertainment - creating a magical experience for guests. The wristbands become the key to everything.  They are selling convenience.  You don't know your purse or credit cards.  All Disney characters can receive info on kids so they can be addressed by name.  Allows Disney to analyze how people move through the park.

* BigML
* Google Fusion Tables

What's in it for me?
* Search to find how your industry is using big data.
* What vexing issues is your organization interested in?  Can you help them address those issues?  Can big data help?
* Embed info IT and big data teams to provide point-of-need research.
* Understand patterns vs. predictions / coincidence vs. causation

Britt Mueller - Qualcomm
How has big data been applied at Qualcomm.
* Too large to parse using traditional tools
* Opportunity to analyze, visualize, cluster and mine for increased understanding and object discovery

Two major opportunities:
* Research 
*** Applicable to new research spaces
*** Create large data sets of data from multiple sources
*** Use analysis tools to create views info the large data sets
*** Produce new "starting places" for traditional expert research
*** Find what we don't know

Analyzing usage metrics and user behaviors 
*** How our population uses the information and tools we provide
*** Combine demographic information, search behaviors/activity, metrics
*** Serve this information back to the user population

* Excel
* Databases - MS Access, Informatica, Oracle
* Intelligence software - Qlikview, Tableau
* Custom search discovery - open source tools, Solr, Lucene

* Joining disparate data 
*** Normalizing and mapping data to maximize analysis is hard
*** Information professionals need to thinks creatively on pulling together disparate data to enhance discovery
*** Joining difference types of content and data increases analysis opportunity and effectiveness of discovery
*** Large, mapped data sets become a diver able in and of themselves

* Content provider outputs 
*** This is new for information professionals, but also for information vendors.
**** challenge in getting vendors to allow data to be pulled out of their system for analysis.
*** Vendors lack consistency,tech support,or licensing models that support creating outputs for further analysis.
***** Content vendors only provide ~60% of search fields as output

If a field is important enough to be searched, it should be important enough to provides as an output.

Wilfred Li - UC San Diego
Research Cyberinfrastructure (RCI) Program, http://rci.ucsd.edu/
Elements of UCSD integrated research cyber infrastructure program
* Data center collocation
* Networking
* Data curation
* Centralized storage
* Research computing
* Technical expertise

Where is the data coming from? Many different places include from audio/video equipment and sensors.
How do people store/backup their data?  Every type of device including Google Drive, Dropbox, USB drives, etc.
* People are using hardware that isn't secure or  difficult to recover.  Generally the data has no metadata.

How long do you need to store the data? Most say 5+ years, permanently or duration of the project (majority).

Do you need metadata annotation capabilities? 23% said yes.

Risks and challenges:
* Campus may cease fusing
* Constantly increasing storage demands
* Bait and switch with increased cost later
* Poor backup plan
* No dedicated support staff 

Top 10 requirements for campus Cyberinfrastructure 
* Better CI with minimal direct cost
* Network attached storage
* Data replication backup
* Dropbox or google-drive like service
* 10G network connection
* Minimal cost beyond hardware cost
* Shared technical expertise
* Distributed multi site replication
* Desktop backup
* Compliant and secure storage for sensitive data
* Tiered storage plans

RCI NAS Data Service

David Minor - UC San Diego
Preservation and curation of Univeristy research data: the complexity of big data 
* Data curation
* Appraisal
* Accession
* Arrangement
* Description
* Storage
* Preservation
* Access

Two year pi lot process with selected researchers since September 2011
Targeted domains represented campus
Required explicit researcher participation

Pilot goals include:
Learn how researchers, information technologies, and librarians work together with data 

* The Brain Observatory
* NSF Open Topography Facility
* Levantine Archaeology Laboratory 
* Scrips Institution of Oceanography Geological Collections 
* The Laboratory of Computational Astrophysics

Complicity at scale
* Issue: moving from files to "objects"
*** Semantic significance
*** Meaning within context
*** Meaning outside of context
* Issue: representing complex data 
*** Rethink data representation processes
*** Broadening metadata processes to accommodate new data types

Interesting DAMS infrastructure 
Complex research collections will be mixed in with regular digital collections
Cross collection discoverability is key.

 Content resides at UCSD.  Metadata is searchable through the Online Archive  of California 

Researchers want their content findable, but don't always recognize that they need metadata.
Curation after the fact is expensive.  It needs to be done upfront.
There is no standard definition of a dataset.
Researchers want tools and best practices to help them manage their data.
Need to create scalable systems.

Sunday, June 09, 2013

#SLA2013 - James Manasco - Librarianship is a Higher Calling

James ManascoJames Manasco told me this at the SLA Leadership Summit and I asked him to repeat it, so I could record it.  So here are his thoughts on why librarianship is a noble profession (2 min.).

#SLA2013 - SLA Loyalty Project

Overview - James King
Six chapters have been involved in this since 2011.  Began after James Kane's speech at the 2010 Leadership Summit.
 Background can be found at:
A version of Kane's handout is at:

Loyalty Components:
* Trust
* Belonging
* Purpose

SLA Loyalty web site, http://loyalty.sla.org/

Community Engagement -SoCal Chapter
* How do we broaden our appeal beyond librarians?
* Can the community see themselves in an event that you host for them?
* Contact info and handouts at http://loyalty.sla.org/?page_id=485
* Kane wanted them to build relationships with other organizations.  They found it tough to do.  
* How can they re-energize this project and keep it manageable?

Leadership - Minnesota Chapter
* How do you continue to develop leaders while protecting from burnout?
* Why do leaders drop "out" after their stint?
* Did interviews with past-presidents and a survey with their advisory board.
* Result - announce past presidents, give corsages, host a leadership lunch
* Are restructuring their Board committees.  What is working? Do people want to be advisory board members? Is the three committee structure working?
* Can they provide additional training for leaders?

Member Relations - Rocky Mountain Chapter
* For a better member experience, how do we get to know our members better?
* First started with exploring relationship management tool to assist with information gathering.
* Used pecha kucha style talks at chapter meetings to introduce the Board.
* Personal outreach to members - with intros and open-ended questions.
* Webinars, master classes, mini conference, happy hours
* They share membership info among board members, which helps to spot errors.
* Each month they send emails to new members, expired members, and expiring members.  
* They have moved from software solutions to glad-handing...building one-on-one relationships.

Remote Member Engagement - Florida-Caribbean Chapter
* How do you encourage engagement when members are spread so far?
* Do have members in Georgia, Mexico and Europe.
* Thought that they had people coordinating regional events, but those people did not know what do to.  Re-divided the chapter into smaller regions (16), but only have three people who are willing to do host two events per year at their convenience.
* Can you help chapter members realize where other SLA members are located in the chapter?
* How do you get the culture of a chapter to change, so that it has meaningful engagement with members?

Sponsor Relations - Maryland and DC chapters
* How do we foster vibrant and two-way relationships with current and potential sponsors?
* If DC was doing everything right, why were they losing members?
* Change mindset from 'fundraising' to 'relationship'.
* Created a sponsorship toolkit, http://loyalty.sla.org/?page_id=493
* Have used the toolkit to train their new vendor relationship person.

Maryland laid out clearly what sponsors receive for their dollars.  They seemed to have made it easier for sponsors to get involved.  Liked the idea of have organizations sponsor registration fees for student members or underemployed members.  They also provide ways for members to host events and build points towards a year's free membership.

Coming next week - Loyalty Field Guide which will be available on the SLA web site.

New York Chapter member mentions that they also have members/potential members in Europe.
Need to explore virtual relationship building.
Do think that you're cold calling members.  You do have someone in common with them...you are both members of SLA.