The Longest Now


How will YOU use 12M bibliographic records?
Thursday April 26th 2012, 7:57 am
Filed under: %a la mod,citation needed

Harvard Libraries recently released bibliodata from their collections – 12 million works in all – under a CC-0 license, which lets other sites and researchers reuse that data in any way possible.

This is the biggest release of bibliographic data of its kind — four times the size of a similar release by the British Library in late 2010.  (Without an explicit release under a free license, such collections of metadata are covered by ‘database rights‘).

How would you reuse these records in your own work and dreams?  Some quick ideas:

  • WP or Wikisource could create 12 million stubs with those records
  • Open Library will improve and update its own metadata collection, which was built from scraped subsets of such data
  • We can write scripts that autogenerate “lists of works” for authors and authors or categories for works
  • We can automatically find mismatches between our person-data and title-data and those in MARC
  • We can publicly clean up mistakes in the MARC catalog and suggest updates

No Comments so far
Leave a comment



Leave a comment
Line and paragraph breaks automatic, e-mail address never displayed, HTML allowed: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>




Bad Behavior has blocked 441 access attempts in the last 7 days.