Tuesday September 19, 2006
17:30 "Beyond Digital Incunabula: Modeling the Next Generation of Digital Libraries". Gregory Crane
This was an interesting presentation about the many ways in which we could mobilize digitized books - analyze and link the full-text in many ways. A digital book should become much more than just a static PDF online - it should participate actively in a network of information.
http://dlib.anu.edu.au/dlib/march06/crane/03crane.html
http://ase.tufts.edu/faculty-guide/faculty.asp?id=gcrane
Separation of Content and Presentation
* extract chunks via XML
Recombinant Data
* Disassemble documents into pieces
* Recombine them on the fly
Dynamic Data
Books Talking to Each Other
Hybrid entty
Human/Machine/Services
Automatic Processes
e.g. Named Entity Analysis - figure out context of references to "Washington"
Lexical Analysis
* doing analysis of mapping between language and its translations
New User Interactions
* Readers talk, books listen
* Personalization
Million Book Libraries
* Google Books
* Open Content Alliance
* i2010 - in planning
Compared to curated
* 10 times bigger
* 10 times more noise
* etc.
Technologies and Domains
* Three core technologies
- page image to text
- text to data
- one language to another
Million Books Workshop (to be announced)
Boston USA
May 22-24, 2007
Comments