2:15
NSDL - National Science Digital Library
Building a Knowledge Base for Science, Math and Engineering Education
Carl Lagoze
Cornell Information Science
Presentation: ppt (5.6M); pdf (6.1M)
this is a big project about improving educational tools for Science Technology Engineering Medicine (STEM)
evolution of digital libararies
... federation - metasearch
lots of questions remain
but we are moving beyond this
...
We thought the work was getting stuff online: but that's (mostly) done.
The digital library thinking was around a warehouse model...
but a library is not a warehouse.
"The real goal is to re-establish the library as a knowledge environment where people organize around
information, contribute new information, and learn from each other."
Although library information flow was books - to catalog cards - to drawers,
there was a second flow: people in the library discussing.
In digital libraries we automated the part where we captured the info,
but we lost the discussion.
Can we capture and enable that discussion, the social network, within the new repositories?
"creating an integration mechanism for specialized audiences"
Creating a Collaborative Knowledge Network
"a web that sits above the web"
The web was not intended as TV - we work together to create knowledge.
So: what other things are doing this well?
- Amazon
Items are more complicated than just being individual unique "stuff".
Items may be polymorphic. Items may be created by the action of a dynamic service
(e.g. different colours of the same model of fridge - is each one a separate item - no, it's
an object with a colour service you can apply)
Concept of Information Network Overlay
About the NSDL
Phase 1: Metadata-Centric Approach
- massive metadata quality issues
there are broader problems
- access alone does not equate to educational value
Phase 2:
We want to capture CONTEXT.
Components of a new approach
* Representation
[...]
for a resource
- who used it?
- how was it used?
- how was it described and rated?
- how did THEY classify it
- how does it relate to standards
- how has it been aggregated
- what has it been used with
they want to use the information network overlay to represent it
using Fedora as the basis for NSDL Data Repository (NDR)
- Web Services association for info reuse/refactoring (e.g. "summarize for grade 11 level" service)
- Versioning ("I want last week's version")
used to do metadata ingest (even for web pages)
now: Focused Crawling and Selection
http://ivia.ucr.edu/
- expert seeded crawls
- expert-guided crawls
Description (Phase 1): manual Dublin Core
Description (Phase 2): use machines
Augmentation (Phase 1): Ask NSDL (not integrated with the rest of the NSDL)
Augmentation (Phase 2): NDSL Expert Voices - blog system
research area: how to build up automated annotations based on the blogs
instructional architect: tool to build e.g. lessons using NSDL
Q (he asked himself): are people going to contribute to this?
A: I don't know, but we have to try.
Q: combine resources from library and ... NASA, NOAA, Geographical Survey...
do you need to use just Fedora?
A: you can access the Fedora services
Q: quality control - metadata? annotations?
A: one way is to vet every resource - but this defeats the purpose of the crawler
Also, if you have a ranking system, the good stuff will bubble to the top.
"the wikipedia approach... statistically the good stuff will peek through"
Comments