In the BIGWIG library social software showcase, Casey Bisson writes about his proposal to further enhance Scriblio. Scriblio is a vision of a modern library web presence that is about finding books, not being a book inventory.
With its partners, the Internet Archive will build an open, structured website that includes tools to provide public and institutional access to library resources around the world. OpenLibrary.org will offer a system of integrated tools that can be used individually or together to meet a library or patron’s needs, that is free to the user and the library. The overall goal of the project is to shorten the distance between initial query and document.
Wired's otherwise admirable piece on Google Maps/Earth
has this oddly misleading sidebar:
Organizing vacation pics would be so much easier if you could remember exactly where you took each one. The Ricoh 500SE can help: This 8-megapixel digicam comes with a built-in GPS receiver that notes longitude and latitude in the file every time you fire up the shutter. (Programs like Google Maps can decode them.) Not ready to drop $1,100? Try a lower-tech workaround: Follow your photos with snapshots of the readout from a cheap GPS unit and type in the coordinates later as tags on Flickr. As GPS becomes more of a must-have feature, you'll see this kind of kung fu embedded in all your gadgets. Imagine checking your computer to see exactly where you left your glasses.
Now I know sidebars can never fit in all you want, but giving as the two options a $1000 camera and a manual process seem to have left out a more practical middle ground, the one I use: a $100 GPS logger plus some software. See my geocoding photos page for more info.
Maybe I should take Polaroids and stick them on a map with pushpins and string, a la Heros.
Actually, come to think of it, that would be kind of cool.
EXPIRED - Writing photo location in pen on the back of the print
TIRED - Typing coordinates into Flickr
WIRED - automatic geocoding with inexpensive GPS loggers and timestamp matching
Topic Pages are an innovative way for scientists to communicate in an informal and flexible way. Each Topic Page will provide researchers with summaries of a specific topic written by an authority in the particular subject area, with direct links to relevant scholarly papers, abstracts and citations, supplemented with relevant websites and other online resources from Scirus. In the initial phase, authors for Topic Pages are invited through an editorial process facilitated through Elsevier publishing staff. As more pages are developed, additional authoring options will be considered.
At the official Topic Page launch later this year, the functionality of the Topic Pages will allow scientists and researchers to alter the content and provide feedback, allowing each topic to be shaped by the suggestions made by the research community.
This is presumably the social network that was indicated in Joris van Rossum's Science-specific Search - Scirus and beyond talk at IATUL 2007.
An interesting post on the Nature blog Nautilus
Points to an essay and also to the Nature Scintilla group Open Science.
I'm still not sure what makes sense in this space - thoughtful blog postings and pre-prints are certainly useful. Very raw information such as we might get with "open notebook science" I'm not so convinced about.
PicasaWeb, the Google web photo gallery, has now got improved metadata features, including supporting geocoded photos (although it is not clear from their blog posting, it will read the EXIF-GPS for the exact location, if available).
See e.g. a photo I took in Regent's Park
There is also a one-click link to view the photos in Google Earth.
If you want more info about this topic, I have a page on geocoding photos.
A couple notes:
$5 for every 100k, roaming outside Rogers network
100k is about one cellphone photo uploaded from my K790.
That's only oh, $84.50 for 1,690k to pick an example.
Wow, what a wondrous age of mobile data we live in.
Here's your incredibly expensive phone, oh by the way, you can't afford to use any of its network features. Welcome to the 21st century in Canada and for Canadians traveling the world, cellphones in hand.
So the good news is that it's possible to Flickr photos in Tallinn directly from a Rogers cellphone,
the bad news is that uploading 2 photos while roaming costs as much as buying an entire old-fashioned 24 exposure roll of film.
This is absurd.
April 05, 2007 more on Canadian cellphone data charges
Or in other words: "Free the Humans" (available as stylish tshirt).
I think it may be difficult coming from a non-CS background to conceptualize how fundamentally important it is to make library data machine-readable, to enable easy interoperability and extendability.
That's why I loved this posting in ALA Techsource, which explains it all much better than I could:
Applications will be accepted until 23:59 Eastern Standard Time
Ottawa - Ontario
This is a 3 year term term position from the date of reporting.
A Bachelor degree in computer science, computer or software engineering, or other related field.
Formal training in Enterprise Architecture is an asset.
This is what my day was like
This is real data from my GPS logger, it's not just a path that I've drawn.
Here's the KMZ
I haven't got the altitude data out, but I'll work on it. Unfortunately, the logger was not set to record altitude (it's a configuration option).
More info to come in this posting in a few days, once I'm say, not at the end of 16 hours of travel or whatever.
UPDATE 2007-06-25: I used the DG-100 GPS logger (SIRFstar III). I turned it on when boarding the planes, I had window seats. I just had it on my leg (which was not so clever, as it fell off onto the floor twice). In the seat pocket would probably also work. That's it, it acquired signal normally and maintained it without any problems. The above track has some gap-fill artificial data over Newfoundland as the battery ran low and I had to recharge the DG-100 from my laptop for an hour or so. While recharging the Air Canada staff asked what it was and when I told them it was a GPS, they checked and verified that it was ok.
In case it's not clear, it's not live data captured directly within Google Earth. That should in theory be possible, although without a net connection you would need to pre-visit places to get them in Google Earth cache I think. The above track was stored on the logger and then converted for Google Earth using GPS Visualizer.
The DG-100 is not a Bluetooth logger; it's self-contained (for Bluetooth, check the airline policies, and you almost certainly can't turn it on until the seatbelt light is off). I'm pretty sure SAS permits Bluetooth; I don't know about other airlines.
There are two separate flights in the track: first BA from Paris to London, and then changed planes to AirCan from London to Ottawa.
Lee Dirks - Director, Scholarly Communication - Microsoft
"Open access, data-driven science & the impact on research communication"
* basic research ACTIVITY unchanged
but output options dramatically changed
- scholarly journals
- discipkine repositories
Current Issues vs. Anticipated Trends
* OA to scientific content, specifically data, will become the norm
* international cross-discipline research facilitated by interoperable standards
* "evolved" methods of peer review will be adopted
* preservation of data will become a requirement
* services develop around scientific content and prevail over pure publishing
- data analytics, publishing workflow tools, long term storage/access
EDUCAUSE "Horizon Report" 2007 - for higher education IT in USA
* key trends
- academic review and rewards are increasingly out of sync with new forms of scholarship
- the notions of collective intelligence and mass amateurization are pushing the boundaries of scholarship
* critical challenges
- assessment of new forms of work
- isses of IP and copyright continue to affect how scholarly work is done
- example: useful chem
- recording experiments that fail
Wikis for Sharing Lab Protocols
- example: OpenWetWare
- example: Connotea
- 1400+ repositories worldwide
Influence of IRs
The Promise of Data Sharing
PLoS article - Sharing Detailed Research Data Is Associated with Increased Citation Rate
"this is going to radically change science"
- data integration and interop
- provenance & quality
- exporting/publishing in agreed formats
"an aspect of competitive differentiation"
Publications as Live Documents
MS will have some results on this later this year
* helps with reproducibility if you can get to the raw data, simulations etc.
Trend: The Rise of Mass Collaboration
* Novartis released all its raw data on genetics of type 2 diabetes
[missed the end of the presentation]
This is very cool stuff, if they ever manage to make it all work.
Christine Chichester - Knewco Inc.
"Community peer review in Wiki environment"
goal: distill down every unique scientific concept to a unique identifier (the "knowlet")
Many challenges in current biomedical research
* volume of data
* distributed systems and databases
* incompatible data formats
[Brazilian Portugese sp etc]
*** ambiguity of terminology
* inability to share knowledge
"Too much to read" indicates major trends
* from reading to consulting
* from reading to meta analysis
* from texts to facts
... to central and community annotation
difficult homonym disambiguation issue: use context
- first order symantic enrichment
a knowlet is a triple
- wiki annotation
- concept profile match
- sequence similarity
build an association matrix for large data sources
- disambiguation of author names
[Dr. somebody has algorithms]
1 million disambiguated authors
- from MEDLINE
1 million for genes, drugs and ?proteins
Assignment of protein function and discovery of new nucleolar proteins based on automtic analysis of medline
Martijn Schuemie, Christine Chichester, Frederique Lisaceck, Yahoo Coute, Peter-Jan Roes, Jean Charles Sanches, Barend Mons
Special issue Systems Biology in Protemics, 2008 (in press)
put discovered hypotheses into WikiProf and then if approved into e.g. swissprot
- GO gene ontology
- NLM UNMS??? UNLS?
runs on OmegaWiki which uses MediaWiki
[knowledge space knowlet thing]
Wikiproteins Peer Review: ??? automated selector/requestor for peer review of annotations ???
He presented a model by which future assessment could be more automated.
Tim Brody - University of Southampton
"Institutions, repositories and research assessment"
Intro to UK RAE
* RAE 2008
- submission deadline November 2007
- for 2009 funding onwards
* subject-specific research outputs
* for most researchers: 4 self-selected published papers per research staff member
* "measures of esteem": editorships, awards, conferences
Submitting to RAE
* Scanned PDF or DOI
special deal with publishers to permit scanning to PDF and sending
if they don't have a paper copy,
they can order doc online from BL, but don't have rights to submit that PDF,
so they print it and scan it again
[a completely mad example of publisher rights insanity]
panel members read papers
e.g. 1000 papers per panel member
beyond 2008... mostly metrics based
* Open (Access) Research Metrics?
1. Researchers self-deposit or publish in OA journals
2. Metrics services harvest full text, citation links, and aggregate downloads
3. Funding agencies extract and generate reports
[Tim Brody's page]
* if the data are *OPEN ACCESS* anyone can experiment
* page rank
* downloads/cites comparison
Experiments with Google Scholar
* experiment undertaken to provide some metrics for the ECS department's measures of esteem submission
* query Google Scholar
[again unique identifiers are important]
Stefan Hornbostel - DFG, Institute for Research and Quality Assurance (IFQ)
"From ad hoc evaluation to monitoring systems"
Types of Activities
* Funding Monitor
- database with web frontend
- including public information
reports generated from database
also plan to use it for internal project management
- store final report documents
- link to repositories
- generate a scientists directory
* ProFile online survey
- database of new PhDs
- career development
Jerry Sheehan - National Library of Medicine
"Research Evaluation: Evolving policies and practices for assessing impact"
The changing policy context for research evaluation
* Growing recognition of links between science, innovation, economic growth, health etc.
* Increased emphasis on evaluation of institutions and their research output
OECD Main Science and Technology Indicators
Changes in Governance of Public Research
* from funding basic research... to governing the science system
- increased priority setting
- increasing role of business and social groups
- new types of funding schemes
- targetting of collaborative activities
- new missions for research organisations
- contributions to industry and society
- knowledge translation
Increased emphasis on all levels of evaluation
- ex-ante, ex-post and in-process
- incorporation of evaluation results into policy making
Types of S&T Indicators are changing
- outputs and quality measures
- process indicators
- outcomes and impacts
Measuring Agency/Government Funding Impact
- does it change the research that is done, e.g. more challenging research
Evaluation at NIH
- Office of Portfolio Analysis and Strategic Objectives (OPASI)
* Office of Extramural Research
- Database to track grants from start to finish
Institutional and disciplinary archives a key element
* NLM's PMC and NIH Public Access Policy
* Knowledge infrastructure
This is some very interesting work and a huge project that should greatly enrich our understanding of the usage of scientific information.
Johan Bollen - Los Alamos National Laboratory
"Scholarly impact: from ranking to assessment"
Scholarly evaluation matters
- qualitative and quantitative indicators
many features in scholarly status space
various opportunity to extract metrics in the scholarly life-scucle
- usage data
- review data
- citation data
usage data is available before citation data
From ranking to assessment
we're in mode ranking 0.6
- single data source
- single criterion
... to assessment 3.0 [yecch]
- situate item in value landscape
- multiple sources of scholarly information
question: which dimensions to choose?
1) MESUR project
- survey wide range of possible indicators
2) Peer review
- study peer review process
Marko Rodriguez and others
Can we improve on citation data and the impact factor?
- perhaps usage data applies to a larger subset of the scholarly community,
capturing more scholarly objects and activities beyond journal articles
usage: COUNTER, IRS [?], SUSHI, CiteBase
MESUR: Metrics from Scholarly Usage of Resources
1 ontology to model the scholarly process
2 beg for usage data
4 create semantic network
2/5th through the project
data 700million usage events and 1 bilion citations
10-15 billion triples
COUNTER logs, item-level data, SFX, etc.
link resolver data very good
[paper in JCDL 2006 about link resolver data gathering architecture]
they are using Franz's AllegroGraph triplestore
Network usage: usage graphs
"we should stop counting: we should look at relationships"
journal network - 50,000 journals
Example: Flow of information
many large organizations and sites are participating
U Texas case study...
[my comment: but isn't there an undergrad effect based on the articles they are assigned?]
principle component plot
[paper at JCDL 2007]
many issues and challenges
Denis Jérome - CNRS, Académie des Sciences
"Evaluation based on scientific publications: experiences in physics"
* public funding is needed for basic research
* evaluation is needed
* can one use publications to evaluate research
[chart showing 64% of (european) physics letters published in US]
* Europe is the first (largest) contributor to physics publications
* yet EU is a minor actor for scientific publications
A Mandatory Plurality
* an overwhelming concentration is dangerous
* need a variety of editorial policies
The need for evaluation
* peer review [of grants, and scientists]
* but also bibliometrics
* e.g. Impact Factor
IF is an indicator for publishers
*** misuse of IF for individual evaluation ***
Nature: 25% of articles receive 90% of citations
Nature & Science only have small number of physics papers therefore:
ban IF for evaluation
IF is about journal popularity, not about the actual citations
Need indications about quality
Physicists publish mostly in small number of journals listed in Web of Science
NASA ADS adsdoc.harvard.edu
Hirsh Index: H
Leo Egghe, 2006 "G" index
analysis: G seems to be more reliable than H
Grain of Salt
* clean scientist names [need unique scientist numbers]
* team work
* negative citations
* quality of citations
must be handled by scientists
Ulrich presented a very interesting open review model for publications, unfortunately his talk was a bit rushed due to factors outside his control. Definitely an approach worth investigating further.
Ulrich Pöschl - Max Planck Society
"Interactive open access publishing and collaborative peer review for improved scientific communication and quality assurance"
* many motivations to do open access
- improve scientific quality assurance
with OA you can do collaborative peer review
problems with scientific publications
speed vs quality
- but then neglect thorough review
Two-stage OA publication with collaborative peer review
* they [the journals] are financially viable
* they have good impact factor
Bernard F. Schultz - Albert Einstein Institute - Future styles? of assessment
- OA to high quality scientific publications
- documentation of scientific discussion (e.g. publish referee comments)
- demonstration of transparency and rationalism
- prescribe OA to publically funded research
- transfer funds for subscription to OA
- foster OA publishing and collaborative peer review
- mere access is not enough (need to get all layers, data etc.)
- evaluate individual papers
- refine statistical parameters for citation, downloads, usage, interactive commenting and rating
Bruno Granier - University of Western Brittany
"Impact of research assessment on scientific publication in earth sciences"
- Misuse of IF [impact factor]
citation: "The number that's devouring science"
- The Goal
* the only common goal is how visible you are...
because visibility is the qualitative factor used to assess your work
he started an OA journal - Notebooks in Geology
As an author
- (particularly in industry) you may want paper published ASAP
- you may reiterate your message in other publications
- in academe you want impact factor
ways to increase visibiity
- bogus signatures / invitation
- author names appear in alternate positions in similar papers
- selective or inexact quotations
- cutting and pasting
- lift information
Evaluators should use weighted averages for multi-author papers, 1st authorship worth much more
Question: How to detect frauds?
Answer: You need a good reviewer
As a reviewer
- the reviewer remains the sine qua non of the evaluation process
As an editor
- blacklist repeat offender authors
- use computer programs to detect plagiarism
- often citations are incorrect or not relevant
As a publisher
- OA gives happy google effect
shall i use new tools/facilities (couunters) to discriminate the kind of papers that
get the larger readership
- a huge part of the scientific inofmration is not given any consideration,
since IF covers only well-established journals
The use and misuse of metrics is responsible for the death of many lab, museum etc. publications in
France and elsewhere.
- bibliometrics or not, the only goal remains to increase your visibility
- the Google effect is at our doors
The Stockholm-Tallinn cruise line Tallink has free wireless Internet, I don't know if it's on all the ships but it is on the Romantika. Not available in the cabins AFAIK, but works fine in the 6th and 7th floor lounge areas next to the outer deck. Must be satellite, as it works while the ship is sailing. A bit slow but certainly usable (I'm using it right now).
The good news is that the hotel has free wireless (albeit you do have to go down to the front desk and request a new access code every 6 hours). The bad news is that on the second floor, with my PowerBook G4, it is very flaky. Sometimes no connection, sometimes no DHCP, sometimes works great all evening.
I am currently camped out in a third floor hallway as it is in the "no DHCP" mode.
Front desk unfortunately doesn't seem to be able to do anything, like reboot the nodes - they just tell me to use the Windows computer that's wired up in the tiny business closet.
The wireless is from The Cloud Networks.