Posts categorized "Web/Tech"

April 28, 2008

availability, discovery, and delivery - redux

  • Availability does not equal accessibility: researchers’ top concern about scholarly communication is that they cannot access all the content they wish to access
  • Researchers tend to use tried-and-tested discovery tools, or those which their library specifically trains them to use. Google and other web search engines remain the most-used search tools for work-related information. The main problem with discovery is coming up against an access barrier
  • Researchers do not always know how to seek out a freely-available copy of an article that they want and which they have discovered behind a toll barrier

Key concerns within the scholarly communication process: report to the JISC Scholarly Communication Group, March 2008 [Word document]

via Lorcan Dempsey

I find it interesting that the focus of concerns is around delivery, not discovery (perhaps this is how the questions were framed).

I think the academic library faces two challenges:

  1. Ensure that your researchers can always get from their chosen discovery environment easily to
    1. Get to your licensed resources
    2. Get to free copies, if licensed versions aren't available
    3. Get to purchase options, if free copies aren't available
  2. Ensure that as many of your resources as possible can actually be discovered

I'm not convinced that we're doing a particularly good job of addressing these fundamental challenges even after years of working on proxies, federated search, link resolvers, and "live in your environment" plugins and external website settings.

It seems to me that librarians were so focused on trying to control the discovery experience, trying to make people discover resources "properly", following established librarian search protocols, that the simple challenges above were not addressed.

I think we need to spend a lot of time with researchers letting them search however they want, and seeing whether they dead-end either by not being able to get to a resource at all, or by landing at a paywall for a resource we license or can get to for free.  Fix that first.

And I'm not entirely convinced we have all the tools we need to fix that problem right now.

Then once you have addressed consistent delivery, work on improving discovery.  I think discovery is much harder to fix.  And to some extent, there should be vendor pushback.  I don't care how rich or comprehensive a licensed resource is, if my users can never discover it, then the message to the vendor should be "enable easy ways for my user to discover your resources within their preferred discovery environments, or next year we're not licensing your content".

Previously:
Lorcan and I had a bit of a back-and-forth about discovery and delivery in 2006.

August 08, 2006  Discovery and disclosure - Lorcan Dempsey
August 09, 2006  online library role in discovering and delivering - Science Library Pad

April 26, 2008

Facebook Mini-Feed wants to be your Life-Feed

Facebook launched some "lifestream" integration features on April 15, 2008

The option to import stories from other sites can be found via the small "Import" link at the top of your Mini-Feed. Only a few sites—Flickr, Yelp, Picasa, and del.icio.us [and now Digg]—are available for importing at the moment

Facebook Blog - A new way to share with friends

Sidebar: Facebook also seems to have silently taken away the feature where you could X stuff out of your News Feed, which was supposed to teach it to only show friends items that interested you, over time.

April 25, 2008

Yahoo - the SearchMonkey cometh

Well, as long as it isn't a flying monkey, ok.

Enter the Yahoo Open Strategy (YOS). ...

There’s a massive, latent social network within Yahoo, and we’re going to bring it to the surface. We’re making Yahoo more social, but we’re not building yet another social network. We already have an incredible social network… we just need to unlock it.

...

A first taste of our strategy is SearchMonkey, which will let developers mash up helpful data with our search engine results.  ... launch party May 15 [2008]

http://ycorpblog.com/2008/04/24/developer-welcome-mat/
(Exclamations removed because I refuse to put exclamations all over the place, including inside of acronyms.)

SearchMonkey is what I previously described as "semantically-enriched search results".

[SearchMonkey]

General story of Yahoo Open Strategy very widely reported, e.g. Globe - Associated Press - Yahoo plans social makeover.  SearchMonkey bit via Search Engine Watch blog.

Globe on Lifestreaming

Today's Globe and Mail has a good article on lifestreaming, the article is a reasonable combination of skepticism and information.

For every bon mot you leave [online], a breathless news release documents the leaving of the bon mot. Assemble all those press releases together, and you've got a lifestream.

...

A lifestream is just a [Facebook] mini-feed writ large, covering not just the cloistered confines of one site, but stretching across the entire Web, pulling in data from every site that is willing to share it.

Globe and Mail - Billy blogged! Sarah e-mailed! Tell everyone! - April 24, 2008

There is also accompanying audio (presented in a video window, I guess so they can show you an ad first).  I like when he describes his worry that full lifestreams will be mostly full of "random detritus" of your web wanderings.

FriendFeed is kind of the canonical application in this space, but to some extent, anything that can aggregate RSS feeds can do a basic job of serving as a lifestream.

April 18, 2008

the social shotgun: blasting your updates everywhere

In the library world we would probably call this something like "federated updating".
With the proliferation of different targets, particularly different social presence sites, people are trying to do one-to-many updates.

We see an example of some infrastructure for this in Yahoo's FireEagle, for collecting and redistributing location updates.

There was also a nice example at the CRIG Repository Challenge, called FileBlast, which allows you to upload a paper (an article) and then send automatic notices with the link to the paper to multiple sources, such as Twitter and your blog.  It is build on the FeedForward infrastructure, and you can read more about it (as well as find a link to the code) in the FeedForward blog.

Today, via the Outsell Headlines feed, I find news that TypePad has a new Facebook app called Blog It, which  can send a single update to many sources of your choosing, including various blogging platforms (TypePad, Blogger, LJ, Vox, WordPress) and various "statusy" services, including Tumblr, Twitter, and Facebook status itself.

[blogit]

UPDATE: The article Outsell pointed me to, Six Apart Gives Facebook Bloggers Blastability, uses the term "lifestreaming".  I wonder if this is where we're headed, from the blog to the lifeblast.  ENDUPDATE

I think there's always this tension between decentralisation and centralisation, and these kind of notification federations may be one way that we manage this.

In particular, "article blast" to multiple repositories using SWORD may be a very compelling solution to a lot of ingest and content recruitment issues.  (And feel free to contact me if you're interested in SWORD ingest across platforms.)  Julie Allinson did a great job of introducing SWORD in her Open Repositories 2008 presentation.

April 08, 2008

Vlickr - Flickr adds video

The rumours are true and “soon” is now. We’re thrilled to introduce video on Flickr. If you’re a pro member, you can now share videos up to 90 glorious seconds in your photostream.

Flickr Blog - Video on Flickr

also see http://www.flickr.com/help/video/

April 07, 2008

Unlimited Librarians

Unlimited is a relatively new (4th issue) Canadian business magazine, if you scroll down to the second story in their Now See Hear section you'll find this well-written piece about the enduring value of librarians.

Librarians – all 1,300 of us in Alberta, 12,000 in Canada, and 158,000 in the U.S., not to mention the rest of the world – do three things: we buy information, we help you find information, and we loan you information. In my experience, we try to do these things using every useful technological tool at our disposal. We buy e-books, instant-message you through a search, check blog feeds to keep an eye on innovations at other libraries. New technologies have not changed the fundamental function of libraries; they constantly change how libraries approach their core mission.

Open Source - Librarians Embrace the Google Era - by Leah Vanderjagt - Unlimited Magazine

(Obscure sidebar: When transcribing this I first wrote it as "Librarians Emerge in the Internet Era" for some reason.)

April 01, 2008

NISO intros new site. Old links: kablooie.

You remember that NISO event I attended all of um, 4 days ago?

http://www.niso.org/news/events_workshops/discovery08/

Kablooie.
Gone with no redirect.
If you dig around the site, you can make your way to

http://www.niso.org/news/events/niso/past/discovery08/

which is empty.

Why does this always happen on a site redesign?  Does no one care about preservation?  Persisting URLs?

I don't know if Archive managed to harvest them, they're not showing anything at the moment.

http://web.archive.org/web/*/http://www.niso.org/news/events_workshops/discovery08/

I guess the best that's possible right now is to look at the Google cache, just to prove I didn't imagine the entire event

http://www.google.com/search?q=cache:www.niso.org/news/events_workshops/discovery08/

I wonder, should Archive provide a link to try Google cache, when it doesn't get a page match in its own DB?  How long does Google's cache of lost pages last anyway?

meta: test of ShareThis

Just a test to see what's going on with the ShareThis widget.

http://sharethis.com/typepad

Hmm... it takes quite a while, it finally pulls in the ShareThis widget after the entire page loads, jamming it after the Trackback link for the post if you're displaying the full blog, or after the Permalink if you're displaying an individual post.

Microsoft Summit on Repository Interop - notes

April 1, 2008 - I had read the posting by Savas (probably via Lorcan), so it was great to have an opportunity to hear about Microsoft's thinking directly from them.  The most dramatic announcement was that Microsoft Research will be developing entirely on the Linux platform.

UPDATE: Lee Dirks said I almost gave him a heartattack with my little April Fools' prank, and the day is wearing on, so it's time to update and move my text up from the bottom...

Thanks go to Lee Dirks and David Flanders for making my first full day in Southampton a very interesting one.  The Linux platform bit is was my contribution to April Fools.  MS Research Tech Computing are in fact of course entirely dedicated to Microsoft platforms.  ENDUPDATE

For further discussion of the MS Repository Platform efforts, they have created a group

http://community.research.microsoft.com/forums/90.aspx

I'm sure it has happened before, but it was the first time I had seen the leads/directors of Fedora (Sandy Payette), Dspace (Michele Kimpton) and Eprints (Les Carr) brought together.

There was a lot about SWORD and also some on OAI-ORE.

Notes on Microsoft Summit on Repository Interoperability event

Lee Dirks
External Research, Technical Computing
- Putting computing into science
- Putting science into computing

Science + computation are not the entire equation
* Microsoft must improve its offerings throughout the scholarly communication lifecycle

Approach: Conduct prototyping projects and proofs-of-concept to evolve Microsoft's scholarly
communication offerings

Five factors Microsoft considers key
* Interop is paramount
* Optimize for data-driven research & science
* Data preservation (and provenance) should be baseline
* Community protocols & conventions
* Social networking & semantic knowledge discovery

when possible IP shared at
http://www.codeplex.com/

Project Execution Models
* internal FTE
* external devel (vendor)
* external devel (institutional partner)
* mixed models

projects 1-2 years

Examples:
* GenePattern for Word 2008
- integrate data and images from GenePattern workflows into research papers
- will move into production in April/May 2008

* Math in Word 2007

* Chemistry Drawing for Office 15
- Peter Murray-Rust et al.
- Chemistry Markup Language (CML)
- proof-of-concept plugin ... but two versions of Office from now, Chemistry will be built-in (we hope)

* PLANETS
- EU project
- preservation of Office documents based on Office OpenXML (OOXML)

===

Savas
"Supporting researchers worldwide"

working towards an "eResearch Platform", a grouping of Microsoft tools that can support research

Flow: Author->Publish->Archive->Discover

Author
* Semantic Annotations for Word
(current target: protein databank)

* NLM DTD plug in - will support SWORD
- export a Word document in NLM DTD -> .nlmx

* Research Ribbon concept - tools relevant to researchers in Office

* can search arXiv from within Word using OpenSearch

Publish
* Conference Management Tool (also SWORD endpoint)
* eJournal - manage peer review (also SWORD endpoint)

Archive
* Research Output Repository (also SWORD endpoint and will support OAI-ORE)
* arXiv (also SWORD support)

? Repository interop/federation

Q: Shibboleth / OpenID support?
A: haven't started looking at it yet

===

Santosh
Microsoft's Research Output Repository Platform

Platform for storing scholarly works and metadata
- papers, videos, presentations, lectures, references...
- enables the development of new funcionality and services on top of the platform
- relationships between stored entitities

* SQL Server 2005 or 2008, Entity Framework, .NET 3.5

* the repository software (but not the servers) will be available to the community for free

Platform Overview
- variety of resource types (publications, tech reports etc.)
- resource tagging
- relationship between resources (triple-based)
- set of well-known predicates (IsVersionOf, Contains, etc.)
- new resource types and predicates through extensibility

Platform
* Core API
* Framework API
* OAI-PMH, Syndication, BibTeX, Search
- UI Web Controls

"A semantic computing platform"
- hybrid between relational database and a triple store

community.research.microsoft.com/forums/90.aspx

===

Stewart Lewis
Update on SWORD Protocol & Future Directions

http://www.ukoln.ac.uk/repositories/digirep/index/SWORD

- Simple Web Service Offering Repository Deposit

JISC/CETIS end of 2005
- identified lack of standard deposit API as #1 issue

2006: Creation of Repository Deposit working group

November 2006
- JISC call for funding, bid submitted for SWORD
- Julie Alinson
- lightweight and agile project

Workpackage 1: Evaluate existing standards
- WebDAV
- JSR
- OKI OSID
- ECL
- SRW Update
- SPI Google Data API
- ATOM Publishing Protocol (APP)

-> page on wiki examining them all

Workpackage 2: Tech Dev
- DSpace
- Fedora
- Eprints
- intraLibrary
* Java client library
- command line, desktop app, web interface

Workpackage 3: User testing and feedback
- arXiv
- SOURCE
- SPECTRa
- White Rose Research Online
- FeedForward

How does SWORD work?
* Two stages
- Discover
GET a Service Document
- Deposit
POST an item to the URI of the collection

GET
- X-On-Behalf-Of
- get a URI

POST

SWORD extensions to APP
* SWORD level
- 0
  - basic
- 1
  - full implementation

- X-On-Behalf-Of
- X-Verbose
- X-No-Op
- X-Format-Namespace

Discovery SWORD interfaces
* Recommend /sword-app
* Recommend /sword-app/servicedocument
* Recommend <link rel="sword" href="/sword-app/servicedocument" />

Authentication
- Required: HTTP BASIC

What?
- any package supported by the repository
- DSpace/Eprints: ZIP files with a METS manifest in SWAP format, with files
- Fedora: image files / METS documents (pull in referenced data streams)
- OAI-ORE resource maps

SWORD 2
- follow-on project
? more APP
? UPDATE / DELETE
? more clients
? client libraries
? provide support to users

Q: What is relationship with APP?
A: none

Comment: Sandy - We need a basic protocol that supports read and write.
Comment: Michele - We need to get into workflow - Zotero, EndNote etc.

Q: OAI-ORE and SWORD together?

===

Experience implementing SWORD at arXiv.org
Simeon Warner
Thorsten Schwander

1. Background
2. SWORD implementation choices
3. Ideas for SWORD evolution

automating from Microsoft Conference Toolkit

CS unusual in that conference publications very important
- use arXiv to host open access proceedings

work internally at arXiv to present conference proceedings as a whole

http://arxiv.org/help/api

Authority
1. author
2. the conference organizer
3. the CMT system (will use the organizer's authority)

returning errors
- all additional errors returned HTTP 400 Bad Request
- return an Atom document for each error code

3. Ideas for SWORD evolution

* Primary goal should be to reduce pairwise customization

- improved self description
  - self-describe size limits for uploads
  - improved error reporting
  sword:errorcode with namespace (and with description)

Integration with complex workflows
- asynchronous notification

===

DSpace
Michele Kimpton

Interop

* Business
- need defined business case / use case need because there is a small developer community

community will rally around common protocols

* operational
- policy transfer-control
  - embargo, authentication, dark archive...
- metadata loss
- identifier compatibility and acceptance

* technical
- numerous content packages
- representation incompatibilities
- interpretation of standards

Community Efforts

* OAI-PMH, OAI-ORE, SWORD, METS, IMS, SWAP
* federation acorss DSpace repositories
* working with key apps
* integration with "content creation" tools to ensure materials are deposited

===

issues: strong standardization of library *DATA*
        weak standardization of repository data

===

Les Carr
Eprints

drawing funny diagrams

user level interop

===

Sandy Payette
Fedora Commons and Interop

2007 Content Model Architecture (CMA)
- Registry of "content model" types for digital objects

Now: Simplicity

2008: Atom Syndication Format, OAI-ORE, simple common web APIs with wide appeal
and adopt other standads where possible

high-end interop (web services apis)
backend interop (Akubra) - various underlying storage - transactional stores, Sun HoneyComb,
Internet Archive PetaBox

* Topaz - application level objects and semantic interoperability

ligh-weight ways to let apps define object types

info objects mapped into triples and persisted in Mulgara triplestore

* Fedora Middleware Projects
- Simple JMS layer with e.g. Gsearch, OAI, Ingest on top

What do users really want interoperability to achieve?

Q (me): heavyweight APIs vs lightweight?
A: light for integration with web apps, heavy inside enterprise

===

Issues
- federation & interop
  - support for delete, update
  - document formats
- content creation opportunities
- content flow -> ingest

discussion of harvesting for search, Google Scholar

how are people providing federated search
- OAI-PMH
- one-off federated integration

Andy said something like "there's fundamental tension between simple and complex".
You can find Andy's liveblogging of the event through his Twitter stream

http://twitter.com/andypowe11

March 31, 2008

microformats links

I had a question at the NISO event about how to track developments in microformats, so this set of links is perhaps timely:

Microformats University: 100+ Articles and Resources

via business|bytes|genes|molecules - Around the web - March 29, 2008

March 29, 2008

The Two Laws of Robotic Librarians

In my NISO presentation I proposed a couple new library laws.
For some background, here's some info from Wikipedia

Ranganathan's Five Laws of Library Science (1931)

  1. Books are for use.
  2. Every reader his [or her] book.
  3. Every book its reader.
  4. Save the time of the reader.
  5. The library is a growing organism.

In 2004, librarian Alireza Noruzi recommended applying Ranganathan's laws to the web in his paper, "Application of Ranganathan's Laws to the Web":

  1. Web resources are for use.
  2. Every user has his or her web resource.
  3. Every web resource its user.
  4. Save the time of the user.
  5. The Web is a growing organism.

I propose Two Laws of Library Science... for Machines

  • Every web resource its machine reader.
  • Save the time of the machine.

By this I mean, our web resources need to be not just readable by humans (the presentation layer), they need to be readable by the machines, who have a hard time understanding presentation and natural language.  This may mean that the machine does some screen-scraping tricks, but that's fragile and time-consuming for the machine.  While you may not think saving the machine's time is an issue, there are two points: firstly, as the content on the web grows, we want it to be parsed by machines as quickly as possible, so that we get immediate discovery of new information; secondly, code running locally on a laptop or in particular a mobile phone/PDA may have limited compute and memory resources, and you may want that code to be able to alter web pages with additional discovery information fast enough so that there is no delay noticed by a user.

Now I have no Semantic Web illusions that people are going to nobly go back and markup all their content with semantic information, that vision is a fantasy that lingers with us from the SGML days and it's never going to happen.

Ralph LeVan took me to task, saying developers are not going to do extra work, the work is only done if there is a business case, and that developers are tasked with presentation GUIs for users, not with enriching web pages in invisible ways.

Well, yes and no.  People will do new things and extra work when they have a compelling motivation.  There may be many different motivations.  The system to register posts and get markup from ResearchBlogging is rather elaborate, but people do it because they want to be discovered.  Even a slight advantage in discovery can be a huge motivator to people.  That's why I think the Yahoo Open semantic initiative will bring a huge push for microformats.  And of course, it will never be people manually adding microformats in a big way anyway.  It will be our creation applications and tools that automatically insert microformats as appropriate.  Programmatically grabbing a DOI and inserting a visible citation is not a huge amount of effort... extending this to embed the citation as COINS is a miniscule additional step.

And of course there will be people running very sophisticated algorithms on big networks of computers with loads of storage, to data mine out useful semantics, in particular about "science objects" like formulas, genes, chemicals etc. and then insert the proper microformats and identifiers for much simpler applications and machines to read.

Here's what Tim Berners-Lee had to say on the topic (from transcription of podcast done with Paul Miller)

Paul Miller: ... Another area that will require a huge amount of effort moving forward is around data for the Semantic Web. We're going to need an awful lot of it. Where are we going to get it from?

Tim Berners-Lee: There's an awful lot of data out there. And I think, one of the huge misunderstandings about the Semantic Web is, "oh, the Semantic Web is going to involve us all going to out HTML pages and marking them up to put semantics in them." Now, there's an important thread there, but to my mind, it's actually a very minor part of it. Because I'm not going to hold my breath while other people put semantics in by hand.

I'm not going to wait for other people to do it, and I don't want to do it either, to sort of add the semantics to HTML pages. So, where is the data going to come from? It's already there. It's in databases. So, most of this data is in databases. Often the data is already available through some kind of a Web interface.

March 28, 2008

where in the world are users generating?

In my NISO presentation I made a rather imperfect metaphorical point, which is that there is too much darkness (in the sense of "too little access to information") on this map.  1 billion of us have incredible (and some would claim overwhelming, or "too much") information access, and 5 billion of us do not.

[NASA Earth at Night]

Night illumination is of course a pretty weak proxy for "level of development".  One would hope actually that a truly developed society would not be wasting energy letting light leak up into the sky, and would instead have directed lighting with minimal sky glow.  But it gets the general point across.

I have argued in the past (in reference to Flickr and "peer production), that you could take the "Earth at Night" image and map it pretty closely to user-created content, on the theory that the main driving force behind user content is leisure time, and it's in the industrial, light-blaring nations that we have the most of this Internet time spent.

Dinah Sanders
had an excellent counterpoint in her presentation, which was (something like) "where you don't see light, there are people using mobile phones".  Which is very valid, the idea of developmental leapfrogging, where countries and individuals are able to take advantage of new, less energy-intensive, advanced technologies to leap right into the next generation of Internet use.

However, I present as my counter-counterpoint, the Meebo Map, showing use of their IM tool (and with lots of caveats about dependencies on IP mapping tools, different tool use depending on language and location, and population densities)

[meebo-map-20080328.jpg]

http://blog.meebo.com/map

March 27, 2008

Economist on social networking walled gardens

The Economist has a piece on social networking, the main points are that technology tends to move back and forth between closed and open periods, and that it's not clear how to make any money off of social network sites - in fact the harder you try to "monetize", the more likely you may be to drive people to other sites.

But should users really have to visit a specific website to do this sort of thing? “We will look back to 2008 and think it archaic and quaint that we had to go to a destination like Facebook or LinkedIn to be social,” says Charlene Li at Forrester Research, a consultancy. Future social networks, she thinks, “will be like air. They will be anywhere and everywhere we need and want them to be.” No more logging on to Facebook just to see the “news feed” of updates from your friends; instead it will come straight to your e-mail inbox, RSS reader or instant messenger. No need to upload photos to Facebook to show them to friends, since those with privacy permissions in your electronic address book can automatically get them.

The problem with today's social networks is that they are often closed to the outside web. The big networks have decided to be “open” toward independent programmers, to encourage them to write fun new software for them. But they are reluctant to become equally open towards their users, because the networks' lofty valuations depend on maximising their page views—so they maintain a tight grip on their users' information, to ensure that they keep coming back. As a result, avid internet users often maintain separate accounts on several social networks, instant-messaging services, photo-sharing and blogging sites, and usually cannot even send simple messages from one to the other. They must invite the same friends to each service separately. It is a drag.

Historically, online media tend to start this way. The early services, such as CompuServe, Prodigy or AOL, began as “walled gardens” before they opened up to become websites.

Economist - Everywhere and nowhere - March 19, 2008

March 24, 2008

TeacherTube studycast: Rock meets Lichen

One of my friends is a middle school teacher who tries to find ways to usefully integrate technology with his teaching.  This is an animation his students made, "Rock meets Lichen"

http://www.teachertube.com/view_video.php?viewkey=b818fffa3e8045eb95a6

You can find all of their studycasts on his wiki at

http://studycasts.wikispaces.com/Studycasts

March 20, 2008

Facebook adds cliques: yay!

I've been wanting for a long time to be able to share information in Facebook in a very granular way.  They have finally enabled it, but the settings are not as centralized as one might wish.

Some settings are under the master Privacy control panel

http://www.facebook.com/privacy.php

whereas others are more easily found on the individual pages for particular capabilities.  For example, photo privacy, to set who can view which albums, is at

http://www.facebook.com/privacy/?view=photos

And very confusingly, which there are some application settings on the main privacy page

http://www.facebook.com/privacy/?view=platform

The very granular "let some friends/friends lists see an application in my profile, but not others" is in the Applications edit screen, under Edit Settings for each application.  Also, unfortunately, when you add a new application I didn't see any way to set the privacy before it is added to your profile, only after it is added.  So there is a brief window when it is outside of privacy control, using the default settings.

http://www.facebook.com/editapps.php

You can set access for friends either per individual, or per friend list, and you can add multiple friends lists to the allowed group ("allow only - default deny").  You can also exclude, or as I prefer to think of it, outcast specific friends lists ("allow all but - default allow").  It's like firewall rules for friends.

You can also if you want, allow friends of friends, and control access by network (networks are things like the Ottawa network, the Your Company Name Here network etc.)

This means that I can finally start adding some apps like Dopplr, which provides detailed travel info I might not necessarily want to share with the world.

[facebook-cliques-dopplr.jpg]

In case you're wondering, yes this is a real rule in my account, it says "allow all friends to see my Dopplr travel status, except those in friends list 'random people'". [UPDATE: err to clarify, 'random people' is a friends list that contains people that I don't know very well.]

See Facebook Blog - More Privacy Options for more info.

UPDATE: As with setting complex firewall rules, getting the rules set up for all applications is quite time consuming.  Things that would help:

  1. "Rule sets" - All creation of rules that can be applied to all applications, or to a subset of apps, and anywhere that privacy rules are used, e.g. "everyone in the lists 'work' and 'work friends' can see this, no one else".
  2. Testing - there is no way to test the rules that I can see - it would be very helpful to have a "see this profile as it appears to user X" option.

Previously:
January 18, 2008  social networlds colliding
November 08, 2007  posting TypePad entries to Facebook
August 14, 2007  FaceBook pulls open scientists into the dark web

Open Repositories 2008

Through an unexpected series of events I find myself going to Open Repositories 2008

http://or08.ecs.soton.ac.uk/

The lineup looks great including a keynote from Peter Murray-Rust, and two (!) sessions on Scientific Repositories.

There is also a Repository Challenge for developers with a £2,500 prize, which is like a million US dollars now (finally, Canadians get to make US dollar jokes).  Kudos to David Flanders for leading this "let's just build stuff and see what works" approach.

I will be blogging under tag/category or08, and twittering under hashtag #or08

I made an Upcoming event, mainly because then if you add the machine tag

upcoming:event=455039

to your Flickr photos, it will automatically put in a nice "Taken at Open Repositories 2008" logo.

NLA announces Library Labs

The [National Library of Australia] has recently opened this "Library Labs" wiki space:

https://wiki.nla.gov.au/display/LABS/Home

The aim of this space is to let our colleagues know what we are doing, to invite comments, questions and feedback and to provide a space for discussion and collaboration.

We have started to redevelop our digital library services using a service-oriented architecture and open source software solutions where these are functional and robust.  We are also aiming to take a common ("single business") approach to collection management, discovery and delivery.

We are interested in forming a community of Australian business analysts and developers who are working on similar problems and who are interested in  interoperable, standards-based solutions. We are also interested in working with colleagues at an international level to provide prototypes and testbeds for new and emerging standards.

via Warwick Cathro
Assistant Director-General, Innovation
National Library of Australia

March 18, 2008

Google Book Search API

The Google Book Search Book Viewability API enables developers to:

                   
  • Link to Books in Google Book Search using ISBNs, LCCNs, and OCLC numbers
  • Know whether Google Book Search has a specific title and what the viewability of that title is
  • Generate links to a thumbnail of the cover of a book
  • Generate links to an informational page about a book
  • Generate links to a preview of a book

http://code.google.com/apis/books/

via LibraryThing blog - Google Books in LibraryThing - March 13, 2008

We need APIs everywhere for everything.

March 17, 2008

Semantically-enriched search results coming from Yahoo

In an upcoming talk I will be continuing a theme I started at Allen Press, calling for more semantic enrichment of scientific information online (I am of course, only one of many making such calls).

It is therefore timely to see Yahoo offering an open platform for harvesting and returning semantically-enhanced search.

There was a pre-announcement on TechCrunch, followed by the official word on the Yahoo Search Blog

In the coming weeks, we'll be releasing more detailed specifications that will describe our support of semantic web standards. Initially, we plan to support a number of microformats, including hCard, hCalendar, hReview, hAtom, and XFN. Yahoo! Search will work with the web community to evolve the vocabulary framework for embedding structured data. For starters, we plan to support vocabulary components from Dublin Core, Creative Commons, FOAF, GeoRSS, MediaRSS, and others based on feedback. And, we will support RDFa and eRDF markup to embed these into existing HTML pages. Finally, we are announcing support for the OpenSearch specification, with extensions for structured queries to deep web data sources.

Yahoo Search Blog - The Yahoo! Search Open Ecosystem - March 13, 2008

You can sign up for more information at

http://tools.search.yahoo.com/newsearch/open.html

So what would an appropriate set of semantic information be for a scientific article, what would your ideal search display include?  # of citations?  Impact Factor?  Chemical and gene sequences?  Price?  (Sometimes information wants to be expensive...)  How much can we fit into a couple of lines that will help to select one article over another in results?

UPDATE: And Yahoo is just one player in this space, as Paul Miller indicates in his posting Looking for a dominant Semantic Web search engine.

via Twitter mostly

March 06, 2008

context and location awareness

A lot of buzz about the next generation of technology providing better information and services by being aware of the context in which the device is being used and the location.

Olli-Pekka Kallasvuo, president and CEO of Nokia, came to the Mobile World Congress... to declare that Nokia will "reshape the Internet."

Nokia believes it, not Google, can deliver operator-independent, cross-platform phones through new software and services. How does Nokia presume that it can reshape an Internet so firmly established already? Nokia's answer lies in Maps 2.0, which the company claimed enables a "context-aware Internet" that combines multimedia features, Internet and Assisted- GPS, "We can bring more relevant and powerful context" to users browsing on the Internet, claimed Kallasvuo.

Niklas Savander, Nokia's executive vice president of services and software, added: "By adding context--such as time, place and people--to the Internet, the Web will become something very different from the one you have today."

EETimes - Nokia, not Google, sees itself reshaping the Internet - February 11, 2008

Some of those in the thick of battle are resigned to having a lot of company. “If there weren’t competitors, there wouldn’t be a market,” said Dan Harple, founder and chief executive of GyPSii, a mobile social network based in Amsterdam that is a contender. “Maybe there are 30 or more now — in three years, there will be 5 that matter.”

The prize, as these start-ups see it, is the 3.3 billion cellphone subscribers, a number that far surpasses the total of Internet users. The advantage over computer-based communities, they believe, is the ability to know where a cellphone is, thanks to global positioning satellites and related technologies.

...

Most mobile social networks seek to capitalize on location information. The SpaceMe service from GyPSii, for instance, will show users where friends and other members are in real time.

New York Times - Social Networking Moves to the Cellphone - March 6, 2008

Well established as the business mobile device of choice, the BlackBerry may soon become a much more social smartphone, says the co-CEO of creator Research In Motion Ltd.

Jim Balsillie says RIM wants the BlackBerry positioned to tap into the growing trend of Internet social networking sites such as FaceBook.com that allow consumers to share information about their lives, and access multimedia content, particularly music, on their mobile devices.

"Architecturally, music and the social networking are going to merge," Balsillie said ahead of a Thursday speech to the Canadian Music Week festival in Toronto.

CP - RIM looks to make social networking part of BlackBerry's strategy - March 6, 2008

Although the above is about mobile, Google is already "location aware", to the extent that each country version of Google ranks results in a different order, presumably based on language and click tracking, amongst other things.  So e.g. Google Canada will list hits in a different order (for the same search) than Google France.

It gets pretty complex to try to make meaningful context decisions though.  If it knows you're in a coffeeshop, should it return higher ranking results for "java" as it relates to coffee?  What if you do all your computer programming in coffeeshops?

This applies beyond mobile devices, to context awareness for any app being used on any platform anywhere, whether at work, at home, or on the go.

Of course, there is an extent to which the computer either implicitly or explicitly knowing more about the context and location of your activities is very privacy intrusive (e.g. hypothetical location-aware shopping application "I see you're passing a drug store on the way to your girlfriend's apartment, perhaps you should purchase some prophylactics?")

To rephrase something I wrote in my Twitter, I find my online and mobile walled gardens either have too many walls, or no walls whatsoever.  I would like to have a lot more control over the barriers and translucency of those barriers.  If my friends want to know my exact location down to the metre, that's fine, but as my circle of acquaintances expands outward, I want the the precision of my location to decrease, so that maybe people I know less well are shown what city I'm in, and people I don't know at all only get to see that I am currently somewhere in the vicinity of planet earth.

In a way, this is old news anyway.  The next thing that was supposed to follow on from the e-commerce bubble in 2000 was "m-commerce".  The cellphone companies conceived this as the m-commerce "value chain", by which they meant, extracting value FROM you, FOR them, all the way along the chain.  So they wanted not only to charge Amazon for placement on their wireless portal, they thought they should get a cut of everything you bought from Amazon.

I thought this was ridiculous when they were talking about it in 2000...

"E-Commerce value chain has many more steps and players than standard"

Notes on Wireless Internet for E-Commerce seminar - April 28, 2000

and when Tod Maffin talked about "CRM M-Commerce" in 2001, my general feeling was that the day a coupon pops up on my phone screen when I pass a store is the day I throw my cellphone into the river.

Accordingly, given people's widely varying expectations of privacy and "value", we are going to need much more granular and much more interoperable tools in order to achieve workable context awareness (including location).

Yahoo`s Fire Eagle is an infrastructure piece, an architecture for sharing location information between applications.

fire-eagle

Plus which, this is all very nice in theory, but given that in Canada our mobile providers near-total control over their nextworks, and have data plans that are expensive and/or limited, mostly incomprehensible, and don't cover roaming outside Canada anyway, I think it will be a while before most Canadians are willing to use any sort of advanced mobile applications.

I actually think the carriers are setting themselves up for a fall, because Canada is concentrated in a few cities with lots of WiFi, so as soon as more phones have WiFi, people will use that to the exclusion of wireless data, and may even try to do a lot more VoIP over WiFi as well.

Related:
February 15, 2008  CNet reviews Nokia 6210 Navigator GPS phone [including Maps 2.0]
February 05, 2008   Nokia Location Tagger: in-phone photo geocoding

March 03, 2008

Chapters invites you to their content acquisition community

Chapters-Indigo is the major bookstore chain in Canada.  What do they provide for booklovers online?

Chapters Indigo Community, "Acceptable Use Policy"

http://www.chapters.indigo.ca/Legal-Statement/legal-art.html

Emphasis mine.

The User acknowledges that any content, e-mails, postings, offers, software, videos, photos, text, graphics, music, sounds, questions, creative suggestions, messages, feedback, ideas, recipes, notes, drawings, articles, stories or other information, data, materials and opinions (including, without limitation any postings on community forums) ("Submissions") that he or she may provide, e-mail, post, upload or otherwise transmit to the Website shall be deemed and shall remain the property of Indigo, including all copyright, without reservation, and User waives in favour of Indigo any and all moral rights in such Submissions. Except as provided in the Privacy Policy, none of the Submissions shall be subject to any obligation of confidence on Indigo’s part, and We shall not be liable for any use or disclosure of any Submissions. Without limitation of the foregoing, the User acknowledges and agrees that all or any portion of the Submissions may be used, edited, reproduced, published, translated, sublicensed, copied and distributed and/or incorporated into other works in any form, media, or technology now known or hereafter developed for the full term of any copyright that may exist in such Submissions, without compensation of any kind to the User. When You post Submissions to the Website, You authorize and direct Indigo to make such copies thereof as We deem necessary in order to facilitate the posting and storage of the Submissions on the Website. By posting Submissions to any part of the Website, You automatically grant, and You represent and warrant that You have the right to grant, to Indigo an irrevocable, perpetual, non-exclusive, transferable, fully paid, worldwide license (with the right to sublicense) to use, copy, publicly perform, publicly display, reformat, translate, excerpt (in whole or in part) and distribute such Submissions for any purpose on or in connection with the Website or the promotion thereof, to prepare derivative works of, or incorporate into other works, such Submissions, and to grant and authorize sublicenses of the foregoing. You agree to defend, indemnify and hold Indigo, its subsidiaries and affiliates, and each of their directors, officers, agents, contractors, partners and employees, harmless from and against any loss, liability, threatened or actual claim, demand, damages, costs and expenses, including reasonable legal fees, arising out of or in connection with any of Your Submissions.

Err thanks, but my idea of community doesn't involve transferring my ideas to a private corporation for reproduction in formats yet to be invented, until the end of time.

Way to miss the point of social networking, Chapter-Sin-digo.

They say they have 80,000 members, I don't see how that is even possible, considering that even the older, well-established, and wildly popular LibraryThing only has 368,000 members.

Science Policies and Science Portals - registration open

Registration is now open for

IFLA 2008 Satellite meeting
Science Policies and Science Portals

Canada, Montreal, Polytechnique Montreal - Friday August 8th 2008

I also made an Upcoming.org event.

I've proposed a tag: ifla2008science

Researchers - Create Change Canada

http://www.createchangecanada.ca/ is a Canadian adaptation of the US http://www.createchange.org/

It provides information about the new modes of communicating scholarly information in the digital environment.

February 25, 2008

Adobe adds AIR to cloud

AIR is intended to help software developers create applications that exist in part on a user’s PC or smartphone and in part on servers reachable through the Internet.

To computer users, the applications will look like any others on their device, represented by an icon. The AIR applications can mimic the functions of a Web browser but do not require a Web browser to run.

...

“There is a big cloud movement that is building an infrastructure that speaks directly to this kind of software and experience,” said Sean M. Maloney, Intel’s executive vice president.

New York Times - Adobe Blurs Line Between PC and Web - February 25, 2008

AIR has graduated from Adobe Labs, and is now available for free download at

http://www.adobe.com/products/air/

I have to say, I'm not really clear how this is any different from, or better than, Java.
I guess the argument is that it uses web standardsy stuff, so it's easier to program than Java.

Plus it's all well and good to say cloud this and cloud that, for example, but I don't see any indication that AIR provides you with some storage cloud from Adobe that you can use.

I found a posting by a GWT developer that briefly introduces the major competing technologies in this space

Adobe AIR/Apollo vs Ajax vs Gears vs Flash vs Silverlight vs JavaFX vs GWT - June 11, 2007

As well I wonder about the implications for searching.  Our search engines mostly eat text or things that can be converted easily to plain text (Word documents, PDFs, PowerPoints).  You can do a whole fancy site in Flash and to a search engine it will barely exist.  If we move from building text-based web sites to interaction based web apps, how will we ever be able to find anything again?

----

Search


  • Google
    Web scilib.typepad.com

Receive via Email



  • Powered by FeedBlitz

Twitter Updates

    follow me on Twitter

    Furl Linkblog

    Resources

    Recent Comments

    Referral