I think the Long Tail concepts have a lot of relevance for those engaged in providing Internet services. I want to look in particular at the provision of academic content.
Let me step back for a moment to the book itself. It examines the Long Tail from a variety of angles. In an earlier draft of my book review, I had broken these down as:
All three are essential to the Long Tail effect. It can be useful to use these to frame what business you are in. Are you about all three, or just one in particular?
For example, PhotoBucket is in the availability business. You get a bucket of storage, you dump your photos in. It is mostly not in the discoverability business. That's up to the users, as they post the photos in various places on the net. I would also consider Amazon S3 and Open Access repositories to be mainly in the availability business.
Google, of course, is a classic example of a discoverability business. And I think it's really in understanding the differences between availability and discoverability that we can learn a lot about our businesses.
Libraries are mainly about availability, as far as I'm concerned. I think one of the big conflicts has been that some libraries thought they were in the discoverability business, this is why they perceive Google to be a competitor or a threat. One of the big areas of confusion, I think, is that physical availability is about providing the container. If I can find the book in its one-and-only-one possible shelf location, then I can provide you with the service. In the online world, availability is about providing the content. This is also a business that libraries thought they were in, but again I would argue, they really weren't.
The other thing is that availability is not as "glamourous" as discovery, particularly since the Internet experience is discovery-centric, often starting with a search query. I think what happened, as was discussed at the Info Grid conference, is there was a big digital library push - taking offline content and making it available online. This has been a big success, but then what we found out is that in the online world, simple availability isn't enough. Social networking and other informal discovery methods had not made it into the online digital libraries, since that wasn't their focus.
Availability is, however, important work, as long as we understand that it is just one part of the three essential elements. I was at the Acadia University Herbarium this week. In a way, I felt like I had stepped back in time - while their facility is very modern, what it holds are shelves upon shelves of plants, pressed flat, glued onto paper, and labelled. All very Victorian explorer age to me. They are in the process of "dematerializing" this collection, not literally, but in the sense of scanning it in for availability online. Out of the 200,000 specimens, so far two summer students have gotten 1000 online this summer, so you can see that there is lots of work left in the availability business. (Also see Technology brings new life to Acadia University's herbarium, press release, November 25, 2004.)
That being said, I think basic availability is well understood - digitize your backfiles, digitize books, digitize plant sheets... it's fairly straightforward. The challenge is really around discoverability in particular. I think that many of us in the academic content business thought we were discoverability experts, but err, discovered that is not really the case. We need to find ways to partner with companies that do have expertise, as well as expanding into areas of true (not "the way we think people should work") discoverability. As I indicated in my book review, I think this is the major area for exploration, and the one in which the research sector can benefit most from commercial developments and corporate expertise. Rather than fighting Google, we should be seeing the benefits of partnership.
I have certainly heard internal reference to CISTI as being a Long Tail business, due to our deep holdings of (paper, undigitized) academic content. The business side is certainly interested in finding ways to drive document delivery demand "down the Long Tail". It would certainly be interesting to see some analysis on the extent to which docdel and ILL (the Long TaILL?) follows the powerlaw curve. Are there "hits" in the world of docdel and ILL, or are we mostly serving from the tail already? I'm more interested in making sure that no knowledge is "missed" - in the torrent of information we receive from the net, are important articles not reaching their potential audience (whether it be individuals concerned about a particular medical condition, or researchers who could benefit from an additional piece of information).
I certainly think that with full-text, you open up many opportunities. In a book about psychology, there might just happen to be a paragraph with a side story of a Viennese cafe you are researching. Normal systems of classification would never enable you to discover this, but full-text search across books (and articles) will.
Tapping into creativity is a whole other dimension that I won't really cover, there are some overlaps with discoverability, e.g. if you let scientists tag articles as in Connotea, is that discoverability or creativity? Anyway, another business you can be in is providing the tools to support creativity. And isn't creativity at the core of research... hmm...
Beyond Academic Content - the Internet of Stuff
I think that as more and more of our offline world goes online, we will better understand these challenges. Many of the big business opportunities covered in Anderson's book are, to my mind, the result of moving inventories online. We have had, for decades, very efficient systems for getting stuff into organizations (whether it be a library, or an individual home), but perhaps by design, very inefficent systems for getting stuff out. There is a huge funnel that exists to pour stuff into your house, but if you wanted to get it back out of your house, you had pretty limited options: the trash was and is a very popular option, followed by the hassle of classified ads and garage sales. eBay gets a lot of attention because of the auction model, but actually what it tapped into was the Internet of Stuff, the World Inventory. (A lot of sales on eBay are immediate, at the "Buy It Now" price.)
This is mainly, I think, a North American problem, as we are very fond of accumulating huge amounts of stuff. I am hoping that we will start to build much more efficient systems that allow us to "search globally, buy locally". It is a bit unfortunate, I think, that it is easier to discover and buy stuff from across the entire continent (say in my case, to buy stuff from California and have it shipped to Ottawa) than it is to quickly locate items in stores a block or two away from my house. Some companies have made strides in this direction - e.g. TheSourceCC provides the ability to check local store inventories, but things are still in pretty primitive stages I think.
Perhaps one day, we will have the Inventory of All - the location of every item available, anywhere in the world, with attached price or conditions. This has lots of scary implications, but are LibraryThing and related applications telling us this is the direction that people want to go?
Previously: my review of The Long Tail