For some reason, our lists of library conferences are, how shall we put it nicely? Geocoding challenged? Semantically unrich?
The easiest way this could be fixed would be for the list creators (or others) to run the lists through a simple geocoder and produce GeoRSS.
In the meantime, here's a demonstration hack to show how easily it can be done.
"Library Related Conferences" list from
http://homepage.usask.ca/~mad204/CONF.HTM
Let's chop it up with an online tool.
RSSxl - Convert an HTML Web Page to RSS
Item start/end: <tr> and </tr>
Description start/end: <td> and </a></td>
Link to extract: 2
Ok, we've cooked some RSS.
http://scilib.typepad.com/science_library_pad/geotest/wotz-conf.xml
Pass it to GeoNames RSS-to-GeoRSS encoder.
It doesn't like some embedded unicode - com.sun.syndication.io.ParsingFeedException: Invalid XML: Error on line 46: An invalid XML character (Unicode: 0x13) was found in the element content of the document.
Strip out everything except legal characters using BBEdit.
http://scilib.typepad.com/science_library_pad/geotest/wotz-conf-zap.xml
GeoRSSify
Result: GeoRSS test file
http://scilib.typepad.com/science_library_pad/geotest/wotz-conf-zap-geo.xml
Visualize!
It's that easy. Yes, the placemarks on the map are clickable, and the hyperlinked dates will take you to the particular conference URL.
You can also feed GeoRSS directly into Google Maps
I've been thinking a lot about how you could cleanly parse these lists and accurately geocode them, but this was the simplest demo I could come up with, without any programming involved.
Interesting approach...
I immediately saw it as a Yahoo pipes problem, but the html page is too long to load in to a pipe...
..so i tried a dapp, set up for a map, but that doesn't seem to geocode, though it does give a clunky RSS feed out (it would be better to go in to dapper again and create a dapp that gives a more usefully structured rss feed out - like one that captures the [more] link)
http://www.dapper.net/dapp-howto-use.php?dappName=Libraryconfs
I then passed the dapp rss feed to a pipe that does the geocoding (it could also tidy up the date a bit, as well as cleaning out the null/non-geocoded entries, to produce quite a rich feed along the lines of this MyMaps Geoblogging pipe: http://blogs.open.ac.uk/Maths/ajh59/012936.html):
http://pipes.yahoo.com/pipes/pipe.info?_id=nj5BpsbX3BGrK_qyyZ1_DQ
There are too many items for this feed to work properly, eg using the kml to populate a google map, but it could probably be tweaked easy enough to give items n to m
Posted by: Tony Hirst | February 10, 2008 at 06:05 AM