Friday, December 05, 2008

This is pretty exciting news from Amazon: they're launching a service to freely host public data sets. Part of building was about that problem, but Amazon can do a better job of storage than a smaller organization can. Problems still to be solved:
  • finding the right dataset -- somebody needs to do data-focused searches on terms like "lyme disease" or "lung cancer mortality", and this probably needs a bit of ontology work
  • automatically generated visualizations appropriate to well-known kinds of data
  • Wikipedia-style annotation, comments, highlighting: people not only mining and analyzing the dataset in private, but also in public and benefiting from each others work
Add this to the long list of data-oriented Web efforts in the last two years (palantir, swivel, flu trends, flowing data to which I owe a hat tip, and more), and it's the hot thing calendaring was a couple years ago when eventful and meetup were launching. Financial services were the first I saw to do sophisticated dynamic online visualizations of data, followed by Google Analytics, and it's only starting from there.

