Digging Olympic Data at Londinium MMXII

This is a guest post by Makoto Inoue, one of the organisers of this weekend’s Londinium MMXII hackathon.

The Olympics! Only a few days to go until seemingly every news camera on the planet is pointed at the East End of London, for a month of sporting coverage. But for data diggers everywhere, this is also a gigantic opportunity to analyse and visualise whole swathes of sporting data, as well as create new devices and apps to amplify, manage and make sense of the data in interesting ways.

Remapping past Olympic results into London 2012 schedule to predict the medal ranking leader board

I’m organising the Londinium MMXII Hackathon which happens the day after the opening of the Olympics so that the participants can do cool hacks using real time data. But while you can use Twitter and Facebook to gather social buzz, or TfL, Google Maps and Foursquare to do geo mashups, it turns out the one dataset we’re missing is real time game results. I spent a long time trying to find out if there are publicly available data APIs but in the end it looked like we were out of luck!

Out of luck, that was, until we found out about ScraperWiki. Rather than waiting for the data to come to us, ScraperWiki lets us go grab the freshest data ourselves – after all, there will be tons of news sites publishing the Olympic schedule, and many (like as the BBC) are well structured enough to reliably scrape. Since the BBC publishes the schedule (and, from the look of it, the result) of each event, including most importantly, the exact time of each sport, we can easily set periodic scheduler jobs to scrape the latest data as it is announced. Perfect!

I’ve already written one scraper while writing an Olympic medal rivalry article, so feel free to copy the scraper as your own starting point. Setting an hourly cronjob on ScraperWiki is normally a premium service, but the guys at ScraperWiki are so keen to see what data the Londinium MMXII hackers can come up with, they’re allowing all participants free access to set an hourly cron, for the duration of the hackathon (thanks ScraperWiki!). So let’s join the hackathon and hack together!!

This entry was posted in events and tagged , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s