Some ScraperWikiLovin’ at MozFest

This weekend saw ideas made reality, collaborations fostered and the future web bloom. The Mozilla Festival was all about making the web and making it happen in two days! Here at ScraperWiki we like doing that with data, so as well as contributing to the Data Driven Journalism Handbook, we held a quick fire ScraperWiki round.

And when I say quick I mean ~1hr! With a couple of geeks in hand, some eager journalist types, laptops and our ever articulate CEO, Francis Irving, we set to work, well, talking about data. The fact is there are many pre-scraping steps to consider:

  1. What is the general area you are interested in?
  2. Can you find other people, especially geeks, with that interest?
  3. When you have done so, you need to find where the data is that relates to your field of interest
  4. Once you’ve got a list of interesting data, you need to look at its structure (non-programmatically) in order to decide on a hypothesis to test
  5. Then you need to recruit your geek (who should be involved in all of the above steps) to start deconstructing the data i.e. seeing what can be scraped
  6. At this point you all need to work together to decide the schema of the scraper datastore i.e. the headings and their attributes
  7. Iterate until your data can answer your hypothesis or alter your hypothesis (it could be that you can mash the scraper with another dataset)
  8. Get working on answering your hypothesis. The outcome could be a query, a visualization or an application
  9. Go back to your data and iterate again so that the structure fits your outcome
  10. Pat yourselves on the back, have a beer and keep in touch for your next project

This may seem a bit much but this is how you make, iterate, and mediate for the web. The Mozilla Festival proved that this is achievable and enjoyable. In that vein, we got a scraper in 1hr! So a big cheer to Alex Poderoso for winning the coveted ScraperWiki mug.

To catch up on the MozFest fun, here is  the first draft of the Data Journalism Handbook. The festival premiered an amazing HTML5 documentary called The One Millionth Tower. You can catch up with all the rest including teaching kids to code with Hackasaurus and hacking video with popcorn.js (and an octocopter!) and loads more at the Mozilla Festival website.

This entry was posted in events and tagged , . Bookmark the permalink.

One Response to Some ScraperWikiLovin’ at MozFest

  1. collabdocs says:

    Hi,
    Sorry not to get to the Scraper Wiki session at the Mozilla Festival. I’m working with Popcorn and interested in using data sources in documentary. Might this be something you’d be interested in having a chat about? Mandy
    http://collabdocs.wordpress.com/
    mandy.rose@uwe.ac.uk

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s