We’re sure that Greater Manchester Police had us in mind when they set about tweeting 24 hours of calls on the eve of our Hacks and Hackers Hack Day Manchester. (Photos courtesy of Michael Brunton-Spall).
It proved a fantastic data set to work with at our hack day on Friday 15th October at Vision+Media in Salford, and sparked four different ‘splinter’ groups. Michael Brunton-Spall, a developer from Guardian Platform (one of the event sponsors), set about making the tweets usable and created a Json GMP24 dataset [link].
Meanwhile, for the ‘Genetically Modified Policing’ project, Louise Bolotin, from Inside the M60, Lee Swettenham from MEN Media, programmer David Kendal and Megan Knight from the University of Central Lancashire scraped tweets and analysed peak times of tweets, the categories of calls and the number of followers of the feeds throughout the day.
Obviously, they would love to work with a dump of the police calls database, but in the meantime, this would do, said Megan, who presented the team’s work.
David Kendal also produced his own project mapping 999 calls in the area. He took the tweet data and put it through the Yahoo placemaker tool, plotting information on a Google map, to see which areas got calls over certain periods of time.
Yuwei Lin and Enrico Zini took the stage and First Prize for the final police project, a GMP tweet database, and showed a very neat search tool that allowed analysis of certain aspects of the police data (3257 items).
For example, we could look at the number of incidents that involved ‘sex’, or ‘youths and drinking’, whether the incidents involved males or females (“men are troublesome than women!” ), and at a tag cloud for certain locations. We could see a list of keywords and place names. It involved using the Json dataset created by Michael Brunton-Spall [dataset link] and adding keyword sets. The source code has been released here, along with a handy explanation.
Second prize went to ‘Preston’s Summer of Spend’, built by Uclan student Daniel Bentley and Scraperwiki’s Julian Todd. They took spending data from Preston City Council, converting PDFs to machine readable formats.
Once in a CSV file, they were able to create interactives, and identify interesting aspects of the data. It might be worth, for example, looking into why quite so much went to one individual that Google told us was a “legal representative of a controversial city development”. A further step might be to request the same information from other local councils and compare the spending levels.
Third prize and the Scraperwiki mug for best scraper went to the ‘Quarternote’ project built by developers Kane, Robin, Zen, Becky, Andrew and Andrew. This web application, which got many of the audience very interested, provided local music and band information for venue owners, promoters and event organisers.
By scraping MySpace, you could easily find band gigging in your area for your event. Simply put, you could put together a gig list in three clicks. While something like LastFM would have been an easier hack, the team targeted MySpace as a source to which more local bands were contributing. (Photo from video by @josephstash)
Tom Mortimer-Jones of Scraperwiki, freelance writer Ruth Rosselson, InsidetheM60’s Nigel Barlow, Journal Local developer Philip John and freelance Mark Bentley decided to hack data showing ‘Manchester Rich and Poor’. They made a comparison by ward in Manchester, showing different factors, eg. population density, unemployment rate, incapacity benefit and severe disablement allowance, and education.
Lastly, the Judgmental group, Francis, Chris, James (with some help from Kane and Philip) decided to do some work with legal data [disclaimer: I was also part of this one!]. Thanks to a friendly unknown donor, one of our team had been given a CD full of United Kingdom case judgment data. At the moment this is only available via Bailli and the team wanted to make something more usable and searchable (Bailli’s data cannot be scraped or indexed by Google). So judgmental.org.uk was created.
It is still a work in progress, but could eventually provide a very useful tool for journalists. Although the data is not updated past a certain point, journalists would be able to analyse the information for different factors: which judges made which judgments? What is the level of activity in different courts? Which times of year are busier? It could be scrutinised to determine different aspects of the cases. [21/10/10: Updated screen grab of project, still in development]
Judge Andy Dickinson from the University of Central Lancashire has since blogged his thoughts about the day overall:
Give the increasing amount of raw data that organisations are pumping out journalists will find themselves vital in making sure that they stay accountable. But I said in an earlier post that good journalists don’t need to know how to do everything, they just need to know who to ask.
With thanks to our judges (Andy, along with developer Tim Dobson and Julian Tait from Open Data Cities), our host Vision+Media and our lovely sponsors Inside the M60, Guardian Open Platform, the Digital Editors Network, Vision+Media (supported by the European Regional Development Fund and the Northwest Regional Development Agency), Journal Local and MEN Media.
A special thanks to Louise & Nigel at InsidetheM60 and Jacqui at Vision+Media for the organisational help.
Links to posts about Hacks and Hackers Hack Day Manchester:
Any more you have spotted? Any names I’ve missed off? Videos will be added to this post soon. If you have technical detail, or screen shots, or presentations to add please email judith [at] scraperwiki.com.
Our youngest hacker yet, with Aidan: