Author Archives: Ian Hopkinson

About Ian Hopkinson

I've worked as a scientist for the last 20 years, at various universities, a large health and personal care company and now @ScraperWiki, a software startup. I am @SmallCasserole on Twitter.

It’s good to share…

As you may have gathered I’m on a journey, I’ve worked as a physicist, a data scientist for 20 years and now I’ve fallen amongst software engineers. There are obvious similarities in what we do, we write code to do … Continue reading

Posted in developer | Tagged | Leave a comment

Book review: The Tableau 8.0 Training Manual – From clutter to clarity by Larry Keller

My unstoppable reading continues, this time I’ve polished off The Tableau 8.0 Training Manual: From Clutter to Clarity by Larry Keller. This post is part review of the book, and part review of Tableau. Tableau is a data visualisation application … Continue reading

Posted in Data Science | Tagged | Leave a comment

Making a ScraperWiki view with R

In a recent post I showed how to use the ScraperWiki Twitter Search Tool to capture tweets for analysis. I demonstrated this using a search on the #InspiringWomen hashtag, using Tableau to generate a visualisation. Here I’m going to show … Continue reading

Posted in Data Science | Tagged | Leave a comment

#InspiringWomen – catching twitter with ScraperWiki

Those of you on twitter may have caught the recent #InspiringWomen hash tag, this was a response to the online abuse and threats received by many women in the public eye. On Sunday 4th August people tweeted about women who … Continue reading

Posted in Data Science | Tagged | Leave a comment

Adventures in Tableau – loading files

Tableau is a widely used visualisation tool, particularly in the business intelligence area. It grew out of the Polaris project at Stanford University, subtitled “interactive database visualisation”. This is worth bearing in mind since it is the context in which Tableau deals with … Continue reading

Posted in Data Science | Tagged | Leave a comment

pdftables – a Python library for getting tables out of PDF files

One of the top searches bringing people to the ScraperWiki blog is “how do I scrape PDFs?” The answer typically being “with difficulty”, but things are getting better all the time. PDF is a page description format, it has no … Continue reading

Posted in Scrapers | Tagged | 3 Comments

Book Review: Clean Code by Robert C. Martin

Following my revelations regarding sharing code with other people I thought I’d read more about the craft of writing code in the form of Clean Code: A Handbook of Agile Software Craftmanship by Robert C. Martin. Despite the appearance of … Continue reading

Posted in developer | Tagged | Leave a comment