Twitter Scraper Python Library

I wanted to save the tweets from Transparency Camp. This prompted me to turn Anna‘s basic Twitter scraper into a library. Here’s how you use it.

Import it. (It only works on ScraperWiki, unfortunately.)

from scraperwiki import swimport
search = swimport('twitter_search').search 

Then search for terms.

search(['picnic #tcamp12', 'from:TCampDC', '@TCampDC', '#tcamp12', '#viphack']) 

A separate search will be run on each of these phrases. That’s it.

A more complete search

Searching for #tcamp12 and #viphack didn’t get me all of the tweets because I waited like a week to do this. In order to get a more complete list of the tweets, I looked at the tweets returned from that first search; I searched for tweets referencing the users who had tweeted those tweets.

from scraperwiki.sqlite import save, select
from time import sleep

# Search by user to get some more
users = [row['from_user'] + ' tcamp12' for row in \
select('distinct from_user from swdata where from_user where user > "%s"' \
% get_var('previous_from_user', ''))]

for user in users:
    search([user], num_pages = 2)
    save_var('previous_from_user', user)

By default, the search function retrieves 15 pages of results, which is the maximum. In order to save some time, I limited this second phase of searching to two pages, or 200 results; I doubted that there would be more than 200 relevant results mentioning a particular user.

The full script also counts how many tweets were made by each user.


Remember, this is a library, so you can easily reuse it in your own scripts, like Max Richman did.

This entry was posted in events, Scrapers. Bookmark the permalink.

3 Responses to Twitter Scraper Python Library

  1. Pingback: MediaShift Idea Lab . A Look Back at News Hack Day SF | PBS

  2. paulbradshaw says:

    Is there any way to grab tweets by a particular user?

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s