Mark Chapman has made us two new Ruby tutorials.
Advanced Scraping: Pages Behind Forms shows you how to get data that is buried behind search boxes and drop down query lists. It uses the Mechanize library, which is a class that pretends to be a web browser, so it can work with cookies, and has a familiar interface
Advanced Scraping: PDFs shows you how to extract information from Adobe Portable Document Files. It uses the Ruby library PDF::Reader. It handles the text extract phase – working out how to parse that is a later skill.
You can find all the Ruby tutorials (and links to Python and PHP ones) on one page.