Guest post by Dan Armstrong
My last three jobs have involved telling stories with survey data. Since every story requires a conflict, I look for conflicts. For instance, I’ll go to CFOs and ask them if their marketing people are doing a good job. Then I’ll ask the marketers if they think they’re doing a good job. Here’s a surprise: They don’t agree. And therein lies a story.
One way I’ve been able to uncover stories is to compare what people tell me with other data that they don’t tell me. That’s where Scraperwiki comes in. There’s a lot of company information available on sites ranging from EDGAR to Wikipedia to Yahoo Finance. It’s all public. You can buy it from people who aggregate and repackage it. Or you can scrape it. I choose the latter.
In my current job at ITSMA—the IT Services Marketing Association—we did a brand awareness survey of buyers of IT services. We asked them: “Which companies come to mind when you think of technology consulting and services?” The executives named 49 companies more than once. We tracked the percentage of mentions that each company received, which ranged from 52% (for IBM) to only one mention (114 companies). (Nobody mentioned Scraperwiki, but I’m confident that they will next year.)
What drives brand awareness? It’s partly how much the company spends to promote itself. But it’s mostly size, usually measured as number of employees, market capitalization or annual revenue. So I collected the ticker symbols of the companies mentioned by our survey respondents and wrote a scraper (with the help of the good people on the Scraperwiki Google Group) to collect data on company size. I chose to go to Yahoo Finance because the page layout there is easy to parse, and the URLs are based on ticker symbols. The most recent version of the scraper is at https://scraperwiki.com/scrapers/yahoo_finance_company_info/.
I found that the company size variable most closely correlated with brand awareness was the number of professionals available to take on large-scale projects. Clearly there’s a power law at work: a handful of companies with very high brand awareness and a “long tail” of firms that are barely on the radar.
As the chart (created using Tableau) shows, this pattern mirrors another dynamic: the fact that there are a few very big companies and many small ones. Brand perceptions reflect the underlying reality of company size.
My next project will be to scrape the ages of CEOs to update this age distribution from a few years back: http://www.analyzethis.net/2009/11/30/the-93-year-old-ceo/ (The oldest CEO, Walter Zable, died last year at the age of 97.) The age data is all there on Yahoo Finance. It’s going to take some regular expressions to get at it. But I’m confident there will be a story in the results.