Pages

Analytics and Twitter: intro

I'm a developer working for a project at university trying to figure out what the future of transportation is. To do this, we have looked into trends in modern technology and culture around transportation, and after hours of deliberation, we realized that our "findings" had been essentially brought out of thin air through our own opinions. What we really wanted was data on who was thinking what about transportation. Unfortunately, we didn't really have a large spread for running surveys. However, having worked with the big data department before, I asked around and found out how I might mine social media data around certain search terms and see what we could find out.

For the record, this blog is just a rough document of stuff I'm doing. I don't really care if it's readable at this point. Maybe I'll update it later for real use, but we'll just have to see how it goes. At the moment I'm flying by the seat of my pants and would like to document this for future use by anyone else running in my path.

At any rate, I began my journey searching for twitter analysis, and frankly, what I found was unusable for my needs. I wanted to see what people were saying around certain keywords over time, but there was simply no way to do this. I also discovered, to my dismay that Twitter does not keep its data public for more that a few weeks explicitly. They remove tweets from search results after a while even though they might stay in the database. Companies profit off of this by selling that data or analyzing it in real time. the best service I found for my needs was tweetarchivist.com, but even this was not quite what I wanted, and it required payment after a certain amount of time.

And so, as I often do in these situations, I turned to my old friend python, the swiss army chainsaw of programming languages. I wanted to gather my own data, analyze it, and see what I could find. That's what I'm going to do with these posts now.