Please Note: This project is not currently active. The content on this site is provided for reference and is not actively maintained.

Cultivating the Crowd for Social Media Analysis

by April 22, 2011

In a recent post on Crowdsourcing.org, Panos Ipeirotis writes that Amazon Mechanical Turk today is a “market for lemons,” referencing economist George Akerlof concept of quality uncertainty. For those who aren’t familiar with Mechanical Turk, it’s a distributed workforce platform that allows one to crowdsource small tasks. For a relatively low cost, those requesting work can get their tasks quickly accomplished by a large pool of anonymous workers.

This post resonates with us at Dialogue Earth, where we are leveraging a crowdsourced workforce to help us analyze social media dialogue. Our Pulse tool relies on crowdsourced workers to determine the sentiment of Twitter tweets on topics like the U.S. mood about weather.

Pulse, by Dialogue Earth

Ipeirotis explains that one key issue is that those requesting work from the crowd often pay everyone as if they are low quality workers, assuming that extra quality assurance techniques will be required—such as having five workers perform each task.  With no opportunity to earn a higher wage, workers have little incentive to do the work well, and the cycle of low expectations and low quality continue.

Among the ideas suggested in the piece for improving the quality of crowdsourced work include allowing workers to get endorsements, allowing feedback on the performance of the workers, qualification tests and publishing the reputation history of the workers.

We at Dialogue Earth couldn’t agree more. It is imperative that “the crowd” accurately characterizes the sentiment of the tweets we present them. That is why we have partnered with CrowdFlower to maximize quality using distributed workforce channels, including Mechanical Turk.

At the current time, we require that each tweet is judged by five independent workers. We only add to our data set those tweets for which three or more coders have the same judgment (a confidence score of 60%). This is a quality control tactic that is effective, but costly and time consuming. As we strive to continually improve quality, we are focusing on two main quality control tactics, which are explained in detail in this post.

First, we create and continually optimize the instructions we provide workers at the beginning of each CrowdFlower job.  These instructions include tips for identifying tricky issues like sarcasm and context in the online chatter, as well as specific tweet examples for each potential area of confusion.

Second, we leverage CrowdFlower’s system of “gold” units, which are a small percentage of work units for which we know the answer. Each time a worker gets a gold unit wrong, we are able to explain to them the correct answer. If a worker gets too many of these golf units incorrect, we are able to remove them from a job.

Ultimately, as worker quality improves, we and the distributed workforce both benefit—workers make money faster, and our tasks are completed more quickly, and more better accuracy.


Leave a Reply