Please Note: This project is not currently active. The content on this site is provided for reference and is not actively maintained.

Training the Cloud with the Crowd: Training A Google Prediction API Model Using CrowdFlower’s Workforce

by February 29, 2012

 

kcocco_twitter_data_google_prediction_api

Can a machine be taught to determine the sentiment of a Twitter message about weather?  With the data from over 1 million crowd sourced human judgements the goal was to use this data to train a predictive model and use this machine learning system to make judgements.  Below are the highlights from the research and development of a machine learning model in the cloud that predicts the sentiment of text regarding the weather.  The following are the major technologies used in this research:  Google Prediction APICrowdFlowerTwitter,  Google Maps.

The only person that can really determine the true sentiment of a tweet is the person who wrote it.  When the human crowd worker makes tweet sentiment judgements only 44% of the time do all 5 humans make the same judgement.  CrowdFlower’s crowd sourcing processes are great for managing the art and science of sentiment analysis.  You can scale up CrowdFlower’s number of crowd workers per record to increase accuracy, of course at a scaled up cost.

The results of this study show that when all 5 crowd workers agree on the sentiment of tweet the predictive model makes the same judgement 90% of the time.  When you take all tweets the CrowdFlower and Predictive model return the same judgement 71% of the time.  Both CrowdFlower and Google Predictions supplement rather than substitute each other.  As shown in this study, CrowdFlower can successfully be used to build a domain/niche specific data set to train a Google Prediciton model.  I see the power of integrating machine learning into  crowd sourcing systems like CrowdFlower.  CrowdFlower users could have the option of automatically training a predictive model as the crowd workers make their judgements.  CrowdFlower could continually monitor the models trending accuracy and then progressively include machine workers into the worker pool.  Once the model hit X accuracy you could have a majority of data stream routed to predictive judgments while continuing to feed a small percentage of data the crowd to refresh current topics and continually validate accuracy.  MTurk hits may only be pennies but Google Prediction ‘hits’ cost even less.

(more…)

More »

Tracking the Mood About Gas Prices on Twitter: A Case Study

by January 25, 2012

As another test of our strategy for teasing out public opinion from social media, we explored measuring mood about gas prices on Twitter. This post summarizes the findings from this case study. Incidentally, we are set up to measure mood from Twitter on an ongoing basis, although we would need to find a partner to help defray the ongoing costs of crowdsourcing the sentiment judgments. (See this post to read more about our decision to examine the discussion about gas prices on Twitter.)

The sentiment we mapped was culled from tweets gathered from four weeks’ worth of data starting on May 22nd, 2011. This time period was chosen to coincide with Memorial Day, a holiday during which many Americans travel by car. Our team was curious to see whether there would be an uptick in either the volume of tweets about gas prices during this period or a noticeable change in sentiment about these prices. (more…)

More »

Capturing Mood About Daily Weather From Twitter Posts

by September 29, 2011

After considerable preparation, we’ve just launched a version of our interactive tool, Pulse. Using Pulse, users can explore feelings about the weather as expressed on Twitter.

We began the process by choosing a topic that would yield a substantial volume of discussion on Twitter as well as be of general interest. Once we settled on weather, we wrote a survey designed to gauge Twitter users’ sentiments about the topic. With the help of workers from the “crowd” accessed through CrowdFlower, we had tens of thousands of relevant tweets coded as to the expressed emotion about the weather. These results were then used to create an “instance” of the Pulse tool, which manifests as a map of the United States that at a glance reveals Twitter users’ sentiments about the weather in their region on a given day. (You can read more about the coding process here and our choice of weather as a topic here.)

For our launch of Pulse for weather, we chose to feature tweets published over a month beginning in late April, 2011, a period in which many extreme weather events occurred—the devastating tornado in Joplin, MO; widespread drought throughout the South; and flooding of the Mississippi River, among others. The image below is from May 25, three days following the Joplin tornado (jump to the interactive map here).

may-25-pulse

We gathered tweets from all 50 states as well as for about 50 metro areas. Here you can see a zoom up on several states centered on Missouri.

zoom-may-25-pulse

The interactive map tells part of the story, namely a state’s or city’s overall sentiment about the weather, while the content under the “Analysis” and “Events” tabs reveal some of the “why” behind this sentiment: what were some of the most notable weather events occurring on a given day? [Note: our "events" feature has a bug in it and is currently turned off. In the future, icons will show up on the map to highlight out-of-the-ordinary weather events, like outbreaks of tornadoes, persistent flooding or drought, etc.] To what extent did the weather deviate from normal conditions? Why were tweets from, say, the South, uniformly negative during a certain time? What was happening when we saw a single positive state amidst a region that was otherwise negative?

We hope that weather is just the beginning. We envision using the Pulse tool to visualize nationwide sentiments about more complex, nuanced topics in the future—a sample of emotions about gas prices is just around the corner, and see our preliminary work on opinions about global warming. For now, you can explore the Pulse tool here, and let us know what you think!

More »

Sentiment Analysis Milestone: More Than One Million Human Judgments

by June 27, 2011

judgment-shot We have developed a process, dubbed Pulse, to extract nuanced sentiment from social media, like Twitter. We recognized early on that tools weren’t available to adequately answer specific questions, such as: “What’s the mood about today’s weather?” or “What portion of Twitter authors who discuss global warming believe that it is occurring?” or “Did Apple or Google have a more favorable buzz during this year’s South-by-Southwest Interactive?” Specifically, we concluded that it was necessary to get humans involved in the process—especially for Twitter posts, or tweets, which are often cryptic and have meaning that might be missed by a computer algorithm.

So, we turned to crowdsourcing.

However, successfully leveraging the power of the crowd for our sentiment analyses required cultivating the crowd, which we have achieved by working with partner CrowdFlower. In short, CrowdFlower offers an approach where we can access various work channels (we have relied mostly on Amazon’s Mechanical Turk), yet do so by layering on a quality control filter. Specifically, we intersperse within jobs what CrowdFlower terms “gold” units—in our case, tweets for which we already know the sentiment.  Workers build trustworthiness scores by getting the gold units correct. If they miss a gold unit, they get some feedback from us that has been tailored to that unit, such as “This person is happy that their garden is getting rain, so this should be marked as a positive emotion about the weather.”

We have been running a lot of jobs through CrowdFlower, but only recently did I step back and add up the tweets processed. For more than 200,000 individual tweets, we have received more than 1,000,000 trusted, human judgments from the CrowdFlower workforce! I know our research team, who had to do a bunch of judgments early on as we worked out a viable strategy, are grateful that we could get help from the crowd.

cf-jobs More »

Teasing Out Opinions About Global Warming From Twitter

by June 24, 2011

snapshot-ca A couple of months ago, we posted results from a quick sampling of mood about global warming in the Twittersphere that was featured in Momentum, the publication of the University of Minnesota’s Institute on the Environment. Along with our work on weather mood and mood about gas prices, we are on the verge of releasing a more in-depth analysis of sentiment about global warming. Here, we explain the method behind our sentiment analysis related to global warming, building off an earlier post that presented some of the details of our methodology on studying global warming chatter. (more…)

More »

Don Shelby: Climate Experts to Ratchet Up Language

by June 2, 2011

In his most recent article for MinnPost, Don Shelby describes a meeting he had with three climate science thought leaders about communicating their scientific findings to the public more effectively.  He describes a current message shift that many of them will undertake when fielding questions from reporters about global warming.

The example Shelby gives is when a reporter asks a climate scientist if current weather phenomena are due to global warming.  Rather than the typical, “no single event can be contributed to global warming,” the response will shift to, “no single event can be attributed to global warming, but we told you this was going to happen.”  The former statement has the potential to reinforce complacency by those who are skeptical of global warming or its predicted impacts.  Those supporting this new strategy, as Shelby explains, are hoping that adding a twist will reduce or eliminate some of this fodder for complacency.

(more…)

More »

Relevancy and Context are “Critical” with Sentiment Analysis

by May 24, 2011

September 11 Whenever I come across a piece that highlights how tricky sentiment analysis truly is, I tend to be encouraged more often than dissuaded to keep trying to figure it out.

Sentiment analysis is tough—not as in strict, like a teacher is tough, or in resilient, like a marathoner is tough. More like hard, like an AP calculus test is tough.  Not hard, like a block of concrete is hard.  Hard, as in difficult.  Eh, nevermind.

A colleague of mine just sent me a piece from the Miller-McCune site discussing a flawed mood study about September 11 pager text messages.

Researchers from Johannes Gutenberg University in Germany had concluded that there was an escalating level of “anger” words communicated to pagers as time passed on September 11 (here’s the study).  I’ve included the original data graph in this post. (more…)

More »

Phillips Looks To Brighten The Market For LEDs

by May 18, 2011

EnduraLED A21

EnduraLED A21

Philips is looking to change the game for LED lights, which have traditionally offered long term savings at a high initial cost (as much as $50 and up).  However, as far as brightness, LED bulbs just have not yet been able live up to their incandescent cousins (only being able to emit light equivalent to that of a 60-watt incandescent).

Phillips recently announced,as shared in this NYT post, that later this year, it will market a new LED lamp, the EnduraLED A21, that will retail for about $40 and emit equivalent light as a 75-watt incandescent.

Through new technologies in retail items, such as these light bulbs, it is important for people to know the information surrounding them, such as the initial cost of a new kind of technology and the potential for savings in both money and energy use—just the kind of information that Dialogue Earth aims to deliver.

I’d also like to note that there are programs that offer incentives to subsidize the cost associated with changing over to more efficient lighting, such as the Commercial Lighting Program, offered by Xcel energy through a joint effort with the Minnesota Center for Energy and Environment.

It is conceivable one day, that our Pulse tool will be able to be used for viewing public sentiment across important topics similar to the question of whether people are preferring traditional incandescent light bulbs, or if they like the idea of switching to LED lights and why.

More »

Hope for Human Sentiment Analysis Coding

by May 13, 2011

I just read an interesting blog post on Social Times discussing the advantages of machine-based sentiment analysis. In the piece, author Dr. Taras Zagibalov challenges the critics of “automatic” sentiment analysis, who claim that humans can better determine than computers the sentiment of social media text. He asserts that, with the proper tuning of a system’s classifier—creating specific classifiers for each domain (subject matter) and keeping them current—a machine-based sentiment analysis system can outperform human accuracy.

The discussion of human vs. machine sentiment is core to our work at Dialogue Earth, where we are developing Pulse—a social media analytics tool to help tease out nuances in the social media dialogue about key societal topics. Pulse social media analytics tool (more…)

More »

“Momentum” for Dialogue Earth

by May 11, 2011

We are thrilled that Dialogue Earth is featured in the most recent issue of Momentum magazine, an award-winning publication from the University of Minnesota’s Institute on the Environment.

momentum_dropshadow_300dpi While we work to optimize key aspects of our business—from the incentives we provide crowdsourced video creators, to the quality of the underlying data for Pulse, our social media analytics tool—we’re also rapdily ramping up our efforts to engage and broaden our base of supporters and collaborators.

Indeed, this Momentum feature piece comes at a great time for us.  There’s a ton going on.

Our Pulse tool is just about ready for prime time.  In a matter of weeks, we’ll have an version of Pulse that will provide daily information on the Twittersphere’s mood about the weather.  On the heels of that, we’ll be looking at Twitter chatter related to gas prices.

(more…)

More »

Have We at Dialogue Earth Broken Free of Randy Olson’s “Nerd Loop”?

by May 9, 2011

nerd-loop-piecePrior to reading Andy Revkin’s post Climate, Communication and the ‘Nerd Loop’ just now, I was unaware of Randy Olson’s newly coined term the “Nerd Loop.” It is a term that he recently gave to in-the-box strategies for communicating science to general audiences (read about it on his blog, The Benshi).

Olson argues passionately that there needs to be more risk taking in the science communication realm. I equate this to needing more out-of-the-box approaches, some of which will fail and some of which will help members of the public to understand a bit more about important issues like global warming, energy, food, water, land use, and so on. There won’t be a single approach that will work in all cases. Nor do I expect that there will be massive uptake of new information. It’ll be a slow, gradual process.

For me, I think the key for out-of-the-box approaches to work is that there needs to be an underlying genuine quality. Is there an effort to change people’s minds, or just to inform? If the goal is ultimately to change people’s minds, I deeply believe that even the most out-of-the-box efforts to raise literacy on a number of key issues connected to the environment will face barriers.

That’s why I’m committed to a non-advocacy approach with Dialogue Earth. We’re advocates for good information being present in societal dialogue and decision making. Period.

I believe that our strategy based in understanding the public dialogue, building credibility by drawing in a wide spectrum of experts, and ultimately delivering highly-engaging, crowd-based multimedia products holds lots of promise.

Ultimately, we can convince ourselves that we’ve stepped outside of the box, but our opinion amounts to very little. What do you think?

More »

Oil Companies’ Profits to Increase Greatly This Year; People’s Energy-Related Questions to Follow Suit.

by May 5, 2011

The rapid increase in oil prices should equate to the oil industry having its best year since 2008, as reported by Chris Kahn for AP (via ABC). Exxon Mobil Corp., Chevron Corp. and ConocoPhilips are expected to report a combined $18.2 billion in first quarter earnings — a 40% increase from last year and just shy of the $20.2 billion that they earned in the first three months of 2008.

An increase in consumption, the constriction of supply (e.g., Libya’s reserve access is currently limited), and also a weaker US dollar are all speculated to contribute to an increase in oil prices.

While some stand to benefit from the rise in oil prices (shareholders), businesses and consumers will feel the hurt as gasoline prices inflate. Increases in gas prices tend to have ripple effects, increasing the prices of transportation and any good or service that is reliant on transportation — bread, toiletries, DVD players, air plane tickets, etc.

The broad societal effect of an increase in oil prices is precisely what makes this issue of interest to Dialogue Earth.  This will undoubtedly augment expressed sentiment related to energy across social media platforms, such as Twitter. (more…)

More »

Just Around the Corner: A Longer-Running Pilot On Weather Emotions

by April 27, 2011

This week the weather in the U.S. has been pretty unusual. We set a record for rainfall here in the Twin Cities, which is really a footnote to the week compared to the violent extreme weather in the Southeast and beyond. While understanding how people are feeling about the weather day-to-day won’t change the weather, we see it as a great starting point for developing our Pulse system for tracking public opinion on issues discussed in the social media.

As a follow-on to our first weather pilot, we are gearing up to monitor mood about the daily weather across the U.S. for weeks at a time. In fact, we are just completing a run of about 8000 Twitter tweets through our “crowd-based sentiment engine” using the CrowdFlower platform. Once we have double-checked the results, we are set up now to collect tweets continuously, automatically send them over to CrowdFlower for sentiment judgments, have the judgments returned to our database automatically, and then publish the data on our interactive Pulse display. We expect to be analyzing several thousand tweets through CrowdFlower on a daily basis in order to create a detailed map of weather mood for the U.S. (see more here about our data sampling strategy). Look for more on this in the coming days. The image below is a sneak peek at our interactive platform, which our team has overhauled in recent weeks. It should prove to be a much-improved user experience!

Pulse social media analytics tool More »

Cultivating the Crowd for Social Media Analysis

by April 22, 2011

In a recent post on Crowdsourcing.org, Panos Ipeirotis writes that Amazon Mechanical Turk today is a “market for lemons,” referencing economist George Akerlof concept of quality uncertainty. For those who aren’t familiar with Mechanical Turk, it’s a distributed workforce platform that allows one to crowdsource small tasks. For a relatively low cost, those requesting work can get their tasks quickly accomplished by a large pool of anonymous workers.

This post resonates with us at Dialogue Earth, where we are leveraging a crowdsourced workforce to help us analyze social media dialogue. Our Pulse tool relies on crowdsourced workers to determine the sentiment of Twitter tweets on topics like the U.S. mood about weather.

Pulse, by Dialogue Earth

(more…)

More »

Grabbing A Random Sample from the Twitter River

by April 13, 2011

As we prepare to report on weather mood and emotions about gas prices on an ongoing basis, we are faced with an issue of scale. For example, we have been collecting weather-related tweets continuously for the past two weeks, using the keyword list described here, and we now have about 600,000 that have sufficient location information to be of interest to us. There are undoubtedly a bunch of duplicates in this batch, but still, it represents a huge volume of tweets from which we need to extract a sense of the authors’ emotions, and it would simply be too costly to consider sending all of them to our distributed workforce via CrowdFlower. (more…)

More »

Reading Twitter Users’ Sentiments On Gas Prices

by April 5, 2011

A couple of months ago, our team decided to dive into the discussion of gas prices on Twitter. We figured it would be a fertile topic and we weren’t disappointed. As the chart below illustrates, and as most Americans are well aware by now, gas prices saw a sharp increase in February 2011 after incremental increases over the past year. This in turn has prompted reactions on Twitter tweets ranging from resignation to ire to downright despair.

Similar to our exploration of weather tweets—see our discussion here—the topic of gas prices enjoys a kind of universality. Gas price fluctuations impact most of us weekly, even daily—and the reactions on Twitter bear this out. And as with our project on weather, emotional reactions to the topic have been the focus of this exploration. (more…)

More »

A Quick Video Tour of Pulse, How We Extract Sentiment from Social Media

by March 31, 2011

Earlier today I gave a talk to the Social Computing Group at the University of Minnesota. The talk featured our approach for teasing out sentiment and pubic opinion from social media, with a focus on data from our recent weather mood and global warming pilots. As a bit of an experiment, I ran through the talk again this evening and recorded a video. We would love to get any and all feedback!

More »

Global Warming Chatter: A Hot Topic on Twitter?

by March 28, 2011

Some months ago, our research team developed a strategy for inferring opinions about global warming from Twitter for our Pulse platform. We were lucky to be asked last week if we could present such data for the next issue of Momentum, the award-winning publication of the University of Minnesota’s Institute on the Environment. Of course, like all of us on a deadline, they needed it “yesterday.”

Not to be deterred, we rapidly spun up our collection system to grab those Twitter tweets that included the keywords global warming, climate change, and #climate. For a six day period ending on 23 March, we collected about 7600 tweets that had some geo-location information associated with them. Based on our recent experience focused on weather mood (described in this post), and because we had already generated a good number of quality control units (as described here), we posted a major job on the CrowdFlower platform within a day of the request from the Momentum team. Here’s a snapshot of the results:

momentum_dropshadow_300dpi (more…)

More »

A Sentimental Look at SXSW

by March 21, 2011

 

sxsw-banner South by Southwest (SXSW) is the annual opportunity for startups across art and technology to prove themselves or, more often than not, generate buzz in the attempt. Whether you’re checking out the latest surf rock 3-piece or organizing drinks via group text, SXSW generates chatter – and a lot of it.

We couldn’t resist the lure of participation, albeit from across the country, so we decided to turn our developing Pulse technology, for analyzing social sentiment, on the interactive portion of the event. We focused on something a little different, though…

(more…)

More »

Digging into the South-by-Southwest (SXSW) Twitter Traffic on Apple & Google

by March 16, 2011

ssGiven we couldn’t be at South by Southwest (SXSW) this year, we thought it would be interesting to apply our developing Pulse technology to the Twitter chatter connected with the event. Pulse represents our approach to sifting out interesting information from social media dialogue. Our first major application has been in the area of weather mood, a pilot study of which is chronicled here.

The quick overview is that we leverage the power of the crowd using CrowdFlower’s platform to extract a high-quality, nuanced understanding of sentiment from Twitter tweets. Prior to going to the crowd, we develop a strategy to create a survey that we can give to crowd-based workers so that they can make reliable judgments about author sentiment. We then collect a bunch of relevant tweets, do some pre-processing to limit the size of the sentiment coding job sent to the crowd, do some preliminary rounds of coding to ensure quality control, and then run a coding job on a large number of tweets. (more…)

More »