Building 943, Eagle Room
Mountain View, California 94043

Human annotation is crucial for many machine learning tasks but can be
expensive and time-consuming. We explore the use of Amazon’s
Mechanical Turk web service, a significantly cheaper and faster method
for collecting annotations from a broad base of paid non-expert
contributors over the Web. We investigate five tasks in the field of
natural language processing: affect recognition, word similarity,
recognizing textual entailment, event temporal ordering, and word
sense disambiguation. For all five, we show high agreement between
Mechanical Turk non-expert annotations and existing gold standard
labels provided by expert labelers. For the task of affect
recognition, we also show that using non-expert labels for training
machine learning algorithms can be as effective as using gold standard
annotations from experts. We propose a technique for bias correction that significantly
improves annotation quality on two tasks. We conclude that many large
labeling tasks can be effectively designed and carried out in this
method at a fraction of the usual expense. A summary of this work may
be found online at:


Rion Snow is a PhD Candidate in Computer Science at Stanford
University, advised by Professors Andrew Ng and Dan Jurafsky. Rion
works in the intersection of machine learning and natural language
processing, with a focus in computational semantics. He leads the
Stanford Wordnet Project, which aims at learning large-scale semantic
networks automatically from natural text. His work on automatically
inferring semantic taxonomies received the Best Paper Award at the
2006 conference for the Association of Computational Linguistics

Official Website:

Added by marstein on November 6, 2008

Interested 2