Do Intelligent Robots Need Emotion?

What's your opinion?

Word Sense Disambiguation

In natural language processing, word sense disambiguation (WSD) is the problem of determining which "sense" (meaning) of a word is activated by the use of the word in a particular context, a process which appears to be largely unconscious in people. WSD is a natural classification problem: Given a word and its possible senses, as defined by a dictionary, classify an occurrence of the word in context into one or more of its sense classes. The features of the context (such as neighboring words) provide the evidence for classification.

A famous example is to determine the sense of pen in the following passage (Bar-Hillel 1960):

Little John was looking for his toy box. Finally he found it. The box was in the pen. John was very happy. WordNet lists five senses for the word pen:

pen — a writing implement with a point from which ink flows. pen — an enclosure for confining livestock. playpen, pen — a portable enclosure in which babies may be left to play. penitentiary, pen — a correctional institution for those convicted of major crimes. pen — female swan. Research has progressed steadily to the point where WSD systems achieve consistent levels of accuracy on a variety of word types and ambiguities. A rich variety of techniques have been researched, from dictionary-based methods that use the knowledge encoded in lexical resources, to supervised machine learning methods in which a classifier is trained for each distinct word on a corpus of manually sense-annotated examples, to completely unsupervised methods that cluster occurrences of words, thereby inducing word senses. Among these, supervised learning approaches have been the most successful algorithms to date.

Current accuracy is difficult to state without a host of caveats. On English, accuracy at the coarse-grained (homograph) level is routinely above 90%, with some methods on particular homographs achieving over 96%. On finer-grained sense distinctions, top accuracies from 59.1% to 69.0% have been reported in recent evaluation exercises (SemEval-2007, Senseval-2), where the baseline accuracy of the simplest possible algorithm of always choosing the most frequent sense was 51.4% and 57%, respectively.

http://www.scholarpedia.org/article/Word_sense_disambiguation
...waiting for data...