WordNet English lexicon access

Bob Ingria ingria at world.std.com
Sun Jul 25 21:19:28 UTC 1999


At 05:50 PM 7/18/99 -0700, you wrote:
>
>	WordNet is indeed neat.  Thank you for the pointer to it!  Here is
>a simple fileIn that lets you pull in information about English words from
>the WordNet site while inside Squeak.   It gives programatic access to the
>WordNet lexicon. at http://www.cogsci.princeton.edu/cgi-bin/webwn/

It's been a long time since I looked at WordNet seriously.  But the last time I did look I noticed that it suffered from the 'mapping problem'.  This was a term coined by Roy Byrd, an NL/lexicon researcher at IBM/Watson, to describe a situation that arises when you try to merge entries from multiple lexical sources.  You will typically discover that the number of (sub)entries for all but the simplest head words (holding part of speech constant, of course) in several dictionaries will be different.  This raises the question of how these different senses map to one another.  Adding to the fun is the fact that often the senses in different dictionaries don't divide up the semantic space in quite the same way, so that, say, two senses in one will cover the same ground as three in another.

WordNet was assembled from multiple lexical sources and, occasionally, it shows.  I've seen entries where a given head word has multiple senses with two of the senses being indistinguishable from each other.  Such examples seem to be the residue of a less than optimal solution to the mapping problem.  So be prepared for such entries.

Human users of WordNet, of course, can adapt to such idiosyncracies.  They're harder for program clients to spot, though, which was the context in which I had been looking at WordNet.

Caveat lector.


-30-
Bob Ingria
As always, at a slight angle to the universe





More information about the Squeak-dev mailing list