A WordNet Browser in Squeak (Mach I alpha)

Andrew C. Greenberg werdna at gate.net
Tue Aug 17 13:14:48 UTC 1999


Attached is a rough, first cut changeset for primitives to read
Princeton's WordNet files, together with higher level classes and an
explorer-style browser for manipulating the same.  The present
version works only with Macintosh (and probably Unix) versions of the
Princeton databases.  Mach II will handle Wintel as well.  To install:

	(0) Install Squeak Version 2.5 (or earlier versions with
			Bob Arning's hierarchical browser code)

	(1) download the Wordset database.  A copy can be obtained from:

		http://www.cogsci.princeton.edu/~wn/obtain/

	(2) drag the folder "Database" into the directory with your
Squeak application

	(3) load the changeset.  If you have a problem loading WNet initially,
	be sure to execute:

		WNet release.  WNet initialize

	before proceeding.  To startUp the WordNet browser, execute:

		aSearchString exploreInWnet

  From inside the browser, you can drill through WordNet pointers,
selecting a list item displays a report of the object in the lower
pane.  You can spawn new explorers or generate workspace reports from
a menu item inside the browser.  The browser works best in Morphic
(very flaky in MVC, but hey, its just a first cut).

MANY KUDOS to Bob Arning for his outstanding hierarchical browser
framework, which making it possible for me to build in no time and
with some ease a browser when I had ABSOLUTELY NO IDEA what I was
doing!  An image of the browser follows:



As noted, the present cut is pretty rough and poorly documented, but
I thought it was more important to get it "out there" than to hang
onto the code at this time.  Getting ready for trial in a pending
matter, I probably won't have time to finish that polishing for at
least a month, and I though it might be of interest to some
squeakers.  Mach I is somewhat funky and inefficient (I'm going to
build in some database caches for the next version)  Very rough
documentation follows:

To get initial handles on objects, you can use the WNet class side message

	WNet reportAt: 'dog'

which will generate a workspace reporting on 'dog' for all database
parts of speech (this will be a list of WNWord instances), or you can
query individual parts of speech with

	WNet N reportAt: 'dog'

which will generate a workspace reporting on noun senses for 'dog'
(this will be a WNWord).  The first sense of the WNWord can be
drilled by executing the following doit in the workspace by the
immediately preceding doIt with:

	this reportAt: 1

which will generate a workspace with the WNSense for the first sense
of 'dog' ('this' will be a WNSense).  You can drill to get a list of
hypernyms with:

	this allHypernyms reportWorkspace

which will generate a workspace with a WNList of Hypernyms.  You get
the idea.  Sketchy documentation (from the comment for WNet) follows:


WordNet BASICS:

Installation
-------------

Copy the WordNet "Database" folder into the default directory (no
proxies), fileIn the sources, and execute the following doIt:

	WNet initialize

If everything is in place, WNet will be ready to go.  WNet
automatically reopens the files at startup.  If something gets munged
or a new update is installed, execute the following doIts:

	WNet close.
	WNet initialize


Searching and Drilling:
----------------------------
WNet at: 'dog'				"WNList of WNWords for 'dog'"
WNet reportAt: 'dog'			"Workspace with list of all
part of speech for 'dog'"

WNet N at: 'dog'			"Noun WNWord for 'dog'"
WNet N reportAt: 'dog'			"Workspace with list of all
noun senses for 'dog'"

WNet N at: 'dog' senseAt: 1		"Noun Sense Number 1 for 'dog'"
WNet N reportAt: 'dog' senseAt: 1	"Workspace with Noun sense 1 for 'dog'"

Reporting:
------------
anyWNObject reportWorkspace		"Open a workspace describing
this object,

		in which 'this' is a reference to anyWNObject"

Analyzing:
-------------
senseOrSynset pointers			"pointers from sense or synset"
pointer ishypernym			"true iff pointer is a
hypernym pointer"
senseOrSynset hypernyms			"WNList of immediate
hypernyms of the object"
senseOrSynset allHypernyms		"WNList of closure on hypernyms"
senseOrSynset closureOn: aBlock		"WNList of closure on
relation defined by

		boolean expression aBlock"

	e.g.: 	sense closureOn: [:each | each isHypernym]
	N.B.:	Closure presently makes no effort to avoid
recursions, so will not work with
			all pointer relationships.

senseOrSynset allHypernyms reportWorkspace "Workspace with all
Hypernyms of sense"

"Raw" access to database:
-----------------------------
WNIndexStream is a StandardFileStream that views a WordNet index file
as a stream of WNIndexStreamRecord objects.  Likewise with
WNDataStream, but for data files with WNDataStreamRecord objects.

WNIndexStream can be (why?) queried sequentially, using

	s positionAtFirstRecord.
	[s atEnd] whileFalse: [ . . . s next . . .].

or more usefully, queried by binary searching for a string, using
	s positionForWord: aString.
	record _ s next
or more directly:
	record _ s wordAt: aString

WNDataStream can be queried sequentially as above, but can also be queried:
	s position: aSynsetIndex.
	record _ s next

WNPOSDictionary provides a higher level access using the WN object
types, reporesenting a part of speech (comprising the index and data
files) and can be queried using
	pos wordAt: aString "(or just use at:)"
	pos synsetAt: anInteger
	pos glossAt: anIndex

WNet provides access to all WNPOSDictionaries for the WordNet
database, and a WNList of query results for all parts of speech can
be obtained by
	WNet wordAt: aString
or searches for individual parts of speech can be made
	WNet N wordAt: aString
	WNet nounAt: aString
Attachment converted: Anon:WordNetDemo.17Aug850am.cs (TEXT/R*ch) (0001238E)
Attachment converted: Anon:test5.gif (GIFf/ogle) (0001238F)





More information about the Squeak-dev mailing list