[ENH] Fulltext: Skiplist refactor

Scott A Crosby crosby at qwes.math.cmu.edu
Mon Feb 18 07:17:21 UTC 2002


This is a ** severe ** refactoring of Skiplists. They are *required* for
Full Text indexing.

If you're using the old skip lists, this changeset will probably do bad
things. You have been warned.

I add comments and make the skiplists act dictionarylike, supporting
key-value and the 'next' operations. They can also act set-like with the
'add' operation. Thus, these skiplists are probably more reusable.

They may be faster than OrderedCollections, so try them out if you don't
like the performance of large OrderedCollections.

This skiplist implementation can handle random inserts into a collection
containing tens of thousands of elements at a rate of .5-2k/sec.

Someone who knows the squeak collections better should try to figure out
whether it should be resubclassed under OrderedCollection or something
else.


Scott



-------------- next part --------------
'From Squeak3.2alpha of 11 October 2001 [latest update: #4646] on 25 January 2002 at 12:57:47 pm'!
Collection subclass: #SkipList
	instanceVariableNames: 'sortBlock pointers numElements level splice '
	classVariableNames: 'Rand '
	poolDictionaries: ''
	category: 'Collections-SkipLists'!

!SkipList commentStamp: 'sac 1/23/2002 19:41' prior: 0!
A skiplist is a sorted data structure that allows one to search for any element in o(log n) time.

It also allows one to enumerate forward to the next element. Basically, its a tree-like algorithm, except it doesn't use trees.

The implementation here is similar to a Dictionary, in that it indexes (a subclass of) Associations. Thus, you can do    foo at: key put: value   You can also search for a key, if the key does not exist, it will report the first key greater than the search, or nil.
!

Association subclass: #SkipListNode
	instanceVariableNames: 'pointers object '
	classVariableNames: ''
	poolDictionaries: ''
	category: 'Collections-SkipLists'!

!SkipListNode commentStamp: 'sac 1/23/2002 19:39' prior: 0!
This is a skiplist node. It can be treated as an association that allows a 'next' operator to go to the next node in the skiplist.

!


!SkipList methodsFor: 'accessing' stamp: 'sac 1/22/2002 19:11'!
at: element 
	^self at: element ifAbsent: []! !

!SkipList methodsFor: 'accessing' stamp: 'sac 1/22/2002 18:18'!
at: element ifAbsent: aBlock
	"Get the key if it exists, or if it doesn't exist, get the key just after it."
	| node forward |
	node _ self.
	level to: 1 by: -1 do: [:i |
			[forward _ node forward: i.
			self is: forward before: element] whileTrue: [node _ forward]].
	node _ node next.
	(self is: node theNodeFor: element) ifFalse: [^aBlock value].
	^node value
! !

!SkipList methodsFor: 'accessing' stamp: 'sac 1/22/2002 17:19'!
first
	^pointers first.
! !

!SkipList methodsFor: 'accessing' stamp: 'sac 1/22/2002 18:33'!
search: element 
	"Get the key if it exists, or if it doesn't exist, get the key just after it. If no key after it, return nil."
	| node forward |
	node _ self.
	level to: 1 by: -1 do: [:i |
			[forward _ node forward: i.
			self is: forward before: element] whileTrue: [node _ forward]].
	node _ node next.
	^node! !

!SkipList methodsFor: 'adding' stamp: 'sac 1/22/2002 18:22'!
add: element 
	"Add an association or key on to the skiplist"
	^self add: element ifPresent: [].
	! !

!SkipList methodsFor: 'adding' stamp: 'sac 1/25/2002 12:26'!
add: element ifPresent: aBlock
	^self at: element put: element ifPresent: aBlock.
! !

!SkipList methodsFor: 'adding' stamp: 'sac 1/22/2002 18:22'!
at: key put: value 
	"Add an association or key on to the skiplist"
	^self at: key put: value ifPresent: [].
	! !

!SkipList methodsFor: 'adding' stamp: 'sac 1/22/2002 18:23'!
at: key put: value ifPresent: aBlock
	| node lvl s |
	node _ self search: key updating: splice.
	node ifNotNil: [^ aBlock value].
	lvl _ self randomLevel.
	node _ SkipListNode key: key value: value level: lvl.
	level + 1 to: lvl do: [:i | splice at: i put: self].
	1 to: lvl do: [:i |
				s _ splice at: i.
				node atForward: i put: (s forward: i).
				s atForward: i put: node].
	numElements _ numElements + 1.
	splice atAllPut: nil.
	^ node
! !

!SkipList methodsFor: 'removing' stamp: 'sac 1/22/2002 16:58'!
remove: key ifAbsent: aBlock
	| node i s |
	"Remove and return th association containing key."
	node _ self search: key updating: splice.
	node ifNil: [^ aBlock value].
	i _ 1.
	[s _ splice at: i.
	i <= level and: [(s forward: i) == node]]
				whileTrue:
					[s atForward: i put: (node forward: i).
					i _ i + 1].
	numElements _ numElements - 1.
	splice atAllPut: nil.
	^ node.
! !

!SkipList methodsFor: 'testing' stamp: 'sac 1/22/2002 17:49'!
includes: key
	^ (self search: key updating: nil) notNil! !

!SkipList methodsFor: 'enumerating' stamp: 'sac 1/23/2002 20:24'!
associationsDo: aBlock
	self nodesDo: [:node | aBlock value: node]! !

!SkipList methodsFor: 'enumerating' stamp: 'sac 1/23/2002 20:21'!
do: aBlock
	self nodesDo: [:node | aBlock value: node value]! !

!SkipList methodsFor: 'enumerating' stamp: 'sac 1/25/2002 00:18'!
keysAndValuesDo: aBlock
	^self associationsDo:[:assoc|
		aBlock value: assoc key value: assoc value].! !

!SkipList methodsFor: 'enumerating' stamp: 'sac 1/23/2002 20:22'!
keysDo: aBlock
	self nodesDo: [:node | aBlock value: node key]! !

!SkipList methodsFor: 'private' stamp: 'sac 1/22/2002 17:50'!
is: node before: element 
	| key |
	node ifNil: [^ false].
	key _ node key.
	^ sortBlock
		ifNil: [key < element]
		ifNotNil: [(self is: key equalTo: element) ifTrue: [^ false].
			sortBlock value: key value: element]! !

!SkipList methodsFor: 'private' stamp: 'sac 1/22/2002 18:04'!
is: node theNodeFor: key 
	node ifNil: [^ false].
	node == self ifTrue: [^ false].
	
	^ self is: node key equalTo: key! !

!SkipList methodsFor: 'private' stamp: 'sac 1/22/2002 18:01'!
search: element updating: array
	| node forward |
	node _ self.
	level to: 1 by: -1 do: [:i |
			[forward _ node forward: i.
			self is: forward before: element] whileTrue: [node _ forward].
			"At this point: node < element <= forward"
			array ifNotNil: [array at: i put: node]].
	node _ node next.
	^ (self is: node theNodeFor: element) ifTrue: [node]! !


!SkipListNode methodsFor: 'accessing' stamp: 'sac 1/22/2002 17:22'!
printOn: aStream
	| first |
	aStream
		nextPut: $[.
	super printOn: aStream.
	aStream
		nextPutAll: ']-->('.
	first _ true.
	pointers do: [:node |
		first ifTrue: [first _ false] ifFalse: [aStream space].
		node ifNil: [aStream nextPutAll: '*'] 
                ifNotNil: [node printOn: aStream]].
	aStream nextPut: $)
! !


!SkipListNode class methodsFor: 'instance creation' stamp: 'sac 1/22/2002 17:53'!
key: key value: value level: maxLevel 
	^ (super key: key value: value) initialize: maxLevel! !

!SkipListNode class methodsFor: 'instance creation' stamp: 'sac 1/22/2002 17:57'!
on: element level: maxLevel 
	^ self key: element value: element level: maxLevel
! !

SkipListNode removeSelector: #object!
SkipListNode removeSelector: #object:!
SkipList removeSelector: #associationDo:!


More information about the Squeak-dev mailing list