[FIX] SortedCollectionFix-sr v2

Stephan Rudlof squeak-dev at lists.squeakfoundation.org
Sat Sep 28 00:32:00 UTC 2002


This is a multi-part message in MIME format.
--------------080907000203090502030503
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

"
Change Set:  SortedCollectionFix-sr v2
Date:        28 September 2002
Author:      Stephan Rudlof

This is an evolutionary fix/extension of SortedCollection:
- >>copyFrom:to: should use the sortBlock of the receiver instead of the
default nil sortBlock;
- >>addLast: should be forbidden as >>addFirst is already;
- >>addLastWithoutSorting has been introduced as 'private' method for
constructing a copy by the newly introduced >>copyFrom:to:.
- clients fixed, which have send >>addLast to SortedCollections before:
  - IndexFile>>readFrom:messageFile: has to be tested (see method comment),
ideally by a Celeste maintainer,
  - Dictionary>>keysSortedSafely has been fixed (it could break the class
invariant of SortedCollection, if its internal implementation would change)
without a relevant performance loss;
- class>>withAll:sortBlock: is a new 'instance creation' method.

Rationale:
I'm expecting an OrderedCollection, if I'm sending >>addLast:. ANSI v1.9
supports this view: it says, that >>addLast: behaves only to the protocol
<OrderedCollection> and the protocol of <SortedCollection> does *not*
conform to this protocol.
>>addLast should always work for OrderedCollections, but it would only work
in *rare* cases for SortedCollections. So I'm against making it work for
these rare cases for SortedCollections. BTW: I'm not against introducing
something like >>addAllProbablyPresorted:, but I don't do it here.

Note: there are other changeset with SortedCollection fixes (dealing with
e.g. >>addLast:), but they haven't reached Squeak3.2 yet.

History:
v2
: - no client has to call >>addLastWithoutSorting anymore, it is just a
private method used by its own class now (in >>copyFrom:to:),
	  - proposal of Richard added, cs comment changed and extended
v1
: first version

There is an interesting proposal of Richard A. O'Keefe, which hasn't been
implemented yet (I have changed doublequotes to double doublequotes):

---------------------------------------------------------------
I think it's time to talk about lazy data structures.

We could add elements fast to SortedCollections like this.

1.  The internal representation is an array partitioned thus:
	+----+------------+--------+----+
	|1   |a          z|       x|   n|
	+----+------------+--------+----+

    elements 1..a-1 : all nil
    elements a..z   : sorted according to sortBlock
    elements z+1..x : added elements not sorted yet
    elements x+1..n : all nil

    Where I have written ""a"" and ""x"" you should understand the existing
    instance variables of OrderedCollection (inherited by SortedCollection)
    being used instead; the ""z"" instance variable is new.

2.  add: anItem
	^super addLast: anItem
    addFirst: anItem
	self shouldNotImplement
    addLast: anItem
	self shouldNotImplement
    addAll: aCollection
	^super addAllLast: aCollection
    addAllFirst: aCollection
	self shouldNotImplement
    addAllLast: aCollection
        self shouldNotImplement

    No, I'm not being inconsistent here.  I'm saying ""let someone build
    a sorted collection in linear time if they can provide the elements
    in sorted order.""  With this kind of implementation, #add: _is_
    adequate for the job.

3.  A strictly private method would exist:

    pvtEnsureAllSorted		   ""rather like current addAll:""
	x = z ifTrue: [^array].
	self pvtSortAddedElements. ""sorts z+1..x""
	self pvtMerge.             ""merges a..z with z+1..x""
	z := x.
	^array

    Ideally pvtSortAddedElements would use a method which is linear when
    the elements are sorted already; Dijkstra's smooth-sort is
    just one of many algorithms which are good at that, bottom-up
    natural merge is easier to understand and works very well in practice.

    Using bottom-up natural merge, if you add M elements to a Sorted
    Collection having N elements and then look at the result,
    the adds would cost O(M) time,
    pvtEnsureAllSorted would cost O(M) for pvtSortAddedElements
    and O(M+N) for pvtMerge, for a total O(N+M).
    Building an entire collection from N=0 would thus be O(M).

4.  Other methods would call pvtEnsureAllSorted, e.g.

    at: index
	self assert: [index between: 1 and: x-a+1].
	^self pvtEnsureAllSorted at: index-a+1

    do: aBlock
	self pvtEnsureAllSorted.
	^super do: aBlock

5.  Setting the sortblock would simply enlarge the unsorted area:

    sortBlock: aBlock
        sortBlock := aBlock ifNil: [nil] ifNotNil: [aBlock fixTemps].
        z := a-1.


This is the clean way to do it.  The class invariant is weakened from
""the elements are always sorted"" to ""the elements are always sorted
whenever anyone else looks; when noone else is looking they are in a
convenient configuration for sorting.""  From the outside, it looks as
though the elements are always sorted.
---------------------------------------------------------------

This may be an improvement to the current status.
"
-- 
Stephan Rudlof (sr at evolgo.de)
   "Genius doesn't work on an assembly line basis.
    You can't simply say, 'Today I will be brilliant.'"
    -- Kirk, "The Ultimate Computer", stardate 4731.3

--------------080907000203090502030503
Content-Type: application/x-gunzip;
 name="SortedCollectionFix-sr.2.cs.gz"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
 filename="SortedCollectionFix-sr.2.cs.gz"

H4sIAAAAAAAAAKVZ227bSBJ91gP/oax5kDyWGduJsxPGMeBcjPXuxAniLPJgKEFLbEmMKbaG
TcqSkY/fU9VNiro4E2CMyYBsdlfX9dRFncvcTOnmr1Kru6fhCZkRHR/Tf8p0SSdHRyd0m6pC
24LKWYyHiH579uL0eZ9MRid/0I2eFXo60Lnbqwo6iU6eR6cvSE07e0H7zURlY41tRdRq3Zi8
0PEbk6Z6WCQmu0wWhzan+Unwlim3Wq0tisFFWUxMzt9uCj0DNfpUxqkZBcHnSWIJ/2FJz01a
MkWVL2mULJ7oRaEziwWWZvPaKDik8/OhmS0vIXlUmIjsxJRpTKXVVEw0WZx4nZrhHR/nhVwP
dTIHT0lmC63iaj3WI1WmBWVJujr0UuirOP5T2aKmPdA0MvkgiWOdkbJux2WSQ7MsRJqD7HLt
6JcEopcFs59kY5rg0EDjcJIVuYnLoY6ZTmeWJ3Oor0NTjf0x30JDAz7zcigHFbGsNFgKy5m+
h2UbNNZVEYKDYZrorLCsSR336H6SDCe4fg7F6CxeMUiF2VIu8wgOdBQQHdJVFuvFZZLq83OW
Ty6ZamvVWPNqJEKBCpTDPgZuulbrSpKhmU7ByH6PklirNBUZFL3RKe+lqYIU+Kfznlz2Nhl6
Hzg/v9NL61i7USONk7X2RCjqJgXIi13A2J1oZpgqC4/K5ipPVFbsch1wMqKk4F2FzjOVUjKd
pZq5VPyd7oXmULx+n+6dBcFzDp7nTHSmc6hnqrKhptRY+1L0jYvPz3n3RZpGtSNF4hlsMeqw
48mhIfjlqyp7h0HwSRYU9BlcdaakFzPtLZ/RhzzW+ZYIvI1tyZtWrhrSxfXNFc2Pwxdky9kM
fMA6HGbzRN+Dm4KsWtoe1hDoDRcfaPYOC0yApmFO1uYsN4UZmpTOtlg4B2Px+ibo+mxT2ecU
GxD9PTPF7+zRrDdHHRxVJ8Ng5Y4+0lR6Dy5hivxOgmHrfkgwgFUgjjOXsC3bk4x+z1WucZ+y
uJuPb3l4iCXRoBorNgsc8Y4VKfT8nRAOWMKkfk7p9ecvkdCCkDW9KjiZqDVsZn5KkzvtlA4n
+ZibgRqky4+5tkI1cjJdQWdZp8D/mZ0J5IZ/XBvG7YLfiDky/Oid1OpC3HSLOQkUS12OPL5e
NulwHDYMv+8uBbml4APfDPccThBhq4Sy1AWY+HdiC5Mvo2B+0ooQrZnxOFNhwBAR/ij4qWw5
Bar0WCpY/3tpOao89FV4AfiOGSI4QM195gM6M/cI92wD5/Z7QYtBA340M1aJB34CzqkcDgSM
Bu4NbYVAXlex+K3kFmwI5scQZCQIjtTA2YaTEivZZSXBCCAV87/rmouQPnT+q/VIrzDWsgod
ylfAgmuhQepeOQiuWIlNOUj1XyVsK+pz72vL+1EQHP6zv+CKwy1DZBQdXJNMtUSgSu9IDRja
UvWwJBQGilzCKSExrP1Fe3iFLkk7QRAEjyYNcW4ObJw9DomgxxXG5noGqjXGOu2qPFdLmin4
B69qBpTSRkHrgPk+aApxsPZwELR+HBPRD0X138MP/7Dgh+zHr1AJeH8t2nEYqsNjioi9GOXA
+lcVhg94j8gFK6nh0OSCvtBGjffrZx4OQHPBFNkdV+uMFJ4M/GL9zILPZA0u5PMXcUrvP/d5
UsCBqd1W7bY4dLu9wNPSlHUZBPfOOd04kNaLRJxYaNV5SLIkXM2yS29BLEccbk0KF5GbBt8X
WgPNGpCo9XXVS7mw3X4AQ1s3seGRC+EhJ/AQaAViZlcoFoPWV6QrQFqdkPw6X1JVWavNVqcj
LyqQ8aqKs2r3GoW/28z5mtRKsCYr+OZpNb6vzlVcNQ///W27KfLfY2dlw7Xp1WnGqT3JuEiE
ZQWCOVGQKwvUkj+326kuJPsgtoDySRo7LirfG65sDXBFitAqdwCRjFxGGCJKgXtzFG/Oj7yX
ej+qCBn2nRAGJ8Z8l92RT6XGXq+uevSbWP1bYr95lQDpOAH4lEvfzQDu8RSiXDAeoSBEZt/I
Ei7ni1NHTjmzefEus0Au6Nd5agu5ASrIlWRKAadhmeesKm/0djtoLegVPUDcz3mJ/Hr7VSCp
H3ozgioTu+DofedFD0HUSlnlopupVJvf63zMRmj8tdtTXvT4IRl4de6Bole0wG3uXifKlS+U
d13uJecmR9XKkLyT2MqA9xPtPKppMCkaKuTyjQqq7e93ULFCWrBTg5LikHeAlByXBM2eAyOi
2EXZno4N4GAytf5Opjk2YEEVUk+ilDBFYaaH5UwoZAq5BOAvGmAOtbIJjAHAbOAT/+Oiy3IK
RhGngXsJu52Caw65+GFa/7Ps0jX9ddpSEDP6cap6vxIaFykPXEKkAW4AUiZ43dCQw0oEgjF3
TiTuGi2aw16tT1xQGWFooJ8P3ff7EjK9R9xwazP7+S7TumgAD9h1cF3ve+/k4zcFeZC2seH6
4P1+KAdec1T7PgFUEi5WG/UfjwWuXx15Jji7cp/GfECtz+CpHyQ6nCfVgnHm2ZakJ6Wjs4YC
gCXcGHrfR5Gmc6zdyiLuKO5RAEV0zBJFtDhUB8ccVV+rSNnUUk2Qd7o7YjT1yiXV1iPHwhqr
G5uDU8iFaYUUbdUoYCCjACefZUhaQl2pYr/kLWVWxQZCw0NKo4dTq9y+9oXD132DA14nyCO3
yNh9fjGFe/efUYl/1tOZ7Yc1FQl+FBxQaT0IcT2shjHR/7iSEGWbL6Y2m1vsv0d9rrlwYksH
7fZWyPtGykkHgAbJv9sVMIBonpQg6Dn+dWq1BIV9KeCCDFQvM+zgi1f1UqgheFWAzIRmQpoD
7vuSMaLV+SQ82bqWQDIGV/MiN0pRizwj/YHcBq8KuIMYT+jnHIf/uER2BphC5wgPJZU7sp7c
VzXDVfIAaBUlatz2Huy2txpYVEF0iVEXdVAdYkYCGTt8YDrDEsZkL56c/PFEpmxHx9Gzp5iu
bQ45glb7IrP3rPztdg6K5GFJ5dfVRAvwzWRCJJT2rc374iyjMgNkd6qBBHUvpNgG3Gs1fQNC
GK5FYRh2GoMmV5gzaM5NEgctEdvPKjgLKA4rfpDrwpC6alQIq7Ycj7lNcvsushinLH1SarDP
bH3dkiVotX7GWmEROpGlH64k4vveGl7CEy+iiFwUH0uABhb6/X2QW4UrXlq3Ecpu3krtG1cR
5378pCV7npnB957UF9m4R1k5PedGEa2IdnaAdG2m0+0uoJMb2eew7HZZL8i9rRYb/Ja3XZcy
9Fxtcwt9v6+uMhZ0Rsu+X7tUiCNZdPH96hUt3aNsaJ5CFZQVnhdQaL73q82b5DKFcu6sIimv
/X5/j8R769neuvNyeXDITYZ+zHsxIX4G760HgqRuxHq0NhlsvBB88ypDm4dJxAOuW4pdR1Xs
j+HHGeaHi8K7wcqZgaR+sLi3F7Ru6kmsBKh0JtvDxC73zS3fLeV6KjtVNaxglKGOmSFtJw9u
ENdbVVEx9+a5KS2mCwhgNC9JIcgwTwyP0X91zojcdOUOYrYkY1crjbfUbjuGeq4ARo4wGDkl
KceDyatQ5wrDXfAOg6VEwzfcjJYEZ/Mq1025bjHMdKP0aUZd1+QVE0z3V/BBzaBqDIQggkyu
fzqk5XBHc05TO756yxUJgBGtOF4bSPmNPqYADG4IG6s8K4DmVyvQ4JbcONvdwsWMJ5vHR0dH
Ls4aifs2mqKhn54ACPAwV2npdElnrwirqwUuTm4rJ1aoM2KOKDhFqn1ECao4sb7Re5Wkb18z
psRXju0/4XdroRDyAaeAb6shOouxdLHa2ooev/5YDFWfmYfIaVguWVcuV1KOzRnjo3Ag27Z1
KZ1Y98JaM0wcyMOc1WnRjT+/XwHGlurXk14c78h4px4znkanjBk7O+7GC8ITAHHOvyJwM8wv
T57QUzath0KxRPNELKmBh5acG9bHCLzaF/mFXi4q6AdNqHyclJS2rCShIgVsk29Ryj/VyVOn
E8ctPPnD4DvoPDpH+JU7eUi649Ln/tJTB971KJW35YW4KPEvaRiNupdmMSK/P23+lCY/Iviq
xDZmhAzrmJUmfiLSJF9T90BRcA1eNCQBWtQMnDVZWzXpzjDM0bvprJB2fYvMNz9Mwe8o7OoC
ETXdwybdAwz+dhBooIijVL2HGxI16DrvcT3QDtF2zsUj7+l1B7S/Ux7nuOKBm99+xSXqXxgf
zeenKz/cZLDhle1M69il3fVhvFR5G7FXn3sMPVwaXWN0x49k6yz/y7F8/Dw6+gMs1wlM7bad
778arpytxpI82UBqLocT58r8q0M94mo1RybNYa8ri1Ed5XIXe9gokQGov01K3q5ThjjeJrzt
01rB6o+95NUd8PgSyxhu5OwmoskG2rvi5kbzVtbfb5sJ+eLTXvB/WS5MuR8gAAA=
--------------080907000203090502030503--




More information about the Squeak-dev mailing list