Re: [squeak-dev] Changing an object's class-balanced tre.

2 May 2011


      On Sun, 1 May 2011, Frank Shearar wrote:
...
On 2011/05/01 18:57, Levente Uzonyi wrote:
...
On Sun, 1 May 2011, Frank Shearar wrote:
...
Ah, that might be exactly the hammer I'm looking for. Thanks!
I don't want to add the balancing protocol to ZTree because then I'm
allowing the mutation of a ZTree, which means that clients of ZTree
can't assume that it's immutable.
Well. To be picky, ZTree _does_ have the balancing protocol, in the
sense that it responds to #balance. The implementation of #balance,
however, makes a MutableZTree copy, balances that, and returns an
immutable version of the newly balanced tree. ZTree just doesn't
permit the necessary mutating like setting a subtree and the like.
("Necessary" because the DSW algorithm makes extensive use of mutation.)
You shouldn't be that strict IMHO. You can mutate the state of the nodes
as long as it has no side effects. So if you copy all nodes, then you
can freely mutate the state of them before giving any of them to the user.
Well, in the context of a zipper zipping over a tree you do need to be that 
strict, surely? The copying that I'm doing here is "special case" - when 
simply zipping over a tree it's easy enough to not make copies of things. 
This particular experiment's a bit of a "big bang" mutation.
I'm not familiar with zippers, but the only case when you can save copying 
is when the subtree is already balanced and it fits the new balanced data 
structure without any modifications.
...
Certainly creating a whole new tree, mutating it and letting it "set"/turn 
immutable's a whole lot cheaper than folding the tree into a list and then 
unfolding it recursively. (*)
I guess with DSW you create two copies from each node, while with DSW2 you 
only create one and change the class of it. You may find other primitive 
based methods, like #copyFrom: useful if you're not using it already. 
Something like
ZTree >> #mutableCopy
^MutableZTree basicNew copyFrom: self
may be faster than directly changing the instance variables.
Btw, if timeToAdd1000 means what I think it does, then you have plenty of 
room for improvement. Adding 100000 integers to 
LLRBConcurrentDictionaryTree (which is a persistent, concurrent 
(lock-free), binary tree based dictionary) takes 250-350ms (depending 
on the order of them) using Cog.
Levente
...
...
If you want to create efficient immutable data structures in smalltalk
then you should use different techniques than with modern functional
programming languages. For example smalltalk VMs won't detect that you
created a copy of something that won't be used anymore, therefore you
should avoid creating such copies.
Indeed. And not mutating lets me avoid copying! But I'll freely admit that 
I'm new to all this, and I'm more than happy to hear what things work or 
don't work.
...
If you don't want to expose state mutating methods to the user, then you
can use separate tree and node objects, like I did with LLRBT, which is
a persistent balanced binary tree implementation. An old version is
available at http://leves.web.elte.hu/squeak/LLRBT-ul.7.mcz . It has
LLRBTree which is the class for the tree object and there are two
classes for nodes, LLRBTreeNode and LLRBExtremalTreeNode.
Users only have direct access to tree objects in this case.
I'll certainly take a look at LLRBT; thanks!
frank
(*) Here's the data. The times are in milliseconds. "recursive" flattens the 
tree into a list and then rebuilds the tree recursively, DSW mutates a copy, 
and DSW mutates a copy with the adoptInstance: optimisation.
#numNodes, timeToAdd1000, recursive, DSW, DSW2
1000, 1012, 84, 53, 37
2000, 1853, 155, 108, 66
3000, 2242, 253, 142, 108
4000, 2985, 401, 217, 136
5000, 3612, 495, 417, 234
6000, 4200, 728, 323, 220
7000, 4900, 760, 337, 357
8000, 5596, 893, 390, 301
9000, 6530, 1115, 656, 352
10000, 7038, 1298, 593, 344
11000, 7867, 1538, 594, 583
12000, 8707, 1666, 632, 414
13000, 9347, 2169, 704, 607
14000, 10251, 2031, 706, 651
15000, 11134, 2459, 968, 554
16000, 11636, 2826, 794, 623
17000, 12410, 3077, 970, 717
18000, 12876, 3497, 1138, 690
19000, 13740, 3865, 1222, 865
20000, 14078, 4168, 1207, 934
frank