On 2011/05/01 18:57, Levente Uzonyi wrote:
On Sun, 1 May 2011, Frank Shearar wrote:
Ah, that might be exactly the hammer I'm looking for. Thanks!
I don't want to add the balancing protocol to ZTree because then I'm allowing the mutation of a ZTree, which means that clients of ZTree can't assume that it's immutable.
Well. To be picky, ZTree _does_ have the balancing protocol, in the sense that it responds to #balance. The implementation of #balance, however, makes a MutableZTree copy, balances that, and returns an immutable version of the newly balanced tree. ZTree just doesn't permit the necessary mutating like setting a subtree and the like. ("Necessary" because the DSW algorithm makes extensive use of mutation.)
You shouldn't be that strict IMHO. You can mutate the state of the nodes as long as it has no side effects. So if you copy all nodes, then you can freely mutate the state of them before giving any of them to the user.
Well, in the context of a zipper zipping over a tree you do need to be that strict, surely? The copying that I'm doing here is "special case" - when simply zipping over a tree it's easy enough to not make copies of things. This particular experiment's a bit of a "big bang" mutation.
Certainly creating a whole new tree, mutating it and letting it "set"/turn immutable's a whole lot cheaper than folding the tree into a list and then unfolding it recursively. (*)
If you want to create efficient immutable data structures in smalltalk then you should use different techniques than with modern functional programming languages. For example smalltalk VMs won't detect that you created a copy of something that won't be used anymore, therefore you should avoid creating such copies.
Indeed. And not mutating lets me avoid copying! But I'll freely admit that I'm new to all this, and I'm more than happy to hear what things work or don't work.
If you don't want to expose state mutating methods to the user, then you can use separate tree and node objects, like I did with LLRBT, which is a persistent balanced binary tree implementation. An old version is available at http://leves.web.elte.hu/squeak/LLRBT-ul.7.mcz . It has LLRBTree which is the class for the tree object and there are two classes for nodes, LLRBTreeNode and LLRBExtremalTreeNode. Users only have direct access to tree objects in this case.
I'll certainly take a look at LLRBT; thanks!
frank
(*) Here's the data. The times are in milliseconds. "recursive" flattens the tree into a list and then rebuilds the tree recursively, DSW mutates a copy, and DSW mutates a copy with the adoptInstance: optimisation.
#numNodes, timeToAdd1000, recursive, DSW, DSW2 1000, 1012, 84, 53, 37 2000, 1853, 155, 108, 66 3000, 2242, 253, 142, 108 4000, 2985, 401, 217, 136 5000, 3612, 495, 417, 234 6000, 4200, 728, 323, 220 7000, 4900, 760, 337, 357 8000, 5596, 893, 390, 301 9000, 6530, 1115, 656, 352 10000, 7038, 1298, 593, 344 11000, 7867, 1538, 594, 583 12000, 8707, 1666, 632, 414 13000, 9347, 2169, 704, 607 14000, 10251, 2031, 706, 651 15000, 11134, 2459, 968, 554 16000, 11636, 2826, 794, 623 17000, 12410, 3077, 970, 717 18000, 12876, 3497, 1138, 690 19000, 13740, 3865, 1222, 865 20000, 14078, 4168, 1207, 934
frank