set and stream

Boris Gaertner Boris.Gaertner at gmx.net
Mon Aug 1 17:11:45 UTC 2005


 "Houssam Fakih" <fakih at ensm-douai.fr> wrote:


>Thank you for your response. What I need is a method that does the same
>computation as addAll: with better performance. By using streams I improve
>the performance by 70%. My question was just to know if the developers on
>the list use another pattern (other than using streams) to improve that.

addAll: is not that bad. It is certainly better than concatenation with the
#,  method. The problem with repeated concatenation is that 
increasingly large collections are copied again and again. This
is avoided both with #addAll: and with a stream (for collection that
a stream can write into).

Please compare these two examples:

   "  repeated concatenation "
 |sets union |
 sets := OrderedCollection new.
 1 to: 100 do: [:idx | sets add: (Set with: idx)].
 Time millisecondsToRun:
  [ 1000 timesRepeat: 
   [union := sets first.
     2 to: 100 do: [:idx | union := union, (sets at: idx)].
     1 to: 100 do: [:idx | union := union, (sets at: idx)].
  ].]. 


   " use of #addAll: "
 |sets union |
 sets := OrderedCollection new.
 1 to: 100 do: [:idx | sets add: (Set with: idx)].
 Time millisecondsToRun:
  [ 1000 timesRepeat: 
   [union := Set new:100.
     1 to: 100 do: [:idx | union addAll: (sets at: idx) ].
     1 to: 100 do: [:idx | union addAll: (sets at: idx)].
  ].].  



Additional notes:

1. When you can guess the size of the union set, you should use
that guess to create an empty set of a suitable capacity. With a set
of suitable capacity you avoid some grow operations.

2. Keep in mind that insertion into a set is costly when most
set elements a mapped onto only a few hash values. This is
sometimes a really serious problem.


Greetings
Boris








More information about the Squeak-dev mailing list