<div dir="auto">Hi Chris,<div dir="auto"><br></div><div dir="auto">I still don't see a compelling reason to change #new.</div><div dir="auto"><br></div><div dir="auto">In one paragraph you write about using new: "proactively" and in the next you write this is not about performance.</div><div dir="auto"><br></div><div dir="auto">My opinion on the API: if I start out with no specific knowledge about the expected size of my collection or I don't think that is relevant, I use new. That is 99% of the cases.</div><div dir="auto"><br></div><div dir="auto"><span style="font-family:sans-serif">I expect that the standard library gives me a default that has proven to work well in most cases. By "work well" I predominantly think of time-efficiency, not space-efficiency.</span></div><div dir="auto"><span style="font-family:sans-serif"><br></span></div><div dir="auto"><span style="font-family:sans-serif">We can look at other such libraries, OpenJDK also uses 10 for ArrayList: </span><font face="sans-serif"><a href="http://hg.openjdk.java.net/jdk/jdk/file/9211f6e20448/src/java.base/share/classes/java/util/ArrayList.java#l118">http://hg.openjdk.java.net/jdk/jdk/file/9211f6e20448/src/java.base/share/classes/java/util/ArrayList.java#l118</a></font></div><div dir="auto"><br></div><div dir="auto">I only use new: when I know that it gives me a better result, after I noticed that I need a better result. "Proactive" from the start implies premature optimisation for me.</div><div dir="auto"><br></div><div dir="auto">Also I don't expect that initial allocation with new: will always be faster. I expect that I can save reallocations later on. That is where the time is supposed to be saved. If new is faster than new: 1, I am fine with that. Allocating an Array of 1 or not allocating a Collection at all would be even better. I expect a benefit from new: mostly with large numbers.</div><div dir="auto"><br></div><div dir="auto">If I needed new: with a small size because I have space problems, then I wouldn't use hashed collections in the first place. Don't make new possibly worse because it happens to suit your case, which is possibly special.</div><div dir="auto"><br></div><div dir="auto">We have not seen<span style="font-family:sans-serif"> in this thread</span> any credible, real data on what "most cases" and "works well" really is, and I have none to give. There is probably some study or paper about it out there, I don't know. Not having the numbers seems to imply to you that the choice is arbitrary, or even random as you called it, but I beg to differ. If we make it worse in the average but unrevealed case, it is still a regression. A symptom would be if people after the change felt they have to use new: more often to achieve acceptable results. Chances are you won't even be able to measure such cognitive overhead for the users and that's really bad!</div><div dir="auto"><br></div><div dir="auto">Patching new for a specific application is obviously only an option if the application is deployed alone (Sqeuak only as runtime) as opposed to installing the application into other images. So for your GraphQL engine it is not an option, sure. But one can always add another constructor as an extension method if one has special requirements.</div><div dir="auto"><br></div><div dir="auto">Kind regards,</div><div dir="auto">Jakob</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Chris Muller <<a href="mailto:ma.chris.m@gmail.com" rel="noreferrer noreferrer" target="_blank">ma.chris.m@gmail.com</a>> schrieb am Mi., 29. Jan. 2020, 22:59:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div dir="ltr">Hi Nicolas,</div><div dir="ltr"><br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>Thanks for the interesting discussion.  I'd like you to know, I'm not gung-ho about this change, but do think we should seriously consider it for the benefit of Squeak.  I think the benefit is real, but deceptive.</div><div><br></div><div>It seems there are two dimensions to the decision:</div><div><br></div><div>  - legacy / compatibility</div><div>  - API design / user-expectations</div><div><br></div><div>I do respect your point about legacy, that writing #new has always meant you get something that can hold up to 10 elements before needing to grow, instead of only 3.</div><div>It "<u>sounds"</u> reasonable, but...</div><div><br></div><div>Here are some <u>certainties</u>:</div><div><br></div><div>    - Allocating a 3 element array is quicker than allocating 10 element one.<br></div></div></div></blockquote><div>

<div><br></div><div>Hmm not sure about that. Not for single or few objects.<br></div><div>Allocating many larger short lived objects will increase the rate of scavenging statistically, but this will be measurable only is allocating massively this kind of objects IMO.<br></div><div><br></div>


</div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div><div>    - 3 element Array's take up less memory than 10 element ones</div><div>    - consuming more RAM can lead to slowdowns due to paging or GC overhead <br></div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>    - In the worst possible case (e.g., doing it over and over and nothing else), adding 9 elements to an (OrderedCollection new: 3) is 72% the speed of adding 9 elements to an (OrderedCollection new: 10). see [1]</div><div><br></div></div></div></blockquote><div><br></div><div>Typical Smalltalk objects are short lived.</div><div>That's why we get a generational garbage collection.</div><div>So typical usage is unlikely to generate paging.</div><div>Case of paging can only occur if many of these objects are longer lived (and tenured), which again is not typical.</div><div>For specific usage, there might be specific optimizations like growing a bit the Eden.<br></div><br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div></div><div>Here are some <u>uncertainties</u>:</div><div><br></div><div>     - there may be some code somewhere creating many thousands of OrderedCollections  (if it were only a few, it wouldn't be noticed)</div><div>     - the many thousands are all created in a very short amount of time (if it were spread out over time, it wouldn't be noticed)</div><div></div><div>     - it then stores 7-9 elements in most of the OrderedCollections</div><div>     - in spite of all of the above, the author still wrote #new instead of #new:</div><div><br></div></div></div></blockquote><div><br></div><div>All your analysis is exclusively focused on 

a specific un-typical usage of the library...</div><div>You are then trying to bend the general purpose library tothis specific case.</div><div>IMO this should be handled with a specific optimization.</div></div></div></blockquote><div><br></div><div>My proposal to reduce the the defaultSize was never motivated by performance.  All of the above is simply a _rebuttal_ to Jakob's and Levente's concerns about performance being adversely affected in some code somewhere.  You and I are in agreement about the above -- it shows those concerns are inflated.</div><div><br></div><div></div><div>No, MY motivation for this proposal has always been for a better API design that can capture _something_, instead of, as you said, "trying to bend the general purpose library" to something we think is "reasonable" with arbitrary numbers like 10 for OrderedCollection.  I contend there is little to no basis for that number, but there CAN be a basis for choosing a small number; space efficiency and API clarity.  You mentioned "typical usage of the library" above, which I think it's impossible to define, but would love to hear your thoughts if there's something more tangibly obtainable than space.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div><br></div><div>I did propose using something like a SmallDictionary that is WAY more optimized for small size than hashed collection. </div></div></div></blockquote><div><br></div><div>Which I ignored because it missed the point of the proposal.  See above.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

Collections-ul.871, just like the former version, creates dictionaries <br>

matching those expectations, but Dictionaries returned by the new version <br>

use less memory.<br>

So, what's the problem?<br></blockquote><div><br></div><div>It slowed down (Dictionary new: 1) relative to trunk, </div><div><b>by a comparable margin</b> as adding 9 elements to an (OrderedCollection new: 3) relative to an (OrderedCollection new: 10 </div><div>(see [1])</div><div></div></div></div></blockquote><div><br></div><div>IMO this is typical biased usage of percentages... </div></div></div></blockquote><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_quote"><div>Saving 30% of a short duration or 30% of a long duration is not at all the same thing!</div><div>The former case is premature optimization presumably unless used in tight loops.<br></div></div></div></blockquote><div><br></div><div>Right, again this was simply addressing Jakob and Levente's opposition to the proposal.  They had concerns about performance, these benchmarks were meant to allay them for the reason you stated.</div><div><br></div><div>As I said, all of <u>my</u> motivations are more outwardly focused; the API presented by Squeak, and user expectations thereof.  I agree with you that these concerns on performance are "premature", and shouldn't stop us from seriously considering this.</div><div><br></div><div>Best,</div><div>  Chris</div><div> </div><div><br></div></div></div>

<br>

</blockquote></div>