string sharing (possible bug?)

glenn krasner at objectshare.com
Wed Dec 9 19:04:40 UTC 1998


The problem is independent of duplication. Say you have a method that says

foo
	| b |
	b := 'abc'.
	Transcript space; show: b.
	b at: 2 put: $d

What prints the second time you send #foo? Because the literal (copied or
not) is stored with the compiled method and not the activation, the second
time aroud b is 'adc' even though it's right after the assignment of 'abc'.

Immutablility, copy-on-write, and copy-literal-on-use are all typically
fairly expensive. Since these are rare occurences, many Smalltalks
(including Squeak and VisualWorks) have been unwilling to pay the cost and
therefore live with these rare anomalies. For VisualWorks, we got rid of
the sharing but not the mutability.

VisualAge has immutablity, good for them (the at:put: message above would
be an error). I don't know any Smalltalk that makes a fresh copy on use,
but there may be some.

We discussed this in the development of the ANSI Standard, of course, and
as you can see we provided you no help--we used the "undefined" wimp-out
twice, for both the behavior of state-modifying messages and for the
uniqueness of literals.

glenn

At 01:42 PM 12/9/98 -0500, Doug Way wrote:
>
>On Wed, 9 Dec 1998, Adam P. Jenkins wrote:
>>
>> Also, is there some deep Smalltalk principle here that is the explanation
>> for this policy of sharing literals in a method, or is it just to save
>> memory?
>
>I'd like to hear more about this as well.  It seems to me that having
>duplicate strings/constant arrays in a method would be a rare enough
>occurence that you wouldn't really save that much memory anyway.  (Out of
>the blue, I'd guess 1-3% of them would be duplicates...?)  The only
>potentially common case I can think of is when you're initializing several
>strings (or arrays) to be empty, but "initializing" implies that you'd
>want to be able to modify them independently at some point in the future. 
>
>> Every other high-level language that I've used -- TCL, Java (is java
>> high-level?)
>
>How about medium-level? :-) 
>
>> Perl, Python, Basic, O'Caml, ML, Prolog, Lisp, Matlab --
>> all create a new object in response to a string or array literal. 
>> (Actually ML and Python use immutable strings, so this whole problem
>> isn't an issue for those two, at least for strings.) Note that these
>> languages are free to not actually create a new object each time; they
>> could use a copy-on-write scheme to avoid unnecessary copying yet still
>> give the semantics of unique objects.
>
>This sounds like a much better solution to me than making literals
>read-only.  I don't know all of the issues behind implementing a
>copy-on-write scheme in Smalltalk, though... would there be a big
>performance hit? 
> 
>- Doug Way
>  dway at mat.net
>  dway at transom.com
>  http://www.transom.com
>
>
>
>





More information about the Squeak-dev mailing list