[ENH] CharacterTimes-rsb

Brian Rice water at tunes.org
Wed Jun 23 04:55:55 UTC 2004


Ugh. This is far too long to deal with. I am simply going to omit a 
lot, and please presume that I agree or am too annoyed with the 
rhetorical volume.

On Jun 22, 2004, at 8:25 PM, Richard A. O'Keefe wrote:

> I made the pleasant discovery that manipulating strings as Prolog 
> lists of
> characters using definite clause grammars (a rough equivalent to using
> streams, in this context) was considerably more efficient than using
> Interlisp-D's native strings (about 5 times faster, in fact, on the
> benchmarks I was using) as well as being considerably easier to do.
>
> Java, of course, has Strings.  But for any serious String construction,
> Sun warn you over and over and over NOT to use String concatenation but
> to use StringBuffers.  And what is a Java StringBuffer but
> (WriteStream on: (String new: someDefault))?

('' writer) in Slate. Or (foo writer) for any foo collection, including 
trees, tries, bags, whatever.

> 	Perhaps Squeak is interested in this for its own sake, but I see
> 	no reason to continue in this path.  #space: is nondescript,
>
> #space: is to #space as #tab: is to #tab and #crtab: is to #crtab.
> The name could be better, maybe #nextPutSpaces: would have been 
> clearer.
> 	presumes that streams have knowledge of characters,
> which they do, fairly pervasively.
> 	and presumes that they care about printing of any kind.
> which they do, fairly pervasively.

Shouldn't be there. Shouldn't be there. Shouldn't be there. The entire 
idiom needs to die, or just not make it into the system I work on. To 
put the reason colloquially, nouns should not be hardcoded in any 
verbs.

> I don't see any problem with there being *some* kinds of Stream that
> know about characters and are "for" printing, so I presume that the
> complaint is that *every* kind of Stream (even, say, a WriteStream on
> a FloatArray) is burdened with these things.

PrintStream; or Printer; or Formatter; there are many suitable 
factorings.

> I would love to see a redesign of the Stream classes, probably based
> on Traits, with a proper factoring between reading, writing, 
> positioning,
> and text.

That's what I'm doing in Slate.

> 	What Slate does for timesRepeated: is totally element-type-compatible
> 	and abstract, creating a Repetition sequence object, with slots for
> 	element and size.
>
> "Preview" can't find "Repetition" in progman.pdf.  Is there are more
> recent version of the Slate programmer's manual that describes this?
> (And is it available in A4 format, pretty please?)

It's not in the manual because it's half a page of code at the end of 
src/sequence.slate in the Slate distribution. A manual entry would need 
only the class comment. I may change that, but I'm more interested in 
documenting the more important things.

<<Explanations for how this is done in Squeak>>
> 	I modified my suggestion since there'd be an uproar about a new class
> 	just for this apparently distasteful idiom.
> 	
> Not really.  You'd just have had me pointing out that RunArray already
> does that job.

I don't care. I work on Slate, not Squeak, and those are not idioms I 
would have picked for a "clean re-design". We have trivially-easy lazy 
concatentation, #;;. #; is concatenation and nextPutAll:'s shortcut, so 
it all composes rather simply in the program text. (And no, we don't 
have cascaded message-sends. I'm not going to debate that, though.)

> And that's problem 1 with #repeatedTimes:.
> Looking for methods containing 'repeat' I found a bunch of things
> that weren't there, nor is this a complete list of what _was_ there,
> but it's reasonably thorough.
>
...
>
> So we have three meanings for "repeat" in Squeak at the moment:
> (1) Repeat a block or some other kind of action.
> (2) Concatenate multiple copies of a sound to make another sound.
> (3) A meaning peculiar to RunArray (does anyone other than the 'ar'
>     who added those methods use them?)
>
> Only the two methods in (3) refer to putting multiple copies of an
> element into a sequence, and those two methods do not mention the 
> element.
>
> Overwhelmingly, 'repeat' in Squeak means to repeat a block or other 
> kind
> of action.  The name #repeatedTimes: sits uncomfortably with Squeak.

*sigh* Okay, I withdraw the suggestion then. This does surely remind me 
of why I'm building Slate and not hacking Squeak any more. (And the 
initials 'ar' surely don't let me forget.)

> Cultural pressure from other languages could perhaps override this.
> But the languages I've used that had a REPEAT or RPT or RPT$ function
> gave it the same kind of input (a string) as result (a string).
>
> This runs us headlong into the second problem with #repeatedTimes:.
> If we look at existing uses of #new:withAll: we find a variety of
> collections being constructed.  Most are Arrays or Strings, but many
> are not.
>
> What good is an idiom for a construction (making a collection of a 
> particular
> size filled with a particular element) which fails to cover so many
> instances of that construction?
>
> Both the name and the coverage problem end up suggesting that something
> like
>
>     SequenceableCollection>.
>     copied: nTimes
>         ^self species streamContents: [:s |
>             nTimes timesRepeat: [s nextPutAll: self]]
>
> might work.  This *is* like a (shallow) copy, so the name fits.
> It *is* a fairly short name.  It means that
>
>     instead of					you write
>     nil repeatedTimes: 27			#(nil) copied: 27
>     x isNil repeatedTimes: 10			{x isNil} copied: 10
>     ($ ) repeatedTimes: 80			' ' copied: 80
>     ($x repeatedTimes: 80) asArray		{$x} copied: 80
>     (255 repeatedTimes: 256) asByteArray	#[255] copied: 256
>     ?no can do?					'<>' copied: 16
>
> Funny, an approach which is more general *and* shorter *and* doesn't
> confuse people by looking like #timesRepeat:.

Fine, that's a more general idiom. I presume you mean copy/copied: in 
analogy to new/new: in which case I would transform it to 
copy/copySize: since Slate dropped new: very early on for newSize: 
(yes, I know it's a capacity in general). I can do this because I 
control Slate. I don't want to hear about Squeak or Smalltalk-80.

You don't specify the behavior of copySize: for non-unity-sized 
collections, and it's ill-definable for non-Sequences, unless you 
decide on a different definition using capacity. It's even possible 
that there is an inconsistency in making a copy protocol that feeds 
values where they weren't before (Sequences being Mappings when you 
have Traits). This is another matter.

> With specific reference to characters and strings, is there any reason
> to prefer #copied: to #timesRepeat:?
>
> Yes.  It's called Unicode.

Fine. Sold.

Caveats:
(1) Treating arrays of codes as Characters is not attractive.
(2) I have no reason or inclination to support any of Squeak legacy 
idioms listed as (2) and (3), so (1) alone is not enough to prevent me 
from using #repeatedTimes:, but thanks for pointing this out.

By the way, as a person who records and aggregates your each and every 
word on this mailing list about library and language design and the 
specifications and so forth, I very much appreciate your opinion, and 
do not mind so much the abrasive tone and the haranguing, but I do not 
think that this tolerance is shared by others, probably for the reason 
that they have no motivation to aggregate the information. 
Single-sentence suggestions do not need (even after two stages) 10 
pages of rhetoric to make the counter-point. We've probably shouted 
down any possibility of code being written; I hope not. Whoever wants 
this, write the darned thing and it'll get approved!

--
Brian T. Rice
LOGOS Research and Development
http://tunes.org/~water/




More information about the Squeak-dev mailing list