[Pharo-project] [squeak-dev] Re: Optimizing RunArray

David T. Lewis lewis at mail.msen.com
Wed Aug 3 00:08:50 UTC 2011


On Wed, Aug 03, 2011 at 01:01:11AM +0200, Nicolas Cellier wrote:
> One thing that super upset me is that TextEditor is still using the
> old selectors #readKeyboard #keyboard #keyboardPeek instead of using
> events.

I would like to see this fixed, but I would also like to see it done
right whenever it does get fixed. That means make it work properly in
MVC as well as in Morphic (and SimpleMorphic). It will be more work
to do it that way, but IMO it is better to take the time to do it
right if it is going to be done at all.

One of the many things that is nice about Cuis is the use of events
in TextEditor. When comparing SimpleMorphic (extracted from Cuis)
to Morphic, a large portion of the differences were due to this alone.
But we should use the Squeak Project structure along with ToolBuilder
to make sure that all types of project work in reasonable ways.

I think that MVC is important not just because it is small and
efficient, but also because it provides a real test of portability
across different styles of UI interaction. This gives a real way to
test if some concept or feature is truly generic (and thus should
be supported in a general way in projects and tool builder), or if
it is a feature that is specific to one type of UI. In other words,
if something works in a Morphic project and in an MVC project, then
there is a fairly good chance that it is generic enough that it
will work in a SeasideProject or an AidaProject or a TabletUIProject
or whatever.

So please: Whoever steps up to do the work of changing TextEditor
to use events, please also make it work in MVC while you are at it.
Or ask for a collaborator to work with you to make sure it gets done.

$0.02,

Dave


> Look #handleInteraction:fromEvent:
> It still has this hackish code to transform event driven Morph into
> st-80-polling-ParagraphEditor-friendly-Sensor-like
> 	self editor sensor: (KeyboardBuffer new startingEvent: evt).
> IMO,
> 	self handleInteraction: [ editor readKeyboard ] fromEvent: evt.
> should be replaced with
> 	self handleInteraction: [ editor keyStroke: evt ].
> and the hack should be delegated to the TextMorphEditor (ParagraphEditor)
> keyStrocke: anEvent
> 	"Pass the event to a polling friendly sensor like so that old st-80
> code shall still work"
> 	self sensor: (KeyboardBuffer new startingEvent: evt).
> 	self readKeyboard.
> Then TextEditor>>keyStrocke: should dispatchKeyboardEvent: anEvent
> with: typeAhead and stop querying a pseudo sensor.
> Of course, that means changing a lot of methods in TextEditor (any
> method with 'sensor' in source).
> But if TextEditor is just a place holder for copying the st-80
> ParagraphEditor code and the early Morphic hacks, I just don't see the
> point of making such a copy ;)
> 
> Nicolas
> 
> 2011/8/2 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> > I have tried again with huge text (my 45 MByte change log).
> > Main contributor seems to be Text composition.
> > A marginal gain (2 to 3% when resizing the window) is to remove the
> > indirection in TextStopConditions, and replace instance variable stops
> > with direct integer slots (variableSubclass)
> >
> > No guaranty the fileIn of attachment will work, modifying Text
> > processing is dangerous...
> >
> > Nicolas
> >
> > 2011/8/2 St?phane Ducasse <stephane.ducasse at inria.fr>:
> >> Yes I was wondering about the same (for athens too :))
> >>
> >> Stef
> >>
> >> On Aug 2, 2011, at 2:16 PM, Igor Stasenko wrote:
> >>
> >>> Does it impacts the text rendering/processing speed?
> >>>
> >>>
> >>> On 2 August 2011 12:11, Nicolas Cellier
> >>> <nicolas.cellier.aka.nice at gmail.com> wrote:
> >>>> To complete myself, the fast #collect: already exists and is named
> >>>> #mapValues: except that it modifies the RunArray in place and also
> >>>> won't coalesce...
> >>>>
> >>>> I also gain a huge factor for #collect:as: be defining this method:
> >>>>
> >>>> RunArray>>fillFrom: aCollection with: aBlock
> >>>> ? ? ? ?"Evaluate aBlock with each of aCollections's elements as the argument.
> >>>> ? ? ? ?Collect the resulting values into self. Answer self."
> >>>> ? ? ? ?| newRuns newValues lastLength lastValue |
> >>>> ? ? ? ?newRuns := (Array new: aCollection size) writeStream.
> >>>> ? ? ? ?newValues := (Array new: aCollection size) writeStream.
> >>>> ? ? ? ?lastLength := 0.
> >>>> ? ? ? ?lastValue := Object new.
> >>>> ? ? ? ?aCollection do: [:each |
> >>>> ? ? ? ? ? ? ? ?| value |
> >>>> ? ? ? ? ? ? ? ?value := aBlock value: each.
> >>>> ? ? ? ? ? ? ? ?lastValue = value
> >>>> ? ? ? ? ? ? ? ? ? ? ? ?ifTrue: [lastLength := lastLength + 1]
> >>>> ? ? ? ? ? ? ? ? ? ? ? ?ifFalse:
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?[lastLength > 0
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?ifTrue:
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?[newRuns nextPut: lastLength.
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?newValues nextPut: lastValue].
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?lastLength := 1.
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?lastValue := value]].
> >>>> ? ? ? ?lastLength > 0
> >>>> ? ? ? ? ? ? ? ?ifTrue:
> >>>> ? ? ? ? ? ? ? ? ? ? ? ?[newRuns nextPut: lastLength.
> >>>> ? ? ? ? ? ? ? ? ? ? ? ?newValues nextPut: lastValue].
> >>>> ? ? ? ?self setRuns: newRuns contents setValues: newValues contents
> >>>>
> >>>> [ (Array new: 1000) collect: [:e | 4 atRandom] as: RunArray] bench.
> >>>> BEFORE: '25.1 per second.'
> >>>> AFTER: ?'1,080 per second.'
> >>>>
> >>>> It's worth a few lines of code.
> >>>>
> >>>> Nicolas
> >>>>
> >>>> 2011/8/2 Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com>:
> >>>>> I played a bit with RunArray, and found some un-optimized features.
> >>>>> First, I don't know why RunArray is an ArrayedCollection. It cannot
> >>>>> #add: but it can #addFirst: and #addLast:.
> >>>>> It cannot #add:withOccurrences: but it can #addLast:times:. Why
> >>>>> inventing new selectors for old behaviours ?
> >>>>> These operations will cost a realloc it the last value is different,
> >>>>> so the underlying runs/values could better be an OrderedCollection if
> >>>>> these operations are used often.
> >>>>> A RunArray cannot remove at all.
> >>>>> Very weird collection species, I don't like the implementation too much.
> >>>>>
> >>>>> Then, #do: loops could be far faster. They rely on ArrayedCollection
> >>>>> which inlines do: loops with #to:do: and #at:
> >>>>> But #at: is not that fast. Scanning the runs and counting elements
> >>>>> would result in a n^2 cost.
> >>>>> Fortunately there is a cache lastIndex,lastRun,lastOffset to keep a cost n.
> >>>>> Nonetheless, all the tests cost, and the loop is suboptimal.
> >>>>> Let use see:
> >>>>>
> >>>>> version 1:
> >>>>> RunArray>>fastDo: aBlock
> >>>>> ? ? ? ?runs with: values do: [:r :v |
> >>>>> ? ? ? ? ? ? ? ?r timesRepeat: [aBlock value: v]].
> >>>>>
> >>>>> | tmp |
> >>>>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
> >>>>> {
> >>>>> [ tmp do: [:e |]] bench.
> >>>>> [ tmp fastDo: [:e |]] bench.
> >>>>> }
> >>>>> #('3,220 per second.' '6,290 per second.')
> >>>>>
> >>>>> But timesRepeat: is slow, it is unoptimized by the compiler and costs
> >>>>> a message send.
> >>>>> I think we should implement BlockClosure>>repeat: and optimize that
> >>>>> call in Compiler.
> >>>>> But let's not do it, and rather inline by ourself:
> >>>>>
> >>>>> version 2:
> >>>>> ? ? ? ?runs with: values do: [:r :v |
> >>>>> ? ? ? ? ? ? ? ?1 to: r do: [:i | aBlock value: v]].
> >>>>>
> >>>>> | tmp |
> >>>>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
> >>>>> {
> >>>>> [ tmp do: [:e |]] bench.
> >>>>> [ tmp do2: [:e |]] bench.
> >>>>> }
> >>>>> #('3,070 per second.' '25,500 per second.')
> >>>>>
> >>>>> We can even inline the with:do: loop itself:
> >>>>> version 3:
> >>>>> ? ? ? ?1 to: runs size do: [:i |
> >>>>> ? ? ? ? ? ? ? ?| r v |
> >>>>> ? ? ? ? ? ? ? ?v := values at: i.
> >>>>> ? ? ? ? ? ? ? ?r := runs at: i.
> >>>>> ? ? ? ? ? ? ? ?[( r := r - 1) >= 0]
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?whileTrue: [aBlock value: v]].
> >>>>>
> >>>>> | tmp |
> >>>>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
> >>>>> {
> >>>>> [ tmp do: [:e |]] bench.
> >>>>> [ tmp do2: [:e |]] bench.
> >>>>> }
> >>>>> #('3,370 per second.' '32,200 per second.')
> >>>>>
> >>>>> Now the operation I wanted to use was reverseDo: so I implemented:
> >>>>> RunArray>>fastReverseDo: aBlock
> >>>>> ? ? ? ?| i |
> >>>>> ? ? ? ?i := runs size.
> >>>>> ? ? ? ?[i > 0]
> >>>>> ? ? ? ? ? ? ? ?whileTrue:
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?[ | r v |
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?v := values at: i.
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?r := runs at: i.
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?i := i - 1.
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ?[( r := r - 1) >= 0]
> >>>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?whileTrue: [aBlock value: v]].
> >>>>> | tmp |
> >>>>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]) as: RunArray.
> >>>>> {
> >>>>> [ tmp reverseDo: [:e |]] bench.
> >>>>> [ tmp reverseDo2: [:e |]] bench.
> >>>>> }
> >>>>> ?#('83.9 per second.' '32,600 per second.')
> >>>>>
> >>>>> Ouch! The cache is missing a lot of indices, and our loop turns into a n^2 cost.
> >>>>> I know, premature optimization bla bla bla, but a factor x400 is worth
> >>>>> some inlining no?
> >>>>>
> >>>>> I guess these features are never used.
> >>>>> By now RunArray is kind of private utility for Text implementation.
> >>>>> But it could / should be generic.
> >>>>>
> >>>>> I also have proposals for count: / select: / collect:. etc...
> >>>>> It would be to evaluate the block only once per group of values.
> >>>>> For example
> >>>>> RunArray>>collect: aBlock
> >>>>> ? ? ? ?"Beware, the block will be evaluated only once per group of values."
> >>>>> ? ? ? ?^(self class runs: (runs collect: aBlock) contents values: values
> >>>>> copy) coalesce
> >>>>> But that's controversial, it would make the RunArray behave
> >>>>> differently if the block has side effects...
> >>>>>
> >>>>> | i tmp tmp2 tmp3 |
> >>>>> tmp := ((Array new: 1000) collect: [:e | 4 atRandom]).
> >>>>> i := 0.
> >>>>> tmp2 := tmp collect: [:e | i := i + 1].
> >>>>> i := 0.
> >>>>> tmp3 := (tmp as: RunArray) collect: [:e | i := i + 1].
> >>>>> tmp2 = tmp3 asArray
> >>>>> ==> false
> >>>>>
> >>>>> Nicolas
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Best regards,
> >>> Igor Stasenko AKA sig.
> >>>
> >>
> >>
> >>
> >



More information about the Squeak-dev mailing list