[squeak-dev] [Please Review] Refactoring for #literalsDo: etc.

Marcel Taeumel marcel.taeumel at hpi.de
Thu Jul 4 09:56:29 UTC 2019

Hi all! :-)

I think I managed to (somewhat) complete the refactoring of #literalsDo:, #allLiterals, and #hasLiteral:. My goals were as follows:

1. Reduce duplicate enumeration code in the form of "1 to: self numLiterals -1 do:".
2. Fix regresssions that occurred in SistaV1 because of blocks not being inlined at the byte-code level anymore (compared to V3).
3. Understand and clarify the meaning of "literal" in the programming interface.
4. Always enumerate literals thoroughly for #allLiteralsDo: (new) and #hasLiteral:.

Please find attached a changeset to try out. The interface to please review is:

- all implementors of #literalsDo: and #literals and #literalAt: (-> does not decend)
- all implementors of #codeLiteralsDo: and #codeLiterals (-> decends into compiled code objects)
- all implementors of #allLiteralsDo: and #allLiterals (-> decends into compiled code objects and arrays and bindings and pragmas ... to make enumeration "thorough")
- all implementors of #hasLiteral: and #hasLiteralSuchThat: and #literalEqual:
- all implementors of #messagesDo: and #messages and #sendsMessage:

Please also take a look at the (rewritten) tests:

LiteralRefLocatorTest >> #testFindLiteralsInBytecode
LiteralRefLocatorTest >> #testThoroughFindLiteralsInBytecode

The following benchmarks look promising:

[SystemNavigation default allCallsOn: #drawOn:] bench 

 BEFORE: '41.3 per second. 24.2 milliseconds per run.'
 AFTER: '37.7 per second. 26.5 milliseconds per run.'
[SystemNavigation default allSelectorsAndMethodsDo: [:b :s :m | m allLiterals]] bench

BEFORE:  '33.1 per second. 30.2 milliseconds per run.'
AFTER: '32 per second. 31.2 milliseconds per run.'

[SystemNavigation default allCallsOnClass: OrderedCollection] bench

BEFORE:  '34.9 per second. 28.7 milliseconds per run.'
 AFTER: '4.04 per second. 248 milliseconds per run.'

[(PasteUpMorph >> #tryInvokeHalo:) allLiterals] bench

BEFORE: '1,320,000 per second. 759 nanoseconds per run.'
AFTER:  '951,000 per second. 1.05 microseconds per run.'

[(PasteUpMorph >> #tryInvokeHalo:) hasLiteral: #second] bench

BEFORE: '3,710,000 per second. 270 nanoseconds per run.'
AFTER: '1,170,000 per second. 855 nanoseconds per run.'

[(Object >> #yourself) allLiterals] bench

BEFORE: '3,570,000 per second. 280 nanoseconds per run.'
AFTER:  '3,660,000 per second. 273 nanoseconds per run.'

Here are the things I could not manage to fix or implement yet:

1. Enumerating special selectors via #allLiteralsDo:. Yet, works for #hasLiteral: and #messagesDo: because of using InstructionStream scanning.
2. Enumerating special literals via #allLiteralsDo:. Yet, works for #hasLiteral: because of using InstructionStream scanning.

It would be nice to have a version of BytecodeEncoder class >> #scanBlockOrNilForLiteral: that would be able to just check for any special selector or literal.


Am 27.06.2019 09:31:56 schrieb Marcel Taeumel <marcel.taeumel at hpi.de>:
Hi Eliot,

thanks for the reply. :-) 

Do you think we need a way to not decend into Array literals and Pragmas *but* CompiledCode (i.e., SistaV1)? At the time of writing, my changeset offers #allLiteralsDo: to go all the way down and #literalsDo: to just enumerate the stuff between header and first bytecode.

Am 27.06.2019 00:42:03 schrieb Eliot Miranda <eliot.miranda at gmail.com>:
Hi Marcel,

On Mon, Jun 24, 2019 at 5:32 AM Marcel Taeumel <marcel.taeumel at hpi.de [mailto:marcel.taeumel at hpi.de]> wrote:

Hi all,

next round. Please find attached a changeset:

thanks.  Sorry to be late to this party.

- senders browsing is only slightly slower but *thorough* all the time

I think that's safe.  Thorough might be retained as a way of not descending into Array literals and Pragmas.  But if you're happy that we always traverse these (and avoid being caught by recursion in circular strucutres0 THEN FINE.
- #hasLiteral: now also works for #>= and other special literals, even though not enumerable
- #literalsDo: enumerates the *raw* literals (for copy etc.) in that byte array (i.e. after header, before first byte code)
- #allLiteralsDo: enumerates the *real* literals, that is, it is thorough but skips special objects such as outer scope, method selector, and class binding -- (for #hasLiteral:, tools etc.)

I just thought about writing some tests for it but those would have to account for V3 vs. SistaV1. Hmmm... maybe just tests for #hasLiteral: and #allLiteralsDo:. Not that low-level #literalsDo:.

Alas yes.  ut there are some prototype tests you can use.  See LiteralRefLocatorTest.  These tests compile two versions of each tested method, in the primary and secondary bytecode sets.  So if you start there you'll find that testing both V3PlusClosures and SistaV1 is taken care of.


Am 24.06.2019 11:27:31 schrieb Marcel Taeumel <marcel.taeumel at hpi.de [mailto:marcel.taeumel at hpi.de]>:
Hi Nicolas,

thanks. :-) That's correct. It's an extension to high-level test and enumeration. I just discovered that I have to integrate (or update) CompiledCode >> #refersTo:bytecodeScanner:thorough: to respect those special selectors.

Am 24.06.2019 10:56:21 schrieb Nicolas Cellier <nicolas.cellier.aka.nice at gmail.com [mailto:nicolas.cellier.aka.nice at gmail.com]>:
Hi Marcel,
beware, there are specialSelectors and inlined selectors too (do: ifTrue: etc...), so it might happen that low level literals enumeration does not fit higher level...

Le lun. 24 juin 2019 à 10:48, Marcel Taeumel <marcel.taeumel at hpi.de [mailto:marcel.taeumel at hpi.de]> a écrit :

Hi, there.

Please find attached a new version of this refactoring.

I discovered more recent code for scanning literals (that seems to be at the bytecode level, not object level):

BytecodeEncoder class >> #scanBlockOrNilForLiteral:
EncoderForSistaV1 class >> #scanBlockOrNilForLiteral:
EncoderForV3 class >> #scanBlockOrNilForLiteral:

However, I could not see a way to enumerate literals this way. Did I miss something? Makes me wonder about the entire use of CompiledCode >> #literalsDo:. We could replace #hasLiteral: with an implementation similar to Behavior >> #whichSelectorsreferTo:thorough: using #scanBlockOrNilForLiteral:.

Thoughts? Eliot? :-) How is the conceptional relationship between "has literal" and "literals do"?

Am 28.05.2019 14:49:59 schrieb Marcel Taeumel <marcel.taeumel at hpi.de [mailto:marcel.taeumel at hpi.de]>:
Hi, there.

Please find attached a change set that (tries to) clean up everything related to enumerating or testing literals in CompiledCode, CompiledMethod, and CompiledBlock.

I have three important questions:

- The purpose of #hasLiteralThorough is not needed anymore because we enumerate and test always in a "thorough" way?
- Are the enumeration boundaries in CompiledCode, CompiledMethod, CompiledBlock in #literalsDo: correct?
- What is a literal as expected in #hasLiteral: etc. to answer true? Just symbols or also bindings (symbol-to-class) and classes themselves?

Try exploring this result: "(Morph >> #fullDrawOn:) literals" or similar.




best, Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20190704/9d4c40d9/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: literals-do.15.cs
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20190704/9d4c40d9/attachment-0001.ksh>

More information about the Squeak-dev mailing list