<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr">On Tue, Jan 8, 2019 at 2:39 PM Eliot Miranda <<a href="mailto:eliot.miranda@gmail.com">eliot.miranda@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi Nicolas,</div><br><div class="gmail_quote"><div dir="ltr">On Tue, Jan 8, 2019 at 2:30 PM Nicolas Cellier <<a href="mailto:nicolas.cellier.aka.nice@gmail.com" target="_blank">nicolas.cellier.aka.nice@gmail.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"> <div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div dir="ltr">Le mar. 8 janv. 2019 à 23:07, Nicolas Cellier <<a href="mailto:nicolas.cellier.aka.nice@gmail.com" target="_blank">nicolas.cellier.aka.nice@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><br></div><div class="gmail_quote"><div dir="ltr">Le mar. 8 janv. 2019 à 22:51, Nicolas Cellier <<a href="mailto:nicolas.cellier.aka.nice@gmail.com" target="_blank">nicolas.cellier.aka.nice@gmail.com</a>> a écrit :<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div>Hi all,</div><div>particularly Clement and Eliot,</div><div><br></div><div>One of the most annoying limit of bytecode is the number of arguments (<16 in V3), not so much annoying for pure Smalltalk, but certainly so for FFI (FORTRAN 77 lacks structures so existing code base often have functions with many arguments).</div><div>For scientific Smalltalk, some of those old FORTRAN libraries are still around nowadays (LAPACK is an example).</div><div><br></div><div>I patched the old Squeak compiler in Smallapack to workaround this limitation (it was easy enough to pass a single Array, and invoke FFI with many args).</div><div>In modern Pharo flavour, this is more involved with the new OpalCompiler (iit does not seem to be designed for extensibility as it seems necessary to patch many pieces/subclasses for a single feature change...).</div><div><br></div><div>But we now have Sista V1 bytecodes which removed a lot of limitations (# inst vars, #literals, max jump offset ...). Alas I don't see a modified limit for number of arguments (source: <a href="https://hal.inria.fr/hal-01088801/document" target="_blank">https://hal.inria.fr/hal-01088801/document</a> a bytecode set for adaptive optimization): there is still a limit of 4 reserved bits in compiled method header documented in link above.</div><div>Though, there is an adjacent unused bit now...</div><div>In Squeak,/Pharo, EncoderForSistaV1>>genSend:numArgs: suggests that the limit is 31 (sic)<br></div><div><br>    (nArgs < 0 or: [nArgs > 31]) ifTrue:<br>        [^self outOfRangeError: 'numArgs' index: nArgs range: 0 to: 31 "!!"].</div><div><br></div><div>or at least 2047 if we believe code below:</div><div><br>    "234        11101010    i i i i i j j j    Send Literal Selector #iiiii (+ Extend A * 32) with jjj (+ Extend B * 8) Arguments"<br></div><div><br></div><div><a href="https://github.com/pharo-project/pharo/blob/50992c3e5fed790b7e660954aee983f4681da658/src/Kernel-BytecodeEncoders/EncoderForSistaV1.class.st" target="_blank">https://github.com/pharo-project/pharo/blob/50992c3e5fed790b7e660954aee983f4681da658/src/Kernel-BytecodeEncoders/EncoderForSistaV1.class.st</a></div><div><br></div><div>Pharo also limit the numArgs to 15 whatever the encoding in CompiledMethod>><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">newBytes:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">trailerBytes:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">nArgs:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">nTemps:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">nStack:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">nLits:</span><span class="gmail-m_2651482508952953415gmail-m_-1299208501136377445gmail-m_1991889944658992902gmail-m_3386891624357783899gmail-pl-c1">primitive:</span><br></div><div><a href="https://github.com/pharo-project/pharo/blob/50992c3e5fed790b7e660954aee983f4681da658/src/Kernel/CompiledMethod.class.st" target="_blank">https://github.com/pharo-project/pharo/blob/50992c3e5fed790b7e660954aee983f4681da658/src/Kernel/CompiledMethod.class.st</a></div><div><br></div><div>But Squeak does not limit nArgs at all in</div><div>EncoderForSistaV1>>computeMethodHeaderForNumArgs:numTemps:numLits:primitive:<br></div><div><br></div><div>So my questions:</div><div>- is that doc up-to-date?</div><div>- if so, couldn't we expand the limit to 31 args by using the unused bit?</div><div><br></div><div>Note: there is another unused bit in V3 (not adjacent), and the double extended (send) byte code has room for 31 args in V3 too, since only the first 3 bits of second byte encode the type of operation...<br></div></div></div></div></div></div></div></div></div></blockquote><div><br></div><div>Confirmed in VMMaker: max num args is still 15 in CompiledMethod header</div><div><br></div><div>StackInterpreter>>argumentCountOfMethodHeader: header<br>    <api><br>    ^header >> MethodHeaderArgCountShift bitAnd: 16rF</div><div><br></div><div>I see that this decoding is shared whatever byte-code alternative...<br></div></div></div></div></blockquote><div><br></div><div>And the other unused bit (flag, position 30, 0-based) is not always unused... See maybeFlagMethodAsInterpreted:<div><br></div><div>         realHeader := realHeader bitOr: (objectMemory integerObjectOf: 1 << MethodHeaderFlagBitPosition).</div><div><br></div></div><div>Why do we flag the interpreted methods?</div></div></div></div></div></div></blockquote><div><br></div><div>This is for me for profiling.  Because Cog is a hybrid interpreter+JIT it can use the policy of avoiding JITting a method until it is found in the method lookup cache, or (to make sure that doits run at full speed) evaluated by one of the withArgs:executeMethod: primitives.  This means that the JIT doesn't waste time JITing huge methods that are only evaluated one e on system startup.  JITting is like a very slow interpretation, so it doesn't make sense to JIT a method that is simply a long sequence of statements if it is only used once; it will be much quicker and use much less space to simply interpret it once, instead of JITing it and then evaluating it once.  Further, the VM won't JIOT anything with a certain number of literals anyway.  This is a startup parameter and defaults to 60 literals.  Any method with more than 60 literals won't be JITted.</div></div></div></div></blockquote><div><br></div><div>Note also that the interpreter counts backward branches and if it finds it is in a method that is looping a lot (default 20 iterations IIRC) it will JIT compile the method and switch to the JIT version of it on the 20th iteration.  This avoids the VM getting stuck interpreting a method that loops forever.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div>But I wanted to be sure that this policy was not affecting any methods that are performance critical.  So I added the facility top flag interpreted methods as I can see which methods are executed that the VM chooses not to JIT.<br></div><div><br></div><div>Back in 2009/2010 I did indeed use the facility to check and found that my intuitions were correct and that the policy is a good one (although our including of the selectors for inlined messages such as ifTrue:ifFaklse: et al might mean we want to increase the limit on number fo literals for JITing as little, 65 perhaps?).</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div class="gmail_quote"><div>I had the impression that the offset was off by 1</div><div><br>    MethodHeaderFlagBitPosition := 28 + tagBits.</div><div><br></div><div>but we are tricky, since integerObjectOf: will perform the missing 1-bit shift...<br></div>This way, Newspeak that is using both bits 29-30 has the right offset. Ouch.<br></div><div class="gmail_quote">I now see no room left in that header...</div></div></div></div></div>

</blockquote></div><div><br></div><div dir="ltr" class="gmail-m_2651482508952953415gmail_signature"><div dir="ltr"><div><span style="font-size:small;border-collapse:separate"><div>_,,,^..^,,,_<br></div><div>best, Eliot</div></span></div></div></div></div></div>

</blockquote></div><br clear="all"><div><br></div>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><span style="font-size:small;border-collapse:separate"><div>_,,,^..^,,,_<br></div><div>best, Eliot</div></span></div></div></div></div>