Jeremy,<div><br></div><div>    Smalltalk-80 (and Squeak) opcodes are for a spaghetti stack machine where each activation is a separate object.  These activation objects are called contexts, and chain together thorugh the sender field.  Each context has a fixed size stack (in Squeak there are small and large contexts, maximum size 52 stack slots).  Each activation holds onto a compiled method which is a vector of literal objects and a vector of bytecodes.  In Squeak and Smalltalk-80 these two vectors are encoded in a single flat object, half references to other objects (literals) half bytes (opcodes).  Since both contexts and compiled methods are objects the system implements its compiler and meta-level interpreter in Smalltalk itself, which require a real machine (the virtual machine) to execute.  If you run a Squeak or Pharo system you will be able to browse the classes that implement the compiler and the meta-level interpreter.  In particular:</div>

<div><br></div><div>The classes EncoderForV3 &amp; EncoderForV3PlusClosures implement the back-end of the compiler, generating concrete opcodes for abstract bytecodes such as pushReceiver: send:numArgs: etc.</div><div>Instances of class CompiledMethod are generated by the compiler (see MethodNode&gt;generate:using:) using an instance of EncoderForV3PlusClosures.</div>

<div><br></div><div>The class InstructionClient defines all the abstract opcodes for the current V3 plus closures instruction set.</div><div>The class InstructionStream decodes/interprets CompiledMethod instances, dispatching sends of the messages understood by InstructionClient to itself.  InstructionStream has several subclasses which respond to the seds of the opcodes in different ways.</div>

<div><br></div><div>Most importantly ContextPart and its subclass MethodContext implement the InstructionClient api by simulating execution.  Hence ContextPart and MethodContext provide a specification in Smalltalk of the semantics of the bytecodes.  EncoderForV3 &amp; EncoderForV3PlusClosures serve as a convenient reference for opcode encodings, and are well-commented.</div>

<div><br></div><div>By the way InstructionClient&#39;s subclass InstructionPrinter responds to the api by disassembling a compiled method, hence aCompiledMethod symbolic prints opcodes, e.g.</div><div>(Object &gt;&gt; #printOn:) symbolic evaluates to the string</div>

<div><div>&#39;37 &lt;70&gt; self</div><div>38 &lt;C7&gt; send: class</div><div>39 &lt;D0&gt; send: name</div><div>40 &lt;69&gt; popIntoTemp: 1</div><div>41 &lt;10&gt; pushTemp: 0</div><div>42 &lt;88&gt; dup</div><div>43 &lt;11&gt; pushTemp: 1</div>

<div>44 &lt;D5&gt; send: first</div><div>45 &lt;D4&gt; send: isVowel</div><div>46 &lt;99&gt; jumpFalse: 49</div><div>47 &lt;23&gt; pushConstant: &#39;&#39;an &#39;&#39;</div><div>48 &lt;90&gt; jumpTo: 50</div><div>49 &lt;22&gt; pushConstant: &#39;&#39;a &#39;&#39;</div>

<div>50 &lt;E1&gt; send: nextPutAll:</div><div>51 &lt;87&gt; pop</div><div>52 &lt;11&gt; pushTemp: 1</div><div>53 &lt;E1&gt; send: nextPutAll:</div><div>54 &lt;87&gt; pop</div><div>55 &lt;78&gt; returnSelf</div><div>&#39;</div>

</div><div><br></div><div><br></div><div>and InstructionStream&#39;s subclass Decompiler implements the api by reconstructing a compiler parse tree for the compiled method, so e.g.</div><div><div>(Object &gt;&gt; #printOn:) decompile prints as</div>

<div><div>printOn: t1 </div><div><span class="Apple-tab-span" style="white-space:pre">        </span>| t2 |</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>t2 := self class name.</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>t1</div>

<div><span class="Apple-tab-span" style="white-space:pre">                </span>nextPutAll: (t2 first isVowel</div><div><span class="Apple-tab-span" style="white-space:pre">                                </span>ifTrue: [&#39;an &#39;]</div><div><span class="Apple-tab-span" style="white-space:pre">                                </span>ifFalse: [&#39;a &#39;]);</div>

<div><span class="Apple-tab-span" style="white-space:pre">                </span> nextPutAll: t2</div></div><div>whereas the source code for the same method ((Object &gt;&gt; #printOn:) getSourceFromFile) evaluates to a Text for</div><div>

<div>&#39;printOn: aStream</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>&quot;Append to the argument, aStream, a sequence of characters that  </div><div><span class="Apple-tab-span" style="white-space:pre">        </span>identifies the receiver.&quot;</div>

<div><br></div><div><span class="Apple-tab-span" style="white-space:pre">        </span>| title |</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>title := self class name.</div><div><span class="Apple-tab-span" style="white-space:pre">        </span>aStream</div>

<div><span class="Apple-tab-span" style="white-space:pre">                </span>nextPutAll: (title first isVowel ifTrue: [&#39;&#39;an &#39;&#39;] ifFalse: [&#39;&#39;a &#39;&#39;]);</div><div><span class="Apple-tab-span" style="white-space:pre">                </span>nextPutAll: title&#39;</div>

</div></div><div><br></div><div>So if you want to find a current, comprehensible specification of the Squeak/Pharo opcode set I recommend browsing EncoderForV3, EncoderForV3PlusClosures, InstructionClient, InstructionStream, ContextPart MethodContext.  Further, I recommend exploring existing CompiledMethod instances using doits such as</div>

<div><br></div><div>    SystemNavigation new browseAllSelect: [:m| m scanFor: 137]</div><div><br></div><div>HTH</div><div>Eliot</div><div><br><div class="gmail_quote">On Mon, Apr 16, 2012 at 10:03 AM, Jeremy Kajikawa <span dir="ltr">&lt;<a href="mailto:jeremy.kajikawa@gmail.com">jeremy.kajikawa@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

Colin: thanks... something like that... just trying to work out the<br>

octet numbers and formatting for what data goes where.<br>

<br>

as I trying to encode this at assembler level where each opcode value<br>

has a specific routine that is called from a opCodeVector JumpTable<br>

<br>

Each Entry in the JumpTable is directly executed by the processor with<br>

a second JumpTable encoded similarly for basic microcode Read/Write<br>

functions to deal with various standard DataTypes in fixed formats<br>

<br>

this is to plug into the generic Interpreter engine I already have.<br>

<br>

the first test of this was to Emulate an Intel 80486 on a Motorola<br>

68040 processor with the Host running at 25MHz.<br>

<br>

I managed to get an average speed rating of between 16MHz to 20MHz<br>

performance even with &quot;real world&quot; code being run through<br>

<br>

I am currently re-implimenting this engine on top of a PPC host and<br>

would like to expand its modularity to additional languages and<br>

targets.<br>

<br>

If at all possible I would like to make the equivalent &quot;machine level&quot;<br>

interpretation of the opcode numbers possible even if there is inline<br>

data and addresses present as well.<br>

<br>

With having no prior experience with Smalltalk any usage of terms I<br>

know in a different will won&#39;t make any sense initially and trying to<br>

get to grips with Smalltalk by using the Environment ... I already<br>

tried this unsuccessfully.<br>

<br>

I&#39;m more interested in the number codes that each operation is<br>

represented by and making routines to match within set ranges,  and<br>

where one operation is multiple codes chained,  being able to have a<br>

listing starting with 0x00 is opcode &quot;somename&quot; and has N octets of<br>

immediate values following it formatted as ?:? bitstrings.<br>

<br>

If that makes any sense?<br>

<br>

as for stack or message information,  I&#39;m willing to work out what is<br>

needed to make those happen if they are needed as bytecode level<br>

information.<br>

<br>

On Tue, Apr 17, 2012 at 4:20 AM, Colin Putney &lt;<a href="mailto:colin@wiresong.com">colin@wiresong.com</a>&gt; wrote:<br>

&gt;<br>

&gt;<br>

&gt; On 2012-04-16, at 8:14 AM, Jeremy Kajikawa wrote:<br>

&gt;<br>

&gt; I am somewhat dogmatically minded about technical details,  so I am<br>

&gt; unlikely to wade through buckets of documentation about Smalltalk as a<br>

&gt; language and how to use it if it is not answering the question about<br>

&gt; what I am looking up.<br>

&gt;<br>

&gt;<br>

&gt; I&#39;m confused. You want to implement a Smalltalk interpreter, but you&#39;re not interested in the details of the language? Perhaps you should tell us what your overall goal is. That way we can provide more useful information.<br>


&gt;<br>

&gt; As for documentation of the bytecode set, you may find the Blue Book useful. It&#39;s the canonical description of how Smalltalk works, including the interpreter. Squeak is a descendant of this implementation. The section on the interpreter is here:<br>


&gt;<br>

&gt; <a href="http://www.mirandabanda.org/bluebook/bluebook_chapter28.html#StackBytecodes28" target="_blank">http://www.mirandabanda.org/bluebook/bluebook_chapter28.html#StackBytecodes28</a><br>

&gt;<br>

&gt; Hope this helps,<br>

&gt;<br>

&gt; Colin<br>

&gt;<br>

&gt; PS. Since this has nothing to do with Ubuntu, I&#39;ve changed the subject to something more appropriate<br>

&gt;<br>

</blockquote></div><br><br clear="all"><div><br></div>-- <br>best,<div>Eliot</div><br>

</div>