[Vm-dev] Bytecode set (was Ubuntu unit issues)

Eliot Miranda eliot.miranda at gmail.com
Tue Apr 17 16:47:05 UTC 2012


Hi Jeremy,

On Mon, Apr 16, 2012 at 11:22 PM, Jeremy Kajikawa <jeremy.kajikawa at gmail.com
> wrote:

>
> Further details then...
>
> On Tue, Apr 17, 2012 at 7:57 AM, Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
> >
> > Hi Jeremy,
> >
> >    I'll try one more time :)
> >
> > On Mon, Apr 16, 2012 at 11:17 AM, Jeremy Kajikawa <
> jeremy.kajikawa at gmail.com> wrote:
> >>
> >>
> >> It would be nice to actually see actual formatted data without the
> >> language specific semantics,
> >>
> >> since I am writing the VM itself
> >>  (there is no other VM on the target so any requirement to run an
> >> existing SqueakVM ...)
> >>
> >> nice to know that tidbit of detail was dutifully ignored...
> >
> >
> > I didn't ignore it.  I wasn't assuming you would use the existing system
> for other than reading code and exploring the system.  I understand you
> want to do a clean-room VM for the target (and good luck, that's a fun
> project, one I cut my teeth on many years ago now).  You're sending email,
> therefore you probably have access to a system that can run Squeak and
> Pharo and can hence run the system and explore it so as to educate yourself
> on what's involved in a Smalltalk VM implementation.  IMO the best source
> of specification and documentation on the bytecode set is in the image in
> the classes I mentioned.
> >
>
> Yes I am clean-room building... and not dedicated to a single target.
>
> >> Glad to see that the opcode values are presented... but you skipped
> >> the formatting of the objects themselves,  or is this left as an
> >> exercise for the reader?
> >
> >
> > I was trying to direct your attention to specification of the opcodes.
>  I can point you to implementation of the image format, and hence the
> object format, but not to specification. But I think it is important you
> understand the bytecode and VM semantics if you don't want to waste time on
> implementation details.  Only by understanding the semantics will you have
> any idea of how complex the implementation is.  Send semantics and context
> semantics are much much more complex than conventional processor opcode
> semantics.
> >
>
> I'm working from Classical Semantics and need to know the differences,
>  specifically so I can deal with them.  the other targets I am dealing
> with are Classical Semantics based... so Smalltalk has to be "fit in"
> with the rest without majorly redesigning the system just to deal with
> it (the system I am working on has proven its speed and stability with
> the existing design details for everyday reliable use)
>
> >> I am NOT a classically trained programmer in that I am self-taught for
> >> the most part...
> >>  and I am trying to phrase my question the best I know how.
> >>
> >> use of Pharo or Squeak VirtualMachines **IS*NOT** available, no if but
> >> maybe or otherwise.
> >
> >
> > So what systems are you using to send email?
>
> Same system, using the Gmail web interface.
>
> >>
> >> so without my actually implimenting the VM itself to make the language
> >> available it simply won't happen.
> >>
> >> so... is there a defined list where the format of the instructions and
> >> any attached description where strings of one or more octets are
> >> modified or moved?
> >
> >
> > To my knowledge the only up-to-date info is in the image and in the
> VMMaker package which is difficult to read other than using the running
> system.
> >
> Would that not break "Clean room" building the VM system?
>
> >>
> >> or am I trapped in the need to learn smalltalk itself to understand
> >> the semantics of any answers given on this list?
> >>
> >> Use of smalltalk to actually produce smalltalk is not an option at this
> point.
> >
> >
> > Why not?  You have no linux, Mac or Windows systems available to you?
>
> Available, yes, capable?  that I question.
> As they have enough trouble running the OS they are installed with for
> defaults as far as I am concerned.
>
> >>
> >>
> >> The target host is neither posix, win32 nor bare-metal.
> >
> >
> > But you don't have to run the system on the target host to explore it do
> you?
> >
> No I don't... I need documentation on the format and specific handling
> of the opcodes in a Classical sense.
>
> >>
> >> the first programming language I learned was C based on K&R book
> materials...
> >>  however... I have only ever used the Host OS routines and never
> >> relied on the std c library
> >
> >
> > I find this hard to believe.  You've never used printf?
> >
> not once, ever,  I have always had other options and been able to step
> through iterating over each code modification.
>
> I have tried to deal with the Smalltalk system within OpenCobalt,
> however found this to have its own maze of semantics and the
> documentation I find no help in answering any questions I have as it
> is.
>

As Colin recommended I suggest you start from
http://www.mirandabanda.org/bluebook/.  Then you'll have an idea about
Smalltalk semantics.  Right now you're assuming it is comparable to
conventional register-based processors and it is not; it is way above that.


>
> I will just have to find a reasonably up to date Linux kernel build
> and LinuxFromScratch for Dual-Booting Linux in addition to Amiga OS
> 4.x
>
> I was hoping to sort out registers, opcodes and data formatting so
> that I can properly map the operations on a classical single processor
> system.
>
> as for the FPGA,  I do have the option of pusing material from being
> processed by the PPC onboard to the FPGA once an applicable program is
> written for it.
>
> I'll keep these Emails and try to work out what the actual operational
> changes are
> Currently Smalltalk appears as a black box and I am hoping to at least
> work out getting a basic flat memory model working with classical
> opcodes to Emulate other processors
>
> >>
> >> Thank you at least for trying to answer the request,
> >
> >
> > you're welcome.
> >
> >>
> >>
> >> ジェレミー
> >>
> >> Beware of Assumptions,  the Hee then Haw before kicking your head in...
> >>
> >> On Tue, Apr 17, 2012 at 5:32 AM, Eliot Miranda <eliot.miranda at gmail.com>
> wrote:
> >> >
> >> > Jeremy,
> >> >
> >> >     Smalltalk-80 (and Squeak) opcodes are for a spaghetti stack
> machine where each activation is a separate object.  These activation
> objects are called contexts, and chain together thorugh the sender field.
>  Each context has a fixed size stack (in Squeak there are small and large
> contexts, maximum size 52 stack slots).  Each activation holds onto a
> compiled method which is a vector of literal objects and a vector of
> bytecodes.  In Squeak and Smalltalk-80 these two vectors are encoded in a
> single flat object, half references to other objects (literals) half bytes
> (opcodes).  Since both contexts and compiled methods are objects the system
> implements its compiler and meta-level interpreter in Smalltalk itself,
> which require a real machine (the virtual machine) to execute.  If you run
> a Squeak or Pharo system you will be able to browse the classes that
> implement the compiler and the meta-level interpreter.  In particular:
> >> >
> >> > The classes EncoderForV3 & EncoderForV3PlusClosures implement the
> back-end of the compiler, generating concrete opcodes for abstract
> bytecodes such as pushReceiver: send:numArgs: etc.
> >> > Instances of class CompiledMethod are generated by the compiler (see
> MethodNode>generate:using:) using an instance of EncoderForV3PlusClosures.
> >> >
> >> > The class InstructionClient defines all the abstract opcodes for the
> current V3 plus closures instruction set.
> >> > The class InstructionStream decodes/interprets CompiledMethod
> instances, dispatching sends of the messages understood by
> InstructionClient to itself.  InstructionStream has several subclasses
> which respond to the seds of the opcodes in different ways.
> >> >
> >> > Most importantly ContextPart and its subclass MethodContext implement
> the InstructionClient api by simulating execution.  Hence ContextPart and
> MethodContext provide a specification in Smalltalk of the semantics of the
> bytecodes.  EncoderForV3 & EncoderForV3PlusClosures serve as a convenient
> reference for opcode encodings, and are well-commented.
> >> >
> >> > By the way InstructionClient's subclass InstructionPrinter responds
> to the api by disassembling a compiled method, hence aCompiledMethod
> symbolic prints opcodes, e.g.
> >> > (Object >> #printOn:) symbolic evaluates to the string
> >> > '37 <70> self
> >> > 38 <C7> send: class
> >> > 39 <D0> send: name
> >> > 40 <69> popIntoTemp: 1
> >> > 41 <10> pushTemp: 0
> >> > 42 <88> dup
> >> > 43 <11> pushTemp: 1
> >> > 44 <D5> send: first
> >> > 45 <D4> send: isVowel
> >> > 46 <99> jumpFalse: 49
> >> > 47 <23> pushConstant: ''an ''
> >> > 48 <90> jumpTo: 50
> >> > 49 <22> pushConstant: ''a ''
> >> > 50 <E1> send: nextPutAll:
> >> > 51 <87> pop
> >> > 52 <11> pushTemp: 1
> >> > 53 <E1> send: nextPutAll:
> >> > 54 <87> pop
> >> > 55 <78> returnSelf
> >> > '
> >> >
> >> >
> >> > and InstructionStream's subclass Decompiler implements the api by
> reconstructing a compiler parse tree for the compiled method, so e.g.
> >> > (Object >> #printOn:) decompile prints as
> >> > printOn: t1
> >> > | t2 |
> >> > t2 := self class name.
> >> > t1
> >> > nextPutAll: (t2 first isVowel
> >> > ifTrue: ['an ']
> >> > ifFalse: ['a ']);
> >> >  nextPutAll: t2
> >> > whereas the source code for the same method ((Object >> #printOn:)
> getSourceFromFile) evaluates to a Text for
> >> > 'printOn: aStream
> >> > "Append to the argument, aStream, a sequence of characters that
> >> > identifies the receiver."
> >> >
> >> > | title |
> >> > title := self class name.
> >> > aStream
> >> > nextPutAll: (title first isVowel ifTrue: [''an ''] ifFalse: [''a '']);
> >> > nextPutAll: title'
> >> >
> >> > So if you want to find a current, comprehensible specification of the
> Squeak/Pharo opcode set I recommend
> browsing EncoderForV3, EncoderForV3PlusClosures, InstructionClient, InstructionStream, ContextPart
> MethodContext.  Further, I recommend exploring existing CompiledMethod
> instances using doits such as
> >> >
> >> >     SystemNavigation new browseAllSelect: [:m| m scanFor: 137]
> >> >
> >> > HTH
> >> > Eliot
> >>
> >> > On Mon, Apr 16, 2012 at 10:03 AM, Jeremy Kajikawa <
> jeremy.kajikawa at gmail.com> wrote:
> >> >>
> >> >>
> >> >> Colin: thanks... something like that... just trying to work out the
> >> >> octet numbers and formatting for what data goes where.
> >> >>
> >> >> as I trying to encode this at assembler level where each opcode value
> >> >> has a specific routine that is called from a opCodeVector JumpTable
> >> >>
> >> >> Each Entry in the JumpTable is directly executed by the processor
> with
> >> >> a second JumpTable encoded similarly for basic microcode Read/Write
> >> >> functions to deal with various standard DataTypes in fixed formats
> >> >>
> >> >> this is to plug into the generic Interpreter engine I already have.
> >> >>
> >> >> the first test of this was to Emulate an Intel 80486 on a Motorola
> >> >> 68040 processor with the Host running at 25MHz.
> >> >>
> >> >> I managed to get an average speed rating of between 16MHz to 20MHz
> >> >> performance even with "real world" code being run through
> >> >>
> >> >> I am currently re-implimenting this engine on top of a PPC host and
> >> >> would like to expand its modularity to additional languages and
> >> >> targets.
> >> >>
> >> >> If at all possible I would like to make the equivalent "machine
> level"
> >> >> interpretation of the opcode numbers possible even if there is inline
> >> >> data and addresses present as well.
> >> >>
> >> >> With having no prior experience with Smalltalk any usage of terms I
> >> >> know in a different will won't make any sense initially and trying to
> >> >> get to grips with Smalltalk by using the Environment ... I already
> >> >> tried this unsuccessfully.
> >> >>
> >> >> I'm more interested in the number codes that each operation is
> >> >> represented by and making routines to match within set ranges,  and
> >> >> where one operation is multiple codes chained,  being able to have a
> >> >> listing starting with 0x00 is opcode "somename" and has N octets of
> >> >> immediate values following it formatted as ?:? bitstrings.
> >> >>
> >> >> If that makes any sense?
> >> >>
> >> >> as for stack or message information,  I'm willing to work out what is
> >> >> needed to make those happen if they are needed as bytecode level
> >> >> information.
> >> >>
> >> >> On Tue, Apr 17, 2012 at 4:20 AM, Colin Putney <colin at wiresong.com>
> wrote:
> >> >> >
> >> >> >
> >> >> > On 2012-04-16, at 8:14 AM, Jeremy Kajikawa wrote:
> >> >> >
> >> >> > I am somewhat dogmatically minded about technical details,  so I am
> >> >> > unlikely to wade through buckets of documentation about Smalltalk
> as a
> >> >> > language and how to use it if it is not answering the question
> about
> >> >> > what I am looking up.
> >> >> >
> >> >> >
> >> >> > I'm confused. You want to implement a Smalltalk interpreter, but
> you're not interested in the details of the language? Perhaps you should
> tell us what your overall goal is. That way we can provide more useful
> information.
> >> >> >
> >> >> > As for documentation of the bytecode set, you may find the Blue
> Book useful. It's the canonical description of how Smalltalk works,
> including the interpreter. Squeak is a descendant of this implementation.
> The section on the interpreter is here:
> >> >> >
> >> >> >
> http://www.mirandabanda.org/bluebook/bluebook_chapter28.html#StackBytecodes28
> >> >> >
> >> >> > Hope this helps,
> >> >> >
> >> >> > Colin
> >> >> >
> >> >> > PS. Since this has nothing to do with Ubuntu, I've changed the
> subject to something more appropriate
> >> >> >
> >> >
> >> >
> >> >
> >> >
> >> > --
> >> > best,
> >> > Eliot
> >> >
> >> >
> >
> >
> >
> >
> > --
> > best,
> > Eliot
> >
> >
>



-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20120417/5f44819f/attachment-0001.htm


More information about the Vm-dev mailing list