[Vm-dev] VMBIGENDIAN question (was: A proposal to split VMMaker into subpackages)

David T. Lewis lewis at mail.msen.com
Sat Mar 23 00:02:13 UTC 2013


Hi John,

I don't know if it will do any good but since we were talking about it in
this thread anyway, why not try generating the code with MemoryAccess? It
will not fix anything, but it might get you a little closer to spotting
the problem, so it's worth a try.

It looks like you are using a slightly older version of VMMaker, so use
MemoryAccess-dtl.4 rather than the update that I posted about an hour
ago. Evaluate "MemoryAccess enable" then regenerate the code and see what
happens.

Dave


On Fri, Mar 22, 2013 at 06:46:22PM -0400, John McIntosh wrote:
>  
> Likely this relates to a problem with LLVM and it playing more by the rules.
> 
> (a) lack of prototypes makes LLVM generate bad code.
> 
> Fix that then you can compile and run VM with -O0
> And run a bit with -O2 -Os
> 
> >  Earlier note sent to a few folks, but let's widen the audience.
> 
> Ok, I'm driven to compile an Old VM for IOS with LLVM, as you know
> this broken, still I'm millions of bytecodes in now...
> 
> What I found is the lack of prototypes would cause LLVM on anything
> but -O0 to throw an ARM exception at the first longAtPointer
> Once I added prototypes then well we run millions of bytecode. Right
> up to the point we die attempting to print the friggen Time on the
> Transcript.
> As it's one in the morning I'll throw it out for clues and maybe
> someone wants to fight with it, give clues, throw beer or heckle from
> the sidelines...
> 
> I know that the  commonAt:  is given an invalid or zero index value,
> so the digitValue: throws a failure of at: not working... Er then in
> my image we run off to print the crash report, with time stamp....
> repeat until you run out of 20MB of memory.
> 
> PS other fun notes because the endian and mmap memory offset is the
> same between my mac and my iOS device and the saved image then on
> startup I don't have to
> swizzle the bytes, or alter the offset, so the VM startup code isn't
> given a chance to stomp on anything. On an iPhone 3G this cut many
> seconds off the VM startup.
> 
> Add handy fprintf on failure to commonAt:
> sqInt commonAt(sqInt stringy) {
> register struct foo * foo = &fum;
>     sqInt atIx;
>     sqInt rcvr;
>     sqInt result;
>     sqInt index;
>     sqInt sp;
>     sqInt sp1;
> 
> 
>         /* Sets successFlag */
> 
>         index = positive32BitValueOf(longAt(foo->stackPointer));
>         rcvr = longAt(foo->stackPointer - (1 * BytesPerWord));
>         if (!(foo->successFlag && (!((rcvr & 1))))) {
>                 /* begin primitiveFail */
>                 foo->successFlag = 0;
> >>>>>        fprintf(stderr,"\n failure for %u given %i",rcvr,index);
> //Happy puppy when -O0   not happy with -O1 (or others)
>                 return null;
>         }
> 
> 
> 
> 
> so given
> 
> sqInt primitiveSecondsClock(void) {
> register struct foo * foo = &fum;
>     sqInt oop;
>     sqInt sp;
> 
>         /* begin pop:thenPush: */
>         oop = positive32BitIntegerFor(ioSeconds());
>         longAtput(sp = foo->stackPointer - ((1 - 1) * BytesPerWord), oop);
>         foo->stackPointer = sp;
> }
> 
> *seems to work as the positive32BitIntegerFor feeds back something I
> push back to positive32BitValueOf and print all the little pieces...
> 
> sqInt ->D2F68C2F<- Bytes 2F 8C F6 D2   integer value 3539373103  value
> from ioSeconds going into positive 3539373103 push as 3539373103<>
> 
> On Fri, Mar 22, 2013 at 5:25 PM, Nicolas Cellier <
> nicolas.cellier.aka.nice at gmail.com> wrote:
> 
> >
> > 2013/3/22 Bert Freudenberg <bert at freudenbergs.de>:
> > >
> > > On 2013-03-22, at 05:43, David T. Lewis <lewis at mail.msen.com> wrote:
> > >
> > >> "It ain't what you don't know that gets you into trouble. It's what you
> > know for sure that just ain't so."
> > >> -- Mark Twain
> > >>
> > >> An interpreter VM compiled with the normal C macros in sqMemoryAccess.h
> > (for "performance"):
> > >>
> > >> 0 tinyBenchmarks. '417277913 bytecodes/sec; 14395420 sends/sec'
> > >> 0 tinyBenchmarks. '414239482 bytecodes/sec; 14646769 sends/sec'
> > >> 0 tinyBenchmarks. '417277913 bytecodes/sec; 14406658 sends/sec'
> > >>
> > >> The same interpreter VM with C macros replaced by Smalltalk slang
> > (class MemoryAccess):
> > >>
> > >> 0 tinyBenchmarks. '455111111 bytecodes/sec; 14217973 sends/sec'
> > >> 0 tinyBenchmarks. '451897616 bytecodes/sec; 14485815 sends/sec'
> > >> 0 tinyBenchmarks. '453900709 bytecodes/sec; 14497194 sends/sec'
> > >>
> > >> Dave
> > >
> > > That is ... unexpected :)
> > >
> >
> > Well, that's almost the same code both in MemoryAcess and
> > sqMemoryAccess.h right?
> > Maybe a bit different if you define USE_INLINE_MEMORY_ACCESSORS:
> >
> >   static inline sqInt byteAt(sqInt oop)                         { return
> > byteAtPointer(pointerForOop(oop)); }
> >   static inline sqInt byteAtPointer(char *ptr)                  { return
> > (sqInt)(*((unsigned char *)ptr)); }
> >   static inline char *pointerForOop(usqInt oop)                 { return
> > sqMemoryBase + oop; }
> >
> > Though I do not well see why it would not inline such simple piece,
> > gcc has a license to not honour the inline request.
> > On the other side MemoryAccess will always inline as we asked the code
> > generator to (self inline: true)
> > It would be worth verifying if one of the static function is generated
> > in the executable (with nm -a or something).
> >
> > But I also see other subtle differences like this:
> >
> > intAtPointer: ptr put: val
> >         self inline: true.
> >         self var: #ptr type: 'char *'.
> >         self var: #val type: 'unsigned int'.
> >         ^ self cCoerce:
> >                         ((self cCoerce: ptr to: 'unsigned int *')
> >                                 at: 0
> >                                 put: val)
> >                 to: 'sqInt'
> >
> > while the header tells
> >   static inline sqInt intAtPointerput(char *ptr, int val)       { return
> > (sqInt)(*((unsigned int *)ptr)= (int)val); }
> >
> > OK, you might think that casting int->unsigned int is no-op on
> > 2-complement machines.
> > But it's a distraction, we must omit the intermediate (*(unsigned int
> > *)) and just consider that the return value is assigned with the
> > parameter val.
> > So the header just copy an int->int, but MemoryAccess uses the
> > opposite cast unsigned int->int
> > It's also a no-op except that:
> > - the cast can overflow, which would be UB.
> > - gcc has a licence to presume you don't rely on UB and thus can
> > further consider the returned int is always >= 0
> > That assertion cannot be done in the case of sqMemoryAccess.h
> >
> > So all I see here has nothing to do with premature optimization.
> > It has to do with lack of understanding of the modern C standards, and
> > the absolute casualness attitude we take with signed and unsigned
> > types.
> >
> > Nicolas
> >
> > > But it again shows the importance of the 3rd rule of Optimization.
> > > http://c2.com/cgi/wiki?RulesOfOptimization
> > >
> > > - Bert -
> > >
> >
> 
> 
> 
> -- 
> ===========================================================================
> John M. McIntosh <johnmci at smalltalkconsulting.com>
> Corporate Smalltalk Consulting Ltd. Twitter: squeaker68882
> ===========================================================================



More information about the Vm-dev mailing list