Performance profiling results...

John M McIntosh johnmci at smalltalkconsulting.com
Sat Sep 22 23:43:33 UTC 2001


>Ok.. How about having a SIGALARM handler that increments a counter 100
>times a second? THe thing checks the counter before&after the primitive to
>see if it changed by more than 3 steps (30ms).
>
>Trust me, getting rid of this would eliminate 40,000 system calls A
>SECOND, and 4% of the runtime.

Scott, You know I looked at this on the macintosh a year back or so. 
I found the timer call backs took longer. However David Simmons 
pointed out another clock in Quicktime that we might use. One of the 
things you need to watch here, I think, is perhaps not incrementing a 
counter, rather you get the clock. I'm not sure you have a guarantee 
that over N days you would end up with your millisecond clock 
matching the OS clock if you do the increment (mind it does wrap, but 
how long?). Then again does it matter? But instance timmer pops under 
mac OS-9 are deferred say 30ms or more to service a pending page 
fault, which messes up the use of this logic, but maybe windows is 
smarter. Bet unix is too.


Now I few years back I altered slang to extrude the VM that uses a 
structure for all the globals. This gives back a few % on the ppc 
because of the way addressing off register 2 is done for globals. 
Basically a register can be used to reference the base of the 
structure and we gain multiple registers back that otherwise would be 
used to store references to globals. Also I hand sorted variables to 
match up with expected usage for cache line optimizations, not that 
you could tell a difference but maybe on 2GHz machines cache misses 
become very expensive.

So I'm wondering if you think there is a benefit in doing this for 
the intel side. You'd need to look at how globals get referenced 
versus a variable in a structure and how that alters the assembler 
and performance.


At 12:40 PM -0700 9/22/01, Andreas Raab wrote:
>
>>  Also, about 90% of the invocations of fetchClassOfNonInt
>>  are from loadFloatOrIntFrom. And 99.5% of the invocations of
>>  floatValueOf are from fetchClassOfNonInt. About 100 million function
>>  calls total.
>
>Yeah, would be nice to get rid of 'em but the inliner currently prevents
>inlining methods that have explicit return types - and #loadFloatOrIntFrom:
>is a #returnTypeC: 'double'.

MMm take a look at 
http://www.smalltalkconsulting.com/papers/tipsAndThoughts/source/ForceInline-JMM.cs

and consider the comments and the code with regard to what you are 
doing. It may have a fit.

-- 
--
===========================================================================
John M. McIntosh <johnmci at smalltalkconsulting.com> 1-800-477-2659
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================




More information about the Squeak-dev mailing list