Performance hit

Thu Jul 30 19:57:30 UTC 1998

In doing some testing that involved Processes waiting on timers, I was puzzled by poor performance. Digging deeper I finally narrowed it down to this example:

PWatch class>>runTests3:

runTests3: howMany

	| block alone done notAlone |

	alone _ notAlone _ 0.
	block _ [
		Time millisecondsToRun: [100 timesRepeat: [howMany factorial]]
	].
	3 timesRepeat: [alone _ alone + block value].
	done _ false.
	[(Delay forSeconds: 30) wait. done _ true] forkAt: 6.
	3 timesRepeat: [notAlone _ notAlone + block value].
	[done] whileFalse.
	Transcript show: 
		howMany asString,
		' alone = ',(alone // 3) asString,
		'  	notAlone = ',(notAlone // 3) asString; cr.	

Then, when I evaluate:

5 to: 30 by: 5 do: [ :howMany | PWatch runTests3: howMany]

I get:

5 alone = 3  	notAlone = 3
10 alone = 4  	notAlone = 5
15 alone = 54  	notAlone = 292
20 alone = 161  	notAlone = 967
25 alone = 248  	notAlone = 1661
30 alone = 379  	notAlone = 2635

When we go from 10 factorial to 15 factorial (when the result crosses from SmallInteger to LargeInteger), performance drops by a factor of 5 or 6 *IF* another process is waiting on a timer to expire.

Looking at the interpreter source, after every primitive dispatch, there is

	if (successFlag && ((nextWakeupTick != 0) && (((ioMSecs()) & 536870911) >= nextWakeupTick))) {
		interruptCheckCounter = 1000;
		checkForInterrupts();
	}

There are comments (in the Mac version, anyway) as to there being two routines available for getting a millisecond clock value and that the more expensive is used for greater accuracy. Perhaps this might be a spot to use the cheaper variety with a suitable allowance for accuracy.

Any thoughts?

Cheers,
Bob