hard-drive read-performance

Wed Nov 24 17:00:49 UTC 2010

When reading any object off the hard drive (represented as the
'byteArray' of a single MaObjectBuffer), Magma always reads 280 bytes.
 Since the #physicalSize is in the object header, it is then able to
check the contents of the buffer to determine the size of the whole
object and, if necessary, read more bytes in order to get the whole
object.  See MaObjectFiler>>#read:bytesInto:and:startingAt:filePosition:
for this behavior.

280 bytes is enough for about 40 pointer references, allowing most
objects to be read in just one disk access.  I refer to it as the
#trackSize, to remind me it is supposed to be how many bytes I think
can the HD read in one operation without overrunning its own internal
buffers and becoming inefficient.  I was curious whether this number
is optimized in 2010, so I ran the following script:

-----------
|stats random| stats:=OrderedCollection new. random := Random new.
nextPos:=100.
(FileDirectory on: '/home/cmm/test3/cube.001.magma') fileNamed:
'objects.2.dat' do:
	[ : stream | | ba fileSize | ba := ByteArray new: 10000.
	fileSize := stream size.
	100 to: 10000 by: 100 do:
		[ : n |
		stream position: 0.
		Transcript cr; show: (stats add: n->([stream
				maRead: n "bytes"
				bytesFromPosition: 1
				of: ba
				atFilePosition: (random nextInt: fileSize ] bench)) ]].
stats
------------

Note that "objects.2.dat" is a real Magma file, 1.8GB in size.  The
goal of the script is bench how fast Squeak can read object buffers
off the hard-drive when we obviously won't get many (if any) HD cache
hits.

I have a cheap, Western Digital Caviar HD, which produced the following output:

100->'119 per second.'
200->'98.5 per second.'
300->'106 per second.'
400->'106 per second.'
500->'101 per second.'
600->'102 per second.'
700->'99.9 per second.'
800->'103 per second.'
900->'104 per second.'
1000->'99 per second.'
1100->'97.9 per second.'
1200->'104 per second.'
1300->'111 per second.'
1400->'99.8 per second.'
1500->'107 per second.'
1600->'108 per second.'
1700->'95.6 per second.'
1800->'103 per second.'
1900->'108 per second.'
2000->'102 per second.'
2100->'103 per second.'
2200->'107 per second.'
...
3000->'98.7 per second.'
4000->'102 per second.'
5000->'106 per second.'
6000->'104 per second.'
7000->'101 per second.'
8000->'102 per second.'
9000->'102 per second.'
10000->'107 per second.'

For curiousity, I also modified the script to read very small buffers
from the HD, here are the results:

4->'137 per second.'
12->'146 per second.'
20->'154 per second.'
28->'143 per second.'

(The HD busy light was solid ON during the test).

At first I was puzzled because Magma has demonstrated much faster
objects-per-second read rates than these, even including
materialization, what gives?

It's the HD buffering.  Most of the time, objects are "clustered"
closely together, so that reading one object causes the "next" object
which will be read to already be in the HD's buffer.  Here's the same
script, except reading mostly "sequentially" through the file instead
of from a random location:

|stats random nextPos| stats:=OrderedCollection new. random := Random new.
nextPos:=100.
(FileDirectory on: '/home/cmm/test3/cube.001.magma') fileNamed:
'objects.2.dat' do:
	[ : stream | | ba fileSize | ba := ByteArray new: 10000.
	fileSize := stream size.
	#(4 12 20 28 100 200 300 400 500)
		[ : n |
		stream position: 0.
		Transcript cr; show: (stats add: n->([stream
				maRead: n "bytes"
				bytesFromPosition: 1
				of: ba
				atFilePosition: ("random nextInt: fileSize" (nextPos :=
nextPos+n+10)) ] bench)) ]].
stats

Now look at the results:

"Reading sequentially rather than at a random position."
4->'1,160,000 per second.'
12->'1,210,000 per second.'
20->'1,100,000 per second.'
28->'973,000 per second.'
...
100->'1,030,000 per second.'
200->'321,000 per second.'
300->'215,000 per second.'
400->'160,000 per second.'
500->'227,000 per second.'

Conclusions:

  - Hard-disk seek is definitely a bottleneck with Magma, or any
Squeak application that requires random-access to a file.
  - When objects are clustered closely together, read performance can
be dramatically better.
  - HD's with fast seek times, such as newer solid-state drives, might
perform dramatically better.
  - I should consider reducing the trackSize from 280 bytes to ~100
bytes (or make it customizable); because the rate drops really fast
after that and even a second read required could still be faster than
an initial read.

 - Chris