More File Performance Q.?
Bob Arning
arning at charm.net
Thu May 16 11:17:01 UTC 2002
On 16 May 2002 00:20:22 -0500 Jimmie Houchin <jhouchin at texoma.net> wrote:
>This program opens the file and reads each line to determine if any line
>beginning with 'from' is actually a header or in the body. If in the
>body I insert a space at the beginning of the line.
>
>Due to the requirement of reading each line to operate on it I had to
>change from StandardFileStream to CrLfFileStream.
>
>Is there anything I am doing wrong in my code which is causing problems?
Let's see what MessageTally has to say...
======================================================
- 1158 tallies, 19288 msec.
**Tree**
33.9% {6539ms} CrLfFileStream(FileStream)>>contentsOfEntireFile
|33.9% {6539ms} CrLfFileStream>>next:
| 19.7% {3800ms} String>>withSqueakLineEndings
| |11.6% {2237ms} primitives
| |4.4% {849ms} String(SequenceableCollection)>>copyFrom:to:
| |3.7% {714ms} String>>indexOfAnyOf:startingAt:ifAbsent:
| 14.2% {2739ms} CrLfFileStream(StandardFileStream)>>next:
| 10.4% {2006ms} CrLfFileStream(PositionableStream)>>nextInto:
| |10.4% {2006ms} CrLfFileStream(StandardFileStream)>>next:into:startingAt:
| 3.7% {714ms} primitives
28.2% {5439ms} String>>beginsWith:
9.9% {1910ms} StandardFileStream>>nextPutAll:
8.9% {1717ms} String>>asLowercase
|4.7% {907ms} String>>translateToLowercase
| |4.7% {907ms} String>>translateWith:
| | 4.1% {791ms} String>>translateFrom:to:table:
|4.1% {791ms} String(Object)>>copy
| 4.1% {791ms} String(SequenceableCollection)>>shallowCopy
| 3.5% {675ms} String(SequenceableCollection)>>copyFrom:to:
7.3% {1408ms} ReadStream(PositionableStream)>>nextLine
|6.5% {1254ms} ReadStream>>upTo:
| 3.6% {694ms} String(SequenceableCollection)>>copyFrom:to:
| 2.8% {540ms} String>>indexOf:startingAt:ifAbsent:
7.1% {1369ms} StandardFileStream(WriteStream)>>cr
7.1% {1369ms} StandardFileStream>>nextPut:
**Leaves**
28.2% {5439ms} String>>beginsWith:
11.6% {2237ms} String>>withSqueakLineEndings
11.5% {2218ms} String(SequenceableCollection)>>copyFrom:to:
10.4% {2006ms} CrLfFileStream(StandardFileStream)>>next:into:startingAt:
9.9% {1910ms} StandardFileStream>>nextPutAll:
7.1% {1369ms} StandardFileStream>>nextPut:
4.1% {791ms} String>>translateFrom:to:table:
3.7% {714ms} CrLfFileStream(StandardFileStream)>>next:
3.7% {714ms} String>>indexOfAnyOf:startingAt:ifAbsent:
2.8% {540ms} String>>indexOf:startingAt:ifAbsent:
**Memory**
old -24,468 bytes
young -156,388 bytes
used -180,856 bytes
free +156,388 bytes
**GCs**
full 3 totalling 1,892ms (10.0% uptime), avg 631.0ms
incr 396 totalling 389ms (2.0% uptime), avg 1.0ms
tenures 0
root table 0 overflows
======================================================
First observation: we are losing 10% to garbage collection. Andreas has some code that reduces this, but I'm not sure if it is released yet.
Second: grouping the numbers to show the majow parts:
-- reading - 41.2%
33.9% {6539ms} CrLfFileStream(FileStream)>>contentsOfEntireFile
7.3% {1408ms} ReadStream(PositionableStream)>>nextLine
-- finding 'from' - 37.1%
8.9% {1717ms} String>>asLowercase
28.2% {5439ms} String>>beginsWith:
-- writing - 17%
9.9% {1910ms} StandardFileStream>>nextPutAll:
7.1% {1369ms} StandardFileStream(WriteStream)>>cr
Looking at the finding part, converting to lowercase seems a bit much, so a smarter #beginsWith: is in order....
======================================================
beginsWith2: prefix
"Answer whether the receiver begins with the given prefix string.
The comparison is NOT case-sensitive."
self size < prefix size ifTrue: [^ false].
self first asLowercase == prefix first asLowercase ifFalse: [^false].
^ (self findSubstring: prefix in: self startingAt: 1
matchTable: CaseInsensitiveOrder) = 1
======================================================
This saves about 4 seconds
======================================================
- 918 tallies, 15265 msec.
**Tree**
41.8% {6381ms} CrLfFileStream(FileStream)>>contentsOfEntireFile
21.0% {3206ms} ReadStream(PositionableStream)>>nextLine
14.9% {2274ms} StandardFileStream>>nextPutAll:
9.4% {1435ms} StandardFileStream(WriteStream)>>cr
5.4% {824ms} String>>beginsWith2:
2.7% {412ms} primitives
2.3% {351ms} StandardFileStream>>flush
======================================================
Next, if you can avoid CrLfFileStream (by handling your particular requirements in #nextLine, e.g.), you can save a bit more:
======================================================
- 791 tallies, 13122 msec.
**Tree**
20.7% {2716ms} StandardFileStream(FileStream)>>contentsOfEntireFile
25.2% {3307ms} ReadStream(PositionableStream)>>nextLine
25.8% {3385ms} StandardFileStream>>nextPutAll:
8.0% {1050ms} StandardFileStream>>flush
7.8% {1024ms} StandardFileStream(WriteStream)>>cr
7.5% {984ms} String>>beginsWith2:
2.1% {276ms} String>>findString:
======================================================
Further improvements may be a bit harder to find (there... that should get someone going).
Cheers,
Bob
More information about the Squeak-dev
mailing list
|