[squeak-dev] OT: compressing log files
Eliot Miranda
eliot.miranda at gmail.com
Tue Feb 9 21:34:19 UTC 2010
Hi All,
I've just needed to make sense of a very long log file generated by
strace. The log file is full of entries like:
--- SIGALRM (Alarm clock) @ 0 (0) ---
gettimeofday({1265744804, 491238}, NULL) = 0
sigreturn() = ? (mask now [])
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
and my workspace script reduces these to e.g.
--- SIGALRM (Alarm clock) @ 0 (0) ---
gettimeofday({1265744797, 316183}, NULL) = 0
sigreturn() = ? (mask now [])
NEXT 2 LINES REPEAT 715 TIMES
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
--- SIGALRM (Alarm clock) @ 0 (0) ---
gettimeofday({1265744797, 317189}, NULL) = 0
sigreturn() = ? (mask now [])
My question is has anyone looked at this issue in any depth and perhaps come
up with something not as crude as the below and possibly even recursive.
i.e. the above would ideally be reduced to e.g.
NEXT 7 LINES REPEAT 123456 TIMES
--- SIGALRM (Alarm clock) @ 0 (0) ---
gettimeofday({1265744797, 316183}, NULL) = 0
sigreturn() = ? (mask now [])
NEXT 2 LINES REPEAT BETWEEN 500 AND 800 TIMES
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
ioctl(8, 0x80045530, 0xbfd4fe70) = 0
ioctl(8, 0xc1205531, 0xbfd4fb80) = 0
--- SIGALRM (Alarm clock) @ 0 (0) ---
gettimeofday({1265744797, 317189}, NULL) = 0
sigreturn() = ? (mask now [])
Here's my quick hack that I ran in vw7.7nc:
| f o lines maxrun repeats range |
f := '../Cog/squeak.strace.log' asFilename readStream.
o := 'compressed.log' asFilename writeStream.
lines := OrderedCollection new.
maxrun := 50.
repeats := 0.
range := nil.
[[f atEnd] whileFalse:
[lines size > maxrun ifTrue:
[repeats > 0
ifTrue:
[1 to: range first - 1 do:
[:i| o nextPutAll: (lines at: i); cr].
o nextPutAll: 'NEXT '; print: range size; nextPutAll: ' LINES REPEAT ';
print: repeats + 1; nextPutAll: ' TIMES'; cr.
range do:
[:i| o nextPutAll: (lines at: i); cr].
lines removeFirst: range last.
repeats := 0]
ifFalse:
[o nextPutAll: lines removeFirst; cr; flush].
range := nil].
lines addLast: (f upTo: Character cr).
[:exit|
1 to: lines size do:
[:i| | line repeat |
line := lines at: i.
repeat := lines nextIndexOf: line from: i + 1 to: lines size.
(repeat ~~ nil
and: [lines size >= (repeat - i * 2 + i)
and: [(i to: repeat - 1) allSatisfy: [:j| (lines at: j) = (lines at: j - i
+ repeat)]]]) ifTrue:
[repeats := repeats + 1.
range isNil
ifTrue: [range := i to: repeat - 1]
ifFalse:
[range = (i to: repeat - 1) ifTrue:
[range do: [:ignore| lines removeAtIndex: repeat].
exit value]]]]] valueWithExit]]
ensure: [f close. o close].
repeats
Forgive the cross post. I expect deep expertise in each newsgroup posted
to.
best
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20100209/ad0eb98a/attachment.htm
More information about the Squeak-dev
mailing list
|