[squeak-dev] OT: compressing log files

David T. Lewis lewis at mail.msen.com
Tue Feb 9 23:44:12 UTC 2010


A bit of a strain on the old garbage collector, but a Bag is good
for that kind of analysis:

  f := FileStream fileNamed: 'strace.txt'.
  lines := Bag new.
  [[f atEnd] whileFalse: [lines add: (f upTo: Character lf)]]
      ensure: [f close].
  lines sortedCounts inspect

Dave

On Tue, Feb 09, 2010 at 01:34:19PM -0800, Eliot Miranda wrote:
> Hi All,
> 
>     I've just needed to make sense of a very long log file generated by
> strace.  The log file is full of entries like:
> 
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744804, 491238}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> 
> and my workspace script reduces these to e.g.
> 
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 316183}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> NEXT 2 LINES REPEAT 715 TIMES
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 317189}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> 
> 
> My question is has anyone looked at this issue in any depth and perhaps come
> up with something not as crude as the below and possibly even recursive.
>  i.e. the above would ideally be reduced to e.g.
> 
> NEXT 7 LINES REPEAT 123456 TIMES
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 316183}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> NEXT 2 LINES REPEAT BETWEEN 500 AND 800 TIMES
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> ioctl(8, 0x80045530, 0xbfd4fe70)        = 0
> ioctl(8, 0xc1205531, 0xbfd4fb80)        = 0
> --- SIGALRM (Alarm clock) @ 0 (0) ---
> gettimeofday({1265744797, 317189}, NULL) = 0
> sigreturn()                             = ? (mask now [])
> 
> 
> 
> Here's my quick hack that I ran in vw7.7nc:
> 
> | f o lines maxrun repeats range |
> f := '../Cog/squeak.strace.log' asFilename readStream.
> o := 'compressed.log' asFilename writeStream.
> lines := OrderedCollection new.
> maxrun := 50.
> repeats := 0.
> range := nil.
> [[f atEnd] whileFalse:
> [lines size > maxrun ifTrue:
> [repeats > 0
> ifTrue:
> [1 to: range first - 1 do:
> [:i| o nextPutAll: (lines at: i); cr].
> o nextPutAll: 'NEXT '; print: range size; nextPutAll: ' LINES REPEAT ';
> print: repeats + 1; nextPutAll: ' TIMES'; cr.
> range do:
> [:i| o nextPutAll: (lines at: i); cr].
> lines removeFirst: range last.
> repeats := 0]
> ifFalse:
> [o nextPutAll: lines removeFirst; cr; flush].
>  range := nil].
> lines addLast: (f upTo: Character cr).
> [:exit|
> 1 to: lines size do:
> [:i| | line repeat |
> line := lines at: i.
> repeat := lines nextIndexOf: line from: i + 1 to: lines size.
> (repeat ~~ nil
>  and: [lines size >= (repeat - i * 2 + i)
>  and: [(i to: repeat - 1) allSatisfy: [:j| (lines at: j) = (lines at: j - i
> + repeat)]]]) ifTrue:
> [repeats := repeats + 1.
>  range isNil
> ifTrue: [range := i to: repeat - 1]
> ifFalse:
> [range = (i to: repeat - 1) ifTrue:
> [range do: [:ignore| lines removeAtIndex: repeat].
>  exit value]]]]] valueWithExit]]
> ensure: [f close. o close].
> repeats
> 
> Forgive the cross post.  I expect deep expertise in each newsgroup posted
> to.
> 
> best
> Eliot

> 




More information about the Squeak-dev mailing list