Hi, some remember that from time to time, especially when the ZipPlugin is disabled, the DeflateStream fallback code is failing with a 'subscript is out of bounds: 65537'
I have a reproducible example with quite a large file
| zip file | zip := (FileStream fileNamed: 'snapshot.bin.gz') binary contentsOfEntireFile asString. file := zip unzipped. file zipped unzipped = file
which gives me:
ByteString(Object)>>error: ByteString(Object)>>errorSubscriptBounds: ByteString(Object)>>at: ByteString>>at: ByteString>>byteAt: GZipWriteStream(DeflateStream)>>updateHashAt: GZipWriteStream(DeflateStream)>>insertStringAt: GZipWriteStream(DeflateStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(ZipWriteStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(DeflateStream)>>deflateBlock GZipWriteStream(DeflateStream)>>next:putAll:startingAt: GZipWriteStream(DeflateStream)>>nextPutAll: ByteString(String)>>zipped
When the plugin is enabled, the example works. So I've been tracking the differences between DeflatePlugin and DeflateStream fallback code for some times without success.
What if I add a bound check in the DeflatePlugin? Then the plugin equally fails. I mean it reads past zipCollection bounds.
So there's something bad in the implementation. Why does the example seem to work is still mysterious to me.
I believe that the problem lies in DeflateStream>>deflateBlock
Especially the lines that compute lastIndex:
[blockPosition < position] whileTrue:[ (position + MaxMatch > writeLimit) ifTrue:[lastIndex := writeLimit - MaxMatch] ifFalse:[lastIndex := position].
Because the expectations below is: deflateBlock: lastIndex chainLength: chainLength goodMatch: goodMatch "Continue deflating the receiver's collection from blockPosition to lastIndex. Note that lastIndex must be at least MaxMatch away from the end of collection"
And because lastIndex and position are 0 based. While writeLimit is a size (65536).
I think that the code should rather be:
[blockPosition < position] whileTrue:[ (position + MaxMatch + 1 > writeLimit) ifTrue:[lastIndex := writeLimit - MaxMatch - 1] ifFalse:[lastIndex := position].
My failing example pass with above patch. I'm about to commit it. Before I do so, are there some extensive tests for zlib?
2017-03-31 21:13 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:
Hi, some remember that from time to time, especially when the ZipPlugin is disabled, the DeflateStream fallback code is failing with a 'subscript is out of bounds: 65537'
I have a reproducible example with quite a large file
| zip file | zip := (FileStream fileNamed: 'snapshot.bin.gz') binary
contentsOfEntireFile asString. file := zip unzipped. file zipped unzipped = file
which gives me:
ByteString(Object)>>error: ByteString(Object)>>errorSubscriptBounds: ByteString(Object)>>at: ByteString>>at: ByteString>>byteAt: GZipWriteStream(DeflateStream)>>updateHashAt: GZipWriteStream(DeflateStream)>>insertStringAt: GZipWriteStream(DeflateStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(ZipWriteStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(DeflateStream)>>deflateBlock GZipWriteStream(DeflateStream)>>next:putAll:startingAt: GZipWriteStream(DeflateStream)>>nextPutAll: ByteString(String)>>zipped
When the plugin is enabled, the example works. So I've been tracking the differences between DeflatePlugin and DeflateStream fallback code for some times without success.
What if I add a bound check in the DeflatePlugin? Then the plugin equally fails. I mean it reads past zipCollection bounds.
So there's something bad in the implementation. Why does the example seem to work is still mysterious to me.
Bah, it's a bit more complex than that, because we already pass lastIndex-1 to sub-level:
flushNeeded := self deflateBlock: lastIndex-1 chainLength: self hashChainLength goodMatch: self goodMatchLength.
It's more that we need MinMatch bytes to insert the string following the best match... (confirmed by comment near the end of C version http://git.savannah.gnu.org/cgit/gzip.git/tree/deflate.c) So in case of longest match (MaxMatch), we will access the zipCollection up to:
here + MaxMatch - 1 + MinMatch - 1 (0 based)
where here is up to lastIndex-1 (the lastIndex of deflateBlock above)
So the expectation is:
lastIndex - 1 + MaxMatch - 1 + MinMatch - 1 < writeLimit
Since MinMatch = 3, the expectation is correctly:
lastIndex + MaxMatch < writeLimit.
And that means that lastIndex can be up to writeLimit - MaxMatch - 1.
I'm confident that my analysis is correct, and will thus push the correction ASAP.
BTW, I've found the offending file back: it's while extracting snapshot.bin from the archive http://source.squeak.org/VMMaker/VMMaker.oscog-nice.794.mcz
2017-03-31 23:05 GMT+02:00 Nicolas Cellier < nicolas.cellier.aka.nice@gmail.com>:
I believe that the problem lies in DeflateStream>>deflateBlock
Especially the lines that compute lastIndex:
[blockPosition < position] whileTrue:[ (position + MaxMatch > writeLimit) ifTrue:[lastIndex := writeLimit - MaxMatch] ifFalse:[lastIndex := position].
Because the expectations below is: deflateBlock: lastIndex chainLength: chainLength goodMatch: goodMatch "Continue deflating the receiver's collection from blockPosition to lastIndex. Note that lastIndex must be at least MaxMatch away from the end of collection"
And because lastIndex and position are 0 based. While writeLimit is a size (65536).
I think that the code should rather be:
[blockPosition < position] whileTrue:[ (position + MaxMatch + 1 > writeLimit) ifTrue:[lastIndex := writeLimit - MaxMatch - 1] ifFalse:[lastIndex := position].
My failing example pass with above patch. I'm about to commit it. Before I do so, are there some extensive tests for zlib?
2017-03-31 21:13 GMT+02:00 Nicolas Cellier <nicolas.cellier.aka.nice@ gmail.com>:
Hi, some remember that from time to time, especially when the ZipPlugin is disabled, the DeflateStream fallback code is failing with a 'subscript is out of bounds: 65537'
I have a reproducible example with quite a large file
| zip file | zip := (FileStream fileNamed: 'snapshot.bin.gz') binary
contentsOfEntireFile asString. file := zip unzipped. file zipped unzipped = file
which gives me:
ByteString(Object)>>error: ByteString(Object)>>errorSubscriptBounds: ByteString(Object)>>at: ByteString>>at: ByteString>>byteAt: GZipWriteStream(DeflateStream)>>updateHashAt: GZipWriteStream(DeflateStream)>>insertStringAt: GZipWriteStream(DeflateStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(ZipWriteStream)>>deflateBlock:chainLength:goodMatch: GZipWriteStream(DeflateStream)>>deflateBlock GZipWriteStream(DeflateStream)>>next:putAll:startingAt: GZipWriteStream(DeflateStream)>>nextPutAll: ByteString(String)>>zipped
When the plugin is enabled, the example works. So I've been tracking the differences between DeflatePlugin and DeflateStream fallback code for some times without success.
What if I add a bound check in the DeflatePlugin? Then the plugin equally fails. I mean it reads past zipCollection bounds.
So there's something bad in the implementation. Why does the example seem to work is still mysterious to me.
squeak-dev@lists.squeakfoundation.org