[squeak-dev] The Trunk: Compression-ul.10.mcz

commits at source.squeak.org commits at source.squeak.org
Sat Dec 12 14:20:35 UTC 2009


Levente Uzonyi uploaded a new version of Compression to project The Trunk:
http://source.squeak.org/trunk/Compression-ul.10.mcz

==================== Summary ====================

Name: Compression-ul.10
Author: ul
Time: 12 December 2009, 2:18:43 am
UUID: 070e2365-df5f-514e-8d90-fa49c065cb01
Ancestors: Compression-edc.9

- replace sends of #ifNotNilDo: to #ifNotNil:, #ifNil:ifNotNilDo: to #ifNil:ifNotNil:, #ifNotNilDo:ifNil: to #ifNotNil:ifNil:

=============== Diff against Compression-edc.9 ===============

Item was changed:
  SharedPool subclass: #ZipConstants
  	instanceVariableNames: ''
+ 	classVariableNames: 'BaseDistance BaseLength BitLengthOrder DistanceCodes DynamicBlock EndBlock ExtraBitLengthBits ExtraDistanceBits ExtraLengthBits FixedBlock FixedDistanceTree FixedLiteralTree HashBits HashMask HashShift MatchLengthCodes MaxBitLengthBits MaxBitLengthCodes MaxBits MaxDistCodes MaxDistance MaxLengthCodes MaxLiteralCodes MaxMatch MinMatch NumLiterals Repeat11To138 Repeat3To10 Repeat3To6 StoredBlock WindowMask WindowSize'
- 	classVariableNames: 'ExtraLengthBits DynamicBlock Repeat11To138 ExtraDistanceBits MaxDistCodes FixedLiteralTree DistanceCodes Repeat3To10 Repeat3To6 HashBits ExtraBitLengthBits NumLiterals MaxLiteralCodes MaxBitLengthBits MaxMatch HashMask MinMatch EndBlock FixedDistanceTree StoredBlock BaseLength HashShift FixedBlock WindowSize MaxBitLengthCodes BaseDistance WindowMask MatchLengthCodes MaxBits MaxLengthCodes MaxDistance BitLengthOrder'
  	poolDictionaries: ''
  	category: 'Compression-Streams'!

Item was changed:
  ----- Method: GZipReadStream class>>viewContents: (in category 'fileIn/Out') -----
  viewContents: fullFileName
  	"Open the decompressed contents of the .gz file with the given name.  This method is only required for the registering-file-list of Squeak 3.3a and beyond, but does no harm in an earlier system"
  
+ 	(FileStream readOnlyFileNamed: fullFileName) ifNotNil:
- 	(FileStream readOnlyFileNamed: fullFileName) ifNotNilDo:
  		[:aStream | aStream viewGZipContents]!

Item was changed:
  InflateStream subclass: #FastInflateStream
  	instanceVariableNames: ''
+ 	classVariableNames: 'DistanceMap FixedDistTable FixedLitTable LiteralLengthMap'
- 	classVariableNames: 'DistanceMap FixedLitTable FixedDistTable LiteralLengthMap'
  	poolDictionaries: ''
  	category: 'Compression-Streams'!
  
  !FastInflateStream commentStamp: '<historical>' prior: 0!
  This class adds the following optimizations to the basic Inflate decompression:
  
  a) Bit reversed access
  If we want to fetch the bits efficiently then we have them in the wrong bit order (e.g., when we should fetch 2r100 we would get 2r001). But since the huffman tree lookup determines the efficiency of the decompression, reversing the bits before traversal is expensive. Therefore the entries in each table are stored in REVERSE BIT ORDER. This is achieved by a reverse increment of the current table index in the huffman table construction phase (see method increment:bits:). According to my measures this speeds up the implementation by about 30-40%.
  
  b) Inplace storage of code meanings and extra bits
  Rather than looking up the meaning for each code during decompression of blocks we store the appropriate values directly in the huffman tables, using a pre-defined mapping. Even though this does not make a big difference in speed, it cleans up the code and allows easier translation into primitive code (which is clearly one goal of this implementation).
  
  c) Precomputed huffman tables for fixed blocks
  So we don't have to compute the huffman tables from scratch. The precomputed tables are not in our superclass to avoid double storage (and my superclass is more intended for documentation anyways).!

Item was changed:
  ReadStream subclass: #InflateStream
  	instanceVariableNames: 'state bitBuf bitPos source sourcePos sourceLimit litTable distTable sourceStream crc'
+ 	classVariableNames: 'BlockProceedBit BlockTypes FixedDistCodes FixedLitCodes MaxBits StateNewBlock StateNoMoreData'
- 	classVariableNames: 'StateNoMoreData FixedDistCodes StateNewBlock BlockProceedBit BlockTypes FixedLitCodes MaxBits'
  	poolDictionaries: ''
  	category: 'Compression-Streams'!
  
  !InflateStream commentStamp: '<historical>' prior: 0!
  This class implements the Inflate decompression algorithm as defined by RFC1951 and used in PKZip, GZip and ZLib (and many, many more). It is a variant of the LZ77 compression algorithm described in
  
  [LZ77] Ziv J., Lempel A., "A Universal Algorithm for Sequential Data Compression", IEEE Transactions on Information Theory", Vol. 23, No. 3, pp. 337-343.
  
  [RFC1951] Deutsch. P, "DEFLATE Compressed Data Format Specification version 1.3"
  
  For more information see the above mentioned RFC 1951 which can for instance be found at
  
  	http://www.leo.org/pub/comp/doc/standards/rfc/index.html
  
  Huffman Tree Implementation Notes:
  ===========================================
  The huffman tree used for decoding literal, distance and length codes in the inflate algorithm has been encoded in a single Array. The tree is made up of subsequent tables storing all entries at the current bit depth. Each entry in the table (e.g., a 32bit Integer value) is either a leaf or a non-leaf node. Leaf nodes store the immediate value in its low 16 bits whereas non-leaf nodes store the offset of the subtable in its low 16bits. The high 8 bits of non-leaf nodes contain the number of additional bits needed for the sub table (the high 8 bits of leaf-nodes are always zero). The first entry in each table is always a non-leaf node indicating how many bits we need to fetch initially. We can thus travel down the tree as follows (written in sort-of-pseudocode the actual implementation can be seen in InflateStream>>decodeValueFrom:):
  
  	table _ initialTable.
  	bitsNeeded _ high 8 bits of (table at: 1).		"Determine initial bits"
  	table _ initialTable + (low 16 bits of (table at: 1)). "Determine start of first real table"
  	[bits _ fetch next bitsNeeded bits.			"Grab the bits"
  	value _ table at: bits.						"Lookup the value"
  	value has high 8 bit set] whileTrue:[		"Check if it's leaf"
  		table _ initialTable + (low 16 bits of value).	"No - compute new sub table start"
  		bitsNeeded _ high 8 bit of value].		"Compute additional number of bits needed"
  	^value
  !

Item was changed:
  SharedPool subclass: #ZipFileConstants
  	instanceVariableNames: ''
+ 	classVariableNames: 'CentralDirectoryFileHeaderSignature CompressionDeflated CompressionLevelDefault CompressionLevelNone CompressionStored DataDescriptorLength DefaultDirectoryPermissions DefaultFilePermissions DeflatingCompressionFast DeflatingCompressionMaximum DeflatingCompressionNormal DeflatingCompressionSuperFast DirectoryAttrib EndOfCentralDirectorySignature FaMsdos FaUnix FileAttrib IfaBinaryFile IfaTextFile LocalFileHeaderSignature'
- 	classVariableNames: 'CompressionStored FaUnix LocalFileHeaderSignature CompressionLevelNone IfaBinaryFile CompressionDeflated DeflatingCompressionNormal EndOfCentralDirectorySignature FaMsdos DefaultDirectoryPermissions DeflatingCompressionMaximum CompressionLevelDefault IfaTextFile DirectoryAttrib DefaultFilePermissions DeflatingCompressionSuperFast FileAttrib DataDescriptorLength CentralDirectoryFileHeaderSignature DeflatingCompressionFast'
  	poolDictionaries: ''
  	category: 'Compression-Archives'!

Item was changed:
  SharedPool subclass: #GZipConstants
  	instanceVariableNames: ''
+ 	classVariableNames: 'GZipAsciiFlag GZipCommentFlag GZipContinueFlag GZipDeflated GZipEncryptFlag GZipExtraField GZipMagic GZipNameFlag GZipReservedFlags'
- 	classVariableNames: 'GZipCommentFlag GZipContinueFlag GZipNameFlag GZipMagic GZipExtraField GZipDeflated GZipEncryptFlag GZipAsciiFlag GZipReservedFlags'
  	poolDictionaries: ''
  	category: 'Compression-Streams'!




More information about the Squeak-dev mailing list