Zip decompression in Squeak

Stefan Matthias Aust sma at kiel.netsurf.de
Sun Oct 4 09:24:15 UTC 1998


>I need to decompress zip files in Squeak.
>Has anyone implemented zip decompression?
>Ale.

I did this for VisualWorks.  It was for a commercial program, so
unfortunately I can't provide the source.  But I can describe how it was
done.  Extracting files from a zip file involves two tasks.  First, you
need to parse the zip file structure searching for the stored data.
Second, if the data is compressed, you need to decompress them.  As the zip
file format supports a number of different algorithms and as most of them
use computation-intensive bit processing, I used the freely available zlib
library.

The specification of zip files is freely available.  A zip file consists of
a number of local headers followed by chunks of data.  Then a central
header and an end header structure.  The bad news is that you need to start
at the end of the zip file, scanning for the end header (which might be
followed by a file comment of unknown length) from where you can find the
central header which again leads you to the local headers.  The header
structure contains the used compressing algorithm.  Hopefully, this is
either NONE (0) or INFLATE (8).  In the first case, you can copy the
specified number of bytes.  Otherwise, you need to feed the raw data into
an inflate stream as follows:

First, you need to initialize the zlib library by calling

inflateinit2_(&zip_stream, -15, zlibVersion(), sizeof(zip_stream))

all other initialization functions you'll find in the zlib.h file are
macros.  Please note the magical number "-15" which is undocumented but
used to initialize the library in a pkzip-compatible mode which needs to
header information in front of the raw data.

You'll now fill in size and buffer address off input and output buffer.
The first buffer contains the compressed raw data, the second buffer will
contain the uncompressed information.  You can extract the needed buffer
size from the zip entry header.  For very large files, you can use the zlib
in stream mode, where the buffers take only a smaller part of the file.
You need to call the following inflate function more than once and with a
different constant.  But in the simple mode, you just call

inflate(&zip_stream, 2)
inflateEnd(&zip_stream)

and you assign output buffer will contain the uncompressed bytes.


If you just want to use the inflate/deflate algorithms and don't care about
zip file compatibility, you may want to look into zlib's utility functions
compress/uncompress.  They provide a very easy interface to deflating and
inflating chunks of memory.  For example (this time in Dolphin Smalltalk)

compress: source
  | dest destLen |
  dest := source species new: (source size * 1.001) asInteger + 13.
  destLen := DWORD fromInteger: dest size.
  ^(self compress: dest destLen: destLen 
      source: source sourceLen: source size) = 0
      ifTrue: [dest copyFrom: 1 to: destLen asInteger]

and

compress: dest destLen: destLen source: source sourceLen: sourceLen
  <stdcall: sword compress lpvoid DWORD* lpvoid dword>
  ^self invalidCall

to answer a compressed byte array created from a passed byte array.


bye
--
Stefan Matthias Aust  //  Are you ready to discover the twilight zone?





More information about the Squeak-dev mailing list