Zip archive performance
Cees De Groot
cdegroot at gmail.com
Wed Oct 5 15:03:46 UTC 2005
Hi,
When using the functions in System-Archives to create a zip archive
over a large number of files (~1500 in our test case), performance
suffers badly (takes around 220 seconds on my box).
The problem is that Archive>>addTree:match: only passes the filename
to new archive members (creating them with #newFromFile:). When
tracing calls, this ends up at ZipNewFileMember>>from:, which calls
the innocently-looking #directoryEntry on StandardFileStream. However,
this method invokes another scan of the directory on the OS level and
a linear search over them. Which means that if you have a directory
with a hundred files, for every file these hundred files are stat()'ed
(or the Win32 equivalent from them)...
I worked around by passing the directory entry from
Archive>>addTree:match: (making #newFromFile:entry:, etcetera). This
enhances performance in our test case by a factor of 10...
I'm not sure whether this is the correct fix, just wanted to report
that there's room for optimization here :)
Code used:
TimeProfileBrowser onBlock: [| zip |
zip := ZipArchive new.
zip addTree: self default dataDirectory match: [:each | true].
zip writeToFileNamed: 'c:\temp\dgv.zip']
More information about the Squeak-dev
mailing list
|