MultiStrings and ZipArchive

Yoshiki Ohshima yoshiki at squeakland.org
Sun Apr 17 01:40:18 UTC 2005


  Colin, 

> I'm investigating a Monticello bug report, and have discovered that 
> it's actually a problem with ZipArchive. The problem arises if you add 
> a MultiString as a member of a Zip archive, and then try to read it 
> back. ZipArchive builds a String out of the bytes in the archive, with 
> the result that it includes a lot of null characters, and any non-ascii 
> characters get mangled.

  Just to make sure, it was about the content of the member, rather
than the file name of the member, right?

  In regard to the content of file, I thought I addressed the problem,
but I hadn't done anything with Monticello yet...

  In regard to the file name it is a bit trickier.  Since there are
tons of different Zip files around with member names that contains 8
bit (not 7 bit characters), and its interpretation is based on the
local encodings.  Japanese version of Windows XP creates .zip files
whose member name encoding is Shift-JIS and my guess is that European
version of Windows do it with Latin-1.

  The half-based solution was to use the platform-default encoder to
create the byte-representation of wide strings, but I have been
wanting to add extra attribute to the zip archieve to tell the file
member name encoding.  Ned told me how to do it, but I haven't got
around to do it...

-- Yoshiki



More information about the Squeak-dev mailing list