[Seaside] File upload - encoding issue

Philippe Marschall philippe.marschall at gmail.com
Sun Oct 19 16:42:14 UTC 2014


On Fri, Oct 17, 2014 at 11:25 AM, Sven Van Caekenberghe <sven at stfx.eu> wrote:
> Hi Philippe, Dave,
>
> I made a couple of changes to Zinc to handle the problem (which basically is: mime parts such as uploaded files embedded in multipart/form-data do not have a charset parameter on their mime types, hence the encoding is not known with absolute certainty) and I think I fixed it (for Zn itself, the default encoding now is UTF-8). I added a specific test (ZnServerTests>>#testFormTest3Unspecified) for this case. Additionally, the filename is now also assumed to be UTF-8 encoded (like a file path).
>
> For the Zn Seaside adaptor, the story was a bit different. The adaptor uses a special Zn option to read everything binary, as Seaside wants to do its own conversions.

Not really. Seaside wants a WARequest object (or a subtype). The
adapters in the Seaside repository all do the conversion but that's
because these servers don't support conversion. That is out of
necessity not by contract. Seaside should work totally fine if you
came up with a WARequest object that is build from an already parsed
object.  The same goes for WAUrl and WAFile. You don't have to use the
class side parse methods. If you already have parsed objects it is
totally fine for an adapter to build WAUrl instances with #new and
#addAllToPath: and friends.

> That option did not extend to mime parts in multipart/form-data. This is now added and the adaptor now works, without altering ZnZincServerAdaptor>>#convertMultipartFileField:
>
> IMHO though, WAUploadFunctionTest is wrong. Basically, the use of ISO-8859-1 is questionable and should be replaced with UTF-8 for current browsers (in the methods #renderDownloadLinksOn: and #renderFileContentsOn:). Then those tests pass for uploaded text files that have non-ascii contents.

#renderDownloadLinksOn: could probably we fixed if we always use #rawContents

#renderFileContentsOn: is trickier because we need to know what the on
disk encoding of the file was. That could have been to operating
system default encoding (UTF-8 on MacOS and modern Linux, maybe UTF-16
on Windows) or something else. We could look for a UTF-16 BOM and if
it's missing default to UTF-8.

Cheers
Philippe


More information about the seaside mailing list