[Seaside] Re: GRInvalidUtf8Error: Invalid UTF-8 input

Philippe Marschall philippe.marschall at gmail.com
Sun Jun 26 16:38:30 UTC 2016


On Wed, Jun 22, 2016 at 1:33 PM, Hilaire <hilaire at drgeo.eu> wrote:
>
>
> Le 22/06/2016 09:40, Philippe Marschall a écrit :
>> Ok, it's likely in the server adapter before Seaside actually kicks in
>> then. Can you set a break point in GRPharoUtf8Codec >> #invalidUtf8?
>>
>> My suspect would be ZnZincServerAdaptor >> #convertMultipart:
>>
>> If you can send us the string it's trying to convert that would be helpful.
>
>
> The string argument of GRPharoUtf8Codec>>decode:
>
> is
>
> aString ->'Identités certifiées.pdf'
>
> printed as this in the Debugger. As we know Pharo does not use UTF8
> internally it is suspect to see an utf8 string correctly printed in
> Pharo, right?

You are seeing a UTF-8 string that has already been decoded to
Pharo/Unicode therefore it displays correctly. Then Seaside/the
adaptor tries to decode it a second time which fails.

> Does it looks like a Latin1 ?:
>
> aString  asByteArray do:  [:each|  Transcript show: each hex ;  space]
>
> 16r49 16r64 16r65 16r6E 16r74 16r69 16r74 16rE9 16r73 16r20 16r63 16r65
> 16r72 16r74 16r69 16r66 16r69 16rE9 16r65 16r73 16r2E 16r70 16r64 16r66
>
> So indeed, GRPharoUtf8Codec>>decode: already received a decoded utf8
> string to latin1, then obviously fail.

Correct.

> Now looking back in the stack as you suggested, then decoding already
> took place at:
>
> ZnMimePart>>fileName
> "Pathnames are often silenty encoded using UTF-8,
> this is a no-op for ASCII, but will fail on Latin-1 and others"
> ^ (self detectContentDispositionValue: 'filename')
>         ifNotNil: [ :encodedFileName | encodedFileName asByteArray utf8Decoded ]
>
>
> The timecode of this method is 10/10/2014 from Sven
>
> The second place where the decode takes place is (Zinc-Seaside package):
>
> ZNZincServerAdaptor>>convertMultipartFileField: part
> | file |
> (file := WAFile new)
>         fileName: (self codec decode: part fileName);
>         contentType: part contentType printString;
>         contents: part contents asByteArray.
> ^ file
>
> Timecode is 11/14/2014 from Johan, where the decode was added.
>
>
> This two methods use too different decoding methods (duplication?), one
> from the Grease package, the other from ZN package.
>
> My opinion is the Zinc-Seaside package should not try to decode, or
> preferably use the ZN decode method (utf8Decoded), but it will bring an
> error on already decoded string.

The output of Zinc-Seaside must be decoded UTF-8 in Pharo encoding.
How that is achieved is up to the Zinc-Seaside package. Just to be
sure, you are working with an up to date Zinc version?

Cheers
Philippe


More information about the seaside mailing list