[Seaside] How do we get a string with a specific encodings

Philippe Marschall philippe.marschall at gmail.com
Thu Aug 2 14:41:27 UTC 2007


2007/8/2, Norbert Hartl <norbert at hartl.name>:
> On Thu, 2007-08-02 at 15:49 +0200, stephane ducasse wrote:
> > >>>> That depends on the image version and WAKom variant you are using.
> > >>>> If you use a squeak 3.9 image and the WAKomEncoded39 server than
> > >>>> everything is converted to utf-8 before it is being transfered
> > >>>> to the client.
> > >>>
> > >>> It also depends where you get the content from. A from field? A file
> > >>> upload? A file on disk?
> > >>> Really more information would help.
> > >>
> > >> In fact it comes from two sources.
> > >>         - one on a file (that normally I read with a converter to
> > >> be sure)
> > >
> > > What encoding does the file have?
> >
> > utf8 but I could change that.
> > It seems that latin1 would be better.
> >
> > >>         - from input fields.
> > >
> > > Are you running an encoded server adapter or WAKom?
> >
> > Just WAKom but I should move to use the WAKomEncoded39 version
> >
> > >> Now I was also wondering how I can simply specify a string with an
> > >> encoding in Squeak
> > >> in the image since I have no clue about that.
> >
> > do you know the anwser that the last question?
>
> You better not. I know, Phillippe will disagree :)

Sorry, I didn't get the question. Could you please rephrase and be
very exact about:
- where your strings come from
- what encoding these strings have
- what encoding these strings should have in the end

Just "I have an encoding problem" is not really helpful at all
especially if asking for answers before supplying any information.

> My opinion
> is that you should convert every string you get from outside
> the image to the internal format squeak uses. That is a controlled
> setting and you are able to use size and other methods on those
> strings without worry.

Unfortunately even if you have WideStrings you can not rely on that
all the methods for Strings work since some are broken for
WideStrings. Additionally you won't be able to inspect your non-latin
Squeak WideStrings because they have no language tag and there is no
way of adding one.

> The get your file to the web with the
> correct encoding you read in the file convert it from utf-8 to
> the internal squeak encoding. Then you should use WAKomEncoded39
> that will do the conversion to utf-8 when the string is about
> to leave the image.

That is of course only true if you get your Strings into the image via
Seaside and not via FileStream. And of course WAKomEncoded39 does not
work in Squeak 3.9 if you have the special version of Kom distributed
with the SeasideInstaller.

> From input fields it is nearly the same except
> the fact that WAKomEncoded39 doesn't do conversion for fields if
> the request is multipart (that means mostly you uploaded something
> like image along with the text fields).

This is fixed in 2.8. The only place where WAKomEncoded39 does not
conversion is for fileuploads because we have no way of telling what
encoding the file has or if it's even a text file.

Cheers
Philippe

> If you have trouble with this
> you can ask me and I can provide you with a fix. Or your searching
> the archives with subjects:
>
> - [Seaside] 3.9 and encoding in multipart fields
>
> Norbert
>
> _______________________________________________
> Seaside mailing list
> Seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
>


More information about the Seaside mailing list