[Seaside] Accented characters

Philippe Marschall philippe.marschall at gmail.com
Thu Jul 27 17:27:09 UTC 2006


2006/7/27, Koji Yokokawa <koubo2006 at yengawa.jpn.org>:
> Hi,
>
> On Thu, 27 Jul 2006 16:58:34 +0200
> "Philippe Marschall" <philippe.marschall at gmail.com> wrote:
>
> > 2006/7/27, Koji Yokokawa <koubo2006 at yengawa.jpn.org>:
> > > I struggled to use Japanese on Seaside recently.
> > >
> > > The problem is not only about accented characters (Unicode). The cause
> > > of that is a lack of the fundamental facility, 'charset', in Seaside.
> > > Charset is very important especially in Asia. Many of Asian sites uses
> > > various local charset not Unicode in reality.
> > >
> > > Umezawa-san and I made a patch for internationalization of
> > > Seaside2/Squeak. This patch fixes the problems cause of charset
> > > encodings (includes Unicode) on Seaside2.6b1. It makes WAKom (NOT
> > > WAKomEncoded) handle Charset, XHTML lang, encoded filenames and encoded URL.
> >
> > I think this should be done by WAKomEncoded instead of WAKom. WAKom is
> > supposed to do no conversion at all and thus effectively deals with
> > byte arrays rather than strings.
> >
> > Like said before, for some people it's perfectly ok to have raw utf-8
> > (or whatever encoding) strings in the image. Others even want it that
> > way.
>
> I don't think so.
> The encoding depends on the application (the session to be exact), not
> on the server. Therefor I added the 'charset' value as a property of an
> application. Then the changes are scattered over the system. (check the
> changed methods by the Monticello's 'Merge' button in your Seaside image.)

I think we are talking about different this. What I meant is the
following. Suppose you have an application that uses utf-8 (or
whatever encoding) both externally and in the backend for the
database. The application never needs to query the size of strings in
number of characters and never directly indices into the strings.

You now have to options. Either convert the strings that come into the
image (form database or web) to WideStrings only the convert them back
to the original encoding when the out of the image (to database or
web) or do no conversion at all. Sometimes the later really is a valid
option.

> >
> > > http://squeaksource.blueplane.jp/Seaside2I18N/
> > > (This is a project on the SqueakSource in *Japanese*, however you can
> > > load it into your image without any changes.)
> >
> > The problem is that this is not at all portable. I will only work on
> > Squeak 3.9 with Kom. No other Squeak, not other Smalltalk, no other
> > http server.
>
> You're right.
> I don't have knowledge of porting Seaside to other environment. Is there
> some one teach me rules or idioms to make the code portable in Seaside?

Michel was our expert here but it looks like Boris has taken over. So
they are probably better qualified. Some rules I learned:

1. don't send #asString, send #displayString instead (exception WAUrl)
2. move platform specifc stuff to SeasidePlatformSupport.

Now in your special case, I suggest the following

What about the following contract:
We don't do any conversion (character encoding or decoding) in
Seaside. We do it in the server adapters. This should make porting
easier since they are platform specific anyway. This way the get rid
of all the TextConverters in Seaside (I don't think they are anywhere
near portable). In cases where we absolutely have to (probably WAUrl)
move it to SeasidePlatformSupport.

Move the Kom specific stuff to Kom. We probably have to do a Kom for 3.9.

Let's keep it the way that WAKom does not do any de/encoding and do it
instead in WAKomEncoded.

And ask Michel, Boris and Avi what the think about it.

Philippe


More information about the Seaside mailing list