[Seaside] Re: character encoding / was: (Postgres / Glorp / Kom)

Ramiro Diaz Trepat ramirodt at gmail.com
Tue May 15 03:35:28 UTC 2007


Just for the record, I have been trying Norbert's Glorp hack for UTF8
and in conjunction with WAKomEncoded39 works really well.  It all
passes through without you having to do anything special in you object
model.

That was really a cool hack Norbert.
I don't know who is maintaining Glorp in Squeak, but probably we could
add the functionality to make it work more elegantly to whatever the
database encoding is.
Upon connection, we could set a default TextConverter in
SqueakDatabaseAccessor that we can easily read (at least from
Postgres) with the following statement:

select pg_encoding_to_char(encoding) from pg_database where datname='XXXXX';

Then if Squeak has the appropiate TextConverter, like it happens with
UTF8, we can automatically set it up, and the following line:

SqueakDatabaseAccessor>>basicExecuteSQLString: aString
     ....
     result := connection execute: (aString convertToEncoding: #utf8).
     ....

could be something like

     result := connection execute: (self encode: aString).

or something like it.
I suppose the other way around, converting strings from the database,
would be a bit harder to implement.  At least more codes would have to
be added in
PGConnection class>>buildDefaultFieldConverters
(I really don't have a clue were those codes came from) :)

Thank's again Norbert.


r.


On 5/14/07, Ramiro Diaz Trepat <ramirodt at gmail.com> wrote:
> Summarizing
>
> Using Squeak 3.9, for example:
>
> 1.
> - Start Seaside with WAKom.
> - Go to the SushiStore, search for 'Ñandú' (a kind of Argentinean ostrich).
> - The method WAStoreFillCart>>search: receives a properly formed ByteString
> that reads 'Ñandú'
> - Seaside then displays the corrupt String: No items match '?amd?'
>
> 2.
> - Start Seaside with WAKomEncoded39.
>  - Go to the SushiStore, search for 'Ñandú' .
>  - The method WAStoreFillCart search: receives a properly formed ByteString
> (not a WideString or an UTF8 formatted ByteString) that reads 'Ñandú'
>  - Seaside then displays the correct String: No items match Ñandú
>
> In spite that example 2 properly displays the string, methods like #search:
> never seem to receive an UTF8 or WideString instance.  What you get either
> with WAKom or with WAKomEncoded39 are always indistinguishable instances of
> ByteString.
> WAKomEncoded39 encodes strings before sending and after receiving to UTF8,
> but you don't get to "see" these UTF8 Strings. When they get to you, they
> are always converted to Squeak´s default encoding (which I don't know what
> it is yet) ?
>
>
>
>
> Summarizing some of the answers I got.
>
> Philippe
> Basically informs us that the handling of UTF8 strings in KomHttpServer /
> Squeak 3.9 got really broken, and that sadly the fix seems not to be on the
> way anytime soon.  But also says that everything works in 3.8.
> In spite of this affirmation, I got no concrete answers from anyone using
> Seaside in production (and using special characters) about which platform
> are they using.  In particular, I didn't hear from the rest of the Seaside
> core developers "We are all using Squeak 3.8" nor I have the fix for
> KomHttpServer for 3.9 but I will not share it :)
>
> Norbert
> Being in a very similar context than me, that is having to use a Postgres DB
> encoded in UTF8,  was also unable to make it work out of the box (confirming
> Philippe's statements) and coded a very smart work around, that he kindly
> shared with us.
>
> Sebastián
> Everything works for him using WAKomEncoded39.  But probably as in the
> SushStore examples above with WAKomEncoded39.  That is, not receiving UTF8
> or WideStrings.
>
>


More information about the seaside mailing list