Postgres / Glorp

Mon May 14 16:40:15 UTC 2007

On Mon, 2007-05-14 at 11:43 -0300, Ramiro Diaz Trepat wrote:
> Sorry Philippe, I am not sure what you mean.
> 
> When you say "a custom Kom" I suppose you mean extensions like
> Norbert's.  But then, when you say "or get WideStrings".  What do you
> mean? from where?
> 
> What happened between 3.8 and 3.9?
> Why does Kom not work with utf8 in 3.9?
> There are plenty of Spanish, German, Swedish and French speakers in
> this list, what do you guys use in production?
> I suppose that Norbert's solution of adding hooks to translate strings
> everytime fixes the problem, but it also introduces a major
> performance penalty, because every string that comes and goes has to
> get translated (everytime, unless you cache the translations you
> make), and I guess these translations are not that cheap.
> 
I wouldn't think of big performance penalty. Usually the strings
aren't huge so conversion doesn't cost much. And it is the only
way to do it. Every component should be free to choose an encoding
it needs or it is configured to use. Take Java as an example. The
internal storage is UCS-2. It is very close to UTF-8 ;) but not
the same. So there are conversions everytime you need the string
in a different encoding. Having a big registration for converted
strings will cost more than its benefit if you get large lookup
tables.

My problem is to do assumptions about encoding. I assume to get UTF-8
from the web (as long as I'm sending UTF-8 this is feasible but not
safe). I also assume the client encoding to the database is UTF-8
which in my case is always true but it can differ. 

As long as there is no negotiation about encoding everything is a
hack and will fail sooner or later. And having encoding problems
is the worst computer problems have to offer :)

Norbert