[Seaside] WAUrl class>>#decodePercent:

Mon Sep 2 08:59:42 UTC 2013

Hi Philippe,

Am 31.08.13 13:48, schrieb Philippe Marschall:
> On Fri, Aug 30, 2013 at 9:27 PM, jtuchel at objektfabrik.de
> <jtuchel at objektfabrik.de> wrote:
>> Okay, this message costs me some courage now ;-)
>>
>> Philippe, of course Seaside is decoding. Otherwise I would never have gotten
>> an Exception from decodePercent: in the first place. So I am expecting the
>> right thing from Seaside and am getting it.
>>
>> The real problem in my case is that I am using ISO-8859-15 in my
>> application. So the result of getting the text field's contents using val()
>> is an ISO-8859-15 encoded String.
> AFAIK that should be UTF-16 for JavaScript.
Well, it seems it is exactly what the html page's charset setting says. 
In my case it is ISO-8859-15. At least the Strings that are coming in to 
my callback carry umlauts in exactly the encoding that I need them.

>> encodeURI and encodeURIComponent not only escape characters, but also
>> convert special characters into UTF-8. In my case this were German umlauts.
>> So I fell hostage to a side effect of encodeURI and the fact that VA ST
>> doesn't yet support Unicode and somehow didn't understand this.
>>
>> So what I did was not wrong per se, I just ignored the whole UTF-8 thing. I
>> should have started my search in that area, because it is not the first time
>> AJax and its UTF-8 nature bit me.
>> For Pharo/Squeak/Gemstone users, this UTF-8 stuff is a non-issue, and
>> therefor readers of my posts had absolutely no chance to see the forest
>> between all the trees.
> Just for completeness' sake you could try to fake it, accept UTF-8 and
> translate it to ISO-8859-15
Yes, I thought about this possibility, but then I decided against it not 
only for the reasons you mention, but also for performance reasons. In 
an autocompleter that reacts to every single keystroke and builds up a 
hierarchical reperesentation of business objects that are retrieved 
using Glorp and renders them in nested <ul> tags, every en/decoding step 
makes the thing slower, and this is a very central place in the app that 
makes part of its strengths.

> but then the question is what you do with
> everything of Unicode that doesn't fit into ISO-8859-15. Also we have
> to option of running UTF-8 but not decoding it. This makes it possible
> to run UTF-8 on non-Unicode-capable systems but you have to be very
> careful especially with the backend.
I invested quite some effort in making the whole application from 
database to web page UTF-free, because VAST doesn't support it very 
well, other than converting back and forth whenever a String enters or 
leaves VAST. Special fun is involved with DB field lengths etc. So as 
strange as it may sound, unicode is not necessarily your friend if there 
is at least one part of the chain that doesn't support it.
Luckily, the application is very tightly coupled to legal regulations in 
Germany, so I can quite safely decide to ignore the rest of the world 
that needs characters outside of the ISO-8859-15 character set.

>
> You may want to do special testing with €ŠšŽžŒœŸ which are part of
> ISO-8859-15 but not ISO-8859-1. Also you may want to test what happens
> when somebody enters non-ISO-8859-15 input [1].
Okay, you are right. I need to test what happens if somebody enters 
non-iso-8859. But I would expect this to be prevented by the web browser 
if I explicitly set the metadata of my html pages and especially form 
tags...

Joachim

-- 
-- 
----------------------------------------------------------------------- 
Objektfabrik Joachim Tuchel          mailto:jtuchel at objektfabrik.de 
Fliederweg 1                         http://www.objektfabrik.de
D-71640 Ludwigsburg 		     http://joachimtuchel.wordpress.com
Telefon: +49 7141 56 10 86 0         Fax: +49 7141 56 10 86 1