[Seaside] WAUrlEncoder

Philippe Marschall philippe.marschall at gmail.com
Sat Feb 7 15:12:56 UTC 2009


2009/2/6 Karsten <karsten at heeg.de>:
> Hi,
>
> today I had a problem with non-ascii characters in the URL. The reason for
> this problem seemed to be that URLs are encoded in a different way as they
> are decoded. Encoding is done with WAUrlEncoder and decoding is done with
> URLEncoder (at least on VisualWorks). The URLEncoder uses UTF8 to
> encode/decode URLs, but the WAUrlEncoder stores characters bytewise with %.
> So if you encode 'ü' (252 as Integer) with WAUrlEncoder you get '%FC', while
> the URLEncoder produces '%C3%BC'.

That is how things are in Seaside 2.8, at least the WAUrlEncoder I
can't say anything about URLEncoder.

> If the URLEncoder is also used for encoding everything works fine and even
> the browser shows the characters properly in the address bar. I'm not sure
> if that's just a problem of the VW port, but still I don't understand why
> the WAUrlEncoder doesn't encode with UTF8, even though that's recommented in
> the rfc (at least that's what Wikipedia said ;-) ).

First, URLEncoder is a VW class which means we can't use it.
Second, specs and Wikipedia entries don't matter. The only things that
matter is what browsers and people do. Browsers are quite good at
ignoring specs mostly because authors are way better at ignoring
specs.

So why don't we encode to utf-8? Mostly because a lot of people use
utf-8 as an internal encoding for their images. This means that data
that is already utf-8 must not encoded to utf-8 again. While this is
fine for those people it screws the people that want to use the native
encoding internally and use utf-8 externally. Additionally encoding to
utf-8 or decoding from utf-8 is wrong if you want to use ISO-8859-1
externally.

In Seaside 2.8 you can only indirectly specify your desired internal
encoding through the choice (or configuration) of your Server adapter.
However in WAUrlEncoder we don't have access to this information. The
fix in Seaside 2.9 is that we have access to whether we need to do
encoding and to what ;-)

Honestly I think the combination of WAUrlEncoder not doing encoding to
utf-8 and URLEncoder doing utf-8 decoding is broken. But that is a VW
port issue.

Cheers
Philippe


More information about the seaside mailing list