[Seaside-dev] Re: encoded stream

Sun Jan 25 15:46:59 UTC 2009

2009/1/24, Julian Fitzell <jfitzell at gmail.com>:
> On Sat, Jan 24, 2009 at 6:18 PM, Philippe Marschall
> <philippe.marschall at gmail.com> wrote:
>> These two are you want conversion from an external encoding
>> to Smalltalk or you want to use your external encoding as your
>> internal one. so your internal encoding can be either Smalltalk or
>> your external one.
>
> The problem with what you suggest is that much of the time you don't
> *know* the external encoding, so saying you want to use the *same*
> encoding internally as externally makes no sense.

But adding a configuration option doesn't add knowledge. You can't
transform something unknown into something known. We don't know
anything about it. We can set stuff up so that with a reasonable
likelihood it will be what we expect but if it isn't most of the time
we won't even find out that it is something different. And if it is
something different we have almost no chance of finding out what it
really is.

> I don't see why we can support the conversions:
> a)    (probably-unknown, assume UTF-8 default)   ->   Smalltalk
> b)    (probably-unknown)                                         ->
> definitely-unknown

That's what people have been doing for years, all big Seaside
applications do this, and many Seaside consultants recommend this as
best practice. You just have to make sure you stay within ASCII.

> But we can't support:
> c)    (probably-unknown, assume UTF-8 default)   ->   UTF-8
> d)    (probably-unknown, assume UTF-8 default)   ->   UTF-16

That doesn't make sense. If you use utf-16 internally and I can't
imagine anybody wanting that then you'd almost certainly want to use
it for external as well. I see no point in supporting configurations
that don't make sense so only to be super generic when in the end
YAGNI.

> It seems to me that (c) is what most legacy applications would want.

Many applications probably want the following as well:
some one byte encoding -> some one byte encoding

> (b) and (c) have the same likelihood of working (ie. not erroring)
> since they don't attempt to convert requests with no specified
> encoding. But (c) has a higher change of producing UTF-8 because it
> can convert requests that *do* specify an encoding.

But request don't in general. Seriously if a strange request comes in
you probably won't even notice. And if you do you won't find out what
the encoding really is. You just can't in practice because the
information isn't there.

Cheers
Philippe