[Q] WAListener and WAFileLibrary problem

chunsj at embian.com chunsj at embian.com
Sat Jan 5 13:13:46 UTC 2008


I do not know Han-Unification part - in fact Hanja, the chinese letter or alphabet is not
included when I say Korean; only hangul, the korean alphabet/letter I say. This does have
dedicated region.

----- Original Message -----
   From: Philippe Marschall <philippe.marschall at gmail.com>
   To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
   Sent: 08-01-05 20:38:40
   Subject: Re: Re: [Q] WAListener and WAFileLibrary problem

  2008/1/5, chunsj at embian.com <chunsj at embian.com>:
> Ah, I've changed/added support for UnicodeEnvironment so that UTF-8
> encoded byte array be converted to/from squeak's internal encoding.
> With this, I can read UTF-8 encoded text(which can include korean or
> other languages encoded as UTF-8) from squeak environment like
> file list.
>
> Language tag is not required because unicode does already has region for
> korea, japanese or chinese or any other languages supported by unicode.
> So we can determine from byte value sequence, in what language region
> does this byte sequence matches.

Uhm no. Unicode does Han-Unification. So for some byte sequences there
is no way of telling whether they're Chinese, Japanese or Korean.

Cheers
Philippe

> Anyway I'm currently finding ways for determining content-type of WAResponse,
> so that if it's not text/html UTF8Stream be not used.
>
> Thank you.
>
> ----- Original Message -----
>    From: Philippe Marschall <philippe.marschall at gmail.com>
>    To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
>    Sent: 08-01-05 15:14:48
>    Subject: Re: [Q] WAListener and WAFileLibrary problem
>
>   2008/1/5, chunsj at embian.com <chunsj at embian.com>:
> > I've found the main reason of image corruption; that's because WAListenerEncoded
> > does use UTF8Stream *unconditionally* as you said it does not decide based on mime
> > type.
> >
> > But I cannot understand why Korean as UTF8 should not work.
>
> Because WAListenerEncoded encoded gives you Strings in Squeak encoding
> but also expects Strings from you to be in Squeak encoding. If you
> pass to it Strings that are already in UTF8 they get converted twice
> to UTF8.
>
> > My image is cutomized by me
> > so that it does support Korean and others(Japanese and Chinese but no font for these 2).
> > WideString for korean can be fawlessly converted to/from UTF8 encoded byte string.
>
> No, not at all. UTF8 has no concept of language tags.
>
> Chees
> Philippe
>
> > Is this
> > the work be done by WAListenerEncoded?
> >
> > Thank you for your help. Now I'm trying to find content-type of WAResponse before using
> > UTF8Stream.
> >
> > ----- Original Message -----
> >    From: Philippe Marschall <philippe.marschall at gmail.com>
> >    To: The general-purpose Squeak developers list <squeak-dev at lists.squeakfoundation.org>
> >    Sent: 08-01-04 20:40:57
> >    Subject: Re: [Q] WAListener and WAFileLibrary problem
> >
> >   2008/1/3, chunsj at embian.com <chunsj at embian.com>:
> > > Hi,
> > >
> > > I've managed to find and modify WAListenerEncoded so that it can process
> > > multibyte language - I've only tested it with Korean as UTF-8. During testing
> > > I found following problem.
> >
> > Korean as UTF-8 should not work on WAListenerEncoded. If it does then
> > it's a bug in WAListenerEncoded. The reason for this is that Korean as
> > UTF-8 violates the contract between the server adapter and you. The
> > *Encoded* adapters give you Strings in Squeak encoding (well not quite
> > in the case of CJK because that is not possible since Unicode does not
> > have the concept of language tags) but in turn expect Strings in
> > Squeak encoding. In the case of Korean this means WideStrings. UTF-8
> > Strings are ByteStrings and should therefore not work.
> >
> > > When I use WAListener/WAListenerEncoded I cannot get FileLibrary registered
> > > image files correctly. I can get CSS file or script file correctly, but I cannot get
> > > image files.
> >
> > I don't think WAListenerEncoded can ever work for binary files. The
> > problem is that due to it's streaming nature WAListenerEncoded
> > compared to WAKomEncoded can never look at the response. This means it
> > can never decide wehter is should do encoding (based on the mimetype),
> > so it always does it. In the case of binary content this is clearly
> > wrong. Your best option (as always) is to serve static files (images,
> > CSS, javascript) with Apache or something similar.
> >
> > > It seems that when I use WAListener, the server sent the image file of the size
> > > of 16135 byte, but original file size is 10819 byte, and this might be the source
> > > of the problem. I cannot open wrong sized file even though I cut the size of the
> > > file to the original one.
> >
> > WAListener should not do any encoding at all so images should work.
> > But then again we don't know what code you changed so we can't really
> > help you. It would help if you send us the image so we can test.
> >
> > Cheers
> > Philippe
> >
> >
> >
> >
> >
> >
>
>
>
>
>
>




More information about the Squeak-dev mailing list