[Seaside-dev] WACodecTest>>testCodecUtf8ShortestForm

Michael Lucas-Smith mlucas-smith at cincom.com
Mon Jun 29 17:55:46 UTC 2009


Philippe Marschall wrote:
> 2009/6/22 Michael Lucas-Smith <mlucas-smith at cincom.com>:
>   
>> Hi All,
>>
>> This test has a non-shorted form for the characters 'abc'.
>>     
>
> It should be 'ABC'.
>   
Yep that's what the code tests for.
>   
>> The specification
>> that you should reject -illegal- sequences, but non-shortest are okay.
>> You're not meant to -generate- non-shortest, perhaps the test should be
>> flipped to make sure the non-shorted form is produced when encoding 'ABC'.
>> However, this seems a little redundant as I can't imagine any Smalltalker
>> would go out of his/her way to make a UTF8 encoder that produces the
>> non-shortest of the letters ABC.
>>     
>
> If you wanted to attack a system, eg. bypass certain word filters, it
> might well be the case [1]
>
>  [1] http://blogs.sun.com/xuemingshen/entry/the_big_overhaul_of_java
>   
But this attack is based on the idea that you would attempt to filter 
certain words -before- you've decoded the UTF8.
That's insane. Period. I acknowledge the idea that it'd be nice to 
protect our users from themselves.. hah. The post also mixes up illegal 
sequences with non-shortest form - which the spec goes to pains to 
differentiate in its verbiage.

May be Java has decided that users don't want to decode UTF8 and 
therefore it's a security risk, but I don't think that's necessarily the 
right thing for us to do in Smalltalk.

You won't get this kind of attack using Opentalk-HTTP ...unless you're 
using Seaside with a WANullCodec. It's therefore possible to get this 
attack with Seaside, but only if you're using WANullCodec - which from 
what I gather is what every body is using. However, it is also the 
intent to move off of WANullCodec ...so crippling an otherwise correct 
UTF8 decoder to satisfy WANullCodec would bt the wrong thing to do.

I'm all for rejecting the illegal sequences, but the spec is pretty 
specific about non-shortest forms being parsable... and since when did 
we start looking to Java for "the right thing to do" ? ;)

Michael


More information about the seaside-dev mailing list