[squeak-dev] WebClient, Json and CouchDB

Igor Stasenko siguctua at gmail.com
Thu May 13 01:20:08 UTC 2010


2010/5/13 Levente Uzonyi <leves at elte.hu>:
>>
>> Yes, your version of the method is nicer
>> escapeForCharacter: c
>>
>>        | index |
>>        ^ (index := c asciiValue + 1) <= escapeArray size
>>                ifTrue: [ ^ escapeArray at: index ]
>>
>>
>>                "THIS IS WROOONG!!! unicode is not 16bit wide!"
>>                ifFalse: [ ^ '\u', (((c asciiValue bitAnd: 16rFFFF)
>> printStringBase:
>> 16) padded: #left to: 4 with: $0) ]
>>
>> However your comment leads me to the non-urgent question: How would we
>> deal with a code point >65536?
>
> Noone has to deal with those, since all characters that must be escaped fit
> into 16 bits (you can find the escaping rule in RFC 4627 if you're
> interested). So this implementation is wrong, because it's trying to escape
> everything which asciiValue is greater than 127 and will fail for values
> greater than 65535. This escaping is totally unnecessary, it just gives a
> (not so) nice slowdown.
>
> From RFC 4627:
> "
>   ... All Unicode characters may be placed within the
>   quotation marks except for the characters that must be escaped:
>   quotation mark, reverse solidus, and the control characters (U+0000
>   through U+001F).
>
>   Any character may be escaped. ...
> "
>
> So the best to do is: escape only $\ $" and the characters from 0 to 31.
>
so, how about just this:

escapeForCharacter: c
	
	| index |
	^ (index := c asciiValue + 1) <= escapeArray size
		ifTrue: [ ^ escapeArray at: index ]
		ifFalse: [ c ]



>
> Levente
>


-- 
Best regards,
Igor Stasenko AKA sig.



More information about the Squeak-dev mailing list