[squeak-dev] Strange Char code 160
Tobias Pape
Das.Linux at gmx.de
Wed Sep 12 07:27:41 UTC 2018
Hi Ron
> On 11.09.2018, at 21:24, Ron Teitelbaum <ron at usmedrec.com> wrote:
>
> Hi Tobias,
>
> Sorry, User error! I surrounded it with single quotes and it started working again! I'd pasted it and hit inspect to see what it was. Silly mistake.
>
> Ok so now this is what actually blew up on me.
>
> (160 asCharacter asString, 'abc') withBlanksTrimmed ' abc'
>
> Shouldn't nbsp be considered a blank character?
Sounds reasonable.
Currently, separators (aka spaces) are defined as follows:
Character class>>separators
"Answer a collection of the standard ASCII separator characters."
^ #(32 "space"
13 "cr"
9 "tab"
10 "line feed"
12 "form feed")
collect: [:v | Character value: v] as: String
Ie, confined to ASCII. Maybe we should consider using the unicode Zs category instead…
That being said, our unicode stuff is a bit broken… Lemme see.
Best
-Tobias
>
> All the best,
>
> Ron
>
> On Tue, Sep 11, 2018 at 2:13 PM Tobias Pape <Das.Linux at gmx.de> wrote:
> Hi Ron
>
>
>> On 11.09.2018, at 17:38, Ron Teitelbaum <ron at usmedrec.com> wrote:
>>
>> Hi All,
>>
>> I ran into this problem. Has anyone seen this before?
>>
>> I was importing a file that contained some sort of char code 160. http://www.adamkoch.com/2009/07/25/white-space-and-character-160/ says this is a non-breaking space.
>>
>> I pasted the character into Squeak. When trying to inspect it in squeak I get illegal character.
>>
>
> Can you get me the char before the 160?
>
> I see that the ahead char is 30, which makes things very strange.
>
> First this does not look like Latin-1, where 160/0xa0 would be nbsp, because the ahead is 30/1e, a control char, and not defined in Latin-1.
> It is in ascii (record separator RS), but there 160/0xa0 is not defined.
> Windows CP 1252 would have both, but I am a bit unsure as to whether you'd actually find a NBSP+RS combo just like that in smalltalk data…
>
> Leaves, eg, MacRoman (I _think_ stuff used to be coded in macroman in Squeak in the 90s), and
> there 30/0x1e ist still RS (strange) but 160/0xA0 is Dagger (†) which is indeed illegal.
>
> So could you give us a bit content around the char?
>
> Best regards
> -Tobias
>
>
> PS: EBCDIC would make no sense at all…
> PPS: Interestingly, 160/0xA0 is actually defined #xBinary…
>
>
>
>
>> Illegal character (char code 160 16r16rA0) ->
>>
>> The 16r16r seems to be an error in the method and not a real number
>>
>> xIllegal
>> "An illegal character was encountered"
>> self halt.
>> self notify: 'Illegal character (char code ' , hereChar charCode , ' 16r' , hereChar charCode hex , ')' at: mark
>>
>> But if I inspect Character nbsp I get a character 160 that seems to work fine.
>>
>> Here is the stack. I added a halt in xIllegal.
>> '11 September 2018 11:30:40.486 am
>>
>> VM: Win32 - Smalltalk
>> Image: Squeak4.1 [latest update: #9957]
>>
>> Parser(Object)>>halt
>> Receiver: a Parser
>> Arguments and temporary variables:
>>
>> Receiver''s instance variables:
>> source: a ReadWriteStream
>> mark: 22
>> hereChar: $
>> aheadChar: Character value: 30
>> token: nil
>> tokenType: #xIllegal
>> currentComment: nil
>> buffer: a WriteStream ''''
>> typeTable: #(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> here: nil
>> hereType: nil
>> hereMark: nil
>> hereEnd: nil
>> prevMark: nil
>> prevEnd: nil
>> encoder: {an EncoderForV3PlusClosures}
>> requestor: a SmalltalkEditor
>> parseNode: nil
>> failBlock: [closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> requestorOffset: 0
>> tempsMark: nil
>> doitFlag: nil
>> properties: nil
>> category: nil
>>
>> Parser(Scanner)>>xIllegal
>> Receiver: a Parser
>> Arguments and temporary variables:
>>
>> Receiver''s instance variables:
>> source: a ReadWriteStream
>> mark: 22
>> hereChar: $
>> aheadChar: Character value: 30
>> token: nil
>> tokenType: #xIllegal
>> currentComment: nil
>> buffer: a WriteStream ''''
>> typeTable: #(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> here: nil
>> hereType: nil
>> hereMark: nil
>> hereEnd: nil
>> prevMark: nil
>> prevEnd: nil
>> encoder: {an EncoderForV3PlusClosures}
>> requestor: a SmalltalkEditor
>> parseNode: nil
>> failBlock: [closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> requestorOffset: 0
>> tempsMark: nil
>> doitFlag: nil
>> properties: nil
>> category: nil
>>
>> Parser(Scanner)>>scanToken
>> Receiver: a Parser
>> Arguments and temporary variables:
>>
>> Receiver''s instance variables:
>> source: a ReadWriteStream
>> mark: 22
>> hereChar: $
>> aheadChar: Character value: 30
>> token: nil
>> tokenType: #xIllegal
>> currentComment: nil
>> buffer: a WriteStream ''''
>> typeTable: #(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> here: nil
>> hereType: nil
>> hereMark: nil
>> hereEnd: nil
>> prevMark: nil
>> prevEnd: nil
>> encoder: {an EncoderForV3PlusClosures}
>> requestor: a SmalltalkEditor
>> parseNode: nil
>> failBlock: [closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> requestorOffset: 0
>> tempsMark: nil
>> doitFlag: nil
>> properties: nil
>> category: nil
>>
>> Parser(Scanner)>>scan:
>> Receiver: a Parser
>> Arguments and temporary variables:
>> inputStream: a ReadWriteStream
>>
>> Receiver''s instance variables:
>> source: a ReadWriteStream
>> mark: 22
>> hereChar: $
>> aheadChar: Character value: 30
>> token: nil
>> tokenType: #xIllegal
>> currentComment: nil
>> buffer: a WriteStream ''''
>> typeTable: #(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> here: nil
>> hereType: nil
>> hereMark: nil
>> hereEnd: nil
>> prevMark: nil
>> prevEnd: nil
>> encoder: {an EncoderForV3PlusClosures}
>> requestor: a SmalltalkEditor
>> parseNode: nil
>> failBlock: [closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> requestorOffset: 0
>> tempsMark: nil
>> doitFlag: nil
>> properties: nil
>> category: nil
>>
>>
>> --- The full stack ---
>> Parser(Object)>>halt
>> Parser(Scanner)>>xIllegal
>> Parser(Scanner)>>scanToken
>> Parser(Scanner)>>scan:
>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> Parser>>init:notifying:failBlock:
>> Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> Compiler>>translate:noPattern:ifFail:
>> Compiler>>evaluate:in:to:notifying:ifFail:logged:
>> [] in SmalltalkEditor(TextEditor)>>evaluateSelection
>> BlockClosure>>on:do:
>> SmalltalkEditor(TextEditor)>>evaluateSelection
>> [] in PluggableTextMorphPlus(PluggableTextMorph)>>inspectIt
>> ...etc...
>>
>> And to top it off if I inspect hereChar on xIllegal in the debugger I get a char 160 that works fine!
>>
>> I'm not sure how to determine what exactly what is the difference between the two characters. Any suggestions?
>>
>> Thanks!
>>
>> All the best,
>>
>> Ron Teitelbaum
>>
>
> <Bildschirmfoto 2018-09-11 um 20.13.03.PNG><Bildschirmfoto 2018-09-11 um 20.13.03.PNG>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180912/f1b1863e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Bildschirmfoto 2018-09-11 um 20.13.03.PNG
Type: image/png
Size: 46225 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180912/f1b1863e/attachment-0001.png>
More information about the Squeak-dev
mailing list
|