[squeak-dev] Strange Char code 160

Tobias Pape Das.Linux at gmx.de
Wed Sep 12 07:27:41 UTC 2018


Hi Ron

> On 11.09.2018, at 21:24, Ron Teitelbaum <ron at usmedrec.com> wrote:
> 
> Hi Tobias,
> 
> Sorry, User error!  I surrounded it with single quotes and it started working again!  I'd pasted it and hit inspect to see what it was.  Silly mistake.
> 
> Ok so now this is what actually blew up on me.
> 
> (160 asCharacter asString, 'abc') withBlanksTrimmed  ' abc'
> 
> Shouldn't nbsp be considered a blank character?

Sounds reasonable.
Currently, separators (aka spaces) are defined as follows:

Character class>>separators
	"Answer a collection of the standard ASCII separator characters."

	^ #(32 "space"
		13 "cr"
		9 "tab"
		10 "line feed"
		12 "form feed")
		collect: [:v | Character value: v] as: String

Ie, confined to ASCII. Maybe we should consider using the unicode Zs category instead…
That being said, our unicode stuff is a bit broken… Lemme see.
Best
	-Tobias

> 
> All the best,
> 
> Ron
> 
> On Tue, Sep 11, 2018 at 2:13 PM Tobias Pape <Das.Linux at gmx.de> wrote:
> Hi Ron
> 
> 
>> On 11.09.2018, at 17:38, Ron Teitelbaum <ron at usmedrec.com> wrote:
>> 
>> Hi All,
>> 
>> I ran into this problem.  Has anyone seen this before?  
>> 
>> I was importing a file that contained some sort of char code 160.  http://www.adamkoch.com/2009/07/25/white-space-and-character-160/ says this is a non-breaking space.  
>> 
>> I pasted the character into Squeak. When trying to inspect it in squeak I get illegal character.  
>> 
> 
> Can you get me the char before the 160?
> 
> I see that the ahead char is 30, which makes things very strange.
> 
> First this does not look like Latin-1, where 160/0xa0 would be nbsp, because the ahead is 30/1e, a control char, and not defined in Latin-1.
> It is in ascii (record separator RS), but there 160/0xa0 is not defined.
> Windows CP 1252 would have both, but I am a bit unsure as to whether you'd actually find a NBSP+RS combo just like that in smalltalk data…
> 
> Leaves, eg, MacRoman (I _think_ stuff used to be coded in macroman in Squeak in the 90s), and
> there 30/0x1e ist still RS (strange) but 160/0xA0 is Dagger (†) which is indeed illegal.
> 
> So could you give us a bit content around the char?
> 
> Best regards
> 	-Tobias
> 
> 
> PS: EBCDIC would make no sense at all…
> PPS: Interestingly, 160/0xA0 is actually defined #xBinary…
> 
> 
> 
> 
>> Illegal character (char code 160 16r16rA0) ->
>> 
>> The 16r16r seems to be an error in the method and not a real number
>> 
>> xIllegal
>> 	"An illegal character was encountered"
>> 	self halt.
>> 	self notify: 'Illegal character (char code ' , hereChar charCode , ' 16r' , hereChar charCode hex , ')' at: mark
>> 
>> But if I inspect Character nbsp I get a character 160 that seems to work fine.
>> 
>> Here is the stack.  I added a halt in xIllegal.
>> '11 September 2018 11:30:40.486 am
>> 
>> VM: Win32 - Smalltalk
>> Image: Squeak4.1 [latest update: #9957]
>> 
>> Parser(Object)>>halt
>> 	Receiver: a Parser
>> 	Arguments and temporary variables: 
>> 
>> 	Receiver''s instance variables: 
>> 		source: 	a ReadWriteStream
>> 		mark: 	22
>> 		hereChar: 	$ 
>> 		aheadChar: 	Character value: 30
>> 		token: 	nil
>> 		tokenType: 	#xIllegal
>> 		currentComment: 	nil
>> 		buffer: 	a WriteStream ''''
>> 		typeTable: 	#(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> 		here: 	nil
>> 		hereType: 	nil
>> 		hereMark: 	nil
>> 		hereEnd: 	nil
>> 		prevMark: 	nil
>> 		prevEnd: 	nil
>> 		encoder: 	{an EncoderForV3PlusClosures}
>> 		requestor: 	a SmalltalkEditor
>> 		parseNode: 	nil
>> 		failBlock: 	[closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> 		requestorOffset: 	0
>> 		tempsMark: 	nil
>> 		doitFlag: 	nil
>> 		properties: 	nil
>> 		category: 	nil
>> 
>> Parser(Scanner)>>xIllegal
>> 	Receiver: a Parser
>> 	Arguments and temporary variables: 
>> 
>> 	Receiver''s instance variables: 
>> 		source: 	a ReadWriteStream
>> 		mark: 	22
>> 		hereChar: 	$ 
>> 		aheadChar: 	Character value: 30
>> 		token: 	nil
>> 		tokenType: 	#xIllegal
>> 		currentComment: 	nil
>> 		buffer: 	a WriteStream ''''
>> 		typeTable: 	#(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> 		here: 	nil
>> 		hereType: 	nil
>> 		hereMark: 	nil
>> 		hereEnd: 	nil
>> 		prevMark: 	nil
>> 		prevEnd: 	nil
>> 		encoder: 	{an EncoderForV3PlusClosures}
>> 		requestor: 	a SmalltalkEditor
>> 		parseNode: 	nil
>> 		failBlock: 	[closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> 		requestorOffset: 	0
>> 		tempsMark: 	nil
>> 		doitFlag: 	nil
>> 		properties: 	nil
>> 		category: 	nil
>> 
>> Parser(Scanner)>>scanToken
>> 	Receiver: a Parser
>> 	Arguments and temporary variables: 
>> 
>> 	Receiver''s instance variables: 
>> 		source: 	a ReadWriteStream
>> 		mark: 	22
>> 		hereChar: 	$ 
>> 		aheadChar: 	Character value: 30
>> 		token: 	nil
>> 		tokenType: 	#xIllegal
>> 		currentComment: 	nil
>> 		buffer: 	a WriteStream ''''
>> 		typeTable: 	#(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> 		here: 	nil
>> 		hereType: 	nil
>> 		hereMark: 	nil
>> 		hereEnd: 	nil
>> 		prevMark: 	nil
>> 		prevEnd: 	nil
>> 		encoder: 	{an EncoderForV3PlusClosures}
>> 		requestor: 	a SmalltalkEditor
>> 		parseNode: 	nil
>> 		failBlock: 	[closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> 		requestorOffset: 	0
>> 		tempsMark: 	nil
>> 		doitFlag: 	nil
>> 		properties: 	nil
>> 		category: 	nil
>> 
>> Parser(Scanner)>>scan:
>> 	Receiver: a Parser
>> 	Arguments and temporary variables: 
>> 		inputStream: 	a ReadWriteStream
>> 
>> 	Receiver''s instance variables: 
>> 		source: 	a ReadWriteStream
>> 		mark: 	22
>> 		hereChar: 	$ 
>> 		aheadChar: 	Character value: 30
>> 		token: 	nil
>> 		tokenType: 	#xIllegal
>> 		currentComment: 	nil
>> 		buffer: 	a WriteStream ''''
>> 		typeTable: 	#(#xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xIllegal #xDelimiter #xDelimiter #xIllegal #xDelimiter #xDelimiter #xIllegal #xIllegal #xIllega...etc...
>> 		here: 	nil
>> 		hereType: 	nil
>> 		hereMark: 	nil
>> 		hereEnd: 	nil
>> 		prevMark: 	nil
>> 		prevEnd: 	nil
>> 		encoder: 	{an EncoderForV3PlusClosures}
>> 		requestor: 	a SmalltalkEditor
>> 		parseNode: 	nil
>> 		failBlock: 	[closure] in Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> 		requestorOffset: 	0
>> 		tempsMark: 	nil
>> 		doitFlag: 	nil
>> 		properties: 	nil
>> 		category: 	nil
>> 
>> 
>> --- The full stack ---
>> Parser(Object)>>halt
>> Parser(Scanner)>>xIllegal
>> Parser(Scanner)>>scanToken
>> Parser(Scanner)>>scan:
>>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> Parser>>init:notifying:failBlock:
>> Parser>>parse:class:category:noPattern:context:notifying:ifFail:
>> Compiler>>translate:noPattern:ifFail:
>> Compiler>>evaluate:in:to:notifying:ifFail:logged:
>> [] in SmalltalkEditor(TextEditor)>>evaluateSelection
>> BlockClosure>>on:do:
>> SmalltalkEditor(TextEditor)>>evaluateSelection
>> [] in PluggableTextMorphPlus(PluggableTextMorph)>>inspectIt
>> ...etc...
>> 
>> And to top it off if I inspect hereChar on xIllegal in the debugger I get a char 160 that works fine!
>> 
>> I'm not sure how to determine what exactly what is the difference between the two characters.  Any suggestions?
>> 
>> Thanks!
>> 
>> All the best,
>> 
>> Ron Teitelbaum 
>> 
> 
> <Bildschirmfoto 2018-09-11 um 20.13.03.PNG><Bildschirmfoto 2018-09-11 um 20.13.03.PNG>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180912/f1b1863e/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Bildschirmfoto 2018-09-11 um 20.13.03.PNG
Type: image/png
Size: 46225 bytes
Desc: not available
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20180912/f1b1863e/attachment-0001.png>


More information about the Squeak-dev mailing list