[Pkg] The Trunk: Compiler-ul.203.mcz

commits at source.squeak.org commits at source.squeak.org
Fri Apr 1 23:06:16 UTC 2011


Levente Uzonyi uploaded a new version of Compiler to project The Trunk:
http://source.squeak.org/trunk/Compiler-ul.203.mcz

==================== Summary ====================

Name: Compiler-ul.203
Author: ul
Time: 2 April 2011, 12:50:39.599 am
UUID: 9817f32d-943f-4745-81b8-0ecaeec259d4
Ancestors: Compiler-ul.202

Scanner changes:
- introduced #character type for characters instead of #number (part 2)
- updated comment
- added a new class variable DoItCharacter. All checks for doIt characters (30) should use this. (part 1)
- moved TypeTable initialization to a separate method
- instance creation uses the common #initialize method instead of #initScanner
- removed #initScanner and the class side #new
- use #== instead of #= for symbol and character comparison
- use #repeat instead of true and #whileTrue for loops
- use #and: instead of #&

=============== Diff against Compiler-ul.202 ===============

Item was changed:
  Object subclass: #Scanner
  	instanceVariableNames: 'source mark hereChar aheadChar token tokenType currentComment buffer typeTable'
+ 	classVariableNames: 'AllowUnderscoreAssignments AllowUnderscoreSelectors DoItCharacter TypeTable'
- 	classVariableNames: 'AllowUnderscoreAssignments AllowUnderscoreSelectors TypeTable'
  	poolDictionaries: ''
  	category: 'Compiler-Kernel'!
  
+ !Scanner commentStamp: 'ul 3/27/2011 23:24' prior: 0!
+ I scan a string or text, picking out Smalltalk syntactic tokens. I look one character ahead. I put each token found into the instance variable, token, and its type (a Symbol) into the variable, tokenType. At the end of the input stream, I pretend to see an endless sequence of special characters called doIts.
+ 
+ Instance Variables
+ 	aheadChar:		<Character>
+ 	buffer:		<WriteStream>
+ 	currentComment:		<OrderedCollection>
+ 	hereChar:		<Character>
+ 	mark:		<Integer>
+ 	source:		<ReadStream>
+ 	token:		<Symbol|String|NumberCharacter|Boolean|nil>
+ 	tokenType:		<Symbol>
+ 	typeTable:		<Array>
+ 
+ aheadChar
+ 	- the next character in the input stream
+ 
+ buffer
+ 	- a reusable WriteStream on a String which is used for building strings. Shouldn't be used from multiple methods without resetting.
+ 
+ currentComment
+ 	- an OrderedCollection of strings which contain all comments between the current token and the previous token or the beginning of the source.
+ 
+ hereChar
+ 	- the current character
+ 
+ mark
+ 	- the position of the current token in the source stream
+ 
+ source
+ 	- xxxxx
+ 
+ token
+ 	- the current token
+ 
+ tokenType
+ 	- the type of the current token. The possible token types are: #binary, #character, #colon, #doIt, #keyword, #leftArrow, #leftBrace, #leftBracket, #leftParenthesis, #literal, #period, #rightBrace, #rightBracket, #rightParenthesis, #semicolon, #string, #upArrow, #verticalBar, #word, #xBinary, #xColon, #xDelimiter, #xDigit, #xDollar, #xDoubleQuote, #xLetter, #xLitQuote, #xSingleQuote, #xUnderscore
+ 
+ typeTable
+ 	- an array that maps each an evaluable tokenType to each character with asciiValue between 0 and 255!
- !Scanner commentStamp: '<historical>' prior: 0!
- I scan a string or text, picking out Smalltalk syntactic tokens. I look one character ahead. I put each token found into the instance variable, token, and its type (a Symbol) into the variable, tokenType. At the end of the input stream, I pretend to see an endless sequence of special characters called doits.!

Item was changed:
  ----- Method: Scanner class>>initialize (in category 'initialization') -----
  initialize
+ 	
+ 	self initializeTypeTable.
+ 	DoItCharacter := Character value: 30!
- 	| newTable |
- 	newTable := Array new: 256 withAll: #xBinary. "default"
- 	newTable atAll: #(9 10 12 13 32 ) put: #xDelimiter. "tab lf ff cr space"
- 	newTable atAll: ($0 asciiValue to: $9 asciiValue) put: #xDigit.
- 
- 	1 to: 255
- 		do: [:index |
- 			(Character value: index) isLetter
- 				ifTrue: [newTable at: index put: #xLetter]].
- 
- 	newTable at: 30 put: #doIt.
- 	newTable at: $" asciiValue put: #xDoubleQuote.
- 	newTable at: $# asciiValue put: #xLitQuote.
- 	newTable at: $$ asciiValue put: #xDollar.
- 	newTable at: $' asciiValue put: #xSingleQuote.
- 	newTable at: $: asciiValue put: #xColon.
- 	newTable at: $( asciiValue put: #leftParenthesis.
- 	newTable at: $) asciiValue put: #rightParenthesis.
- 	newTable at: $. asciiValue put: #period.
- 	newTable at: $; asciiValue put: #semicolon.
- 	newTable at: $[ asciiValue put: #leftBracket.
- 	newTable at: $] asciiValue put: #rightBracket.
- 	newTable at: ${ asciiValue put: #leftBrace.
- 	newTable at: $} asciiValue put: #rightBrace.
- 	newTable at: $^ asciiValue put: #upArrow.
- 	newTable at: $_ asciiValue put: #xUnderscore.
- 	newTable at: $| asciiValue put: #verticalBar.
- 	TypeTable := newTable "bon voyage!!"
- 
- 	"Scanner initialize"!

Item was added:
+ ----- Method: Scanner class>>initializeTypeTable (in category 'initialization') -----
+ initializeTypeTable
+ 	"self initializeTypeTable"
+ 
+ 	| newTable |
+ 	newTable := Array new: 256 withAll: #xBinary. "default"
+ 	newTable atAll: #(9 10 12 13 32 ) put: #xDelimiter. "tab lf ff cr space"
+ 	newTable atAll: ($0 asciiValue to: $9 asciiValue) put: #xDigit.
+ 
+ 	1 to: 255
+ 		do: [:index |
+ 			(Character value: index) isLetter
+ 				ifTrue: [newTable at: index put: #xLetter]].
+ 
+ 	newTable at: 30 put: #doIt.
+ 	newTable at: $" asciiValue put: #xDoubleQuote.
+ 	newTable at: $# asciiValue put: #xLitQuote.
+ 	newTable at: $$ asciiValue put: #xDollar.
+ 	newTable at: $' asciiValue put: #xSingleQuote.
+ 	newTable at: $: asciiValue put: #xColon.
+ 	newTable at: $( asciiValue put: #leftParenthesis.
+ 	newTable at: $) asciiValue put: #rightParenthesis.
+ 	newTable at: $. asciiValue put: #period.
+ 	newTable at: $; asciiValue put: #semicolon.
+ 	newTable at: $[ asciiValue put: #leftBracket.
+ 	newTable at: $] asciiValue put: #rightBracket.
+ 	newTable at: ${ asciiValue put: #leftBrace.
+ 	newTable at: $} asciiValue put: #rightBrace.
+ 	newTable at: $^ asciiValue put: #upArrow.
+ 	newTable at: $_ asciiValue put: #xUnderscore.
+ 	newTable at: $| asciiValue put: #verticalBar.
+ 	TypeTable := newTable "bon voyage!!"!

Item was removed:
- ----- Method: Scanner class>>new (in category 'instance creation') -----
- new
- 
- 	^self basicNew initScanner!

Item was removed:
- ----- Method: Scanner>>initScanner (in category 'initialize-release') -----
- initScanner
- 
- 	buffer := WriteStream on: (String new: 40).
- 	typeTable := TypeTable!

Item was added:
+ ----- Method: Scanner>>initialize (in category 'initialize-release') -----
+ initialize
+ 
+ 	super initialize.
+ 	buffer := WriteStream on: (String new: 40).
+ 	typeTable := TypeTable!

Item was changed:
  ----- Method: Scanner>>scanAllTokenPositionsInto: (in category 'expression types') -----
  scanAllTokenPositionsInto: aBlock
  	"Evaluate aBlock with the start and end positions of all separate non-white-space tokens, including comments."
  
  	| lastMark |
  	lastMark := 1.
  	[currentComment ifNotNil:
  		[currentComment do:
  			[:cmnt| | idx |
  			 idx := source originalContents indexOfSubCollection: cmnt startingAt: lastMark.
  			 (idx > 0 and: [idx < mark]) ifTrue:
  				[aBlock value: idx - 1 value: (lastMark := idx + cmnt size)]].
  		 currentComment := nil].
  	mark ifNotNil:
  		[(token == #- 
  		  and: [(self typeTableAt: hereChar) == #xDigit]) ifTrue:
  			[| savedMark |
  			 savedMark := mark.
  			 self scanToken.
  			 token := token negated.
  			 mark := savedMark].
  		"Compensate for the fact that the parser uses two character lookahead.  Normally we must
  		  remove the extra two characters.  But this mustn't happen for the last token at the end of stream."
  		 aBlock
  			value: mark
  			value: (source atEnd
  					ifTrue: [tokenType := #doIt. "to cause an immediate ^self" source position]
  					ifFalse: [source position - 2])].
  	 (tokenType == #rightParenthesis
  	  or: [tokenType == #doIt]) ifTrue:
  		[^self].
  	tokenType == #leftParenthesis
  		ifTrue: 
  			[self scanToken; scanAllTokenPositionsInto: aBlock]
  		ifFalse: 
+ 			[(tokenType == #word or: [tokenType == #keyword or: [tokenType == #colon]])
- 			[(tokenType == #word or: [tokenType == #keyword or: [tokenType = #colon]])
  				ifTrue: 
  					[self scanLitWord.
  					 token == #true ifTrue: [token := true].
  					 token == #false ifTrue: [token := false].
  					 token == #nil ifTrue: [token := nil]]
  				ifFalse:
  					[(token == #- 
  					  and: [(self typeTableAt: hereChar) == #xDigit])
  						ifTrue: 
  							[self scanToken.
  							 token := token negated]]].
+ 		self scanToken ] repeat!
- 		self scanToken.
- 	true] whileTrue!

Item was changed:
  ----- Method: Scanner>>scanFieldNames: (in category 'public access') -----
  scanFieldNames: stringOrArray
  	"Answer an Array of Strings that are the identifiers in the input string, 
  	stringOrArray. If passed an Array, just answer with that Array, i.e., 
  	assume it has already been scanned."
  
  	| strm |
  	(stringOrArray isMemberOf: Array)
  		ifTrue: [^stringOrArray].
  	self scan: (ReadStream on: stringOrArray asString).
  	strm := WriteStream on: (Array new: 10).
+ 	[tokenType == #doIt]
- 	[tokenType = #doIt]
  		whileFalse: 
+ 			[tokenType == #word ifTrue: [strm nextPut: token].
- 			[tokenType = #word ifTrue: [strm nextPut: token].
  			self scanToken].
  	^strm contents
  
  	"Scanner new scanFieldNames: 'abc  def ghi' ('abc' 'def' 'ghi' )"!

Item was changed:
  ----- Method: Scanner>>scanMessageParts: (in category 'public access') -----
  scanMessageParts: sourceString
  	"Return an array of the form (comment keyword comment arg comment keyword comment arg comment) for the message pattern of this method.  Courtesy of Ted Kaehler, June 1999"
  
  	| coll nonKeywords |
  	coll := OrderedCollection new.
  	self scan: (ReadStream on: sourceString asString).
  	nonKeywords := 0.
+ 	[tokenType == #doIt] whileFalse:
- 	[tokenType = #doIt] whileFalse:
  		[(currentComment == nil or: [currentComment isEmpty])
  			ifTrue: [coll addLast: nil]
  			ifFalse: [coll addLast: currentComment removeFirst.
  				[currentComment isEmpty] whileFalse:
  					[coll at: coll size put: (coll last, ' ', currentComment removeFirst)]].
+ 		(token numArgs < 1 or: [token = #| and: [ coll size > 1 ] ])
- 		(token numArgs < 1 or: [(token = #|) & (coll size > 1)])
  			ifTrue: [(nonKeywords := nonKeywords + 1) > 1 ifTrue: [^ coll]]
  						"done with header"
  			ifFalse: [nonKeywords := 0].
  		coll addLast: token.
  		self scanToken].
  	(currentComment == nil or: [currentComment isEmpty])
  		ifTrue: [coll addLast: nil]
  		ifFalse: [coll addLast: currentComment removeFirst.
  			[currentComment isEmpty] whileFalse: [
  				coll at: coll size put: (coll last, ' ', currentComment removeFirst)]].
  	^ coll!

Item was changed:
  ----- Method: Scanner>>scanStringStruct (in category 'expression types') -----
  scanStringStruct
  
  	| s |
  	s := WriteStream on: (Array new: 16).
+ 	[tokenType == #rightParenthesis or: [tokenType == #doIt]]
- 	[tokenType = #rightParenthesis or: [tokenType = #doIt]]
  		whileFalse: 
+ 			[tokenType == #leftParenthesis
- 			[tokenType = #leftParenthesis
  				ifTrue: 
  					[self scanToken; scanStringStruct]
  				ifFalse: 
+ 					[tokenType == #word ifFalse:
- 					[tokenType = #word ifFalse:
  						[^self error: 'only words and parens allowed']].
  			s nextPut: token.
  			self scanToken].
  	token := s contents!

Item was changed:
  ----- Method: Scanner>>typedScanTokens: (in category 'public access') -----
  typedScanTokens: textOrString 
  	"Answer an Array that has been tokenized with literals mapped to literals,
  	 special characters mapped to symbols and variable names and keywords
  	 to strings. This methiod accepts _ (underscore) as an assignment token
  	 irrespective of whether the system prefers := as the assignment token."
  	| s |
  	self initScannerForTokenization.
  	self scan: (ReadStream on: textOrString asString).
  	s := WriteStream on: (Array new: 16).
+ 	[tokenType == #doIt] whileFalse:
- 	[tokenType = #doIt] whileFalse:
  		[(token == #- 
+ 		  and: [(self typeTableAt: hereChar) == #xDigit]) ifTrue: 
- 		  and: [(self typeTableAt: hereChar) = #xDigit]) ifTrue: 
  			[self scanToken.
  			 token := token negated].
  		s nextPut: token.
  		self scanToken].
  	^s contents
  
  	"Scanner new typedScanTokens: (Scanner sourceCodeAt: #typedScanTokens:)"!

Item was changed:
  ----- Method: Scanner>>xColon (in category 'multi-character scans') -----
  xColon
  	"Allow := for assignment"
  	
+ 	aheadChar == $= ifTrue:
- 	aheadChar = $= ifTrue:
  		[self step.
  		tokenType := #leftArrow.
  		self step.
  		^ token := #':='].
  	"Otherwise, just do what normal scan of colon would do"
  	tokenType := #colon.
  	^ token := self step asSymbol!

Item was changed:
  ----- Method: Scanner>>xDollar (in category 'multi-character scans') -----
  xDollar
  	"Form a Character literal."
  
  	self step. "pass over $"
  	token := self step.
+ 	tokenType := #character!
- 	tokenType := #number "really should be Char, but rest of compiler doesn't know"!



More information about the Packages mailing list