[BUG] and suggested replacement for String>>findBetweenSubStrs: (corrected, with change set attached)

David T. Lewis lewis at mail.msen.com
Tue Jan 9 04:19:41 UTC 2001


Reposted to the list per request from the author. For convenience, I have
attached a change set containing the corrected methods.

Dave

----- Forwarded message from Jan Skibinski <jans at numeric-quest.com> -----

Date: Mon, 8 Jan 2001 13:54:51 -0500 (EST)
From: Jan Skibinski <jans at numeric-quest.com>
To: "David T. Lewis" <lewis at mail.msen.com>
Subject: Re: [BUG] and suggested replacement for String>>findBetweenSubStrs:
In-Reply-To: <20010108173636.B738 at conch.msen.com>


	David,
	Thanks for posting it. But, oops, my replacement was missing one
	condition in the second method. Sorry for that. Here is the
	corrected version.
	Jan
 	------------------------------------------------------	
 	separateByStrings: strings 
 		startingAt: start 
 		storingIn: aCollection
 
 	"Private recursive method.
 	Separate the receiver into tokens, which are delimited by a list
 	of delimiter 'strings'. Start collecting from a start position
 	within the receiver. Add resulting tokens to a aCollection."
 	
 	| i ks matches position|
 	ks := strings collect:[:each| 
 		self findString: each startingAt: start].
 	matches := ks select: [:each| each > 0].
 	(matches notEmpty)
 	ifTrue:[
 		position := matches min.
 		i := ks indexOf: position.
>>>		(position > start) "Do not collect empty strings!!!" 
>>>		ifTrue:[
 			aCollection addLast: (self copyFrom: start 
 				to: (position -1))
>>>		].
  		^self separateByStrings: strings
  			startingAt: position + (strings at: i) size
  			storingIn: aCollection
  	]
  	ifFalse:[
  		aCollection addLast: (self copyFrom: start to: self size).
  		^aCollection
  	]

----- End forwarded message -----
-------------- next part --------------
'From Squeak2.9alpha of 13 June 2000 [latest update: #3125] on 8 January 2001 at 11:12:26 pm'!
"Change Set:		separatedByStrings
Date:			8 January 2001
Author:			Jan Skibinski <jans at numeric-quest.com>

Proposed replacement for String>>findBetweenSubStrs:
Transcribed from email to a Squeak change set by dtl.

	------------------------------------------------------------
	The problematic method is: String >> findBetweenSubStrs:
	It works most of the time, and I use it quite often for all
	sorts of parsing. However, there are cases where:
	a. The result is wrong
	b. Squeak gets into forever loop in my Linux box. I should
	   point out that this method is dependent on some primitives
	   and the bug might be system dependent. The version I use
	   is (2.8-4) and it has OSProcess installed as the only
	   extra pluggable module. However, since I was not able to
	   reproduce the crash, I will only deal here with the case a.
	
	Examples:
	'xxx :: a' findBetweenSubStrs: #('::')
		==!
 > OrderedCollection('xxx ' ' a')
		This is OK.
	'xxx :: a' findBetweenSubStrs: #('::' '=>')
		Still OK.
	'xxx :: a' findBetweenSubStrs: #('::' ' => ')
		==> OrderedCollection('xxx ' ':a')
		Wrong!!
 
	The methods callable from the 'findBetweenSubStrs:' are
	quite lengthy and having quite elaborate logic.
	They are also trying to be overly generic - allowing to
	choose both: strings and characters, as delimiters.
	 
	Instead of tracking down the source of errors I wrote
	a distinctly different implementation (two methods)
	that is short and easy to understand. It works but I have not
	judged whether it is efficient enough. Judge it yourself whether
	it is worthy as a replacement of the buggy 'findBetweenSubStrs:'."!


!String methodsFor: 'accessing' stamp: 'dtl 1/8/2001 23:02'!
separatedByStrings: strings
	"A collection of tokens obtained by breaking the receiver into substrings
	delimited by a given list of 'strings'"

	^self separateByStrings: strings
		startingAt: 1
		sto!
 ringIn: OrderedCollection new! !

!String methodsFor: 'private' stamp:
 'dtl 1/8/2001 23:03'!
separateByStrings: strings startingAt: start storingIn: aCollection
	"Private recursive method. Separate the receiver into tokens, which are
	delimited by a list of delimiter 'strings'. Start collecting from a start
	position within the receiver. Add resulting tokens to a aCollection."

 	| i ks matches position|
 	ks := strings collect:[:each| 
 		self findString: each startingAt: start].
 	matches := ks select: [:each| each > 0].
 	(matches notEmpty)
 	ifTrue:[
 		position := matches min.
 		i := ks indexOf: position.
		(position > start) "Do not collect empty strings!!!!!!" 
		ifTrue:[
 			aCollection addLast: (self copyFrom: start 
 				to: (position -1))
		].
  		^self separateByStrings: strings
  			startingAt: position + (strings at: i) size
  			storingIn: aCollection
  	]
  	ifFalse:[
  		aCollection addLast: (self copyFrom: start to: self size).
  		^aCollection
  	]

! !


More information about the Squeak-dev mailing list