[Newbies] Re: Splitting Excel csv into lines

Zulq Alam me at zulq.net
Sat Dec 27 17:04:26 UTC 2008


Hi Stephan,

It's a bit hard to understand. Especially without tests. I would 
probably split it into a few methods such that each method tells the 
reader a manageable chunk of what is going on. I've sketched a few other 
things as well to give you some ideas:

splitIntoLines: aString
   | in |
   in := aString readStream.
   ^ Array streamContents: [:out |
     [in atEnd] whileFalse: [
       out nextPut: (self readLine: in)
     ]
   ]


readLine: aStream
   inQuote := false. "instance variable "
   " Use a stream to build the string instead of #, "
   ^ String streamContents: [:out |
     | char |
     " Give clear indication of what an end of line is "
     [((char := aStream next) = Character cr) and: [inQuote not]]
       whileFalse: [self readNext: aStream onto: out].
     " Use stream methods where you can"
     aStream peekFor: Character lf]


readNext: inStream onto: outStream
   | char |
   char := inStream last. " the loop calls next for us "
   " This used to be ifTrue: [ ifTrue: [ with no ifFalse: "
   (char = $" and: [inQuote]) ifTrue:[
     " Common tasks given to helper methods "
     char := self lookFor: Character tab on: inStream.
     char := self lookFor: Character tab on: inStream
       ifFound: [inQuote := false].
   ].
   char = (Character tab) ifTrue:[
     inQuote ifTrue: [inQuote := false]
       ifFalse: [
         self lookFor: Character tab in: aStream
           ifFound: [inQuote := false].
       ]
   ].
   " BTW, not sure this was right - don't you want CRs
   that aren't part of EOLs? "
   char = (Character cr) ifFalse: [out nextPut: char]


lookFor: aCharacter in: aStream
   ^ self lookFor: aCharacter in: aStream ifFound: []


lookFor: aCharacter in: aStream ifFound: aBlock
   aStream peek = aCharacter ifTrue: [
     aBlock value.
     ^ aStream next.
   ].
   ^ aStream last


Hope this helps,

Zulq.

stephan at stack.nl wrote:
> String lines doesn't handle separating Excel copy-and-paste very well,
> because soft enters in a cell get splitted into separate lines.
> 
> Ive done now:
> 
> splitIntoLines: aString
>     "Return a collection with the string-lines of the receiver."
> 
>     | input char temp inQuote|
>     input := aString readStream.
>     ^ Array streamContents: [ :output |
>         temp := ''.
>         inQuote := false.
>         [ input atEnd ] whileFalse: [
>             char := input next.
>             char = $" ifTrue: [
>                 inQuote ifTrue:[
>                     input peek = (Character tab) ifTrue: [
>                         char := input next.].
>                     input peek = (Character cr) ifTrue: [
>                         char := input next.
>                         inQuote := false]]].
>             char = (Character tab) ifTrue:[
>                 inQuote ifTrue: [ inQuote := false]
>                 ifFalse: [
>                     input peek= $" ifTrue: [
>                         input next.
>                         inQuote:=true]]].
>             char = (Character cr)
>                 ifFalse: [temp := temp, char asString]
>                 ifTrue: [
>                     inQuote ifFalse: [
>                         output nextPut: temp.
>                         temp:=''.
>                         input peek = Character lf ifTrue: [input next]]]]]
> 
> I would be interested in (speed & elegance & mistakes) improvements to it.
> 
> Stephan



More information about the Beginners mailing list