[Newbies] Decomposing Binary Data by CR/LF
Lou at Keystone-Software.com
Mon Jul 24 20:41:49 UTC 2017
Windows files normally use CR/LF as line termination. Linux files normally use LF. Look at
the #subStrings: and friends. You may want to change all CR/LF to LF and then all CR to LF and
then split the file at LFs. You could also look into the various stream classes.
There are lots of ways to do this and if you are just learning, it doesn't hurt to try a few of
On Tue, 25 Jul 2017 06:00:25 +1200, John-Reed Maffeo <jrmaffeo at gmail.com> wrote:
>Is there an existing method that will tokenize/chunk(?) data from a file
>using CR/LF? The use case is to decompose a file into PDF objects defined
>as strings are strings terminated by CR/LF. (if there is an existing
>framework/project available, I have not found it, just dead ends :-(
>I have been exploring in #String and #ByteString and this is all I have
>found that is close to what I need.
>"Finds first occurance of #Sting"
>self findString: ( Character cr asString, Character lf asString).
>"Breaks at either token value"
>self findTokens: ( Character cr asString, Character lf asString)
>I have tried poking around in #MultiByteFileStream, but keep running into
>If there is no existing method, any suggestions how to write a new one? My
>naive approach is to scan for CR and then peek for LF keeping track of my
>pointers and using them to identify the CR/LF delimited substrings; or
>iterate through contents using #findString:
>latest update: #16549
>Current Change Set: PDFPlayground
>Image format 68021 (64 bit)
>Operating System Details
>Operating System: Windows 7 Professional (Build 7601 Service Pack 1)
>Registered Owner: T530
>SP major version: 1
>SP minor version: 0
>Suite mask: 100
>Product type: 1
Keystone Software Corp.
More information about the Beginners