[Newbies] Decomposing Binary Data by CR/LF

Mon Jul 24 18:00:25 UTC 2017

Is there an existing method that will tokenize/chunk(?) data from a file
using  CR/LF? The use case is to decompose a file into PDF objects defined
as strings are strings terminated by CR/LF. (if there is an existing
framework/project available, I have not found it, just dead ends :-(

I have been exploring in #String and #ByteString and this is all I have
found that is close to what I need.

"Finds first occurance of #Sting"
self findString: ( Character cr  asString,  Character lf asString).
"Breaks at either token value"
self findTokens: ( Character cr  asString,  Character lf asString)

I have tried poking around in #MultiByteFileStream, but  keep running into
errors.

If there is no existing method, any suggestions how to write a new one? My
naive approach is to scan for CR and then peek for LF keeping track of my
pointers and using them to identify the CR/LF delimited substrings; or
iterate through contents using #findString:

TIA, jrm

-----
Image
-----
C:\Smalltalk\Squeak5.1-16549-64bit-201608180858-Windows\Squeak5.1-16549-64bit-201608180858-Windows\Squeak5.1-16549-64bit.1.image
Squeak5.1
latest update: #16549
Current Change Set: PDFPlayground
Image format 68021 (64 bit)

Operating System Details
------------------------
Operating System: Windows 7 Professional (Build 7601 Service Pack 1)
Registered Owner: T530
Registered Company:
SP major version: 1
SP minor version: 0
Suite mask: 100
Product type: 1
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/beginners/attachments/20170725/572b3b60/attachment.html>