Is there an existing method that will tokenize/chunk(?) data from a file
using CR/LF? The use case is to decompose a file into PDF objects defined
as strings are strings terminated by CR/LF. (if there is an existing
framework/project available, I have not found it, just dead ends :-(
I have been exploring in #String and #ByteString and this is all I have
found that is close to what I need.
"Finds first occurance of #Sting"
self findString: ( Character cr asString, Character lf asString).
"Breaks at either token value"
self findTokens: ( Character cr asString, Character lf asString)
I have tried poking around in #MultiByteFileStream, but keep running into
errors.
If there is no existing method, any suggestions how to write a new one? My
naive approach is to scan for CR and then peek for LF keeping track of my
pointers and using them to identify the CR/LF delimited substrings; or
iterate through contents using #findString:
TIA, jrm
-----
Image
-----
C:\Smalltalk\Squeak5.1-16549-64bit-201608180858-Windows\Squeak5.1-16549-64bit-201608180858-Windows\Squeak5.1-16549-64bit.1.image
Squeak5.1
latest update: #16549
Current Change Set: PDFPlayground
Image format 68021 (64 bit)
Operating System Details
------------------------
Operating System: Windows 7 Professional (Build 7601 Service Pack 1)
Registered Owner: T530
Registered Company:
SP major version: 1
SP minor version: 0
Suite mask: 100
Product type: 1