Hi David

Thank you so much for your time.

I will give it a go next week.

FWIW, the flow of data is from a foo.bz2 file to --> (Squeak via your tools) -> Monty's XMLParser-Stax (aka SAX XML processing)  -> Levante's  Postgres Database.


I am hoping that Squeak will be able tor work directly on bz2 files (and other types, via your tools) and so leverage the existing underlying Linux/Mac/Dos tools to get the data from these types of things.

Unfortunately, I know a bit less than diddly-squat about streams/a-synchronous this/processes to code anything worthwhile.

Again,

Thank you for your time and effort.

cordially,

t





---- On Thu, 24 Aug 2023 19:47:54 -0400 David T. Lewis <lewis@mail.msen.com> wrote ---

On Fri, Aug 11, 2023 at 10:48:19AM -0400, gettimothy via Squeak-dev wrote:
> Hi Folks.
>
> Continuing with my previous failed efforts to get the output stream from a linux process into squeak, I studied the UnixProcess testPipeLine.
>
> I have modified it under my own test method, as shown below
>
> catFromFileToSqueak
>

Hi tty,

I not tried to debug the code that you shared, but I recall that you
have been trying to read large input streams into your Squeak image
with OSProcess.

After reading your earlier posts, I took a look at using OSProcess/CommandShell
for reading large input streams, and I found that it did not work properly for
large input streams. I have made some updates since then to correct the errors.
The main changes are these:


Name: OSProcess-dtl.137
Author: dtl
Time: 31 July 2023, 4:57:28.437465 pm
UUID: b02b83df-4436-4105-a7dc-c9fe9ff35dd1
Ancestors: OSProcess-dtl.136

OSProcess 4.7.4
Update BufferedAsyncFileStream to enable a PipeableOSProcess (from
companion package CommandShell) to stream over a very large input stream.
For a buffered input stream on an OS pipe, ResettableBufferStream will
periodically trim its internal buffer to remove elements previously read.

Name: CommandShell-dtl.112
Author: dtl
Time: 31 July 2023, 4:56:27.621385 pm
UUID: a78a0dc1-02b2-417a-9537-a19b8b86a010
Ancestors: CommandShell-dtl.111

CommandShell 4.7.13
Let PipeableOSProcess #upToEnd and #upToEndOfFile perform reading in discreet
chunks to prevent blocking a buffered pipe reader on a large input stream.


If you update to the latest versions of OSProcess and CommandShell, you should
be able to evaluate e.g. "PipeableOSProcess command: 'cat reallyBigFile.txt' "
and read from output with methods like #next: and #upToEndOfFile.

Here are some cautions for reading large input streams:

1) It is very slow compared to processing data directly with Unix tools

2) If you do not read all of the output, the external process may not exit
and clean up normally, possibly leaving some Smalltalk processes running.
You may need to use a Process Browser to manually terminate left over Smalltalk
processes. These run in background to retrieve data from the external OSProcess.

3) Don't try to read too much stuff into your image. If you are reading something
too large to fit into your Squeak image, read it in chunks (e.g. with #next:).
If you do an #upToEndOfFile on a huge input stream, it's not going to work.

HTH,

Dave


>
>
>
>
> | input pipe2?? output dest child desc result|
>
> input := OSProcess readOnlyFileNamed: '/home/wm/install.txt'.?? "returns a MultiByteFileStream: '/etc/hosts'"
>
> pipe2 := OSPipe nonBlockingPipe.
>
> output := pipe2 writer.
>
> dest := pipe2 reader.
>
> desc := Array
>
> with: input
>
> with: output
>
> with: nil.
>
>
>
> child := UnixProcess
>
> forkJob: '/bin/cat'
>
> arguments: nil
>
> environment: nil
>
> descriptors: desc.
>
> input close.
>
> output close.
>
> result := dest upToEnd.
>
> result inspect.
>
> self break.
>
> dest close.
>
> child sigterm.
>
> ^ result
>
>
>
>
>
> Here is the oddity...
>
>
>
>
>
>
>
> UnixProcess catFromFileToSqueak
>
>
>
>
> when I invoke the method by "inspecting it" via CTL-I I get an empty string returned, or just a Do-It and hit the break, the result inspect is an empty string.
>
> BUT! When I inspect dest , then in its workbox, type self upToEnd (inspect it) I get the contents.
>
>
>
>
>
> The overarching purpose of my efforts is to be able to stream the unix output of tar, bzcat, unzip...etc directly into squeak using the OSProcess tools. directly into the monty (?) SAX / XML tools...(excellent tools, btw, thank you for your work.
>
> ios := (StandardFileStream readOnlyFileNamed:(path,filename)).?? <--HERE I want to replace the StandardFileStream with something from OSProcess.
>
> ios inspect.
>
> [(WikiMediaSaxToPostrgresHandler on: ios) debug:true;?? pingevery:100000;?? optimizeForLargeDocuments;parseDocument.] forkAt: Processor userBackgroundPriority named:'SaxToDB'
>
>
>
>
>
> cordially and thanks in advance
>