[squeak-dev] OSProcess streaming and early termination
Bert Freudenberg
bert at freudenbergs.de
Sat Dec 8 08:16:46 UTC 2012
On 08.12.2012, at 03:10, "David T. Lewis" <lewis at mail.msen.com> wrote:
> Hi Bert,
>
> On Fri, Dec 07, 2012 at 04:54:31PM +0100, Bert Freudenberg wrote:
>> Hi David, folks,
>>
>> I need to read output from an external command which is potentially too large to fit in memory. So I want to read from the pipe, and possibly have to terminate early.
>>
>> Here is what I have so far - "od" is an example only of course, but I need to be able to use arg arrays and a working dir and it illustrates the problem:
>>
>> | process1 process2 |
>> process1 := PipeableOSProcess new: '/usr/bin/od'
>> arguments: {'-v'. '-t'. 'x1'. (Smalltalk imageName copyAfterLast: $/) asVmPathName}
>> environment: nil descriptors: nil
>> workingDir: Smalltalk imagePath asVmPathName
>> errorPipelineStream: nil.
>> process2 := ExpressionEvaluator block: [:stdin | stdin next: 1000].
>> process2 pipeToInput: process1 pipeFromOutput.
>> process1 value.
>> process2 value.
>> process2 succeeded
>> ifFalse: [process2 errorUpToEnd]
>> ifTrue: [process2 output]
>>
>> This does get me the first 1000 bytes od the "od" output.
>>
>> However, this seems like more hoops than necessary to jump through - I have to set up the processes first, then pipe them, then execute them, only then can I access the output. Finding the right sequence required reading a lot of code and guessing. Is there a more convenient way? I tried "|" but it only wants a string argument, not an ExpressionEvaluator object.
>>
>> Secondly, even though the external process should be gone after reading 1000 chars, it appears that it is still running. Do I manually have to kill it? I tried #closePipes but that does an upToEnd which in this case is counterproductive because it churns through a Gigabyte of data.
>>
>
> I think this will work:
>
> cmd := 'od -v -t x1 ', (Smalltalk imageName copyAfterLast: $/) asVmPathName.
> pipeline := ProxyPipeline command: cmd.
> data := pipeline next: 1000.
> pipeline closePipes.
> data inspect
Okay, that looks a lot simpler. But with the string interface I have to worry about argument escaping. That's why I wanted to use the array interface, and avoid a shell. Constructing a sanitized string from user data is very hard, and made unnecessary bybusing a non-interpreted interface.
Also, as I wrote I need to be able to set the working directory (and potentially the environment). Your example only works accidentally because you launched squeak from the image directory.
> Your example exposed a problem in BufferedAsyncFileReadStream which was
> reading all available data from a stream regardless of whether anybody was
> consuming the data, so eventually the system gets a low memory warning. I added
> a check to prevent this, so please do another update your OSProcess from
> SqueakMap.
Ah, thanks. I thought that should have worked :)
> ProxyPipeline should have a better name. It made sense when I wrote it as a
> support class for CommandShell, but it turns out that nobody uses CommandShell
> and lots of people want to be able to evaluate a command line with some shell
> syntax support. So maybe it should be a CommandPipeline or a ShellCommandLine
> or something like that.
>
>
>> Thirdly, how do I find out about errors in the external process? E.g. if I misspell the command there is nothing in its stderr, it all seems to fail silently.
>>
>
> A PipeJunction has a pipeToInput, a pipeFromOutput, and an errorPipelineStream.
> A command pipeline works by connecting the pipeFromOutput (aka stdout) from one
> process proxy to the pipeToInput (aka stdin) of the next. The error output
> (aka stderr) of a proxy is accumulated in the shared errorPipelineStream.
> A ProxyPipeline behaves like a PipeJunction, with the stderr of all proxies
> accumulated in a shared errorPipelineStream.
>
> The stderr output of a command pipeline is in the errorPipelineStream, and is
> accessed with #errorUpToEnd or #errorUpToEndOfFile.
>
> Exit status of the external processes can be tested from the process proxies,
> and testing methods such as ProxyPipeline>>succeeded give overall status.
>
> Thus:
>
> cmd := 'foo -v -t x1 ', (Smalltalk imageName copyAfterLast: $/) asVmPathName.
> pipeline := ProxyPipeline command: cmd.
> pipeline succeeded ==> false
> pipeline first exitStatus ==> #fail
> pipeline errorUpToEndOfFile 'sqsh: foo: command not found
> '
Makes sense. But what if I want to avoid a shell, how do I get a readable error?
- Bert -
> Dave
>
>> Or maybe I'm going about this in a completely wrong way? I could not find an example anywhere in OSProcess that would pipe command output into Smalltalk code.
>>
>> Help appreciated :)
>>
>> - Bert -
>>
>>
>
More information about the Squeak-dev
mailing list
|