[squeak-dev] Advice on using multiple processes on large jobs
gettimothy at zoho.com
Mon Nov 8 14:40:22 UTC 2021
I will be parsing a subset of a large XML file using the monty SAXParser, creating objects from the subsets and storing them in-image.
About a 2 hour job.
On those objects, I will be running different kind of parse using a PEG Grammar and the Xtreams-parsing package to turn the wikitext they contain into xHTML and storing that output (or the XMLDocument that contains it) on the object itself.
"Tabulating" a failure to correctly PEGParse an object is the goal here . I am "automating" the process of capturing bugs in my PEG grammar.
Now, here is the question......
Should I run the two tasks sequentially? or in Parallel using separate processes?
The SAX process will probably run faster than the PEG process over the long run, so If I run in parallel, then I will put the SAX at a slightly lower priority then the PEG.
I am also considering Announcements to announce the SAX has output a new object and the PEG should start on it.
I find this idea attractive. Another option is separate images with the AMQP communication between them, but that is a bit more work.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Squeak-dev