[squeak-dev] Can I create a 75 Gb Image? (about those post-build stats)

gettimothy gettimothy at zoho.com
Thu Oct 21 17:42:44 UTC 2021


Hi Levente.



On squeak, here are two versions:



#timeProfile is your friend.

Transcript clear.

[(DocDemoSaxHandler on:('/bulkstorage/enwiki-20200501-pages-articles-multistream.xml' asFileReference)) pingevery:100000;  optimizeForLargeDocuments;parseDocument] timeProfile .



Transcript clear.

[(DocDemoSaxHandler on:('/bulkstorage/enwiki-20200501-pages-articles-multistream.xml' asFileReference)) pingevery:100000;  optimizeForLargeDocuments;parseDocument] forkAt: Processor userSchedulingPriority named:'SAX'




the method #asFileReference does not exist on String in Squeak6.0alpha, I added that to  String in the fs-core-converting category, it is just a copy of String>>asReference from that same category;

asFileReference

"Return an FSReference on disk"

^ FileSystem disk referenceTo: self




On pharo...same thing but without the asFileReference hack.


Transcript clear.

[(DocDemoSaxHandler on:('/bulkstorage/enwiki-20200501-pages-articles-multistream.xml' asFileReference)) pingevery:100000;  optimizeForLargeDocuments;parseDocument] forkAt: Processor userSchedulingPriority named:'SAX'





Transcript clear.

[(DocDemoSaxHandler on:('/bulkstorage/enwiki-20200501-pages-articles-multistream.xml' asFileReference)) pingevery:100000;  optimizeForLargeDocuments;parseDocument] timeProfile.



thanks for your help.

t


also, those "ping"messages are some plumbing I added to my DocDemoSaxHandler class 





---- On Thu, 21 Oct 2021 13:13:44 -0400 Levente Uzonyi <leves at caesar.elte.hu> wrote ----


Hi Tim, 
 
On Thu, 21 Oct 2021, gettimothy wrote: 
 
> Thx Levente. 
> 
> 
> Should I attempt to fix this? How should it be approached? 
> 
> I have only a dim idea what "read buffering is" (file access is slow, so get a lot of data, at a certain threshold, asynchonously refill the buffer?). 
> 
> Is there an existing Stream that implemts it? 
> 
> Should I take the guts of that and put it in FSReadStream?  
 
What is the snippet you execute to parse the documents? 
 
(I loaded Monty's XML parser and checking the code makes me think 
that you create an FSReadStream not Monty's code). 
 
 
Levente
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.squeakfoundation.org/pipermail/squeak-dev/attachments/20211021/7a378721/attachment.html>


More information about the Squeak-dev mailing list