<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"><html><head><meta content="text/html;charset=UTF-8" http-equiv="Content-Type"></head><body ><div style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 10pt;"><div>Thx Levente.<br></div><div><br></div><div><br></div><div>Should I attempt to fix this? How should it be approached? <br></div><div><br></div><div>I have only a dim idea what "read buffering is" (file access is slow, so get a lot of data, at a certain threshold, asynchonously refill the buffer?).<br></div><div><br></div><div>Is there an existing Stream that implemts it?<br></div><div><br></div><div>Should I take the guts of that and put it in FSReadStream? <br></div><div><br></div><div><br></div><div>Thank you for your time.</div><div><br></div><div><br></div><div>Below are the relevant sections of the squeak and pharo runs:<br></div><div><br></div><div>squeak:(~ 1 Million elements in ~1 hour.)<br></div><div><br></div><div><blockquote style="border: 1px solid rgb(204, 204, 204); padding: 7px; background-color: rgb(245, 245, 245);"><div>                                                                                                                                                                              98.2% {2363012ms} XMLWellFormedParserTokenizer>>nextPCDataToken<br></div><div>                                                                                                                                                                                |67.6% {1626652ms} XMLNestedStreamReader>>peek<br></div><div><b>                                                                                                                                                                                |  |67.6% {1626281ms} FSReadStream>>next</b><br></div><div>                                                                                                                                                                                |  |  66.4% {1597152ms} primitives<br></div><div>                                                                                                                                                                                |  |  1.2% {28325ms} UTF8TextConverter>>nextFromStream:<br></div><div>                                                                                                                                                                                |26.5% {637549ms} XMLNestedStreamReader>>next<br></div><div>                                                                                                                                                                                |3.6% {87248ms} XMLWellFormedParserTokenizer>>nextGeneralEntityOrCharacterReferenceOnCharacterStream<br></div><div>                                                                                                                                                                                |  3.0% {72138ms} XMLWellFormedParserTokenizer>>nextGeneralEntityReferenceOnCharacterStream<br></div><div>                                                                                                                                                                                |    2.5% {59508ms} XMLWellFormedParserTokenizer>>nextEntityName<br></div><div>                                                                                                                                                                                |      1.8% {42464ms} XMLNestedStreamReader>>peek<br></div><div>                                                                                                                                                                                |        1.8% {42307ms} FSReadStream>>next<br></div><div>                                                                                                                                                                                |          1.7% {41744ms} primitives<br></div><div>                                                                                                                                                                              1.1% {27156ms} XMLWellFormedParserTokenizer>>nextContentMarkupToken<br></div></blockquote>pharo (ping: three hundred thirty-two million elements.  Time: 0:02:04:23.88027)</div><div><br></div><div>:<br><br><br><br><blockquote style="border: 1px solid rgb(204, 204, 204); padding: 7px; background-color: rgb(245, 245, 245);"><div>         98.4% {7344009ms} XMLWellFormedParserTokenizer(XMLParserTokenizer)>>nextContentToken<br></div><div>            88.3% {6593766ms} XMLWellFormedParserTokenizer>>nextPCDataToken<br></div><div>              |33.6% {2507147ms} XMLNestedStreamReader>>peek<br></div><div><b>              |  |30.9% {2305352ms} ZnCharacterReadStream(ZnEncodedReadStream)>>next</b><br></div><div>              |  |  |27.8% {2072836ms} ZnCharacterReadStream>>nextElement<br></div><div>              |  |  |  |25.8% {1929523ms} ZnUTF8Encoder(ZnCharacterEncoder)>>nextFromStream:<br></div><div>              |  |  |  |  |22.9% {1706511ms} ZnUTF8Encoder>>nextCodePointFromStream:<br></div><div>              |  |  |  |  |  |21.6% {1610840ms} ZnBufferedReadStream>>next<br></div><div>              |  |  |  |  |  |  |21.6% {1610834ms} primitives<br></div><div>              |  |  |  |  |  |1.2% {86079ms} primitives<br></div><div>              |  |  |  |  |3.0% {223012ms} primitives<br></div><div>              |  |  |  |1.0% {77498ms} primitives<br></div><div>              |  |  |1.6% {122458ms} primitives<br></div><div>              |  |  |1.5% {110057ms} ZnBufferedReadStream>>atEnd<br></div><div>              |  |  |  1.5% {110055ms} primitives<br></div><div>              |  |2.2% {163569ms} ZnCharacterReadStream(ZnEncodedReadStream)>>atEnd<br></div><div>              |  |  1.2% {92568ms} primitives<br></div><div>              |16.2% {1207262ms} XMLNestedStreamReader>>next<br></div><div>              |12.7% {949179ms} WriteStream>>nextPut:<br></div><div>              |  |12.7% {949177ms} primitives<br></div><div>              |8.4% {627634ms} Character>>isXMLChar<br></div><div>              |6.6% {489916ms} XMLWellFormedParserTokenizer>>nextGeneralEntityOrCharacterReferenceOnCharacterStream<br></div><div>              |  |6.1% {452072ms} XMLWellFormedParserTokenizer>>nextGeneralEntityReferenceOnCharacterStream<br></div><div>              |  |  2.9% {216938ms} Dictionary>>at:ifPresent:<br></div><div>              |  |    |1.3% {96858ms} Dictionary(HashedCollection)>>findElementOrNil:<br></div><div>              |  |    |  |1.3% {95924ms} Dictionary>>scanFor:<br></div><div>              |  |    |1.2% {90639ms} BlockClosure>>cull:<br></div><div>              |  |  2.6% {190530ms} XMLWellFormedParserTokenizer>>nextEntityName<br></div><div>              |  |    1.1% {79370ms} XMLNestedStreamReader>>peek<br></div><div>              |6.2% {462153ms} WriteStream>>contents<br></div><div>              |  |5.9% {441631ms} WideString>>copyFrom:to:<br></div><div>              |  |  3.8% {285489ms} WideString(String)>>isOctetString<br></div><div>              |  |  1.9% {144904ms} WideString(String)>>asOctetString<br></div><div>              |  |    1.9% {139761ms} primitives<br></div><div>              |4.0% {301249ms} primitives<br></div><div>            9.3% {695725ms} XMLWellFormedParserTokenizer>>nextContentMarkupToken<br></div><div>              8.3% {621663ms} XMLWellFormedParserTokenizer>>nextTag<br></div><div>                2.6% {197358ms} XMLWellFormedParserTokenizer>>nextEndTag<br></div><div>                  |1.2% {91522ms} XMLNestedStreamReader>>next<br></div><div>                2.2% {160759ms} XMLWellFormedParserTokenizer>>nextElementName<br></div></blockquote></div><div><br></div><div class="zmail_extra_hr" style="border-top: 1px solid rgb(204, 204, 204); height: 0px; margin-top: 10px; margin-bottom: 10px; line-height: 0px;"><br></div><div class="zmail_extra" data-zbluepencil-ignore="true"><br><div id="Zm-_Id_-Sgn1">---- On Thu, 21 Oct 2021 04:19:11 -0400 <b>Levente Uzonyi <leves@caesar.elte.hu></b> wrote ----<br></div><br><blockquote style="margin: 0px;"><div>Hi Tim, <br> <br>On Wed, 20 Oct 2021, gettimothy via Squeak-dev wrote: <br> <br>> First, thanks to all for the advice. <br>> <br>> I parsed 1 Million elements, if you need more, let me know. <br>> It takes about 11 hours to parse 20 million elements (out of 300million+). <br> <br>Sounds really slow. <br> <br>> Levente: Regarding #timeProfile is my friend. <br>> <br>> I am not sure how to read this, but it may be that "peek" is the hog here. <br> <br>It looks as if FSReadStream does not implement read buffering, so <br>just reading the file doing anything with the its content takes ages. <br> <br> <br>Levente <br></div></blockquote></div><div><br></div></div><br></body></html>