<div dir="ltr"><div>When you say it would &quot;enable partial parsing of the ancestry info&quot; I didn&#39;t quite understand how you achieve that.  That&#39;s what the new Scanner does?</div><div><br></div><div>If you want to put it to Inbox I&#39;m sure it&#39;ll make more sense and we can evaluate both approaches.</div>

<div><br></div><div>I like what you seem to be saying about trying to trim it on _load_ so we can always just have a &quot;right-sized&quot; image.  It might be a pipe-dream for either implementation though, which is why, for now, to have it as part of the flush-all-caches operation.  So the sizes are, &quot;large and fast&quot; or &quot;as small as possible,&quot; which fit two use-cases, development and deployment, respectively.</div>

<div><br></div><div>The only way to have something in-between those two is to flush-caches in your dev image and keep developing.  After a big release of all of them, only the projects that are worked on enough to invoke their history will be put back in the image.  So that&#39;s one way to &quot;right-size&quot; between the big and small.</div>

<div><br></div><div>Let me tell you the last step I&#39;d planned for my Proxy implementation.  The only ancestry access in MCAncestry is pretty much allAncestorsDo: type stuff that ends up traversing the whole tree.  We could instead have a Preference of some kind (pragma-based, of course), which defines the size of what should be considered &quot;recentHistory&quot;.  Like, something between 10 and 100.</div>

<div><br></div><div>MCWorkingCopy&gt;&gt;#stubAncestry would be updated to stub everything older than the preference setting.<br></div><div><br></div><div>Finally, all the operations which today are using allAncestorsDo: would change to use &quot;recentAncestorsDo:&quot; so that the Proxy would never be hit.  The preference could be adjusted to balance between development and deployment interests.</div>

<div><br></div><div>Whether this more complex &quot;sizing&quot; capability would stop me from just doing a flush-all-caches before deployment.. or care during development.. probably not.  So that&#39;s is why I wonder whether attempting this is useful...<br>

</div><div><br></div><div><br></div>On Wed, Aug 14, 2013 at 3:33 PM, Levente Uzonyi <span dir="ltr">&lt;<a href="mailto:leves@elte.hu" target="_blank">leves@elte.hu</a>&gt;</span> wrote:<br><div class="gmail_extra"><div class="gmail_quote">

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">I had a different idea to solve this issue:<br>

<br>

Unroll the ancestry tree to a list. Create a modified MCScanner, which can read a list of versions. A list item would look like the current tree nodes, but the ancestry and stepChildren lists were just references to the actual ancestors/stepChildren in the list.<br>


This would enable partial parsing of the ancestry info. A reference could<br>

contain the position of the referenced list item in the version list, so we wouldn&#39;t have to parse the intermediate elements.<br>

<br>

For backwards compatibility this new version list would be stored in a separate file in the .mcz files. This way old versions of MC could still load the package, but newer versions with the new scanner could read them much faster.<br>


<br>

<br>

Levente<br>

<br>

On Wed, 14 Aug 2013, Chris Muller wrote:<br>

<br>

<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">

A little ditty to move toward sustainable ancestry.  After selecting<br>

&quot;flush cached versions&quot; from the menu, the ancestry-tree will now be<br>

like this:<br>

<br>

aMCVersionInfo.27<br>

    &#39;ancestry&#39; = anArray<br>

         1 = aMCVersionInfo.26<br>

              &#39;ancestry&#39; = anArray<br>

                   1 = aMCVersionInfo.25<br>

                        &#39;ancestry&#39; = anArray<br>

                             1 = aMCInfoProxy(trimmed  &#39;info&#39;,<br>

&#39;repository&#39; to re-retrieve it)<br>

<br>

Truncating the ancestry hierarchies this way recovers about 2.5MB of image size.<br>

<br>

Special notes:<br>

<br>

- It keeps the most-recent 10 and snips off the ancestry starting<br>

10-versions ago to replace it with a MCInfoProxy.  Most any operation<br>

that uses ancestry will cause the original full MCVersionInfo tree to<br>

need to be re-retrieved.<br>

<br>

- This assumes the Info of 10 versions ago exists in the same<br>

repository as the current version.  In practice, it normally will.<br>

<br>

- When a new version is saved after recovering the Info tree from<br>

ANOTHER FILE. (e.g., the one 10 versions ago) the result is an<br>

ancestry tree built from multiple files.  But it&#39;s the same tree, so<br>

this should be no problem.<br>

<br>

<br>

</blockquote>

</blockquote></div><br></div></div>