[squeak-dev] Performance regression

Jakob Reschke forums.jakob at resfarm.de
Mon Jun 29 06:24:14 UTC 2020


Am Mo., 29. Juni 2020 um 01:50 Uhr schrieb Levente Uzonyi
<leves at caesar.elte.hu>:
>
> Hi Jakob,
>
> On Sun, 28 Jun 2020, Jakob Reschke wrote:
> >
> > Still, may I suggest to optimize includesVersionNamed: for
> > MCDirectoryRepository? It currently reads all the directory entries
> > just to check afterwards whether one of them fits the version name.
> > Instead it could look up files for the sought version name directly:
> > there is a method allFileNamesForVersionNamed:.
>
> That sounds like a good idea, but it's not easy to do that. Why?
> You can't know for sure what file name to search for.
> E.g.: when you're looking for Collections-mt.896, your package cache may
> only contain Collections-mt.896(nice.895).mcd, but not
> Collections-mt.896.mcz.
> Figuring out the right file name without reading the contents of the
> directory is not easy.
> If the image transformed all .mcd to .mcz upon download, that wouldn't be
> a problem.
>

I thought the repository would already do the name transformations,
and that MCFileBasedRepository>>#allFileNamesForVersionNamed: would
use that, but it gets all the file names, too, and then checks which
of them matches the MCVersionName. Moreover,
MCFileBasedRepository>>#basicStoreVersion: does not dictate the
filename scheme, but asks the MCVersion for the fileName. That sounds
odd to me, why should a version know in which format it is stored?
Should that not be the business of the repository and the
readers/writers it uses? (It may still dispatch through the version to
distinguish diffy versions.)

Anyway...

> One case can still be improved though: when the file names are not cached,
> look up the .mcz file and if it's present, return true. See
> Monticello-ul.726 in the Inbox with that patch.

Sounds reasonable. I guess I should still fix my tests, but it should
cut the slowdown for the cases where a number candidate is already
taken (trying new version names 1; 1, 2; 1, 2, 3; 1, ..., n - 1; 1,
..., n).

>
> If possible, try to add a few #cacheAllFileNamesDuring: sends to the right
> places. That should improve performance in all cases. Remember to
> #flushAllFilenames when a new version is written while the argument of
> #cacheAllFileNamesDuring: is being evaluated.

I hope I'll be able to try that during the week.


More information about the Squeak-dev mailing list