[Vm-dev] Extending primitiveDirectoryEntry

Alistair Grant akgrant0710 at gmail.com
Sun Apr 23 16:56:29 UTC 2017


On Fri, Apr 21, 2017 at 11:36:52AM -0700, Eliot Miranda wrote:
> On Fri, Apr 21, 2017 at 10:37 AM, tim Rowledge <tim at rowledge.org> wrote:
>     > On 21-04-2017, at 10:17 AM, Alistair Grant <akgrant0710 at gmail.com> wrote:
>     >
>     >
>     > Am I missing anything?
> 
>     I think so; I urge you to consider working with Dave Lewis to see if it
>     might make sense to improve his DirectoryPlugin.
> 
> +1.

Sure.  I don't know anything about the DirectoryPlugin, and it looks 
like it isn't part of Pharo, but I'm happy to help.  It would be 
particularly good if David can help with the Windows side of things.

David?


> Further, primitive invocation is slow.  Try and provide a bulk primitive that
> answers multiple attributes, especially if the attributes are obtained from a
> single system call (as is the case with stat).

My current thoughts, which may change after David's input and any other 
feedback...

In Pharo the majority of calls are just to check file existance.  The 
remaining calls typically use only one element of all the information 
returned (in the base Pharo image, I can't speak for applications, of 
course).  That was why I proposed extending the primitive to indicate 
which piece of information to return.

Following on from Eliots comments about returning multiple attributes: 
The current methods in Pharo test / return one piece of information at a 
time.  I don't think we can make assumptions about how long the 
attribute information is valid, so we can't cache the information.  That 
means we would create an attribute object that is returned to the caller 
and leave it up to them to manage how long to consider it valid.


> Further, try and come up with cross-platform abstractions so that a single
> primitive can be used across platforms. One of the things that would require a
> lot of thought is harmonising Unix symbolic links, Mac OS X Aliases and Windows
> Aliases.

Hopefully the file attribute object I mentioned above could manage the 
cross-platform interpretation, leaving the primitives to just return the 
basic information.


>     http://www.squeaksource.com/DirectoryPlugin.html
>     http://wiki.squeak.org/squeak/2274
> 
>     More generally the file stuff is quite a convoluted mess. Any concerted
>     effort to clean it up, improve performance and error handling and even
>     (gasp!) document where it does well or poorly, would be welcomed.

While I was profiling this I noticed that 
DiskStore>>defaultWorkingDirectory is the most called method by a factor 
of about 3, as it is called every time a filename is resolved.

#defaultWorkingDirectory is relatively expensive as it calls a primitive
to get the image directory, converts the resulting ByteString to a byte 
array, calls ZnCharacterEncoder to decode it, converts the string to a 
path, and finally gets the parent.

As an example of how often it can be called: Using Iceberg to clone an
empty repository, add a small package and synchronise the repository
called #defaultWorkingDirectory around 500 times (+/- 50).  It is called so many
times because FileSystem>>resolve: leads to FileSystem>>resolvePath:
which resolves the supplied path against the working directory, and
#resolve: is called from many places.

Caching the result in memory using lazy initialisation gives over a 
1,000 times performance improvement in #defaultWorkingDirectory 
(assuming I tested it correctly, I'll supply more details if we look in 
to this).  The tricky part is managing the cache during image Save As and 
quit / restart.

Cheers,
Alistair


# vim: tw=72


More information about the Vm-dev mailing list