[squeak-dev] Faster directory enumeration?

Bernhard Pieber bernhard at pieber.com
Mon Oct 17 18:35:32 UTC 2016


Hi Dave,

Thanks for the answer. I guess I would need to build the latest version of the plugin myself, right? (I am on macOS Sierra.)

I could load DirectoryPlugin. However, VMConstruction-Plugins-DirectoryPlugin needs InterpreterPlugin available.

Bernhard

> Am 17.10.2016 um 19:56 schrieb David T. Lewis <lewis at mail.msen.com>:
> 
> It is probably far too bit-rotted to be of any use now, but here is what I
> came up with 15 years ago to improve this:
> 
>  http://wiki.squeak.org/squeak/2274
> 
> I did briefly look at this again a couple of years ago, and put the
> updates on SqueakSource. But I think I found that the directory primitives
> are nowhere near as big a win now as they were 15 years ago. Nevertheless
> it may still be of some interest.
> 
> Dave
> 
>> Dear Squeakers,
>> 
>> I want to count files with a certain extension in a folder recursively.
>> Here is the code I use:
>> 
>> | dir count runtime |
>> count := 0.
>> dir := FileDirectory on:
>> '/Users/bernhard/Library/Mail/V4/D77E3582-7EBE-4B5A-BFE0-E30BF6AE995F/Smalltalk.mbox/Squeak.mbox'.
>> runtime := Time millisecondsToRun: [
>> 	dir directoryTreeDo: [:each |
>> 		(each last name endsWith: '.emlx') ifTrue: [count := count + 1]]].
>> {count. runtime}. #(289747 66109)
>> 
>> As you can see it finds 289.747 files and it takes about 66 seconds. Is
>> there any faster way to do this given the current VM primitives?
>> 
>> The reason I ask is that the equivalent Python code takes between 1.5 and
>> 6 seconds. :-/
>> 
>> #!/usr/local/bin/python3
>> import os
>> import time
>> 
>> path =
>> '/Users/bernhard/Library/Mail/V4/D77E3582-7EBE-4B5A-BFE0-E30BF6AE995F/Smalltalk.mbox/Squeak.mbox'
>> 
>> print(path)
>> 
>> start = time.time()
>> emlx = 0
>> for dirpath, dirnames, filenames in os.walk(path):
>>    for filename in filenames:
>>        if filename.endswith('.emlx'):
>>            emlx += 1
>> 
>> runtime = time.time() - start
>> 
>> print(emlx, runtime)
>> 
>> It seems to have to do with an optimized os.scandir() function, described
>> here: https://www.python.org/dev/peps/pep-0471/
>> 
>> Cheers,
>> Bernhard
>> 
>> 
>> 
> 
> 
> 



More information about the Squeak-dev mailing list