[Vm-dev] Git & MS Windows path length restriction

btc at openinworld.com btc at openinworld.com
Sun Mar 9 15:43:32 UTC 2014


I started looking into Pharo Case 13030 "Many tests failing in 
MetacelloValidation Job on Jenkins"
and even before getting into it, I've hit a stumbling block on Windows 7 
with its pathName length restriction of 259 characters.

I managed to isolate the problem as follows...
    MetacelloPlatform current
        downloadFile: 
'https://github.com/dalehenrich/metacello-work/zipball/master'
        to: 'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'.
    zip := ZipArchive new readFrom:  
'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'.
    zip extractAllTo: 'C:\tmp\unzippedByPharo' asFileReference.
which produces the error "FileDoesNotExist: File @ 
C:\tmp\unzippedByPharo\dalehenrich-metacello-work-96e07b1\repository\Metacello-TestsMCB.package\MetacelloScriptingStandardTestHarness.class\instance\validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st"

where the #size of that filename is 354 characters. 

Trying to drill into github-dalehenrichmetacelloworkmaster.zip from 
Windows Explorer produces error "The Compresses (zipped) file is invalid."
However using 7Zip [1] I can extract the file so that _all_ files appear 
in the hierarchy, so it seems 259 is not a hard limit, and indeed that 
limit is imposed by the Windows Shell, since NTFS can have a path length 
of ~32K [2].  Operation of 7Zip was verified by:
 * From Pharo:     zip members size --> 6007
 * For cmd.exe:    dir /b /s > dir.txt
                            Open dir.txt into Notepad++ --> 6007 (after 
dir.txt line removed)
                            Also Windows Explorer <Properties> reports 
(5277 files + 730 folders) = 6007

[1] www.7-zip.org
[2] 
http://stackoverflow.com/questions/265769/maximum-filename-length-in-ntfs-windows-xp-and-windows-vista
  
Now presuming its reasonable for the working directory to have 30 
characters, using...
    histogram := (zip members collect:
    [     :member  |  | mySize |
        mySize := member fileName size.
        (mySize >=230)  ifTrue: [Transcript crShow: mySize printString , 
'-->' , member fileName printString ].
        mySize.
    ]) asBag.
produces...
306-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-TestsMCB.package/MetacelloScriptingStandardTestHarness.class/instance/validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st'
252-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/createBaseline.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
257-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyBaselineVersionIn.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
259-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyVersion.section.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
262-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/addSection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'
265-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'
278-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.sectionIndex.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'
                       
So...
* Are there many users of Smalltalk/git on Windows
* What is the design plan for dealing with the long path name?
I guess the change to using the unicode functions to get the 32K 
pathName length would need to happen in the VM.
  
I wonder if instead of Smalltalk-git working with individual files in 
the file system, it might work directly from the a zip file.  It seems 
git can be configured to accept zip files and unpack them before pushing 
into its repository [3].  Some benefits of this:
* Avoid this problem on Windows
* Accessing one stream when loading a from git repository might be 
faster than accessing many individual files
* Git sees one file per method, but users see one file per package.
* Git-zip-files might act more like familiar mcz files - that could be 
copied around the same - or even provide some convergence if an mcz held 
a file per method rather than a single source.st file.

[3] http://tante.cc/2010/06/23/managing-zip-based-file-formats-in-git/

Interested in you thoughts.
cheers -ben


More information about the Vm-dev mailing list