Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree** -------------------------------- Process: (40) 56225: nil -------------------------------- 99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
That is a really eye-opening measurement. I occasionally do "TimeProfileBrowser onBlock: [MCMcmUpdater updateFromServer]" to try to see where the time is going, but I have a hard time making sense of the resulting profile.
Your measurement suggests that 1) #allFileNames is horribly inefficient (at least on a unix file directory) and 2) we are doing it far too often.
There is a cacheAllFileNamesDuring: mechanism that is presumably trying to minimize this but the cache is repeatedly discarded and refreshed, with a new repository scan each time.
Dave
On 2024-02-04 14:48, Nicolas Cellier wrote:
Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree**
Process: (40) 56225: nil
99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
Hi Dave, yes, in case of a bloated package-cache such as mine, some caching may mitigate problem 1). For problem 2), the source of those directory scans are these long chains of .mcd They invariably appear during the update process (maybe related to either a high frequency of .mcm, or high frequency of image updating) So another angle of attack would be to question the production of .mcd vs .mcz strategy...
Nicolas
Le lun. 5 févr. 2024 à 16:22, lewis@mail.msen.com a écrit :
That is a really eye-opening measurement. I occasionally do "TimeProfileBrowser onBlock: [MCMcmUpdater updateFromServer]" to try to see where the time is going, but I have a hard time making sense of the resulting profile.
Your measurement suggests that 1) #allFileNames is horribly inefficient (at least on a unix file directory) and 2) we are doing it far too often.
There is a cacheAllFileNamesDuring: mechanism that is presumably trying to minimize this but the cache is repeatedly discarded and refreshed, with a new repository scan each time.
Dave
On 2024-02-04 14:48, Nicolas Cellier wrote:
Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree**
Process: (40) 56225: nil
99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
Oh yes I see it now. Thank you, I am slow to see the obvious. To be honest I had never paid any attention to those .mcd files in my package-cache directories.
Dave
On 2024-02-05 22:31, Nicolas Cellier wrote:
Hi Dave, yes, in case of a bloated package-cache such as mine, some caching may mitigate problem 1). For problem 2), the source of those directory scans are these long chains of .mcd They invariably appear during the update process (maybe related to either a high frequency of .mcm, or high frequency of image updating) So another angle of attack would be to question the production of .mcd vs .mcz strategy...
Nicolas
Le lun. 5 févr. 2024 à 16:22, lewis@mail.msen.com a écrit :
That is a really eye-opening measurement. I occasionally do "TimeProfileBrowser onBlock: [MCMcmUpdater updateFromServer]" to try to see where the time is going, but I have a hard time making sense of the resulting profile.
Your measurement suggests that 1) #allFileNames is horribly inefficient (at least on a unix file directory) and 2) we are doing it far too often.
There is a cacheAllFileNamesDuring: mechanism that is presumably trying to minimize this but the cache is repeatedly discarded and refreshed, with a new repository scan each time.
Dave
On 2024-02-04 14:48, Nicolas Cellier wrote:
Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree**
Process: (40) 56225: nil
99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
Hi Nicolas,
How many files are in your package cache? (MCCacheRepository default allFileNames size)
Levente
On 2024. 02. 04. 15:48, Nicolas Cellier wrote:
Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree**
Process: (40) 56225: nil
99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
Hi Levente, as I said, it's quite bloated, I currently have about 2000 .mcz files. But I have already removed about 3000 .mcd. Without the .mcd, the upgrade process runs smoothly.
Nicolas
Le lun. 5 févr. 2024 à 23:04, leves leves@caesar.elte.hu a écrit :
Hi Nicolas,
How many files are in your package cache? (MCCacheRepository default allFileNames size)
Levente
On 2024. 02. 04. 15:48, Nicolas Cellier wrote:
Hi all, I have noticed that upgrading an image to be on par with trunk was becoming incredibly slow. Notably, there were very long pauses with information dialog telling 'snapshotting...'. So I have interrupted the process and watched thru debugger... That was due to long chains of .mcd: each element is involving a new directory scan of the bloated package-cache...
For example, a single MCWorkingCopy>>findSnapshotWithVersionInfo: System-ct.1444 leads to catastrophic MessageTally spyOn: [snapshot := MCPatcher apply: patch to: self baseSnapshot].
- 143610 tallies, 188827 msec.
**Tree**
Process: (40) 56225: nil
99.2% {187281ms} [] MorphicProject>>spawnNewProcess 99.2% {187281ms} PasteUpMorph>>doOneCycle ...snip... **Leaves** 80.6% {152119ms} UnixFileDirectory(FileDirectory)>>directoryContentsFor:do: 5.2% {9861ms} DirectoryEntry class>>fromArray:directory: 1.5% {2849ms} WriteStream>>nextPut:
Each MCCacheRepository(MCRepository)>>versionWithInfo: costs about 3 to 4 seconds! It's clear that I should better clean some cache, but I wonder about the overall .mcd/.mcz strategy. Network latency is clearly not dominating my own costs!
Nicolas
squeak-dev@lists.squeakfoundation.org