On UUID's and MC file names

John M McIntosh johnmci at smalltalkconsulting.com
Fri Jun 22 05:00:05 UTC 2007


Ok, the UUID logic for which all the source code exists I might add  
clearly shows what happens.

It generates a version 4 random number IF the UUID primitive is not  
implemented on the host platform.  That uses
the Squeak's Random number generator, with hopefully a different  
start value each time it's used (cross fingers).
I recall some early MC users on unix were burned when the original  
logic restarted with the same start seed on an image crash.

Typically the host operating system supplied a version 1

I recall what happened was someone decoded the UUIDs found in MS word  
documents which were claimed to be from an anonymous party, but  
turned out to be a political party in Washington DC. That was met  
with outrage that you could easly trace back to the MAC address of  
the computers involved in the creation of any microsoft document, so  
Microsoft  moved to a HASHed value, others like OS-X followed. I can  
not speak for Unix.  {well actually it invokes MakeUUID(location);  
but I have no idea what that does}

The intent of the MD5 or SHA-1 is to hide the original value, I  
believe they are one way, and might as well be random bits from any  
viewpoint of a better way to print the UUID.

However if you are lucky some older operating system might still be  
using version 1 UUIDS.


Oh lastly I'll note the decoder for the UUID>>asUUID:  SUCKS (well I  
wrote it).

asUUID: aString
	| stream token byte |
	stream _ ReadStream on: (aString copyReplaceAll: '-' with: '')  
asUppercase.
	1 to: stream size/2 do: [:i |
		token _ stream next: 2.
		byte _ Integer readFrom: (ReadStream on: token ) base: 16.
		self at: i put: byte].
	^self


In profiling Sophie document reading this nasty leap to the foreground.

However I wrote a new one for Sophie, which made the few percents  
dedicated to converting a string UUID to a UUID object go away.
Lot less readable tho,

SophieID>>asUUID:
	| n i poke l total r |

	n := aString size.
	i := 1.
	poke := 1.
	[ i < n] whileTrue:
		[l := (aString at: i) asInteger.
		l = 45 ifFalse:
			[total := l > 96
					ifTrue: [10 + (l-97)]
					ifFalse: [l-48].
			i := i + 1.
			r := (aString at: i) asInteger.
			total := r > 96
				ifTrue: [total * 16 +  (10 + (r-97))]
				ifFalse: [total * 16 +  (r-48)].
			self at: poke put: total.
			poke := poke + 1.
			i := i + 1]
			ifTrue: [i := i + 1]].


On Jun 21, 2007, at 5:34 PM, Bert Freudenberg wrote:

> Jerome,
>
> we should take this discussion to squeak-dev. Reply-To set.
>
> On Jun 22, 2007, at 1:06 , Jerome Peace wrote:
>
>> Hi Bert,
>>
>> Thanks for the interesting response.
>>
>> ***
>>> [V3dot10] Re: RV: Do in a workspace and say if could
>> build
>>>
>>>
>>> Bert Freudenberg bert at freudenbergs.de
>>> Thu Jun 21 00:11:25 UTC 2007
>>>
>>> On Jun 21, 2007, at 1:51 , Jerome Peace wrote:
>>>
>>>> First a better way to print out a uuid. Since its
>>>> based on time I should be able to take an encoded
>> UUID
>>>> and print it out asHumanIntelligableText.
>>>
>>> http://en.wikipedia.org/wiki/UUID
>>>
>>>> Secondly it would seem that a time based version
>>>> number would be a little less dangerous than a
>>>> sequential version. So a package would be name
>>>> somethink like:
>>>> PackageName-subPackage-initials.yymmddnn.mcz
>>>> with yymmddnn is a number based on time with a
>>>> sufficient resolution to solve most problems.
>>>> The details may be modified to meet other design
>>>> criteria (e.g. spaceCompression).
>>>>
>>>> The first should be easy to do.
>>>
>>> Reversing a cryptographic hash function? Have fun.
>>
>> Hmm. I don't have to reverse a hash function I just
>> have to "know what it means".
>> That can be done by extra info saved with the hash as
>> part of the name.
>> Enough info to provide a human intellegible clue.
>> UUID hashes mean that on such and such a day at such
>> and such a time from such and such a place a something
>> was saved and given a uuid in such and such a format.
>>
>> If the purpose of the saving is not to keep secret
>> what was saved you can place both the open text and
>> the hash together and if needs be keep a dictionary to
>> reverse the cyptographic hash.
>>
>> Partial progress counts. I just want to look as
>> something that doesn't mystify me.
>> Remember the context is to make something a beginner
>> and an amatuer can learn.
>
> That information is stored in the VersionInfo entry next to the  
> UUID. It's easily accessible. Whereas the UUID might be generated  
> using the UUIDPlugin and you have no idea how to reverse that. It's  
> not sensible to even attempt that.
>
>>>> I wonder what it would take to train MC to work
>> with the second.
>>>
>>> That's trivial. Since MC does not place meaning on
>> the version name
>>> you can just pre-populate the version name input
>> field of the version
>>> save dialog with whatever suits you.
>>
>> Huh? Wow.
>>
>> Does this mean I could rename the file and MC would
>> still recognize it for what it is?
>> Oh,. you said version name. So you mean that the
>> packagename portion is still significant but I can
>> play around with the version names and MC will pay no
>> attention.
>
> No. The package name is stored *inside* the MCZ.
>
>> So a mischief maker could rename things so that
>> Package-puck.30.mcz  was the ancestor of
>> Package-puck.29.mcz instead of the expected other way
>> around?
>>
>> On the other hand Package-puck.3.mcz duplicated and
>> renamed to egakcaP-puck.3.mcz would not be recognized
>> by MC as the same?
>>
>>
>>>
>>> Actually, maybe having readable version file names is
>> a problem in
>>> itself. It gives the illusion that these have any
>> meaning to MC.
>>> Other systems like git avoid the problem by just
>> using UUIDs as
>>> filenames.
>>
>> And how would you know when mischief had happened
>> then?
>
> MC is not designed to prevent mischief, though the UUIDs prevent  
> accidental mistakes. For actual security, one could for example use  
> the hash of the entire package contents as identifier, making it  
> unforgeable.
>
> - Bert -
>

--
======================================================================== 
===
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
======================================================================== 
===





More information about the Squeak-dev mailing list