Image Unique Identifier

Ramon Leon ramon.leon at allresnet.com
Tue Aug 22 22:09:40 UTC 2006


> Hi Ramon,
> 
> The thing to keep in mind about a UUID is that it is supposed 
> to be unique.
> By hashing the UUID you will distribute the number more 
> widely depending on the UUID implementation.  The beginning 
> bytes of the UUID are supposed to be widely spread so I doubt 
> that this makes much difference.  
> 
> For example
> 
> (1 to: 10) collect: [:i | UUID new]  I get: 
> an UUID('48d374b0-b196-6a40-a15e-79a02a8dde89')
> an UUID('48eacb8f-4564-d84f-a456-2856c5226b97')
> an UUID('de67f9a4-4a42-cd44-bf79-8046366e6c9d')
> an UUID('59edd53d-eb69-164b-b7f6-5e63d6284d12')
> an UUID('fae32b90-49c8-ca46-a583-129e5a0fdc02')
> an UUID('cc760f65-1732-3647-b632-9087256a85f1')
> an UUID('6f23d070-a505-a240-addf-a749b02d6243')
> an UUID('a620fe16-4701-3d4a-9aee-0b549057d855')
> an UUID('154d299f-e2fe-3b42-bd3d-0fc4c3362db8')
> an UUID('0cc33014-4f35-6f4e-b9de-4a005a3ceb00'))
> 
> Notice the first group of bytes are already well dispersed so 
> there is no benefit to the MD5.

Ah, yea we were hashing it in hopes that it'd lessen the impact of trimming
it.  I wasn't aware it was already dispersed.

> 
> Truncating to size 16 will remove some of the uniqueness of 
> the UUID.  Since some of the goal of UUID was to disperse the 
> values the likelihood of a truncated UUID overlapping goes up 
> as the number of objects increases but 16 bytes of 25 (which 
> is what you would get with base 36) is close to the 
> uniqueness you get with UUID.
> 

Any idea how close?

> But notice the following: 
> 
> A UUID new asInteger printStringBase: 36 returns a string 
> that is 25 bytes long for example: 
> 'BGI8YR7NBJJTUOWIBQJU7E5TA' and if you take the first 16 you 
> get only 16 bytes (one for each character).  But if you store 
> the UUID asInteger guess what?  The integer is only 16 bytes! 
>  So the right thing to do if you have a budget of 16 bytes is 
> to store the UUID as an integer instead of as a string!

It's not about storage, I can store whatever I want, it's about a short
string representation for human use.




More information about the Squeak-dev mailing list