Image Unique Identifier

Ron Teitelbaum Ron at USMedRec.com
Tue Aug 22 21:45:20 UTC 2006


Hi Ramon,

The thing to keep in mind about a UUID is that it is supposed to be unique.
By hashing the UUID you will distribute the number more widely depending on
the UUID implementation.  The beginning bytes of the UUID are supposed to be
widely spread so I doubt that this makes much difference.  

For example

(1 to: 10) collect: [:i | UUID new]  I get: 
an UUID('48d374b0-b196-6a40-a15e-79a02a8dde89') 
an UUID('48eacb8f-4564-d84f-a456-2856c5226b97') 
an UUID('de67f9a4-4a42-cd44-bf79-8046366e6c9d') 
an UUID('59edd53d-eb69-164b-b7f6-5e63d6284d12') 
an UUID('fae32b90-49c8-ca46-a583-129e5a0fdc02') 
an UUID('cc760f65-1732-3647-b632-9087256a85f1') 
an UUID('6f23d070-a505-a240-addf-a749b02d6243') 
an UUID('a620fe16-4701-3d4a-9aee-0b549057d855') 
an UUID('154d299f-e2fe-3b42-bd3d-0fc4c3362db8') 
an UUID('0cc33014-4f35-6f4e-b9de-4a005a3ceb00'))

Notice the first group of bytes are already well dispersed so there is no
benefit to the MD5.

Truncating to size 16 will remove some of the uniqueness of the UUID.  Since
some of the goal of UUID was to disperse the values the likelihood of a
truncated UUID overlapping goes up as the number of objects increases but 16
bytes of 25 (which is what you would get with base 36) is close to the
uniqueness you get with UUID.

But notice the following: 

A UUID new asInteger printStringBase: 36 returns a string that is 25 bytes
long for example: 'BGI8YR7NBJJTUOWIBQJU7E5TA' and if you take the first 16
you get only 16 bytes (one for each character).  But if you store the UUID
asInteger guess what?  The integer is only 16 bytes!  So the right thing to
do if you have a budget of 16 bytes is to store the UUID as an integer
instead of as a string!

Hope that helps!

Ron Teitelbaum
President / Principal Software Engineer
US Medical Record Specialists
Ron at USMedRec.com 
Squeak Cryptography Team Leader


> -----Original Message-----
> From: squeak-dev-bounces at lists.squeakfoundation.org [mailto:squeak-dev-
> bounces at lists.squeakfoundation.org] On Behalf Of Ramon Leon
> Sent: Tuesday, August 22, 2006 5:22 PM
> To: 'The general-purpose Squeak developers list'
> Subject: RE: Image Unique Identifier
> 
> Speaking of UUID's, I have a need for something like a UUID, but shorter,
> more human readable.  I'd be fine with a UUID, but the biz folks don't
> like
> em.
> 
> A buddy and I came up with the following...
> 
> (String streamContents: [:string |
>     (MD5 hashMessage: UUID new)
>         do: [:each | string nextPutAll: (each printStringBase: 36)]])
> truncateTo: 16
> 
> It does an MD5 hash on a GUID, then takes the base36 string of each
> resulting number, and chops the string to a length of 16.  I'm just
> guessing
> on the 16, might try it longer or shorter, hopefully shorter.  Was
> wondering
> if anyone had any comments about this, or an alternate approach?
> 
> 





More information about the Squeak-dev mailing list