Re: [Vm-dev] Mac and Unix file/directory/clipboard interface?

3 Jun 2007


      Oh, how interesting. I had no idea that there is UTF-8 and UTF-8. So 
much for my proposal, I guess ;-)
Cheers,
   - Andreas
John M McIntosh wrote:
...
ok the mac carbon vm, and I believe with the unix os-x vm  let you 
specify what format the file/directory/drag-drop information is in .
By default the os-x carbon vm uses macroman because of issues with the 
file list dialog and how it assumes it knows what the file/directory  
names should be translated in various
version of Squeak.
For Sophie we use UTF8, Plopp I think they use UTF8, Scratch I believe 
is MacRoman
I'll note from http://en.wikipedia.org/wiki/UTF-8
The Mac OS X Operating System uses canonically decomposed Unicode, 
encoded using UTF-8 for file names in the filesystem.
So saying it's UTF8 is well not quite all the picture when it comes to 
UTF8.
In early May I applied some fixes to the Mac Carbon VM to address issues 
with pre-composed versus canonically decomposed Unicode UTF8 translation 
based on suggestions from
Tetsuya Hayashi and further testing.
...
        sqMacUnixFileInterface.c        Tetsuya HAYASHI, 

tetha@st.rim.or.jp, tetha@mac.com  I've found the latest mac vm (or 
recent version) fails to normalize UTF file name.
                                        It seems to be the function 
convertChars() of sqMacUnixFileInterface.c, which normalizes only 
decompose when converting squeak string to unix,
                                        but I think it needs 
pre-combined when unix string to squeak, and I noticed normalization 
form should be canonical (exactly should be
                                         kCFStringNormalizationFormC) 
for pre-combined.
I cannot say if this is also an issue with the unix VM.
As for the clipboard the old primitives assume macroman. The extended 
os-x clipboard plugin lets you pass any character format you wish based 
on mime-type.  Should that be
text, utf-8, utf-32, utf-16 or RTF? mmm no perhaps TIFF/PNG or JPEG
On Jun 2, 2007, at 9:34 PM, Andreas Raab wrote:
...
Hi Folks -
Since I just went through all of this, can someone explain to me what 
string encoding the Unix and Mac VMs use for interfacing the file, 
directory and clipboard functions? If these are all UTF-8 based (which 
I suspect) then should we just define that *all* strings passed to the 
VM are to be interpreted as UTF-8 and any VM or function that doesn't 
deal with UTF-8 correctly is considered broken and needs fixing? It 
strikes me as a nice, elegant solution to solve this problem once and 
forever.
Comments, anyone?
Cheers,

Andreas

--
John M. McIntosh johnmci@smalltalkconsulting.com
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================
--
John M. McIntosh johnmci@smalltalkconsulting.com
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [Vm-dev] Mac and Unix file/directory/clipboard interface?

--

--