[Vm-dev] Unicode clipboard

Chris Petsos chrispetsos at sch.gr
Wed May 30 11:31:18 UTC 2007


On Wed, 2007-05-30 at 11:19 +0300, Diomidis Spinellis wrote:
> Chris Petsos wrote:
> >> Michael Rueger wrote:
> >>> Chris Petsos wrote:
> >>>> Any quick ideas on how we can handle unicode text from and to the
> >>>> system clipboard with Squeak?
> >>> There has been some work done in Sophie, currently being integrated with
> >>> the OLPC image.
> >> I'm working only for X11 (linux) with the OLPC.
> >> If you want try on Mac or Win32 soon, see System-Clipboard-Extended
> >> category in Sophie.
> >>
> >> - Takashi
> > 
> >>From what i saw System-Clipboard-Extended package uses UTF Converters for
> > the internal representation of the data.
> > The thing is that we are trying to create a VM where the internal
> > representation of the characters will be Unicode.
> > This means that the VM we use is sending unicode charcodes to the image, we
> > use unicode fonts etc...
> > So, a UTF interpreted string will not display properly in our image. Unless,
> > we use interpreters for our Unicode chars...
> > I think we will have to patch the VM again so that the clipboard related
> > methods send again unicode streams to the image.
> > Don't know which solution of the two is more desirable...
> > 
> > The related methods that are called when putting to or getting something
> > from the clipboard are
> >     int clipboardSize(void)
> >     int clipboardWriteFromAt(int count, int byteArrayIndex, int startIndex)
> >     int clipboardReadIntoAt(int count, int byteArrayIndex, int startIndex)
> > 
> > in
> >     sqWin32Window.c
> > 
> > Any help on that Diomidis?
> 
> Sorry for taking so long to reply.  The change needed in sqWin32Window.c 
> is to replace the five instances of CF_TEXT with CF_UNICODETEXT. 
> However, this solves only the Windows part of the problem.  For this to 
> work, the characters we copy/paste must be of type wchar_t.  In the VM 
>   (unsigned char *)byteArrayIndex + startIndex appears to point to byte 
> characters.  How are Unicode characters represented there?
> 
> Diomidis Spinellis - http://www.spinellis.gr

Ok..i am half the way there...the trick is that the image converts the
unicode chars to UTF8 before sending them to the VM. Thus, byte data
reach the VM in UTF8 representation. These data are then passed to
	MultiByteToWideChar( CP_UTF8, 0, src,
        GlobalSize(h) + 1, out,   
     	GlobalSize(h2) );

Finally, the converted data are sent to the system clipboard with
	SetClipboardData(CF_UNICODETEXT, h2);

You are right in CF_UNICODETEXT Diomidis. I have in hand a very
pre-mature solution...just yesterday i managed to copy something from
eToys and paste it to MS Word correctly. But, i know it's a matter of
time...
I'll post a complete solution as soon as i complete it...
By the way...Takashi thanks for your interest... i'll send it as soon as
i can so that we can start testing...
Again thanks to everyone...

Christos.



More information about the Vm-dev mailing list