[Vm-dev] Unicode clipboard
John M McIntosh
johnmci at smalltalkconsulting.com
Tue May 22 14:56:11 UTC 2007
You're welcome to look at the
in the mac os tree / plugins/ClipboardExtended to see how we
extended the clipboard logic for Sophie.
Higher up the extended clipboard class uses mimetype information to
indicate the data type, at the lower level it's up to the plugin to
determine what for example ioReadClipboardData: clipboard format:
where clipboard is a 32bit value (address), and format is a string
Likely the method that is not clear is the
ioGetClipboardFormat: clipboard formatNumber: formatNumber
on the macintosh you can have an item on the clipboard in many
formats, such as a string in utf8, utf16, ascii, macroman
The ioGetClipboardFormat: formatNumber: returns each format type
based on the index number formatNumber.
We used the results of that data which we converted back to
mimetypes to decide the best format for reading the clipboard.
Each platform has helper methods to convert the platform format data
to a mimetype, so for example in windows we had
at: 49510 put: 'text/rtf' asMIMEType;
at: 1 put: 'text/plain' asMIMEType; "CF_TEXT"
at: 2 put: 'image/bmp' asMIMEType; "CF_BITMAP"
at: 12 put: 'audio/wave' asMIMEType; "CF_WAVE"
at: 13 put: 'text/unicode' asMIMEType; "CF_UNICODETEXT"
at: 16 put: 'CF_LOCALE'; "CF_LOCALE"
I will note for Windows we used FFI to make the required calls and
did not build a plugin.
So for example for textual data we would process either mime types of
rtf, utf8, unicode, or plain
Later you use the
ioReadClipboardData: clipboard format: format
to actually return the data object.
I'll note for reading unicode on the mac it came across as UTF16 with
no byte order mark, so our read WideString method that returned
WideString data did:
| bytes |
"utf16 plain text has no bom"
bytes := self readClipboardData: 'public.utf16-plain-text'.
^bytes ifNil: [bytes] ifNotNil:
[bytes asString convertFromWithConverter: (UTF16TextConverter new
useLittleEndian: (SmalltalkImage current endianness = #little)
on reading we did the following and supplied a byte order mark.
| ba |
ba := aString convertToWithConverter: (UTF16TextConverter new
self addClipboardData: ba dataFormat: 'public.utf16-plain-text'
On May 22, 2007, at 3:18 AM, Chris Petsos wrote:
>> Michael Rueger wrote:
>>> Chris Petsos wrote:
>>>> Any quick ideas on how we can handle unicode text from and to the
>>>> system clipboard with Squeak?
>>> There has been some work done in Sophie, currently being
>>> integrated with
>>> the OLPC image.
>> I'm working only for X11 (linux) with the OLPC.
>> If you want try on Mac or Win32 soon, see System-Clipboard-Extended
>> category in Sophie.
>> - Takashi
>> From what i saw System-Clipboard-Extended package uses UTF
>> Converters for
> the internal representation of the data.
> The thing is that we are trying to create a VM where the internal
> representation of the characters will be Unicode.
> This means that the VM we use is sending unicode charcodes to the
> image, we
> use unicode fonts etc...
> So, a UTF interpreted string will not display properly in our
> image. Unless,
> we use interpreters for our Unicode chars...
> I think we will have to patch the VM again so that the clipboard
> methods send again unicode streams to the image.
> Don't know which solution of the two is more desirable...
> The related methods that are called when putting to or getting
> from the clipboard are
> int clipboardSize(void)
> int clipboardWriteFromAt(int count, int byteArrayIndex, int
> int clipboardReadIntoAt(int count, int byteArrayIndex, int
> Any help on that Diomidis?
John M. McIntosh <johnmci at smalltalkconsulting.com>
Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com
More information about the Vm-dev