Any quick ideas on how we can handle unicode text from and to the system clipboard with Squeak?
Christos
Michael Rueger wrote:
Chris Petsos wrote:
Any quick ideas on how we can handle unicode text from and to the system clipboard with Squeak?
There has been some work done in Sophie, currently being integrated with the OLPC image.
I'm working only for X11 (linux) with the OLPC. If you want try on Mac or Win32 soon, see System-Clipboard-Extended category in Sophie.
- Takashi
Michael Rueger wrote:
Chris Petsos wrote:
Any quick ideas on how we can handle unicode text from and to the system clipboard with Squeak?
There has been some work done in Sophie, currently being integrated with the OLPC image.
I'm working only for X11 (linux) with the OLPC. If you want try on Mac or Win32 soon, see System-Clipboard-Extended category in Sophie.
- Takashi
From what i saw System-Clipboard-Extended package uses UTF Converters for
the internal representation of the data. The thing is that we are trying to create a VM where the internal representation of the characters will be Unicode. This means that the VM we use is sending unicode charcodes to the image, we use unicode fonts etc... So, a UTF interpreted string will not display properly in our image. Unless, we use interpreters for our Unicode chars... I think we will have to patch the VM again so that the clipboard related methods send again unicode streams to the image. Don't know which solution of the two is more desirable...
The related methods that are called when putting to or getting something from the clipboard are int clipboardSize(void) int clipboardWriteFromAt(int count, int byteArrayIndex, int startIndex) int clipboardReadIntoAt(int count, int byteArrayIndex, int startIndex)
in sqWin32Window.c
Any help on that Diomidis?
Christos.
You're welcome to look at the Sophie-Clipboard.st ClipboardExtendedPlugin.c JMMExtendedClipBoardPlugin.1.cs
in the mac os tree / plugins/ClipboardExtended to see how we extended the clipboard logic for Sophie.
Higher up the extended clipboard class uses mimetype information to indicate the data type, at the lower level it's up to the plugin to determine what for example ioReadClipboardData: clipboard format: format means where clipboard is a 32bit value (address), and format is a string value.
Likely the method that is not clear is the ioGetClipboardFormat: clipboard formatNumber: formatNumber
on the macintosh you can have an item on the clipboard in many formats, such as a string in utf8, utf16, ascii, macroman The ioGetClipboardFormat: formatNumber: returns each format type based on the index number formatNumber.
We used the results of that data which we converted back to mimetypes to decide the best format for reading the clipboard. Each platform has helper methods to convert the platform format data to a mimetype, so for example in windows we had
clipboardFormatMap at: 49510 put: 'text/rtf' asMIMEType; at: 1 put: 'text/plain' asMIMEType; "CF_TEXT" at: 2 put: 'image/bmp' asMIMEType; "CF_BITMAP" at: 12 put: 'audio/wave' asMIMEType; "CF_WAVE" at: 13 put: 'text/unicode' asMIMEType; "CF_UNICODETEXT" at: 16 put: 'CF_LOCALE'; "CF_LOCALE"
I will note for Windows we used FFI to make the required calls and did not build a plugin.
So for example for textual data we would process either mime types of rtf, utf8, unicode, or plain
Later you use the ioReadClipboardData: clipboard format: format to actually return the data object.
I'll note for reading unicode on the mac it came across as UTF16 with no byte order mark, so our read WideString method that returned WideString data did:
readWideStringClipboardData | bytes | "utf16 plain text has no bom"
bytes := self readClipboardData: 'public.utf16-plain-text'. ^bytes ifNil: [bytes] ifNotNil: [bytes asString convertFromWithConverter: (UTF16TextConverter new useLittleEndian: (SmalltalkImage current endianness = #little) )]
on reading we did the following and supplied a byte order mark.
addWideStringClipboardData: aString | ba |
self clearClipboard. ba := aString convertToWithConverter: (UTF16TextConverter new useByteOrderMark: true). self addClipboardData: ba dataFormat: 'public.utf16-plain-text'
On May 22, 2007, at 3:18 AM, Chris Petsos wrote:
Michael Rueger wrote:
Chris Petsos wrote:
Any quick ideas on how we can handle unicode text from and to the system clipboard with Squeak?
There has been some work done in Sophie, currently being integrated with the OLPC image.
I'm working only for X11 (linux) with the OLPC. If you want try on Mac or Win32 soon, see System-Clipboard-Extended category in Sophie.
- Takashi
From what i saw System-Clipboard-Extended package uses UTF Converters for
the internal representation of the data. The thing is that we are trying to create a VM where the internal representation of the characters will be Unicode. This means that the VM we use is sending unicode charcodes to the image, we use unicode fonts etc... So, a UTF interpreted string will not display properly in our image. Unless, we use interpreters for our Unicode chars... I think we will have to patch the VM again so that the clipboard related methods send again unicode streams to the image. Don't know which solution of the two is more desirable...
The related methods that are called when putting to or getting something from the clipboard are int clipboardSize(void) int clipboardWriteFromAt(int count, int byteArrayIndex, int startIndex) int clipboardReadIntoAt(int count, int byteArrayIndex, int startIndex)
in sqWin32Window.c
Any help on that Diomidis?
Christos.
-- ======================================================================== === John M. McIntosh johnmci@smalltalkconsulting.com Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com ======================================================================== ===
Use UTF-8 to transfer clipboard data. Windows has two nice functions to efficiently convert those (MultiCharToWideChar and WideCharToMultiChar). By far the easiest solution.
Cheers, - Andreas
Chris Petsos wrote:
Any quick ideas on how we can handle unicode text >from and to the system clipboard with Squeak?
Christos
vm-dev@lists.squeakfoundation.org