[Vm-dev] primitiveClipboardText mangling line endings

Sat Oct 25 23:52:55 UTC 2014

On Sat, Oct 25, 2014 at 4:18 PM, Levente Uzonyi <leves at elte.hu> wrote:

>
> Hi Eliot,
>
> On Sat, 25 Oct 2014, Eliot Miranda wrote:
>
>  Is there some magic X11 setting that could account for your issue?
>>
>
> No, at least not intentionally. I tried to check if there are any settings
> that could affect the clipboard encoding, but I didn't find anything. It
> seems like only utf-8 is supported.
>
> I'm pretty sure it's related to the VM, because when I copy some text from
> another application, and paste it into an image, then the line endings are
> converted to CRs, which is very unlikely to happen on linux.
>
> I tried to copy all 7-bit ascii characters to the clipboard (besides zero
> which is not possible), because those are the same in utf-8:
>
> Clipboard default primitiveClipboardText: (1 to: 127) asByteArray.
>
> When I checked it with xclip, it turned out that all the bytes are on the
> clipboard:
>
> $ xclip -o -selection clipboard | hexdump -b;
> 0000000 001 002 003 004 005 006 007 010 011 012 013 014 012 016 017 020
> 0000010 021 022 023 024 025 026 027 030 031 032 033 034 035 036 037 040
> 0000020 041 042 043 044 045 046 047 050 051 052 053 054 055 056 057 060
> 0000030 061 062 063 064 065 066 067 070 071 072 073 074 075 076 077 100
> 0000040 101 102 103 104 105 106 107 110 111 112 113 114 115 116 117 120
> 0000050 121 122 123 124 125 126 127 130 131 132 133 134 135 136 137 140
> 0000060 141 142 143 144 145 146 147 150 151 152 153 154 155 156 157 160
> 0000070 161 162 163 164 165 166 167 170 171 172 173 174 175 176 177
> 000007f
>
> But in the other image, the bytes were filtered:
>
> Clipboard default primitiveClipboardText asByteArray. #[9 13 27 32 33 34
> 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
> 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84
> 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
> 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125
> 126]

Ugh, *why* does X11 have to be so complicated?  Why does Ian's VM code have
to be so complicated?  I've discovered that there's a -textenc flag for the
VM.  If you do

squeak -textenc UTF8 myimage.image

in both images then you'll get all 127 characters copied across.  I'd
immediately make this the default but
a) there is no command-line argument to select the default, what ever that
is
b) I *don't know* what the default is called, so I can't figure out a
name.  It's not that simple to determine.  here's the operative code from
platforms/unix/vm-display-X11/sqUnixX11.c:

static char *getSelectionFrom(Atom source)
{
  char * data= NULL;
  size_t bytes= 0;

  /* request the selection */
  Atom target= textEncodingUTF8 ? xaUTF8String : (localeEncoding ?
xaCompoundText : XA_STRING);

Further down there's

# if defined(X_HAVE_UTF8_STRING)
      if (uxUTF8Encoding == sqTextEncoding)
        Xutf8TextPropertyToTextList(stDisplay, &textProperty, &strList, &n);
      else
# endif
        XmbTextPropertyToTextList(stDisplay, &textProperty, &strList, &n);

So I guess at one point UTF8 support was added, hence it not being the
default.  Any objections to us making it the default now?

Ugh...
-- 
best,
Eliot
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/vm-dev/attachments/20141025/8ce31dc5/attachment.htm