[squeak-dev] unrecognized unicode characters

Levente Uzonyi leves at elte.hu
Fri Sep 2 18:39:41 UTC 2011


On Fri, 2 Sep 2011, Gonzalo Romano wrote:

> I'm using "RePlugin" by andrew Greenberg, and I'm opening the file
> like this "aFileEntry readWriteStream" where a aFileEntry is a
> DirectoryEntryFile.

Okay, I guess RePlugin is responsible for the problem. It was written for 
pre 3.8 Squeak, so it's pretty likely that it doesn't support WideStrings. 
PRCE supports UTF-8 encoded strings, so if it's possible to the tell 
RePlugin that the text is using that encoding, then you should be able to 
make it work.


Levente

>
> 2011/9/1 Levente Uzonyi <leves at elte.hu>:
>> On Tue, 30 Aug 2011, Gonzalo Romano wrote:
>>
>>> Hi levente thanks for you answer, the idea was just to process the
>>> files, I'm sure the files are in utf8, I've used squeakToUtf8 to
>>> convert the string and write the files, but no luck.
>>
>> That converter should be fine if your files really have UTF-8 encoding.
>>
>>>
>>> am I using the wright text converter?
>>> I'm doing some stuff with regexp, and rewriting the file, these
>>> characters have no translation to ascii or iso could that be the
>>> problem?
>>
>> Which regular expression library do you use? How are you opening the file
>> you're writing the output into?
>>
>>
>> Levente
>>
>>>
>>> 2011/8/29 Levente Uzonyi <leves at elte.hu>:
>>>>
>>>> On Mon, 29 Aug 2011, Gonzalo Romano wrote:
>>>>
>>>>> Hi, I'been working on a script to fix some xml files for a web
>>>>> application, and I'm having some trouble with character encoding.
>>>>> Tt seems there are some characters that squeak does not recognize like
>>>>> "..." -> u2026,  u2014, that ms word uses on their text files...
>>>>> Could anyone confirm this?, and maybe provide a workaround...
>>>>> thanks in advance!
>>>>
>>>> Would you like to display those documents in Squeak or just process the
>>>> files with a program you wrote?
>>>> In the first case you have to install and use a font, that contains the
>>>> missing characters (the default font doesn't contain these).
>>>> In the second case you have to make sure that you're using the right text
>>>> converter for your document.
>>>>
>>>>
>>>> Levente
>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Gonzalo, Romano
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Gonzalo, Romano
>>>
>>
>>
>>
>>
>
>
>
> -- 
> Gonzalo, Romano
>
>


More information about the Squeak-dev mailing list