Hi all,
while debugging some encoding issues in Squeak Inbox Talk, I observed a possible encoding issues with the pipermail archives for squeak-dev. Specifically, I'm referring to the downloadable versions, e.g., http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November.txt or http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November.txt.gz.
When I retrieve such a file which is encoded as UTF-8 using the WebClient, the Content-Type header of the server response does not include the charset used, so Squeak does not decode the UTF-8 again. I'm not sure, but shouldn't the server indicate this? Of course, I can work around, just wanted to document this here. :-)
(Note that this bug report is independent of the multipart encoding issue I analyzed in http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November/222729.... which turned out to be my fault.)
Best, Christoph
--- Sent from Squeak Inbox Talk
Hi,
(Note that this bug report is independent of the multipart encoding issue I analyzed in http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November/222729.... which turned out to be my fault.)
Here is a second, slightly related issue: The notification mails by commits@source.squeak.org itself does not describe the charset of its messages, for instance in this message: http://lists.squeakfoundation.org/pipermail/squeak-dev/2021-November/217286....
This leads to exactly the problem described above, that is, in the monthly archive files, the special character in the patch is replaced by a U+FFFD replacement char.
Best, Christoph
--- Sent from Squeak Inbox Talk
On 2022-11-28T15:07:55+01:00, christoph.thiede@student.hpi.uni-potsdam.de wrote:
Hi all,
while debugging some encoding issues in Squeak Inbox Talk, I observed a possible encoding issues with the pipermail archives for squeak-dev. Specifically, I'm referring to the downloadable versions, e.g., http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November.txt or http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November.txt.gz.
When I retrieve such a file which is encoded as UTF-8 using the WebClient, the Content-Type header of the server response does not include the charset used, so Squeak does not decode the UTF-8 again. I'm not sure, but shouldn't the server indicate this? Of course, I can work around, just wanted to document this here. :-)
(Note that this bug report is independent of the multipart encoding issue I analyzed in http://lists.squeakfoundation.org/pipermail/squeak-dev/2022-November/222729.... which turned out to be my fault.)
Best, Christoph
Sent from Squeak Inbox Talk
squeak-dev@lists.squeakfoundation.org