[squeak-dev] Porting Speech/TTS to Squeak Trunk (was: Speech.sar is missing)

David T. Lewis lewis at mail.msen.com
Wed Oct 13 22:39:52 UTC 2021


Hi Christoph,

I have to apologize for not replying earlier, I got busy elsewhere and forgot.

It's fine if you want to keep your existing 'c​t' account,
although I would actually recommend that you just create a new
account using 'ct2' or similar, and I can add that as developer
on the http://www.squeaksource.com/Speech project.

I have also added a new validation check on squeaksource.com to
require that author initials for account user ID be ascii alphanumeric
or $. character. It's a bit culturally insensitive but I think
that this is a reasonable restriction given that SqueakSource is
very easily confused multibyte character strings.

We have many contributions in the history of Squeak similar to
those of Tricot Christophe. The author may no longer be active
in the community, but the contributions are meaningful and part
of our shared history :-)

Dave

On Mon, Oct 04, 2021 at 04:21:14AM +0200, christoph.thiede at student.hpi.uni-potsdam.de wrote:
> Hi Dave,
> 
> first, I am sorry that I did not manage to reply earlier to you, and second, I am even more sorry that I have caused you so much extra work. Registering with an invisible Unicode character was really a stupid idea ... Actually, when I chose this username, I was assuming I would already have registered as normal 'ct' earlier but would have forgotten my credentials. It was not my intention to disrespect Tricot Christophe in any way.
> 
> I have just tried to log in as c​t as was able to successfully upload Speech-ct.13 to Squeaksource! (By the way, in my email clients (Squeak Inbox Talk and Outlook.com) all your mentions of this string are displayed correctly.)
> 
> As it works for me know, guided by the motto "Never change a running system", I would stick with this name -- provided that no one else has a problem with it. If anyone is planning to sanitize the credential database for Squeaksource, or if Tricot is still active and fels distracted because of my registration, please let me know and I will aggree to remove this account immediately.
> 
> And yes, form validation sounds like a very good idea, I guess. I don't want to know whether the absence of any validation step could probably even be exploited to run something like an XSS attack against Squeaksource, but I promise that I won't try. :D
> 
> By the way, I had already been talking with Marcel some months ago about changing the inofficial conventions for author initials; at our institute, we are observing a way too high number of multiple users taking the same initials, too. There is already a changeset lying on his desk that suggests new users choose more unambiguous initials (you can also find it in the attachment).
> 
> Best,
> Christoph
> 
> ---
> Sent from Squeak Inbox Talk
> 
> On 2021-09-25T12:45:46-04:00, lewis at mail.msen.com wrote:
> 
> > OK, I give up on the email conversions. The nine character byte
> > string that I intended to send to you contains these characters:
> > 
> > #($c $& $# $8 $2 $0 $3 $; $t)
> > 
> > If you enter those characters as your author initials when
> > logging in to squeaksource.com, then you should be able to
> > get access. But probably a simpler solution is just create a
> > new account for 'ct2' to avoid the conflict with the existing
> > 'ct' used by Christophe Tricot. Let me know if you do that,
> > and I'll add the new account as developer on Speech.
> > 
> > Dave
> > 
> > On Sat, Sep 25, 2021 at 11:25:24AM -0400, David T. Lewis wrote:
> > > Hi Christoph,
> > > 
> > > I discovered by accident today that the message below might make
> > > no sense to you because the author initials that I intended to
> > > send to you probably appeared as two unicode characters when you
> > > read it. It looked right to me in my text mode mail reader, but
> > > when I look at it on the mail archive it has been converted back
> > > to unicode.
> > > 
> > > My apologies for the confusion.
> > > 
> > > What I intended to say is that if you log on to squeaksource.com
> > > with auther initials c​t (nine 8 bit ascii characters to
> > > me, and probably nine unicode characters for you), then you should
> > > be able to log in to your existing account.
> > > 
> > > Chatting with Marcel today, he explained that he had a similar
> > > issue (someone on squeaksource.com had previously used 'mt'),
> > > and he solved it by registering himself as 'mt2'. You may want
> > > to do something similar, perhaps registering yourself as 'ct2'
> > > for the squeaksource.com account.
> > > 
> > > I think that I have found a way to protect against this kind
> > > of error in squeaksource. I can see that there are validate
> > > rules in the form input, so I'll see if I can add a rule to
> > > prevent the use of author initials that do not look like
> > > author initials.
> > > 
> > > Dave
> > > 
> > > On Tue, Sep 21, 2021 at 06:12:04PM -0400, David T. Lewis wrote:
> > > > One correction below:
> > > > 
> > > > On Tue, Sep 21, 2021 at 05:23:55PM -0400, David T. Lewis wrote:
> > > > > Hello Christoph,
> > > > > 
> > > > > I figured out what is happening. Attached is your user ID on squeaksource.com
> > > > > exactly as it appears in the squeaksource image. When you entered your
> > > > > author initials, the Unicode characters got converted into a 9-byte string.
> > > > > SqueakSource maintains a dictionary of users, and this is the key to your
> > > > > account in that dictionary. Your actual author intials are recorded
> > > > > separately in your account information, and these appear as 'ct'.
> > > > 
> > > > Correction - the member initials are not correct in the account information,
> > > > I misread it. Maybe it's time to locate and fix this bug once and for all.
> > > > 
> > > > Dave
> > > > 
> > > > 
> > > > > 
> > > > > This means that if you log in to squeaksource.com using 'c​t' rather
> > > > > than 'ct' as your user ID, you will be able to log in successfully.
> > > > > 
> > > > > That leaves two problems:
> > > > > 
> > > > > 1) Your author initials are the same as those of Christophe Tricot.
> > > > > I don't know if this will cause problems or not, but you should be
> > > > > aware if it in case of issues.
> > > > > 
> > > > > 2) Using 'c​t' to log in to this repository is not very convenient.
> > > > > 
> > > > > I think that you may well be the first person to explore the concept
> > > > > of using incomprehensible user names as a technique for authentication.
> > > > > This might prove to be an exciting new frontier in computer security ;-)
> > > > > 
> > > > > Note, this bug in SqueakSource has been present since the beginning.
> > > > > We had frequent problems with this back when squeaksource.com was
> > > > > being used for classroom assignments. I had hoped that the issue
> > > > > would somehow have been fixed when I updated squeaksource.com to
> > > > > the latest sources, and ran it on a Squeak 5.3 image. Unfortunately
> > > > > the problem is apparently still with us. I think that we do not see
> > > > > it on our source.squeak.org server because it has a much smaller
> > > > > number of user accounts, and no multibyte user IDs.
> > > > > 
> > > > > Dave
> > > > > 
> > > > > 
> > > > > On Mon, Sep 20, 2021 at 08:15:43PM -0400, David T. Lewis wrote:
> > > > > > Hi Christoph,
> > > > > > 
> > > > > > It's not a Unicode issue, rather it seems that you may not be the
> > > > > > first person to have used initials 'ct' on squeaksource, see attached.
> > > > > > 
> > > > > > Tricot Christophe ('ct') has some legitimate projects on the SqueakSource
> > > > > > server (PicaRobot and BotIncPica).
> > > > > > 
> > > > > > I am not sure of the right way to handle this.
> > > > > > 
> > > > > > Dave
> > > > > > 
> > > > > > 
> > > > > > On Mon, Sep 20, 2021 at 11:29:30PM +0000, Thiede, Christoph wrote:
> > > > > > > Hm ... Still no access (401 Unauthorized). What could this be? Is this because of the unfortunate Unicode space in my initials?
> > > > > > > 
> > > > > > > 
> > > > > > > Best,
> > > > > > > 
> > > > > > > Christoph
> > > > > > > 
> > > > > > > ________________________________
> > > > > > > Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von David T. Lewis <lewis at mail.msen.com>
> > > > > > > Gesendet: Montag, 20. September 2021 23:08:04
> > > > > > > An: The general-purpose Squeak developers list
> > > > > > > Betreff: Re: [squeak-dev] Porting Speech/TTS to Squeak Trunk (was: Speech.sar is missing)
> > > > > > > 
> > > > > > > Hi Christoph,
> > > > > > > 
> > > > > > > A added you as developer on http://www.squeaksource.com/Speech.
> > > > > > > You can copy your Speech-ct.11.mcz from inbox to this repo, then
> > > > > > > merge and commit freely. There is no need for reviews or approvals,
> > > > > > > you know what you are doing so please just go ahead and do whatever
> > > > > > > updates you feel are appropriate.
> > > > > > > 
> > > > > > > Thanks!
> > > > > > > Dave
> > > > > > > 
> > > > > > > 
> > > > > > > On Mon, Sep 20, 2021 at 04:24:34PM +0000, Thiede, Christoph wrote:
> > > > > > > > Here are my author initials again in plain Unicode:
> > > > > > > >
> > > > > > > > ct
> > > > > > > >
> > > > > > > > Copying and pasting all three characters should work, I hope. :-)
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > >
> > > > > > > > Christoph
> > > > > > > >
> > > > > > > > ________________________________
> > > > > > > > Von: Squeak-dev <squeak-dev-bounces at lists.squeakfoundation.org> im Auftrag von Thiede, Christoph
> > > > > > > > Gesendet: Samstag, 18. September 2021 18:35:35
> > > > > > > > An: squeak-dev at lists.squeakfoundation.org; lewis at mail.msen.com
> > > > > > > > Betreff: Re: [squeak-dev] Porting Speech/TTS to Squeak Trunk (was: Speech.sar is missing)
> > > > > > > >
> > > > > > > > Hi Dave,
> > > > > > > >
> > > > > > > > no problem, thanks for the pointer. I now have registered on Squeaksource with the author initials returned by 'c%E2%80%8Bt' unescapePercents (sorry, it contains a Unicode character that the SMTPClient currently rejects to encode correctly, simple 'ct' was already in use ...).
> > > > > > > >
> > > > > > > > I have also updated the Wiki entry<https://wiki.squeak.org/squeak/651> and added a link to this Squeaksource project. :-)
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Christoph
> > > > > > > >
> > > > > > > > ---
> > > > > > > > Sent from Squeak Inbox Talk<https://github.com/hpi-swa-lab/squeak-inbox-talk>
> > > > > > > >
> > > > > > > > On 2021-09-18T10:51:03-04:00, lewis at mail.msen.com wrote:
> > > > > > > >
> > > > > > > > > On Sat, Sep 18, 2021 at 04:01:03PM +0200, christoph.thiede at student.hpi.uni-potsdam.de wrote:
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I have finally found the time to try again to get Speech running under Squeak 5.3. The SAR file in the wiki is incomplete, but this one can be loaded with almost zero errors into a fresh trunk image: https://source.squeak.org/39a/Speech-md.9.mcz
> > > > > > > > > >
> > > > > > > > > > I have just uploaded Speech-ct.11 to the inbox which can be loaded without any hickups into Squeak Trunk. :-) The question, what shall we do it? Speech is no longer part of the Trunk (I don't know why), I have no idea what is the best way to keep it in a maintainable state. Shall we readd it to the Trunk? Shall I make a separate GitHub repository for it? Open to your ideas. :-)
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > Hi Christoph,
> > > > > > > > >
> > > > > > > > > Please set up an account for yourself on squeaksource.com, exactly
> > > > > > > > > like your ct account on source.squeak.org. There is already a
> > > > > > > > > Speech project there (http://www.squeaksource.com/Speech) that
> > > > > > > > > begins with the Speech-md.9.mcz that you found. I'll add you to
> > > > > > > > > that project so that you can contribute directly. You can copy
> > > > > > > > > your Speech-ct.11 directly there, and merge as needed. The most
> > > > > > > > > recent update in that repository is from 2012, so I'm sure that
> > > > > > > > > it is overdue for some updates :-)
> > > > > > > > >
> > > > > > > > > Presumably someone should also update the SqueakMap entry for
> > > > > > > > > Speech, which apparently points to an old copy that was extracted
> > > > > > > > > from Squeak 3.4.
> > > > > > > > >
> > > > > > > > > Sorry I did not mention this earlier, but I honestly had forgotten
> > > > > > > > > that the squeaksource repository was there, even though I am the
> > > > > > > > > person who set it up long ago.
> > > > > > > > >
> > > > > > > > > Dave
> > > > > > > > >
> > > > > > > > >
> > > > > > > 
> > > > > > > >
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > > 
> > > > > 
> > >
> > 
> > 
> ["authorInitials-check.3.cs"]

> 



More information about the Squeak-dev mailing list