[squeak-dev] Re: [Seaside] SqueakSource/Seaside question - has anyone seen this problem before?

David T. Lewis lewis at mail.msen.com
Mon Dec 23 14:30:13 UTC 2013


Thanks Philippe,

This is very helpful, much appreciated.

Dave

On Mon, Dec 23, 2013 at 12:51:52PM +0100, Philippe Marschall wrote:
> On Sun, Dec 22, 2013 at 5:50 PM, David T. Lewis <lewis at mail.msen.com> wrote:
> > I think that we probably have a few issues related to wide strings in the
> > SqueakSource code, and these issues certainly will effect source.squeak.org
> > in the same way that we have seen on squeaksource.com. The only difference
> > being that we do not happen to have any author names with multibyte characters
> > registered on source.squeak.org at the moment.
> >
> > When I was originally loading the old SqueakSource onto our squeak.org servers,
> > I found some problems with the image updating its repository from disk, and
> > at the time I chose to work around them manually in order to get the system
> > up and running. But there seem to be places where the identity of an author
> > is saved in the repository (in the image, not on disk), and stored with
> > possibly different encoding in the MCZ file names on disk, and may be stored
> > with yet another possibly different encoding internally within the MCZ file.
> >
> > The good news is that the squeaksource.com files and image give us enough
> > real life data that we should be able to locate the problem cases and think
> > about how to handle them properly. For example, the ss.log file shows evidence
> > of continuing problems related to six specific files:
> >
> > 2013-12-21T18:39:10.089+00:00 RECOVERING FelTimetable/FelTimetable-M????Sa.53.mcz
> > 2013-12-21T18:39:11.353+00:00 RECOVERING FelTimetable/FelTimetable-M????Sa.52.mcz
> > 2013-12-21T18:39:11.602+00:00 RECOVERING FelTimetable/FelTimetable-M????Sa.55.mcz
> > 2013-12-21T18:39:12.884+00:00 RECOVERING FelTimetable/FelTimetable-M????Sa.66.mcz
> > 2013-12-21T18:39:14.193+00:00 RECOVERING FelTimetable/FelTimetable-M????Sa.54.mcz
> > 2013-12-21T18:39:15.45+00:00 RECOVERING FelTimetable/Seaside2.8a1-M????Sa.49.mcz
> >
> > So some follow up is needed. But maybe not today, for now I'm just happy
> > to have the site running again :-)
> 
> Long story short it's messy and your options are kinda limited. The
> problems stem from the fact that the Seaside version of SqueakSource
> is very old (probably a decade by now). It's unmaintained and missing
> all the Unicode fixes that went in over the past years. There are
> newer versions of SqueakSource available [1] [2] [3] that work with
> newer versions of Seaside. The trouble however is migrating (you'll
> likely have to migrate to a newer version of Squeak as well). But I'm
> sure all the advocates of images and objects will be eager lend you a
> helping hand.
> 
> Now regarding encoding there are two things you need decide. What
> should be the internal encoding in the image and what should be the
> external encoding on the web page. If they are different some
> transcoding has to happen for both input and output. In Seaside 3.x
> this is quite easy to do, in Seaside 2.6 not so much. The webpage
> currently seems to use iso-8859-1 as indicated in the XML preamble
> (there is no HTTP header). I assume (without being sure) that the
> internal encoding is Squeak/MacRoman. Which brings us to the question
> how St???ane Munioz ended up in the image. Can you confirm that his name
> is a WideString and ??? is a single instance of Character?
> The obvious choice at this point would be to go for utf-8 external and
> Squeak internal. The easiest way to do this would be to use
> WAKomEncoded but I don't think this is even present in this version of
> Seaside. Remember you'll have to encode all the output in decode all
> the input. For example when I search for "Munioz" under "Members"
> still only half the page renders. There is one downside to this
> approach though and that is that you'll end up having WideStrings in
> the image. WideString has a bad reputation of being slow and buggy.
> Seaside 3.x helps a bit because the response would be encoded on the
> fly and therefore avoid a huge WideString response buffer. To avoid
> this you could use utf-8 internally but that breaks all length related
> methods and you'll have to pay attention when interacting with
> external systems (eg. file system).
> 
> General itmes:
> One of the optimizations we never had time to implement was installing
> mod_xsendfile [4]. Serving all the MCZ files through the image is very
> inefficient and puts unnecessary pressure on the image. We can't do it
> directly with Apache because we have to do an authentication check
> first. mod_xsendfile would allow the image to tell Apache which file
> to serve.
> >From time to time the image would lock up completely. We applied
> several patches that were supposed to make Semaphore thread-safe but
> the issue never fully went away. Some people said this was because
> SqueakSource was never designed to handle this load. I don't
> understand this argument, even if this is the case that should just
> make the image slow, not lock it up.
> 
>  [1] http://www.squeaksource.com/ss2.html
>  [2] http://www.squeaksource.com/squeaksource3.html
>  [3] http://ss3.gemstone.com/ss/ss3.html
>  [4] https://tn123.org/mod_xsendfile/
> 
> Cheers
> Philippe
> _______________________________________________
> seaside mailing list
> seaside at lists.squeakfoundation.org
> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside


More information about the Squeak-dev mailing list