Swiki Mirroring (was Re: Squeak swiki is back up)

Mark Guzdial guzdial at cc.gatech.edu
Tue Aug 8 14:08:22 UTC 2000


An issue in doing the mirroring using standard software is the 
History links.  When capturing the Swiki, you probably don't need all 
previous versions of a page, but we make them all available so that 
users can "fix" things themselves.  You don't REALLY want a web crawl 
to follow all those history links -- that's a huge amount of 
material, and it's fairly hard on the server to parse out the old 
ones and provide them.

Georgia Tech's own robot insists on following all the History links 
(despite my best efforts at robot.txt files and cajoling those 
building the software). If you go to http://www.gatech.edu and search 
for the name of anyone who's ever taken a class using a Swiki, you'll 
find scores of hits, mostly just Versions of a small number of pages. 
<sigh>

Mark


>For those who mentioned interest in mirroring the Swiki, I've been 
>testing some mirroring software.
>This basically just follows the links from the squeak/1 page and 
>downloads all relevant files under
>minnow.cc.gatech.edu/squeak to my local hard drive.
>
>I think my mirroring attempt actually brought down the Swiki on 
>Saturday night, because the mirroring
>was chugging along successfully for about 15 minutes, then the Swiki 
>went down.  I guess mirroring
>hammers it pretty well, although I did limit the software to use 
>only one connection at a time.  In
>any case, it's a useful reliability test for the Swiki. :)
>
>So, the Swiki apparently isn't too reliable with the older version 
>of the MacOS that minnow uses... (I
>thought it was 7.6?)  As a separate test, I tried mirroring the 
>Squeak Book swiki on coweb (which uses
>a different/newer OS) and that worked fine, although it is a much 
>smaller swiki.
>
>It looks like there's plenty of free mirroring software available 
>for Windows, Linux, Unix, etc.  The
>package I tried was HTTrack for Windows. (http://httrack.free.fr/)
>
>When configuring HTTrack, by default it downloads all 
>html/gif/jpg/png linked files, only within the
>squeak directory and subdirectories.  I also set it to *not* 
>download *.edit and *.version files.  (We
>don't want the *.edit files, and the *.version files don't seem all 
>that necessary, they take up a lot
>of space.)  Doing this causes the mirror to link back to minnow when 
>you hit the "edit" icon, which
>isn't perfect, but seems reasonable.  (Ideally, maybe the edit icon 
>would be disabled.  A search &
>replace script on the html files could do this.)   Also, there's a 
>command-line mode in HTTrack if you
>wanted to run it nightly, etc.  Anyway, the mirrored Squeak Book 
>pages on my hard drive work fine.
>
>So, it sounds like we'll have to wait for the OS upgrade to minnow 
>before attempting a mirror.
>Probably once it's upgraded, it'll be more reliable and the mirror 
>won't be quite as essential, but I
>still think a mirror might be a good idea, since the Swiki is an 
>important resource for a lot of us.
>
>- Doug Way
>  dway at mat.net, @riskmetrics.com
>  RiskMetrics Group, Ann Arbor, MI
>  http://www.riskmetrics.com
>
>
>"Jochen F. Rick" wrote:
> >
> > Hello, as the title line says, the Squeak Swiki is back up.
> > We have yet to hear back from our support services about upgrading the
> > OS. Hopefully, upgrading the OS will help. We have tried every MacVM
> > possible, so I don't think it's a Squeak problem.
> >
> > John McIntosh said that OpenTransport in MacOS8.0 has some serious
> > problems. Thus, we are figuring that to be the problem.
> >
> > Several people have asked about setting up a mirror. My recommendation is
> > to wait until we have CNS upgrade it. If reliability is still a problem,
> > we can then look into other solutions.
> >
> > Peace and Luck!
> >
> > Je77

--------------------------
Mark Guzdial : Georgia Tech : College of Computing : Atlanta, GA 30332-0280
Associate Professor - Learning Sciences & Technologies.
Collaborative Software Lab - http://coweb.cc.gatech.edu/csl/
(404) 894-5618 : Fax (404) 894-0673 : guzdial at cc.gatech.edu
http://www.cc.gatech.edu/gvu/people/Faculty/Mark.Guzdial.html





More information about the Squeak-dev mailing list