Blocks from strings
renggli at student.unibe.ch
Thu Jan 1 20:18:31 UTC 2004
> To see this whole thing in action, first go here:
> This will make the system think you are a spider.
> Now just hit the site normally:
> You will see that all the Seaside gobble-de-goop is now missing from
> all URL's and all pages.
> Go ahead and hover the mouse over any anchor, or any "link" image.
> You will see that the URL that was generated for each link is also a
> static-looking link.
> The site will continue to think you are a spider until one hour after
> your last access.
I suppose you are working either with the referee-header or you are
remembering the IP of the spider accessing robots.txt. Are you sure this
As far as I know from Google they are running their spiders on a cluster
of linux boxes. A site isn't scanned all at once, but every page is
scheduled, fetched and indexed from different machines with different
IPs. Are you using a different trick to keep the 'isSpider' information?
More information about the Squeak-dev