[Seaside] Static sites and spider handling
Cees de Groot
cg at cdegroot.com
Tue Aug 26 10:46:12 CEST 2003
Here's my idea on how to handle spiders for static sites:
1. assumption: a site like http://www.tric.nl/ which has largely static
pages that are directly reachable;
2. when a client fetches /robots.txt, the user agent is added to the UA
database and a special session id based on the user agent string is
created;
3. every time the user agent has a value known in the UA database when a
session needs to be created, the same session id is used. This ensures
that URL's appear static to the spider;
4. when a link is followed with a 'robot session id' but the UA doesn't
match, a new session is created - this represents a real user coming in.
I want to make robots.txt accessible maybe with a simple comanche module
(or is there a way with Seaside to return data without a redirect to a
session first?), so that all hits on the file can be handled from the
application server image.
Does this sound like a reasonable idea?
More information about the Seaside
mailing list