[Seaside] Static sites and spider handling

Tue Aug 26 10:15:18 CEST 2003

Cees and Nevin,

One thing I learned while working at Whistler.com is that search engine 
robots are incredibly paranoid and cynical. It would seem that the 
primary design goal of a search engine is to detect and defeat 
spamdexing in order to present an accurate view of the web to its 
users. As a result, the bots absolutely *hate* two-faced web sites, and 
will penalize them in the rankings accordingly.

IMO the best search engine strategy is to keep as much content as 
possible static, and only use dynamic pages when necessary. Obviously 
that's not an option here,
but it would make a good rule of thumb when designing the robot special 
case: to a robot, the site should appear as much as possible like a 
collection of html files served up by name.

Another data point, and maybe you guys already know this, is that links 
from other sites trump just about every other criterion for determining 
a page's rank. So ideally, you want to provide ensure that all links to 
your site from elsewhere point into the robot index. From there you 
want to get them into a session in way that doesn't annoy the robots.

I figure this will be a difficult project, particularly since the 
search engines are in an arms race with the spamdexers, and the rules 
change constantly. Luckily the optimization community obsesses about 
this stuff, and so the information you need to do it effectively is out 
there, if you can find it.

Good luck,

Colin