<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

<html>

<head>

  <title></title>

</head>

<body>

<br>

<br>

Cees de Groot wrote:<br>

<blockquote type="cite"

 cite="mid1061887809.30119.626.camel@home.home.cdegroot.com">

  <pre wrap="">On Tue, 2003-08-26 at 10:07, Avi Bryant wrote:

  </pre>

  <blockquote type="cite">

    <pre wrap="">['that will not work']

    </pre>

  </blockquote>

  <pre wrap=""><!---->

Needless to say, Avi was right. We chatted over this and this is roughly

the proposed solution:

1. every time you send GWSession&gt;&gt;addToPath:, you effectively indicate

that you are about to setup some semi-static page. This might not always

be true, but then you don't need to use the code I'm about to write :-);

2. so when you do this, a flag is set; the next time a HTMLResponse is

seen, its HTMLDocument object is added to a cache keyed by the path (and

the site prefix, etcetera);

3. when a robot hits the site (robot detection mechanism can still be

done through the /robots.txt + UserAgent recognition trick), it gets an

index page containing a static link to every page in the cache;

4. robot follows index page, happily munches all the pages, and puts

your site on rank #1. All the pages are rendered so that they don't have

IMG or local HREF links, effectively presenting a flattened version of

your site to the bot.

I'm working on this (although I don't think I'll finish before tonight).

Tentative package name is Janus, after the Roman God with two faces.

I'll probably enhance this so you can have a dictionary keyed by page

name and pointing to metatags around, so you can do quality SEO with

Seaside.

_______________________________________________

Seaside mailing list

<a class="moz-txt-link-abbreviated" href="mailto:Seaside@lists.squeakfoundation.org">Seaside@lists.squeakfoundation.org</a>

<a class="moz-txt-link-freetext" href="http://lists.squeakfoundation.org/listinfo/seaside">http://lists.squeakfoundation.org/listinfo/seaside</a>

  </pre>

</blockquote>

<br>

Here's what I tried (and what the result was):<br>

<br>

1. For "robots" detection, I merely checked the log to see the IP range for

the google bots, then test for that IP range (your "robots" detection scheme

is obviously better).<br>

<br>

2. I created an abstract superclass from which to subclass for my components,

and it has the following three methods (note that I also added the 'komRequest'

instance variable to the session class):<br>

<br>

isGoogleIP<br>

&nbsp;&nbsp;&nbsp; ^ self session komRequest ipString beginsWith: '64.68.80'<br>

<br>

renderContentOn: html <br>

&nbsp;&nbsp;&nbsp; self isGoogleIP<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; ifTrue: [self renderGoogleContentOn: html]<br>

&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; ifFalse: [self renderNormalContentOn: html]<br>

<br>

renderGoogleContentOn: html <br>

&nbsp;&nbsp;&nbsp; ^ self renderNormalContentOn: html<br>

<br>

3. &nbsp;Now I placed the component code that the google bot is supposed to see

in #renderGoogleContentOn:, and the normal content in #renderNormalContentOn:.

&nbsp;Common content shared between those two methods (which actually is most

of the render code) is refactored into separate methods.<br>

<br>

4. For #renderGoogleContentOn:, I turned links into explicit URL's using

the scheme I discussed a couple months ago (in a Seaside thread titled "[Seaside]

anchors behaving like buttons?").<br>

<br>

*******************************<br>

<br>

OK, the above was mainly just a clumsy experiment. &nbsp;Did it work? &nbsp;Not really--

I can't tell that it made any difference to the google bot. &nbsp;Possibly part

of the reason is because even though I turned links into explicit URL's,

Seaside just changes them right back to the usual funny Seaside URL, and

I don't think the google bot liked that.<br>

<br>

Plus, it left me with two sets of rendering code, much of which needed to

now be kept in sync.<br>

<br>

In other words, I don't have a solution. &nbsp;I only have another experience

data point.<br>

<br>

Nevin<br>

&nbsp;<br>

<pre class="moz-signature" cols="$mailwrapcol">-- 

Nevin Pratt

Bountiful Baby

<a class="moz-txt-link-freetext" href="http://www.bountifulbaby.com">http://www.bountifulbaby.com</a>

(801) 992-3137

</pre>

<br>

</body>

</html>