Documentation Suggestion

List overview All Threads
Download

newer

older

SqueakSource is down again

Robert Hawley

27 Dec 2007 27 Dec '07

4:25 a.m.

Documentation Suggestion.

Can we have direct links from within the Squeak image from each class, method, project, category, package, (etc.,) into a user-editable documentation web-site? It would effectively be Squeak's own encyclopedia; similar to wikipedia. Structured documentation could be available with great immediacy, be developed incrementally and be subject to continual review and rewrite by the whole community.

If there is existing discussion on this idea then it would be useful to have pointers to it. If it has not been much discussed before then I would suggest that this is a possibility worth exploring.

Yours

Bob

Show replies by date

Damien Pollet

27 Dec 27 Dec

4:03 p.m.

I have been thinking of doing that, at least to get google to index smalltalk code, so that people can see what that looks like without loading an image first. As far as structured documentation is concerned, you already have that from within smalltalk, when there are class or method comments.

There is already code in Pier to export comments as LaTeX or HTML code, I think that's what Lukas used to generate part of his thesis. SqueakSource has a code browser but it's not bookmarkable and I don't think it's indexed by search engines.

On 27/12/2007, Robert Hawley rhawley@plymouth.ac.uk wrote:

...

Documentation Suggestion.

Can we have direct links from within the Squeak image from each class, method, project, category, package, (etc.,) into a user-editable documentation web-site? It would effectively be Squeak's own encyclopedia; similar to wikipedia. Structured documentation could be available with great immediacy, be developed incrementally and be subject to continual review and rewrite by the whole community.

If there is existing discussion on this idea then it would be useful to have pointers to it. If it has not been much discussed before then I would suggest that this is a possibility worth exploring.

Yours

Bob

-- Damien Pollet type less, do more [ | ] http://typo.cdlm.fasmz.org

Lukas Renggli

6:30 p.m.

...

SqueakSource has a code browser but it's not bookmarkable and I don't think it's indexed by search engines.

Philippe made some extensions to SqueakSource so that it can be indexed by google. We don't know why it doesn't properly work, probably the search engine thinks it is fraud by the big duplication with all the different versions of the same code.

Lukas

-- Lukas Renggli http://www.lukas-renggli.ch

Klaus D. Witzel

6:57 p.m.

New subject: SqueakSource and search engines [was: Documentation Suggestion]

On Thu, 27 Dec 2007 18:30:12 +0100, Lukas Renggli wrote:

...

...
SqueakSource has a code browser but it's not bookmarkable and I don't think it's indexed by search engines.

Philippe made some extensions to SqueakSource so that it can be indexed by google. We don't know why it doesn't properly work, probably the search engine thinks it is fraud

LOL :)

...

by the big duplication with all the different versions of the same code.

I'm relatively good at search engine optimization (no, not the buzzwords business but the server technical side) because of customer demand (sites with sometimes > 10,000 pages). Could you/Phillippe post some example URLs of contents which ought to be indexed. Can give it a try and report what I find.

/Klaus

...

Lukas

Lukas Renggli

8:13 p.m.

New subject: SqueakSource and search engines [was: Documentation Suggestion]

...

...
by the big duplication with all the different versions of the same code.

I'm relatively good at search engine optimization (no, not the buzzwords business but the server technical side) because of customer demand (sites with sometimes > 10,000 pages). Could you/Phillippe post some example URLs of contents which ought to be indexed. Can give it a try and report what I find.

http://www.squeaksource.com/robots.txt http://www.squeaksource.com/sitemap.xml.gz

Note that the directory listing produces a slightly different result when being visited by a GoogleBot.

Lukas

-- Lukas Renggli http://www.lukas-renggli.ch

Klaus D. Witzel

9:37 p.m.

New subject: SqueakSource and search engines [was: Documentation Suggestion]

On Thu, 27 Dec 2007 20:13:35 +0100, Lukas Renggli wrote:

...

...
...
by the big duplication with all the different versions of the same code.

I'm relatively good at search engine optimization (no, not the buzzwords business but the server technical side) because of customer demand (sites with sometimes > 10,000 pages). Could you/Phillippe post some example URLs of contents which ought to be indexed. Can give it a try and report what I find.

http://www.squeaksource.com/robots.txt http://www.squeaksource.com/sitemap.xml.gz

These look indeed reasonable except response header expiration date of /robots.txt and .mcz files (how would anybody reset that date? seems to be impossible and perhaps *this* looks like fraud to them; I personally never go past 12 months; your to be sure that some day it's me who's back in control).

And then the absence of last-modified field in response headers. The latter shouldn't be that hard to add, so that crawlers don't have to work on assumptions and webmaster has a bit more control on content negotiation. *This* is the hot-spot (and not an expiration header past funeral date) when you don't want them to re-index but later perhaps have file format/contents/organizational/conceptual change which are unable to imagine now.

Also the project pages have expiration a minute or so from date header, why would anybody follow their links?

Well then, tried some of the project pages linked from sitemap.xml but Google is by no means interested "We're sorry, but there isn't enough text on this webpage; at least a few paragraphs are necessary to provide results. You can try entering a different URL, or check the box labeled 'Include other pages on my site linked from this URL'."

There ya go, another incarnation of the Squeak+Documentation problem (seemingly many (most?) authors don't write something up on their SqueakSource entries which then can be put onto the crawler's project pages).

This *can* be the reason (but perhaps also the content type of the .mcz files).

How about putting at least "Squeak, Squeaksource, <project name>, <tags>" into html keyword meta data on project pages.

...

Note that the directory listing produces a slightly different result when being visited by a GoogleBot.

What's in that? I can make more mistakes when attempting to find out than you can imagine. Could you post an example generated by the software from the Regex project (which Google doesn't like to index). Also, are there differences in response headers.

/Klaus

...

Lukas

Damien Pollet

7:24 p.m.

Yeah in fact I'm not sure google indexes all versions of repositories with web frontends. For people I guess it would already be useful if the latest version of each package was indexed.

On 27/12/2007, Lukas Renggli renggli@gmail.com wrote:

...

...
SqueakSource has a code browser but it's not bookmarkable and I don't think it's indexed by search engines.

Philippe made some extensions to SqueakSource so that it can be indexed by google. We don't know why it doesn't properly work, probably the search engine thinks it is fraud by the big duplication with all the different versions of the same code.

Lukas

-- Lukas Renggli http://www.lukas-renggli.ch

-- Damien Pollet type less, do more [ | ] http://typo.cdlm.fasmz.org

Klaus D. Witzel

8:03 p.m.

On Thu, 27 Dec 2007 19:24:17 +0100, Damien Pollet wrote:

...

Yeah in fact I'm not sure google indexes all versions of repositories with web frontends. For people I guess it would already be useful if the latest version of each package was indexed.

I think that *all* the project pages are indexed (IIRC that was what Phillipe made work), see

- http://www.google.com/search?q=SqueakSource+project+page+%22.mcz%22+site%3As...

But the .mcz files listed on those pages don't turn up in search results (or I have just not found a way to get results with .mcz or .zip files listed using google with site:squeaksource.com).

/Klaus

...

On 27/12/2007, Lukas Renggli wrote:

...
...
SqueakSource has a code browser but it's not bookmarkable and I don't think it's indexed by search engines.

Philippe made some extensions to SqueakSource so that it can be indexed by google. We don't know why it doesn't properly work, probably the search engine thinks it is fraud by the big duplication with all the different versions of the same code.

Lukas

-- Lukas Renggli http://www.lukas-renggli.ch

tim Rowledge

29 Dec 29 Dec

3:50 a.m.

...

On 27/12/2007, Robert Hawley rhawley@plymouth.ac.uk wrote:

...
Documentation Suggestion.

Can we have direct links from within the Squeak image from each class, method, project, category, package, (etc.,) into a user- editable documentation web-site?

The link part is easy - cmd-6 (on Mac, alt-6 on win32 and probably *nix) in a text editor opens a menu with several options including 'be a web URL link' so that you can include a link within comments. It also has linking to other methods, class comments etc.

...

...
It would effectively be Squeak's own encyclopedia; similar to wikipedia. Structured documentation could be available with great immediacy, be developed incrementally and be subject to continual review and rewrite by the whole community.

Duane Maxwell tried to encourage just such an idea a few years ago and even set up a domain for it. I don't recall many people making the effort to provide any content. We don't even need a new site though; what is wrong with providing the doc on the swiki and linking to it?

tim -- tim Rowledge; tim@rowledge.org; http://www.rowledge.org/tim Strange OpCodes: CDC: Clear Disks and Crash

5986

Age (days ago)

5988

Last active (days ago)

squeak-dev@lists.squeakfoundation.org

8 comments

5 participants

tags (0)

participants (5)

Damien Pollet
Klaus D. Witzel
Lukas Renggli
Robert Hawley
tim Rowledge