DVD's, Project Gutenberg, Full Text Search

Cees de Groot cg at home.cdegroot.com
Tue Jan 29 20:39:04 UTC 2002


Scott A Crosby <crosby at qwes.math.cmu.edu> said:
>Actually, thats such an infinitesmial fraction of the worlds literature,
>it could almost be lost without comment. (CTEA *cough* *cough* Remember,
>it doesn't include more than .01% of anything written since 1922.. Which
>means its missing 99% of the worlds literature.)
>
Well, I was not talking about the amount of characters, but the amount of
important works present in Gutenberg (and of course, it's an ongoing effort,
they only did some 4500 works). Shakespeare, the Bible, the old US documents,
etcetera. It's not just a random 4500/1000000th sample of whatever qualifies
as Literature. 

(the major thing I have against PG is that it's heavily biased towards English
literature, but that's probably a result of who is volunteering).

>But, neat idea. Full text indexing of the stories would not be possible,
>but doing the title/author/abstract could be done.
>
I was thinking about just that - FTI'ing the stories would probably be
possible in a couple of years (long live Moore ;-)). 

-- 
Cees de Groot               http://www.cdegroot.com     <cg at cdegroot.com>
GnuPG 1024D/E0989E8B 0016 F679 F38D 5946 4ECD  1986 F303 937F E098 9E8B



More information about the Squeak-dev mailing list