Simple Parser for Natural Language?

Luciano Notarfrancesco luciano2 at mail.ru
Mon Jul 26 20:13:25 UTC 1999


> Bob wrote:
>
> At 01:09 AM 7/17/99 +0400, Luciano Notarfrancesco wrote:
> >I was planning to implement NGrammars for doing syntactic analysis in order to do better prosody generation for a text-to-speech system. I believe the N-grams stuff could be useful for your project.
> 
> I've seen statistical techniques used for NL analysis, but never N-grams, because they're not really suited for understanding/analysis.  The typical application in a speech recognition program is to filter the recognizer's hypotheses.  They can be very useful in that application.  Tri-gram filtering usually leaves in a lot of amusingly non-English schmutz.  But four-gram filtering produces utterances which are all valid English utterances.

Bob, actually N-Grams are used for ambiguity resolution in NL processing. These techniques have been used for part-of-speech tagging, for estimating lexical probabilities and for building probabistically based parsing algorithms. I read the Viterbi algorithm using bigram or trigram probability models can attain accuracy rates of over 95 percent for part-of-speech tagging.

Do you have experience on speech recognition? It would be great to do something on that area for Squeak, don't you think so?

Luciano.-





More information about the Squeak-dev mailing list