On Tue, 28 Apr 1998, Paul Fernhout wrote:
Rsynth is public domain I believe. Do you have any plans to put your work with it under the Squeak license (or something similar)? If so, I'd be interested in helping out as a tester or with comments related to getting it to work efficiently or better under Squeak.
It was just an experiment... I'd like to have the time to do it much better. Anyway, everything I do is always public domain, so if you or others want it I can post it. I will send it to you and to a Squeak ftp site the next Tuesday (I don't have it here, sorry) and by the way I'll try to improve it a little.
How slow is it? What system (processor/speed/memory) are you running it on? Is it as understandable as the original rsynth itself? Are there specific Squeak speed related problems (like clicking or drop outs or stuttering) that could be worked around?
Roughly it needed about 10 or 20 seconds to say 'how are you' at a sampling rate of 8 khz in a 486 DX2 66 with 12 MBs running Linux. It sounds like rsynth, with the exception of intonation (stress), that I have not yet implemented. It can probably be speed up adding some primitives.
Is it possible to use the Smalltalk->C translator to take your work and build it into the VM (like some current sound primitives)? It would have to be written with the translator restrictions in mind, or the translator would have to be expanded to support it.
I tryed to do it, but the translator does not support changing instance variables which are not integers (floats, for instance) and don't support returns which are not ^self.
Festival is only free for non-commercial use I believe. If you used that code directly, your work could not be part of the general Squeak distribution under the Squeak license (although I guess it could be distributed as an add on).
I didn't mean using Festival code directly, but using a diphone concatenation approach just in the way Festival and other synthesizers (most of them I think) do. The disadvantage of that method is that it needs a big (between 4 and 12 MBs) base consisting of sampled spoken units coded in such a way to make it easy to change duration, pitch and amplitude of voice; these units are concatenated to form words and phrases. The advantage of this approach is that the resulting voice is very natural sounding, in such a way that even can make the synthesizer sing as in the Lyricos project.
regards, Luciano.-