Chapter 8 - Speech synthesis demo

Speech sounds can be minimally specified in terms of a small set of parameters, each of which can be described in terms of how they are made (physiological characteristics), or their physical (acoustic) characteristics.

Some of these parameters are isolated in the synthesized speech tokens in this table. For example, token number 1 (linked in the column labeled "1") is composed of a monotone voice with only a first formant resonance frequency. When you look at the spectrogram of this utterance, there is only one formant. Token 4 combines the first three formants, token 5 is composed of only stop release burst noises and fricatives, and finally in token 7 the voice has normal fundamental frequency variation.

This speech was synthesized in 1971 by Peter Ladefoged on a synthesizer at UCLA. The values of the parameters were a modified version of a set provided by John Holmes.

	PHYSIOLOGICAL	ACOUSTIC
1	Rate of vibration of the vocal folds	Fundamental frequency
2	First resonance of the vocal tract	Formant 1 frequency
3		Formant 1 amplitude
4	Second resonance of the vocal tract	Formant 2 frequency
5		Formant 2 amplitude
6	Third resonance of the vocal tract	Formant 3 frequency
7		Formant 3 amplitude
8	Fricative and stop bursts	Center of noise frequency
9		Amplitude of noise