Difference between revisions of "Klatt Synthesizer Parameters"
Line 69: | Line 69: | ||
= Variable Parameters = |
= Variable Parameters = |
||
− | ==Voicing == |
+ | == Voicing == |
=== Amplitude of voicing (av) === |
=== Amplitude of voicing (av) === |
||
Amplitude in dB of the voicing source waveform sent through the cascade vocal tract. |
Amplitude in dB of the voicing source waveform sent through the cascade vocal tract. |
||
Line 75: | Line 75: | ||
=== Fundamental frequency of voicing (f0)=== |
=== Fundamental frequency of voicing (f0)=== |
||
Rate at which the vocal folds are currently vibrating in Hz. |
Rate at which the vocal folds are currently vibrating in Hz. |
||
+ | |||
+ | === Amplitude of turbulence (at) === |
||
+ | Amplitude in dB of turbulence noise generated at the glottis during the open phase of a glottal vibration. |
||
+ | |||
+ | === Voice spectral tilt (tl) === |
||
+ | The (additional) downward tilt of the spectrum of the voicing source, in dB as realized by a soft one-pole low-pass filter. A value of zero has no effect on the source spectrum, while a value of 24 tilts the spectrum down gradually such that frequency components above about 3 kHz are attenuated by about 24 dB relative to a more normal source spectrum. |
||
+ | |||
+ | === Voice skew (vs) === |
||
+ | The number of 25 microsecond increments to be added to and subtracted from successive fundamental period durations in order to simulate the tendency for alternate periods to be more similar in duration than adjacent periods, one aspect of vocal fry. |
||
+ | |||
+ | === Open Quotient (oq) === |
||
+ | A nominal indicator of the width of the glottal pulse when using the default impulse train glottal source, and the exact number of samples in the open period when using the natural voicing source ('ss'=2). A value of 'oq'=50, the default value, corresponds to a 5 msec open portion of the fundamental period at the default sampling rate(10000 samples/sec) and default F0(100 Hz). |
Revision as of 12:23, 21 July 2015
This is a list of parameters that can be altered to create different types of synthetic utterances using the Klatt Synthesizer.
Constant Parameters
Sound file duration (du)
The duration of the utterance to be synthesized. This number will be rounded up to the nearest multiple of 'ui,' the number of milliseconds in a parameter update time interval.
Update interval (ui)
The number of msec of waveform generated between times when parameter values are updated.
Number of cascade formants (nf)
Specifies how many formants, counting from F1 up to a maximum of F8, are actually in the cascade vocal tract. The default value is 5, which is an appropriate number if the sampling rate is 10,000 samples/sec and the speaker has a vocal tract length of 17 cm.
Source select (ss)
A switch that determines which of two voicing source waveforms is used for synthesis. The default value, 1, causes a low-pass filtered impulse train to be generated, while the value 2 causes a more natural waveform with a definite sharp closing time to be invoked.
Output select (os)
Determines which waveform is saved in the output file. If 'os' has the default value of zero, the normal final output of synthesis is saved. Other output options are given below
1. Voicing periodic component alone
2. Aspiration alone
3. Frication alone
4. Glottal source (voicing, turbulence, and aspiration)
5. Glottal source sent to parallel vocal tract (AP) + radiation char
6. Cascade vocal tract, output of nasal zero resonator
7. Cascade vocal tract, output of nasal pole resonator
8. Cascade vocal tract, output of fifth formant
9. Cascade vocal tract, output of fourth formant
10. Cascade vocal tract, output of third formant
11. Cascade vocal tract, output of second formant
12. Cascade vocal tract, output of first formant
13. Parallel vocal tract, output of sixth formant alone
14. Parallel vocal tract, output of fifth formant alone
15. Parallel vocal tract, output of fourth formant alone
16. Parallel vocal tract, output of third formant alone
17. Parallel vocal tract, output of second formant alone
18. Parallel vocal tract, output of first formant alone
19. Parallel vocal tract, output of nasal formant alone
20. Parallel vocal tract, output of bypass path alone
Random Seed (rs)
Seed value given to the random number generator routine. Any number from 0 to 99999 can be specified.
Overall gain (g0)
An overall gain control, 'g0', is included to permit the user to adjust the output level without having to modify each source amplitude time function.
Delta of formant bandwidth (db) and delta of formant frequencies (dF)
These parameters are obscure and rarely used. They control a degree of flutter in the value of F1 and b1 as a function of the glottal state to perhaps improve the naturalness of the voice.
Variable Parameters
Voicing
Amplitude of voicing (av)
Amplitude in dB of the voicing source waveform sent through the cascade vocal tract.
Fundamental frequency of voicing (f0)
Rate at which the vocal folds are currently vibrating in Hz.
Amplitude of turbulence (at)
Amplitude in dB of turbulence noise generated at the glottis during the open phase of a glottal vibration.
Voice spectral tilt (tl)
The (additional) downward tilt of the spectrum of the voicing source, in dB as realized by a soft one-pole low-pass filter. A value of zero has no effect on the source spectrum, while a value of 24 tilts the spectrum down gradually such that frequency components above about 3 kHz are attenuated by about 24 dB relative to a more normal source spectrum.
Voice skew (vs)
The number of 25 microsecond increments to be added to and subtracted from successive fundamental period durations in order to simulate the tendency for alternate periods to be more similar in duration than adjacent periods, one aspect of vocal fry.
Open Quotient (oq)
A nominal indicator of the width of the glottal pulse when using the default impulse train glottal source, and the exact number of samples in the open period when using the natural voicing source ('ss'=2). A value of 'oq'=50, the default value, corresponds to a 5 msec open portion of the fundamental period at the default sampling rate(10000 samples/sec) and default F0(100 Hz).