Klatt Synthesizer

From Phonlab
Jump to navigationJump to search

The Klatt Synthesizer is a speech synthesis software designed by Dennis Klatt in 1980. Since then, it has undergone several modifications and is an excellent tool for jobs requiring speech synthesis. The synthesizer is a hybrid of parallel formant synthesizers and cascade formant synthesizers, allowing the modeling of both vowels (using the cascade configuration) and fricatives and stop bursts (using the parallel configuration).

How to use

Parameters

The parameters for the synthesized recording fall under two categories: constant and varied. Constant parameters, such as duration (du) or number of cascade formants (nf), cannot change during playback. Varied parameters, such as f0, can be set to change at different time points during playback. See Klatt Synthesizer Parameters for a list and description of possible parameters.

Using a .klp file

One method of using the Klatt Synthesizer is to create a .klp file that contains all of the desired parameters. It is important to remember that in order to be properly read by the synthesizer, the parameters must be entered with care so that they will result in the intended output. A .klp file can be edited with a spreadsheet program, which is useful for visualizing and making certain types of changes. However, it cannot contain commas, and the file extension must be .klp.

A .klp file consists of three ordered sections, all of which are optional:

1. A block of comments at the top

2. A constant parameters section

3. A varied parameters section

Lines in the comment section begin with ‘#’. Constant parameters are provided one per line. They consist of the parameter name and its integer value, separated by whitespace. The varied parameters section is introduced by the line consisting of exactly:

_varied_params_

This line is followed by a whitespace delimited table of varied parameters, in which the columns are the parameters, and the rows are the time points. The first line of the table is a header defining the columns. Each column name defined in this header must be a valid parameter name, or the name must begin and end with ‘_’, in which case the column is ignored. Note that the "_msec_" begins and ends with '_'. This column is designed to be ignored by the program, and is intended for human readers. One cannot specify changes in parameters at arbitrary points in the time column and expect those events to occur at the correct times. Any inconsistencies in the time column, therefore, will not be noticed by the synthesizer. By default, a 5-second utterance should have 1000 rows of varied parameters. If the _varied_params_ section your .klp file has only 500 rows, for example, the utterance will only be 2.5 seconds, even if it was specified to last 5 seconds.

An example of a .klp file complete with parameters can be found in /klsyn/doc/testklp.klp

Once you have saved your parameter file with the extension .klp, you can run the Klatt Synthesizer by typing

> klattsyn.py myparamfile.klp

into the terminal. A .wav file will then be created containing the synthesized speech according to the parameters.

Using the interactive interface

A somewhat simpler and less painstaking method of using the synthesizer is by using the interactive interface, which can be summoned by entering

> klattsyn_interactive.py

into the terminal, which brings up the interface. From here, you can follow the instructions to to design your synthesized speech, which will automatically create a .klp and .wav file.

Creating a continuum

A wide range of phonetics experiments require the creation of a continuum between two speech sounds. The Klatt Synthesizer allows one to create such a continuum. To do so, enter

> klp_continuum.py

into the terminal. This will prompt the synthesizer to ask for two .klp files between which it can create a continuum. After entering the two .klp files, the synthesizer will automatically create a continuum between the two sounds.

You can specify how many steps you would like to be interpolated by editing the script. Open the script in a text editor and locate the line defining nsteps, adjusting this value to the total number of steps desired (including the endpoints you specified). Be sure to save the script before running!

Examples

Interactive

Let's make a very simple sound file using the Interactive Klatt Synthesier. We'll create the cardinal [i], with F1 = 240 Hz and F2 = 2400 Hz:

Start by bringing up the interactive interface:

> klattsyn_interactive.py 

The following interface appears:

FILE  
       <o> to open a .klp or .fb file
       <s> to save your results in .klp and .wav files.
       <p> to play the synthesized speech.
       <q> to quit without saving anything.

     EDIT
       <c> to input a constant parameter.
       <v> to input a varied parameter trajectory.
       <t> to show a table of parameters. 

We're just making the sound [i], so we don't want any varied parameters. Input a constant parameter by entering "c" which prompts

      which parameter? 

We'll enter "F1" which gives us

      what value should it be? 

Enter 240, which brings us back to the menu screen. Repeat this process to change F2 to 2400. If desired, you can see what your utterance sounds like play inputting 'p' (the result should resemble [i]). Save your creation by entering 's'. The synthesizer will ask you what the name of your file should be, and it will create both a .klp and .wav file according to the set parameters.

Synthesizing a whisper

  • Open the .klp file you just made in a text editing or spreadsheet program
  • Make changes to the .klp file to synthesize a ‘whisper’:
    • Set the bandwidth of the first format (B1) to 500. This will allow the first formant to be much more diffuse. B1 is a constant parameter so is found in the first block of parameters at the top of the .klp file.
    • Change the header av to ah. This replaces voicing with aspiration.
    • Change af to av. Then set all values to 0.
  • Save and review your new work
  • You may need to play around with parameters some more to get an authentic sounding whisper.