Difference between revisions of "Multi align examples"
m |
m |
||
Line 18: | Line 18: | ||
[[File:nws_mono.multi_align.png|600px|Forced alignment of 'the north wind and the sun']] |
[[File:nws_mono.multi_align.png|600px|Forced alignment of 'the north wind and the sun']] |
||
+ | |||
+ | The output filename uses the same name as the inputs, with the extension <code>.multi_align.TextGrid</code>. |
||
== Specify a non-default input transcript == |
== Specify a non-default input transcript == |
||
+ | The input transcript does not have to have the same basename as the <code>.wav</code> file. Use the <code>--input</code> parameter to specify the name of the input transcript. The basename of <code>--input</code> is used to form the output filename: |
||
+ | <code>multi_align --input nws_mono.v2.TextGrid nws_mono.wav # output file is nws_mono.v2.multi_align.TextGrid</code> |
||
== Simple alignment of a single utterance == |
== Simple alignment of a single utterance == |
||
− | If the audio file contains a single utterance, you might prefer to skip creating a textgrid file and provide the |
+ | If the audio file contains a single utterance, you might prefer to skip creating a textgrid file and provide the transcript in a simple text file or on the command line. For example, if you have a simple text file named <code>nws_mono.txt<code>: |
+ | |||
+ | <code>The north wind and the sun.</code> |
||
+ | |||
+ | Then you can align by specifying that the <code>--input-type</code> is a text file: |
||
+ | |||
+ | <code>multi_align --input-type text nws_mono.wav</code> |
||
+ | |||
+ | By default <code>multi_align</code> looks for a <code>.txt</code> file that matches the <code>.wav</code> name when <code>--input text</code> is used. You can of course override this default with the <code>--input</code> parameter. |
||
+ | |||
+ | A second way to do simple alignment is to include the transcript on the command line. To do this, specify that the <code>--input-type</code> is <code>raw</code>. When <code>raw</code> is used, then the <code>--input</code> parameter should contain the transcript rather than a filename: |
||
+ | |||
+ | <code>multi_align --input-type raw --input 'The north wind and the sun' nws_mono.wav</code> |
Revision as of 11:53, 30 November 2018
This page illustrates usage of the multi_align
command for forced alignment. For the full set of options execute:
multi_align --help
The examples on this page use audio that contains the utterance 'The north wind and the sun', either as a single channel or a stereo recording in which the first two words are in the first channel and the remaining words are in the second channel.
Default behavior of multi_align
The only required argument of multi_align
is the name of a .wav
file to be aligned. By default the transcript of the audio is expected to be provided by the labels of a textgrid with the same basename as the .wav
file and with the extension .Textgrid
. The screenshot shows audio and associated textgrid.
If the audio in the screenshot is saved as nws_mono.wav
and the textgrid as nws_mono.TextGrid
, then the following command performs alignment:
multi_align nws_mono.wav
The resulting textgrid contains three tiers, named 'phone', 'word', and 'trs'. The first contains the phone alignments, the second contains the word alignments, and the last contains the original transcript labels.
The output filename uses the same name as the inputs, with the extension .multi_align.TextGrid
.
Specify a non-default input transcript
The input transcript does not have to have the same basename as the .wav
file. Use the --input
parameter to specify the name of the input transcript. The basename of --input
is used to form the output filename:
multi_align --input nws_mono.v2.TextGrid nws_mono.wav # output file is nws_mono.v2.multi_align.TextGrid
Simple alignment of a single utterance
If the audio file contains a single utterance, you might prefer to skip creating a textgrid file and provide the transcript in a simple text file or on the command line. For example, if you have a simple text file named nws_mono.txt
:
The north wind and the sun.
Then you can align by specifying that the --input-type
is a text file:
multi_align --input-type text nws_mono.wav
By default multi_align
looks for a .txt
file that matches the .wav
name when --input text
is used. You can of course override this default with the --input
parameter.
A second way to do simple alignment is to include the transcript on the command line. To do this, specify that the --input-type
is raw
. When raw
is used, then the --input
parameter should contain the transcript rather than a filename:
multi_align --input-type raw --input 'The north wind and the sun' nws_mono.wav