The UC Berkeley Up Project is a speech corpus and phonetic longitudinal study based on the "Up" series of documentary films by director Michael Apted, showing a set of individuals at seven year intervals over a period of 42 years.
The corpus
The Up corpus contains audio recordings that were extracted from the various Up movies. These recordings comprise:
- 250 utterances
- 11 speakers, 9 of which have utterances from each represented age
- 21,328 word tokens
- 27,921 vowel tokens
- 41,284 consonant tokens