The Berkeley Phonetics & Phonology Forum ("Phorum") is a weekly talk and discussion series featuring presentations on all aspects of phonology and phonetics.


Mondays 12-1
1303 Dwinelle


Sarah Bakst

Matt Faytak

Berkeley Phonetics and Phonology Forum

Schedule of Talks for Fall 2014

October 27 -

Martin Kohlberger (Universiteit Leiden)
Phonetic evidence for gradience: Spanish resyllabification and voicing reconsidered

Abstract coming soon

December 1 -

Nik Rolle (with Larry Hyman) (UC Berkeley)
Title TBA

December 15 -

Robert Eklund (Linköping University)
The neural correlates of filled pauses: An fMRI study of disfluency perception

A characteristic of spontaneous spoken language is that no one is completely fluent. When speaking (or signing) everyone exhibits a certain degree of disfluency, to use the most common term. The average frequency of disfluency has been reported to be around 6% at word-level (Eklund, 2004; Bortfeld et al., 2001; Brennan & Schober, 2001; Fox Tree, 1995; Oviatt, 1995). This talk presents results from a perception study of disfluency in spontaneous speech (Eklund, 2004:175-196). Unfilled pauses (silences; UPs) and filled pauses (“eh”; FPs) were excerpted from spontaneously produced dialogs and were played to subjects in an fMRI experiment where the subjects acted as silent interlocutors.

The results exhibited increased activity in Primary Auditory Cortex for both types of stimuli. However, and more interestingly, FPs, but not UPs, also elicited significant modulation in the Supplementary Motor Area (Brodmann Area 6). To the best of our knowledge, this study represents the first neurocognitive confirmation of the well-known difference between FPs and all other kinds of speech disfluency. The results also provide an explanation for the previously reported beneficial effect of FPs on reaction times in speech perception, and is also interesting from the alleged role of FPs as “floor-holding devices” in human dialog, since FPs seemingly activate motor programs in the listener.

The observed activation also has implications for various other perspectives of human communicative and motor actions, such as mirror neuron theory, motor theory of speech perception, and even help shed light on the oft-reported very short inter-speaker intervals (ISIs) in human-human dialog.

Previous Meetings

September 8 -

Myfany Turpin (University of Queensland)
Linguistic fieldwork and song

Jointly hosted by Phorum and FForum


September 15 -

Megha Sundara (UCLA)
Phonetic similarity biases phonological learning in infants

Researchers have suggested that learners are biased to prefer phonological mappings between sounds that are phonetically similar (Steriade, 2001; Peperkamp et al., 2006; Wilson, 2006; White, 2014). We tested this claim in one artificial language study and one natural language study involving 12-month-old infants. Our results demonstrate that infants generalize phonological learning in ways that are not predicted from the input alone and sometimes even fail to learn patterns available in their input. These findings show that input statistics alone cannot account for how infants learn phonological alternations. Instead, phonological learning is biased by a preference for alternations involving phonetically similar segments.

September 22 -

Ronald Sprouse and Keith Johnson (UC Berkeley)
The Berkeley Phonetics Machine

We will demonstrate a virtual linux machine that we are calling The Berkeley Phonetics Machine. BPM runs in VirtualBox on Mac, PC, and Linux, and has a number of standard and custom software packages for phonetic research.

Custom Berkeley software:

The machine also includes standard packages: opensesame, Praat, Wavesurfer, sox, ffmpeg, edgetrack. The demo will show the machine, and discuss how to download and run it.

September 29 -

Larry Hyman (UC Berkeley)
Initial Vowel Length in Lulamogi: Cyclity or Globality?

Over the past several decades there has been recurrent skeptism concerning cyclic derivations in phonology, one of the most central tenets of traditional generative and lexical phonology and morphology. Some of the proposed cyclic analyses have been argued not to require cyclicity, or to represent lexical relations that are not totally productive (as in certain cases in English). For those surviving cases, a major strategy within optimality theory has been to capture cyclic relations by surface "output-output" (O/O) constraints. Thus, to take a standard example, génerative and derívative have different stress patterns not because they are literally derived from génerate and deríve, but because the stress of each derivative must agree with the output stress of its respective corresponding base. A particularly explicit (and hence falsifiable) component of O/O correspondence is that "a cyclic Base must be a freely occurring expression, a phrase or a free-standing word (Benua 1997, Kager 1999; Kenstowicz 1996; cf. Bermúdez-Otero 2010, Kiparsky 1998, Trommer 2013 for critical discussion and proposed counterevidence)" (Steriade 2013). In this paper I draw on original data from Lulamogi, a previously almost unstudied Bantu language of Uganda, to show that the most insightful analysis of a vowel length alternation requires either cyclicity or global reference to internal morphological structure and, in many cases, a non-free standing base.

October 6 -

Gregory Finley (UC Berkeley)
Recruitment of dual-source binaural cues by hearing impaired listeners

Speaker's disclaimer: This is not a linguistics talk, but it may appeal to those interested in perception, hearing science, or industry research in general.

In this talk I present a project I undertook while an intern at Starkey Hearing Technologies. Listeners with hearing impairment perform simple binaural tasks such as sound localization nearly as well as normal hearing listeners, but for more complex conditions, such as speech recognition in noise, their binaural abilities fall short. How well would these listeners handle a simple task performed on a complex scene? I devised a study to test lateralization of two sources, a male and a female talker, at fine differences in simulated location. Performance suffered somewhat with age and hearing loss, with some older listeners completely unable to perform the task under some conditions. The results' similarities and differences to those of other experimental paradigms are discussed, as are plans for further research.

October 13 -

Andréa Davis (U Arizona / UC Berkeley)
When more is not better: Variable input in the formation of robust word representations

A number of studies with infants and with young children suggest that hearing words produced by multiple talkers helps learners to develop more robust word representations (Richtsmeier et al 2009, Rost & McMurray 2009, 2010). Native adult learners, however, do not seem to derive the same benefit from multiple talkers. Two word learning studies, with native English-speaking adults and with second-language English-speaking adults, were conducted. In both studies, participants learned 4 new minimal English-like minimal pair words either from a single talker or from multiple talkers. They were then tested with a) a perceptual task, in which they saw the two pictures corresponding to a minimal pair, heard one of the pair, and had to choose the picture corresponding to the word they heard b) for the native English speakers a speeded production task, in which they had to repeat the words they had just learned as quickly as possible. Unlike infants and young children, native English-speaking adults did not differ significantly between the multiple talker group and the single talker group, either in perceptual accuracy, or in production. However, the second language English speakers did differ significantly, but only when they were less proficient in English. The more proficient speakers, like the native speakers, did not differ. Taken together, these results suggest that proficiency plays a role in whether learners benefit from variable input.

October 20 -

David Conant (UCSF)
A multi-modal imaging system for simultaneous measurement of speech articulator kinematics compatible with human electrophysiology

Speech articulation involves the rapid, coordinated movement of speech articulators (e.g. lips, jaw, tongue, and larynx). Most neuroscience investigations of speech have relied upon static, binary features instead of dynamic articulator kinematics. However, a complete neurobiological understanding of speech motor control requires determining the relationship between simultaneously recorded neural activity and the kinematics of all articulators. Many speech articulators are internal to the vocal tract, and so simultaneously tracking the kinematics of all articulators is difficult, especially in the context of human electrophysiology recordings.

Here, we describe a noninvasive, multi-modal imaging system for simultaneously tracking the movement of the lips, jaw, tongue and larynx for articulator tracking in human neuroscience at the bedside. We combined three non-invasive methods previously used separately: videography to track the lips and jaw, electoglotiography to monitor the larynx, and ultrasonography to track the tongue. To characterize this system, we recorded articulator positions and acoustics from six speakers during multiple productions of nine American English vowels.

We first describe processing methods for the robust extraction of kinematic parameters from the raw signals and to alignment/scaling methods to account for artifactual variability across recording conditions. These results generally confirm the importance of tongue height and tongue frontness, and the first and second formants, but show significant cross-subject variability. We used unsupervised matrix factorization techniques to extract 'basis sets' of vocal tract 'shapes' associated with different vowels. This data-driven approach may preserve information about vocal tract 'shape' better than traditional point extraction methods. Finally, we developed a statistical speech synthesizer to convert measurements of the vocal tract to audible speech and were able to reconstruct perceptible speech from measured articulatory features. These results demonstrate a multi-modal experimental system to non-invasively monitor articulator kinematics during speech articulation, and describe novel analytic methods for relating kinematic data to speech acoustics.