Schedule of Talks for Fall 2014
December 1 -
Nik Rolle (with Larry Hyman) (UC Berkeley)
December 15 -
Robert Eklund (Linköping University)
The neural correlates of filled pauses: An fMRI study of disfluency perception
A characteristic of spontaneous spoken language is that no one is completely fluent. When
speaking (or signing) everyone exhibits a certain degree of disfluency, to use the most
common term. The average frequency of disfluency has been reported to be around 6% at
word-level (Eklund, 2004; Bortfeld et al., 2001; Brennan & Schober, 2001; Fox Tree, 1995;
Oviatt, 1995). This talk presents results from a perception study of disfluency in spontaneous speech
(Eklund, 2004:175-196). Unfilled pauses (silences; UPs) and filled pauses (“eh”; FPs) were
excerpted from spontaneously produced dialogs and were played to subjects in an fMRI
experiment where the subjects acted as silent interlocutors.
The results exhibited increased activity in Primary Auditory Cortex for both types of stimuli.
However, and more interestingly, FPs, but not UPs, also elicited significant modulation
in the Supplementary Motor Area (Brodmann Area 6). To the best of our knowledge, this
study represents the first neurocognitive confirmation of the well-known difference between
FPs and all other kinds of speech disfluency. The results also provide an explanation for the
previously reported beneficial effect of FPs on reaction times in speech perception, and is also
interesting from the alleged role of FPs as “floor-holding devices” in human dialog, since FPs
seemingly activate motor programs in the listener.
The observed activation also has implications for various other perspectives of human
communicative and motor actions, such as mirror neuron theory, motor theory of speech
perception, and even help shed light on the oft-reported very short inter-speaker intervals
(ISIs) in human-human dialog.
September 8 -
Myfany Turpin (University of Queensland)
Linguistic fieldwork and song
Jointly hosted by Phorum and FForum
September 15 -
Megha Sundara (UCLA)
Phonetic similarity biases phonological learning in infants
Researchers have suggested that learners are biased to prefer phonological mappings between sounds that are phonetically similar
(Steriade, 2001; Peperkamp et al., 2006; Wilson, 2006; White, 2014). We tested this claim in one artificial language study and one natural language study
involving 12-month-old infants. Our results demonstrate that infants generalize phonological learning in ways that are not predicted from the input alone
and sometimes even fail to learn patterns available in their input. These findings show that input statistics alone cannot account for how infants learn
phonological alternations. Instead, phonological learning is biased by a preference for alternations involving phonetically similar segments.
September 22 -
Ronald Sprouse and Keith Johnson (UC Berkeley)
The Berkeley Phonetics Machine
We will demonstrate a virtual linux machine that we are calling The Berkeley Phonetics Machine. BPM runs in VirtualBox on Mac, PC, and Linux, and has
a number of standard and custom software packages for phonetic research.
Custom Berkeley software:
- ifcformant - Inverse Filter Control formant tracking, which seems to be a very good formant tracking method.
- make_text_grids - An implementation of the Penn Forced Aligner for aligning with marked up transcripts.
- wxklsyn - An implementation of the Klatt speech synthesizer.
- xwaves - A collection of speech signal processing for command-line scripting.
- convertlabel - A tool for converting between audio annotation formats.
- audiolabel - A Python library for reading audio annotation files.
The machine also includes standard packages: opensesame, Praat, Wavesurfer, sox, ffmpeg, edgetrack. The demo will show the machine, and discuss how to download and run it.
September 29 -
Larry Hyman (UC Berkeley)
Initial Vowel Length in Lulamogi: Cyclity or Globality?
Over the past several decades there has been recurrent skeptism concerning cyclic derivations in phonology, one of the most central tenets of
traditional generative and lexical phonology and morphology. Some of the proposed cyclic analyses have been argued not to require cyclicity, or to represent
lexical relations that are not totally productive (as in certain cases in English). For those surviving cases, a major strategy within optimality theory has
been to capture cyclic relations by surface "output-output" (O/O) constraints. Thus, to take a standard example, génerative and derívative have different stress
patterns not because they are literally derived from génerate and deríve, but because the stress of each derivative must agree with the output stress of its
respective corresponding base. A particularly explicit (and hence falsifiable) component of O/O correspondence is that "a cyclic Base must be a freely occurring
expression, a phrase or a free-standing word (Benua 1997, Kager 1999; Kenstowicz 1996; cf. Bermúdez-Otero 2010, Kiparsky 1998, Trommer 2013 for critical discussion
and proposed counterevidence)" (Steriade 2013). In this paper I draw on original data from Lulamogi, a previously almost unstudied Bantu language of Uganda, to
show that the most insightful analysis of a vowel length alternation requires either cyclicity or global reference to internal morphological structure and, in many
cases, a non-free standing base.
October 6 -
Gregory Finley (UC Berkeley)
Speaker's disclaimer: This is not a linguistics talk, but it may appeal to those interested in perception, hearing science, or industry research in general.
Recruitment of dual-source binaural cues by hearing impaired listeners
In this talk I present a project I undertook while an intern at Starkey Hearing Technologies. Listeners with hearing impairment perform simple binaural
tasks such as sound localization nearly as well as normal hearing listeners, but for more complex conditions, such as speech recognition in noise, their binaural
abilities fall short. How well would these listeners handle a simple task performed on a complex scene? I devised a study to test lateralization of two sources, a
male and a female talker, at fine differences in simulated location. Performance suffered somewhat with age and hearing loss, with some older listeners completely
unable to perform the task under some conditions. The results' similarities and differences to those of other experimental paradigms are discussed, as are plans for further research.
October 13 -
Andréa Davis (U Arizona / UC Berkeley)
When more is not better: Variable input in the formation of robust word representations
A number of studies with infants and with young children suggest that hearing words produced by multiple talkers helps learners to develop more robust
word representations (Richtsmeier et al 2009, Rost & McMurray 2009, 2010). Native adult learners, however, do not seem to derive the same benefit from multiple talkers.
Two word learning studies, with native English-speaking adults and with second-language English-speaking adults, were conducted. In both studies, participants learned 4
new minimal English-like minimal pair words either from a single talker or from multiple talkers. They were then tested with a) a perceptual task, in which they saw the
two pictures corresponding to a minimal pair, heard one of the pair, and had to choose the picture corresponding to the word they heard b) for the native English speakers
a speeded production task, in which they had to repeat the words they had just learned as quickly as possible. Unlike infants and young children, native English-speaking
adults did not differ significantly between the multiple talker group and the single talker group, either in perceptual accuracy, or in production. However, the second
language English speakers did differ significantly, but only when they were less proficient in English. The more proficient speakers, like the native speakers, did not
differ. Taken together, these results suggest that proficiency plays a role in whether learners benefit from variable input.
October 20 -
David Conant (UCSF)
A multi-modal imaging system for simultaneous measurement of speech articulator kinematics compatible with human electrophysiology
Speech articulation involves the rapid, coordinated movement of speech articulators (e.g. lips, jaw, tongue, and larynx). Most neuroscience investigations
of speech have relied upon static, binary features instead of dynamic articulator kinematics. However, a complete neurobiological understanding of speech motor control
requires determining the relationship between simultaneously recorded neural activity and the kinematics of all articulators. Many speech articulators are internal to the
vocal tract, and so simultaneously tracking the kinematics of all articulators is difficult, especially in the context of human electrophysiology recordings.
Here, we describe a noninvasive, multi-modal imaging system for simultaneously tracking the movement of the lips, jaw, tongue and larynx for articulator tracking in
human neuroscience at the bedside. We combined three non-invasive methods previously used separately: videography to track the lips and jaw, electoglotiography to monitor
the larynx, and ultrasonography to track the tongue. To characterize this system, we recorded articulator positions and acoustics from six speakers during multiple
productions of nine American English vowels.
We first describe processing methods for the robust extraction of kinematic parameters from the raw signals and to alignment/scaling methods to account for artifactual
variability across recording conditions. These results generally confirm the importance of tongue height and tongue frontness, and the first and second formants, but
show significant cross-subject variability. We used unsupervised matrix factorization techniques to extract 'basis sets' of vocal tract 'shapes' associated with different
vowels. This data-driven approach may preserve information about vocal tract 'shape' better than traditional point extraction methods. Finally, we developed a statistical
speech synthesizer to convert measurements of the vocal tract to audible speech and were able to reconstruct perceptible speech from measured articulatory features. These
results demonstrate a multi-modal experimental system to non-invasively monitor articulator kinematics during speech articulation, and describe novel analytic methods for
relating kinematic data to speech acoustics.
October 27 -
Martin Kohlberger (Universiteit Leiden)
Phonetic evidence for gradience: Spanish resyllabification
and voicing reconsidered
Abstract coming soon
Talks from Previous Semesters