Several UM linguists will be presenting their work at the 158th meeting of the Acoustical Society of America in San Antonio, Oct. 26-30.
Presenters and abstracts listed below
The perceptual time course of coarticulatory nasalization.
Patrice S. Beddor, Julie E. Boland, Andries Coetzee, Kevin McGowan
Abstract:
Listeners’ moment‐by‐moment processing of anticipatory vowel nasalization and a following nasal consonant was investigated. English‐speaking participants’ eye movements were monitored as they heard instructions to look at one of two pictured objects on a computer screen. Trials included pictured pairs for naturally produced words of the form CVNC‐CVC (e.g., bend‐bed), CVNC‐CVNC (bend‐bent), and CVC‐CVC (bed‐bet). Vowels in CVNC words were coarticulatorily nasalized. Results to date show that, when participants heard a CVNC word (bend), they visually fixated the correct picture earlier when the competing picture was CVC (bed)—that is, when the vowel in the competitor would be expected to be non‐nasal—than when the competitor was another CVNC word (bent). Results also suggest that participants often fixated the target CVNC picture in CVNC‐CVC trials after onset of vowel nasalization but before N onset. However, although vowel nasalization facilitated early selection of CVNC over CVC, a non‐nasalized vowel was not similarly helpful for selecting CVC over CVNC. When participants heard CVC (bed), they did not fixate the correct picture earlier when the competing picture was CVNC (bend) than when the competitor was CVC (bet). Findings are interpreted in light of production data for English and perceptual theories.
Nasal coarticulation in clear speech
Anthony Brasher
Abstract:
This study tests whether speakers, when trying to speak clearly, employ variable enhancement strategies as a function of phonetic environment. Using aerodynamic and acoustical methods, this study examines the effects of phonemic context and speaking modality and on the spatial and temporal extent of anticipatory nasal coarticulation in English. Target words are English (C)VNCvoiced (e.g., bend) and (C)VNCvoiceless (e.g., bent) words spoken in either clear or citation speech modes. In order to enhance the percept of /n/ in clear speech, speakers increase the duration of the nasal consonant in CVNCvoiced words but marginally increase, or even decrease, /n/ duration in CVNCvoiceless words. While highly variable, airflow results suggest little difference on anticipatory nasalization as a function of speech mode. These results argue against models predicting a global reduction in coarticulation in clear speech.
Effects of prosodic structure on the relative timing of articulators in English lateral production.
Susan S. Lin
Abstract:
Previous research has established that American English speakers tend to produce syllable‐final /l/ with movement of the tongue dorsum preceding movement of the tongue tip. However, the results of these studies differ with respect to the articulator timing in syllable‐initial /l/, with some claiming synchrony (Browman and Goldstein, 1995) and others claiming asynchrony in the direction opposite that of syllable‐final /l/ (Gick, 2003). This study uses ultrasound imaging to investigate the relative timing of the tongue tip and dorsum during production of syllable‐initial and syllable‐final /l/ in multiple prosodic contexts. Prosody has a significant effect on both duration and extent of articulator movement in speech production—onsets of larger prosodic units involve larger and longer movement than onsets of smaller prosodic (Keating, 2006). The explanation that these effects result from speakers’ attempts to render perceptually more clear the segments that initiate phrases and utterances suggests that examining these segments at varying prosodic positions may provide insight into speakers’ knowledge of speech perception. Current preliminary results show that American English speakers may utilize at least two distinct timing relations in initial laterals, supporting a position that speaker knowledge may be variable between speakers.
Aerodynamic modeling for concatenative speech synthesis.
Kevin B. McGowan
Abstract:
Listeners can perceive and use a wide array of fine‐grained phonetic details, including the detailed coarticulatory influences of adjacent sounds, when perceiving speech. Details like anticipatory nasalization can, for example, potentially provide the listener with a rich network of informative cues and are a key to understanding listeners’ ability to disambiguate speech sounds from seemingly ambiguous input. Unfortunately, these coarticulatory cues are generally missing or contradictory in the output of speech synthesis systems. These systems work by concatenating variable‐length sound units chosen from a large database of recorded speech. Units are chosen to minimize two functions: the cost of aligning a particular unit with the desired speech output (target cost) and the cost of adjoining the next sound to the most recently selected unit (join cost). Generally, these costs are calculated using features which can be automatically extracted from the acoustic speech signal. A unit selection database is created, automatically segmented and automatically labeled with nasal and oral airflow feature vectors. These aerodynamic features are used as a proxy for articulatory information in the calculation of join and cost functions. Listeners’ mean opinion scores are obtained on output from this system and a baseline acoustic system for comparison.