The prosodic/lexical interface: Effects of prosodic domain on recognition of onset-embedded words

Mikhail Masharov, Katherine Crosswhite, Joyce McDonough & Michael K. Tanenhaus
University of Rochester

mym@bcs.rochester.edu

 

Recognition of a word is correlated with its uniqueness point, i.e., when in a word phonetic input disambiguates that word from its lexical alternatives.  Recent work in phonetics demonstrates consistent effects of position in a prosodic domain on the initial portion of a stressed syllable [1].  In strong positions, vowels are lengthened, especially for monosyllabic words, and co-articulation is strongly reduced compared to weak positions.  For cohort pairs such as beaker and beetle, these changes would be expected to delay the point of disambiguation, resulting in longer lasting lexical competition.  However, for onset-embedded words such as doll-dolphin and pen-pencil, prosodic strengthening would exaggerate differences in vowel duration, making it a stronger and more reliable cue distinguishing between a monosyllabic word, such as doll, and a disyllabic competitor, dolphin [2, 3].

An analysis of 8 native speakers confirmed large and systematic differences in vowel duration for 24 pairs of embedded words, such as doll and dolphin in weak (medial) and strong (phrase final) prosodic domains (Put the dolphin/doll below the triangle and Now, click on the doll/dolphin, respectively).  Vowel differences between monosyllabic and disyllabic words averaged 30 ms in medial position and 90 ms in final position, with most of the effect due to large increases in vowel duration for monosyllabic words in utterance-final position.

We then monitored eye movements using recordings from a representative speaker as participants followed instructions to click on (Now click on the doll/dolphin) or move (Put the doll/dolphin below the triangle) objects.  Displays contained four pictures: a doll, dolphin, and two unrelated pictures.  Although the phonemic point of disambiguation occurred 100 ms later in utterance-final position compared to utterance-medial position, looks to the target and cohort diverged more than 100 ms earlier, demonstrating that listeners were using vowel duration information to help disambiguate between the target and its competitor.  This was confirmed in a second experiment where we used cross-spliced tokens in strong prosodic domains (e.g., doll from dol/phin and doll+phin (from dolphin).  Misleading vowel information dramatically increased cohort effects and delayed the point where looks diverged between targets and cohorts.

The results demonstrate that vowel duration is used as an on-line cue to help disambiguate between polysyllabic words and onset-embedded competitors.  More importantly, the results suggest that models of lexical access will need to take into account information about prosodic domains.

 

References

[1] Fougeron, C., &P. Keating (1997).  Articulatory strengthening at edges of prosodic domains.  Journal of the Acoustical Society of America, 101, 3728-3740.

[2] Davis, Marslen-Wilson, & Gaskell, (in press).  Leading up the lexical garden-path: Segmentation and ambiguity in spoken word recognition.  Journal of Experimental Psychology: Human Perception and Performance.

[3] Salverda, A P., Dahan, D., & McQueen, J. (2001).  Effects of vowel duration on the processing of onset embedded words.