Comprehension complexity and corpus frequencies in noun phrase conjunction

Timothy Desmet1 & Edward Gibson2
Ghent University, 2 Massachusetts Institute of Technology


A number of researchers have proposed that sentence comprehension is frequency driven, such that the ease of understanding a construction depends on its frequency of use (e.g., Jurafsky, 1996; MacDonald & Seidenberg, 1999; Mitchell et al., 1995; Tabor, Juliano & Tanenhaus, 1997).  In apparent contradiction to such accounts, Gibson and Schutze (1999) showed that on-line disambiguation preferences do not always mirror corpus frequencies.  When presented with the syntactic ambiguity involving the conjunction of a noun phrase to three possible attachment sites, participants were faster to read attachments to the first site (e.g., 1a) than attachments to the second one (e.g., 1b), although the latter were shown to be more frequent in text corpora (Gibson, Schutze, & Salomon, 1996).

(1) The kids' magazine printed a story about a haunted house near a pond and...
a. one about an old mansion near a river because Halloween was coming soon.
b. one near a river because Halloween was coming soon.

In the present study, we investigated whether a particular feature in the items of Gibson and Schutze --- disambiguation using the pronoun 'one' --- could account for the discrepancy they found.  According to some theoretical accounts, the presence of a pronoun can induce a high attachment preference (e.g., Hemforth, Konieczny, & Scheepers, 2000).  Moreover, an investigation of the Brown and WSJ corpora reveals that most of the instances of NP conjunctions with three possible attachment sites involve conjoining a syntactically and/or semantically parallel NP which does not include a pronoun.  Therefore, we directly compared high and middle attachments that either (a) contained the pronoun 'one' (e.g., 2a and 2b) or (b) were parallel, but did not contain a pronoun (e.g., 2c and 2d).


A column about a soccer team from the suburbs and...

a. one about a baseball team from the city were published in the Sunday edition.
b. one from the city was published in the Sunday edition.
c. an article about a baseball team from the city were published in the Sunday edition.
d. a baseball team from the city was published in the Sunday edition.

A self-paced word-by-word reading study demonstrated that the presence of this pronoun is indeed responsible for the high attachment preference in the conjunction ambiguity.  When the disambiguation contained the pronoun 'one', we replicated the high attachment preference that was found by Gibson and Schutze (1999).  But when there was no pronoun present, we found evidence that the middle attachments were read faster than the high attachments, thus matching the pattern observed in corpus frequencies.  We conclude that there is no discrepancy between on-line preferences and corpus frequencies, and consequently there is no need to assume different processes underlying sentence comprehension and sentence production based on this syntactic ambiguity, as Gibson and Schutze had hypothesized.  We will discuss the ramifications of this finding for frequency-based theories of sentence comprehension.