Syntactic storage costs in sentence comprehension

Evan Chen, Florian Wolf & Edward Gibson
Massachusetts Institute of Technology

echen2@mit.edu

 

It is well known that nested structures like (1a) are much harder to understand than their right-branching counterparts (1b):

(1) a. # The student who the professor who the scientist collaborated with had advised copied the article.
b. The scientist collaborated with the professor who had advised the student who copied the article.

One factor that has been proposed to account for the difference in difficulty is integration distance (Gibson, 1998): Integration distances are much longer at the verbs for a nested structure than for a non-nested structure, making the nested structure harder.  Another factor that has been proposed to account for the difference is storage: Keeping track of more incomplete phrase structure rules or predicted categories could also make nested structures harder (Yngve, 1960; Chomsky & Miller, 1963; See Gibson, 1998, and Lewis, 1993 for more recent proposals).  This paper used self-paced word-by-word reading to test whether storage costs exist independent of integration differences.  To investigate this issue, we compared reading times for sentence regions in which storage costs varied but integrations remained constant.  Experiment 1 manipulated the number of verbs needed to form a grammatical sentence, using materials like those in (2):

(2) a. The employee realized that the boss implied that the company planned a layoff and so he sought alternative employment.
b. The employee realized that the boss's implication that the company planned a layoff was not just a rumor.
c. The employee's realization that the boss implied that the company planned a layoff caused a panic.
d. The employee's realization that the boss's implication that the company planned a layoff was not just a rumor caused a panic.

The target region for all four sentence types is the most embedded clause "the company planned a layoff".  In (2a), no verbs beyond the target region are needed to make a grammatical sentence.  In (2b) and (2c), one verb beyond the target region is predicted.  In (2d), two verbs beyond the target region are required.  Thus storage cost theories predict that the target region should be read fastest in (2a), slower in both (2b) and (2c), and slowest in (2d).  The results of the experiment confirmed these predictions.

Experiment 2 used materials like those in (3) to investigate whether incomplete dependencies other than those involving an expected verb also incur storage costs:

(3) a. Complement clause

The claim (alleging) that the cop who the mobster attacked ignored the informant might have affected the jury.

b. Relative clause

The claim which the cop who the mobster attacked ignored might have affected the jury.

The target region consists of the embedded material "the cop who the mobster attacked".  In (3a), the target material is part of a complement clause (CC) of the noun "claim".  In (3b), the target material is part of a relative clause (RC) modifying "claim".  Of relevance here, the RC requires the presence of an extra NP dependency position inside the RC (e.g., the gap NP object of "ignored" in (3b)).  If storing this expectation is associated with processing cost, then people should read the target region slower for the unambiguous RC structure in (3b) than for either the ambiguous or unambiguous versions of the CC structure in (3a).  The results of Experiment 2 verified these predictions (unambiguous CC vs. unambiguous RC: p's<.05; ambiguous CC vs. unambiguous RC: p's<.01).