Carnegie Mellon University & Massachusetts Institute of Technology
Interactive constraint-based theories of sentence processing have gained increasing support, as a growing body of empirical evidence demonstrates early influences of various factors on comprehension performance. Connectionist networks are one form of model that naturally reflect many properties of constraint-based theories, and thus provide a form in which those theories may be instantiated.
Unfortunately, most of the connectionist language models implemented until now have involved severe limitations. Comprehension and production models have, by and large, been limited to simple sentences with small vocabularies (cf. St. John & McClelland, 1990). Most models that have addressed the problem of complex, multi-clausal sentence processing have been prediction networks (cf. Elman, 1991; Christiansen & Chater, 1999). Although a useful component of a language processing system, prediction does not get at the heart of language: the interface between syntax and semantics.
The current study involves a recurrent neural network model that has been trained to both comprehend and produce a relatively complex subset of English. This language includes such features as tense and number, adjectives and adverbs, prepositional phrases, relative clauses, subordinate clauses, and sentential complements, with roughly 50 each of noun and verb stems, for a total of about 300 words. It is broad enough that it permits the replication of a wide range of sentence processing experiments.
Critical to the model is the way in which the meanings of complex sentences are to be encoded. Finite slot-filler representations will not suffice, so complex sentence meanings are encoded as sets of propositions. The "encoder" portion of the model is responsible for compressing this set into a single static representation of sentence meaning, which serves as the target of comprehension and the source of production. The comprehension and production systems, which are largely integrated, map between a sequence of words and a message. That the model has properly encoded or comprehended the message can be verified by asking fill-in-the-blank questions. A method has also been developed for obtaining simulated reading times based on the difficulty of both word prediction and semantic integration.
The model has been extensively tested on a variety of tasks, including the processing of lexical and structural ambiguities, and a range of unambiguous sentence types. It is able to replicate many key aspects of human sentence processing, including sensitivity to lexical and structural frequencies, semantic plausibility, and locality effects. In this presentation I will briefly describe the model, review in detail its processing of several interesting features of English, including the NP/S ambiguity, and discuss some of the lessons learned from its study.