Combining structure and probabilities in a Bayesian model of human sentence processing

Srini Narayanan1 & Daniel Jurafsky2
1
SRI International and ICSI, Berkeley, 2 University of Colorado, Boulder

snarayan@icsi.berkeley.edu

 

Human language processing is sensitive to the frequency of many kinds of linguistic knowledge, including lexical frequencies, probabilistic relations between words, subcategorization frequencies, and thematic frequencies.  Experimental support for these frequency effects is robust and wide-spread.  But correctly modeling these frequency effects requires understanding how different kinds of frequencies or probabilities are combined.  Narayanan and Jurafsky (1998) proposed that human language comprehension be modeled by treating human comprehenders as Bayesian reasoners, and modeling the comprehension process with Graphical Models (Bayes Nets).  Bayes Nets provide a principled way to combine probabilistic evidence.  In this paper we extend the Narayanan and Jurafsky model to make further predictions about reading time given the probability of different parses or interpretations, and test the model against reading time data.

In the Bayesian approach, sentence processing proceeds by making a probabilistic decision about interpretations of the input sentence.  Each possible interpretation is assigned a probability, this probability is updated incrementally as each word is input, and the most-probable interpretation is chosen.  Assumptions about the dependence and independence of different probabilistic sources are represented in the topology of the graphical model.  Quantitative dependencies between knowledge sources are modeled using conditional probability tables.

The difficulty in Bayesian models is in making fine-grained reading-time predictions.  The Narayanan and Jurafsky (1998) model predicted extra reading time only when the correct parse had been pruned by the parser due to low-probability.  In our extension of their model, we also predict extra reading time whenever the next word is unpredictable (following Hale, 2001) or when a re-ranking of parse preference occurs.

We tested this extended Bayesian model of human parsing on the experimental data of McRae et al. (1998).  McRae et al. showed that thematic fit influenced by a gated sentence completion task and an on-line reading task.  We showed that a Bayesian network which included probabilities for thematic and syntactic knowledge sources was able to model off-line human jugements and on-line reading-time difficulty for agent-biased sentences.

Bayesian models have not been widely applied in psycholinguistics, despite their common use in other areas of psychology such as categorization and learning.  The Bayesian model is similar to constraint-based or connectionist models of sentence processing, but differs in having a principled way to weight and combine evidence.  Our results suggest that our Bayesian approach is able to model psycholinguistic results on evidence combination in human sentence processing, making reading-time predictions from probability.

 

References

Hale, J. (2001).  A probabilistic earley parser as a psycholinguistic model.  Proceedings of NAACL-2001.

McRae, K., Spivey-Knowlton, M., & Tanenhaus, M. K.(1998).  Modeling the effect of thematic fit (and other constraints) in on-line sentence comprehension.  Journal of Memory and Language, 38, 283 312.

Narayanan, S., & Jurafsky, D. (1998).  Bayesian models of human sentence processing.  In COGSCI-98, pp. 752-757.  Madison, WI: Lawrence Erlbaum.