A Connectionist Model of Sentence Comprehension and Production

Douglas Rohde

Ph.D. Thesis
School of Computer Science, Carnegie Mellon University
Center for the Neural Basis of Cognition

Download: 1.05M .ps.gz 3.05M .pdf


The most predominant language processing theories have, for some time, been based largely on structured knowledge and relatively simple rules. These symbolic models intentionally segregate syntactic information processing from statistical information as well as semantic, pragmatic, and discourse influences, thereby minimizing the importance of these potential constraints in learning and processing language. While such models have the advantage of being relatively simple and explicit, they are inadequate to account for learning and validated ambiguity resolution phenomena. In recent years, interactive constraint-based theories of sentence processing have gained increasing support, as a growing body of empirical evidence demonstrates early influences of various factors on comprehension performance. Connectionist networks are one form of model that naturally reflect many properties of constraint-based theories, and thus provide a form in which those theories may be instantiated.

Unfortunately, most of the connectionist language models implemented until now have involved severe limitations, restricting the phenomena they could address. Comprehension and production models have, by and large, been limited to simple sentences with small vocabularies (St. John & McClelland, 1990). Most models that have addressed the problem of complex, multi-clausal sentence processing have been prediction networks (Elman, 1991; Christiansen & Chater, 1999). Although a useful component of a language processing system, prediction does not get at the heart of language: the interface between syntax and semantics.

The current thesis focuses on the design and testing of the Connectionist Sentence Comprehension and Production (CSCP) model, a recurrent neural network that has been trained to both comprehend and produce a relatively complex subset of English. This language includes such features as tense and number, adjectives and adverbs, prepositional phrases, relative clauses, subordinate clauses, and sentential complements, with a vocabulary of about 300 total words. It is broad enough that it permits the model to address a wide range of sentence processing phenomena. The experiments reported here involve such issues as the relative comprehensibility of various sentence types, the resolution of lexical ambiguities, generalization to novel sentences, the comprehension of main verb/reduced relative, sentential complement, subordinate clause, and prepositional phrase attachment ambiguities, agreement attraction and other production errors, and structural priming.

The model is able to replicate many key aspects of human sentence processing across these domains, including sensitivity to lexical and structural frequencies, semantic plausibility, inflectional morphology, and locality effects. A critical feature of the model is its suggestion of a tight coupling between comprehension and production and the idea that language production is primarily learned through the formulation and testing of covert predictions during comprehension. I believe this work represents a major advance in the attested ability of connectionist networks to process natural language and a significant step towards a more complete understanding of the human language faculty.


Douglas Rohde, dr@tedlab.mit.edu