A Connectionist Model of Sentence Comprehension and Production
Ph.D. Thesis
School of Computer Science, Carnegie Mellon University
Center for the Neural Basis of Cognition
Download:
Abstract:
The most predominant language processing theories have, for some time, been
based largely on structured knowledge and relatively simple rules. These
symbolic models intentionally segregate syntactic information processing from
statistical information as well as semantic, pragmatic, and discourse
influences, thereby minimizing the importance of these potential constraints
in learning and processing language. While such models have the advantage of
being relatively simple and explicit, they are inadequate to account for
learning and validated ambiguity resolution phenomena. In recent years,
interactive constraint-based theories of sentence processing have gained
increasing support, as a growing body of empirical evidence demonstrates early
influences of various factors on comprehension performance. Connectionist
networks are one form of model that naturally reflect many properties of
constraint-based theories, and thus provide a form in which those theories may
be instantiated.
Unfortunately, most of the connectionist language models implemented
until now have involved severe limitations, restricting the phenomena
they could address. Comprehension and production models have, by and
large, been limited to simple sentences with small vocabularies
(St. John & McClelland, 1990). Most models that have addressed the
problem of complex, multi-clausal sentence processing have been
prediction networks (Elman, 1991; Christiansen & Chater, 1999).
Although a useful component of a language processing system,
prediction does not get at the heart of language: the interface
between syntax and semantics.
The current thesis focuses on the design and testing of the Connectionist
Sentence Comprehension and Production (CSCP) model, a recurrent neural
network that has been trained to both comprehend and produce a relatively
complex subset of English. This language includes such features as tense and
number, adjectives and adverbs, prepositional phrases, relative clauses,
subordinate clauses, and sentential complements, with a vocabulary of about
300 total words. It is broad enough that it permits the model to address a
wide range of sentence processing phenomena. The experiments reported here
involve such issues as the relative comprehensibility of various sentence
types, the resolution of lexical ambiguities, generalization to novel
sentences, the comprehension of main verb/reduced relative, sentential
complement, subordinate clause, and prepositional phrase attachment
ambiguities, agreement attraction and other production errors, and structural
priming.
The model is able to replicate many key aspects of human sentence processing
across these domains, including sensitivity to lexical and structural
frequencies, semantic plausibility, inflectional morphology, and locality
effects. A critical feature of the model is its suggestion of a tight
coupling between comprehension and production and the idea that language
production is primarily learned through the formulation and testing of covert
predictions during comprehension. I believe this work represents a major
advance in the attested ability of connectionist networks to process natural
language and a significant step towards a more complete understanding of the
human language faculty.
Contents:
- 1 Introduction
- 1.1 Why implement models?
- 1.2 Properties of human language processing
- 1.3 Properties of symbolic models
- 1.4 Properties of connectionist models
- 1.5 The CSCP model
- 1.6 Chapter overview
- 2 An Overview of Connectionist Sentence Processing
- 2.1 Parsing
- 2.2 Comprehension
- 2.3 Word prediction
- 2.4 Production
- 2.5 Other language processing models
- 3 Empirical Studies of Sentence Processing
- 3.1 Introduction
- 3.2 Relative clauses
- 3.3 Main verb/reduced-relative ambiguities
- 3.4 Sentential complements
- 3.6 Prepositional phrase attachment
- 3.7 Effects of discourse context
- 3.8 Production
- 3.9 Summary of empirical findings
- 4 Analysis of Syntax Statistics in Parsed Corpora
- 4.1 Extracting syntax statistics isn't easy
- 4.2 Verb phrases
- 4.3 Relative clauses
- 4.4 Sentential noun phrases
- 4.5 Determiners and adjectives
- 4.6 Prepositional phrases
- 4.7 Coordination and subordination
- 4.8 Conclusion
- 5 The Penglish Language
- 5.1 Language features
- 5.2 Penglish grammar
- 5.3 The lexicon
- 5.4 Phonology
- 5.5 Semantics
- 5.6 Statistics
- 6 The CSCP Model
- 6.1 Basic architecture
- 6.2 The semantic system
- 6.3 The comprehension, prediction, and production system
- 6.4 Training
- 6.5 Testing
- 6.6 Claims and limitations of the model
- 7 General Comprehension Results
- 7.1 Overall performance
- 7.2 Representation
- 7.3 Experiment 2: Comparison of sentence types
- 7.4 Lexical ambiguity
- 7.5 Experiment 4: Adverbial attachment
- 7.6 Experiment 5: Prepositional phrase attachment
- 7.7 Reading time
- 7.8 Individual differences
- 8 The Main Verb/Reduced Relative Ambiguity
- 8.1 Empirical results
- 8.2 Experiment 6
- 8.3 Verb frequency effects
- 8.4 Summary
- 9 The Sentential Complement Ambiguity
- 9.1 Empirical results
- 9.2 Experiment 7
- 9.3 Summary
- 10 The Subordinate Clause Ambiguity
- 10.1 Empirical results
- 10.2 Experiment 8
- 10.3 Experiment 9
- 10.4 Experiment 10: Incomplete reanalysis
- 10.5 Summary and discussion
- 11 Relative Clauses
- 11.1 Empirical results
- 11.2 Experiment 11
- 11.3 Discussion
- 12 Production
- 12.1 Word-by-word production
- 12.2 Free production
- 12.3 Agreement attraction
- 12.4 Structural priming
- 12.5 Summary
- 13 Discussion
- 13.1 Summary of results
- 13.2 Accomplishments of the model
- 13.3 Problems with the model
- 13.4 Model versus theory
- 13.5 Properties, principles, processes
- 13.6 Conclusion
Appendices
- A Lens: The Light, Efficient Network Simulator
- A.1 Performance benchmarks
- A.2 Optimizations
- A.3 Parallel training
- A.4 Customization
- A.5 Interface
- A.6 Conclusion
- B SLG: The Simple Language Generator
- B.1 The grammar
- B.2 Resolving the grammar
- B.3 Minimizing the grammar
- B.4 Parsing
- B.5 Word prediction
- B.6 Conclusion
- C TGrep2: A Tool for Searching Parsed Corpora
- C.1 Preparing corpora
- C.2 Command-line arguments
- C.3 Specifying patterns
- C.4 Controlling the output
- C.5 Differences from TGrep
- D Details of the Penglish Language
- D.1 The Penglish SLG grammar
- D.2 The Penglish lexicon
Douglas Rohde, dr@tedlab.mit.edu