Connectionism
Connectionism is an approach to language learning that models acquisition as the strengthening of connections in neural networks through repeated exposure to input. There is no innate grammar module, no rule system, and no symbolic manipulation — only weighted connections between simple processing units that gradually encode the statistical regularities of the input. Language knowledge is not stored as rules but as patterns of activation distributed across the network.
Origins
Connectionism emerged from the Parallel Distributed Processing (PDP) framework of Rumelhart and McClelland (1986), which proposed that cognition arises from the simultaneous activity of many simple interconnected units. Their landmark demonstration was a connectionist model that learned to produce English past tense forms — including the U-shaped developmental curve (correct → overgeneralised → correct) observed in children — without any explicit rules. The model extracted patterns from input frequency and phonological similarity alone.
This was a direct challenge to Chomsky's argument that language requires innate rules. If a simple network could simulate developmental patterns previously attributed to rule learning, perhaps rules were not needed at all.
How Connectionist Learning Works
- Input — the network receives linguistic input (e.g., verb stems paired with past tense forms)
- Weighted connections — connections between units have weights that determine how strongly one unit activates another
- Error-driven learning — the network compares its output to the correct target and adjusts weights to reduce error (typically via backpropagation)
- Generalisation — after sufficient training, the network can produce correct outputs for novel inputs it has never encountered, by generalising from patterns in the training data
Key Findings in SLA
- Rumelhart & McClelland (1986) — past tense acquisition without rules, including developmental overgeneralisation
- Ellis & Schmidt (1997) — connectionist networks trained on artificial plural morphology produced patterns closely resembling adult L2 learners' performance
- MacWhinney (2001) — the Competition Model incorporates connectionist principles, modelling how learners weight and reweight cues across languages
- N. Ellis (2002) — demonstrated that frequency effects across all levels of language are consistent with connectionist learning mechanisms
Connectionism vs. Nativism
| Feature | Connectionism | Nativism |
|---|---|---|
| Innate knowledge | None specific to language | Universal Grammar |
| Learning mechanism | Statistical pattern extraction from input | Parameter setting triggered by input |
| Rules | Emergent, not explicitly represented | Explicitly represented in the grammar |
| Role of input | Primary — the network's knowledge is the input patterns | Trigger — input activates pre-existing knowledge |
| Errors | Result from incomplete pattern learning | Result from incorrect parameter settings or incomplete rule application |
Connectionism vs. Behaviourism
Though both reject innate grammar and emphasise the role of input, connectionism is fundamentally different from behaviourism:
- Behaviourism treats learning as habit formation through stimulus-response-reinforcement chains. Connectionism models learning as statistical pattern extraction across distributed representations.
- Behaviourism cannot account for creativity in language (producing sentences never heard before). Connectionist networks generalise to novel inputs.
- Behaviourism has no internal representation. Connectionism has rich internal representations — they are just distributed, not symbolic.
Criticisms
- Poverty of the stimulus — nativists argue that connectionist models only succeed because researchers design the training data carefully. Real input is noisier and sparser.
- Scaling — early connectionist models handled limited phenomena (e.g., past tense). Scaling to full grammar acquisition remains a challenge.
- Biological plausibility — backpropagation, the most common learning algorithm, may not correspond to how biological neurons actually learn
- Lack of systematicity — Fodor and Pylyshyn (1988) argued that connectionist networks cannot capture the systematic, compositional nature of language (e.g., if you can understand "John loves Mary," you can understand "Mary loves John")
Significance for SLA
Connectionism provides the computational foundation for Emergentism and Usage-Based Theory. If language can be learned through general-purpose statistical mechanisms, then the case for innate Universal Grammar is weakened. Connectionist models also make specific predictions about the role of input frequency, L1 transfer (as prior network weights), and the gradual, item-based nature of L2 development.