Connectionism

SLAConnectionist ModelsParallel Distributed ProcessingPDP

Connectionism is an approach to language learning that models acquisition as the strengthening of connections in neural networks through repeated exposure to input. There is no innate grammar module, no rule system, and no symbolic manipulation — only weighted connections between simple processing units that gradually encode the statistical regularities of the input. Language knowledge is not stored as rules but as patterns of activation distributed across the network.

Origins

Connectionism emerged from the Parallel Distributed Processing (PDP) framework of Rumelhart and McClelland (1986), which proposed that cognition arises from the simultaneous activity of many simple interconnected units. Their landmark demonstration was a connectionist model that learned to produce English past tense forms — including the U-shaped developmental curve (correct → overgeneralised → correct) observed in children — without any explicit rules. The model extracted patterns from input frequency and phonological similarity alone.

This was a direct challenge to Chomsky's argument that language requires innate rules. If a simple network could simulate developmental patterns previously attributed to rule learning, perhaps rules were not needed at all.

How Connectionist Learning Works

Input — the network receives linguistic input (e.g., verb stems paired with past tense forms)
Weighted connections — connections between units have weights that determine how strongly one unit activates another
Error-driven learning — the network compares its output to the correct target and adjusts weights to reduce error (typically via backpropagation)
Generalisation — after sufficient training, the network can produce correct outputs for novel inputs it has never encountered, by generalising from patterns in the training data

Key Findings in SLA

Rumelhart & McClelland (1986) — past tense acquisition without rules, including developmental overgeneralisation
Ellis & Schmidt (1997) — connectionist networks trained on artificial plural morphology produced patterns closely resembling adult L2 learners' performance
MacWhinney (2001) — the Competition Model incorporates connectionist principles, modelling how learners weight and reweight cues across languages
N. Ellis (2002) — demonstrated that frequency effects across all levels of language are consistent with connectionist learning mechanisms

Connectionism vs. Nativism

Feature	Connectionism	Nativism
Innate knowledge	None specific to language	Universal Grammar
Learning mechanism	Statistical pattern extraction from input	Parameter setting triggered by input
Rules	Emergent, not explicitly represented	Explicitly represented in the grammar
Role of input	Primary — the network's knowledge is the input patterns	Trigger — input activates pre-existing knowledge
Errors	Result from incomplete pattern learning	Result from incorrect parameter settings or incomplete rule application

Connectionism vs. Behaviourism

Though both reject innate grammar and emphasise the role of input, connectionism is fundamentally different from behaviourism:

Behaviourism treats learning as habit formation through stimulus-response-reinforcement chains. Connectionism models learning as statistical pattern extraction across distributed representations.
Behaviourism cannot account for creativity in language (producing sentences never heard before). Connectionist networks generalise to novel inputs.
Behaviourism has no internal representation. Connectionism has rich internal representations — they are just distributed, not symbolic.

Criticisms

Poverty of the stimulus — nativists argue that connectionist models only succeed because researchers design the training data carefully. Real input is noisier and sparser.
Scaling — early connectionist models handled limited phenomena (e.g., past tense). Scaling to full grammar acquisition remains a challenge.
Biological plausibility — backpropagation, the most common learning algorithm, may not correspond to how biological neurons actually learn
Lack of systematicity — Fodor and Pylyshyn (1988) argued that connectionist networks cannot capture the systematic, compositional nature of language (e.g., if you can understand "John loves Mary," you can understand "Mary loves John")

Significance for SLA

Connectionism provides the computational foundation for Emergentism and Usage-Based Theory. If language can be learned through general-purpose statistical mechanisms, then the case for innate Universal Grammar is weakened. Connectionist models also make specific predictions about the role of input frequency, L1 transfer (as prior network weights), and the gradual, item-based nature of L2 development.