Language:
A new kind of evolutionary system

Simon Kirby

Language Evolution and Computation Research Unit
Theoretical and Applied Linguistics, Edinburgh

Language evolution in context

Eight key transition events in the history of life on Earth

(Maynard Smith & Szathmáry 1995):

Replicating molecules → Populations of molecules

Independent replicators → Chromosomes

RNA → DNA

Prokaryotes → Eukaryotes

Asexual clones → Sexual populations

Protists → Animals, plants and fungi

Solitary individuals → Colonies

Primate societies → Human societies (Language)

The growth of evolutionary linguistics

Not just the biologists who are interested in language evolution…

Pinker & Bloom (1990): Landmark paper linking linguistic and evolutionary theory.

Decade before: 96 articles

Decade after: 1095 articles

Evolutionary thinking in mainstream linguistics.

Jakendoff (2002), Foundations of Language: Brain, Meaning, Grammar, Evolution

Linguistic theory in the “big” science journals: Nature, Science, etc.

Computer science, robotics and AI communities actively researching language evolution.

Evolutionary linguistics

Survey two approaches to the why question:

Standard nativism

Evolutionary nativism

Introduce a new approach: the Iterated Learning Model

Present two simulations of the ILM

Argue that language itself is an evolutionary system

The standard nativist answer

Language is unique because our brains are unique.

We (and no other species) are born with a specialised innate cognitive mechanism for learning language.

Language is the way it is because our biology constrains it to be that way.

It’s the job of linguistics to deduce the structure of the language acquisition device: Universal Grammar.

(And that’s it!)

Evolutionary nativism

Why is the Language Acquisition Device the way it is?

The evolutionary psychology position: (Pinker & Bloom 1990)

Language is the way it is because natural selection favoured individuals who were able to learn languages that were useful for communication.

An alternative mechanism?

Some problems with the evolutionary nativism position:

Doesn’t really explain why language is unique

Not specific about how natural selection is going to work

Is everything in language optimally tailored?

Are there alternative mechanisms?

A new hypothesis:

Where does the primary linguistic data come from?

The Iterated Learning Model
(a framework for computational simulation)

What will the agents talk about?

Need some simple but structured “world”.

Simple predicate logic:

Agents can string random characters together to form utterances.

How do agents learn?

Learners try and form a grammar that is consistent with the primary linguistic data they hear.

Fundamental principle: learning is compression.

Two processes on hearing a meaning-signal pair:

A rule is added to the grammar that is specific to that pair.

Search for ways of “compressing” the grammar. Are there pairs of rules that can be subsumed under a single rule? Are there duplicate rules?

Compression process uncovers any generalisations in the data.

A simulation run

Start with one learner and one adult speaker neither of which have grammars.

Choose a meaning at random.

Get speaker to produce signal for that meaning (may need to “invent” random string).

Give meaning-signal pair to learner.

Repeat 2-4 one hundred and fifty times.

Delete speaker.

Make learner be the new speaker.

Introduce a new learner (with no initial grammar)

Repeat 2-8 thousands of times.

Results 1a: initial stages

Initially, speakers have no language, so “invent” random strings of characters.

A protolanguage emerges for some meanings, but no structure. These are holistic expressions:

ldg “Mary admires John”

xkq “Mary loves John”

gj “Mary admires Gavin”

axk “John admires Gavin”

gb “John knows that Mary knows that John admires Gavin”

Slide 14

Results 1b: many generations later…

gj h      f tej     m
    John   Mary admires
“Mary admires John”

gj h      f tej     wp
    John   Mary loves
“Mary loves John”

gj qp      f tej     m
    Gavin   Mary admires
“Mary admires Gavin”

gj qp      f h       m
    Gavin John admires
“John admires Gavin”

i h       u         i tej      u         gj qp       f h        m
John knows    Mary knows     Gavin    John admires
“John knows that Mary knows that John admires Gavin”

Slide 16

Quantitative results: languages evolve

What’s going on?

There is no biological evolution in the ILM.

There isn’t even any communication; no notion of function in model at all.

So, why are structured languages evolving?

Hypothesis:

Languages themselves are evolving to the conditions of the ILM in order that they are learnable.

Only rules that are generalisable from limited exposure are stable.

The poverty of the stimulus ensures that holistic expressions cannot survive.

Can any irregularity survive?

Languages are not completely regular.

Languages are not completely stable.

In the previous simulation, languages evolve to a completely regular fixed-point.

Why would holistic expressions survive?

Why do expressions change?

Another simulation…

Completely accurate transmission of signal implausible. Might this lead to fixed end points?

Include a least effort principle:

Speakers always use the shortest string possible for a given meaning.

Speakers occasionally drop letters in production.

Simplified meaning-space: 5x5 “paradigm”
i.e., each meaning is a coordinate in 5x5 space

Results 2a: early protolanguage stage

Results 2b: later stages, regulars

Results 2c: occasional, but short-lived irregulars

Frequency effects

Top ten verbs of English by frequency:

be, have, do, say, make, go, take, come, see, get…

was, had, did, said, made, went, took, came, saw, got…

Add frequency biases in meaning space. (modelled on Zipf’s law)

Meanings in top left of table get spoken more often

Frequency distribution

Results 3: frequent=irregular infrequent=regular

Language is an adaptive system in its own right

(At least) two adaptive problems for language:

Must be learnable even under “poverty of the stimulus” conditions

Must be produced by speakers employing least-effort principles

The solution is a language that is compositional where it matters (where learning data is likely to be sparse), and short where it matters (where utterances need to be produced frequently).

Extensions

Specific language universals:

Word-order universals. Why do languages tend to be consistently left- or right-branching?

Universal constraints on question formation, relative clauses, anaphora etc.

What about creolisation?

Need more sophisticated models of population dynamics.

Where do the meanings come from?

Models of meaning-formation and grounding during learning.

Why is language specific to humans?

Explore conditions for emergence of syntactic structure.

Model the biological evolution of mechanisms required for iterated learning itself.

Conclusions

The transition to language is the transition to a new kind of evolutionary system.

The LAD does not directly determine the structure of language.

Explains some language structure without appealing to

hard innate constraints

communicative function

The “poverty of the stimulus” is not a syntactic learnability problem. It is required for the emergence of syntax

Take home message:

Rather than looking at the way we have adapted to language, we should look more at how language adapts to us.


	Simon Kirby
	Language Evolution and Computation Research Unit Theoretical and Applied Linguistics, Edinburgh


	Eight key transition events in the history of life on Earth
	(Maynard Smith & Szathmáry 1995):
	Replicating molecules → Populations of molecules
	Independent replicators → Chromosomes
	RNA → DNA
	Prokaryotes → Eukaryotes
	Asexual clones → Sexual populations
	Protists → Animals, plants and fungi
	Solitary individuals → Colonies
	Primate societies → Human societies (Language)


	Not just the biologists who are interested in language evolution…
	Pinker & Bloom (1990): Landmark paper linking linguistic and evolutionary theory.
		Decade before: 96 articles
		Decade after: 1095 articles
	Evolutionary thinking in mainstream linguistics.
		Jakendoff (2002), Foundations of Language: Brain, Meaning, Grammar, Evolution
	Linguistic theory in the “big” science journals: Nature, Science, etc.
	Computer science, robotics and AI communities actively researching language evolution.


	Survey two approaches to the why question:
		Standard nativism
		Evolutionary nativism
	Introduce a new approach: the Iterated Learning Model
	Present two simulations of the ILM
	Argue that language itself is an evolutionary system


	Language is unique because our brains are unique.
	We (and no other species) are born with a specialised innate cognitive mechanism for learning language.







	Language is the way it is because our biology constrains it to be that way.
	It’s the job of linguistics to deduce the structure of the language acquisition device: Universal Grammar.
	(And that’s it!)


	Why is the Language Acquisition Device the way it is?
	The evolutionary psychology position: (Pinker & Bloom 1990)







	Language is the way it is because natural selection favoured individuals who were able to learn languages that were useful for communication.


	Some problems with the evolutionary nativism position:
		Doesn’t really explain why language is unique
		Not specific about how natural selection is going to work
		Is everything in language optimally tailored?
		Are there alternative mechanisms?

	A new hypothesis:


	Need some simple but structured “world”.
	Simple predicate logic:







	Agents can string random characters together to form utterances.


	Learners try and form a grammar that is consistent with the primary linguistic data they hear.
	Fundamental principle: learning is compression.
	Two processes on hearing a meaning-signal pair:
		A rule is added to the grammar that is specific to that pair.
		Search for ways of “compressing” the grammar. Are there pairs of rules that can be subsumed under a single rule? Are there duplicate rules?
	Compression process uncovers any generalisations in the data.


	Start with one learner and one adult speaker neither of which have grammars.
	Choose a meaning at random.
	Get speaker to produce signal for that meaning (may need to “invent” random string).
	Give meaning-signal pair to learner.
	Repeat 2-4 one hundred and fifty times.
	Delete speaker.
	Make learner be the new speaker.
	Introduce a new learner (with no initial grammar)
	Repeat 2-8 thousands of times.


	Initially, speakers have no language, so “invent” random strings of characters.
	A protolanguage emerges for some meanings, but no structure. These are holistic expressions:

	ldg “Mary admires John”
	xkq “Mary loves John”
	gj “Mary admires Gavin”
	axk “John admires Gavin”
	gb “John knows that Mary knows that John admires Gavin”


	gj h f tej m John Mary admires “Mary admires John”
	gj h f tej wp John Mary loves “Mary loves John”
	gj qp f tej m Gavin Mary admires “Mary admires Gavin”
	gj qp f h m Gavin John admires “John admires Gavin”
	i h u i tej u gj qp f h m John knows Mary knows Gavin John admires “John knows that Mary knows that John admires Gavin”


	There is no biological evolution in the ILM.
	There isn’t even any communication; no notion of function in model at all.
	So, why are structured languages evolving?
	Hypothesis:
	Languages themselves are evolving to the conditions of the ILM in order that they are learnable.
	Only rules that are generalisable from limited exposure are stable.
	The poverty of the stimulus ensures that holistic expressions cannot survive.


	Languages are not completely regular.






	Languages are not completely stable.
	In the previous simulation, languages evolve to a completely regular fixed-point.
	Why would holistic expressions survive?
	Why do expressions change?


	Completely accurate transmission of signal implausible. Might this lead to fixed end points?

	Include a least effort principle:
		Speakers always use the shortest string possible for a given meaning.
		Speakers occasionally drop letters in production.

	Simplified meaning-space: 5x5 “paradigm” i.e., each meaning is a coordinate in 5x5 space



	Top ten verbs of English by frequency:
		be, have, do, say, make, go, take, come, see, get…
		was, had, did, said, made, went, took, came, saw, got…

	Add frequency biases in meaning space. (modelled on Zipf’s law)
	Meanings in top left of table get spoken more often


	(At least) two adaptive problems for language:
		Must be learnable even under “poverty of the stimulus” conditions
		Must be produced by speakers employing least-effort principles


	The solution is a language that is compositional where it matters (where learning data is likely to be sparse), and short where it matters (where utterances need to be produced frequently).


	Specific language universals:
		Word-order universals. Why do languages tend to be consistently left- or right-branching?
		Universal constraints on question formation, relative clauses, anaphora etc.

	What about creolisation?
		Need more sophisticated models of population dynamics.

	Where do the meanings come from?
		Models of meaning-formation and grounding during learning.

	Why is language specific to humans?
		Explore conditions for emergence of syntactic structure.
		Model the biological evolution of mechanisms required for iterated learning itself.


	The transition to language is the transition to a new kind of evolutionary system.
	The LAD does not directly determine the structure of language.
	Explains some language structure without appealing to
		hard innate constraints
		communicative function
	The “poverty of the stimulus” is not a syntactic learnability problem. It is required for the emergence of syntax


	Take home message:
	Rather than looking at the way we have adapted to language, we should look more at how language adapts to us.