Notes
Slide Show
Outline
1
Language:
A new kind of evolutionary system
  • Simon Kirby

  • Language Evolution and Computation Research Unit
    Theoretical and Applied Linguistics, Edinburgh
2
Language evolution in context
  • Eight key transition events in the history of life on Earth
  • (Maynard Smith & Szathmáry 1995):
  • Replicating molecules → Populations of molecules
  • Independent replicators → Chromosomes
  • RNA → DNA
  • Prokaryotes → Eukaryotes
  • Asexual clones → Sexual populations
  • Protists → Animals, plants and fungi
  • Solitary individuals → Colonies
  • Primate societies → Human societies (Language)
3
The growth of evolutionary linguistics
  • Not just the biologists who are interested in language evolution…
  • Pinker & Bloom (1990): Landmark paper linking linguistic and evolutionary theory.
    • Decade before: 96 articles
    • Decade after: 1095 articles
  • Evolutionary thinking in mainstream linguistics.
    • Jakendoff (2002), Foundations of Language: Brain, Meaning, Grammar, Evolution
  • Linguistic theory in the “big” science journals: Nature, Science, etc.
  • Computer science, robotics and AI communities actively researching language evolution.
4
Evolutionary linguistics
  • Survey two approaches to the why question:
    • Standard nativism
    • Evolutionary nativism
  • Introduce a new approach: the Iterated Learning Model
  • Present two simulations of the ILM
  • Argue that language itself is an evolutionary system
5
The standard nativist answer
  • Language is unique because our brains are unique.
  • We (and no other species) are born with a specialised innate cognitive mechanism for learning language.








  • Language is the way it is because our biology constrains it to be that way.
  • It’s the job of linguistics to deduce the structure of the language acquisition device: Universal Grammar.
  • (And that’s it!)
6
Evolutionary nativism
  • Why is the Language Acquisition Device the way it is?
  • The evolutionary psychology position: (Pinker & Bloom 1990)








  • Language is the way it is because natural selection favoured individuals who were able to learn languages that were useful for communication.


7
An alternative mechanism?
  • Some problems with the evolutionary nativism position:
    • Doesn’t really explain why language is unique
    • Not specific about how natural selection is going to work
    • Is everything in language optimally tailored?
    • Are there alternative mechanisms?

  • A new hypothesis:



8
Where does the primary linguistic data come from?
9
The Iterated Learning Model
(a framework for computational simulation)
10
What will the agents talk about?
  • Need some simple but structured “world”.
  • Simple predicate logic:








  • Agents can string random characters together to form utterances.


11
How do agents learn?
  • Learners try and form a grammar that is consistent with the primary linguistic data they hear.
  • Fundamental principle: learning is compression.
  • Two processes on hearing a meaning-signal pair:
    • A rule is added to the grammar that is specific to that pair.
    • Search for ways of “compressing” the grammar. Are there pairs of rules that can be subsumed under a single rule? Are there duplicate rules?
  • Compression process uncovers any generalisations in the data.



12
A simulation run
  • Start with one learner and one adult speaker neither of which have grammars.
  • Choose a meaning at random.
  • Get speaker to produce signal for that meaning (may need to “invent” random string).
  • Give meaning-signal pair to learner.
  • Repeat 2-4 one hundred and fifty times.
  • Delete speaker.
  • Make learner be the new speaker.
  • Introduce a new learner (with no initial grammar)
  • Repeat 2-8 thousands of times.
13
Results 1a: initial stages
  • Initially, speakers have no language, so “invent” random strings of characters.
  • A protolanguage emerges for some meanings, but no structure. These are holistic expressions:


  • ldg “Mary admires John”
  • xkq “Mary loves John”
  • gj “Mary admires Gavin”
  • axk “John admires Gavin”
  • gb “John knows that Mary knows that John admires Gavin”



14
 
15
Results 1b: many generations later…
  • gj h      f  tej     m
        John   Mary admires
    “Mary admires John”
  • gj h      f  tej     wp
        John   Mary loves
    “Mary loves John”
  • gj qp      f  tej     m
        Gavin   Mary admires
    “Mary admires Gavin”
  • gj qp      f h       m
        Gavin  John  admires
    “John admires Gavin”
  • i h       u         i  tej      u         gj  qp       f h        m
      John knows    Mary  knows     Gavin    John  admires
    “John knows that Mary knows that John admires Gavin”
16
 
17
Quantitative results: languages evolve
18
What’s going on?
  • There is no biological evolution in the ILM.
  • There isn’t even any communication; no notion of function in model at all.
  • So, why are structured languages evolving?
  • Hypothesis:
  • Languages themselves are evolving to the conditions of the ILM in order that they are learnable.
  • Only rules that are generalisable from limited exposure are stable.
  • The poverty of the stimulus ensures that holistic expressions cannot survive.
19
Can any irregularity survive?
  • Languages are not completely regular.







  • Languages are not completely stable.
  • In the previous simulation, languages evolve to a completely regular fixed-point.
  • Why would holistic expressions survive?
  • Why do expressions change?


20
Another simulation…
  • Completely accurate transmission of signal implausible. Might this lead to fixed end points?


  • Include a least effort principle:
    • Speakers always use the shortest string possible for a given meaning.
    • Speakers occasionally drop letters in production.

  • Simplified meaning-space: 5x5 “paradigm”
    i.e., each meaning is a coordinate in 5x5 space
21
Results 2a: early protolanguage stage
22
Results 2b: later stages, regulars
23
Results 2c: occasional, but short-lived irregulars
24
Frequency effects

  • Top ten verbs of English by frequency:
    • be, have, do, say, make, go, take, come, see, get…
    • was, had, did, said, made, went, took, came, saw, got…


  • Add frequency biases in meaning space. (modelled on Zipf’s law)
  • Meanings in top left of table get spoken more often


25
Frequency distribution
26
Results 3: frequent=irregular infrequent=regular
27
Language is an adaptive system in its own right
  • (At least) two adaptive problems for language:
    • Must be learnable even under “poverty of the stimulus” conditions
    • Must be produced by speakers employing least-effort principles


  • The solution is a language that is compositional where it matters (where learning data is likely to be sparse), and short where it matters (where utterances need to be produced frequently).
28
Extensions
  • Specific language universals:
    • Word-order universals. Why do languages tend to be consistently left- or right-branching?
    • Universal constraints on question formation, relative clauses, anaphora etc.

  • What about creolisation?
    • Need more sophisticated models of population dynamics.

  • Where do the meanings come from?
    • Models of meaning-formation and grounding during learning.

  • Why is language specific to humans?
    • Explore conditions for emergence of syntactic structure.
    • Model the biological evolution of mechanisms required for iterated learning itself.
29
Conclusions
  • The transition to language is the transition to a new kind of evolutionary system.
  • The LAD does not directly determine the structure of language.
  • Explains some language structure without appealing to
    • hard innate constraints
    • communicative function
  • The “poverty of the stimulus” is not a syntactic learnability problem. It is required for the emergence of syntax



  • Take home message:
  • Rather than looking at the way we have adapted to language, we should look more at how language adapts to us.