Mathematical Linguistics and Language Evolution

Timothy J. O'Donnell (timo@inf.ed.ac.uk) and Willem Zuidema

University of Edinburgh, Language Evolution and Computation Research Unit

Human languages exhibit a combination of computational features that
make them unique systems of communication in nature: large and learned
lexicons, combinatorial phonology, compositional semantics, and
hierarchical phrase structure.  In the field of evolution of language
controversies have often focused on the complexity of these
computational mechanisms.  These controversies include debates about
innateness, whether or not language was exapted, if it is the result of
a few or many mutations and if it increased in complexity over
evolutionary time (see e.g.  Pinker & Bloom, 1990, and the many peer
commentaries and the authors' response in the same issue).  We analyze
these debates and find that at their core they rely in varying degrees
on two implicit assumptions: (i) that complexity in the computational
machinery for processing language is difficult for evolution to achieve
and/or that (ii) that complexity is itself a trait which can be selected
for or against. 

Out of the many possible ways of studying computational complexity,
formal linguistics has primarily been concerned with situating natural
language processes and formalisms on various computational
hierarchies.  By far the most studied of these is the (extended)
Chomsky Hierarchy. We ask the questions: how do the two assumptions
outlined above fare when analyzed under this notion of complexity, and
how does this apply to the debates in the field?  Such a formal
definition would potentially resolve conflicting intuitions about
complexity (exemplified e.g. in Lewontin's and Piatelli-Palmarini's
commentaries on Pinker & Bloom, 1990).

We argue that complexity in the automata theoretic sense is in fact
very common in natural systems.  We find it plausible that genes can
code for systems with small numbers of elements interacting with
simple rules. There is increasing evidence that these sorts of systems
are in fact often computationally universal (e.g. Wolfram, 2003).
Furthermore, certain classes of neural network models have been shown
to be Turing equivalent (Siegelmann & Sontag, 1991), and capable of
efficiently encoding phenomena such as hierarchical phrase structure
(Pollack 1988). We suspect that the reality is that brains in many
kinds of animals are already implementing algorithms and computations
which are sufficiently complex to represent and process language in
the strict automata theoretic sense.

Furthermore, we go on to argue that these grammars and automata are
not well suited to be used as phenotypes in biological models. They
do, of course, expose interesting differences in grammatical classes
on the hierarchy. For instance, the word recognition, or parsing
problem increases in time complexity as one makes certain moves up the
hierarchy. Likewise, differences in the hierarchy can be understood in
terms of increasing relaxation of memory limitations, e.g. finite to
stack based to stack based with less restrictive push procedures, etc.
But it is difficult to see how these differences satisfy various
evolutionary constraints or can affect fitness. We argue that instead
of looking at these formalisms in terms of their place on the
hierarchy we must look deeper at the properties of language that they
are meant to abstract.

We summarise that it is not the physical constraints of the general
neural architecture that restrict natural language to a specific
complexity class.  Rather, the requirements of learnability and
population coherence as well as the interface conditions of
interpretability and producability under realistic time and noise
constraints choose specific classes of computational mechanisms. These
mechanisms restrict any language that is to survive either cultural or
genetic transmission. We discuss the implications of this for the
debates outlined in the introduction and reach some general
conclusions. For instance, is it theoretically useful to describe the
evolution of language as climbing the Chomsky hierarchy? (as do,
e.g. Hashimoto & Ikegami, 1996). Finally, we conclude that while the
Chomsky hierarchy is a bad model of phenotypic complexity, it is a
very good model of language. This suggests a way of rescuing it as a
tool for evolutionary theory.

Hashimoto, T. & Ikegami, T. (1996). The emergence of a net-grammar in
  communicating agents. BioSystems 38, 1-14.

Pinker, S. & Bloom, P. (1990). Natural language and natural
  selection. Behavioral and brain sciences 13, 707-784.

Pollack, J. B. (1988). Recursive auto-associative memory: Decising
  compositional distributed representations. In: Proc. of the Tenth
  Annual Conference of the Cognitive Science Society.  Lawrence
  Erlbaum.

Siegelmann, H. and Sontag, E. (1991). Neural networks are universal
  computing devices.  Technical Report SYCON--91--08, Rutgers Center
  for Systems and Control.

Wolfram, S. (2002). A New Kind of Science. Champaign, IL, U.S.A.:
  Wolfram Media.