Mathematical Linguistics and Language Evolution Timothy J. O'Donnell (timo@inf.ed.ac.uk) and Willem Zuidema University of Edinburgh, Language Evolution and Computation Research Unit Human languages exhibit a combination of computational features that make them unique systems of communication in nature: large and learned lexicons, combinatorial phonology, compositional semantics, and hierarchical phrase structure. In the field of evolution of language controversies have often focused on the complexity of these computational mechanisms. These controversies include debates about innateness, whether or not language was exapted, if it is the result of a few or many mutations and if it increased in complexity over evolutionary time (see e.g. Pinker & Bloom, 1990, and the many peer commentaries and the authors' response in the same issue). We analyze these debates and find that at their core they rely in varying degrees on two implicit assumptions: (i) that complexity in the computational machinery for processing language is difficult for evolution to achieve and/or that (ii) that complexity is itself a trait which can be selected for or against. Out of the many possible ways of studying computational complexity, formal linguistics has primarily been concerned with situating natural language processes and formalisms on various computational hierarchies. By far the most studied of these is the (extended) Chomsky Hierarchy. We ask the questions: how do the two assumptions outlined above fare when analyzed under this notion of complexity, and how does this apply to the debates in the field? Such a formal definition would potentially resolve conflicting intuitions about complexity (exemplified e.g. in Lewontin's and Piatelli-Palmarini's commentaries on Pinker & Bloom, 1990). We argue that complexity in the automata theoretic sense is in fact very common in natural systems. We find it plausible that genes can code for systems with small numbers of elements interacting with simple rules. There is increasing evidence that these sorts of systems are in fact often computationally universal (e.g. Wolfram, 2003). Furthermore, certain classes of neural network models have been shown to be Turing equivalent (Siegelmann & Sontag, 1991), and capable of efficiently encoding phenomena such as hierarchical phrase structure (Pollack 1988). We suspect that the reality is that brains in many kinds of animals are already implementing algorithms and computations which are sufficiently complex to represent and process language in the strict automata theoretic sense. Furthermore, we go on to argue that these grammars and automata are not well suited to be used as phenotypes in biological models. They do, of course, expose interesting differences in grammatical classes on the hierarchy. For instance, the word recognition, or parsing problem increases in time complexity as one makes certain moves up the hierarchy. Likewise, differences in the hierarchy can be understood in terms of increasing relaxation of memory limitations, e.g. finite to stack based to stack based with less restrictive push procedures, etc. But it is difficult to see how these differences satisfy various evolutionary constraints or can affect fitness. We argue that instead of looking at these formalisms in terms of their place on the hierarchy we must look deeper at the properties of language that they are meant to abstract. We summarise that it is not the physical constraints of the general neural architecture that restrict natural language to a specific complexity class. Rather, the requirements of learnability and population coherence as well as the interface conditions of interpretability and producability under realistic time and noise constraints choose specific classes of computational mechanisms. These mechanisms restrict any language that is to survive either cultural or genetic transmission. We discuss the implications of this for the debates outlined in the introduction and reach some general conclusions. For instance, is it theoretically useful to describe the evolution of language as climbing the Chomsky hierarchy? (as do, e.g. Hashimoto & Ikegami, 1996). Finally, we conclude that while the Chomsky hierarchy is a bad model of phenotypic complexity, it is a very good model of language. This suggests a way of rescuing it as a tool for evolutionary theory. Hashimoto, T. & Ikegami, T. (1996). The emergence of a net-grammar in communicating agents. BioSystems 38, 1-14. Pinker, S. & Bloom, P. (1990). Natural language and natural selection. Behavioral and brain sciences 13, 707-784. Pollack, J. B. (1988). Recursive auto-associative memory: Decising compositional distributed representations. In: Proc. of the Tenth Annual Conference of the Cognitive Science Society. Lawrence Erlbaum. Siegelmann, H. and Sontag, E. (1991). Neural networks are universal computing devices. Technical Report SYCON--91--08, Rutgers Center for Systems and Control. Wolfram, S. (2002). A New Kind of Science. Champaign, IL, U.S.A.: Wolfram Media.