The emergence of syntax

The emergence of syntax

James R Hurford

(Editorial introduction to section on syntax in The Evolutionary Emergence of Language: Social function and the origins of linguistic form, edited by C. Knight, M. Studdert-Kennedy and J. Hurford, (2000) Cambridge University Press. Pp.219-230.).
Note: This HTML version may differ slightly from the printed version; the printed version is the ``authorized'' version.

The papers in this section reflect a movement, in the late 1990's, away from a focus on the genetic evolution of the innate Language Acquisition Device towards accounts invoking also cultural and linguistic evolution. This is not to deny that the human linguistic capacity evolved biologically, but to acknowledge that such evolution was slow and complicatedly entangled with other aspects of human evolution. The whole section is neatly sandwiched by its first and last chapters, contributed by generative linguists. The first of these argues against narrowly biological adaptationist accounts of language evolution, and the last (complementarily but quite independently, as it happens) casts its contribution to language evolution research in the form of a clearly historical exercise in linguistic reconstruction. The papers in the middle of this sandwich are no less meaty, many of them setting out on a complementary quest for accounts of how languages could have evolved relatively rapidly into their particular complex modern shapes by nonbiological mechanisms, within a relatively static biological frame of reference.

Several themes connect papers in this section, reflecting the general movement described above. These themes are:

One might have expected syntactic theorists to have been prominent in the literature on human evolution, given that what is most remarkable about humans is their capacity for syntactically complex language. But until very recently, syntactic theorists have kept away from evolutionary theorizing. This avoidance has tended to apply to linguists in general; paradoxically, mainstream scholarly linguists have typically been outnumbered in speculation on the evolution of language by anthropologists, psychologists and paleontologists (witness the list of contributors to Lock and Peters' massive Handbook (Lock and Peters, 1996)).

Why have linguists traditionally been so reticent contributing to evolutionary debates? The very complexity of human languages, especially their syntactic components, of which linguists above all (and one might even say only linguists) have been fully aware, is a severe obstacle to theorizing. Although evolutionary biology is a well-established field, one notices a similar reticence among many biologists to engage in evolutionary speculation, because biologists, above all, know how complex the explananda are. Just characterizing the intricacies of human syntax has been work enough for linguists, let alone worrying about how it all might fit into an evolutionary scenario. But the time for turning to the evolution of such complexity had to come.

As the chapters in this section illustrate, the time has come, and linguists are now getting involved in evolutionary theorizing. Less than a decade ago, it would have been impossible to assemble such a coherent set of contributions by scholars who have made their reputations in non-evolutionary aspects of linguistics. Here, now, we have such an assembly of linguists, albeit from markedly different backgrounds, some willing to outline actual mechanisms by which syntactic complexity may have emerged, and others (well, just one, actually) clearly skeptical of such attempts, but taking them seriously enough to mount a closely reasoned counter-argument.

It remains the case that the linguists most concerned with the complexity of the explananda are those with least sympathy for proposed evolutionary explanations. In this section, David Lightfoot's chapter acts as a salutary counterweight to any tendency to assume, without detailed argument, that any universal aspect of the syntax of humen languages must be adaptive. Although Lightfoot is the only contributor in this section who voices what might be taken as a negative note on theories of language evolution, in fact his arguments in no way clash with the tenor of the other chapters here. This is an indication of how far and how quickly theorizing about the evolution of syntax has shifted in the decade since Pinker and Bloom's influential paper (Pinker and Bloom, 1990).

Popular science books can give an impression of simple intellectual battlelines drawn up with Dawkins, Pinker and Dennett on one side (the ``adaptationists'') and Chomsky, Gould and Lewontin on the other. As Andrew Carstairs-McCarthy's chapter in this volume notes, ``What both sides in this debate have generally had in common, so far, is an all-or-nothing attitude: either grammar is adaptive or it is not. But from the point of view of evolutionary biology, that attitude seems oversimple.'' Pinker and Bloom's conspicuously drawn parallel between the structure of the eye and the structure of human languages certainly tended to start people thinking along straightforwardly adaptationist lines about the complexity of language. Yet, for all its virtues in reviving discussion of the evolution of language, it is clear that Pinker and Bloom's analogy with the eye is in many ways misplaced. The eye evolved separately many times, whereas human language has only evolved once; the evolution of the eye took tens of millions of years, at least, whereas it seems that the evolution of human language was faster by several orders of magnitude; and there is clearly a social, communicative dimension to the evolution of language lacking in any story about the eye. Pinker and Bloom argued that the innate human capacity to acquire language was likely to have been selected by orthodox Darwinian processes. Thus it seemed, at the beginning of the 1990s, that work on the evolution of syntax was set to take a decidely adaptationist (and biological) course. Lightfoot's chapter in this volume can be seen in this context. It echoes his position in a commentary on another broadly adaptationist proposal from the early 1990s (see Newmeyer 1991; Lightfoot, 1991). Lightfoot here discusses some quite complex and abstract constraints on grammatical structures, underdetermined by any evidence that learners are likely to observe, and hence attributable to a definite bias in the learning mechanism. He shows how these constraints have a dysfunctional effect, actually preventing speakers from expressing certain messages in what would seem to be the most straightforward way. It should be no surprise that any complex system has advantages and disadvantages; efficiency is always a matter of compromise between costs and benefits.

None of the papers in this section argue for biological adaptation of brain structures as the central mechanism behind the emergence of complex syntax. None of them deny the existence of selective pressures, either, but adaptation is not their prime focus. Thus, Carstairs-McCarthy's central argument is that a particular feature of language, for a long part of its prehistory, was not well adapted. Derek Bickerton advances an explicitly exaptationist position: (much of) syntactic complexity is built on non-linguistic structure that existed before. Alison Wray also argues explicitly that much of grammar is dysfunctional for day-to-day communicative activities. The computational models of Simon Kirby and Jim Hurford are consonant with Bickerton's view, as they assume mental representational structures pre-existing communication, which give rise to linguistic structures. The title of Kirby's paper is ``Syntax without natural selection'', emphasizing its nonadaptationist stance. Kirby, Hurford and Robert Worden all assume a constant biological endowment, in the shape of specific learning mechanisms, which enables cultural evolution of languages. Finally, Fritz Newmeyer's paper in this volume does not continue his earlier (1991) broadly adaptationist arguments, but deals with the evolution of languages within the context of an unchanging biological endowment, and includes an argument specifically against a particular form of adaptive evolution (genetic assimilation, or the Baldwin Effect) as an account of properties of the innate human capacity for syntax.

Lightfoot's paper, and the second part of Newmeyer's, bear testimony to the great complexity of syntactic structure. Non-linguists are apt to dismiss arguments from linguists on the grounds that they are too complicated. This is like concluding that General Relativity can't be true because you can't understand it. And just as it would be wrong to disbelieve in electrons and quarks because you can't experience them directly, it is wrong to discount the linguists' arguments involving invisible (and inaudible) elements of linguistic structure, such as the brackets, indices and traces, which are at the centre of Lightfoot's, and to some extent Newmeyer's, arguments. In the hope that it will help non-linguists to get the drift of such arguments, some points of background from syntactic theory are given at the end of this introduction, in a brief appendix.

The first and last papers in this section, Lightfoot's and Newmeyer's, are by linguists who have made their names within the Chomskyan tradition of generative grammar. The level of technical syntactic detail in their arguments exceeds that of the intervening papers, which tend toward programmatic, `big-picture' statements. There remains a significant gap to be bridged between such programmatic proposals and the degree of detailed knowledge that has now been accumulated about the syntax of languages. I will give three examples.

Carstairs-McCarthy makes a broad structural proposal for the origin of the category `grammatical subject': it arose, he suggests, from a structural parallel with the syllable-structure category `onset'. But in many languages, specific constraints apply to grammatical subjects, as opposed to (direct or indirect) objects. Lightfoot's examples (11)-(14) illustrate this: wh- items cannot be extracted from the subjects of tensed clauses, whereas such extractions from other functional positions are normally grammatical. Carstairs-McCarthy's account of the origin of the category `subject' does not yet address the specific properties that generative grammarians have identified as peculiar to subjects.

Bickerton claims ``In fact, it has proved possible to derive most if not all of the basic principles of universal grammar from a small set of primitives which includes only the obligatory representation of thematic roles, a general economy measure (`all constituents without independent reference must choose the nearest available referent') and a pragmatic assumption (`no two arguments of a clause can refer to the same referent unless one of them is lexically marked to this effect').'' It is not clear that this broad claim can stretch to many of the examples given by Lightfoot.

A third example of the large gap between those concerned with syntactic detail and the `big-picture' (or `broad-brush') theorists of the evolution of syntax is seen in the simple syntaxes of the emergent languages in Kirby's and Hurford's computer models. Their simulated populations converge on languages whose grammars can be completely described on one sheet of paper, obviously falling far short of the complexity of real languages.

So what is the value of such big-picture proposals as are included in this collection? A clue lies in the very course of syntactic theorizing over the last forty years. Undoubtedly, generative grammar has expanded, with the result that vastly more detailed knowledge has now been accumulated about the syntax of languages than ever before. And some of this detailed knowledge is expressible in generalizable form, as in the case of the Subjacency constraints mentioned by Lightfoot and Newmeyer. The field has also moved very fast, so that today's theoretical arguments are typically quite different from those of even a decade ago. This is often not because the old arguments have been settled, but rather the very speed at which the field has changed has resulted in the implications for large areas of language being left unresolved. Furthermore, syntactic theory has fragmented into a host of rival theories. See, for example, the large number, at least a dozen, of alternative generative theories surveyed and summarized in Brown and Miller (1996).

In brief, narrow syntactic theorizing, despite undeniable gains, is in a state of considerable turmoil, and needs to start looking beyond its traditional horizons for explanatory principles of kinds that it has not previously considered. No contributor to this volume denies the central part played by the specific dispositions of human language learners in the structuring of language. But the papers here place language acquisition in the wider context of the arena of use, history and evolution. In this wider context, types of explanation for universals in syntactic structure become available which are either alternative or complementary to those appealing to the special nature of the language acquisition device.

Carstairs-McCarthy draws attention to significant structural parallels between syllable structure and simple clause structure. It is suggested that the human capacity for mental representation of signals was constrained, at both the syllabic and the clausal level, and for many millennia, by the same structural straightjacket. For Carstairs-McCarthy, the step to modern syntactic structures came later, with a capacity to represent recursively embedded structures. It seems reasonable to suppose that a capacity to represent non-self-embedding hierarchical structures may have preceded a capacity to handle recursion. Although Bickerton, Kirby and Hurford take as given a capacity for representing recursively self-embedded (semantic) structures, there is in fact no necessary conflict between Carstairs-McCarthy's proposal and theirs. Carstairs-McCarthy does not suggest that Homo erectus' representations of meanings were constrained in the same way as their representations of possible signals. It is possible, and would be an instructive exercise, to put Carstairs-McCarthy's and Bickerton's (and Wray's) proposals together into a more fine-grained and detailed story.

For Bickerton, recursively hierarchical syntactic structure is an exaptation from pre-existing semantic structure. The major shift between protolanguage and modern language came with a steep rise in signal processing capacity. Although he does not single out recursive self-embedding as a special problem, it is clear that self-embedding does pose special processing problems (for example in a system where the online store of the structure currently being processed cannot reliably record and distinguish more than one structure of the same category). Carstairs-McCarthy's move to recursion and Bickerton's `catastrophic' shift in signal processing capacity perhaps coincided. If they did, both authors concur in locating this shift at around 300,000 years ago, coinciding with the probable first appearance of Homo sapiens sapiens.

Wray argues cogently that protolanguage is still with us. There has been a tendency to talk in terms of Bickertonian protolanguage `giving way to', or being wholly supplanted by, modern language. Bickerton concurs with Wray that such was not the case: `` ... there is no need to suppose that in catastrophic changes of state, one state supersedes or abolishes the other. ... We need not imagine parents who spoke only protolanguage with children who spoke like us.'' Wray's and Bickerton's views coincide on the implications of the high cost of processing grammatically articulated language.

Bickerton's implicit assumptions about the nature of the semantic structure pre-existing modern language are quite different from Wray's explicit view of the semantics of protolanguage. Bickerton envisages discrete and distinct mental categories corresponding to predicates and their arguments; in protolanguage there could have been `words' corresponding to argument concepts and others corresponding to predicate concepts. The step from protolanguage to language involved externalizing the syntax of semantic representations (more or less like predicate logic, without quantifiers, for Bickerton), so that, roughly, the forms corresponding to predicates became verbs, the forms corresponding to arguments became nouns, and the larger forms expressing embedded propositions became subordinate clauses. Thus, for Bickerton, the seeds of reference and predication are already present in the semantics of the protolanguage. The meanings of protolanguage utterances were essentially truth-conditional. (In fact, natural languages do not fit the syntax of predicate logic as neatly as Bickerton implies, as common nouns, adjectives, prepositions and verbs are all semantically predicates, and only proper nouns correspond to the atomic arguments of predicate logic.)

Wray's view of the semantics of protolanguage is quite different, with messages being essentially holistic speech-acts, centrally pragmatic in function, and without inbuilt reference and predication. For Wray, reference and predication emerge with the segmentation of protolanguage utterances into smaller meaningful parts. This illustrates another difference between Bickerton and Wray. One can identify two views of the move from protolanguage to language, which one can label `synthetic' and `analytic'. Bickerton takes the synthetic view. The original words of protolanguage had meanings which became the atomic constituents of the meanings of the larger utterances of full language. The original words of protolanguage were strung together to make the phrases and sentences of full language. Wray takes an analytic view. The original words of protolanguage had meanings which became the meanings of high-level constituents in full language. The original words of protolanguage were dissected into parts which came to express the atomic meanings of full language.

Thus Bickerton and Wray diverge along two separate dimensions. For Bickerton, protolanguage meanings were truth-conditional, and the step to language was synthetic; for Wray, protolanguage meanings were pragmatic (`interpersonal'), and the step to language was analytic. The two dimensions are independent, as Kirby's paper in this section in fact assumes a kind of truth-conditional semantics for presyntactic stages of language, but combines this with a model of an analytic move from syntaxless early language to later syntactic language. Kirby and Hurford differ along the same analytic/synthetic dimension as Wray and Bickerton. One can summarize the positions in a table.

meanings were:
Move from protolanguage to language was:

These various background assumptions about the move from presyntactic protolanguage to language are not argued for in any great detail by these authors (and especially not by Kirby or Hurford). The present analysis is intended to point out directions for future debate. It is notable that the `Pragmatic/Synthetic' cell in the above table is empty. It is more difficult, though not actually impossible, to conceive of a Pragmatic/Synthetic scenario. In such a scenario, there would presumably have been protolanguage referential `words' for I and you, two of the central atomic components of any speech act meaning (illocution).

Kirby's paper is one of the first in a current spate of research reports which describe fully implemented computational models of the evolution of languages with (simple) syntax in a population of learners. The paper builds on, and significantly extends, the work of Batali (1998) (in the predecessor volume to this one). This new trend, the computational modelling of evolving populations, with individuals endowed with quite complex behaviours, is made possible by the spectacular advances in computing power of the last decade. A collection of such work appears in a companion volume to this (Briscoe, 2000), which contains a survey article on these works (Hurford, 2000). As mentioned earlier, Kirby models this process `analytically'; the essential capacity in his learners which gives rise to syntax is a capacity to segment utterances and generalize over chance coincidences in the meanings of identical segments of different utterances. This is just the same process as envisaged by Wray in her paper.

Hurford's paper in this section follows Kirby's lead, modelling the evolution of a syntactic language from an initial languageless state in a population of learners. Hurford adds recursive power to the model (but see now Kirby, forthcoming) and focusses on the way in which the mechanism of social transmission of language tends to select general (i.e. syntactic) mappings between meanings and forms. Hurford models the evolution of syntax `synthetically', via learners with a capacity to invent constructions in which previously learned atomic forms are included.

Kirby's and Hurford's computational models highlight a kind of language evolution which is not driven by clearly functional pressures, such as a pressure toward easily parsable structures, or toward socially useful meanings. The very fact that the history of a language passes through stages of data-compression (language learning) and exemplification (language use) is sufficient to guide a language over time toward general, regular structures. Kirby and Hurford emphasize different kinds of generalization and regularization, with Kirby arguing persuasively that this historical process favours the evolution of one of the most basic features of language, namely compositionality of meaning. The compositionality principle states that the meaning of a whole is a function of the meanings of its parts and of the way the parts are put together. Compositionality is so basic to language that it is usually simply taken for granted, and the idea of explaining why languages should be organized in this way is seldom contemplated. But Kirby's paper shows how the historical compression-exemplification cycle leads to compositional languages evolving.

Worden's paper shares with Kirby's and Hurford's a degree of formal explicitness, and similarly reports on a model implemented computationally. Worden's paper begins to bridge the gap between the level of syntactic detail typical of generative treatments of syntax and the kinds of big-picture theorizing more typical in work on the evolution of language. With a quite specific model of syntactic structure, derived in part from various extant generative theories, Worden, like Kirby and Hurford, explores the implications of the historical passage of a language through the cycle of learning and exemplification, and offers specific explanations for several widespread characteristics of language. Worden differs from Kirby and Hurford in introducing functional pressures into the historical picture, including ease of learning and usefulness of meaning. Kirby and Hurford do not deny the relevance of such functional pressures, but leave them out of consideration in their ``purer'' models, as a way of disentangling the contributions made by function, on the one hand, and the basic fact of historical transmission by exemplification, on the other. Several of the features of language which Worden's model explains are examples of the kind of generality and regularity which Kirby's and Hurford's models also explain. The details of implementation in the three models are all rather different, but they converge in strikingly similar conclusions, with respect to the emergence of syntactic regularities in languages.

Newmeyer's paper here is a strikingly bold attempt, the first of its kind, to argue for a particular ordering of the major sentence constituents in the earliest human language, SOV. Like all such bold speculations (including Bickerton's concept of protolanguage), it will raise many questions. The fact that Newmeyer has thrown this suggestion into the ring, with arguments that can be taken seriously, is of great value. He has shown that it is in fact possible to advance sensible arguments relating to what might previously have been thought to have been an unanswerable question.

Appendix on syntactic notation

Speakers of a language have mental representations of the grammatical structure of sentences in that language. While ordinary native speakers clearly do not, consciously or otherwise, store such scholarly grammatical labels as `Noun', `Verb' and `CP' in their heads (any more than electrons carry little labels on them saying `ELECTRON'), the organization of a person's knowledge of her language in the brain is in terms of such differentiated categories. Speakers mentally represent sentences in their language as partitioned into non-arbitrary chunks, each with its own characteristic contribution to the meaning and overall wellformedness of the sentence. In the notation of Lightfoot's examples, such chunks (`constituents') are marked off by matching left and right square brackets, sometimes labelled with a subscript indicating the structural type of the chunk, e.g. CP[ that Kim hit Tim ] . The structural type of a constituent is crucial to the meaning and well-formedness of the sentence in which it occurs. The wrong type of constituent in the wrong place can be a reason for ungrammaticality.

A person's knowledge of her language includes both positive and negative knowledge --- knowledge of what is a wellformed expression in the language, and also knowledge of what couldn't be a wellformed expression in the language. Linguists argue on the grounds of both wellformed and hypothetical ill-formed examples. Illformed examples are examples that a native speaker, on due reflection, judges to be so. A star, or asterisk, prefixed to an example indicates that the string of words so prefixed is illformed in the language in question. In linguistic argumentation, sets of asterisked and un-asterisked examples are typically juxtaposed and the difference in wellformedness thus indicated is attributed to the effect of some general (`universal') principle of grammatical structure applying to a minimal structural difference between the examples. When reading syntactic arguments, you need to slow down a bit, but it's not as bad as maths. Hint to non-linguists: for each set of juxtaposed examples, identify the minimal differences between asterisked and non-asterisked examples, and relate this to the surrounding discussion.

A speakers' knowledge of her language is abstract enough to include elements which have no phonetic realization. Linguists postulate these in order to give the most general account of the whole (infinite) set of judgements that a speaker can make about strings of words. In Lightfoot's article, such invisible elements are variously indicated by: GAP, 0 (zero), and e (mnemonic for `empty'). To be reminded of the justification for such invisible elements, a simple exercise is to take a short substring containing them out of the whole example, and consider whether, on its own, such a substring would be wellformed. For instance, one of Lightfoot's examples is:

Whoi do you think [ ei Ray saw ei]?

Reading this whole sentence aloud (without the brackets and subscripted e's, as they are inaudible elements of structure), it can be heard that this is a perfectly commonplace English question sentence. But what is the point of the eis? Consider just the last ei in this example. Try saying just the last two words on their own, i.e. * Ray saw. This could be expected to be wellformed, since it forms a proper constituent of the larger sentence (marked as such by the square brackets around it), but, on its own *Ray saw is, at best, elliptical. Ray saw what?, we are inclined to ask. In the larger sentence Who do you think Ray saw?, this elliptical feeling does not arise, because we, as native English speakers, know that the `gap' after the verb saw is legitimated by the sentence-initial who.

Newmeyer's arguments also include examples with such empty categories. His Japanese examples (17) and (19) are probably particularly hard for a Western non-linguist to disentangle. The first thing to note is that in Japanese, as in most SOV languages, relative clauses precede their head nouns. So, for example, the noun phrase the man who owns a dog would in Japanese look more like a quasi-English string *dog-own man. Furthermore, such preposed relative clauses, as Newmeyer's example shows, can be used to express meanings that in English cannot be expressed with parallel postposed relative clause constructions. Newmeyer's example is parallel to something like the man, who owns the dog that barked, but which comes out in Japanese like the quasi-English * ei owns dog barked man, where the `gap' shown by ei is co-referential with man. Finally, the relationship between the invisible element and its legitimating word elsewhere in the sentence is often referred to by linguists, as by Lightfoot here, as `movement', as if the legitimating element had once been in the place of the invisible marker, and migrated, leaving a `trace' behind it. The relationship between the `moved' element and its `original' location is indicated by giving them common subscripts.


John 1998 ``Computational simulations of the emergence of grammar''. In Approaches to the evolution of language: Social and cognitive bases, edited by James R Hurford, Michael Studdert-Kennedy and Chris Knight, 405-426, Cambridge University Press.
Ted, (ed.) forthcoming, Linguistic Evolution through Language Acquisition: Formal and Computational Models, Cambridge University Press.
E.Keith and Jim Miller (eds), 1996 Concise Encyclopedia of Syntactic Theories, Pergamon (Elsevier Science Ltd.), Oxford.
James R, forthcoming ``Expression/induction models of language evolution: dimensions and issues'', to appear in Linguistic Evolution through Language Acquisition: Formal and Computational Models, edited by Ted Briscoe, Cambridge University Press.
Simon, forthcoming, ``Learning, Bottlenecks and the Evolution of Recursive Syntax'', to appear in Linguistic Evolution through Language Acquisition: Formal and Computational Models, edited by Ted Briscoe, Cambridge University Press.
David, 1991 ``Subjacency and Sex'', Language and Communication, 11:67-69.
Andrew, and Charles Peters 1996 Handbook of Human Symbolic Evolution, Clarendon Press, Oxford.
Frederick, 1991 ``Functional Explanation in Linguistics and the Origin of Language'' Language and Communication, 11:3-28.
Steven, and Paul Bloom, 1990 ``Natural Language and Natural Selection'', Behavioral and Brain Sciences, 13:707-784.