LEC talk Friday 2nd October: Ted Gibson

By jon | September 24, 2015

NOTE UNUSUAL DAY, TIME, AND LOCATION
Friday 2nd October, 14:00–15:30
Informatics Forum, 4.31/4.33

Ted Gibson, Department of Brain and Cognitive Sciences, MIT

Information theoretic approaches to language universals

Finding explanations for the observed variation in human languages is the primary goal of linguistics, and promises to shed light on the nature of human cognition. One particularly attractive set of explanations is functional in nature, holding that language universals are grounded in the known properties of human information processing. The idea is that grammars of languages have evolved so that language users can communicate using sentences that are relatively easy to produce and comprehend. In this talk, I summarize results from explorations into several linguistic domains, from an information-processing point of view.

First, we show that all the world’s languages that we can currently analyze minimize syntactic dependency lengths to some degree, as would be expected under information processing considerations. Next, we consider communication-based origins of lexicons and grammars of human languages. Chomsky has famously argued that this is a flawed hypothesis, because of the existence of such phenomena as ambiguity. Contrary to Chomsky, we show that ambiguity out of context is not only not a problem for an information-theoretic approach to language, it is a feature. Furthermore, word lengths are optimized on average according to predictability in context, as would be expected under and information theoretic analysis. Then we show that language comprehension appears to function as a noisy channel process, in line with communication theory. Given si, the intended sentence, and sp, the perceived sentence we propose that people maximize P(si | sp), which is equivalent to maximizing the product of the prior P(si) and the likely noise processes P(sisp). We discuss how thinking of language as communication in this way can explain aspects of the origin of word order, most notably that most human languages are SOV with case-marking, or SVO without case-marking.