11 December: Fiona Kirton

Animacy and word order choice in an artificial language learning experiment

Fiona Kirton (CLE, University of Edinburgh)

Tuesday, 11 December
11:30am – 12:30pm
3.10, DSB

A number of improvised gesture studies have investigated how constituent order preferences are influenced by the animacy of the entities interacting in an event. Results from these studies suggest that, when people use improvised gesture to describe events involving an animate agent and inanimate patient (so-called non-reversible events), they show a preference for SOV. In contrast, some early studies find that when people describe events in which both the agent and patient are animate (reversible events), they show a preference for SVO. Data from other studies, however, suggest a more complex picture in relation to reversible events. In this talk, I will discuss competing hypotheses the have been proposed by different authors to account for the observed word order patterns in each study. I will focus in particular on two. The first, the noisy channel hypothesis (Gibson et al., 2013), is an information-theoretic account that proposes that verb-medial orders maximise message recoverability when both entities are animate and could plausibly fulfil the role of agent or patient. The second, which I term the cognitive saliency hypothesis (Meir et al., 2017), suggests that more cognitively basic entities are expressed first, resulting in a preference for SOV when the agent is higher on the animacy scale than the patient, and an approximately even distribution across SOV and OSV when both entities are animate. In contrast to Gibson et al., Meir et al. argue that the preference among some groups for SVO when describing reversible events can be explained in terms of interference from another linguistic system. In this talk, I will discuss an artificial language learning experiment designed to test these two hypotheses. I will present initial results from this study and discuss design ideas for a follow-up experiment.

4 December: Stella Frank

Machine Learning (re)discovers Language Evolution

Stella Frank (CLE, University of Edinburgh)

Tuesday, 4 December
11:30am – 12:30pm
1.17, DSB

I’ll give an informal introduction to, and overview of, recent work in the ML/NLP space on “emergent language”, featuring agents based on neural networks, trained using reinforcement learning, interacting in increasingly realistic worlds. I hope to provoke a discussion on how this line of work relates to what’s being done at the CLE, and whether/how we can add to it.

27 November: Hannah Rohde

Why are you telling me this? Information content and form

Hannah Rohde (University of Edinburgh)

Tuesday, 27 October
11:00am – 12:30pm
G.32, 7 George Square

Models of discourse prescribe that communication is guided by estimates of message relevance and informativity. This talk considers production and comprehension together. I present a set of psycholinguistic studies that test how speakers make decisions about what content to convey and what forms to use and how listeners reverse engineer the speaker’s intention. One of my primary questions is about coreference and the selection of possible meanings and forms. I use Bayes Rule to characterise this interplay between, on the one hand, speakers’ choices of who to mention and what referential form to use and, on the other hand, listeners’ recovery of the intended referent given a potentially ambiguous referring expression. Choice of mention can be shown to reflect how the discourse is expected to cohere; choice of referring expression depends on the determination of what information is useful to include. The latter choice can be shown to vary with the complexity of the context in which the referring expression will be interpreted. Beyond reference, interlocutors likewise have biases about what messages are probable and what surface forms are typically used to realize such messages. I present work testing possible repercussions of the inclusion of information that is wholly or partly predictable from the discourse context. The results highlight how listeners capitalize on speakers’ (appropriate) redundancy in communication.

6 November: Andres Karjus

Challenges in detecting evolutionary forces in language change using diachronic corpora

Andres Karjus (Centre for Language Evolution, University of Edinburgh)

Tuesday, 6 November
11:30am – 12:30pm
G.32, 7 George Square

Newberry et al. (Detecting evolutionary forces in language change, Nature 551, 2017) tackle an important but difficult problem in linguistics, the testing of selective theories of language change against a null model of drift. They use the Frequency Increment Test (FIT), an application of the t-test to detect signatures of selection in experimental and diachronic data. Having applied the test to a number of relevant examples, they suggest stochasticity has a previously under-appreciated role in language evolution. They also infer the effective population size and show that the strength of drift correlates inversely with corpus frequencies, echoing the analogous observation about small populations in genetics. We replicate their results based on the application of the FIT and find that while the overall observation of the prevalence of drift holds, results produced by this approach on individual time series are highly sensitive to how the corpus is organized into temporal segments (binning). We further investigate the properties of the FIT by using a large and controlled set of time series simulations to systematically explore the range of possible applicability of the test and the artefacts introduced by the binning protocol. The approach proposed by Newberry et al. provides a systematic way of generating hypotheses about language change and broad generalizations in a sample of time series, marking another step forward in research on large scale linguistic data with a deep diachronic dimension. However, we argue that along with the possibilities, the limitations of the approach need to be appreciated. Caution should be exercised with interpreting the results of the FIT (or a similar test) on individual corpus-based linguistic time series, given the limitations of the test, demonstrable bias in certain scenarios, as well as fundamental differences between genetic and linguistic data.
Slides, code and preprint available at: https://andreskarjus.github.io