Efficient meanings for numerals
Chris Cummins (University of Edinburgh)
Tuesday February 12
G.32, 7 George Square
The use of number in natural language gives rise to various ambiguities that are difficult to characterise precisely: should reference to “200 people” be understood to invoke an exact interpretation, a lower bound, an upper bound, an approximate interpretation, or some combination of these? In practical terms, this is potentially consequential because of how numerical quantity information feeds into our decision-making. In this talk I aim to explore how subtleties of number interpretation bear upon our subsequent reasoning, but also what governs our interpretative decisions at a more abstract level: does the meaning of number reflect rational principles about how we should use simple signals to convey complex information?
Frequency, stability, and regularity in language evolution
Christine Cuskley (Centre for Language Evolution, University of Edinburgh)
Tuesday 29 January
G.32, 7 George Square
Highly frequent linguistic units are more stable over time: for example, highly frequent words are more robust against change than lower frequency words. This trend has a functional explanation: forms with high usage frequency are less free to vary because this is more likely to cause communicative failure. This is analogous to the dynamics of purifying and stabilizing selection in biology. Traits with acute survival relevance show strong selection against deleterious alleles (purifying selection), resulting in less variation across the population. This talk will focus on analogous frequency-stability dynamics in language using agent based models, and some experiments which examine (ir)regularisation behaviours in native and non-native speakers of English. Stability in linguistic form across a population is favoured particularly for high frequency meanings, but that the strength of this effect is mediated by dynamic properties of the population.
Do nouns and verbs exist? Cross-linguistic categories as property clusters
Matt Spike (Centre for Language Evolution, University of Edinburgh)
Tuesday 22 January
G.32, 7 George Square
Can we define linguistic categories in a way which accounts for cross-linguistic variation? Not according to Haspelmath (2007,2010,2018), who argues that because there are no natural kinds in language (like ‘gold’ in chemistry or ‘species’ in biology), we should instead make a fundamental distinction between descriptive and comparative linguistic categories. As neither of these are any good for making meaningful generalizations, we should abandon most use of radical notions like ‘noun’, ‘verb’, and ‘adjective’, which is bad news for the language sciences. On the other hand, despite categories like ‘species’ and ‘gene’ in biology having all sorts of similar problems, biologists seem to be doing all sorts of science which refers to them, so there may still be hope. In this talk, I first look at how problematic categories have been dealt with in philosophy and the sciences: fuzzy categories are fine, as long as they have clusters of properties which are somewhat consistent over time and space, and if they can play a successful role in explanation and prediction. I then see if this is true of linguistic categories: using data from the Universal Dependencies tree-bank, I show that linguistic categories do differentiate themselves into clusters of properties both within and across languages, and that categorical labels predict features of individual languages. Finally, I look at some cognitive and evolutionary implications of this perspective on linguistic categories.
Redundancy facilitates learning of grammatical gender and agreement (sometimes)
Kenny Smith (Centre for Language Evolution, University of Edinburgh)
Tuesday 15 January
G.32, 7 George Square
Abstract: I am interested in whether more complex grammars (which we should generally expect to be harder to learn) might in fact facilitate learning of certain kinds of grammatical relationship. I will present some very preliminary work looking at grammatical gender and agreement. Grammars which have gender (division of nouns into multiple classes) and agreement (systematic covariance between formal properties of two elements) are more complex than those that lack these features; nonetheless, these features are widespread in the languages of the world. This could just reflect the ubiquity of the historical processes by which these systems form, but it’s also worth entertaining the possibility that there are functional pressures at play. It’s been suggested that these features can facilitate processing (Dye et al., 2018, Topics in Cognitive Science, https://onlinelibrary.wiley.com/doi/full/10.1111/tops.12316); I am interested in potential learnability advantages. I’ll present some preliminary experimental work (with Jane Bockmühl and Jenny Culbertson) looking at whether gender agreement with the noun phrase facilitates the learning of grammatical gender, and some neural network models showing similar benefits to redundant marking of grammatical gender. Using the same modelling framework, I’ll also show some preliminary results suggesting that redundant marking of a feature (number or gender) within the noun phrase sometimes facilitates learning of agreement based on that feature between the NP and another constituent (i.e. learning subject-verb agreement for number is easier if number is marked in several places within the NP). This is so far all slightly circular: if you have to learn agreement, having more of that agreement helps, but why bother with agreement in the first place? I’ll therefore also present some even more preliminary results trying to break out of that circularity, exploring whether subject-verb agreement on one feature facilitates learning a different dependency between the same elements (e.g. does number or gender agreement between subject and verb make it easier to learn selectional restrictions between a verb and its arguments?). In case it’s not clear from all the uses of ‘preliminary’ above, this is very much work in the early stages, so input or advice to desist would be appreciated.
Animacy and word order choice in an artificial language learning experiment
Fiona Kirton (CLE, University of Edinburgh)
Tuesday, 11 December
11:30am – 12:30pm
A number of improvised gesture studies have investigated how constituent order preferences are influenced by the animacy of the entities interacting in an event. Results from these studies suggest that, when people use improvised gesture to describe events involving an animate agent and inanimate patient (so-called non-reversible events), they show a preference for SOV. In contrast, some early studies find that when people describe events in which both the agent and patient are animate (reversible events), they show a preference for SVO. Data from other studies, however, suggest a more complex picture in relation to reversible events. In this talk, I will discuss competing hypotheses the have been proposed by different authors to account for the observed word order patterns in each study. I will focus in particular on two. The first, the noisy channel hypothesis (Gibson et al., 2013), is an information-theoretic account that proposes that verb-medial orders maximise message recoverability when both entities are animate and could plausibly fulfil the role of agent or patient. The second, which I term the cognitive saliency hypothesis (Meir et al., 2017), suggests that more cognitively basic entities are expressed first, resulting in a preference for SOV when the agent is higher on the animacy scale than the patient, and an approximately even distribution across SOV and OSV when both entities are animate. In contrast to Gibson et al., Meir et al. argue that the preference among some groups for SVO when describing reversible events can be explained in terms of interference from another linguistic system. In this talk, I will discuss an artificial language learning experiment designed to test these two hypotheses. I will present initial results from this study and discuss design ideas for a follow-up experiment.
Machine Learning (re)discovers Language Evolution
Stella Frank (CLE, University of Edinburgh)
Tuesday, 4 December
11:30am – 12:30pm
I’ll give an informal introduction to, and overview of, recent work in the ML/NLP space on “emergent language”, featuring agents based on neural networks, trained using reinforcement learning, interacting in increasingly realistic worlds. I hope to provoke a discussion on how this line of work relates to what’s being done at the CLE, and whether/how we can add to it.
Why are you telling me this? Information content and form
Hannah Rohde (University of Edinburgh)
Tuesday, 27 October
11:00am – 12:30pm
G.32, 7 George Square
Models of discourse prescribe that communication is guided by estimates of message relevance and informativity. This talk considers production and comprehension together. I present a set of psycholinguistic studies that test how speakers make decisions about what content to convey and what forms to use and how listeners reverse engineer the speaker’s intention. One of my primary questions is about coreference and the selection of possible meanings and forms. I use Bayes Rule to characterise this interplay between, on the one hand, speakers’ choices of who to mention and what referential form to use and, on the other hand, listeners’ recovery of the intended referent given a potentially ambiguous referring expression. Choice of mention can be shown to reflect how the discourse is expected to cohere; choice of referring expression depends on the determination of what information is useful to include. The latter choice can be shown to vary with the complexity of the context in which the referring expression will be interpreted. Beyond reference, interlocutors likewise have biases about what messages are probable and what surface forms are typically used to realize such messages. I present work testing possible repercussions of the inclusion of information that is wholly or partly predictable from the discourse context. The results highlight how listeners capitalize on speakers’ (appropriate) redundancy in communication.
Challenges in detecting evolutionary forces in language change using diachronic corpora
Andres Karjus (Centre for Language Evolution, University of Edinburgh)
Tuesday, 6 November
11:30am – 12:30pm
G.32, 7 George Square
Newberry et al. (Detecting evolutionary forces in language change, Nature 551, 2017) tackle an important but difficult problem in linguistics, the testing of selective theories of language change against a null model of drift. They use the Frequency Increment Test (FIT), an application of the t-test to detect signatures of selection in experimental and diachronic data. Having applied the test to a number of relevant examples, they suggest stochasticity has a previously under-appreciated role in language evolution. They also infer the effective population size and show that the strength of drift correlates inversely with corpus frequencies, echoing the analogous observation about small populations in genetics. We replicate their results based on the application of the FIT and find that while the overall observation of the prevalence of drift holds, results produced by this approach on individual time series are highly sensitive to how the corpus is organized into temporal segments (binning). We further investigate the properties of the FIT by using a large and controlled set of time series simulations to systematically explore the range of possible applicability of the test and the artefacts introduced by the binning protocol. The approach proposed by Newberry et al. provides a systematic way of generating hypotheses about language change and broad generalizations in a sample of time series, marking another step forward in research on large scale linguistic data with a deep diachronic dimension. However, we argue that along with the possibilities, the limitations of the approach need to be appreciated. Caution should be exercised with interpreting the results of the FIT (or a similar test) on individual corpus-based linguistic time series, given the limitations of the test, demonstrable bias in certain scenarios, as well as fundamental differences between genetic and linguistic data.
Slides, code and preprint available at: https://andreskarjus.github.io
Perspective-taking is spontaneous but not automatic: evidence from the dot perspective task
Cathleen O’Grady (Centre for Language Evolution, University of Edinburgh)
Tuesday, 16 October, 11:30am – 12:30pm
G.32 George Square
Data from a range of different experimental paradigms – in particular (but not only) the dot perspective task – have been interpreted as evidence that humans automatically track the perspective of other individuals. Results from other studies, however, have cast doubt on this interpretation. The issue remains unsettled in significant part because different schools of thought, with different theoretical perspectives, implement the experimental tasks in subtly different ways, making direct comparisons difficult. Here, we resolve these issues. In a series of experiments, we show that perspective-taking in the dot perspective task is not automatic (it is not purely stimulus-driven), but nor is it the product of simple behavioural rules that do not involve mentalizing. Instead, participants do compute the perspectives of other individuals rapidly, unconsciously and involuntarily, but only when attentional systems prompt them to do so (just as, for instance, the visual system puts external objects into focus only as and when required). This finding prompts us to distinguish spontaneous cognitive processes from automatic ones, and we suggest that spontaneous perspective taking may be a computationally efficient means of navigating the social world.