No prior programming expertise is required. Familiarity with basic UNIX commands, X-windows, an editor (such as EMACS) and email will be indispensable. Competence in these basics can be acquired through (a) the department's own computer induction course early in Term I, and (b) practice.
The course has a practical and a theoretical component.
We will spend about three weeks each on GDE, KIMMO, and corpora. For
the first two of these, students will be expected to compose working
sets of rules which cover interesting linguistic data (of their own
choice, to be negotiated with Prof. Hurford). These rule-sets, and the
accompanying commentary, will constitute part of the assessable work of
the course (see below). The work on corpora will explore various
available corpora and discuss how they might be used. For the section
on corpora, the following book will be found useful:
McEnery, Tom, and Andrew Wilson, 1996 Corpus Linguistics,
Edinburgh Textbooks in Empirical Linhuistics, Edinburgh University Press.
This will consist of weekly lectures by Prof. Hurford, presenting a survey of mechanical techniques for parsing natural language sentences, in relation to the kinds of grammars specified by linguists. The survey will describe, in terms suitable to non-programmers, such concepts as bottom-up and top-down, deterministic and nondeterministic techniques, and will cover finite state machines, recursive transition networks, augmented transition networks, definite clause grammars, and chart parsers. There will also be substantial discussion of the application of these computational techniques to tasks such as speech recognition, morphological analysis, machine translation, and automatic question answering. Reading for this part of the course will be assigned from week to week.
To be arranged.
Assessment will be by two small practical projects and an essay, each worth one third of the assessment for the course
The deadlines for these pieces of work will be: