2 March 2010

Updating the Scottish accent map: preliminary formant data from the VOYS corpus

Catherine Dickie (QMU), Christoph Draxler (LMU), Felix Schaeffler (QMU) and Klaus Jänsch (LMU)

The VOYS (Voices of Young Scots) corpus is a speech database of Scottish English spoken by adolescent speakers (aged 13 to 18). The recordings for this corpus were performed in Scottish secondary schools between autumn 2008 and autumn 2009, using the web-based WikiSpeech system (Draxler & Jänsch, 2008). All recordings comprise two channels (Beyerdynamic Opus 54.16/3 headset microphone and AT3031 table microphone), digitally recorded with 22 kHz and 16 bit.

Data collection has been completed in seven secondary schools in six Scottish locations: Inverness, Aberdeen, Edinburgh (2 schools), Jedburgh, Ayr, and Dumfries. Participating schools were selected by the socio-economic factor of “free school meals”. Schools were chosen that were close to the local average for this factor and thus likely to be representative for the area.

Currently, the database contains 175 speakers, more than 16800 utterances and a total recording duration of approx. 30 hours.

The speech material consists of:

1. scripted speech
a) read words and sentences targeting certain sociophonetic aspects of Scottish English
b) read digits, numbers, currency, date and time expressions, and phonetically rich sentences (this part of the corpus is speech-dat compatible (Höge et al., 1999);
c) a read story (the Dog and Duck story; Brown & Docherty, 1995)
2. unscripted speech
This part consists of spontaneous speech, elicited by questions like “please describe your way to school” or descriptions of pictures showing dramatic events.

For the analysis presented here, the words and sentences from category 1a were used. Currently there are about 7500 items in this category. These items were automatically segmented and labelled, using the Munich Automatic Segmentation System (MAUS; Schiel, 2004). We will present F1 and F2 values for nine Scottish English monophthongs as they are realised in each of the five locations, and discuss the implications for the role of vowel variation in distinguishing regional varieties in Scotland.

A first release of the VOYS corpus with these recordings will be made available in early to mid 2010.

[Back to the P-workshop top page]

owner-pworkshop@ling.ed.ac.uk