A Linguistic Atlas of Early Middle English (LAEME): Overview

1. What is in LAEME?
1.1 The principal parts

There are four types of resources in LAEME: (1) a corpus of ‘Tagged Texts’ (CTT); (2) an ‘Index of Sources’ containing records which describe the sources of the tagged texts; (3) a set of maps which display the geographical distributions of a series of linguistic features attested in the texts; (4) and a set of documents, including an ‘Introduction’ to LAEME.

1.2. Arrangement of the Website

1.2.1 Main page
The main window of the LAEME web page comprises two frames. Down the left-hand side of the screen is the Task List. This is a list of links for the different functions of LAEME. The larger part of the screen, to the right of this, is the Task Area. Each frame allows for separate scrolling if the window is zoomed.

1.2.2 Task List
The ‘Front Page’ link, first link at the top of the Task List, will take you to the LAEME ‘Front Page’. Here are to be found links to ‘About LAEME’, ‘Copyright’, ‘Citation’ of LAEME and ‘Funders’. The next link is to this ‘Overview’, which you are reading. Next is ‘Fonts & Browsers’, which gives important information about (a) fonts you will need to download to get full functionality and (b) browser issues. Below this under ‘Documents’ is a drop-down list that allows selection of one of the LAEME documents, including the chapters of the Introduction. A selected document is displayed in the Task Area. After ‘Documents’ follow the Tasks, which allow you to search for and retrieve the data.

Clicking on a Task‘s link calls up a menu or form within the Task Area. Within each menu or form there is a symbol, |Task Key|. This provides explanations of the menu items or elements of the form, and the kind and format of the data to be retrieved. Placing your cursor on |Task Key| will cause the explanatory notes to appear. (Note that this is not a conventional ‘pop-up’ window, and it is not necessary to have ‘pop-up’ windows activated to see this.) Moving the cursor from |Task Key| will cause the explanations to disappear. The data retrieved through activating a Task - the result - will be displayed in the Task Area. Within a result there may be links to further data. For example, an entry retrieved from the Index of Sources may contain links to a corpus tagged text. Clicking a filename ending in ’.tag’ will display the tagged text in a new tab or window. The original result will stay in the Task Area until replaced by the results of another action (a new Task is activated or the browser window is refreshed). At the foot of the Task List are three links to resources beyond LAEME. The first is a link to the Corpus of Narrative Etymologies, which offers a historical linguistic explanation and commentary of the text words and morphemes in the LAEME CTT. The second is a link to the ‘sister’ website, eLALME, an online version of A Linguistic Atlas of Late Mediaeval English, 1350-1450, revised and augmented. The third is a link to the Middle English Dictionary. The fourth link, ‘Other Resources’, calls up a set of further links to other online linguistic atlases and dictionaries relating to the languages of the British Isles in the late medieval period. (Be aware that the University of Edinburgh is not responsible for the content of the listed external internet sites nor for any consquences of using them.)

1.2.3 Task Area
Clicking on a link in the Task List will call up menus in the Task Area. Completing a menu may lead directly to results or to subsequent menus in the Task Area. Results of Tasks will normally be displayed in this part of the window also.

2. How to access the content of LAEME
2.1 Inside LAEME

2.1.1 The Index of Sources
This is a database of manuscript texts which are the sources of the transcribed and lexico-grammatically tagged texts in the LAEME CTT. The database is searchable by field: e.g. manuscript, date, text (including hands), localisation, script type, bibliography. The amount of information available for each source varies considerably, so search results will be affected accordingly. References to works used in the compilation of LAEME are electronically accessible from the individual entries in the Index of Sources. See further the Index of Sources |Task Key|.

2.1.2 Corpus Files
Both tagged text and text dictionary formats are accessible from this Task. See further the Corpus Files |Task Key|. Tagged texts and text dictionaries are also accessible from the individual entries in the Index of Sources.

2.1.2.1 Tagged Texts
A tagged text is a text that has been transcribed directly from a manuscript or from a facsimile of the manuscript (e.g. microfilm, photostat). The resulting disk text has then been passed through an interactive tagging program to assign to each word or morpheme in the text a description of the word or morpheme that seeks to capture its semantic and grammatical properties in the form of a ‘lexico-grammatical’ tag. For a full explanation of the LAEME tagging and the format of the tagged text, see Introduction, Chapter 4. Cf. also Grammel Commentary. Considerable editorial and textual commentary accompanies each tagged text. The corpus has provided the source material for all the related publications.

The Corpus of Tagged Texts is the core of LAEME; it contains the set of texts transcribed and lexico-grammatically tagged for LAEME and the Text Dictionaries derived from these. Each Tagged Text contains a transcription of an original manuscript text. In the running text each text-word is prefixed by a lexico-grammatical tag. For a full explanation of the LAEME transcription policy and internal format, see Introduction, Chapter 3, §§ 3.3.3 - 3.5. For the treatment of manuscript abbreviations, see §§ 3.4.5 - 7.

2.1.2.1.1 What is a lexico-grammatical tag?
A tag describes the meaning and / or grammatical category and / or grammatical function of a word or morpheme at the point in the text at which the particular instance of the word or morpheme occurs.

A lexico-grammatical tag comprises at least one of two elements: a lexel and / or a grammel. The start of a tag is marked by ‘lexel. The start of the grammel is flagged by ‘_’ which ‘ties’ the full tag to its associated ‘form’.

A lexel is a effectively a gloss in the form of the etymologically equivalent modern English word. Where there is no modern English equivalent, an Old English, Old Scandinavian or Middle English etymon is used as appropriate. Sometimes lexels carry further semantic annotations inside {} for which see under ‘Documents’.

The grammel describes the grammatical category and / or function of the text-word at the point in the text where it occurs. This grammatical description follows what might be considered ‘traditional’ grammatical concepts and is intended to be agnostic with respect to any particular recent in or out of vogue grammatical models or theories. The grammatical labels vary in transparency and are in some cases idiosyncratic. (Puzzlement may be dispelled by reference to the Grammel Commentary.)

Tags are grouped into types: Suffixes, Grammatical Words, Inflexions, Numbers, Lexis. Suffixes are endings like ‘-hood’, ‘-dom’, ‘-ly’. Grammatical Words are determiners, demonstrative adjectives and pronouns, personal pronouns, relative pronouns. These are distinguished from the general Lexis by not having a lexel: they are defined solely by the string in their grammel. Inflexions refer to plural endings (e.g. on nouns, adjectives and relative pronouns), verbal inflexional endings (e.g. present tense indicative, strong past participle, present participle). Numbers are tagged separately from the general lexis and their lexemes are represented by digits. Lexis is anything else with a tag and the tags in the Lexis will have both a lexel and a grammel.

2.1.2.1.2 Place-names and personal names
Place-names and personal names are not tagged. Place-names are flagged solely by ‘;’ and personal names by ‘‘’. These flags are in contrast to ‘$’ in the tagged material proper. Other material is flagged by ‘!’, e.g. roman numerals or forms that are incomplete. See further the Introduction, Chapter 3, § 3.4.10.2..

2.1.2.1.3 Treatment of manuscript lacunae and partial figurae in the disk-text
Sometimes it may be injudicious or impossible to obtain a reading from a piece of text. Such cases are marked in a Corpus text by ‘[]’. In other cases, figurae may be partly decipherable and the missing littera(e) can be safely deduced; in this case the transliteration is placed within [ ], e.g. KI[N]G. If no text is discernible, [ ] are left empty, because one should not try to guess a scribe‘s intention in such cases. If missing word(s) can be deduced, or supplied from a reliable printed edition of the text, they are included in a comment. (See further Introduction, Chapter 3, § 3.4.3 and § 3.5.5.)

2.1.2.1.4 Treatment of puncutuation found in the manuscript text
‘{ }’ are used in combination with other characters to delimit other material. { } alone are used to enclose the encoding of punctuation and features of the layout of the manuscript text. {\} records the end of a line; {\\} the end of a text or of a paragraph. In the LAEME texts three main marks of punctuation are generally met: ’.’ punctus, punctus elevatus and ‘/’ virgula. A punctus is recorded as {.}, a punctus elevatus as {,‘} and a virgula as {,}. See further Introduction, Chapter 3, § 3.5.1 - 2. The recording of the punctuation marks is simply to note their presence. No attempt is made to interpret their significance within the text, e.g. whether a point may mark the end of, or a pause in, a phrase or sentence or whether used to demarcate roman numerals. Importantly, no modern punctuation has been added to the transcribed texts.

2.1.2.1.5 Comment and Annotation: the uses of { } with other characters

See further Introduction, Chapter 3, §§ 3.5.6 - 8 and 3.6.

2.1.2.1.6 Deletions and insertions made by scribes In manuscript texts, words or letters may sometimes be scored through or may be inserted between lines or in a margin. The former are here termed ‘deletions’, the latter ‘insertions’. Scribal deletions are transcribed (where legible). Where a deleted word is incomplete, perhaps because the scribe has miswritten and immediately realized the error before completing the the word, the letters written are still transcribed. For the treatment of insertions and deletions in the LAEME CTT, see Introduction, Chapter 3, § 3.5.4.

2.1.2.2 Text Dictionaries
A Text Dictionary is derived from a tagged text. It lists alphanumerically by tag all the lexico-grammatically tagged material in a Tagged Text, followed by all the forms which attest the information in the tag. After each form is given the number of times the form associated with the tag is attested in the text. Text Dictionaries contain the header and any text words flagged with $ in the Tagged Text; they do not contain place-names or personal names, comments or annotations, nor any material flagged by ! in the Tagged Text.

A Text Dictionary is similar to a ‘Linguistic Profile’ in LALME (or eLALME). The principal difference is that it contains a full listing of the linguistic material in the text (aside from the exclusions just noted), where a LALME LP is based on a predetermined set of items.

3. Tag Dictionary

The Tag Dictionary lists all the material in the tagged texts with tags as the head words. Each tag is followed by a list of the forms which have been assigned to that tag. Sections of the tag dictionary can be created by tag category, following the instructions in the |Task Key|.

3.1 Frequency Counts

There are two formats in which to view the Tag Dictionary: (a) basic listing of tags and forms and (b) a listing in which each form is followed by a count of the number of times it is attested in the corpus associated with the tag and also of the number of texts in the CTT it appears in. So a listing of type (a) has the format:

keep/vi IKEPEN KEPAN KEPE KEPEN KEPIN

and a listing of type (b) has the format

keep/vi IKEPEN 1 1 KEPAN 2 1 KEPE 5 4 KEPEN 2 2 KEPIN 1 1

where the first number after the form is the frequency of attestation and the second number is the number texts in which the form appears associated with the tag - in the example keep/vi.

3.2 Show negatives in elaborated format

See further the instructions in the |Task Key|. This format applies only to the following tags:

For the treatment of clausal negation in the LAEME CTT, see further Introduction, Chapter 4, § 4.4.5 and see § 4.4.5.3 for use of elaborated format for multiple negation.

4. Form Dictionary

The Form Dictionary lists all the material in the tagged texts with forms as the head words. Each form is followed by a list of the tags which are associated with that form. Sections of the form dictionary can be created using part or all of a form, following the instructions in the |Task Key|.

4.1 Frequency Counts

There are two formats in which to view the Form Dictionary: (a) basic listing of forms and associated tags, and (b) a listing in which each form is followed by a count of the number of times it is attested in the corpus associated with the tag and also of the number of texts in the CTT it appears in. So a listing of type (a) has the format:

KEP keep/nOd keep/nOd{rh} keep/n{rh} keep/v-imp keep/vi-m keep/vpp keep/vpp{rh} keep/vpt23{rh}

a listing of type (b) has the format

KEP keep/nOd 2 2 keep/nOd{rh} 11 7 keep/n{rh} 1 1 keep/v-imp 4 3 keep/vi-m 3 2 keep/vpp 1 1 keep/vpp{rh} 1 1 keep/vpt23{rh} 1 1

where the first number after the form is the frequency of attestation and the second number is the number texts in which the tag appears associated with the form - in the example KEP.

5. Maps

Spatial distributions of data in the CTT are to be found by consulting the set of ‘Feature Maps’ for LAEME.

Note: a new and larger set of Feature Maps is presently being created.

Clicking on the ‘Maps’ task brings up a set of options in the Task Area (1) ‘Browse/Search Feature Maps’; (2) Combine Feature Maps; (3) ‘Create a Feature Map’; (4) ‘View Key Map’; (5) ‘Download Key Map’.

5.1 Browse/Search Feature Maps

Clicking on ‘Browse/Search Feature Maps’ brings up a table listing the maps in the set of Feature Maps. These are defined by Map Number, Map Feature and a Map Period. The Map Number is used for internal counting and is only of relevance to the user if the Combine Feature Maps task is implemented (see below § 5.3). Map Feature is in effect the title of the map and is a label for the ‘item(s)’ and ‘feature(s)’ mapped (see below § 5.1.1). It labels the set of tags (whether lexel(s) or grammel-only tag(s)) used to select data for the map and it describes the linguistic feature(s) being mapped. The Map Period, as a default, covers the whole period for LAEME, i.e. 1150 to 1325. (It is possible that future versions of LAEME may include a facility for date banding the maps.)

Clicking on the Map Feature will cause the map to appear in a new tab or window. Note that there is an option to click also a box labelled ‘Show locations of other features’. The full map title appears at the top of the map. The map contains an outline of England, in green on a blue background. The pre-1974 county boundaries are also displayed in yellow. The distribution of a feature is shown as a pattern of red dots on each of the locations at which has been localised a tagged text with one or more forms containing the feature. If the box ‘Show locations of other features’ has been checked, then the display indicates (with small blue dots) those texts which do not show the particular feature but which have some other form attested for that item. Small white dots show the locations of texts that have no forms at all attested for that item. The three different sizes of red dot correspond to frequency of occurrence of the feature in each text placed at the survey location. The three sizes express the relative proportion of occurrence of the feature in a text: a large dot indicates that the relative proportion of the feature, p, is greater than 0.5; a medium-size dot indicates 0.5 >= p > 0.2; and the smallest size of dot indicates 0.2 >= p > 0. Where the option to Combine Feature Maps is invoked (see below § 5.3) and symbols other than filled circles are selected, the other symbols are sized likewise in the resulting maps. Users of LALME will recognize this format from the ‘Dot Maps’ in volume 1, although here the system of ratios is different and the calculation of relative proportions in LAEME is derived from precise frequency counts.

5.1.1 Definitions of ‘item’, ‘form’ and ‘feature‘
An item is a test unit for linguistic comparison. It is a hyperonym for a group of forms, which are functionally equivalent. These can be grouped together as being ‘the same word’, e.g. heo, scho, xe = she, or ‘the same morpheme’, e.g. +, +ith, +es, = ‘3rd person singular indicative inflection’. In these cases, form is more or less equivalent to ‘the spelling adopted for the word or morpheme’. But an item may also comprise a group of forms ‘having the same basic meaning’, e.g. til, fort, unto, ├żat = until. In this case, form is the equivalent of ‘the words chosen to express the meaning’. In LAEME, such semantic items may be constructed by means of lexel specifiers, e.g. in the case of until, combining the following lexels that have {u} for ‘until’ as a specifier: $a-to{u}, $alforto{u}, $allthat{u}, $allwhat{u}, $betwix-and-til{u}, $betwix-and{u}, $forthat{u}, $forththat{u}, $fortothat{u}, $forto{u}, $into{u}, $in{u}, $oYYaet{u}, $oY{u}, $on{u}, $solongthat{u}, $sothat{u}, $that{u}, $tilthat{u}, $til{u}, $to{u}, $until{u}, $unto{u}, $what{u}. A linguistic feature is a segment comprising all or part of a form, or a set of forms, which realise one or more items: e.g. initial x- in forms of ‘she‘; -eþ in 3rd person singular indicative verb inflexions; fort-forms expressing the meaning until.

In constructing input for the Feature Maps in LAEME, an item comprises a set of one or more tags and the associated forms which contain strings representing the defined feature. For example, the tag P13NF she and its associated forms are:

/P13NF *GGE *GHE *HA *HE *HEO *HE[O] *HIE *HO *HOE *HUE *SCHO *SCO *SHE *SHO *zEO +A >g>HO A A+ CHE G2E G2E+ GE GHE H>I>E HA HA+ HE HEO HE[O] HI HIE HIO HO HOE HOO HUE HY HYE SCHA SCHE SCHO SCHO+ SCO SCO+ SCae SE SGE SHE SHE+ SHO YO gHO gHOxxx+ gIE yIE yOE zE zEO zO

If the feature to be mapped was spellings with initial ‘h’, then the feature would be defined by the grammel P13NF and any of the forms having the initial strings H or *H.

5.2 Finding Feature Maps

To locate a desired map, type any part of its title (whether item or feature) into the ‘Find maps that have’ search box and click ‘Submit’. Any map with the search string in its title will then be listed for selection.

5.3 Combine Feature Maps

It may be desirable to view the distributions of more than one feature in the same map. ‘Combine Feature Maps’ allows the distributions for up to four feature maps to be combined. Combining means that the feature distributions for more than one map are overlaid one on the other. Clicking on ‘Combine Feature Maps’ brings up a menu allowing you to specify the numbers of the maps you wish to combine. The |Task Key| explains how to select (contrasting) colours and symbols for the display of the combined features.

5.4 Create a Feature Map

This is a new facility not available in previous versions of LAEME. It requires detailed knowledge of the complexity of the LAEME tagging system and the content of the Corpus of Tagged Texts in order for the user to be sure that they are specifying items and features correctly. Having the relevant section of Tag Dictionary up in another window before specifying your map is strongly recommended. Judicious use of wildcards may then be necessary.

5.5 View or Download Key Map.

The LAEME Key Map shows (by text number) the positions of those tagged texts that have been localised. Pre-1974 county boundaries appear in grey.

6. County Lists and Item Lists

Clicking on the ‘County List’ task brings up a search form in the Task Area. The |Task Key| explains how to set up the input for the search. The result appears as a dictionary for the selected tag(s) where the heads are the (selected) associated forms ordered alphanumerically. Each form is followed by a list enumerating the counties to which texts containing the form have been localised. The counties are represented by a set of three-character abbreviations. Following each county abbreviation are the references numbers of the texts. If the ‘Frequency’ button or either of the ‘Relative Proportions’ buttons has been checked then each form will be followed by a value indicating these (see below § 6.1). Clicking on the text number will bring up a description of the text, abstracted from the LAEME Index of Sources. This also contains links to the full Tagged Text and to the Text Dictionary.

Clicking on the ‘Item List’ task brings up a similar search form and again the |Task Key| explains how to set up the search input. The result appears as a dictionary for the selected tag(s) where the heads are the numbers of the LAEME texts. Each text number is followed by the abbreviation of the county in which it has been localised and then a list of the forms associated with the tag(s). If the Frequency button or either of the Relative Proportions buttons has been checked then each form will be followed by a value indicating these.

6.1 Measures of occurrence

In both County List and Item List tasks, you are asked ‘How do you wish measures of occurrence to be presented?’ Clicking ‘None’ results in no measure being given. ‘Frequency Counts’ gives the number of occurrences of a form associated with the specified tag in a text. ‘Relative Proportions: values’ expresses the relative proportion of the occurrences of the form associated with the specified tag in a text in the range 0.01 to 1. Frequencies and relative proportions occur in [ ] following the forms in Item List or following the number of the text in which the form occurs in County Lists. ‘Relative Proportions: ranges’ expresses the relative proportion of occurrence within a range using parentheses around form (Item List) or text number (County List), e.g. ( TREOW ), ( 182 ): no parentheses means that the relative proportion, p, is greater than half of occurrences, viz. p > 0.50; ( ) signifies 0.5 >= p > 0.2; and (( )) signifies p >= 0.2.

7. Concordances

The LAEME ‘Concordance’ Task allows you to create a ‘Key Word In Context‘-type concordance. However, rather than searching on words in the text, the Concordances Task takes as input a set of one or more ‘key tags’ and the forms with which they are associated. Searching on the key tags allows precise specification of the material in the CTT by lexel and / or grammel. Output is a set of one or more concordance records comprising the key tag and its associated form along with a user-specified number of words which constitute the immediately preceding and following context.

Clicking on the ‘Concordances’ task brings up a search form ‘Make a Concordance’ in the Task Area. The |Task Key| explains how to set up the input for the Concordance Search. This then brings up the ‘Create a Concordance’ menu. The default is that the concordance will cover all the texts in the CTT. There is an option to narrow the input to one or more tagged text specified according to county locations or by individual text number. A key to text numbers and associated counties is linked. The user can also specify the number of words (from 0 to 20) to be retrieved on either side of the key tag in order to provide the immediate contexts for the key forms. The output can be sorted by tag, form, date or file name. The key tags are selected from a partial Tag Dictionary of the tags and associated forms generated from the search string specified in the previous menu. The key tags to be used for the concordance are selected by checking the box beside the relevant tag. There is also a single check box if ALL tags in the partial dictionary are to be used. The output appears in the Task Area as a list of the key forms found along with their tags. Underneath the key tag and key form and their context forms (if any) appear the file name, number and date of the tagged text in which they are associated. Clicking on the file number brings up the relevant Index of Sources entry. Clicking on filename.tag or .dic brings up the relevant tagged text or text dictionary.

8. LAEME Documents

The LAEME Documents are intended to explain the concepts, perspectives and structures of LAEME. The Introduction provides a detailed account of the background and the theoretical framework that underpins LAEME and the methodology adopted to create the Corpus of Tagged Texts. References to works used in the compilation of LAEME are electronically accessible from both the Introduction and from the individual entries in the Index of Sources. Text Keys give filename and file number references to texts in the LAEME CTT as well as listing all the localised texts by region and county. Lexel Specifiers gives a key to the extra semantic annotations to the lexical elements of the tags (‘lexels‘). The Grammel Commentary explains the grammatical elements (‘grammels‘) of the tags.