[See also names of regions to be used with docs when accessing the OED from PAT.]

Tags: A BL CF D DEF E EQ ET ETN HG HL HO IL IPR L LB LF LQ PQP PS PSA Q QP RX S0 S1 S2 S3 S4 S5 S6 S7 S8 SE SF SN SQ ST su T VD VF VL W XDAT XIL XL XR #

Tagging Structural Elements in the OED

Donna Lee Berg,
Centre for the New Oxford English Dictionary and Text Research, University of Waterloo

(Edited for HTML access by Frank Tompa)

OED documents are structures that allow you to restrict your search to specific dictionary components, such as etymologies, definitions, or quotations. In the database, each such component is defined by descriptive tags which delimit its text. For example, etymologies are preceded by a "begin" tag "<ET>" and followed by a matching "end" tag "</ET>". The structures available for searching are listed in two alphabetical sequences, with the more commonly used documents in the first sequence. A highly simplified outline indicates prototypical organization of an entry.

Remember that, in searching the OED, PAT considers all letters and symbols, including tags and spaces as characters. PAT also regards any query as a prefix and locates occurrences that begin with the characters you type. These factors, combined with the fact that PAT interprets the left angle bracket of a tag as a character, have implications if you wish to locate exactly what you type, and nothing else, within a document structure. For instance, you may wish to restrict an "Author" search to "Blake" by using "<A>Blake</A>", avoiding matches to "F.R. Blake" and "O. Blakeston".

For additional information, see:

Author <A>

An author's name normally appears in the printed text in large and small roman capitals as the second element in a
quotation following the date and preceding the title of the work. Most authors are cited by initials followed by surname, but surname only is used for well known authors such as Chaucer and Milton (note that Shakespeare is usually abbreviated as "Shaks."). A single author's name may be cited in several forms. Sir Walter Scott, for instance, appears most frequently as "Scott", but also as "W. Scott" and "Sir W. Scott". In addition, the OED cites several W. Scotts. Dates and titles are useful in such instances; for example, a W. Scott with a publication date of 1635 is obviously not the Victorian author. A number of other OED author conventions may affect searches, including: "Bible" (notably the 16ll King James version) sometimes appears as author, journal quotations frequently do not give authors' names, and translators are deemed authors of the English words with the original author's name included as part of the work, e.g., "Marx's Capital".

Note the effect of name variations and tags (<A>...</A>) on searches. Querying "W. Scott" within the author structure will locate not only that form, but "Sir W. Scott" and an "E.W. Scott". Further, since PAT considers "Scott" as a prefix, it matches "W. Scott-Taggart". Hitting the space bar after "Scott" (the usual way of specifying a word ending) results in no matches because authors' surnames are followed by the left angle bracket of the "end" tag which PAT sees as a character; thus one could specify "W. Scott</A>". Similarly, to exclude matches such as "E.W. Scott", the "begin" tag <A> would also be needed.

Bold Sub-Headword <BL>

One of two types of subordinate headword included within entries; so called because they appear in bold type in the printed text. These word forms are commonly either derivatives (typically formed by adding a suffix to the headword), or combinations (separate, hyphenated or single words that combine the headword, usually as the first element, with another existing word). Note, however, that many derivatives and combinations have entries of their own because they have developed meanings and histories distinct from their main word. Bold Sub-Headwords are usually defined and illustrated by quotations, sometimes grouped within psuedo quotation paragraphs. (See Subentry and compare with Italic Sub-Headword).

IMPORTANT: Note that three forms are included in the Bold Sub-Headword structure: the Lookup Form <LF>, the Stressed Form <SF>, and the Murray Form (the stressed form used in the first edition of the OED, but not included in the printed second edition, nor as a document structure).

Date <D>

Normally the first element in an illustrative quotation. The date given is usually the year in which the cited work was first published, although there are some discrepancies, especially in the dating of texts prepared for the first edition of the OED. Where precise dates could not be established, the date may be qualified by "C." ("c" in the printed text meaning "circa" or "about") or "A." ("a" in the printed text meaning "ante" or "before"), or by replacing the last one or two digits with dots, e.g., 17.. Date of composition is usual for letters, journals, and diaries, while lectures and speeches are assigned the date of their first appearance in print. Although most quotations specify some form of date, there are a few exceptions, the most notable being the many quotations from the Old English epic poem "Beowulf".

Definition <DEF>

Generally, a statement explaining the meaning of a
headword, sense, or sub-sense, although definitions in the OED take several other forms, including cross-references to another sense within the same entry or within another entry. In addition, a definition may simply describe the way the word functions in some grammatical or syntactical context. Definitions should always be read in conjunction with supporting quotations, since in a historical dictionary, the latter play an important role in establishing meaning and context. In fact, in some cases, the actual explanation of the meaning of a sense is contained in a quotation (see Cross-Reference Date).

Note, however, that this structural tagging was inserted automatically at Waterloo and not confirmed by OED editors, so its utility remains somewhat controversial.

Earliest Quotation <EQ>

The quotation having a date which is chronologically the first in an OED entry. While this facility can be useful, many words have multiple senses and sub-senses, either in current use or in their historical development. The earliest chronological date in the entry must therefore be viewed in the context of the sense which it supports.

Note also that this structural tagging was inserted automatically at Waterloo and not confirmed by OED editors, so its utility remains somewhat controversial.

Entry <E>

Entries are the major structural components of most modern dictionaries. In the printed OED, entries are arranged alphabetically by their headwords (the "subject" of the entry) which appear in dark bold type. There are two types of entries in the OED: main entries and cross-reference entries. Main entries contain comprehensive information about the history and meaning of "main form" headwords. The primary function of cross-reference entries is to direct the user from an obsolete or variant spelling of a word to its relevant main entry (see also Status). Specifying Entry as the document you wish to search for a word or phrase, or as a match point for "combining and comparing" two or more sets, means, therefore that PAT searches the entire Dictionary and identifies in which entries your results are located.

Etymology <ET>

Etymologies trace the origin or derivation of
headwords and are enclosed in square brackets in the printed text, normally following a variant form list, if included. Since the the OED was conceived as a history of the English language, the original policy was to trace non-native words to the foreign word or word element from which they were immediately adopted or formed, and native words to their earliest English form. In practice, however, OED etymologies sometimes exceed these guidelines.

Some etymologies include as their final element a paragraph in small print, tagged as "<note>" in the database. These are referred to as "etymological notes" by OED editors and include supplementary comments or information of an unsubstantiated nature such as "folk" or popular theories. ("<note>" tags are also used to identify various editorial comments in small print in other entry elements.) Etymologies are sometimes attached to individual senses or sub-senses (see Sub-Etymology).

Headword Group <HG>

Defines the initial group of elements in an entry and includes headword, pronunciation, part of speech, and homonym number. Note that, with the exception of the headword, not all of these elements necessarily appear in every entry.

Italic Sub-Headword <IL>

One of two types of subordinate
headwords which are included within entries, and so called because they appear in heavy italics in the printed text. This category consists primarily of minor combinations (separate, hyphenated or single words that combine the headword of the entry, usually as their first element, with another word form, but which do not require definition since their meaning is obvious), although it may also include phrases and idioms. Groups of combinations are usually listed alphabetically within one or more senses and are followed by a pseudo quotation paragraph, containing quotations illustrating their use in the same order. (Compare with Bold Sub-Headword.)

IMPORTANT: Note that three forms are included within the Italic Sub-Headword document structure: the Lookup Form <LF>, the Stressed Form <SF>, and the Murray Form (the stressed form used in the first edition of the OED, but not included in the printed second edition, nor as a document).

Label <LB>

In the printed OED, labels are italicized designations, usually abbreviated, which inform Dictionary readers of the boundaries within which a word or sense is, or was, used. In current OED terminology, there are five categories of labels: status (obsolete, rare, colloquial, etc.); regional (indicating a geographical area of usage, such as the U.S.); grammatical (describing the syntactical role of the word or sense, such as plural or collective); semantic (indicating the interpretation given to a word or sense in a particular context, such as figurative, transferred, specific, etc.); and subject (specifying the discipline, profession, trade, etc. in in which a word or sense is used).

It is important to note that subject labels in particular are not consistently used and their specificity may vary, often because of historical change. For instance, the label "Natural History" (Nat. Hist.) is found in a number of older entries. Since this discipline has been largely superseded and sub-divided, labelling of more current entries reflects these changes.

(For definitions and explanations of terms used in OED labels, see D.L. Berg, 1993, and for an example of a search for words used in a particular subject field, see D.L. Berg, 1989.)

Language <L>

This structure contains language references in etymologies and sub-etymologies. OED lexicographers identified over 1,000 different language forms (including abbreviations and regional variations) used in these contexts. While the structure is of considerable assistance in extracting languages that played a part in the origin or history of a word, care must be exercised in using this facility to identify the language from which a word passed directly into English (for examples of problems and techniques associated with such searches, see D.L. Berg, 1989.) Also, some further identification refinement is necessary since automatic tagging of forms includes instances where language names appear attributively as adjectives specifying nationality, e.g., Italian wine-makers.

Note that language forms are usually abbreviated, not always consistently, and full forms can be found in the "List of Abbreviations" which appears at the front of each Dictionary volume.

Latest Quotation <LQ>

This term refers to the quotation in an entry for an obsolete word which exemplifies the last located use of the form. In other words, the criterion used for the category is the chronologically most recent date in entries preceded by a "dagger" status symbol indicating that the headword is an obsolete form (see Status).

Lookup Form <LF>

This structure includes most word forms defined in the OED, and includes
Headwords <HL>, Bold Sub-Headwords <BL> and Italic Sub-Headwords <IL>. As the document name suggests these forms are given in the way most users would look them up, that is, without diacritics, stress marks, etc. (see Stressed Form). However, there are some combinations included within entries that may not be located because the OED sometimes lists minor forms in a style similar to the following example from the entry for "orange": "orange-bloom, -grove, -juice, kernel, leaf, -pip..". A computer program inserted the first element in front of (or, in some cases, following) hyphens. Thus, "orange-grove" and "orange-juice" will be located, but further refinement of the program is needed in order to find unhyphenated minor combinations such as "orange kernel" and "orange leaf". These combinations can often be located by searching quotation texts.

Part of Speech <PS>

A grammatical category (verb, adjective, adverb, etc.). In print, in the case of
headwords, the part of speech normally appears in abbreviated form following the pronunciation. A part-of-speech identification may also be used to describe a sense or subordinate headword (see Subentry). Where no part of speech is included, the form may be assumed to be a noun in most cases. Note that in all instances, the OED employs the term "substantive" (abbreviated "sb.") instead of "noun", in keeping with the tradition in early grammars of distinguishing between a "noun substantive" and a "noun adjective". In general, the term "sb." is only applied when it is necessary to differentiate a noun entry from an entry for a word of the same spelling, but with a different part of speech, or sometimes in instances where there are several noun homonyms, in which case a homonym number is added. The more usual convention for noun homonyms is to add the number to the headword itself.

Pronunciation <IPR>

The second edition of the OED employs the International Phonetic Alphabet for transcribing pronunciation, in contrast to the first edition which used a system invented by its primary editor, James Murray (still shown in the database and tagged <MPR>). In print, pronunciation, when given, appears in brackets immediately following the
headword. The Dictionary gives the pronunciation of most current, "main" headwords, with the exception of some derivatives and combinations, and some single-syllable words, where pronunciation is self-evident. Stress-marks, indicating emphasis, are sometimes included for these exceptions as well as for obsolete words for which pronunciation is not normally supplied. (See also Stressed Form.)

Pronunciation is, in most cases, in accordance with standard southern British speech, although alternative British or non-British usages may sometimes be included. A special parallels symbol precedes some foreign pronunciation alternatives (see Status).

Pseudonym <PSA>

Where the author of a quotation used an assumed or pen name, he or she is usually cited by the pseudonym which appears in print in the OED within single quotation marks. The latter are eliminated in the case of certain well-know pseudonymous authors such as George Eliot. For authors who have used both their real names and one or more pseudonyms, the name under which the particular cited work was published is normally given.

Pseudo Quotation Paragraph <PQP>

Identifies paragraphs of quotations that illustrate a number of word forms, rather than a single word or sense. These forms are usually
Bold Sub-Headwords or Italic Sub-Headwords included within entries and often listed in alphabetical sequence within a single sense, e.g., "television announcer, audience, broadcast, commercial, crew, critic, discussion ..". The accompanying so-called "pseudo" quotation paragraph usually organizes citations in the same order. As an aid to readers, an asterisk often precedes the initial, i.e., chronologically first, quotation in each grouping.

Quotation <Q>

The second edition of the OED contains nearly two and a half million quotations which perform the important function of illustrating the use, form, history, and meaning of word forms in a given sense. Normally quotations pertaining to a particular sense are organized in a
quotation paragraph in chronological order by date of publication or composition. Citations typically include the include the following elements: date <D>; author <A>; work <W> (i.e., title), with the location within the work, such as chapter, page, act, scene, etc.; and the quotation text <T>. Quotations are drawn from all forms of written and published works, including books, manuscripts, journals, newspapers, letters, and diaries, and represent both literary and popular sources.

The policy of the first edition, which dealt with most of the "core" words in the English language, was to include at least one example of use per century. This ratio, however, was increased considerably for entries added in the 1972-86 Supplement and the second edition.

Occasionally, in entries compiled for the first edition, no examples of contemporary usage could be found and illustrations were "made up". Such quotations are introduced by the abbreviation "Mod." (for "modern") and usually appear without a date. (See also Subsidiary Quotation.)

Quotation Paragraph <QP>

Definitions of words and senses are generally followed by a paragraph in smaller print which lists illustrative quotations in chronological sequence (earliest date first). Occasionally, when a sense covers both the literal and figurative use of a word, more than one quotation paragraph is used. (For an exception to these conventions, see Pseudo Quotation Paragraph.)

Quotation Text <T>

This structure contains the actual phrase or passage extracted from the text, as compared to the full citations included in the
Quotation document structure, of all the Dictionary's illustrative quotations. The texts are printed and spelled as they appear in the source edition used. Occasionally, a portion of a quotation text is eliminated and the omission is indicated by two dots (..), or three (...), if the elision includes a period. Sometimes an explanatory word may be inserted in square brackets, and the insert may be preceded by the abbreviation "sc." for "scilicet", meaning "understand" or "supply". In instances where the text quoted is a song title, advertisement or other unusual source, this information is usually given in brackets.

Status <ST>

Status tags enclose several types of symbols that usually precede a
headword or sense and indicate the form's status in the language. These include the dagger symbol which identifies an obsolete entry or sense (also usually further identified by a label "Obs." following the headword); parallels signifying non-naturalized words or pronunciations; and the so-called "catachrestic" symbol (a reversed paragraph symbol) identifying a confused or erroneous sense. Within the <ST> tag field, these symbols are interpreted by the abbreviations "obs" (for "obsolete"), "ali" (for "alien), and "err" (for "erroneous").

In addition, status tagging identifies two types of entries:

1. the numerous cross-reference entries, the headwords of which represent obsolete or variant spellings of main words, and which refer the user from these forms to the relevant "main" entry. These are identified by the abbreviation "xref".

2. a small number (387) of "spurious" entries which are entirely enclosed in square brackets. All of these entries were compiled for the first edition and consist of words that are erroneous, false, or could not be authenticated. Their purpose was primarily to correct errors found in earlier dictionaries resulting from copyists' or translators' errors, misprints, or misreadings of the text. These are identified by the abbreviation "spu".

Subentry <SE>

This structure consists mainly of Bold Sub-Headwords (i.e., defined and illustrated combinations and derivatives included within the entry for their main word) together with their definition text. Corresponding quotations can often be found in pseudo quotation paragraphs. (See also Headword and Italic Sub-Headword).

Variant Form <VF>

The OED attempts to include all documented earlier spellings, irregular inflexions, unusual plurals, etc. of
headwords, where appropriate or known. These are contained in a Variant Forms List preceding the etymology. Regional labels are sometimes included to indicate the geographic area in which the particular form prevailed (or prevails). Many of these forms also appear as headwords in cross-reference entries (see Status).

Work <W>

Refers to the title of the work which was the source for a
quotation. The title usually appears in italics following the author's name and preceding the actual quotation text. The work's text normally includes reference to the specific chapter, page, act, scene, etc. where the cited quotation can be found. Titles are frequently abbreviated and the definite articles "the" and "and", as well as the preposition "of", are routinely omitted. Abbreviations used for a single work can vary; for example, Shakespeare's "Comedy of Errors" appears as "Com. Err.", "C. Err." and "Err." (for an example of a search by title, see D.L. Berg, 1989). Some works, such as anonymous early texts like "Beowulf" and "Cursor Mundi" are cited by title only. The Bible is a special case; for example, books of the Bible are sometimes tagged <W> with "Bible" as author, especially for the 1611 King James version (for a discussion of the numerous variations in citing translations of the Bible, see D.L. Berg, 1993).

Identification of early and obscure works is frequently difficult and can be aided by reference to the Bibliography which appears at the end of Volume 20 of the printed text, and which includes most, but not all, of the titles cited. A notable exception, since this is a bibliography of English works, are the many foreign dictionaries and other word books often referred to in etymologies. (For a discussion of problems associated with matching citations in the Dictionary text to the Bibliography, see G.V.J. Townsend, "Citation Matching in the Oxford English Dictionary". UW Centre for the New OED, 1989.)


Cited Form <CF>

A word form in a foreign language, or in an earlier or regional form of English, that is referred to in an
etymology, usually in the context of its role in explaining the history or origin of a headword. These forms appear in italics in the printed text preceded by the language of origin (often abbreviated as L., Gr., OF, etc.). Words or phrases in this category that occur in the context of elements other than etymologies or sub-etymologies are somewhat problematic, since they represent automatic tagging of italicized forms which do not necessarily conform to this definition. A few such anomalies will be found in the etymological texts as well. (See also Language.)

Cross-Reference <XR>

Cross-references are widely used in the OED to refer readers to other entries or to another part of the same entry. This structure includes four categories of cross-reference elements:
Cross-Reference Headword <XL>, Cross-Reference Italic Headword <XIL>, Cross-Reference Date <XDAT>, and Relative Cross-Reference <RX>.

Cross-Reference Date <XDAT>

Appears within
cross-references and is found primarily in definition texts where users are referred to a quotation by date in the supporting quotation paragraph which follows. Such quotations usually supply supplementary information about the meaning of the word, or sometimes provide the entire explanation of meaning.

Cross-Reference Headword <XL>

Frequent references are found in OED entries to the
headwords of other entries, especially in etymologies. In the printed text, the cross-referenced headword is printed in small roman capitals, followed by a homonym number <HO>, if relevant, and sometimes by the specific location within the target entry (see Sense Number <SN>).

Cross-Reference Italic Sub-Headword <XIL>

These forms are primarily italicized combinations cited in entries which are found in another entry. They are frequently followed by a
Cross-Reference Headword, indicating the main entry in which they will be found. The abbreviation "s.v." sometimes precedes the headword reference, meaning the combination is found "sub voce", or "under the word".

Cross-Reference Sense Number <SN>

The Dictionary contains a large number of cross-references within entries which act as pointers to other entries. In such cases, the relevant headword is cited (see
Cross-Reference Headword), and the specific sense number (or number and letter) may also be included. If a reference does not include a sense number, the user must search the target entry for the relevant information. (Compare with Sense Number <#>.)

Headword <HL>

The subject of a Dictionary
entry which appears in dark bold type in the printed text. An OED headword can be a word, combination, derivative, phrase, prefix, suffix, combining form, abbreviation, acronym, letter of the alphabet, or other lexical entity. Headwords of main entries are usually the most common form of a word in current use, or the most typical of the later forms of an obsolete word. Headwords are sometimes preceded by symbols indicating their status in the language (see Status). Included within this structure are three forms of the headword: the Lookup Form <LF>, the Stressed Form <SF>, and the Murray Form <MF> (the stressed form used in the first edition of the OED, but not included in the printed second edition, nor as a document structure).

Note that it cannot be concluded that a word form is not defined or its use illustrated in the OED if it does not appear as a headword. Many other forms are defined and/or illustrated within entries for their "main" words (see also Bold Sub-Headword and Italic Sub-Headword).

Homonym Number <HO>

Homonym numbers are used to distinguish between or among headwords with the same spelling and part of speech, but which warrant separate entries because of their distinct meanings and histories. The number appears in the text as a superscript attached to a part-of-speech designation, or in the case of some nouns, to the headword itself. The number gives each headword a specific "address" which can be used in Dictionary cross-references (see Cross-Reference Headword).

Relative Cross-Reference <RX>

The OED contains a number of
cross-references which use the terms "prec." (preceding) or "next" to indicate to Dictionary users that they should refer to the preceding or next entry, or, in some cases, to the preceding or next sense in the same entry. A frequent use of "prec.", for example, is found in etymolologies of entries for derivatives or combinations which combine the headword of the previous or "preceding" entry with a suffix, combining form, or another word. The document file distinguishes this particular type of reference by tagging all the occurrences of "prec." and "next" within cross-references <XR>.

Sense Level 0 <S0>

The various senses and sub-senses in the OED are organized in a hierarchical scheme utilizing numbers and letters to distinguish steps in a
headword's development. Sense development is usually chronological, starting with the earliest sense, except for some entries which follow "logical order". The simplest form of identifying senses is linear (1, 2, 3..), but often further subdivisions are required which are ordered a, b, c.. (with the letters in bold type). Further subdivisions are made by italicized series (a), (b), (c).. or (i), (ii), (iii).. , or, occasionally, small Greek letters (alpha, beta..).

When a word's development is not straightforwardly linear (for example, when groups of senses developed simultaneously or diversely), a second level of numbering and lettering employing upper case roman numerals (I, II, III..) identifies branches. Sometimes two parts of speech, such as noun and adjective, are included in one entry, and each "fork" is then identified by the highest level of the scheme, upper case letters (A, B, C..). The two upper levels may be integrated in one entry, and are also occasionally used for other purposes, such as organizing groups of senses syntactically or semantically.

Sense levels 1, 2, 4, 6, and 7 identify groups and senses numbered according to this scheme. Level 1 refers to A, B.. groups; Level 2 to the I, II.. groupings; Level 4 to structures numbered 1, 2..; Level 6 to the a, b.. sub-senses; and Level 7 to the italicized bracketed sub-division of sub-senses - (a), (i), or Greek letters. The remaining numbers are used as follows: Level 0 (zero) identifies unnumbered sense sections, such as initial over-arching text preceding a regular sense numbering, or unnumbered final paragraphs beginning with the word "hence" that usually contain one or more derivatives. Levels 3 and 5 contain increasing numbers of asterisks (*, **, ***..) that provide another means of grouping senses by semantic or syntactical headings in lengthy entries. Level 8 consists of irregular unnumbered senses, such as paragraphs preceded by a a "catachrestic" symbol illustrating erroneous use of a sense (see Status).

Sense Level 1 <S1>

Identifies groups of senses lettered A, B, C.. and is primarily used to separate two (or more) parts of speech (e.g., noun adjective) when they are included in a single entry.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 2 <S2>

Used to identify groups of senses numbered I, II.. , representing branches of meanings which developed simultaneously or diversely.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 3 <S3>

A structure which takes the form in the printed text of an increasing number of asterisks (*, **, *** ..), and is sometimes used in complex and lengthy entries to group senses under semantic or syntactical headings. Level 5 <S5> is also sometimes used for the same purpose.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 4 <S4>

The most common type of sense development structure in which senses are numbered consecutively 1, 2, 3.. For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 5 <S5>

A structure which takes the form in the printed text of an increasing number of asterisks (*, **, *** ..), and is sometimes used in complex and lengthy entries to group senses under semantic or syntactical headings.
Sense Level 3 <S3> is also sometimes used for a the same purpose.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 6 <S6>

Identifies the lower-case bold letter structure (a, b, c..) used to subdivide senses. For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 7 <S7>

Identifies the structure using italicized and bracketed letters (a), (b).., or numbers (i), (ii), (iii).., or, rarely, lower case Greek letters (alpha, beta ..) attached to sub-divisions of sub-senses, and usually found in lengthy and complex entries.

For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Level 8 <S8>

Used to identify irregular unnumbered senses. For a further explanation of sense structure and groupings, see Sense Level 0.

Sense Number <#>

A sense is a numbered and/or lettered entry component which includes as its major elements a
definition and supporting quotation paragraph. The number or letter enclosed by sense number tags not only serves to organize the senses, it also provides a unique address for each sense, an important feature for cross-referencing. Sense identification is especially important in the OED since some entries contain 100 or more senses; for example, the verb "run" has 82 main senses and over 350 sub-senses. (For an explanation of how senses are structured, see Sense Level 0, and also compare with Cross-Reference Sense Number.)

Stressed Form <SF>

The full form of main
headwords, bold sub-headwords, and italic sub-headwords. "Full form" means that each form incorporates diacritics, diphthongs, punctuation, stress marks, etc. as they appear in the printed Dictionary. In the database, each of these typographical elements is tagged, although not all headwords contain such elements, e.g., monosyllabic words and most combinations and derivatives, for which stress is self-evident. (Compare with Lookup Form.)

Sub-Etymology <ETN>

An
etymology attached to a particular sense of a headword. These subordinate etymologies appear in square brackets in the printed text, and normally contain historical information relating to the sense of a word which does not lend itself to inclusion in the etymology at the head of the entry. its use illustrated in the OED if it does not appear as a headword. Many other forms are defined and/or illustrated within entries for their "main" words (see also Bold Sub-Headword and Italic Sub-Headword).

Subsidiary Quotation <SQ>

This structure contains quotations in square brackets which are occasionally found in
quotation paragraphs, usually as the first citation(s). The convention is used when a quotation does not actually employ the word in context, but is in some way relevant to its history. For example, in the case of a word borrowed from another language, the quotation may document its use in the language of origin.

Superscript <su>

Typographical tagging in this category is attached to most text in the Dictionary which appears in superscript, with the exception of
homonym numbers. Superscript text includes miscellaneous typographical conventions used in printing Murray pronunciations (see Pronunciation), mathematical functions, etc. In addition, it contains two special superior numbers preceded by a dash (-0 and -1) that sometimes further define the label "rare". In the first instance, the -0 indicates the word was found only in an earlier dictionary rather than a contextual quotation, while -1 means that only one quotation from a text other than a dictionary was found.

Variant Date <VD>

Earlier forms of spelling, irregular inflexions, etc. included in
variant forms lists, are assigned century ranges indicating when their usage was prevalent. Centuries appear in abbreviated form, for example, "5-6" indicates fifteenth to sixteenth century.

Variant Forms List <VL>

Lists of documented historical, or sometimes contemporary, variants of a
headword's spellings, irregular inflexions, and unusual plurals that normally appear in the printed text immediately before the etymology. Forms are further identified by the century range in which they prevailed. Lists are arranged in chronological order with the earliest variant(s) first. In some cases, two or more branches of forms may have developed simultaneously and these are grouped by lower case italic Greek letters. Illustrative quotations that follow are often referenced by the same Greek letters. (See also Variant Date for conventions used for century ranges.)