9. Patterns - TEI Lex-0 — A baseline encoding for lexicographic data — Clarity for lexicographers. Stability for data. Compatibility for the future.

1.

1.1.

File description

1.1.1.

Source description

1.2.

Encoding description

1.3.

Profile description

1.3.1.

Language profiles

1.4.

Revision description

2.

2.1.

Mandatory attributes

2.2.

Grammatical properties

2.2.1.

Typology of gram

2.2.2.

2.3.

Deprecated entry-like elements

2.3.1.

2.3.2.

3.

3.1.

A note on inheritance

3.2.

3.3.

Inflected forms

3.4.

3.5.

3.5.1.

Orthographic variation

3.5.2.

Phonetic variation

3.5.3.

Regional or dialectal variation

3.6.

Condensed forms

3.7.

Multiword expressions

3.7.1.

3.7.2.

Idiomatic expressions

4.

4.1.

Limiting contexts for def

4.2.

4.2.1.

Glossing examples

4.3.

Grammatical properties

4.3.1.

Grammatical glosses?

4.3.2.

Nested entries vs. multiple senses

5.

6.

Cross-references

6.1.

6.1.1.

Values of ref/@target

6.2.

Cross-reference typology

6.2.1.

6.2.2.

6.2.3.

6.2.4.

6.2.5.

6.2.6.

6.3.

Cross-references in definitions

6.4.

Further examples

6.4.1.

More complex example including quotations

6.4.2.

6.4.3.

Cross-references inside definitions

7.

7.1.

Label-like vs. narrative usage descriptions

7.2.

7.3.

Restricting the scope of usg

7.4.

Hierarchical usage labels

8.

9.

9.1.

Inheritance of xml:lang

10.

10.1.

10.1.1.

10.1.2.

10.1.3.

10.1.4.

10.1.5.

10.1.6.

10.1.7.

10.1.8.

10.1.9.

10.1.10.

10.1.11.

10.1.12.

10.1.13.

10.1.14.

10.1.15.

10.1.16.

10.1.17.

10.1.18.

10.1.19.

10.1.20.

10.1.21.

10.1.22.

10.1.23.

10.1.24.

10.1.25.

10.1.26.

10.1.27.

10.1.28.

10.1.29.

10.1.30.

10.1.31.

10.1.32.

10.1.33.

10.1.34.

<editorialDecl>

10.1.35.

10.1.36.

10.1.37.

10.1.38.

10.1.39.

10.1.40.

10.1.41.

10.1.42.

10.1.43.

10.1.44.

10.1.45.

10.1.46.

10.1.47.

10.1.48.

10.1.49.

10.1.50.

10.1.51.

10.1.52.

10.1.53.

10.1.54.

10.1.55.

10.1.56.

10.1.57.

10.1.58.

10.1.59.

10.1.60.

10.1.61.

10.1.62.

10.1.63.

10.1.64.

10.1.65.

10.1.66.

<langKnowledge>

10.1.67.

10.1.68.

10.1.69.

10.1.70.

10.1.71.

10.1.72.

10.1.73.

10.1.74.

10.1.75.

10.1.76.

10.1.77.

10.1.78.

10.1.79.

10.1.80.

10.1.81.

10.1.82.

10.1.83.

10.1.84.

10.1.85.

10.1.86.

10.1.87.

10.1.88.

10.1.89.

10.1.90.

10.1.91.

10.1.92.

10.1.93.

10.1.94.

10.1.95.

10.1.96.

10.1.97.

10.1.98.

10.1.99.

10.1.100.

10.1.101.

10.1.102.

10.1.103.

<publicationStmt>

10.1.104.

10.1.105.

10.1.106.

10.1.107.

10.1.108.

10.1.109.

10.1.110.

10.1.111.

10.1.112.

10.1.113.

10.1.114.

10.1.115.

10.1.116.

10.1.117.

10.1.118.

10.1.119.

10.1.120.

10.1.121.

10.1.122.

10.1.123.

10.1.124.

10.1.125.

10.1.126.

10.1.127.

10.1.128.

10.1.129.

10.1.130.

10.1.131.

10.1.132.

10.1.133.

10.1.134.

10.1.135.

10.1.136.

10.1.137.

10.2.

10.2.1.

model.addressLike

10.2.2.

model.attributable

10.2.3.

model.availabilityPart

10.2.4.

10.2.5.

10.2.6.

10.2.7.

10.2.8.

10.2.9.

model.divBottom

10.2.10.

10.2.11.

10.2.12.

10.2.13.

model.divTopPart

10.2.14.

10.2.15.

model.encodingDescPart

10.2.16.

model.entryLike

10.2.17.

model.entryPart

10.2.18.

model.entryPart.top

10.2.19.

model.frontPart

10.2.20.

10.2.21.

10.2.22.

model.graphicLike

10.2.23.

10.2.24.

10.2.25.

model.highlighted

10.2.26.

model.imprintPart

10.2.27.

10.2.28.

model.labelLike

10.2.29.

model.languageProfile

10.2.30.

model.lexEmphLike

10.2.31.

model.lexFormPart

10.2.32.

10.2.33.

model.lexPhrase

10.2.34.

model.lexicalRefinement

10.2.35.

model.limitedPhrase

10.2.36.

10.2.37.

model.morphLike

10.2.38.

10.2.39.

model.nameLike.agent

10.2.40.

10.2.41.

10.2.42.

model.pLike.front

10.2.43.

model.pPart.data

10.2.44.

model.pPart.edit

10.2.45.

model.pPart.editorial

10.2.46.

10.2.47.

model.persNamePart

10.2.48.

model.persStateLike

10.2.49.

model.personPart

10.2.50.

10.2.51.

model.placeLike

10.2.52.

model.placeNamePart

10.2.53.

model.placeStateLike

10.2.54.

model.profileDescPart

10.2.55.

10.2.56.

model.publicationStmtPart.agency

10.2.57.

model.quoteLike

10.2.58.

10.2.59.

10.2.60.

10.2.61.

model.sensePart

10.3.

Attribute classes

10.3.1.

10.3.2.

10.3.3.

att.cReferencing

10.3.4.

att.calendarSystem

10.3.5.

10.3.6.

10.3.7.

10.3.8.

10.3.9.

att.datable.custom

10.3.10.

att.datable.iso

10.3.11.

att.datable.w3c

10.3.12.

10.3.13.

10.3.14.

10.3.15.

10.3.16.

10.3.17.

att.fragmentable

10.3.18.

10.3.19.

10.3.20.

att.global.analytic

10.3.21.

att.global.change

10.3.22.

att.global.facs

10.3.23.

att.global.linking

10.3.24.

att.global.rendition

10.3.25.

att.global.responsibility

10.3.26.

att.global.source

10.3.27.

att.internetMedia

10.3.28.

att.lexicographic

10.3.29.

att.lexicographic.normalized

10.3.30.

10.3.31.

att.measurement

10.3.32.

10.3.33.

10.3.34.

10.3.35.

10.3.36.

10.3.37.

10.3.38.

10.3.39.

10.3.40.

10.3.41.

10.3.42.

10.3.43.

10.3.44.

10.3.45.

10.3.46.

att.translatable

10.3.47.

10.3.48.

10.4.

10.4.1.

macro.lexParaContent

10.4.2.

macro.lexSpecialPara

10.4.3.

macro.limitedContent

10.4.4.

macro.paraContent

10.4.5.

macro.phraseSeq

10.4.6.

macro.phraseSeq.limited

10.4.7.

macro.specialPara

10.4.8.

10.5.

10.5.1.

teidata.certainty

10.5.2.

10.5.3.

teidata.duration.iso

10.5.4.

teidata.duration.w3c

10.5.5.

teidata.enumerated

10.5.6.

10.5.7.

teidata.language

10.5.8.

10.5.9.

teidata.namespace

10.5.10.

teidata.numeric

10.5.11.

teidata.outputMeasurement

10.5.12.

teidata.pattern

10.5.13.

10.5.14.

teidata.pointer

10.5.15.

teidata.probCert

10.5.16.

teidata.probability

10.5.17.

teidata.replacement

10.5.18.

10.5.19.

teidata.temporal.iso

10.5.20.

teidata.temporal.w3c

10.5.21.

teidata.temporal.working

10.5.22.

10.5.23.

teidata.truthValue

10.5.24.

teidata.unboundedCount

10.5.25.

teidata.version

10.5.26.

teidata.versionNumber

10.5.27.

10.5.28.

teidata.xTruthValue

10.5.29.

teidata.xmlName

10.5.30.

11.

11.1.

DARIAH Working Group

11.2.

11.3.

11.4.

The Rahtz Prize

11.5.

11.6.

Training measures

12.

13.

13.1.

How to get started?

13.2.

13.3.

Lex-0 vs. TEI P5?

13.4.

Have a question?

13.5.

How to contribute?

13.5.1.

The internal nitty-gritty

13.5.2.

GitHub Workflow

13.6.

Convert from Lex-0 to Ontolex-Lemon?

13.7.

How to cite these guidelines

9. Patterns

9.1. Inheritance of xml:lang

Some elements in TEI Lex-0, like <entry>, for instance, have a required attribute xml:lang; others like <form> or <quote> do not. In general, TEI Lex-0, unlike TEI, recommends that the xml:lang be attached to so-called container elements (for instance, <entry> and <cit>) rather than on individual word forms or textual segments.

TODO: Add some examples

So how can we extract all orthographic forms in a particular language? We can use an XPath expression like this: //orth[ancestor-or-self::*[@xml:lang][1][@xml:lang='en']] .

This XPath expression identifies:

each orth element, regardless of where it is in the document (//)
but only if it itself or one of its ancestors has the @xml:lang attribute ([ancestor-or-self::*[@xml:lang]])
when looking for ancestors with the @xml:lang attribute, we stop at the first such ancestor (i.e. we look for the nearest ancestors) ([1])
finally, we filter out only those selected elements with the @xml:lang attribute whose value is 'en'

If your dictionary uses multiple language tags for one language (as in 'en', 'en-GB' and 'en-US') and you want to capture all language varieties with one XPath expression, you can use the XPath lang() function as in: //orth[ancestor-or-self::*[@xml:lang][1][lang('en')]].

While the predicate [@xml:lang='en'] will match only those elements whose xml:lang is exactly equal to 'en', the predicate with the function [lang('en')] will match all the elements whose language is tagged as either English (i.e. 'en') or one of its 'sublanguages' such as 'en-GB'.

If you are new to XPath, you can check out a DARIAH-Campus tutorial XPath for Dictionary Nerds.