https://disa-dhil.github.io/tei-summer-sessions/
Joey Takeda
Digital Humanities Innovation Lab, SFU | Digital Scholarship in the Arts (DiSA), UBC
August 11, 2025
NFC character | A | m | é | l | i | e | |
---|---|---|---|---|---|---|---|
Composed code point | 0041 | 006d | 00e9 | 006c | 0069 | 0065 | |
Decomposed code point | 0041 | 006d | 0065 | 0301 | 006c | 0069 | 0065 |
Character | A | m | e | ◌́ | l | i | e |
Base Emoji | U+1F3FB (Light) | U+1F3FC (Medium-Light) | U+1F3FD (Medium) | U+1F3FE (Medium-Dark) | U+1F3FF (Dark) |
---|---|---|---|---|---|
👍 | 👍🏻 | 👍🏼 | 👍🏽 | 👍🏾 | 👍🏿 |
🧑 | 🧑🏻 | 🧑🏼 | 🧑🏽 | 🧑🏾 | 🧑🏿 |
Description | Emojis | Result |
---|---|---|
Occupation (Person + medical symbol) | 🧑 + ⚕ | 🧑⚕ |
Family (Person + person + baby) | 👩 + 👩 + 👧 | 👩👩👧 |
Objects (Flag + rainbow) | 🏳 + 🌈 | 🏳️🌈 |
The root TEI element has an @xml:lang="en"
We don't need an @xml:lang on the first title, since it is in English by default
But we do need to say that the second title is in French
And these can nest
The content of the text will be declared to be in English (because of the root @xml:lang)
But individual segments (or the whole body itself) could have a new @xml:lang value
en
), optional script subtag (e.g.Latn
), region subtag (e.g. CA
)Language Tag | Language |
---|---|
en |
English |
en-CA |
Canadian English |
es |
Spanish |
ru |
Russian |
de |
German |
zh |
Chinese |
ja-Latn |
Japanese in Latin script (romanji) |
Use the IANA Subtag Registry Lookup Tool: https://r12a.github.io/app-subtags/ to find the language codes for:
Option 1: Embedded Note
Option 2: Structural, implicit
Option 3: Structural, aligned
Option 4: Link Groups