Encoding Entities: People, Organizations, Time, and Place

Joey Takeda

Digital Humanities Innovation Lab, SFU | Digital Scholarship in the Arts (DiSA), UBC

July 23, 2025

Today

  1. 10:00-12:00: Presentation and Workshop
  2. 12:00-12:30: Lunch
  3. 12:30-1:15: Case Study: Lyon in Mourning
  4. 1:15-2:00: Open Discussion

https://disa-dhil.github.io/tei-summer-sessions/

Before we begin...

  • This is not going to be an introduction to TEI
  • But completely OK if you aren't familiar!
  • Main point is to understand the affordances of TEI for encoding "entities" (people, places, organizations)
  • But also: please feel free to interrupt, ask questions, etc

Today

  1. Basic Principles of Entities (People, Places)
  2. Introduction to the person element
  3. Finding Information about People
  4. Introduction to the place element
  5. Locating places: Demo (Vertexer)

What is an entity?

  • Entities refer to a specific thing in the world
  • E.g. proper nouns: people, places, books, events, organizations
  • Often described in terms of "named" entities (e.g. things with a name), but there are cases where this become a bit tricky (as we'll talk about later)

Encoding Entities: Two Parts

  • Definition: The authority/canonical record
  • Where you define information about the entity
  • This is usually defined externally -- e.g. in a separate file or as a "standoff" entity
          
            
          
        

Anatomy of an *Ography

  • Databases of entities are often known as "standoff" in TEI. Usually these are kept in separate files: e.g. a bibliography, an orgography, a placeography (aka a geography?)
  • Contained by a list* element (e.g. listBibl, listPerson)
  • Each entity has a unique @xml:id to identify it canonically
                
                    

                
            

Encoding Entities: Two Parts

  • Reference: This is where you mention the person or place in your text in some way and link to the definition
          
            
            
            
          
            
            
            

Tagging Entities

  • You can also tag using the canonical reference to an external source
          
            
            
            

Where to Find Data: VIAF

  • VIAF: The Virtual International Authority File
  • Linked data service that uses openly available linked data (library catalogues, Wikidata, etc) to create a combined authority record
  • https://viaf.org/en

Other Resources

  • WikiData
  • Trove: https://trove.nla.gov.au/people/1247624
  • WorldCat: https://id.oclc.org/worldcat/entity/E39PBJq6WD9qK476jfWk4qhvHC
  • ++++ many more "identifiers"

Wait, so Why Create Entities?

  • Why create records for people, places, bibliographies if there are already so many resources available about them?
  • Aren't we just writing the same stuff again and again and again?

Why Create Entities?

  • Not all people are in an existing database and not all databases are correct
  • Individual projects will often want to have more precise definitions of the person -- e.g. local definitions for use within the project
  • Provide more in-depth encoding for project specific needs

Basic Components of an Entity

  • characteristics or traits which do not, by and large, change over time
  • characteristics or states which hold true only at a specific time
  • events or incidents which may lead to a change of state or, less frequently, trait,
  • external resources where other information on the subject can be found.

States vs Traits

  • Very few things are traits!
  • And in most cases, you don't have to decide whether something is a state or a trait
  • Many specific elements within an entity

People: Names

  • persName (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc.
  • surname (surname) contains a family (inherited) name, as opposed to a given, baptismal, or nick name.
  • forename (forename) contains a forename, given or baptismal name.
  • roleName (role name) contains a name component which indicates that the referent has a particular role or position in society, such as an official title or rank.
  • addName (additional name) contains an additional name component, such as a nickname, epithet, or alias, or any other descriptive phrase used within a personal name.
  • nameLink (name link) contains a connecting phrase or link used within a name but not regarded as part of it, such as van der or of.
  • genName (generational name component) contains a name component used to distinguish otherwise similar names on the basis of the relative ages or generations of the persons named.
                   
            
    
    

People: States in Time

  • States can use a set of date attributes to locate them in time
  • @when: The precise date
  • @to and @from: A date range
  • @notBefore and @notAfter: A date range with fuzzy boundaries
          
            
          
        

People: Events

  • birth (birth) contains information about a person's birth, such as its date and place.
  • death (death) contains information about a person's death, such as its date and place.
  • event (event)
          
            
             
        

People: States

  • affiliation (affiliation) contains an informal description of a person's present or past affiliation with some organization, for example an employer or sponsor.
  • age (age) specifies the age of a person.
  • education (education) contains a description of the educational experience of a person.
  • faith (faith) specifies the faith, religion, or belief set of a person.
  • floruit (floruit) contains information about a person's period of activity.
  • gender (gender) specifies the gender identity of a person, persona, or character.

People: States

  • langKnowledge (language knowledge) summarizes the state of a person's linguistic knowledge, either as prose or by a list of langKnown elements.
  • nationality (nationality) contains an informal description of a person's present or past nationality or citizenship.
  • occupation (occupation) contains an informal description of a person's trade, profession or occupation.
  • persName (personal name) contains a proper noun or proper-noun phrase referring to a person, possibly including one or more of the person's forenames, surnames, honorifics, added names, etc.
  • persona provides information about one of the personalities identified for a given individual, where an individual has multiple personalities.
  • persPronouns (personal pronouns) indicates the personal pronouns used, or assumed to be used, by the individual being described.
  • residence (residence) describes a person's present or past places of residence.
  • sex (sex) specifies the sex of an organism.
  • socecStatus (socio-economic status) contains an informal description of a person's perceived social or economic status.
                   
            

            
            
            
        

People: Personae

  • People can also have individual personae, which function as embedded person elements
            
            
        
          
            
             
        

Activity! Encoding People

Activity!

https://thepeopleandthetext.ca/featured-authors/EPaulineJohnson

Activity!

Discussion

Places

  • Places are very similar to people, but far fewer elements
  • But conceptually challenging: many changeable and contested states (or traits)
  • And can nest

 

Places

  • How to represent nested nature of place?
  • Countries: Top level or nested?

 

Activity: Geolocating using Vertexer

Discussion

Lunch

Case Study

Next Sessions

  • August 11 (UBC): Language, Speech, and Thought