Skip to content

Relationship Schema

This file defines how entities connect. It’s a companion to schema-entities.md (which defines entities).


Core Philosophy: Event-Sourced, Computed State

Section titled “Core Philosophy: Event-Sourced, Computed State”

We follow an event-sourcing paradigm for relationships:

  1. Store events, not state — Record “married on date X” and “divorced on date Y”, not “married: true”
  2. Compute current state — Current relationship status is derived from event history
  3. Derive complex relationships — Siblings, grandparents, cousins are computed from primitives
  4. Timestamps on everything — Every relationship has temporal data
ApproachProblems
Stored stateNo history, data conflicts, “when did this change?”, manual sync
EventsFull audit trail, temporal queries natural, no conflicts (events are facts)

Influenced by:

  • FamilySearch GEDCOM — Marriage is an event with date/place, not a boolean
  • Facebook TAO — All associations have timestamps, ordered by time
  • GEDCOM X — Relationships are first-class entities with facts, sources, confidence

Wrong (stored state):

{
"person": "joe",
"employer": "acme",
"employed": true
}

Right (events):

{
"type": "employment.started",
"person": "joe",
"organization": "acme",
"date": "2020-01-15"
}

To know if Joe works at Acme today: find most recent employment event for that pair.


Goal: First-class import/export with GEDCOM 7.0 files.

GEDCOM StructureOur MappingNotes
FAM recordCouple relationshipLinks two partners + children
INDI.FAMCChild-of relationshipWith PEDI qualifier
INDI.FAMSPartner-in relationshipBidirectional with FAM
MARR, DIV, ANULEvents on coupleDate, place, type
PEDILineage typeBIRTH, ADOPTED, FOSTER, SEALING
ASSO + ROLEAssociationGeneric relationships
GEDCOM ValueOur ValueMeaning
BIRTHbiologicalBirth parents
ADOPTEDadoptiveLegal adoption
FOSTERfosterFoster care
SEALINGsealedLDS temple sealing
GEDCOM TagEvent TypeDescription
MARRmarriageLegal/customary marriage
DIVdivorceLegal dissolution
DIVFdivorce_filedFiling for divorce
ANULannulmentMarriage declared void
ENGAengagementAgreement to marry
MARBmarriage_bannPublic notice of intent
MARCmarriage_contractPrenuptial agreement
MARLmarriage_licenseLegal license obtained
MARSmarriage_settlementProperty agreement
  1. Preserve GEDCOM IDs — Store original @F1@, @I1@ in data.gedcom_id
  2. Round-trip fidelity — What we import, we can export back
  3. Map events, not state — Convert MARR to marriage event, not married flag
  4. Handle NO structure — GEDCOM’s NO MARR = “no marriage occurred” assertion

Following FamilySearch’s elegant model, we use only two base relationship types:

A relationship between two people (spouses, partners, etc.).

{
"id": "rel_abc123",
"type": "couple",
"person1": "person_joe",
"person2": "person_jane",
"events": [
{
"type": "marriage",
"date": "2015-06-20",
"place": "San Francisco, CA"
}
]
}

Why not spouse_of? Because:

  • Couples aren’t always spouses (cohabitation, engagement)
  • “Spouse” implies current state; we track events
  • GEDCOM 7 uses FAM which is couple-centric, not marriage-centric

A relationship between a parent and child.

{
"id": "rel_def456",
"type": "parent_child",
"parent": "person_joe",
"child": "person_joey",
"lineage": "biological",
"events": [
{
"type": "birth",
"date": "2018-03-15"
}
]
}

Lineage types:

  • biological — Birth parent
  • adoptive — Legal adoption
  • foster — Foster care
  • step — Parent’s partner, not biological
  • guardian — Legal guardian

Everything else is derived from the primitives:

To FindQuery Pattern
SiblingsPeople who share ≥1 parent
Full siblingsShare both parents
Half-siblingsShare exactly one parent
Step-siblingsParents are coupled but no shared parents
GrandparentsParents of parents
GrandchildrenChildren of children
Aunts/UnclesSiblings of parents
CousinsChildren of aunts/uncles
In-lawsSpouses of siblings, siblings of spouse
  1. No redundancy — Adding a parent automatically updates all derived relationships
  2. No conflicts — Can’t have inconsistent sibling/parent data
  3. Less maintenance — Two relationship types vs dozens
  4. Natural queries — “Find all descendants” is a graph traversal
function getSiblings(person: Entity): Entity[] {
const parents = getParents(person);
const siblings = new Set<Entity>();
for (const parent of parents) {
for (const child of getChildren(parent)) {
if (child.id !== person.id) {
siblings.add(child);
}
}
}
return Array.from(siblings);
}
function getGrandparents(person: Entity): Entity[] {
return getParents(person).flatMap(p => getParents(p));
}

Relationships don’t have boolean state — they have events.

Event TypeMeaningComputes To
engagementAgreement to marryengaged
marriageLegal/customary unionmarried
separationLiving apartseparated
divorce_filedLegal filingdivorce pending
divorceLegal dissolutiondivorced
annulmentMarriage voidedannulled
reconciliationBack together after separationmarried
deathOne partner diedwidowed

Current status is computed from the most recent event.

Event TypeMeaning
birthChild born to parent
adoptionLegal adoption finalized
foster_startFoster placement began
foster_endFoster placement ended
guardianship_startLegal guardianship began
guardianship_endGuardianship ended
emancipationChild legally independent
{
"type": "marriage",
"date": "2015-06-20",
"place": "San Francisco, CA",
"data": {
"ceremony_type": "civil",
"officiant": "Judge Smith"
}
}

Beyond family, we need other relationship types.

Following GEDCOM’s ASSO pattern for non-family connections:

RelationshipDescriptionComputed From
colleagueWork togetherOverlapping employment at same org
classmateSchool togetherOverlapping education at same school
roommateLive togetherOverlapping residence at same address
collaboratorProject togetherShared project membership
mentor_ofMentorshipExplicit relationship
friend_ofFriendshipExplicit relationship

Hypothesis: Many association relationships can be computed from overlapping events:

  • Colleagues = both employed at same org during overlapping time
  • Classmates = both educated at same institution during overlapping time

What must be explicit:

  • Friendship (no events imply it)
  • Mentorship (asymmetric, intentional)
  • Godparent (ceremonial role)
RelationshipFromToDescription
authored_byworkpersonCreator relationship
published_byworkorganizationPublisher
performed_byperformancepersonPerformer
directed_byworkpersonDirector
produced_byworkperson/orgProducer

These may also be event-based:

  • “Joe authored the article” = authorship event with date
  • “Jane left the band” = membership_ended event
RelationshipDescriptionExamples
part_ofContainmentTrack part of album, chapter part of book
member_ofMembershipPerson member of organization
located_inGeographicCity located in country

Every relationship can have metadata beyond type and participants.

PropertyTypeDescription
created_atdatetimeWhen relationship was created in our system
confidence0.0-1.0For AI-extracted relationships
sourcestringWhere we learned about this
notesstringAdditional context

Couple relationships:

  • events[] — Marriage, divorce, etc.

Parent-child relationships:

  • lineage — biological, adoptive, foster, step, guardian
  • events[] — Birth, adoption, etc.

Employment (computed or explicit):

  • title — Job title
  • department — Department
  • started_at / ended_at — Dates

Some relationships are symmetric — if A relates to B, B relates to A the same way:

  • Siblings
  • Spouses
  • Colleagues
  • Friends

For symmetric relationships, we store one relationship and query from either direction.

Some relationships have different meanings in each direction:

  • Parent → Child (inverse: Child → Parent)
  • Mentor → Mentee
  • Employer → Employee

For asymmetric relationships, we store with directionality and define inverse names:

ForwardInverse
parent_ofchild_of
mentor_ofmentee_of
employsemployed_by
authoredauthored_by

When storing:

{
"type": "parent_child",
"parent": "person_joe",
"child": "person_joey"
}

When querying “who are Joey’s parents?”:

  • Find all parent_child where child = joey
  • Return the parent values

When querying “who are Joe’s children?”:

  • Find all parent_child where parent = joe
  • Return the child values

Hypothesis: Most Relationships Are Computed

Section titled “Hypothesis: Most Relationships Are Computed”

Claim: The majority of interesting relationships can be derived from events:

  • Colleague = overlapping employment
  • Classmate = overlapping education
  • Neighbor = overlapping residence
  • Fellow traveler = overlapping trips

Implication: We should focus on capturing events accurately, not on creating relationship types for every possible connection.

Open question: What’s the right balance? Some computed relationships need explicit confirmation.

Claim: Every relationship should have temporal bounds, even if imprecise.

Reasoning:

  • Enables “what was true on date X?” queries
  • Enables change tracking
  • Matches how Facebook TAO works
  • Matches GEDCOM’s event-centric model

Open question: What about relationships without known dates? Use null or estimate?

Hypothesis: Relationships Are First-Class Entities

Section titled “Hypothesis: Relationships Are First-Class Entities”

Claim: Relationships should have their own IDs, not just be edges.

Reasoning:

  • Can attach metadata (confidence, source, notes)
  • Can have events (marriage → divorce sequence)
  • Can be referenced by other structures
  • Matches GEDCOM X model

Open question: Does this add too much complexity for simple cases?

Facebook calculates relationship strength internally from:

  • Mutual friends
  • Interaction frequency
  • Profile overlap

Question: Should we expose strength as a property?

Arguments for:

  • Useful for prioritizing in UI
  • Enables “closest friends” queries

Arguments against:

  • Schema.org doesn’t have it
  • Subjective, changes over time
  • Should be computed, not stored?

Current decision: Don’t include in schema. Can be computed by applications.

OWA (Open World): Absence of data = unknown (RDF, OWL) CWA (Closed World): Absence of data = false (Databases, SHACL)

Most applications expect CWA:

  • “Show me Joe’s children” expects complete list
  • “Does Joe have a spouse?” expects yes/no

But OWA is more honest:

  • We might not know all of Joe’s children
  • Joe might have a spouse we don’t know about

Current decision: Lean toward CWA for application behavior, but support confidence levels and provenance to indicate uncertainty.


-- Base relationships table
CREATE TABLE relationships (
id TEXT PRIMARY KEY,
type TEXT NOT NULL, -- 'couple', 'parent_child', 'member_of', etc.
entity1_id TEXT NOT NULL,
entity2_id TEXT NOT NULL,
data JSON, -- Type-specific properties
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (entity1_id) REFERENCES entities(id),
FOREIGN KEY (entity2_id) REFERENCES entities(id)
);
-- Events on relationships
CREATE TABLE relationship_events (
id TEXT PRIMARY KEY,
relationship_id TEXT NOT NULL,
type TEXT NOT NULL, -- 'marriage', 'divorce', 'birth', etc.
date TEXT, -- ISO date or partial
place TEXT,
data JSON,
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (relationship_id) REFERENCES relationships(id)
);
{
"id": "rel_abc123",
"type": "couple",
"entity1_id": "person_joe",
"entity2_id": "person_jane",
"data": {},
"events": [
{
"id": "evt_111",
"type": "engagement",
"date": "2014-12-25"
},
{
"id": "evt_222",
"type": "marriage",
"date": "2015-06-20",
"place": "San Francisco, CA",
"data": {
"ceremony_type": "civil"
}
}
],
"created_at": "2026-01-24T12:00:00Z"
}

SourceKey Lesson
FamilySearchTwo primitives (couple + parent-child), compute everything else
GEDCOM 7Events not state, marriage is an event with date/place
GEDCOM XRelationships as first-class entities with confidence
Facebook TAOTimestamps on everything, typed associations
SystemWhy Rejected
Schema.orgToo generic, weak abstractions, no temporal model
OGPUnidirectional, fixed vocabulary, limited metadata
FOAFSimple but no temporal support, limited adoption

  • Researched Schema.org, Facebook, FamilySearch, OGP, knowledge graphs
  • Found consensus on: timestamps everywhere, bidirectionality matters
  • Found divergence on: relationships as edges vs entities, explicit vs computed
  • Decision: Follow FamilySearch’s event-sourcing model
  • Decision: First-class GEDCOM 7 compatibility
  • Created this document
  • Defined two primitive relationships: couple, parent_child
  • Defined event types for each
  • Documented computed relationship patterns
  • Listed hypotheses and open questions

Missing Relationships (High Confidence — To Implement)

Section titled “Missing Relationships (High Confidence — To Implement)”
RelationshipResearch SourceNotes
followsFacebook, Schema.orgAsymmetric social (A follows B ≠ B follows A)
contributed_toSchema.orgSecondary creator role
aboutSchema.orgWhat the work is about
mentionsSchema.orgReferences another entity (weaker than about)
from / toGmail, MessagesSender/recipient for messages
reply_toGmail, RedditMessage/comment threading
comment_onYouTube, FacebookWhat a comment responds to
extracted_fromReadwiseSource of a highlight
spoken_byWikiquoteSpeaker of a quote
appears_inWikiquoteWhere a quote was found
RelationshipFrom → ToNotes
attended_byevent → personEvent attendance
organized_byevent → person/orgEvent organizer
RelationshipNotes
same_asIdentity link to external entity (Wikidata Q-number, external URL)

  1. GEDCOM 7 import/export — Build parser and serializer
  2. Computed relationships — Implement sibling, grandparent queries
  3. Test with real data — Import family tree, verify round-trip
  4. Association relationships — Decide which to compute vs store explicitly

Divergent thinking phase. Everything from Google Takeout and Facebook Graph research. Converge later.

Based on research across Google Takeout (23 products), Facebook Graph API, Instagram, Threads, WhatsApp, and Twitter/X.

RelationshipDirectionBetweenPropertiesSource
friend_ofBidirectionalperson ↔ personcreated_atFacebook
followsUnidirectionalperson → person/org/channelcreated_atFB, IG, Twitter, YouTube
blockedUnidirectionalperson → personcreated_atFacebook
knowsBidirectionalperson ↔ personconfidence, contextSchema.org

Design note: Facebook’s friend is bidirectional (creates edges both ways). follow is unidirectional. Consider supporting both patterns.


Family Relationships (Existing + Extensions)

Section titled “Family Relationships (Existing + Extensions)”

Already defined: couple, parent_child (primitives)

Additional family edges from Facebook (computed from primitives):

ComputedQuery Pattern
sibling_ofShare ≥1 parent
grandparent_of / grandchild_ofParent of parent
aunt_uncle_of / niece_nephew_ofSibling of parent
cousin_ofChild of aunt/uncle
in_lawSpouse of sibling, sibling of spouse

Keep as computed, not stored.


RelationshipDirectionBetweenPropertiesSource
works_atUnidirectionalperson → organizationtitle, department, started_at, ended_atGoogle Contacts, LinkedIn
colleague_ofBidirectionalperson ↔ personorganization, overlap_periodComputed
managesUnidirectionalperson → personstarted_atWorkplace
reports_toUnidirectionalperson → personstarted_atWorkplace (inverse of manages)
foundedUnidirectionalperson → organizationdate
invested_inUnidirectionalperson/org → organizationamount, date

Hypothesis: colleague_of should be computed from overlapping works_at relationships, not stored explicitly.


RelationshipDirectionBetweenPropertiesSource
authored_byUnidirectionalwork → personrole (author, co-author, editor)OGP, Schema.org
directed_byUnidirectionalvideo.movie → personOGP
performed_byUnidirectionalmusic.song → personrole (artist, featured)OGP
produced_byUnidirectionalwork → person/orgOGP
published_byUnidirectionalwork → organizationdate
contributed_toUnidirectionalperson → workrole, contributionSchema.org

RelationshipDirectionBetweenPropertiesSource
part_ofUnidirectionalentity → containerpositionUniversal
containsUnidirectionalcontainer → entityInverse of part_of
episode_ofUnidirectionalvideo.episode → video.tv_showseason, episode_numberOGP
track_onUnidirectionalmusic.song → music.albumdisc, track_numberOGP
chapter_ofUnidirectionalsection → book
tagged_withUnidirectionalentity → label/tagKeep, Gmail, Blogger

Note: part_of is the universal containment relationship. Specific forms like episode_of, track_on are just typed part_of with metadata.


RelationshipDirectionBetweenPropertiesSource
fromUnidirectionalmessage → personGmail, FB, WhatsApp
toUnidirectionalmessage → personGmail, FB, WhatsApp
ccUnidirectionalmessage → personGmail
bccUnidirectionalmessage → personGmail
reply_toUnidirectionalmessage → messageGmail, Reddit, FB
attachmentUnidirectionalmessage → fileGmail
participant_inUnidirectionalperson → conversationrole (admin, member), joined_atChat, WhatsApp

RelationshipDirectionBetweenPropertiesSource
likedUnidirectionalperson → contentcreated_atFacebook, IG
reacted_toUnidirectionalperson → contentreaction_type, created_atFacebook
commented_onUnidirectionalcomment → contentFB, IG, YouTube
sharedUnidirectionalpost → post (original)share_timeFacebook
savedUnidirectionalperson → contentcollection, created_atFB, IG, Google Saved
tagged_inUnidirectionalcontent → personx, y (for photos), created_atFB, IG
mentionedUnidirectionalcontent → personFB, IG, Twitter
subscribed_toUnidirectionalperson → channel/podcastcreated_atYouTube, Podcasts

Design decision: Reactions could be separate relationships per type (loved, wowed, etc.) or one reacted_to with type metadata. Recommend: one relationship with metadata.


RelationshipDirectionBetweenPropertiesSource
located_atUnidirectionalentity → placeFB, Google
located_inUnidirectionalplace → placeGeographic hierarchy
visitedUnidirectionalperson → placearrived_at, departed_at, confidenceGoogle Timeline
taken_atUnidirectionalphoto/video → placeGoogle Photos, FB
lives_inUnidirectionalperson → placesinceFB, Contacts
born_inUnidirectionalperson → placeFB
hometownUnidirectionalperson → placeFB
checked_inUnidirectionalperson → placetimestamp, postFB (deprecated)

RelationshipDirectionBetweenPropertiesSource
attendingUnidirectionalperson → eventrsvp_status, rsvp_timeFB, Google Calendar
hosted_byUnidirectionalevent → person/orgFB, Calendar
invited_toUnidirectionalperson → eventinvited_by, invited_atFB, Calendar
speaker_atUnidirectionalperson → event
venue_ofUnidirectionalevent → placeFB, Calendar

RSVP statuses (from Facebook): attending, maybe, declined, invited (no response)


RelationshipDirectionBetweenPropertiesSource
purchased_byUnidirectionalorder → personGmail schema.org
sold_byUnidirectionalorder → organizationGmail schema.org
containsUnidirectionalorder → productquantity, priceGmail schema.org
delivered_byUnidirectionaldelivery → organizationGmail schema.org
part_of_orderUnidirectionaldelivery → orderGmail schema.org
reserved_forUnidirectionalreservation → personGmail schema.org
reserved_atUnidirectionalreservation → placeGmail schema.org

RelationshipDirectionBetweenPropertiesSource
held_byUnidirectionalpass → personGoogle Wallet
issued_byUnidirectionalpass → organizationGoogle Wallet
valid_forUnidirectionalpass → event/transitGoogle Wallet

RelationshipDirectionBetweenPropertiesSource
extracted_fromUnidirectionalhighlight → workposition, pageReadwise, Kindle
spoken_byUnidirectionalquote → personcontext
appears_inUnidirectionalquote → workpage, chapter
review_ofUnidirectionalreview → entitySchema.org, Goodreads

RelationshipDirectionBetweenPropertiesSource
same_asBidirectionalentity ↔ external_idWikidata, Schema.org
linked_accountBidirectionalaccount ↔ accountplatformMeta Accounts Center
alias_ofUnidirectionalname → person

RelationshipDirectionBetweenPropertiesSource
appears_inUnidirectionalperson → photo/videox, y, confidenceGoogle Photos, FB
thumbnail_ofUnidirectionalphoto → video
cover_ofUnidirectionalphoto → album/playlist/bookFB, Spotify
transcript_ofUnidirectionaltext → audio/videoYouTube, Podcasts

From Facebook TAO research, all relationships benefit from metadata:

PropertyTypeDescription
created_atdatetimeWhen relationship was created
confidence0.0-1.0For AI-extracted relationships
sourcestringHow we learned about this

Engagement relationships:

  • reaction_type — LIKE, LOVE, HAHA, WOW, SAD, ANGRY, CARE

Tagging relationships:

  • x, y — Position in photo (0-100 coordinates)

Membership relationships:

  • role — admin, moderator, member
  • joined_at — When joined

Event attendance:

  • rsvp_status — attending, maybe, declined, invited
  • rsvp_time — When they responded

Employment:

  • title — Job title
  • department — Department
  • started_at, ended_at — Tenure

Content positioning:

  • position — Order in container (playlist position, track number)
  • season, episode — For TV episodes

From Facebook’s TAO paper:

  1. Single association per type between two objects

    • Can’t like something twice
    • Different types can coexist (like AND comment on same post)
  2. Timestamps required on all associations

    • Enables “most recent” queries
    • Enables temporal filtering
  3. Bidirectional relationships = two associations

    • Friend A↔B stored as (A→B) AND (B→A)
    • Query from either direction efficiently
  4. Association lists ordered by time

    • Most recent first
    • Enables efficient “latest N” queries
  5. Optional data payload

    • Associations carry key-value metadata
    • Type-specific properties (role, coordinates, etc.)

Should be computed (not stored):

  • sibling_of — from shared parents
  • grandparent_of — parent of parent
  • colleague_of — overlapping employment
  • classmate_of — overlapping education
  • neighbor_of — overlapping residence

Must be explicit (user declares):

  • friend_of — no events imply it
  • mentor_of — intentional, asymmetric
  • godparent_of — ceremonial role
  • romantic_partner — requires confirmation

Could be either:

  • knows — could be computed from interactions OR explicit
  • worked_with — could be computed from overlapping employment OR explicit for project collaboration

For asymmetric relationships, define both directions:

ForwardInverse
parent_ofchild_of
authoredauthored_by
employsemployed_by
managesmanaged_by / reports_to
containspart_of / contained_in
mentor_ofmentee_of
hostshosted_by
followsfollowed_by
invitedinvited_to
taggedtagged_in

From Facebook research, relationships have lifecycle states:

  1. Pending — Request sent, awaiting confirmation (friend requests)
  2. Active — Confirmed, in effect
  3. Ended — Terminated (unfriend, unfollow, divorce)
  4. Blocked — Explicitly blocked

For event-sourced relationships, track via events:

  • friendship_requestedfriendship_confirmedunfriended
  • engagedmarrieddivorced

From Facebook:

  • Entity visibility and relationship visibility are independent
  • Both parties can have different privacy settings for the same relationship
  • Some status changes are silent (divorce doesn’t post to feed)

Consider:

  • visibility property on relationships
  • Different views for self vs others vs public

AspectFacebookGoogleTwitterOur Model
Relationship storageTAO (typed edges)Separate APIsSnowflake IDsEvent-sourced
Bidirectional handlingTwo edgesVariesTwo edgesTwo edges
TimestampsRequiredUsuallyRequiredRequired
Metadata on edgesOptional dataVariesLimitedFull support
Privacy per-edgeYesLimitedNoTBD

RelationshipPriorityNotes
coupleFamily primitive
parent_childFamily primitive
authored_byHighContent creation
part_ofHighUniversal containment
from / toHighMessaging
reply_toHighThreading
followsHighSocial
member_ofHighGroups, organizations
attended_byHighEvents
located_atHighLocation
same_asHighIdentity linking
RelationshipPriorityNotes
works_atMediumProfessional
comment_onMediumEngagement
tagged_inMediumMedia
liked / reacted_toMediumEngagement
extracted_fromMediumHighlights
spoken_by / appears_inMediumQuotes
subscribed_toMediumFollows for channels
review_ofMediumReviews
RelationshipPriorityNotes
purchased_by / sold_byLowerCommerce
held_by / issued_byLowerPasses
visitedLowerLocation history
transcript_ofLowerMedia
thumbnail_of / cover_ofLowerMedia

  1. Reaction as relationship or entity?

    • Recommendation: Relationship with reaction_type metadata
  2. Generic part_of vs specific containment types?

    • Recommendation: Generic part_of with metadata (position, role)
  3. How to handle bidirectional relationships?

    • Recommendation: Store two edges (TAO pattern)
  4. Privacy per-relationship?

    • Open question: Do we need this complexity?
  5. Computed vs explicit threshold?

    • Recommendation: Compute social inference, require explicit for meaningful relationships
  6. Relationship lifecycle events?

    • Recommendation: Full event sourcing for family/romantic, simpler for content relationships

Design Principles from Research (2026-01-24)

Section titled “Design Principles from Research (2026-01-24)”

Synthesis of key learnings across all research sources. These principles should guide the convergence phase.

Principle 1: Two Family Relationship Primitives

Section titled “Principle 1: Two Family Relationship Primitives”

Source: FamilySearch, GEDCOM 7

  • couple + parent_child express all family relationships
  • Siblings, grandparents, cousins are computed, not stored
  • Avoids redundancy and consistency issues
  • One person can have multiple parent relationships (biological + adoptive)

Principle 2: Event-Sourcing Over Stored State

Section titled “Principle 2: Event-Sourcing Over Stored State”

Source: FamilySearch, GEDCOM X, Facebook TAO

  • Store events, not current state
  • “Joe married Jane on 2020-06-15” not “married: true”
  • Current status is computed from event history
  • Divorce, remarriage are events on the same couple relationship

Source: Facebook TAO, FamilySearch

  • Every relationship has temporal data (created_at, events with dates)
  • Required for “what was true on date X?” queries
  • Association lists ordered by time (TAO pattern)
  • Events have date/place/confidence qualifiers

Principle 4: Relationships Are First-Class Entities

Section titled “Principle 4: Relationships Are First-Class Entities”

Source: GEDCOM X, property graphs

  • Relationships have their own IDs
  • Can attach metadata: confidence, source, notes
  • Can have events: marriage → separation → reconciliation → divorce
  • Not just foreign keys in a join table

Principle 5: Compute Don’t Store (Derived Relationships)

Section titled “Principle 5: Compute Don’t Store (Derived Relationships)”

Source: FamilySearch, Facebook

ComputedQuery Pattern
SiblingShare ≥1 parent
GrandparentParent of parent
ColleagueOverlapping employment
ClassmateOverlapping education
NeighborOverlapping residence

Only store explicit relationships (friendship, mentorship, authorship).

Source: Facebook TAO paper

  1. Single association per type — Can’t “like” something twice
  2. Timestamps required — Enables time-based ordering
  3. Bidirectional = two edges — Store both directions for symmetric
  4. Optional data payload — Relationships carry key-value metadata

Source: FamilySearch research

  • First-class import/export with GEDCOM 7.0 files
  • Map: FAM → couple, FAMC → parent_child
  • Preserve GEDCOM IDs in data.gedcom_id
  • Handle PEDI (lineage types): BIRTH, ADOPTED, FOSTER, SEALING

Source: Schema.org, property graph best practices

ForwardInverse
parent_ofchild_of
authored_byauthor_of
employsemployed_by
containspart_of
followsfollowed_by

Store in one direction, compute inverse on query.

Principle 9: Confidence for AI-Extracted Relationships

Section titled “Principle 9: Confidence for AI-Extracted Relationships”

Source: GEDCOM X, knowledge graph research

  • AI-extracted relationships carry confidence: 0.0-1.0
  • Source tracking: where did we learn this?
  • Status enum: proven, challenged, disproven (from GEDCOM)
  • Humans can confirm/reject

Principle 10: Privacy Per-Relationship (Future)

Section titled “Principle 10: Privacy Per-Relationship (Future)”

Source: Facebook research

  • Entity visibility and relationship visibility are independent
  • Both parties can have different privacy settings
  • Some status changes are silent (divorce doesn’t broadcast)
  • Open question: Do we need this complexity now?

Relationship Type Summary (Pre-Convergence)

Section titled “Relationship Type Summary (Pre-Convergence)”
CategoryRelationshipsNotes
Family Primitivescouple, parent_childEverything else computed
Content Creationauthored_by, directed_by, performed_byCreator relationships
Containmentpart_of, member_ofUniversal hierarchy
Messagingfrom, to, reply_toThreading
SocialfollowsAsymmetric follow
Eventattended_by, organized_byEvent participation
Locationlocated_atPlace associations
Identitysame_asCross-source linking
CategoryRelationshipsNotes
Professionalworks_at, managesEmployment
Engagementcomment_on, liked, reacted_toUser interactions
Contentextracted_from, spoken_by, appears_inQuotes/highlights
Subscriptionsubscribed_toFollows for channels
Reviewreview_ofReview relationships
CategoryRelationshipsNotes
Commercepurchased_by, sold_by, delivered_byOrders/delivery
Passesheld_by, issued_by, valid_forWallet passes
Location HistoryvisitedTimeline visits
Mediatranscript_of, thumbnail_of, cover_ofMedia relationships
RelationshipComputed From
sibling_ofShared parents
grandparent_ofParent of parent
aunt_uncle_ofSibling of parent
cousin_ofChild of aunt/uncle
colleague_ofOverlapping works_at
classmate_ofOverlapping education

Universal Metadata Patterns (Applied to Relationships)

Section titled “Universal Metadata Patterns (Applied to Relationships)”

See schema-entities.md for full pattern definitions. This section describes how they apply to relationships specifically.

PatternApplication to Relationships
Evidence QualityHow do we know these two are connected? Quality of the source.
Temporal PrecisionWhen did this relationship start/end? (imprecise dates supported)
ProvenanceWho asserted this connection? From what source?
AliasesN/A — relationships don’t have names
relationship:
type: "couple"
from: "person_abc"
to: "person_xyz"
# Temporal (from Pattern 2)
started_at:
value: "1920-06-15"
modifier: "about"
original: "abt Jun 1920"
# Evidence (from Pattern 1)
evidence:
quality: 2 # Secondary source
confidence: 0.9
method: "imported"
# Provenance (from Pattern 3)
provenance:
source: "Marriage certificate, County Records"
contributor: "user_joe"
status: "verified"
# Relationship-specific
events:
- type: "marriage"
date: { value: "1920-06-15", modifier: "about" }
place: "New York, NY"

Events on Relationships Also Get Attribution

Section titled “Events on Relationships Also Get Attribution”

Events that occur on relationships (marriage, divorce, etc.) can also carry evidence and provenance:

event:
type: "marriage"
date:
value: "1920-06-15"
modifier: "exact"
evidence:
quality: 3 # Primary source (marriage certificate)
confidence: 1.0
provenance:
source: "entity_id_of_certificate"
contributor: "user_joe"

This allows different facts about the same relationship to have different source quality.


  1. Finalize Tier 1 relationships — These are the core schema
  2. Define event types per relationship — What events can attach to couple?
  3. Spec the data model — Relationships table, events table
  4. Apply universal patterns — Evidence, temporal, provenance on relationships
  5. Build GEDCOM 7 import/export — First integration test
  6. Define inverse computation — How to query in both directions