Architecture Decision Record (ADR) 21: Passport Lexicons and Ref-Only AT Protocol Records
Status: Superseded by ADR 27: Zero Trust PDS-Based Provenance
Date: 2026-05-16
Glossary (acronyms in full, with abbreviation in brackets)
- Architecture Decision Record (ADR)
- Authenticated Transfer Protocol (AT Protocol)
- Access Control List
- Content Identifier (CID)
- Decentralized Identifier (DID)
- Digital Asset Passport (DAP)
- JavaScript Object Notation (JSON)
- Namespaced Identifier (NSID)
- Open Authorization (OAuth)
- Personal Data Server (PDS)
- Pre-Shared Key (PSK)
Context
Passport receipts bind assets to an owner did:plc (a Decentralized Identifier (DID) profile used with Personal Data Server (PDS) identity), an Access Control List, and provenance. Today, gateway-adjacent code may hold receipt-shaped data only in volatile process memory (for example the in-memory drive tree), while file bytes can already be stored in the retrieval blockstore under a Content Identifier (CID). Restarts then lose catalog and Access Control even when blocks remain—a poor operator and user experience.
We also want:
- Application-level authorization (owner and Access Control), analogous to how Bluesky reasons about records, without confusing network membership (Pre-Shared Key (PSK) / private mesh) with object-level access control.
- Interoperability: third parties and future clients must validate the same shapes we emit.
- Evolution: schema changes must not silently reinterpret existing Namespaced Identifiers (NSIDs).
The AT Protocol ecosystem standardizes machine-readable schemas as lexicons (each type named by a Namespaced Identifier (NSID)). Human-oriented JavaScript Object Notation (JSON) Schema in docs/passport-receipt-schema.md remains narrative reference until lexicons become the canonical contract.
Decision
Lexicon namespace: All Substratum-defined lexicons for this concern use the Namespaced Identifier (NSID) prefix
cloud.substratum.*. Publish them for validators and external implementers (static lexicon tree + discovery as we mature).Split vocabulary for the Substratum filesystem: Define multiple small lexicons instead of one monolith—for example a receipt (binding + Access Control + provenance), asset metadata (Digital Asset Passport (DAP)-oriented optional fields), and filesystem-facing records that only link those concerns (exact NSIDs and
defsto be enumerated when lexicon files are added). Composition (receipt references metadata Content Identifier (CID) vs inline) is an implementation detail as long as types stay separable.AT Protocol record surface is ref-only: Repository (“repo”) records (and similar AT Protocol-facing payloads) carry references only—Content Identifiers (CIDs), lexicon
$typeidentifiers, and minimal indexing fields. They do not host bulk bytes via Personal Data Server (PDS) blob upload for passports or file payloads. Content-addressed bytes live in Substratum storage (blockstore / private mesh); the AT Protocol layer names and orders those CIDs for a Drive.Required provenance: The receipt lexicon marks
provenanceSignature(and its contract) as required for interoperable instances. Gateways reject receipts that fail cryptographic verification againstownerDid. The exact signing payload is versioned beside the lexicon (see evolution below)—the lexicon describes fields, not the algorithm bytes.Breaking changes → new lexicon NSID: Incompatible changes (required set, semantics, signing input) ship under a new Namespaced Identifier (NSID) (for example a new suffix or versioned name). Existing blobs and records keep their historical
$typeforever; readers maintain a known-types table during migration windows. Additive optional fields on the same semantics may remain on the same lexicon; anything that breaks old clients or verification warrants a new lexicon id.Operational observability: Metrics and logs tag verification and fetch paths by
$typeso adoption of new lexicon versions is visible.
Consequences
Positive
- Clear split: identity and pointers on AT Protocol repos, bytes and heavy metadata on Substratum—matches data sovereignty and keeps Personal Data Server (PDS) records small.
- Interoperable validation and code generation path as lexicons land in-repo.
- Predictable evolution without “same Namespaced Identifier (NSID), new meaning” drift.
Negative
- Personal Data Server (PDS) does not attest to bytes it does not store; gateways and clients must resolve CIDs, fetch blocks, and verify signatures—strictly more logic than “blob on Personal Data Server (PDS).”
- Multi-node durability for referenced Content Identifiers (CIDs) depends on retrieval and pinning behavior (see Retrieval layer gaps); refs are useless if blocks are missing from the mesh.
Neutral
- “Ref-only” applies to the AT Protocol boundary; Substratum still uses content-addressed blocks for payloads. Terminology in runbooks should say AT Protocol ref-only vs Content Identifier (CID)-backed storage to avoid ambiguity.
Related
- Architecture Decision Record (ADR) 12: AT Protocol Integration — identity and Open Authorization (OAuth).
- Architecture Decision Record (ADR) 16: Drive-Centric Domain Vocabulary — Drive as the user-scoped aggregate named by records.
- Architecture Decision Record (ADR) 17: Multi-Tenant Pre-Shared Key (PSK) Injection — mesh membership is not per-object Access Control List.
docs/passport-receipt-schema.md— narrative schema; lexicons supersede for machine contracts when published.