Architecture Decision Record (ADR) 29: Private PDS Namespace and Local CAR Emergency Export
Status: Proposed
Date: 2026-05-24
Last Updated: 2026-05-24 (dual-write in ADR 30; separate catalog-sync worker; break-glass mesh out of scope)
Glossary (acronyms in full, with abbreviation in brackets)
- Architecture Decision Record (ADR)
- Access Control List (ACL)
- Authenticated Transfer Protocol (AT Protocol)
- Content Addressable aRchive (CAR)
- Content Identifier (CID)
- Decentralized Identifier (DID)
- Merkle Search Tree (MST)
- Namespaced Identifier (NSID)
- Open Authorization (OAuth)
- Personal Data Server (PDS)
- Pre-Shared Key (PSK)
Context
Public-repo metadata exposure (ADR 27 follow-on)
ADR 27: Zero Trust PDS-Based Provenance places cloud.substratum.passport.receipt records in the owner's public AT Protocol repository. That gives strong MST-backed proof without gateway attestation, but any party that can call com.atproto.repo.getRecord on the owner repo learns:
- Asset CIDs (what content exists),
access_controlgrantee DIDs (who it was shared with),- Timestamps and optional metadata refs.
File bytes remain on the private libp2p mesh (ADR 03, ADR 17), but the passport catalog on the PDS is world-readable metadata. That conflicts with Substratum's data-sovereignty story for families who chose a private storage product.
AT Protocol PDS implementations support private data: repo records that use the same lexicon system but are not broadcast on the relay firehose and require authenticated XRPC to read. This is the practical “private namespace” for Substratum vNext—not a separate NSID prefix, but private collections for existing cloud.substratum.* types.
PDS permanence and user-owned recovery
Users may lose access to a PDS operator (bankruptcy, account termination, regional outage, or voluntary migration). Substratum already treats the PDS as the mesh authority for receipts (ADR 27, ADR 28). If the PDS is permanently unavailable and the user has no export, they lose:
- Cryptographic proof needed to authorize mesh reads,
- Drive names, folder paths, and file names (the filesystem tree),
- Share and removal intent records on grantee repos.
Today much of that tree lives only in the gateway catalog (drive, drive_entry in Postgres). File and folder labels are carried by drive_entry.path (full relative path, including the filename segment) and optional assetMetadata.title on receipts—not by a separate “filename” field. Without PDS-backed private records and local CAR, a dead PDS plus a lost database means the user may retain mesh blocks but not know what they were called or how they were organized.
ADR 04: Sidecar Design and home-base are the natural homes for user-controlled replicas because they already run beside the blockstore and participate in receipt sync (ADR 10, ADR 28).
The AT Protocol ecosystem standardizes repository export as CAR (Content Addressable aRchive) via com.atproto.sync.getRepo and record-oriented import paths. Substratum will maintain local .car snapshots of private Substratum collections so a family can migrate or recover without trusting a dead PDS.
Substratum lexicon inventory (canonical)
All repo collections below are in scope for PDS-private registration and CAR export. Embedded defs (no standalone collection) are listed for completeness.
| NSID | Kind | Published JSON (libs/lexicons/defs/) | Repo today | CAR / private target | Async pipeline (ADR 30 §7) |
|---|---|---|---|---|---|
cloud.substratum.passport.receipt | Record | Yes | Owner PDS (public) | Private owner collection | Receipt sync |
cloud.substratum.passport.accessRemovalRequest | Record | Yes | Grantee PDS (public) | Private grantee collection | Inline ingress (grantee OAuth) |
cloud.substratum.passport.accessRemovalAck | Record | Yes | Owner repo via receipt worker | Private owner collection | Receipt sync (with removal ack job) |
cloud.substratum.passport.assetMetadata | Object def only | Yes | Embedded in receipt.metadata (optional); future metadata_cid blob | Included via parent receipt (and standalone blob if used) | |
cloud.substratum.filesystem.driveEntry | Record | Yes | Postgres catalog + outbox lexicon_type; not yet on PDS | Private owner collection (required for tree recovery) | Catalog sync |
cloud.substratum.filesystem.drive | Record | To publish (Rust DriveMetadata exists in crates/lexicons) | Swarm block + Postgres drive row | Private owner collection | Catalog sync |
cloud.substratum.filesystem.directory | Record | Planned (ADR 25) | Postgres drive_entry dirs only | Private owner collection when ADR 25 lands | Catalog sync |
Normative collection allowlist (config SUBSTRATUM_PRIVATE_COLLECTIONS, comma-separated):
cloud.substratum.passport.receipt,
cloud.substratum.passport.accessRemovalAck,
cloud.substratum.filesystem.drive,
cloud.substratum.filesystem.driveEntry,
cloud.substratum.filesystem.directoryGrantee repos additionally register:
cloud.substratum.passport.accessRemovalRequestFilesystem naming: what must survive PDS loss
Users care about human-readable names, not just CIDs. Substratum maps names as follows:
| User-visible concept | Authoritative field(s) | Lexicon / store |
|---|---|---|
| Drive display name | name | cloud.substratum.filesystem.drive (DriveMetadata) + Postgres drive.name |
| Folder path | path where isDirectory: true | cloud.substratum.filesystem.driveEntry (and future filesystem.directory) |
| File path + filename | path (e.g. Photos/2024/vacation.jpg) | cloud.substratum.filesystem.driveEntry — filename is the final path segment |
| File title / caption | title (optional) | cloud.substratum.passport.assetMetadata inline on receipt.metadata |
| MIME / copyright | mimeType, license, … | assetMetadata on receipt |
| ACL / sharing | access_control on receipt; drive-level ACL on drive record | receipt, filesystem.drive |
Normative requirement: ADR 30: Catalog–PDS Dual-Write is mandatory for Phase B onward (dedicated CatalogSyncWorker; ADR 28 receipt sync stays separate). Until dual-write ships, CAR export MUST include a catalog-snapshot.json sidecar (encrypted with the vault) listing drive + drive_entry rows so emergency restore is not blocked by Postgres-only paths. That sidecar is a bridge; the end state is PDS records only in the CAR.
driveEntry record key (rkey): deterministic from (driveId, path) (e.g. hash of normalized path) so imports are idempotent and renames can be modeled as delete + create in a later ADR.
Out of scope for this ADR: permissioned spaces (com.atproto.space.*) as a separate realm. If upstream ships spaces broadly, Substratum may align in a later ADR; this ADR uses PDS-private collections first because they match today's createRecord / MST model.
Decision
We will move Substratum AT Protocol records into the PDS private namespace and require local CAR exports on every Substratum runtime that can hold user keys (sidecar, home-base; gateway only as a encrypted optional cache with explicit consent).
1. Private namespace for cloud.substratum.*
Collection policy: All Substratum repo collections under
cloud.substratum.*are registered as PDS-private with the user's PDS (configuration or lexicon-hosted policy, per PDS capabilities). They remain in the user's signed MST repo but must not appear on the public relay firehose.Writes unchanged in trust model: ADR 27 still applies—
com.atproto.repo.createRecord/putRecordwith owner or grantee OAuth, PDS signs the commit with the user's#atprotokey. Only visibility changes.Reads become authenticated:
com.atproto.repo.getRecord,listRecords, and sync endpoints require a valid OAuth session (or future delegation) for private collections. Anonymousget_recordused by mesh ACL today must be replaced with an authenticated resolver path (see §4).Lexicons: No NSID rename. Publish missing JSON defs (
cloud.substratum.filesystem.drive, andfilesystem.directorywhen ADR 25 is accepted) underlibs/lexicons/defs/. Optionalpermissionsmetadata when the PDS ecosystem stabilizes; until then use the allowlists in Substratum lexicon inventory.Supersedes (in part) ADR 27 deployment assumption: ADR 27's signing and MST verification remain authoritative; the assumption that receipts live in the public repo is retired when this ADR is implemented.
3. Dual-write and sync pipelines
Two tracks (see ADR 30 §7):
- Filesystem (ADR 30):
catalog_sync_outbox+CatalogSyncWorkerconverges drive, folder, and file-entry lexicons (cloud.substratum.filesystem.*) from Postgres to the owner PDS. - Passport (ADR 28 + this ADR):
ReceiptSyncWorkercontinues to owncloud.substratum.passport.*; making those collections PDS-private and switching mesh reads to authenticatedgetRecordis a refactor on the receipt path—not catalog-sync.
Both are required for CAR recovery and private namespace; event names and scope are in ADR 30—not duplicated here.
4. Local CAR vault (emergency export)
Every user-facing node that processes Substratum data maintains a Local CAR Vault on disk:
| Runtime | Role | Vault location (convention) |
|---|---|---|
| Sidecar | Primary for desktop / triangle devices | {data_dir}/substratum/car-vault/{owner_did}/ |
| Home-base | Primary for self-hosted home server | Same layout under home-base data root |
| Gateway | Optional, off by default | Only with explicit opt-in; not the sole copy |
Export triggers:
- After successful PDS write — when
ReceiptSyncWorkerorCatalogSyncWorkercommits a private record, append or roll forward a CAR slice containing that record's block(s) and update a manifest. - Periodic snapshot — at least daily while online, export all collections in the allowlist (passport + filesystem). Prefer authenticated private
getRepowhen the PDS supports it; otherwise assemble CAR from outbox commits +listRecordsper collection. Until PDS dual-write is complete, attachcatalog-snapshot.json(drive + drive_entry rows, includingpathandname) into the vault bundle beside the CAR. - User-initiated “Export now” — CLI / installer / file-explorer action writes a portable bundle.
On-disk layout (normative convention):
{data_dir}/substratum/car-vault/{owner_did}/
manifest.json # last_rev, collections[], snapshot_times, car_files[]
latest.car # symlink or copy of newest full snapshot
snapshots/
{iso8601}-rev-{n}.car
incremental/
{iso8601}-{rkey}.car # optional per-write slices between snapshotsmanifest.json (minimum fields):
owner_did,pds_host(last known),format_versioncollections[]— NSIDs included (must match allowlist; see inventory table)lexicon_defs_sha256— hash of bundledlibs/lexicons/defs/*.jsonused at export time (for import validation)snapshots[]—{ path, repo_rev, created_at, sha256, record_counts_by_collection }latest_rev— highest repo revision capturedcatalog_snapshot— optional{ path, sha256 }tocatalog-snapshot.jsonwhen Postgres bridge is used
CAR files use standard AT Protocol repo encoding (DAG-CBOR blocks, MST nodes). Substratum does not invent a parallel archive format.
Security:
- CAR vault directories are user-only filesystem permissions (
0700). - Optional encryption at rest (age or OS keychain-wrapped key) for portable exports—default on for gateway-stored copies, off for local sidecar unless the user enables it (performance on home hardware).
5. Emergency migration and recovery
When a PDS is permanently lost or the user migrates to a new host:
- Import CAR into a new PDS using operator-supported
importRepo/ per-record replay (see PDS account migration — PDS MOOver recommended). - Reconcile catalog — import order:
cloud.substratum.filesystem.drive→driverows (names, ACL),cloud.substratum.filesystem.driveEntry→ rebuild folder/file tree frompath(filenames from final segment),cloud.substratum.filesystem.directory(when present) → directory nodes per ADR 25,cloud.substratum.passport.receipt→ linkassetCid/receiptCidon entries; restoreassetMetadata.titlefor display,- If
catalog-snapshot.jsonis present, diff against PDS import and prefer newerupdatedAtper row.
- Reconcile mesh — blockstore CIDs referenced by imported receipts must still exist on the triangle (ADR 03); CAR does not replace blob storage.
- Update OAuth / handle — user signs in on the new PDS;
ReceiptSyncWorkerandCatalogSyncWorkerresume with newpds_url.
Product promise: A user who maintained sidecar or home-base CAR vaults can restore drive names, folder paths, file names, and sharing metadata even if their PDS vendor disappears. They cannot recover bytes that were never pinned on the mesh.
6. Read-path and mesh authorization (phased)
| Phase | Behavior |
|---|---|
| A (this ADR) | Document policy, vault layout, collection allowlist; no code change |
| B | ADR 30 dual-write + private PDS registration (ADR 29 §1) |
| C | retrieval uses authenticated PDS client (service token or requester's OAuth forwarded under policy) for receipt + removal lookups |
| D | Sidecar/home-base CAR writer hooks on successful dual-write (outbox success + scheduler); drop catalog-snapshot.json when export verifies full PDS coverage |
| E | Import CLI + operations runbook; file-explorer “Export / Restore” |
Mesh invariant (unchanged): Postgres catalog ACL alone is never sufficient for GetBlock (ADR 28). Private receipts must still be fetched and MST-verified; only the transport becomes authenticated.
Grantee reads: Grantees authorized on a receipt need a defined mechanism to read the owner's private receipt (OAuth scope, short-lived capability JWT, or replicated receipt summary on grantee repo). Phase C must pick one approach; default recommendation: OAuth scope cloud.substratum.passport.read on owner repo via user-delegated session, with gateway caching bounded by acl_version.
7. Relationship to other ADRs
| ADR | Relationship |
|---|---|
| 21 | Keep cloud.substratum.* NSIDs and ref-only semantics |
| 22 | DriveMetadata becomes cloud.substratum.filesystem.drive on PDS; Postgres remains query plane |
| 25 | filesystem.directory joins private namespace when first-class directory records ship |
| 27 | Signing model retained; public-repo deployment assumption superseded |
| 28 | Receipt async pipeline (ReceiptSyncWorker); unchanged by catalog dual-write |
| 30 | Dual-write scope, catalog_sync_outbox, CatalogSyncWorker (parallel to receipts) |
| 04 / 10 | Sidecar and home-base own the CAR vault |
| 11 | No return to detached provenanceSignature; CAR is repo export, not a new signature type |
Consequences
Positive
- Metadata sovereignty: Asset CIDs and share graphs are not enumerable from the public repo.
- Vendor independence: Local CAR gives a tangible escape hatch if a PDS shuts down.
- Aligned with AT Protocol direction: Same lexicons and MST commits; uses platform-private data rather than a bespoke metadata database as the sole source of truth.
Negative
- Federation friction: Third parties cannot verify receipts without credentials; intentional tradeoff.
- Implementation cost: Authenticated mesh ACL, grantee receipt access, and CAR lifecycle across three runtimes.
- Storage growth: Snapshots duplicate MST blocks on disk; retention policy required (see below).
- PDS feature dependency: Private collection export APIs vary by PDS version; local incremental export must work when full private
getRepois unavailable.
Retention policy (defaults)
| Artifact | Default retention |
|---|---|
snapshots/*.car | Last 30 daily snapshots + monthly anchor for 12 months |
incremental/*.car | Merge into next snapshot; delete after 7 days |
| Gateway copy (if enabled) | 7 days; user must not rely on cloud as only vault |
Neutral
- Public Bluesky-style discovery of Substratum files was never a goal; private namespace reinforces product positioning.
Out of scope (follow-on ADRs)
- AT Protocol permissioned spaces (
com.atproto.space.*) for multi-user shared realms. - Cross-account delegation for owner-repo writes without owner online.
- Encrypted CAR passphrase recovery and estate planning (may reference this vault in a future security ADR).
Break-glass recovery via private swarm (trusted peers) — required follow-on
Not in this ADR: pinning local CAR vault snapshots (or per-record repo blocks) on the private libp2p mesh so another trusted peer in the family triangle (ADR 23, ADR 24) can restore when both a member device and the PDS are permanently unavailable.
Why deferred: Normal mesh reads (GetBlock, replication Pin) authorize via live owner-repo passport lookup on the PDS (ADR 27, ADR 28, Swarm Command Security Gaps). If the PDS is down, that path cannot approve fetch of vault CIDs—a chicken-and-egg unless we add a separate policy.
Must solve in a future ADR (do not weaken default GetBlock PDS checks):
- Vault replication — after local CAR export, pin vault snapshot CIDs across triangle peers (same durability goal as file blocks; ADR 03).
- Break-glass authorization — scoped read (and pin) of registered vault CIDs authorized by triangle membership (e.g.
substratum-triangle.json/ shared family material), withoutcom.atproto.repo.getRecordto a dead PDS. - Offline verify + import — surviving peer fetches vault CAR from mesh, verifies MST/signatures inside the archive, imports into a new PDS; optional encryption of vault blobs with a triangle-derived key so PSK membership alone is insufficient.
- Explicit non-goals for break-glass — not a bypass for arbitrary CIDs; not a substitute for grantee PDS records; identity (
did:plc/ signing keys) recovery remains a separate account-migration problem.
Until that ADR ships, local CAR vault + optional catalog-snapshot.json on each device remain the primary escape hatch; mesh replication of vault data is aspirational only.
Related
- ADR 27: Zero Trust PDS-Based Provenance
- ADR 28: Receipt Sync Queue and Grantee Access Removal
- ADR 21: Passport Lexicons and Ref-Only AT Protocol Records
- ADR 04: Sidecar Design
- ADR 22: Drive Lexicon and Persistence
- ADR 25: Directory Lexicons and First-Class Passports
- Lexicons:
libs/lexicons/defs/cloud.substratum.*(see inventory) - Rust types:
crates/lexicons/src/lib.rs(DriveMetadata,DriveEntry,Receipt,AssetMetadata, …) - ADR 30: Catalog–PDS Dual-Write
crates/passport-sync/AGENTS.mdcrates/retrieval/src/access_control.rs- Swarm Command Security Gaps — mesh AuthZ today; break-glass must not erode § design intent