Architecture Decision Record (ADR) 30: Catalog–PDS Dual-Write
Status: Proposed
Date: 2026-05-24
Last Updated: 2026-05-24 (catalog-sync = filesystem lexicons only; passport private namespace separate)
Glossary (acronyms in full, with abbreviation in brackets)
- Architecture Decision Record (ADR)
- Access Control List (ACL)
- Authenticated Transfer Protocol (AT Protocol)
- Content Identifier (CID)
- Decentralized Identifier (DID)
- Namespaced Identifier (NSID)
- Open Authorization (OAuth)
- Personal Data Server (PDS)
Context
Substratum maintains two planes of user data (ADR 28):
| Plane | Store | Used for |
|---|---|---|
| Catalog | PostgreSQL (drive, drive_entry, ACL tables) | File-explorer APIs, RLS, fast listing |
| PDS repo | Owner/grantee AT Protocol repositories | Mesh ACL, cryptographic provenance (ADR 27) |
Today only passport receipts converge from catalog to PDS—and only asynchronously via the receipt sync queue (receipt_sync_outbox, ReceiptSyncWorker in crates/passport-sync). HTTP handlers enqueue; they do not block on com.atproto.repo.createRecord.
Filesystem metadata (drive display names, folder paths, filenames) is largely catalog-only. Ingress sets lexicon_type = cloud.substratum.filesystem.driveEntry on rows but does not write those records to the PDS. That blocks ADR 29 goals: private namespace, CAR export, and PDS-loss recovery of the file tree.
This ADR defines catalog–PDS dual-write for filesystem lexicons—drives, folders, and file entries (cloud.substratum.filesystem.*)—via a catalog-sync pipeline parallel to ADR 28 receipt sync.
In scope (this ADR): filesystem.drive, filesystem.driveEntry, and (when ADR 25 lands) filesystem.directory.
Out of scope (separate refactor): registering cloud.substratum.passport.* as PDS-private collections and authenticated mesh reads (ADR 29). That work stays on the receipt-sync path (ReceiptSyncWorker, grantee inline writes)—same pipeline, different PDS visibility policy—not on CatalogSyncWorker.
Decision
1. Dual-write definition
Catalog dual-write (this ADR): For each user action on drives, folders, or file paths, Substratum MUST:
- Commit catalog rows in PostgreSQL (same transaction boundaries as today, plus
catalog_sync_outboxrows where async), and - Converge
cloud.substratum.filesystem.*records on the owner PDS (private namespace when ADR 29 registration ships).
Passport dual-write (receipts, removal acks) already flows through receipt_sync_outbox (ADR 28). Making those collections PDS-private is an ADR 29 refactor on that pipeline—not a catalog-sync concern.
The catalog may be ahead of the PDS; the PDS is mesh authority for receipts. Catalog dual-write does not change mesh ACL rules and does not allow mesh reads from Postgres ACL alone.
2. Two async pipelines (receipt sync unchanged)
Do not fold filesystem dual-write into ReceiptSyncWorker. Keep ADR 28 receipt sync as-is and add a catalog-sync pipeline that mirrors its patterns without sharing implementation:
| Mechanism | Receipt sync (ADR 28) | Catalog sync (this ADR) |
|---|---|---|
| Outbox table | receipt_sync_outbox | catalog_sync_outbox (new) |
| Worker | ReceiptSyncWorker (crates/passport-sync) | CatalogSyncWorker (new crate, e.g. crates/catalog-sync) |
| Ingress enqueue | receipt_sync.rs | catalog_sync.rs (new; same transactional pattern) |
| Lexicon scope | cloud.substratum.passport.* (receipt, ack; removal request inline on grantee repo) | cloud.substratum.filesystem.* only — drive, driveEntry, directory |
| ADR 29 private namespace | Separate refactor on receipt path (authenticated reads + private registration) | Catalog-sync writes filesystem records; private registration follows ADR 29 |
| Catalog status column | receipt_sync on drive_entry / passport rows | pds_sync on drive, drive_entry (or equivalent) |
| Gateway flags | RECEIPT_SYNC_ENABLED, RECEIPT_SYNC_POLL_MS | CATALOG_SYNC_ENABLED, CATALOG_SYNC_POLL_MS |
Shared conventions (copy from ADR 28, do not merge workers):
- Transactional enqueue — catalog mutation + matching outbox row in one Postgres transaction.
- Idempotency keys —
(owner_did, subject_id, content_hash); filesystemsubject_idencodes drive + path or drive id. - Per-owner ordering — serialize PDS commits per
owner_didwithin each outbox; receipt and catalog jobs for the same owner may interleave on the PDS but each worker processes its queue in order. - Claim / RLS — separate
claim_catalog_sync_job()with worker session flag (mirrorclaim_receipt_sync_job()).
Passport records (cloud.substratum.passport.receipt, removal acks) stay on receipt_sync_outbox / ReceiptSyncWorker only—proven in production; HTTP must not block on owner OAuth for uploads/shares.
Filesystem records (cloud.substratum.filesystem.drive, cloud.substratum.filesystem.driveEntry, future filesystem.directory) use catalog_sync_outbox / CatalogSyncWorker only, with business-event discriminants (§3). Prefer async enqueue when owner PDS OAuth is cold.
Upload finalize may enqueue both outboxes in one transaction (file.uploaded on receipt outbox + file.uploaded / path.ancestors_materialized on catalog outbox); workers run independently.
Grantee removal Phase A (accessRemovalRequest on grantee repo) remains inline on the grantee OAuth path when available (ADR 28 §3). Phase B owner convergence (passport.receipt ACL update and passport.accessRemovalAck) uses receipt_sync_outbox / ReceiptSyncWorker only—already implemented in crates/passport-sync; catalog-sync does not handle passport collections.
3. Outbox events are business actions (not DB operations)
The outbox event column stores domain events: something the user (or product) did that must eventually be reflected on the PDS. Names describe user-visible actions, not storage mechanics.
Use:
- Past-tense or short verb phrases:
drive.created,file.uploaded,entry.deleted - One discriminant per distinct user action (do not collapse create/rename/ACL into
upsert_requested)
Do not use:
- SQL/ORM terms:
upsert,insert,delete_requested,published_requested - Worker instructions:
sync_to_pds,createRecord - Repo roles as the event name:
owner.drive.*(role belongs in job metadata:owner_did,repo)
The worker maps each business event to createRecord / putRecord / deleteRecord; that mapping is implementation detail, not the event name.
Catalog-sync events (catalog_sync_outbox)
Do not add filesystem events to receipt_sync_outbox.
Business event (event) | User action | PDS convergence |
|---|---|---|
drive.created | User creates a new drive | filesystem.drive create |
drive.renamed | User changes drive display name | filesystem.drive put |
drive.acl_updated | User changes drive-level sharing/ACL | filesystem.drive put |
folder.created | User creates a folder (mkdir) | filesystem.driveEntry create/put |
file.uploaded | User completes upload; file appears at path | filesystem.driveEntry create/put |
path.ancestors_materialized | Product creates missing parent path segments (implicit dirs) | filesystem.driveEntry per segment |
path.moved | User moves or renames a file or folder | driveEntry delete + create/put |
entry.deleted | User deletes a file or folder | deleteRecord on driveEntry |
directory.created | First-class directory record (ADR 25) | filesystem.directory |
Rust enum variants follow the same vocabulary (e.g. CatalogSyncEvent::DriveCreated), with kind_str() returning the dotted business string above. cloud.substratum.passport.* records are never catalog-sync events—see receipt-sync below.
Receipt-sync events (receipt_sync_outbox)
Remain on ADR 28. Target business names (legacy shipped strings in parentheses):
| Business event | User action | Legacy event (until migration) |
|---|---|---|
file.uploaded | Upload finalize | owner.receipt.published_requested |
file.access_changed | Owner share, ACL patch, descendant inherit | owner.receipt.published_requested |
access.removal_acknowledged | Owner converges after grantee removal: updates passport.receipt ACL and writes passport.accessRemovalAck (same worker job) | owner.access_removal_acknowledged |
Not receipt-outbox today: grantee accessRemovalRequest is written inline in ingress; GranteeAccessRemovalRequested / grantee.access_removal_requested exists in types.rs but is not enqueued from ingress (dead path unless a future snapshot job is added).
Splitting file.uploaded vs file.access_changed on the receipt outbox is recommended when ingress is next touched; until then one legacy row may cover both actions.
Same business name, two outboxes: file.uploaded on catalog-sync ( driveEntry ) and receipt-sync ( passport.receipt ) are different jobs with different payloads—allowed when the user action spans both planes.
Payload MUST include serialized lexicon record bytes (or sufficient fields to build them) so CatalogSyncWorker does not re-query Postgres for authoritative content during PDS write—mirroring receipt payloads today.
4. Scope: user actions that MUST dual-write
| User action | Catalog tables | Private PDS record(s) | Repo | Outbox event(s) |
|---|---|---|---|---|
| Create drive | drive | cloud.substratum.filesystem.drive | Owner | drive.created |
| Rename drive | drive | filesystem.drive | Owner | drive.renamed |
| Change drive ACL | drive | filesystem.drive | Owner | drive.acl_updated |
| Create folder | drive_entry | filesystem.driveEntry | Owner | folder.created |
| Upload file / finalize | drive_entry, passport | driveEntry + passport.receipt | Owner | file.uploaded (catalog + receipt outboxes) |
| Implicit parent paths | drive_entry rows | driveEntry per segment | Owner | path.ancestors_materialized |
| Move / rename path | drive_entry | driveEntry delete + put | Owner | path.moved |
| Delete file / folder | drive_entry | deleteRecord on driveEntry | Owner | entry.deleted |
| Share / ACL patch (owner) — file/folder | ACL + receipt outbox | passport.receipt | Owner | file.access_changed (receipt only) |
| Share / ACL patch (owner) — drive root | ACL + receipt outbox today | passport.receipt (+ drive ACL tables) | Owner | file.access_changed today; add drive.acl_updated on catalog outbox when filesystem.drive ships |
| Grantee self-removal (intent) | catalog | passport.accessRemovalRequest | Grantee | Inline grantee OAuth in ingress (ADR 28 §3); not catalog-sync |
| Grantee self-removal (owner) | catalog + receipt outbox | passport.receipt + passport.accessRemovalAck | Owner | access.removal_acknowledged (receipt outbox only) |
| First-class directory (ADR 25) | drive_entry | filesystem.directory | Owner | directory.created |
driveEntry rkey: deterministic from (driveId, normalized path) for idempotent PDS writes (ADR 29).
5. Catalog sync status
Add pds_sync on drive and drive_entry (pending | synced | failed) for filesystem convergence. Leave receipt_sync on passport-bearing rows for receipt-only status (ADR 28); do not overload receipt_sync for drive metadata.
Mesh GetBlock continues to consult owner passport.receipt only; a driveEntry with pds_sync = pending must not imply mesh access without a valid receipt.
6. Bridge until dual-write ships
Per ADR 29: CAR export includes catalog-snapshot.json when filesystem rows lack a corresponding private PDS record. Remove the sidecar once export verifies full dual-write coverage.
7. Scope boundary: catalog-sync vs passport private refactor
| Concern | Pipeline | This ADR? |
|---|---|---|
| Drive / folder / file names and paths on PDS | CatalogSyncWorker → filesystem.drive, filesystem.driveEntry, filesystem.directory | Yes |
| Passport receipts, removal acks, mesh ACL | ReceiptSyncWorker → passport.receipt, passport.accessRemovalAck | No (ADR 28; already shipped) |
| Grantee removal request | Inline ingress → passport.accessRemovalRequest on grantee repo | No (ADR 28) |
Register passport.* as PDS-private + authenticated mesh reads | Receipt-sync refactor + retrieval (ADR 29) | No — parallel track, not catalog-sync |
Catalog-sync is only for drive, folder, and file-entry lexicons. Passport private collections are a separate refactor on the existing receipt pipeline; do not add passport handlers to crates/catalog-sync.
8. Gap audit vs shipped ingress (2026-05-24)
| Issue | Shipped behavior | ADR 30 target |
|---|---|---|
| Filesystem not on PDS | create_drive, mkdir, rename, move, delete update Postgres only (ingress/AGENTS.md) | Catalog outbox + pds_sync per §4 |
| Legacy receipt event string | Single owner.receipt.published_requested for upload, share, inherit, drive-root ACL | Split file.uploaded / file.access_changed (+ drive.acl_updated on catalog for drive metadata) |
GranteeAccessRemovalRequested | Enum + worker branch; never enqueued from ingress | Grantee intent stays inline; remove or repurpose dead variant |
accessRemovalAck “Phase 2d planned” | Written in ReceiptSyncWorker when removal_request_uri set | Docs updated (ADR 28) |
| Drive root ACL | Receipt outbox only (SubjectType::Drive) | Also drive.acl_updated → filesystem.drive when catalog-sync ships |
| ADR cross-link | — | ADR 28 Related must not say catalog-sync owns passport collections |
Consequences
Positive
- Separation of concerns:
ReceiptSyncWorkerstays focused on passport mesh authority;CatalogSyncWorkerowns filesystem lexicons and private namespace writes. - Shared mental model per pipeline: catalog intent → outbox → worker → PDS (same as receipts, different queue).
- Filesystem paths and drive names become recoverable from PDS/CAR without Postgres.
- Receipt retry, poison, and per-owner ordering patterns are reused without entangling code paths.
Negative
- Two outboxes and workers to operate, monitor, and test.
- More outbox volume on deep uploads (implicit parent paths).
- UI must surface
pds_syncfor directories and drives andreceipt_syncfor file passport state. - Upload finalize must enqueue both queues consistently when one txn commits.
Neutral
- Dual-write does not replace ADR 22 swarm copies of
DriveMetadataimmediately; convergence path is PDS-first for new work.
Implementation phases
| Phase | Deliverable |
|---|---|
| 1 | Publish cloud.substratum.filesystem.drive lexicon; document catalog-sync event payloads |
| 2 | Migration: catalog_sync_outbox + claim_catalog_sync_job(); ingress catalog_sync.rs enqueue with business event strings |
| 3 | CatalogSyncWorker + crates/catalog-sync for filesystem.* dual-write (ADR 29 private registration for those collections can ship in parallel) |
| 4 | pds_sync on drive and drive_entry; file-explorer affordances (receipt UI unchanged) |
| 5 | Drop catalog-snapshot.json from CAR when verifier passes |
Related
- ADR 35: Drive Node Delete (Three-Layer Removal) —
entry.deletedon catalog-sync; receipt tombstone via receipt-sync - ADR 28: Receipt Sync Queue and Grantee Access Removal — authoritative async receipt writing; unchanged by this ADR
crates/catalog-sync/(proposed) —CatalogSyncWorker, mirrorcrates/passport-sync/AGENTS.md- ADR 29: Private PDS Namespace and Local CAR Emergency Export — passport private namespace refactor + CAR vault (not catalog-sync scope)
- ADR 27: Zero Trust PDS-Based Provenance
- ADR 22: Drive Lexicon and Persistence
- ADR 25: Directory Lexicons and First-Class Passports
crates/passport-sync/AGENTS.mdcrates/ingress/AGENTS.md