Skip to content

Architecture Decision Record (ADR) 30: Catalog–PDS Dual-Write

Status: Proposed
Date: 2026-05-24
Last Updated: 2026-05-24 (catalog-sync = filesystem lexicons only; passport private namespace separate)

Glossary (acronyms in full, with abbreviation in brackets)

  • Architecture Decision Record (ADR)
  • Access Control List (ACL)
  • Authenticated Transfer Protocol (AT Protocol)
  • Content Identifier (CID)
  • Decentralized Identifier (DID)
  • Namespaced Identifier (NSID)
  • Open Authorization (OAuth)
  • Personal Data Server (PDS)

Context

Substratum maintains two planes of user data (ADR 28):

PlaneStoreUsed for
CatalogPostgreSQL (drive, drive_entry, ACL tables)File-explorer APIs, RLS, fast listing
PDS repoOwner/grantee AT Protocol repositoriesMesh ACL, cryptographic provenance (ADR 27)

Today only passport receipts converge from catalog to PDS—and only asynchronously via the receipt sync queue (receipt_sync_outbox, ReceiptSyncWorker in crates/passport-sync). HTTP handlers enqueue; they do not block on com.atproto.repo.createRecord.

Filesystem metadata (drive display names, folder paths, filenames) is largely catalog-only. Ingress sets lexicon_type = cloud.substratum.filesystem.driveEntry on rows but does not write those records to the PDS. That blocks ADR 29 goals: private namespace, CAR export, and PDS-loss recovery of the file tree.

This ADR defines catalog–PDS dual-write for filesystem lexicons—drives, folders, and file entries (cloud.substratum.filesystem.*)—via a catalog-sync pipeline parallel to ADR 28 receipt sync.

In scope (this ADR): filesystem.drive, filesystem.driveEntry, and (when ADR 25 lands) filesystem.directory.

Out of scope (separate refactor): registering cloud.substratum.passport.* as PDS-private collections and authenticated mesh reads (ADR 29). That work stays on the receipt-sync path (ReceiptSyncWorker, grantee inline writes)—same pipeline, different PDS visibility policy—not on CatalogSyncWorker.

Decision

1. Dual-write definition

Catalog dual-write (this ADR): For each user action on drives, folders, or file paths, Substratum MUST:

  1. Commit catalog rows in PostgreSQL (same transaction boundaries as today, plus catalog_sync_outbox rows where async), and
  2. Converge cloud.substratum.filesystem.* records on the owner PDS (private namespace when ADR 29 registration ships).

Passport dual-write (receipts, removal acks) already flows through receipt_sync_outbox (ADR 28). Making those collections PDS-private is an ADR 29 refactor on that pipeline—not a catalog-sync concern.

The catalog may be ahead of the PDS; the PDS is mesh authority for receipts. Catalog dual-write does not change mesh ACL rules and does not allow mesh reads from Postgres ACL alone.

2. Two async pipelines (receipt sync unchanged)

Do not fold filesystem dual-write into ReceiptSyncWorker. Keep ADR 28 receipt sync as-is and add a catalog-sync pipeline that mirrors its patterns without sharing implementation:

MechanismReceipt sync (ADR 28)Catalog sync (this ADR)
Outbox tablereceipt_sync_outboxcatalog_sync_outbox (new)
WorkerReceiptSyncWorker (crates/passport-sync)CatalogSyncWorker (new crate, e.g. crates/catalog-sync)
Ingress enqueuereceipt_sync.rscatalog_sync.rs (new; same transactional pattern)
Lexicon scopecloud.substratum.passport.* (receipt, ack; removal request inline on grantee repo)cloud.substratum.filesystem.* only — drive, driveEntry, directory
ADR 29 private namespaceSeparate refactor on receipt path (authenticated reads + private registration)Catalog-sync writes filesystem records; private registration follows ADR 29
Catalog status columnreceipt_sync on drive_entry / passport rowspds_sync on drive, drive_entry (or equivalent)
Gateway flagsRECEIPT_SYNC_ENABLED, RECEIPT_SYNC_POLL_MSCATALOG_SYNC_ENABLED, CATALOG_SYNC_POLL_MS

Shared conventions (copy from ADR 28, do not merge workers):

  • Transactional enqueue — catalog mutation + matching outbox row in one Postgres transaction.
  • Idempotency keys(owner_did, subject_id, content_hash); filesystem subject_id encodes drive + path or drive id.
  • Per-owner ordering — serialize PDS commits per owner_did within each outbox; receipt and catalog jobs for the same owner may interleave on the PDS but each worker processes its queue in order.
  • Claim / RLS — separate claim_catalog_sync_job() with worker session flag (mirror claim_receipt_sync_job()).

Passport records (cloud.substratum.passport.receipt, removal acks) stay on receipt_sync_outbox / ReceiptSyncWorker only—proven in production; HTTP must not block on owner OAuth for uploads/shares.

Filesystem records (cloud.substratum.filesystem.drive, cloud.substratum.filesystem.driveEntry, future filesystem.directory) use catalog_sync_outbox / CatalogSyncWorker only, with business-event discriminants (§3). Prefer async enqueue when owner PDS OAuth is cold.

Upload finalize may enqueue both outboxes in one transaction (file.uploaded on receipt outbox + file.uploaded / path.ancestors_materialized on catalog outbox); workers run independently.

Grantee removal Phase A (accessRemovalRequest on grantee repo) remains inline on the grantee OAuth path when available (ADR 28 §3). Phase B owner convergence (passport.receipt ACL update and passport.accessRemovalAck) uses receipt_sync_outbox / ReceiptSyncWorker only—already implemented in crates/passport-sync; catalog-sync does not handle passport collections.

3. Outbox events are business actions (not DB operations)

The outbox event column stores domain events: something the user (or product) did that must eventually be reflected on the PDS. Names describe user-visible actions, not storage mechanics.

Use:

  • Past-tense or short verb phrases: drive.created, file.uploaded, entry.deleted
  • One discriminant per distinct user action (do not collapse create/rename/ACL into upsert_requested)

Do not use:

  • SQL/ORM terms: upsert, insert, delete_requested, published_requested
  • Worker instructions: sync_to_pds, createRecord
  • Repo roles as the event name: owner.drive.* (role belongs in job metadata: owner_did, repo)

The worker maps each business event to createRecord / putRecord / deleteRecord; that mapping is implementation detail, not the event name.

Catalog-sync events (catalog_sync_outbox)

Do not add filesystem events to receipt_sync_outbox.

Business event (event)User actionPDS convergence
drive.createdUser creates a new drivefilesystem.drive create
drive.renamedUser changes drive display namefilesystem.drive put
drive.acl_updatedUser changes drive-level sharing/ACLfilesystem.drive put
folder.createdUser creates a folder (mkdir)filesystem.driveEntry create/put
file.uploadedUser completes upload; file appears at pathfilesystem.driveEntry create/put
path.ancestors_materializedProduct creates missing parent path segments (implicit dirs)filesystem.driveEntry per segment
path.movedUser moves or renames a file or folderdriveEntry delete + create/put
entry.deletedUser deletes a file or folderdeleteRecord on driveEntry
directory.createdFirst-class directory record (ADR 25)filesystem.directory

Rust enum variants follow the same vocabulary (e.g. CatalogSyncEvent::DriveCreated), with kind_str() returning the dotted business string above. cloud.substratum.passport.* records are never catalog-sync events—see receipt-sync below.

Receipt-sync events (receipt_sync_outbox)

Remain on ADR 28. Target business names (legacy shipped strings in parentheses):

Business eventUser actionLegacy event (until migration)
file.uploadedUpload finalizeowner.receipt.published_requested
file.access_changedOwner share, ACL patch, descendant inheritowner.receipt.published_requested
access.removal_acknowledgedOwner converges after grantee removal: updates passport.receipt ACL and writes passport.accessRemovalAck (same worker job)owner.access_removal_acknowledged

Not receipt-outbox today: grantee accessRemovalRequest is written inline in ingress; GranteeAccessRemovalRequested / grantee.access_removal_requested exists in types.rs but is not enqueued from ingress (dead path unless a future snapshot job is added).

Splitting file.uploaded vs file.access_changed on the receipt outbox is recommended when ingress is next touched; until then one legacy row may cover both actions.

Same business name, two outboxes: file.uploaded on catalog-sync ( driveEntry ) and receipt-sync ( passport.receipt ) are different jobs with different payloads—allowed when the user action spans both planes.

Payload MUST include serialized lexicon record bytes (or sufficient fields to build them) so CatalogSyncWorker does not re-query Postgres for authoritative content during PDS write—mirroring receipt payloads today.

4. Scope: user actions that MUST dual-write

User actionCatalog tablesPrivate PDS record(s)RepoOutbox event(s)
Create drivedrivecloud.substratum.filesystem.driveOwnerdrive.created
Rename drivedrivefilesystem.driveOwnerdrive.renamed
Change drive ACLdrivefilesystem.driveOwnerdrive.acl_updated
Create folderdrive_entryfilesystem.driveEntryOwnerfolder.created
Upload file / finalizedrive_entry, passportdriveEntry + passport.receiptOwnerfile.uploaded (catalog + receipt outboxes)
Implicit parent pathsdrive_entry rowsdriveEntry per segmentOwnerpath.ancestors_materialized
Move / rename pathdrive_entrydriveEntry delete + putOwnerpath.moved
Delete file / folderdrive_entrydeleteRecord on driveEntryOwnerentry.deleted
Share / ACL patch (owner) — file/folderACL + receipt outboxpassport.receiptOwnerfile.access_changed (receipt only)
Share / ACL patch (owner) — drive rootACL + receipt outbox todaypassport.receipt (+ drive ACL tables)Ownerfile.access_changed today; add drive.acl_updated on catalog outbox when filesystem.drive ships
Grantee self-removal (intent)catalogpassport.accessRemovalRequestGranteeInline grantee OAuth in ingress (ADR 28 §3); not catalog-sync
Grantee self-removal (owner)catalog + receipt outboxpassport.receipt + passport.accessRemovalAckOwneraccess.removal_acknowledged (receipt outbox only)
First-class directory (ADR 25)drive_entryfilesystem.directoryOwnerdirectory.created

driveEntry rkey: deterministic from (driveId, normalized path) for idempotent PDS writes (ADR 29).

5. Catalog sync status

Add pds_sync on drive and drive_entry (pending | synced | failed) for filesystem convergence. Leave receipt_sync on passport-bearing rows for receipt-only status (ADR 28); do not overload receipt_sync for drive metadata.

Mesh GetBlock continues to consult owner passport.receipt only; a driveEntry with pds_sync = pending must not imply mesh access without a valid receipt.

6. Bridge until dual-write ships

Per ADR 29: CAR export includes catalog-snapshot.json when filesystem rows lack a corresponding private PDS record. Remove the sidecar once export verifies full dual-write coverage.

7. Scope boundary: catalog-sync vs passport private refactor

ConcernPipelineThis ADR?
Drive / folder / file names and paths on PDSCatalogSyncWorkerfilesystem.drive, filesystem.driveEntry, filesystem.directoryYes
Passport receipts, removal acks, mesh ACLReceiptSyncWorkerpassport.receipt, passport.accessRemovalAckNo (ADR 28; already shipped)
Grantee removal requestInline ingress → passport.accessRemovalRequest on grantee repoNo (ADR 28)
Register passport.* as PDS-private + authenticated mesh readsReceipt-sync refactor + retrieval (ADR 29)No — parallel track, not catalog-sync

Catalog-sync is only for drive, folder, and file-entry lexicons. Passport private collections are a separate refactor on the existing receipt pipeline; do not add passport handlers to crates/catalog-sync.

8. Gap audit vs shipped ingress (2026-05-24)

IssueShipped behaviorADR 30 target
Filesystem not on PDScreate_drive, mkdir, rename, move, delete update Postgres only (ingress/AGENTS.md)Catalog outbox + pds_sync per §4
Legacy receipt event stringSingle owner.receipt.published_requested for upload, share, inherit, drive-root ACLSplit file.uploaded / file.access_changed (+ drive.acl_updated on catalog for drive metadata)
GranteeAccessRemovalRequestedEnum + worker branch; never enqueued from ingressGrantee intent stays inline; remove or repurpose dead variant
accessRemovalAck “Phase 2d planned”Written in ReceiptSyncWorker when removal_request_uri setDocs updated (ADR 28)
Drive root ACLReceipt outbox only (SubjectType::Drive)Also drive.acl_updatedfilesystem.drive when catalog-sync ships
ADR cross-linkADR 28 Related must not say catalog-sync owns passport collections

Consequences

Positive

  • Separation of concerns: ReceiptSyncWorker stays focused on passport mesh authority; CatalogSyncWorker owns filesystem lexicons and private namespace writes.
  • Shared mental model per pipeline: catalog intent → outbox → worker → PDS (same as receipts, different queue).
  • Filesystem paths and drive names become recoverable from PDS/CAR without Postgres.
  • Receipt retry, poison, and per-owner ordering patterns are reused without entangling code paths.

Negative

  • Two outboxes and workers to operate, monitor, and test.
  • More outbox volume on deep uploads (implicit parent paths).
  • UI must surface pds_sync for directories and drives and receipt_sync for file passport state.
  • Upload finalize must enqueue both queues consistently when one txn commits.

Neutral

  • Dual-write does not replace ADR 22 swarm copies of DriveMetadata immediately; convergence path is PDS-first for new work.

Implementation phases

PhaseDeliverable
1Publish cloud.substratum.filesystem.drive lexicon; document catalog-sync event payloads
2Migration: catalog_sync_outbox + claim_catalog_sync_job(); ingress catalog_sync.rs enqueue with business event strings
3CatalogSyncWorker + crates/catalog-sync for filesystem.* dual-write (ADR 29 private registration for those collections can ship in parallel)
4pds_sync on drive and drive_entry; file-explorer affordances (receipt UI unchanged)
5Drop catalog-snapshot.json from CAR when verifier passes