ADR 26: TUS Resumable Drive Uploads
Status: Proposed
Date: 2026-05-16
Context
Drive file and folder uploads today use a single multipart/form-data POST to the gateway (POST /api/v1/drives/{drive_id}/upload). The ingress handler buffers each file entirely (field.bytes().await) before running the domain pipeline: store block in the retrieval swarm, write passport receipt, pin, and persist drive_entry rows under RLS (in crates/ingress/src/router/handlers/drives/upload.rs).
The Web File Explorer (ADR 15, ADR 20) uploads via UploadService and FormData. Folder uploads send many files sequentially with a shared upload_session_id and per-file idempotency_key, but:
- There is no resumability after network failure.
- Progress is invisible in the column/list/grid UI until the full batch completes and the tree is refreshed.
- Server-authoritative progress (receive → swarm → pin → persist) is not exposed;
upload_session_idis echoed but not stored for polling or events. - Idempotency lookup is not yet implemented server-side.
We need a standard, resumable upload transport that fits the modular monolith (ADR 01, ADR 02): tus handles HTTP byte transfer; ingress retains ownership of drive semantics, auth, and finalize into swarm + DB.
Decision
1. Adopt the tus.io protocol for drive uploads
Use tus 1.0 as the resumable upload transport. Integrate via fileloft (fileloft-core, fileloft-axum, fileloft-store-fs) in crates/ingress, mounted under the drive-scoped path:
POST /api/v1/drives/{drive_id}/tus
PATCH /api/v1/drives/{drive_id}/tus/{upload_id}
HEAD /api/v1/drives/{drive_id}/tus/{upload_id}
DELETE /api/v1/drives/{drive_id}/tus/{upload_id} # optional terminationtus endpoints are authenticated the same way as other drive mutations (AuthenticatedDid, AuthenticatedJwt). Shared / read-only browse paths do not expose tus creation.
2. Separate tus completion from Substratum finalize
tus only guarantees bytes are stored in a tus DataStore (filesystem spool in v1). Substratum still must CID the asset, pin in the swarm, and write passport + drive rows.
Add a Substratum-specific finalize step (documented in OpenAPI, not part of tus core):
POST /api/v1/drives/{drive_id}/tus/{upload_id}/finalize → UploadEntryDtoFinalize reuses existing ingress helpers (store_block_in_swarm, persist_uploaded_entry, path sanitization, parent directory creation, ACL inheritance). Do not duplicate domain logic in the tus adapter.
The file explorer marks a file done only after finalize succeeds, not when tus reports 100% offset.
3. Drive context via tus Upload-Metadata
Encode drive placement in tus Upload-Metadata (base64 key/value), validated on creation:
| Key | Purpose |
|---|---|
path | Parent folder path inside the drive (empty = drive root) |
relative_path | Full relative path for folder uploads (e.g. Assets/photo.png) |
filename | Leaf name when needed |
batch_id | Correlates folder-upload files (replaces ad-hoc upload_session_id in FormData) |
idempotency_key | {batch_id}:{relative_path} or per-file key |
content_type | Optional MIME hint |
Paths must pass the same sanitization rules as today's multipart upload (sanitize_rel_path, sanitize_segment).
Enforce Upload-Length ≤ UPLOAD_MAX_FILE_BYTES (global ceiling 1 TiB per ADR 32; account plans may be lower on SaaS).
4. Server-reported progress beyond tus offset
tus supplies byte offset progress during PATCH. Substratum publishes phase progress for finalize and folder batches:
| Phase | Meaning |
|---|---|
uploading | tus offset < Upload-Length |
storing_block | PutBlock to swarm |
pinning | Swarm pin |
persisting | DB + passport |
done | Finalize returned UploadEntryDto |
error | Terminal failure |
Expose progress via:
GET /api/v1/drives/{drive_id}/upload/batches/{batch_id} # snapshot
GET /api/v1/drives/{drive_id}/upload/batches/{batch_id}/events # SSEReuse the SSE pattern already used for system events (ADR 18 alignment). v1 progress store: in-memory Arc<RwLock<...>> on AppState; document migration to DB/Redis for multi-instance gateways as follow-up.
5. File explorer integration
- Add
tus-js-clientand aTusUploadServiceimplementing the existing upload contract (ADR 20). - Flow per file: create tus upload → PATCH chunks →
finalize→ optional SSE for phases. - Folder upload: one
batch_id, N tus resources, bounded concurrency (e.g. 2–3). - UI: optimistic rows in Miller columns (pending/grey, progress ring, highlight on
done) driven by batch SSE + tusonProgress. OpenAPI-generated client covers finalize/batch DTOs only; tus URLs are stable constants aligned with ingress.
6. Keep multipart upload during migration
Retain POST .../upload until tus + finalize + explorer UI are stable. Deprecate in OpenAPI after Phase 5 (see roadmap). No breaking removal without a release note.
7. Ingress module layout
Per crates/ingress/AGENTS.md, extend upload handling as a concern directory:
crates/ingress/src/router/handlers/drives/upload/
mod.rs # shared finalize, re-exports
multipart.rs # existing handle_drive_upload (moved)
tus.rs # fileloft mount + metadata auth
tus_finalize.rs
batch_events.rs # SSE + snapshot8. Operational constraints (v1)
- Temp storage: filesystem spool under a configurable gateway temp root; TTL cleanup for incomplete uploads (e.g. 24h).
- Chunk size: default 5 MiB per PATCH (tunable); edge/proxy body limits must allow chunk size, not only full file size.
- CORS: expose tus headers (
Upload-Offset,Upload-Length,Location,Tus-Resumable,Tus-Version). - Non-goals (v1): tus concatenation; multi-pod shared spool without shared storage. Mesh block chunking may still use smaller
SWARM_MAX_BLOCK_BYTES; that is separate from the ingress upload ceiling.
Consequences
Positive
- Resumable uploads and standard client libraries (tus-js-client).
- Honest server-side progress through persist/pin, not only browser xhr progress.
- Folder uploads gain batch correlation and reconnect via
batch_id+ SSE. - Domain finalize stays in one ingress pipeline; tus remains a replaceable adapter.
Negative
- Two-step client flow (tus + finalize) increases integration complexity vs one POST.
- tus routes are not fully described by OpenAPI; finalize/batch endpoints are the codegen surface.
- Filesystem spool adds disk and cleanup obligations on gateway nodes.
- Single-gateway in-memory batch state does not survive restart or horizontal scale until Phase 5 hardening.
Deferred Decisions and Constraints
| # | Question | Default |
|---|---|---|
| 1 | Tus implementation: Use fileloft vs hand-rolled tus | fileloft |
| 2 | Temp store: Local FS vs S3 | Local FS on gateway (not S3) unless staging policy changes |
| 3 | Default chunk size: | 5 MiB |
| 4 | Multipart coexistence: | Yes until UI fully migrates |
Related
- ADR 11: Cross-Boundary Strategies — OpenAPI for finalize/batch; tus documented here.
- ADR 15: Web File Explorer — upload UX and Miller columns.
- ADR 16: Drive-Centric Domain Vocabulary — drive-scoped paths.
- ADR 20: File Explorer Service Layer —
UploadService/ tus adapter. - ADR 22: Drive Lexicon and Persistence —
drive_entrypersistence after finalize. - Passport Receipt Schema — receipts written at finalize.
- Operational:
apps/file-explorer/AGENTS.md,crates/ingress/AGENTS.md