11 KiB
title, tags, created, status, owner
| title | tags | created | status | owner | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| PRD - Decentralized Custody and Smart-Contract Escrow Roadmap |
|
2026-05-28 | draft-for-review | payments + security + operations |
PRD - Decentralized Custody and Smart-Contract Escrow Roadmap
Executive decision
Do not move the whole Amanat escrow flow into a custom smart contract as the next step.
The current architecture already uses on-chain settlement through Request Network, in-house wallet checkout, derived destination wallets, Transaction Safety Provider checks, and an internal funds ledger. The bigger risk is not "lack of blockchain"; it is custody and administration centralization: one backend/admin path can still become too powerful if hot keys, release/refund confirmation, sweep authority, and dispute decisions are not split across independent signers and delayed controls.
Recommended path:
- Harden the current hybrid escrow.
- Move custody to multisig/hardware signers.
- Add timelocked governance for global settings.
- Pilot a minimal on-chain escrow only for opt-in/high-value flows after the above is stable.
Current baseline
| Area | Current state | Risk |
|---|---|---|
| Pay-in | Request Network in-house checkout, direct wallet txs, signed webhooks, transaction-safety checks | Still depends on webhook durability and reconciliation |
| Funds tracking | FundsLedgerEntry exists and is append-only |
Release/refund ledger enforcement is env-gated by PAYMENT_LEDGER_ENFORCEMENT |
| Deposit routing | Derived destinations per (buyer, sellerOffer, chainId) |
Live divergent-destination probe still needs production-grade evidence |
| Sweep custody | Build-only / hot-key signer abstraction | Production must avoid hot-key signing |
| Release/refund | Admin endpoints build and confirm instructions | Needs mandatory multisig/hardware proof and stronger dispute gating |
| Disputes | Model/service/routes exist with legacy status enum | Needs alignment with canonical dispute/escrow state machine |
| Admin control | App RBAC plus optional Trezor proof | No non-centralized admin quorum yet |
Goals
- Remove single-admin and backend-hot-key custody risk.
- Make release/refund/sweep authority require independent signers.
- Preserve the marketplace's human dispute workflow.
- Keep buyer UX close to the current Request Network in-house checkout.
- Add on-chain escrow only where it creates trust benefit larger than audit and UX cost.
Non-goals
- No token-voting DAO for individual buyer/seller disputes.
- No full rewrite of payments into Solidity before ledger, signing, and webhook controls are stable.
- No custom bridge, cross-chain settlement contract, or generalized DeFi protocol.
- No removal of the internal funds ledger; the ledger remains the application accounting source even if custody moves on-chain.
Target trust model
flowchart LR
Buyer["Buyer wallet"] --> PayIn["Request Network / in-house checkout"]
PayIn --> Dest["Per-payment derived destination"]
Dest --> Safety["Transaction Safety Provider\nconfirmations + token/recipient/amount + AML"]
Safety --> Ledger["Internal append-only funds ledger"]
Ledger --> Policy["Release/refund policy engine"]
Policy --> Safe["Safe multisig custody"]
Safe --> Seller["Seller wallet"]
Safe --> BuyerRefund["Buyer refund wallet"]
Admin["Admin UI"] --> Policy
Arb["Arbitrator quorum"] --> Safe
Guardian["Guardian"] --> Pause["Pause / cancel dangerous ops"]
Timelock["Timelock / AccessManager"] --> Policy
Phase 0 - Stabilize The Hybrid Escrow
Timebox: 1-2 weeks
Purpose: Make the current system safe enough that decentralizing custody does not hide application bugs under contract ceremony.
| Work | Owner | Exit criteria |
|---|---|---|
Turn on PAYMENT_LEDGER_ENFORCEMENT=true in dev, then staging |
Backend | Release/refund cannot exceed ledger available balance |
| Backfill/verify ledger entries for active Request Network payments | Backend | Reconciliation report has no unexplained funded payments without ledger rows |
| Enforce dispute hold in every release/refund path | Backend | Opening a dispute blocks release and refund until explicit resolution/override |
Require Trezor proof in staging with TREZOR_SAFEKEEPING_REQUIRED=true |
Backend + frontend | Release/refund without proof is rejected; valid Trezor proof succeeds |
| Add audit entries for release/refund/sweep instruction build and confirm | Backend | Each operation records actor, before/after state, tx hash, signer, and reason |
| Complete RN durable webhook ingress design | Platform | Worker storage/replay design approved; backend remains the trust oracle |
Decision gate: no Safe migration until release/refund is ledger-gated and dispute-gated in staging.
Phase 1 - Move Custody To Safe Multisig
Timebox: 2-4 weeks
Purpose: Remove single-key custody without changing core escrow semantics.
| Work | Owner | Exit criteria |
|---|---|---|
| Create Safe accounts per supported chain | Ops + security | 2-of-3 minimum for dev/staging, 3-of-5 preferred for production |
| Register hardware-backed owners | Ops | At least two owners use Trezor or equivalent hardware wallets |
| Route release/refund/sweep builds to Safe transaction proposals | Backend + frontend | Admin UI builds a Safe transaction instead of direct hot-key tx |
| Confirm Safe execution before ledger release/refund append | Backend | Ledger terminal entry requires verified Safe tx hash |
| Remove production hot-key sweep mode | Ops | DERIVED_DESTINATION_SWEEP_SIGNER=build-only in production |
| Add break-glass policy | Security | Time-limited, alarmed, documented; cannot silently bypass quorum |
Administration model: admins can propose, but custody owners execute. A compromised app admin cannot move funds alone.
Phase 2 - Durable Payment Evidence And Quarantine
Timebox: parallel with Phase 1
Purpose: Make payment evidence durable and make tainted-funds isolation real.
| Work | Owner | Exit criteria |
|---|---|---|
| Cloudflare Worker receives RN webhooks first | Platform | Raw body, headers, delivery ID, payment reference, timestamp durably stored |
| Replay endpoint/tool for stored webhook deliveries | Backend + ops | Operator can replay by delivery ID/time window/payment reference |
| AML provider behind Transaction Safety Provider | Backend + compliance | clean allows funding; sanctions/mixer verdict blocks or quarantines |
| Derived destination quarantine workflow | Backend + admin UI | Failed AML or transfer mismatch prevents sweep into treasury/Safe |
| Live divergent-destination probe | Payments | Two real paid intents to two derived addresses both complete and reconcile |
Decision gate: no custom escrow contract until webhook replay and per-payment quarantine work operationally.
Phase 3 - Decentralize Administrative Control
Timebox: 4-6 weeks
Purpose: Split operational permissions without making daily support impossible.
| Control | Recommended design |
|---|---|
| Custody movement | Safe multisig threshold |
| Global escrow settings | Timelock or OpenZeppelin AccessManager-managed roles |
| Contract upgrades, if any | Timelocked multisig, no instant admin upgrade |
| Emergency pause | Guardian role can pause, not withdraw |
| Dispute financial outcome | App records decision; Safe quorum executes release/refund/split |
| Confirmation thresholds | App admin can propose; high-risk decreases require timelock or second approval |
| Break-glass | Time-limited, high-severity alert, postmortem required |
Use token/DAO voting only for protocol-level parameters if the platform later has a real governance community. Do not use broad token voting for individual disputes; it leaks private commercial facts and is slow/manipulable.
Phase 4 - Minimal Smart-Contract Escrow Pilot
Timebox: 6-10 weeks after Phases 0-3
Purpose: Test whether on-chain escrow improves trust enough for a specific cohort.
Pilot only when one of these is true:
- Average escrow value is high enough that users ask for contract custody.
- Sellers/buyers explicitly demand on-chain proof that funds cannot be swept.
- The platform wants a premium "contract escrow" mode.
- Regulated partners require provable segregation of funds.
Minimal contract shape:
| Function | Notes |
|---|---|
fund(orderId, token, amount, buyer, seller, deadline) |
Buyer funds ERC-20 escrow; order ID is app-generated and hashed |
confirmDelivery(orderId) |
Buyer can release without admin |
openDispute(orderId) |
Either party can pause auto-release before deadline |
resolve(orderId, releaseAmount, refundAmount, reasonHash) |
Arbitrator/multisig executes split |
claimAfterTimeout(orderId) |
Seller can claim after timeout if no dispute |
refundAfterExpiry(orderId) |
Buyer can recover if seller never starts/accepts |
pause() / unpause() |
Guardian pause only; no fund extraction |
Security requirements:
- Use audited libraries for
SafeERC20, reentrancy protection, pausing, and access control. - Avoid upgradeability unless the timelock and upgrade policy are production-ready.
- External audit before mainnet funds.
- Formal state-transition tests mirroring Funds Ledger and Escrow State Machine Specification.
- Fuzz tests for double release, split math, fee rounding, pausing, and timeout races.
Phase 5 - Go / No-Go Criteria
Proceed from hybrid multisig custody to custom escrow only if all are true:
- Safe/Trezor flow has processed real releases/refunds without operational pain.
- Ledger enforcement has run for at least one complete payment cycle.
- Dispute hold cannot be bypassed in tests or manual review.
- Durable webhook ingress and replay are live.
- Per-payment destination quarantine is live.
- Contract audit budget and maintenance owner are approved.
- The user trust/compliance benefit is explicitly documented.
If these are not true, continue improving the hybrid model. A contract that encodes immature off-chain policy will make the system harder to fix, not safer.
Stale documentation corrected in this pass
This roadmap was created together with a focused documentation alignment pass:
- System Overview now reflects Request Network as the primary payment rail, derived destinations, ledger, and existing dispute service.
- System Architecture now shows Request Network webhooks and durable ingress as the target, instead of SHKeeper-only webhook flow.
- Backend Architecture no longer lists removed SHKeeper service folders/routes as the current module map.
- Escrow Flow now reflects hybrid custody, ledger gates, derived destinations, and the recommended multisig-before-contract direction.
- Dispute Flow no longer says the backend dispute service/model/routes do not exist.
- Request Network Integration Constraints now marks the in-house checkout and derived-destination work as implemented/partially implemented rather than only designed.
External references
- Safe threshold custody model: https://docs.safe.global/advanced/smart-account-concepts
- OpenZeppelin AccessManager and timelock guidance: https://docs.openzeppelin.com/contracts/5.x/access-control
- Request Network payment contracts overview: https://docs.request.network/advanced/protocol-overview/contracts