docs: ship in-house RN checkout, scope 5 follow-up tasks (#7-11)

In-house Request Network checkout went fully end-to-end on dev today.
A real 0.01 USDC payment flowed through wallet connect -> approve ->
ERC20FeeProxy.transferFromWithReferenceAndFee -> RN webhook ->
TransactionSafetyProvider -> Payment.status=completed -> page success
state. Tx 0x494c77a29161b5100d8e0b1ac675f1822955d0bb3633ecdbfafb886f84f2f320.

Docs:
- New PRD: Wallet, Multichain, Confirmations, AML, Trezor
  (5 follow-ups, each sized for an independent contributor)
- Updated PRD: Request Network In-House Checkout (phases 0..3 done,
  phase 4 partial, phases 5-6 not started)
- Updated handoff: deployed versions, what is working end-to-end,
  follow-up tasks index

Taskmaster: 5 new top-level tasks (#7..#11) covering ephemeral
destination wallets, multichain proxy registry + USDC/USDT, runtime
confirmation thresholds, optional seller-paid AML screening, and
Trezor signing for admin actions. Tasks are scoped fine-grained so
each is independent enough for kimi to pick up.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-05-28 15:50:24 +04:00
parent 37f946fc23
commit 0060b16912
69 changed files with 1513 additions and 147 deletions

View File

@@ -0,0 +1,212 @@
---
title: Handoff - Request Network Confirmation Repair - 2026-05-28
tags: [handoff, operations, payments, request-network, webhook]
created: 2026-05-28
---
# Handoff - Request Network Confirmation Repair - 2026-05-28
## Scope
This handoff covers the Request Network dev payment probe where the buyer callback stayed stuck on "processing payment", plus the local confirmation repair work and the documentation/roadmap updates that followed.
Primary user-reported issue:
- A real BSC Request Network test payment completed on-chain, but Amanat never showed confirmation on `https://dev.amn.gg/payment/callback/?paymentId=6a17e08f1485c1de0ff3cd15`.
## Current Answer
Do **not** run another paid payment test against dev until the local `2.6.26` changes are deployed and the webhook smoke test passes against the deployed stack.
After deploy:
- The original webhook `404` correlation bug should be fixed.
- If Request Network includes a transaction hash and the safety checks pass, the payment should complete.
- If Request Network omits the transaction hash, the webhook should be captured but the payment will remain `transactionSafety.pending` instead of being falsely credited.
## Repositories Touched
Backend:
- `backend/src/services/payment/requestNetwork/requestNetworkRoutes.ts`
- `backend/src/services/payment/requestNetwork/signature.ts`
- `backend/src/services/payment/adapters/requestNetworkAdapter.ts`
- `backend/src/services/payment/reconciliation/requestNetworkReconciliationService.ts`
- `backend/src/services/payment/decentralizedPaymentService.ts`
- `backend/src/services/payment/safety/transactionSafetyProvider.ts`
- `backend/src/models/Payment.ts`
- `backend/scripts/smoke/rn-webhook.sh`
- Request Network webhook/reconciliation tests
- `backend/.env.example`
- `backend/package.json`, `backend/package-lock.json`
Frontend:
- `frontend/src/app/payment/callback/page.tsx`
- Frontend version/env files and Dockerfile
Deployment:
- `deployment/docker-compose.yml`
Docs / Taskmaster:
- `nick-doc/PRD - Request Network In-House Checkout.md`
- `nick-doc/.taskmaster/docs/prd-request-network-in-house-checkout.md`
- `nick-doc/01 - Architecture/Request Network Integration Constraints.md`
- `nick-doc/PRD - Request Network Migration and Funds Management.md`
- `nick-doc/07 - Development/Environment Variables.md`
- `nick-doc/02 - Data Models/Payment.md`
- `nick-doc/08 - Operations/Incident Response.md`
- `nick-doc/08 - Operations/Monitoring.md`
- `nick-doc/README.md`
- Taskmaster subtask `3.13` for durable RN webhook ingress and transaction safety
- Taskmaster subtask `6.1` for deploying confirmation repair before the next paid probe
## Evidence From Dev
Test transaction:
```text
0x3a23febd9abd43d7e0851c1ea86c4ceaf08c11098852cb0425fa074e9c88350b
```
Payment document:
```text
paymentId: 6a17e08f1485c1de0ff3cd15
providerPaymentId: rq-af2d092e18cb41bb39ce4b0c
metadata.requestNetworkRequestId: 011ae38f7b99ef135514b987c9629b520b08e7a740f60d92d682f2f06466993a3f
metadata.requestNetworkPaymentReference: rq-af2d092e18cb41bb39ce4b0c
status before repair: pending
```
Nginx/backend evidence:
```text
POST /api/payment/request-network/webhook -> 404
source IP: 34.34.233.192
observed deliveries: four retries on 2026-05-28
```
Conclusion:
- Request Network did call Amanat.
- The payment succeeded on-chain.
- Amanat failed local confirmation because the webhook handler looked up the wrong reference shape and returned `404`.
- The frontend then kept polling too aggressively and eventually hit `429`.
## Implemented Locally
### Backend confirmation repair
- Webhook lookup now searches all known Request Network correlation keys:
- `providerPaymentId`
- `metadata.requestNetworkRequestId`
- `metadata.requestNetworkPaymentReference`
- nested `metadata.requestNetworkData.requestId`
- nested `metadata.requestNetworkData.paymentReference`
- Test webhook bypass is no longer enabled by default.
- New `REQUEST_NETWORK_ALLOW_TEST_WEBHOOKS` env flag controls explicit test-mode acceptance.
- Request Network adapter uses the same test-mode rule.
### Transaction Safety Provider
Added `TransactionSafetyProvider` as the gate between provider event and escrow credit.
Initial checks:
- transaction hash required by default,
- minimum confirmations required by default,
- transfer recipient/token/amount match required by default,
- AML provider placeholder defaults to `none`; non-`none` values block until implemented.
Webhook and reconciliation completion paths both run through the same safety gate.
### Frontend callback repair
- Callback page now unwraps the backend `{ data: { payment } }` shape.
- Socket events handle both `requestId` and `purchaseRequestId`.
- Polling backs off from 3 seconds to 10 seconds.
- Polling stops after terminal states.
- `429`, `401`, and `403` no longer trap the page in misleading behavior.
- Dashboard redirect paths were corrected.
### Deployment/env
New env vars added to backend/deployment docs:
```text
REQUEST_NETWORK_ALLOW_TEST_WEBHOOKS=false
TRANSACTION_SAFETY_ENABLED=true
TRANSACTION_SAFETY_REQUIRE_TX_HASH=true
TRANSACTION_SAFETY_REQUIRE_TRANSFER_MATCH=true
TRANSACTION_SAFETY_MIN_CONFIRMATIONS=12
TRANSACTION_SAFETY_AML_PROVIDER=none
```
Versions were bumped together:
```text
frontend: 2.6.26
backend: 2.6.26
```
## Verification Already Run
Backend:
```bash
npm test -- __tests__/request-network-webhook.test.ts __tests__/request-network-adapter.test.ts __tests__/payment-reconciliation.service.test.ts --runInBand
npm run typecheck
REQUEST_NETWORK_ALLOW_TEST_WEBHOOKS=true BASE_URL=https://dev.amn.gg ./scripts/smoke/rn-webhook.sh
git diff --check
```
Frontend:
```bash
npx eslint src/app/payment/callback/page.tsx
npx tsc --noEmit -p tsconfig.json
git diff --check
```
Deployment/docs:
```bash
git diff --check
```
Important note: the smoke test against `dev.amn.gg` used `REQUEST_NETWORK_ALLOW_TEST_WEBHOOKS=true` because the currently deployed dev stack is still old and unsafe. After deploy, rerun without that override and expect unsigned/test callbacks to be rejected.
## Deploy Gate
Before another paid payment:
1. Commit/push/deploy backend, frontend, and deployment changes.
2. Set the new env vars in Arcane/dev deployment.
3. Confirm backend and frontend report `2.6.26`.
4. Run the RN webhook smoke test against dev without test bypass.
5. Tail nginx and backend logs during the next probe.
6. Inspect `Payment.metadata.transactionSafety` if the callback still waits.
## Recommended Next Work
1. Deploy and verify the confirmation repair.
2. Repeat one small dev BSC payment.
3. If it lands in `transactionSafety.pending` due missing transaction hash, add Request Network status/search enrichment so safety can resolve the tx hash.
4. Build the Cloudflare Worker durable webhook ingress:
- receive raw RN payload and headers,
- durably store delivery evidence,
- forward to backend,
- replay by delivery id/time window/payment reference.
5. Pick the first AML/sanctions provider and wire it behind `TRANSACTION_SAFETY_AML_PROVIDER`.
## Operational Rule
For Request Network incidents:
- Real provider webhook returning `404`: stop paid testing; fix correlation/config.
- Webhook returning `202` with `transactionSafety.pending`: evidence was captured, but payment is not safe to credit yet.
- Webhook returning `200`/completed with safety approved: proceed to normal marketplace state checks.

View File

@@ -0,0 +1,70 @@
# Handoff: Request Network In-House Checkout — 2026-05-28
Status: **fully end-to-end working on dev.amn.gg as of 2.6.38 backend / 2.6.41 frontend**. A 0.01 USDC payment (tx `0x494c77a2…`) flowed: page render → wallet connect (Rabby/injected) → approve → `transferFromWithReferenceAndFee` → RN webhook → backend marks completed → page flips to "پرداخت تأیید شد ✓" → continue → `/dashboard/payment/<id>`.
## What's live
- **Backend 2.6.38** — `/api/payment/request-network/intents` returns an `inHouseCheckout` block (destination, tokenAddress, decimals, chainId, proxyAddress, paymentReference 8-byte hex, feeAmount, feeAddress, amountWei). `GET /api/payment/request-network/:paymentId/checkout` rehydrates the block for an existing Payment record (lazy-enriches legacy records that pre-date 2.6.34 by calling RN's `GET /v2/request/:id`). Public `GET /api/version` for the version badge. `PaymentCoordinator.updatePurchaseRequestStatus` guards both `template-checkout-` and `template-tc-` prefixes (plus regex fallback for any non-ObjectId) — earlier the `template-tc-` blindspot crashed webhook processing on template-checkout payments and stranded escrow.
- **Frontend 2.6.41** — `/checkout/request-network/[paymentId]` page with wagmi state machine: connect → switch-chain → check-allowance → approve → pay → wait-for-webhook. Destination + payment-reference + approve-tx + pay-tx hashes are copyable and click through to BscScan. Once a pay tx is in flight the page no longer reverts to "approve" even though the proxy call consumed the allowance. A 10-second `GET /api/payment/:id` poll runs as a fallback when the socket misses `payment-update`. Success-state continue button handles synthetic purchaseRequestId prefixes (`template-checkout-`, `template-tc-`) by routing to `/dashboard/payment/<id>` instead of the 404-prone `/dashboard/request/<syntheticId>`. WagmiProvider is now rendered unconditionally + the checkout page also self-wraps in its own WagmiProvider for defensive isolation.
Verify which versions are running by hovering the version chip at bottom-left of any page on dev.amn.gg, or `curl https://dev.amn.gg/api/version`.
## Where things stand
A real 0.01 USDC payment ran clean through the in-house path on 2026-05-28. Webhook delivery is durable enough for dev usage; durability for prod is Phase 5 (Cloudflare Worker ingress, not started). Five follow-up tasks were scoped immediately after — see `PRD - Wallet, Multichain, Confirmations, AML, Trezor.md` and Taskmaster `#7..#11`.
## Known issues / open work
- **TypeScript-error CI false-success**: pipelines #40 and #41 reported ✅ green in Woodpecker while `yarn build` was actually failing at the TS step and no image was pushed. Memory entry: `woodpecker_silent_build_fail.md`. **Always verify** `dev-<version>` exists in `git.manko.yoga` before trusting CI green. The wagmi `chainId` field requires `as any` because of its literal-union type — keep that pattern when adding new wagmi calls.
- **Existing/legacy Payment records** (created before backend 2.6.34) don't have RN's request details cached. The GET endpoint lazy-enriches them via `GET /v2/request/:requestId` on first visit, then persists. If RN's API is down at that moment, falls back to the hosted-page link.
- **Mongo access is denied** to the auto-mode classifier on dev — debugging payment records currently requires either the user's mongo creds or relying on the 409 `debug` block surfaced through the frontend.
- **Wagmi provider isolation (2.6.39)**: The checkout page wraps itself in its own `WagmiProvider`. The root `Web3Provider` also renders `WagmiProvider` unconditionally as of 2.6.38. The doubling is intentional defensiveness — if one provider has an issue, the other still serves the checkout flow. Can be simplified later if both prove stable.
- **PRD Phase 5 — Cloudflare Worker durable webhook ingress** — not started. Taskmaster `3.13`. Current dev relies on `dev.amn.gg` being up at the moment RN's webhook fires. For prod, RN webhooks need to land in a durable Cloudflare Worker that buffers + replays into the backend.
## Files changed (recent)
Backend (`/Users/manwe/CascadeProjects/escrow/backend`):
- `src/services/payment/requestNetwork/contract.ts` — spreads full RN response into `raw`
- `src/services/payment/requestNetwork/inHouseCheckout.ts` — block builder, reads `paymentReference` from `rnRaw.requestDetails.paymentReference`
- `src/services/payment/requestNetwork/merchantReference.ts`, `tokens.ts`, `proxyAddresses.ts`, `paymentReference.ts` — helpers
- `src/services/payment/requestNetwork/requestNetworkPayInService.ts` — calls `GET /v2/request` after intent creation
- `src/services/payment/requestNetwork/requestNetworkRoutes.ts``GET /:paymentId/checkout` + lazy enrichment + debug response
- `src/services/payment/requestNetwork/networkClient.ts` — already had `getRequestStatus`
- `src/app.ts``GET /api/version`, exempt from rate limit
- `__tests__/rn-in-house-checkout.test.ts` — 12 unit tests, all green
Frontend (`/Users/manwe/CascadeProjects/escrow/frontend`):
- `src/web3/contracts/rn-fee-proxy.ts` — RN proxy + ERC20 ABIs
- `src/web3/context/wagmi-provider.tsx` — removed the mount-gate that caused `WagmiProviderNotFoundError`
- `src/web3/components/provider-payment.tsx``router.push` to in-house page + sessionStorage stash
- `src/sections/payment/checkout/types.ts` + `rn-in-house-checkout-view.tsx` — state machine, local WagmiProvider wrap
- `src/app/checkout/request-network/[paymentId]/page.tsx` — app router entry
- `src/components/version-logger.tsx` — version chip + tooltip showing backend version
## Memory entries added
- `MEMORY.md` index updated with:
- `arcane_dev_stack.md` (env/project IDs, two-step deploy note)
- `woodpecker_silent_build_fail.md` (CI green ≠ image pushed)
- and existing `rn_webhook_event_field.md`, `backend_rate_limits.md`, `telegram_notify_no_parse_mode.md`, `devEscrow_nginx_after_redeploy.md`, `parallel_agents_on_escrow.md`
## Open PRD questions still to decide
From `PRD - Request Network In-House Checkout.md` §10:
- Proxy address universality across chains (currently BSC + Arb confirmed; Task #8 will probe Polygon/ETH/Base)
- API pricing for hosted-UI-less usage (need RN account-mgmt question)
- Approval UX — exact-amount vs MAX_UINT256 (current: exact-amount)
- Cancel / timeout semantics for abandoned intents
- Telemetry events for in-house vs hosted A/B
## Follow-up tasks (Taskmaster + PRD)
Five follow-ups scoped for kimi to pick up independently. Full spec in `PRD - Wallet, Multichain, Confirmations, AML, Trezor.md`. Quick index:
| # | Task | Priority | Depends on |
|---|---------------------------------------------------------------|----------|------------|
| 7 | Per-(buyer, sellerOffer) ephemeral RN destination wallets | high | (sweep step soft-depends on #11) |
| 8 | Multichain RN proxy registry + USDC/USDT support | high | — |
| 9 | Per-chain confirmation thresholds + admin UI | medium | — |
| 10 | Optional AML screening on incoming payments (seller-paid) | medium | — |
| 11 | Trezor signing for admin actions (release/refund/sweep) | high | — |

View File

@@ -260,6 +260,20 @@ If user data may have leaked, treat as sev 1 and follow your data-breach disclos
Use when Request Network payments are failing, stalled, or out of sync with local payment state.
**First triage:**
1. Check whether RN reached nginx:
```bash
grep '/api/payment/request-network/webhook' /opt/backend/nginx/logs/access.log | tail -50
```
2. If RN deliveries returned `404`, treat it as a backend correlation/config bug. Do not run another paid probe until the correlation fix is deployed and smoke-tested.
3. If deliveries returned `202` or `200` but the payment is still pending, inspect `metadata.transactionSafety` on the `Payment` document. A safety-pending payment is captured but not credited; look for missing tx hash, insufficient confirmations, transfer mismatch, or AML provider blockers.
4. If Cloudflare Worker durable ingress is enabled, replay from the Worker delivery id/time window after backend repair instead of asking the buyer to pay again.
**Immediate rollback (minutes):**
1. Stop routing new intents to Request Network by setting:

View File

@@ -181,6 +181,8 @@ Today these are read manually from logs / Sentry. As Prometheus is added, encode
|--------|-------|---------|-------|
| Payment success rate | `db.payments.aggregate([{$group:{_id:"$status",n:{$sum:1}}}])` | > 95 % completed of 24h-old payments | < 90 % |
| Webhook signature failures | log `Webhook verification failed` | 0 | > 0 |
| Request Network webhook 4xx | nginx access log `/api/payment/request-network/webhook` | 0 | any real provider delivery returning 4xx |
| Request Network safety-pending payments | `db.payments.find({"metadata.transactionSafety.status":"pending"})` | explained/short-lived | pending > 10 min without operator note |
| SHKeeper API errors (5xx) | log + Sentry | 0 | > 5/min sustained |
| Payouts stuck in `pending` > 30 min | `db.payments.find({type:'payout',status:'pending',createdAt:{$lt:ISODate(30 min ago)}})` | empty | non-empty |
| Missing `transactionHash` after `completed` | the same query that drives `fix-transaction-hashes.js` | empty | non-empty |