docs: add latest audit to taskmaster

2026-05-24 09:09:55 +04:00
parent a37b1b1446
commit 393bb17c2e
3 changed files with 916 additions and 19 deletions
--- a/.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md
+++ b/.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md
@@ -0,0 +1,560 @@
 ---
 title: Backend Stack Security and Refactor Assessment
 tags: [audit, security, backend, architecture, payments, refactor]
 created: 2026-05-24
 status: advisory
 ---
 # Backend Stack Security and Refactor Assessment
 ## Purpose
 This document records an advisory assessment of whether Amanat should keep the current Node/Express backend, harden it in place, or migrate at least the security-critical backend surface to another technology stack.
 The conclusion is intentionally strategic rather than implementation-heavy. It should be used as input for architecture review, security planning, and refactor scoping.
 ## Executive summary
 Amanat is not a normal CRUD marketplace. It is a financial escrow platform with authentication, realtime communication, crypto payment intake, payout/release flows, provider webhooks, and dispute-sensitive fund movement.
 The main security risk is not simply "Node is insecure." The larger issue is that the current backend mixes high-risk financial state transitions, webhook handling, realtime room membership, admin operations, test/demo endpoints, and ordinary marketplace APIs in one Express application.
 Moving away from Node/Express may reduce npm supply-chain exposure and improve long-term auditability, but it will not automatically fix the most important risks. The immediate priority should be to define and enforce the correct security architecture:
 - A canonical funds ledger.
 - A strict escrow/payment/dispute state machine.
 - Centralized authorization and ownership checks.
 - Signed webhook handling with idempotency.
 - Server-derived realtime authorization.
 - Secure session handling.
 - A provider-neutral payment abstraction.
 Recommended approach:
 1. Harden the existing backend immediately.
 2. Define the target payment, ledger, and auth architecture in documentation.
 3. Extract or rewrite only the security-critical backend core if the team can support the new stack.
 4. Keep lower-risk marketplace, chat, notification, and dashboard APIs in TypeScript until the core is stable.
 Default recommendation: do not rewrite the entire backend at once. If a rewrite is chosen, start with payment/auth/escrow core services, preferably in Go or Kotlin/Java, while preserving current product behavior behind stable API contracts.
 ## Current system profile
 Observed architecture:
 - Frontend: Next.js, React, MUI, Web3, Socket.IO client.
 - Backend: Express 5, TypeScript, Mongoose, Socket.IO, SHKeeper, Web3 transaction verification, SMTP, OpenAI integration.
 - Storage: MongoDB and Redis, though Redis is not consistently used as a shared state authority for all security-sensitive flows.
 - Realtime: Socket.IO rooms for user, buyer, seller, chat, and purchase-request updates.
 - Payments: SHKeeper pay-in, SHKeeper payout, decentralized/Web3 payment verification, manual/admin payout paths.
 - Docs: existing logical audit and remediation documents already identify several critical flaws.
 The backend currently acts as:
 - API server.
 - Realtime server.
 - Payment orchestrator.
 - Webhook processor.
 - Background-job runner.
 - File upload server.
 - Auth/session issuer.
 - Admin operations surface.
 That is too much responsibility in one process for a financial platform unless the architecture is very tightly controlled.
 ## Code-backed security observations
 These findings are consistent with the existing audit docs and representative source review.
 ### Payment and funds risks
 - Payment state is largely represented by mutable `Payment.status` and `escrowState` fields rather than an immutable funds ledger.
 - Pay-in, manual confirmation, wallet monitoring, webhook handling, and payout flows can converge on the same records through different paths.
 - Release/refund eligibility is not fully centralized around ledger invariants.
 - The existing docs identify a dispute/escrow race: disputes do not reliably create an enforceable hold before release.
 - `Payment` uses mixed/string-compatible references for some core links, reducing referential integrity and query safety.
 - Some payment mutation/history routes were exposed without sufficient authentication or ownership enforcement.
 - Web3 verification has been documented as relying primarily on transaction receipt success rather than strict token, recipient, and amount verification.
 Security implication: a backend stack change alone will not fix this. The platform needs a funds ledger and state machine first.
 ### Authentication and session risks
 - Browser tokens are stored in `localStorage`, increasing impact from XSS.
 - Passkey/WebAuthn behavior is described in the audit docs as stubbed/incomplete and challenge storage is process-local.
 - Refresh-token behavior differs between auth paths.
 - Admin-sensitive routes need explicit role enforcement, not just authentication.
 Security implication: migration should include a session architecture decision, not just a framework change.
 ### Realtime risks
 - Socket.IO room joins are client-driven by IDs such as `join-user-room`, `join-buyer-room`, and `join-seller-room`.
 - The server should derive room membership from authenticated socket identity, not trust client-supplied user IDs.
 Security implication: realtime authorization needs to be treated like API authorization.
 ### Rate limiting and abuse controls
 - Global rate limiting is explicitly disabled in the Express app.
 - Sensitive paths need tiered limits: auth, verification, file upload, AI, payment, webhook, chat.
 - AI endpoints and email endpoints can create cost or abuse exposure if not authenticated and rate-limited.
 Security implication: this is an immediate hardening task regardless of backend stack.
 ### Webhook and provider risks
 - Webhooks must be verified using raw-body signatures, not reconstructed JSON when signatures depend on raw bytes.
 - Webhook delivery must be idempotent.
 - Unknown, duplicate, malformed, and failed webhooks should be visible in structured records or dead-letter storage.
 - Provider callbacks should create reconciliation events, not directly release funds.
 Security implication: payment provider integration should be isolated behind a provider-neutral service contract.
 ### Supply-chain risks
 The Node/npm ecosystem has real and recurring supply-chain risk. For this codebase, that risk matters because both frontend and backend depend heavily on npm packages.
 Relevant 2026 context:
 - Express published February 2026 security releases, including high-severity Multer issues affecting versions before 2.1.0. The backend manifest currently specifies `multer: ^2.0.2`, so the resolved lockfile version should be reviewed and updated if necessary.
 - Node.js published March 2026 security releases across active release lines.
 - Microsoft reported an Axios npm supply-chain compromise in March 2026. This project uses Axios on frontend and backend.
 - TanStack published a May 2026 npm compromise postmortem. This project uses `@tanstack/react-query`.
 References:
 - Express security release, 2026-02-27: https://expressjs.com/2026/02/27/security-releases.html
 - Node.js March 2026 security releases: https://nodejs.org/en/blog/vulnerability/march-2026-security-releases
 - Microsoft on Axios npm supply-chain compromise: https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/
 - TanStack npm supply-chain compromise postmortem: https://tanstack.com/blog/npm-supply-chain-compromise-postmortem
 Security implication: npm supply-chain controls are required even if the backend is rewritten, because the frontend remains npm-based.
 ## Should the backend move away from Node/Express?
 ### Reasons to keep and harden first
 - The product already exists and has working business flows.
 - A full rewrite risks reintroducing escrow/payment bugs.
 - The most dangerous issues are domain/state/authorization issues, not syntax or framework issues.
 - Hardening can reduce immediate exposure faster than a rewrite.
 - The team may currently be more productive in TypeScript.
 ### Reasons to migrate at least the backend core
 - Financial backend code benefits from a smaller, stricter dependency footprint.
 - Payment, ledger, webhook, and payout flows need strong invariants and auditability.
 - Express makes it easy to accumulate route-level exceptions, test endpoints, and inconsistent middleware.
 - Node/npm supply-chain exposure is material and recurring.
 - TypeScript runtime enforcement is limited unless paired with strict schema validation everywhere.
 - A separate payment core can be more easily audited, threat-modeled, tested, and locked down.
 ### Balanced conclusion
 It is security-wise reasonable to move the highest-risk backend core away from Node/Express, but only after the target security model is specified.
 Do not begin with a full product rewrite. Begin with a security-critical core extraction:
 - Auth/session/token authority.
 - Payment intent creation.
 - Provider webhook processing.
 - Funds ledger and reconciliation.
 - Release/refund/dispute-hold enforcement.
 - Admin payout approval and audit logging.
 Keep lower-risk modules in the current stack until the core is stable:
 - Marketplace browsing/listing.
 - Request templates.
 - Chat and notifications, after socket authorization fixes.
 - Admin dashboard reads.
 - File upload, after hardening or moving to object storage.
 ## Stack options
 ### Go
 Best fit if the team wants a smaller, operationally simple, security-focused payment core.
 Strengths:
 - Small binaries and deployment footprint.
 - Lower dependency surface than typical Node services.
 - Strong standard library for HTTP, crypto, JSON, and concurrency.
 - Good fit for webhook receivers, ledger services, workers, and reconciliation jobs.
 - Easy to run static analysis and produce reproducible builds.
 Weaknesses:
 - Less ergonomic than TypeScript for rapid product iteration.
 - Requires team comfort with Go idioms.
 - API/schema generation must be designed deliberately.
 Assessment: recommended first choice for a payment/ledger/auth core if the team can maintain Go.
 ### Kotlin/Java with Spring Boot
 Best fit if the team wants enterprise-grade structure, mature auth patterns, and strong ecosystem support.
 Strengths:
 - Mature security and validation ecosystem.
 - Strong typing and tooling.
 - Good for complex domain services and audit-heavy systems.
 - Well-understood operational patterns.
 Weaknesses:
 - Heavier runtime and framework footprint.
 - More ceremony.
 - Slower iteration for a small team.
 Assessment: strong choice for a larger engineering team or enterprise-style compliance needs.
 ### Rust
 Best fit if maximum memory safety and correctness are worth slower delivery.
 Strengths:
 - Strong compile-time safety.
 - Good for cryptographic and high-assurance components.
 - Very low runtime footprint.
 Weaknesses:
 - Higher implementation cost.
 - Smaller hiring pool.
 - Web API development may be slower.
 Assessment: attractive for narrow cryptographic or transaction-verification components, but probably too costly for the whole backend unless the team is already strong in Rust.
 ### Python/FastAPI
 Best fit if rapid backend development and clean API typing are more important than strict compile-time guarantees.
 Strengths:
 - Fast development.
 - Good validation with Pydantic.
 - Good for admin tools and internal services.
 Weaknesses:
 - Supply-chain risk remains.
 - Runtime typing and async behavior require discipline.
 - Less compelling than Go/Kotlin for a financial core.
 Assessment: acceptable for internal services, not the preferred payment-core target.
 ### Continue TypeScript/Node with stronger architecture
 Best fit if team capacity cannot support another backend language yet.
 Required conditions:
 - Strict route registration policy.
 - Runtime validation on every boundary.
 - No test/demo routes in production builds.
 - Full lockfile and package provenance controls.
 - Centralized auth, ownership, and role guards.
 - Ledger-first payment architecture.
 - Secure cookies or a documented token-storage risk acceptance.
 - Socket auth middleware.
 - Redis-backed challenge/idempotency/rate-limit storage.
 Assessment: viable short term, but the security bar must be raised significantly.
 ## Recommended target architecture
 ### Phase 0: Immediate containment
 Goal: reduce current high-risk exposure without broad redesign.
 Actions:
 - Disable or protect test/demo payment and email endpoints in production.
 - Require authentication and ownership checks on all payment, notification, AI, and file routes.
 - Re-enable rate limiting with stricter limits on auth, payment, AI, file upload, and webhook paths.
 - Add admin role checks to admin routes.
 - Stop accepting arbitrary `userId` from clients for private data.
 - Validate all payment mutations through centralized service methods.
 - Lock Socket.IO room membership to server-verified identity.
 - Review and update lockfiles for known vulnerable packages.
 - Rotate any committed or publicly visible secrets.
 ### Phase 1: Architecture specification
 Goal: define the new security model before implementation.
 Documents to produce are listed in the "Required documentation" section below.
 ### Phase 2: Payment and ledger extraction
 Goal: move funds logic behind a provider-neutral service.
 Introduce:
 - `FundsAccount`
 - `LedgerEntry`
 - `FundsBalance`
 - `PaymentIntent`
 - `PaymentProviderEvent`
 - `ReleaseInstruction`
 - `RefundInstruction`
 - `DisputeHold`
 Key rule: provider webhooks do not directly release funds. They create verified events and ledger entries.
 ### Phase 3: Backend-core rewrite or service split
 Goal: decide whether the extracted core remains TypeScript or moves to Go/Kotlin.
 Recommended split:
 - `core-payments`: payment intent, webhook, ledger, release/refund, reconciliation.
 - `core-auth`: sessions, passkeys, OAuth, token issuance, session revocation.
 - `marketplace-api`: purchase requests, offers, categories, templates.
 - `realtime-api`: chat, notifications, socket rooms.
 The split can be logical first, physical later.
 ### Phase 4: Full migration only if justified
 Goal: avoid rewriting stable lower-risk product surfaces prematurely.
 Only consider full backend migration after:
 - Payment core is stable.
 - Auth/session model is stable.
 - API contracts are documented and tested.
 - Legacy payment records are migrated or safely read-only.
 - Team has demonstrated production maintenance ability in the new stack.
 ## Required documentation before refactor
 ### 1. Threat Model
 Purpose: identify what must be protected and how it can be attacked.
 Should include:
 - Assets: user accounts, admin accounts, wallet addresses, payment records, funds, webhook secrets, API keys, private notifications.
 - Actors: buyer, seller, admin, support, unauthenticated attacker, compromised user, compromised admin, provider, malicious webhook sender.
 - Trust boundaries: browser, backend, database, Redis, provider APIs, wallet/RPC, admin UI, Socket.IO.
 - Abuse cases: fake payment proof, replayed webhook, arbitrary room join, stolen token, double payout, dispute bypass, email/AI abuse.
 ### 2. Funds Ledger Specification
 Purpose: make money movement auditable and provider-independent.
 Should define:
 - Account model per purchase request/order.
 - Immutable ledger entry types.
 - Derived balance model.
 - Gross amount, provider fees, platform fees, held amount, disputed amount, releasable amount, released amount, refunded amount.
 - Idempotency keys.
 - Reconciliation behavior.
 ### 3. Escrow State Machine
 Purpose: define legal transitions once.
 Should include:
 - Purchase request states.
 - Payment states.
 - Escrow/funds states.
 - Dispute states.
 - Valid transitions and forbidden transitions.
 - Who or what can trigger each transition.
 - Required preconditions for release, refund, cancellation, dispute hold, and admin override.
 ### 4. Authorization Matrix
 Purpose: remove route-by-route ambiguity.
 Should map every endpoint and socket event to:
 - Public, authenticated, owner, seller, buyer, admin, support, or service role.
 - Required ownership checks.
 - Required object state.
 - Rate-limit tier.
 - Audit-log requirement.
 ### 5. Payment Provider Adapter Spec
 Purpose: decouple business logic from SHKeeper, Request Network, manual wallet flow, and future providers.
 Should define:
 - `createPayInIntent`
 - `getPayInStatus`
 - `handleProviderWebhook`
 - `createHostedPaymentLink`
 - `createReleaseInstruction`
 - `createRefundInstruction`
 - `getPayoutStatus`
 - `searchProviderPayments`
 Provider-specific metadata should be namespaced and never become the canonical funds state.
 ### 6. Webhook Security Spec
 Purpose: prevent forged, replayed, or silently failed provider events.
 Should define:
 - Raw-body signature verification.
 - Accepted headers and algorithms.
 - Replay prevention.
 - Delivery ID/idempotency handling.
 - Unknown payment behavior.
 - Duplicate event behavior.
 - Retry semantics.
 - Dead-letter/replay storage.
 - Alerting thresholds.
 ### 7. Session and Auth Architecture
 Purpose: decide how browser sessions should work for a financial platform.
 Should define:
 - Access token lifetime.
 - Refresh token lifetime and rotation.
 - Whether tokens move from `localStorage` to `httpOnly` cookies.
 - CSRF strategy if cookies are used.
 - Passkey/WebAuthn implementation requirements.
 - OAuth requirements.
 - Device/session revocation.
 - Admin step-up authentication for payouts or role changes.
 ### 8. Realtime Authorization Spec
 Purpose: make Socket.IO events subject to the same security model as REST.
 Should define:
 - Socket handshake authentication.
 - Server-derived room membership.
 - Which rooms exist.
 - Who may join each room.
 - Whether room membership changes with request/payment/dispute state.
 - Event payload privacy rules.
 ### 9. Migration Plan
 Purpose: avoid breaking current payments and historical records.
 Should include:
 - SHKeeper legacy read path.
 - New provider feature flag.
 - Ledger backfill strategy.
 - Data validation report before enforcement.
 - Rollback criteria.
 - Cutover date for old webhook routes.
 - Operator manual reconciliation workflow.
 ### 10. Secure Build and Supply-Chain Policy
 Purpose: reduce npm and dependency compromise risk.
 Should define:
 - Package manager and lockfile policy.
 - CI install mode.
 - Dependency update cadence.
 - Security advisory monitoring.
 - npm provenance/signature policy where available.
 - Secrets handling.
 - Production build reproducibility.
 - Separation of frontend npm risk from backend core risk.
 ### 11. Operational Runbooks
 Purpose: make security incidents and payment failures survivable.
 Should include:
 - Failed webhook.
 - Duplicate payment.
 - Missing payment.
 - Stuck release.
 - Disputed release attempt.
 - Compromised admin.
 - Leaked API key.
 - Provider outage.
 - Chain/RPC outage.
 - Suspicious payment proof.
 - npm/package compromise.
 ## Decision framework
 Use the following questions before choosing a rewrite:
 - Is the current goal safe launch, or long-term platform rebuild?
 - Is the team willing to delay feature work for a payment-core redesign?
 - Can the team maintain Go/Kotlin/Rust in production?
 - Is the biggest current risk supply chain, or incorrect money movement?
 - Are admin actions trusted, or should high-risk actions require step-up approval?
 - Should Amanat custody funds, or should the provider/payment network hold or route them?
 - Are disputes central to the product, or rare manual exceptions?
 - Is auditability a regulatory/business requirement or only an internal safety goal?
 ## Recommended decision
 Near term:
 - Harden the current Express backend.
 - Disable unsafe production routes.
 - Add centralized authorization and rate limiting.
 - Fix Web3 verification.
 - Fix Socket.IO authorization.
 - Disable passkeys unless implemented with real WebAuthn.
 - Begin ledger/state-machine documentation immediately.
 Medium term:
 - Build a provider-neutral payment and funds layer.
 - Add immutable ledger entries.
 - Move release/refund/dispute-hold checks into the central payment/funds service.
 - Keep SHKeeper compatibility read-only for legacy records.
 - Add Request Network or another provider behind the adapter if desired.
 Long term:
 - Rewrite the payment/auth/escrow core in Go or Kotlin/Java if the team can support it.
 - Do not rewrite the entire backend until the core is proven.
 - Keep lower-risk modules in TypeScript until there is a business or operational reason to migrate them.
 ## Open questions for leadership and engineering
 1. Is launch timeline more important than a full payment/funds redesign?
 2. Should passkeys be removed from launch scope until production-grade WebAuthn is implemented?
 3. Should browser auth move to `httpOnly` cookies even if that requires CSRF work and frontend changes?
 4. Should every payout require admin step-up authentication or two-person approval?
 5. Should Amanat keep funds in a platform-controlled escrow wallet, or should provider-mediated payment pages become the default?
 6. Is Request Network a desired provider migration, or just one option being explored?
 7. What new backend stack can the team realistically operate for the next two years?
 8. What is the acceptable level of temporary dual-stack complexity during migration?
 9. Do we need formal external penetration testing before public launch?
 10. Who owns security decisions: product, backend, DevOps, or a dedicated security owner?
 ## Relationship to existing docs
 This assessment complements:
 - [[Platform Logical Audit - 2026-05-24]]
 - [[PRD - Platform Audit Remediation Plan (2026-05-24)]]
 - [[PRD - Request Network Migration and Funds Management]]
 - [[Security Architecture]]
 - [[Payment Flow - SHKeeper]]
 - [[Payment Flow - DePay & Web3]]
 - [[Escrow Flow]]
 - [[Dispute Flow]]
 The existing remediation PRD is the tactical hardening plan. This document is the strategic backend-stack and refactor assessment.
--- a/.taskmaster/tasks/task-4.md
+++ b/.taskmaster/tasks/task-4.md
@@ -0,0 +1,21 @@
 # Task 4: Define backend security and refactor strategy from latest audit
 Status: pending  
 Priority: high  
 Source audit: `.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md`
 Convert the backend stack security/refactor assessment into concrete architecture decisions, documentation deliverables, and developer handoff criteria.
 This is an advisory/architecture task. It should run in parallel with immediate backend hardening rather than block urgent remediation.
 Subtasks:
 1. Assign security ownership and launch decision criteria.
 2. Produce threat model for escrow platform.
 3. Specify funds ledger and escrow state machine.
 4. Create authorization matrix for REST and Socket.IO.
 5. Decide session, passkey, and admin step-up architecture.
 6. Specify webhook security and provider adapter contracts.
 7. Define secure build and supply-chain policy.
 8. Make backend-core stack decision.
 9. Create migration and operational runbooks.
--- a/.taskmaster/tasks/tasks.json
+++ b/.taskmaster/tasks/tasks.json
@@ -8,7 +8,8 @@
      "sourcePrds": [
        ".taskmaster/docs/prd-mermaid-diagram-rendering-stabilization.md",
        ".taskmaster/docs/prd-platform-audit-remediation-plan-2026-05-24.md",
-        ".taskmaster/docs/prd-request-network-migration-and-funds-management.md"
+        ".taskmaster/docs/prd-request-network-migration-and-funds-management.md",
        ".taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md"
      ]
    },
    "tasks": [
@@ -22,9 +23,36 @@
        "status": "done",
        "dependencies": [],
        "subtasks": [
-          { "id": 1, "title": "Fix Security Architecture email/password sequence", "description": "Normalize parser-sensitive sequence text in 01 - Architecture/Security Architecture.md.", "details": "Avoid semicolons and ambiguous inline punctuation in sequence messages.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "mmdc parse for the specific block." },
+          {
-          { "id": 2, "title": "Fix authentication login and refresh diagrams", "description": "Normalize parser-sensitive token and refresh-token sequence text in Authentication Flow.", "details": "Split method-like or expression-like message text into parser-safe plain text lines.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "mmdc parse for both Authentication Flow blocks." },
+            "id": 1,
-          { "id": 3, "title": "Fix chat, delivery, dispute, OAuth, purchase request, referral, registration, and seller-offer diagrams", "description": "Clean the remaining Mermaid sequence diagrams with invalid or ambiguous syntax.", "details": "Split multi-recipient arrows, remove parser-conflicting semicolon/expression text, and keep intent unchanged.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "Full vault mmdc parser sweep across all Mermaid blocks." }
+            "title": "Fix Security Architecture email/password sequence",
            "description": "Normalize parser-sensitive sequence text in 01 - Architecture/Security Architecture.md.",
            "details": "Avoid semicolons and ambiguous inline punctuation in sequence messages.",
            "status": "done",
            "priority": "medium",
            "dependencies": [],
            "testStrategy": "mmdc parse for the specific block."
          },
          {
            "id": 2,
            "title": "Fix authentication login and refresh diagrams",
            "description": "Normalize parser-sensitive token and refresh-token sequence text in Authentication Flow.",
            "details": "Split method-like or expression-like message text into parser-safe plain text lines.",
            "status": "done",
            "priority": "medium",
            "dependencies": [],
            "testStrategy": "mmdc parse for both Authentication Flow blocks."
          },
          {
            "id": 3,
            "title": "Fix chat, delivery, dispute, OAuth, purchase request, referral, registration, and seller-offer diagrams",
            "description": "Clean the remaining Mermaid sequence diagrams with invalid or ambiguous syntax.",
            "details": "Split multi-recipient arrows, remove parser-conflicting semicolon/expression text, and keep intent unchanged.",
            "status": "done",
            "priority": "medium",
            "dependencies": [],
            "testStrategy": "Full vault mmdc parser sweep across all Mermaid blocks."
          }
        ]
      },
      {
@@ -37,13 +65,94 @@
        "status": "pending",
        "dependencies": [],
        "subtasks": [
-          { "id": 1, "title": "Secure unauthenticated endpoints and owner enforcement", "description": "Require authenticateToken and owner/admin checks on exposed payment, AI, and legacy notification routes.", "details": "Derive notification userId from authenticated principal. Protect payment history and mutation endpoints. Restrict AI calls to authenticated users with per-user budgets. Add denied-access audit logs.", "status": "pending", "priority": "high", "dependencies": [], "testStrategy": "Unauthorized callers receive 401/403; users cannot access or mutate other users' payments/notifications; admins retain authorized access." },
+          {
-          { "id": 2, "title": "Re-enable and scope rate limiting", "description": "Restore global and route-tiered rate limits for public-sensitive paths.", "details": "Use stricter limits for auth, financial, AI, file upload, and verification paths. Keep public reads at relaxed limits. Add observability for 429 spikes.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Exercise configured limits per tier and confirm expected 429 responses without blocking ordinary reads." },
+            "id": 1,
-          { "id": 3, "title": "Replace stubbed passkey/WebAuthn flow", "description": "Implement production-grade WebAuthn registration/authentication and shared challenge storage.", "details": "Use real attestation/assertion verification, Redis-backed TTL challenges, refresh-token persistence/rotation, and deterministic malformed/reused/expired challenge errors.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Registration, login, replay, expired challenge, and refresh-token continuity tests pass." },
+            "title": "Secure unauthenticated endpoints and owner enforcement",
-          { "id": 4, "title": "Strengthen DePay/Web3 payment verification", "description": "Verify transaction recipient, token contract, and amount, not only receipt success.", "details": "Decode ERC-20 Transfer logs, compare recipient against escrow address, validate token contract and decimals-adjusted minimum amount, store verifier evidence and idempotency fingerprint.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Reject successful but wrong-recipient/wrong-token/underpaid tx hashes; accept only matching transfers." },
+            "description": "Require authenticateToken and owner/admin checks on exposed payment, AI, and legacy notification routes.",
-          { "id": 5, "title": "Lock Socket.IO room joins to authenticated context", "description": "Remove trust in client-supplied user/buyer/seller room IDs.", "details": "Validate socket handshake token, derive server-side room membership, reject mismatched joins, and monitor suspicious join attempts.", "status": "pending", "priority": "medium", "dependencies": [1], "testStrategy": "A user cannot subscribe to another user's rooms; legitimate realtime notifications still arrive." },
+            "details": "Derive notification userId from authenticated principal. Protect payment history and mutation endpoints. Restrict AI calls to authenticated users with per-user budgets. Add denied-access audit logs.",
-          { "id": 6, "title": "Enforce dispute hold before payout and release operations", "description": "Add payment hold state and central release/refund guards that block disputed funds.", "details": "Introduce explicit dispute hold fields or state, enforce in PaymentCoordinator and payout/release services, return clear 409/423 responses, and backfill/report blocked payments.", "status": "pending", "priority": "medium", "dependencies": [1, 4], "testStrategy": "Open dispute blocks release/refund until resolved or explicitly overridden through authorized path." },
+            "status": "pending",
-          { "id": 7, "title": "Align documentation, API references, and runtime enums", "description": "Normalize disputed/payment/request status docs and implementation references after security behavior changes.", "details": "Resolve mismatch around absent dispute module, endpoint names, status enums, and action names across Data Models, API Reference, and Flows.", "status": "pending", "priority": "medium", "dependencies": [1, 2, 3, 4, 5, 6], "testStrategy": "Docs match implemented routes, models, enum values, and state transitions." }
+            "priority": "high",
            "dependencies": [],
            "testStrategy": "Unauthorized callers receive 401/403; users cannot access or mutate other users' payments/notifications; admins retain authorized access."
          },
          {
            "id": 2,
            "title": "Re-enable and scope rate limiting",
            "description": "Restore global and route-tiered rate limits for public-sensitive paths.",
            "details": "Use stricter limits for auth, financial, AI, file upload, and verification paths. Keep public reads at relaxed limits. Add observability for 429 spikes.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Exercise configured limits per tier and confirm expected 429 responses without blocking ordinary reads."
          },
          {
            "id": 3,
            "title": "Replace stubbed passkey/WebAuthn flow",
            "description": "Implement production-grade WebAuthn registration/authentication and shared challenge storage.",
            "details": "Use real attestation/assertion verification, Redis-backed TTL challenges, refresh-token persistence/rotation, and deterministic malformed/reused/expired challenge errors.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Registration, login, replay, expired challenge, and refresh-token continuity tests pass."
          },
          {
            "id": 4,
            "title": "Strengthen DePay/Web3 payment verification",
            "description": "Verify transaction recipient, token contract, and amount, not only receipt success.",
            "details": "Decode ERC-20 Transfer logs, compare recipient against escrow address, validate token contract and decimals-adjusted minimum amount, store verifier evidence and idempotency fingerprint.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Reject successful but wrong-recipient/wrong-token/underpaid tx hashes; accept only matching transfers."
          },
          {
            "id": 5,
            "title": "Lock Socket.IO room joins to authenticated context",
            "description": "Remove trust in client-supplied user/buyer/seller room IDs.",
            "details": "Validate socket handshake token, derive server-side room membership, reject mismatched joins, and monitor suspicious join attempts.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              1
            ],
            "testStrategy": "A user cannot subscribe to another user's rooms; legitimate realtime notifications still arrive."
          },
          {
            "id": 6,
            "title": "Enforce dispute hold before payout and release operations",
            "description": "Add payment hold state and central release/refund guards that block disputed funds.",
            "details": "Introduce explicit dispute hold fields or state, enforce in PaymentCoordinator and payout/release services, return clear 409/423 responses, and backfill/report blocked payments.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              1,
              4
            ],
            "testStrategy": "Open dispute blocks release/refund until resolved or explicitly overridden through authorized path."
          },
          {
            "id": 7,
            "title": "Align documentation, API references, and runtime enums",
            "description": "Normalize disputed/payment/request status docs and implementation references after security behavior changes.",
            "details": "Resolve mismatch around absent dispute module, endpoint names, status enums, and action names across Data Models, API Reference, and Flows.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              1,
              2,
              3,
              4,
              5,
              6
            ],
            "testStrategy": "Docs match implemented routes, models, enum values, and state transitions."
          }
        ]
      },
      {
@@ -54,15 +163,222 @@
        "testStrategy": "Use feature flags, provider fixture tests, webhook signature/idempotency tests, ledger invariant tests, migration dry-run reports, and limited cohort rollout before default provider switch.",
        "priority": "high",
        "status": "pending",
-        "dependencies": [2],
+        "dependencies": [
          2
        ],
        "subtasks": [
-          { "id": 1, "title": "Introduce provider-neutral payment adapter", "description": "Decouple checkout, webhook, and payout flows from SHKeeper-specific routes and metadata.", "details": "Define createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, and searchProviderPayments. Add provider values shkeeper, request_network, manual, admin_wallet and PAYMENT_PROVIDER feature flag.", "status": "pending", "priority": "high", "dependencies": [], "testStrategy": "New provider can be selected by feature flag while existing SHKeeper payments remain readable and process late webhooks." },
+          {
-          { "id": 2, "title": "Implement Request Network pay-in integration", "description": "Create Request Network payment requests or Secure Payment Pages for new checkout flows.", "details": "Store requestId, paymentReference, securePaymentUrl, token, merchantReference, network, invoiceCurrency, and paymentCurrency. Validate supported networks/currencies before creating links.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Buyer receives hosted payment URL; webhook reconciles matching internal payment only after amount/currency/reference validation." },
+            "id": 1,
-          { "id": 3, "title": "Add funds ledger and escrow state machine", "description": "Introduce internal funds accounting independent from provider metadata.", "details": "Add FundsAccount, LedgerEntry, derived FundsBalance, expected/held/releasable/releasing/released/refunded/disputed/failed states, fee representation, and release/refund invariant checks.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Every pay-in creates immutable ledger entries and payout/refund cannot exceed available held funds or bypass dispute holds." },
+            "title": "Introduce provider-neutral payment adapter",
-          { "id": 4, "title": "Build Request Network webhook and reconciliation service", "description": "Process signed Request Network events and repair missed webhook state through reconciliation.", "details": "Add /api/payment/request-network/webhook, verify raw-body x-request-network-signature, store delivery ID/retry/event/request/payment reference/payload hash, support test webhooks, and add scheduled payment search/status reconciliation.", "status": "pending", "priority": "high", "dependencies": [2, 3], "testStrategy": "Invalid signatures reject; duplicate delivery IDs acknowledge without duplicate ledger entries; reconciliation repairs missed state." },
+            "description": "Decouple checkout, webhook, and payout flows from SHKeeper-specific routes and metadata.",
-          { "id": 5, "title": "Implement release, refund, and payout orchestration", "description": "Replace SHKeeper payout tasks and simulated release with auditable transaction instruction and confirmation flows.", "details": "Create release/refund service consuming ledger balances, generate Request Network payout or direct admin wallet instructions, store unsigned tx payloads, signer, submitted hash, confirmation status, provider status, and require admin/operator authorization plus dispute checks.", "status": "pending", "priority": "high", "dependencies": [3, 4], "testStrategy": "Release cannot occur if unpaid, already released, refunded, or disputed; tx hash confirmation updates ledger once; admin can retry/cancel safely." },
+            "details": "Define createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, and searchProviderPayments. Add provider values shkeeper, request_network, manual, admin_wallet and PAYMENT_PROVIDER feature flag.",
-          { "id": 6, "title": "Migrate frontend checkout and admin payment UI", "description": "Update buyer checkout, admin release, seller payout, and payment details for provider-neutral Request Network flows.", "details": "Replace ShkeeperPayment with CryptoPayment/RequestNetworkPayment redirect flow, keep legacy SHKeeper only for legacy records, replace ShkeeperPayout with release queue/admin payout UI, and show provider IDs, payment references, hosted links, ledger balances, webhook/reconciliation status.", "status": "pending", "priority": "medium", "dependencies": [2, 3, 5], "testStrategy": "Request Network checkout does not expect walletAddress; admin UI blocks unsafe release; legacy labels are hidden for Request Network records." },
+            "status": "pending",
-          { "id": 7, "title": "Backfill legacy SHKeeper records and decommission provider-specific code", "description": "Migrate historical SHKeeper payment metadata and safely remove legacy wallet monitor/webhook/payout paths after cutoff.", "details": "Backfill provider namespace, create ledger entries for trusted completed SHKeeper payments, mark legacyProvider, keep webhook tail period, and produce decommission checklist for env vars, docs, labels, routes, and runbooks.", "status": "pending", "priority": "medium", "dependencies": [3, 4, 5, 6], "testStrategy": "Dry-run report includes total, migrated, skipped, ambiguous, failed; no historical transaction hash/invoice/task metadata is lost." }
+            "priority": "high",
            "dependencies": [],
            "testStrategy": "New provider can be selected by feature flag while existing SHKeeper payments remain readable and process late webhooks."
          },
          {
            "id": 2,
            "title": "Implement Request Network pay-in integration",
            "description": "Create Request Network payment requests or Secure Payment Pages for new checkout flows.",
            "details": "Store requestId, paymentReference, securePaymentUrl, token, merchantReference, network, invoiceCurrency, and paymentCurrency. Validate supported networks/currencies before creating links.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Buyer receives hosted payment URL; webhook reconciles matching internal payment only after amount/currency/reference validation."
          },
          {
            "id": 3,
            "title": "Add funds ledger and escrow state machine",
            "description": "Introduce internal funds accounting independent from provider metadata.",
            "details": "Add FundsAccount, LedgerEntry, derived FundsBalance, expected/held/releasable/releasing/released/refunded/disputed/failed states, fee representation, and release/refund invariant checks.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Every pay-in creates immutable ledger entries and payout/refund cannot exceed available held funds or bypass dispute holds."
          },
          {
            "id": 4,
            "title": "Build Request Network webhook and reconciliation service",
            "description": "Process signed Request Network events and repair missed webhook state through reconciliation.",
            "details": "Add /api/payment/request-network/webhook, verify raw-body x-request-network-signature, store delivery ID/retry/event/request/payment reference/payload hash, support test webhooks, and add scheduled payment search/status reconciliation.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              2,
              3
            ],
            "testStrategy": "Invalid signatures reject; duplicate delivery IDs acknowledge without duplicate ledger entries; reconciliation repairs missed state."
          },
          {
            "id": 5,
            "title": "Implement release, refund, and payout orchestration",
            "description": "Replace SHKeeper payout tasks and simulated release with auditable transaction instruction and confirmation flows.",
            "details": "Create release/refund service consuming ledger balances, generate Request Network payout or direct admin wallet instructions, store unsigned tx payloads, signer, submitted hash, confirmation status, provider status, and require admin/operator authorization plus dispute checks.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              3,
              4
            ],
            "testStrategy": "Release cannot occur if unpaid, already released, refunded, or disputed; tx hash confirmation updates ledger once; admin can retry/cancel safely."
          },
          {
            "id": 6,
            "title": "Migrate frontend checkout and admin payment UI",
            "description": "Update buyer checkout, admin release, seller payout, and payment details for provider-neutral Request Network flows.",
            "details": "Replace ShkeeperPayment with CryptoPayment/RequestNetworkPayment redirect flow, keep legacy SHKeeper only for legacy records, replace ShkeeperPayout with release queue/admin payout UI, and show provider IDs, payment references, hosted links, ledger balances, webhook/reconciliation status.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              2,
              3,
              5
            ],
            "testStrategy": "Request Network checkout does not expect walletAddress; admin UI blocks unsafe release; legacy labels are hidden for Request Network records."
          },
          {
            "id": 7,
            "title": "Backfill legacy SHKeeper records and decommission provider-specific code",
            "description": "Migrate historical SHKeeper payment metadata and safely remove legacy wallet monitor/webhook/payout paths after cutoff.",
            "details": "Backfill provider namespace, create ledger entries for trusted completed SHKeeper payments, mark legacyProvider, keep webhook tail period, and produce decommission checklist for env vars, docs, labels, routes, and runbooks.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              3,
              4,
              5,
              6
            ],
            "testStrategy": "Dry-run report includes total, migrated, skipped, ambiguous, failed; no historical transaction hash/invoice/task metadata is lost."
          }
        ]
      },
      {
        "id": 4,
        "title": "Define backend security and refactor strategy from latest audit",
        "description": "Convert the backend stack security/refactor assessment into concrete architecture decisions, documentation deliverables, and developer handoff criteria.",
        "details": "Source audit: .taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md. This task is advisory/architecture-focused and should run in parallel with immediate hardening. It should produce the decision artifacts needed before any backend-core rewrite or provider migration is started.",
        "testStrategy": "Review and sign off each architecture document with backend, payments, frontend, and operations stakeholders. Confirm every open question has an owner or explicit deferred decision before implementation work begins.",
        "priority": "high",
        "status": "pending",
        "dependencies": [],
        "subtasks": [
          {
            "id": 1,
            "title": "Assign security ownership and launch decision criteria",
            "description": "Define who owns security decisions and what must be true before public launch or migration work proceeds.",
            "details": "Answer ownership questions from the audit: security owner, launch safety bar, whether launch prioritizes hardening or redesign, and whether external penetration testing is required.",
            "status": "pending",
            "priority": "high",
            "dependencies": [],
            "testStrategy": "Written owner/RACI and launch gate checklist are accepted by leadership and engineering."
          },
          {
            "id": 2,
            "title": "Produce threat model for escrow platform",
            "description": "Document protected assets, actors, trust boundaries, and abuse cases for the financial marketplace.",
            "details": "Include buyer, seller, admin, support, unauthenticated attacker, compromised user/admin, provider, malicious webhook sender, browser/backend/database/Redis/provider/wallet/Socket.IO trust boundaries, and abuse cases such as fake payment proof, replayed webhook, arbitrary room join, stolen token, double payout, dispute bypass, email abuse, and AI abuse.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              1
            ],
            "testStrategy": "Threat model maps each high-risk finding to at least one mitigation task or accepted risk."
          },
          {
            "id": 3,
            "title": "Specify funds ledger and escrow state machine",
            "description": "Define canonical money movement and legal state transitions before refactor or provider migration.",
            "details": "Create specs for FundsAccount, LedgerEntry, FundsBalance, gross paid, provider fees, platform fees, held, disputed, releasable, released, refunded, idempotency keys, reconciliation behavior, purchase request states, payment states, escrow/funds states, dispute states, valid transitions, forbidden transitions, and release/refund/admin override preconditions.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              2
            ],
            "testStrategy": "Spec can be used to reject double-release, release-during-dispute, underfunded payout, and ambiguous provider-event scenarios."
          },
          {
            "id": 4,
            "title": "Create authorization matrix for REST and Socket.IO",
            "description": "Map every endpoint and realtime event to access level, ownership checks, state preconditions, rate-limit tier, and audit-log requirement.",
            "details": "Include public/authenticated/owner/buyer/seller/admin/support/service-role classifications. Socket.IO rooms must be server-derived from authenticated identity, not client-supplied user IDs.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              2
            ],
            "testStrategy": "No route or socket event remains unmapped; implementation tasks can reference matrix rows directly."
          },
          {
            "id": 5,
            "title": "Decide session, passkey, and admin step-up architecture",
            "description": "Choose browser session model and high-risk admin authentication requirements.",
            "details": "Decide localStorage versus httpOnly cookies, access/refresh token lifetimes, CSRF strategy, refresh rotation, WebAuthn requirements, OAuth requirements, device/session revocation, and whether payouts/role changes require step-up authentication or two-person approval.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              2
            ],
            "testStrategy": "Decision record lists chosen model, rejected alternatives, migration cost, and required implementation tasks."
          },
          {
            "id": 6,
            "title": "Specify webhook security and provider adapter contracts",
            "description": "Define provider-neutral payment interface and signed webhook processing rules.",
            "details": "Document createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, searchProviderPayments, raw-body signature verification, replay prevention, delivery ID idempotency, duplicate/unknown event behavior, retry semantics, dead-letter/replay storage, and alert thresholds.",
            "status": "pending",
            "priority": "high",
            "dependencies": [
              3
            ],
            "testStrategy": "Contracts cover SHKeeper legacy, Request Network, manual/admin wallet, invalid signatures, duplicate deliveries, and missed webhook reconciliation."
          },
          {
            "id": 7,
            "title": "Define secure build and supply-chain policy",
            "description": "Reduce npm/dependency compromise risk across frontend and any remaining Node services.",
            "details": "Specify package manager and lockfile policy, CI install mode, dependency update cadence, advisory monitoring, npm provenance/signature policy where available, secrets handling, reproducible production builds, and separation between frontend npm risk and backend-core risk.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              1
            ],
            "testStrategy": "Policy is actionable in CI and includes response steps for compromised package, leaked token, and vulnerable dependency alerts."
          },
          {
            "id": 8,
            "title": "Make backend-core stack decision",
            "description": "Choose whether the security-critical backend core remains TypeScript or moves to Go/Kotlin/Rust/Python.",
            "details": "Evaluate team capability, two-year maintainability, operational footprint, rewrite cost, dual-stack complexity, auditability, supply-chain exposure, and which modules belong in a payment/auth/escrow core versus the existing marketplace/chat API.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              2,
              3,
              4,
              5,
              6,
              7
            ],
            "testStrategy": "Architecture decision record states chosen stack, scope of extraction, non-goals, migration phases, rollback criteria, and owners."
          },
          {
            "id": 9,
            "title": "Create migration and operational runbooks",
            "description": "Document rollout, rollback, and incident response for the selected backend/funds architecture.",
            "details": "Include SHKeeper legacy read path, provider feature flag, ledger backfill, validation report before enforcement, rollback criteria, webhook cutoff, manual reconciliation, failed webhook, duplicate/missing payment, stuck release, disputed release attempt, compromised admin, leaked API key, provider outage, chain/RPC outage, suspicious payment proof, and npm/package compromise.",
            "status": "pending",
            "priority": "medium",
            "dependencies": [
              8
            ],
            "testStrategy": "Runbooks identify owner, trigger, detection signal, immediate action, recovery action, and post-incident documentation for each scenario."
          }
        ]
      }
    ]