diff --git a/.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md b/.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md new file mode 100644 index 0000000..175e2eb --- /dev/null +++ b/.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md @@ -0,0 +1,560 @@ +--- +title: Backend Stack Security and Refactor Assessment +tags: [audit, security, backend, architecture, payments, refactor] +created: 2026-05-24 +status: advisory +--- + +# Backend Stack Security and Refactor Assessment + +## Purpose + +This document records an advisory assessment of whether Amanat should keep the current Node/Express backend, harden it in place, or migrate at least the security-critical backend surface to another technology stack. + +The conclusion is intentionally strategic rather than implementation-heavy. It should be used as input for architecture review, security planning, and refactor scoping. + +## Executive summary + +Amanat is not a normal CRUD marketplace. It is a financial escrow platform with authentication, realtime communication, crypto payment intake, payout/release flows, provider webhooks, and dispute-sensitive fund movement. + +The main security risk is not simply "Node is insecure." The larger issue is that the current backend mixes high-risk financial state transitions, webhook handling, realtime room membership, admin operations, test/demo endpoints, and ordinary marketplace APIs in one Express application. + +Moving away from Node/Express may reduce npm supply-chain exposure and improve long-term auditability, but it will not automatically fix the most important risks. The immediate priority should be to define and enforce the correct security architecture: + +- A canonical funds ledger. +- A strict escrow/payment/dispute state machine. +- Centralized authorization and ownership checks. +- Signed webhook handling with idempotency. +- Server-derived realtime authorization. +- Secure session handling. +- A provider-neutral payment abstraction. + +Recommended approach: + +1. Harden the existing backend immediately. +2. Define the target payment, ledger, and auth architecture in documentation. +3. Extract or rewrite only the security-critical backend core if the team can support the new stack. +4. Keep lower-risk marketplace, chat, notification, and dashboard APIs in TypeScript until the core is stable. + +Default recommendation: do not rewrite the entire backend at once. If a rewrite is chosen, start with payment/auth/escrow core services, preferably in Go or Kotlin/Java, while preserving current product behavior behind stable API contracts. + +## Current system profile + +Observed architecture: + +- Frontend: Next.js, React, MUI, Web3, Socket.IO client. +- Backend: Express 5, TypeScript, Mongoose, Socket.IO, SHKeeper, Web3 transaction verification, SMTP, OpenAI integration. +- Storage: MongoDB and Redis, though Redis is not consistently used as a shared state authority for all security-sensitive flows. +- Realtime: Socket.IO rooms for user, buyer, seller, chat, and purchase-request updates. +- Payments: SHKeeper pay-in, SHKeeper payout, decentralized/Web3 payment verification, manual/admin payout paths. +- Docs: existing logical audit and remediation documents already identify several critical flaws. + +The backend currently acts as: + +- API server. +- Realtime server. +- Payment orchestrator. +- Webhook processor. +- Background-job runner. +- File upload server. +- Auth/session issuer. +- Admin operations surface. + +That is too much responsibility in one process for a financial platform unless the architecture is very tightly controlled. + +## Code-backed security observations + +These findings are consistent with the existing audit docs and representative source review. + +### Payment and funds risks + +- Payment state is largely represented by mutable `Payment.status` and `escrowState` fields rather than an immutable funds ledger. +- Pay-in, manual confirmation, wallet monitoring, webhook handling, and payout flows can converge on the same records through different paths. +- Release/refund eligibility is not fully centralized around ledger invariants. +- The existing docs identify a dispute/escrow race: disputes do not reliably create an enforceable hold before release. +- `Payment` uses mixed/string-compatible references for some core links, reducing referential integrity and query safety. +- Some payment mutation/history routes were exposed without sufficient authentication or ownership enforcement. +- Web3 verification has been documented as relying primarily on transaction receipt success rather than strict token, recipient, and amount verification. + +Security implication: a backend stack change alone will not fix this. The platform needs a funds ledger and state machine first. + +### Authentication and session risks + +- Browser tokens are stored in `localStorage`, increasing impact from XSS. +- Passkey/WebAuthn behavior is described in the audit docs as stubbed/incomplete and challenge storage is process-local. +- Refresh-token behavior differs between auth paths. +- Admin-sensitive routes need explicit role enforcement, not just authentication. + +Security implication: migration should include a session architecture decision, not just a framework change. + +### Realtime risks + +- Socket.IO room joins are client-driven by IDs such as `join-user-room`, `join-buyer-room`, and `join-seller-room`. +- The server should derive room membership from authenticated socket identity, not trust client-supplied user IDs. + +Security implication: realtime authorization needs to be treated like API authorization. + +### Rate limiting and abuse controls + +- Global rate limiting is explicitly disabled in the Express app. +- Sensitive paths need tiered limits: auth, verification, file upload, AI, payment, webhook, chat. +- AI endpoints and email endpoints can create cost or abuse exposure if not authenticated and rate-limited. + +Security implication: this is an immediate hardening task regardless of backend stack. + +### Webhook and provider risks + +- Webhooks must be verified using raw-body signatures, not reconstructed JSON when signatures depend on raw bytes. +- Webhook delivery must be idempotent. +- Unknown, duplicate, malformed, and failed webhooks should be visible in structured records or dead-letter storage. +- Provider callbacks should create reconciliation events, not directly release funds. + +Security implication: payment provider integration should be isolated behind a provider-neutral service contract. + +### Supply-chain risks + +The Node/npm ecosystem has real and recurring supply-chain risk. For this codebase, that risk matters because both frontend and backend depend heavily on npm packages. + +Relevant 2026 context: + +- Express published February 2026 security releases, including high-severity Multer issues affecting versions before 2.1.0. The backend manifest currently specifies `multer: ^2.0.2`, so the resolved lockfile version should be reviewed and updated if necessary. +- Node.js published March 2026 security releases across active release lines. +- Microsoft reported an Axios npm supply-chain compromise in March 2026. This project uses Axios on frontend and backend. +- TanStack published a May 2026 npm compromise postmortem. This project uses `@tanstack/react-query`. + +References: + +- Express security release, 2026-02-27: https://expressjs.com/2026/02/27/security-releases.html +- Node.js March 2026 security releases: https://nodejs.org/en/blog/vulnerability/march-2026-security-releases +- Microsoft on Axios npm supply-chain compromise: https://www.microsoft.com/en-us/security/blog/2026/04/01/mitigating-the-axios-npm-supply-chain-compromise/ +- TanStack npm supply-chain compromise postmortem: https://tanstack.com/blog/npm-supply-chain-compromise-postmortem + +Security implication: npm supply-chain controls are required even if the backend is rewritten, because the frontend remains npm-based. + +## Should the backend move away from Node/Express? + +### Reasons to keep and harden first + +- The product already exists and has working business flows. +- A full rewrite risks reintroducing escrow/payment bugs. +- The most dangerous issues are domain/state/authorization issues, not syntax or framework issues. +- Hardening can reduce immediate exposure faster than a rewrite. +- The team may currently be more productive in TypeScript. + +### Reasons to migrate at least the backend core + +- Financial backend code benefits from a smaller, stricter dependency footprint. +- Payment, ledger, webhook, and payout flows need strong invariants and auditability. +- Express makes it easy to accumulate route-level exceptions, test endpoints, and inconsistent middleware. +- Node/npm supply-chain exposure is material and recurring. +- TypeScript runtime enforcement is limited unless paired with strict schema validation everywhere. +- A separate payment core can be more easily audited, threat-modeled, tested, and locked down. + +### Balanced conclusion + +It is security-wise reasonable to move the highest-risk backend core away from Node/Express, but only after the target security model is specified. + +Do not begin with a full product rewrite. Begin with a security-critical core extraction: + +- Auth/session/token authority. +- Payment intent creation. +- Provider webhook processing. +- Funds ledger and reconciliation. +- Release/refund/dispute-hold enforcement. +- Admin payout approval and audit logging. + +Keep lower-risk modules in the current stack until the core is stable: + +- Marketplace browsing/listing. +- Request templates. +- Chat and notifications, after socket authorization fixes. +- Admin dashboard reads. +- File upload, after hardening or moving to object storage. + +## Stack options + +### Go + +Best fit if the team wants a smaller, operationally simple, security-focused payment core. + +Strengths: + +- Small binaries and deployment footprint. +- Lower dependency surface than typical Node services. +- Strong standard library for HTTP, crypto, JSON, and concurrency. +- Good fit for webhook receivers, ledger services, workers, and reconciliation jobs. +- Easy to run static analysis and produce reproducible builds. + +Weaknesses: + +- Less ergonomic than TypeScript for rapid product iteration. +- Requires team comfort with Go idioms. +- API/schema generation must be designed deliberately. + +Assessment: recommended first choice for a payment/ledger/auth core if the team can maintain Go. + +### Kotlin/Java with Spring Boot + +Best fit if the team wants enterprise-grade structure, mature auth patterns, and strong ecosystem support. + +Strengths: + +- Mature security and validation ecosystem. +- Strong typing and tooling. +- Good for complex domain services and audit-heavy systems. +- Well-understood operational patterns. + +Weaknesses: + +- Heavier runtime and framework footprint. +- More ceremony. +- Slower iteration for a small team. + +Assessment: strong choice for a larger engineering team or enterprise-style compliance needs. + +### Rust + +Best fit if maximum memory safety and correctness are worth slower delivery. + +Strengths: + +- Strong compile-time safety. +- Good for cryptographic and high-assurance components. +- Very low runtime footprint. + +Weaknesses: + +- Higher implementation cost. +- Smaller hiring pool. +- Web API development may be slower. + +Assessment: attractive for narrow cryptographic or transaction-verification components, but probably too costly for the whole backend unless the team is already strong in Rust. + +### Python/FastAPI + +Best fit if rapid backend development and clean API typing are more important than strict compile-time guarantees. + +Strengths: + +- Fast development. +- Good validation with Pydantic. +- Good for admin tools and internal services. + +Weaknesses: + +- Supply-chain risk remains. +- Runtime typing and async behavior require discipline. +- Less compelling than Go/Kotlin for a financial core. + +Assessment: acceptable for internal services, not the preferred payment-core target. + +### Continue TypeScript/Node with stronger architecture + +Best fit if team capacity cannot support another backend language yet. + +Required conditions: + +- Strict route registration policy. +- Runtime validation on every boundary. +- No test/demo routes in production builds. +- Full lockfile and package provenance controls. +- Centralized auth, ownership, and role guards. +- Ledger-first payment architecture. +- Secure cookies or a documented token-storage risk acceptance. +- Socket auth middleware. +- Redis-backed challenge/idempotency/rate-limit storage. + +Assessment: viable short term, but the security bar must be raised significantly. + +## Recommended target architecture + +### Phase 0: Immediate containment + +Goal: reduce current high-risk exposure without broad redesign. + +Actions: + +- Disable or protect test/demo payment and email endpoints in production. +- Require authentication and ownership checks on all payment, notification, AI, and file routes. +- Re-enable rate limiting with stricter limits on auth, payment, AI, file upload, and webhook paths. +- Add admin role checks to admin routes. +- Stop accepting arbitrary `userId` from clients for private data. +- Validate all payment mutations through centralized service methods. +- Lock Socket.IO room membership to server-verified identity. +- Review and update lockfiles for known vulnerable packages. +- Rotate any committed or publicly visible secrets. + +### Phase 1: Architecture specification + +Goal: define the new security model before implementation. + +Documents to produce are listed in the "Required documentation" section below. + +### Phase 2: Payment and ledger extraction + +Goal: move funds logic behind a provider-neutral service. + +Introduce: + +- `FundsAccount` +- `LedgerEntry` +- `FundsBalance` +- `PaymentIntent` +- `PaymentProviderEvent` +- `ReleaseInstruction` +- `RefundInstruction` +- `DisputeHold` + +Key rule: provider webhooks do not directly release funds. They create verified events and ledger entries. + +### Phase 3: Backend-core rewrite or service split + +Goal: decide whether the extracted core remains TypeScript or moves to Go/Kotlin. + +Recommended split: + +- `core-payments`: payment intent, webhook, ledger, release/refund, reconciliation. +- `core-auth`: sessions, passkeys, OAuth, token issuance, session revocation. +- `marketplace-api`: purchase requests, offers, categories, templates. +- `realtime-api`: chat, notifications, socket rooms. + +The split can be logical first, physical later. + +### Phase 4: Full migration only if justified + +Goal: avoid rewriting stable lower-risk product surfaces prematurely. + +Only consider full backend migration after: + +- Payment core is stable. +- Auth/session model is stable. +- API contracts are documented and tested. +- Legacy payment records are migrated or safely read-only. +- Team has demonstrated production maintenance ability in the new stack. + +## Required documentation before refactor + +### 1. Threat Model + +Purpose: identify what must be protected and how it can be attacked. + +Should include: + +- Assets: user accounts, admin accounts, wallet addresses, payment records, funds, webhook secrets, API keys, private notifications. +- Actors: buyer, seller, admin, support, unauthenticated attacker, compromised user, compromised admin, provider, malicious webhook sender. +- Trust boundaries: browser, backend, database, Redis, provider APIs, wallet/RPC, admin UI, Socket.IO. +- Abuse cases: fake payment proof, replayed webhook, arbitrary room join, stolen token, double payout, dispute bypass, email/AI abuse. + +### 2. Funds Ledger Specification + +Purpose: make money movement auditable and provider-independent. + +Should define: + +- Account model per purchase request/order. +- Immutable ledger entry types. +- Derived balance model. +- Gross amount, provider fees, platform fees, held amount, disputed amount, releasable amount, released amount, refunded amount. +- Idempotency keys. +- Reconciliation behavior. + +### 3. Escrow State Machine + +Purpose: define legal transitions once. + +Should include: + +- Purchase request states. +- Payment states. +- Escrow/funds states. +- Dispute states. +- Valid transitions and forbidden transitions. +- Who or what can trigger each transition. +- Required preconditions for release, refund, cancellation, dispute hold, and admin override. + +### 4. Authorization Matrix + +Purpose: remove route-by-route ambiguity. + +Should map every endpoint and socket event to: + +- Public, authenticated, owner, seller, buyer, admin, support, or service role. +- Required ownership checks. +- Required object state. +- Rate-limit tier. +- Audit-log requirement. + +### 5. Payment Provider Adapter Spec + +Purpose: decouple business logic from SHKeeper, Request Network, manual wallet flow, and future providers. + +Should define: + +- `createPayInIntent` +- `getPayInStatus` +- `handleProviderWebhook` +- `createHostedPaymentLink` +- `createReleaseInstruction` +- `createRefundInstruction` +- `getPayoutStatus` +- `searchProviderPayments` + +Provider-specific metadata should be namespaced and never become the canonical funds state. + +### 6. Webhook Security Spec + +Purpose: prevent forged, replayed, or silently failed provider events. + +Should define: + +- Raw-body signature verification. +- Accepted headers and algorithms. +- Replay prevention. +- Delivery ID/idempotency handling. +- Unknown payment behavior. +- Duplicate event behavior. +- Retry semantics. +- Dead-letter/replay storage. +- Alerting thresholds. + +### 7. Session and Auth Architecture + +Purpose: decide how browser sessions should work for a financial platform. + +Should define: + +- Access token lifetime. +- Refresh token lifetime and rotation. +- Whether tokens move from `localStorage` to `httpOnly` cookies. +- CSRF strategy if cookies are used. +- Passkey/WebAuthn implementation requirements. +- OAuth requirements. +- Device/session revocation. +- Admin step-up authentication for payouts or role changes. + +### 8. Realtime Authorization Spec + +Purpose: make Socket.IO events subject to the same security model as REST. + +Should define: + +- Socket handshake authentication. +- Server-derived room membership. +- Which rooms exist. +- Who may join each room. +- Whether room membership changes with request/payment/dispute state. +- Event payload privacy rules. + +### 9. Migration Plan + +Purpose: avoid breaking current payments and historical records. + +Should include: + +- SHKeeper legacy read path. +- New provider feature flag. +- Ledger backfill strategy. +- Data validation report before enforcement. +- Rollback criteria. +- Cutover date for old webhook routes. +- Operator manual reconciliation workflow. + +### 10. Secure Build and Supply-Chain Policy + +Purpose: reduce npm and dependency compromise risk. + +Should define: + +- Package manager and lockfile policy. +- CI install mode. +- Dependency update cadence. +- Security advisory monitoring. +- npm provenance/signature policy where available. +- Secrets handling. +- Production build reproducibility. +- Separation of frontend npm risk from backend core risk. + +### 11. Operational Runbooks + +Purpose: make security incidents and payment failures survivable. + +Should include: + +- Failed webhook. +- Duplicate payment. +- Missing payment. +- Stuck release. +- Disputed release attempt. +- Compromised admin. +- Leaked API key. +- Provider outage. +- Chain/RPC outage. +- Suspicious payment proof. +- npm/package compromise. + +## Decision framework + +Use the following questions before choosing a rewrite: + +- Is the current goal safe launch, or long-term platform rebuild? +- Is the team willing to delay feature work for a payment-core redesign? +- Can the team maintain Go/Kotlin/Rust in production? +- Is the biggest current risk supply chain, or incorrect money movement? +- Are admin actions trusted, or should high-risk actions require step-up approval? +- Should Amanat custody funds, or should the provider/payment network hold or route them? +- Are disputes central to the product, or rare manual exceptions? +- Is auditability a regulatory/business requirement or only an internal safety goal? + +## Recommended decision + +Near term: + +- Harden the current Express backend. +- Disable unsafe production routes. +- Add centralized authorization and rate limiting. +- Fix Web3 verification. +- Fix Socket.IO authorization. +- Disable passkeys unless implemented with real WebAuthn. +- Begin ledger/state-machine documentation immediately. + +Medium term: + +- Build a provider-neutral payment and funds layer. +- Add immutable ledger entries. +- Move release/refund/dispute-hold checks into the central payment/funds service. +- Keep SHKeeper compatibility read-only for legacy records. +- Add Request Network or another provider behind the adapter if desired. + +Long term: + +- Rewrite the payment/auth/escrow core in Go or Kotlin/Java if the team can support it. +- Do not rewrite the entire backend until the core is proven. +- Keep lower-risk modules in TypeScript until there is a business or operational reason to migrate them. + +## Open questions for leadership and engineering + +1. Is launch timeline more important than a full payment/funds redesign? +2. Should passkeys be removed from launch scope until production-grade WebAuthn is implemented? +3. Should browser auth move to `httpOnly` cookies even if that requires CSRF work and frontend changes? +4. Should every payout require admin step-up authentication or two-person approval? +5. Should Amanat keep funds in a platform-controlled escrow wallet, or should provider-mediated payment pages become the default? +6. Is Request Network a desired provider migration, or just one option being explored? +7. What new backend stack can the team realistically operate for the next two years? +8. What is the acceptable level of temporary dual-stack complexity during migration? +9. Do we need formal external penetration testing before public launch? +10. Who owns security decisions: product, backend, DevOps, or a dedicated security owner? + +## Relationship to existing docs + +This assessment complements: + +- [[Platform Logical Audit - 2026-05-24]] +- [[PRD - Platform Audit Remediation Plan (2026-05-24)]] +- [[PRD - Request Network Migration and Funds Management]] +- [[Security Architecture]] +- [[Payment Flow - SHKeeper]] +- [[Payment Flow - DePay & Web3]] +- [[Escrow Flow]] +- [[Dispute Flow]] + +The existing remediation PRD is the tactical hardening plan. This document is the strategic backend-stack and refactor assessment. diff --git a/.taskmaster/tasks/task-4.md b/.taskmaster/tasks/task-4.md new file mode 100644 index 0000000..3224194 --- /dev/null +++ b/.taskmaster/tasks/task-4.md @@ -0,0 +1,21 @@ +# Task 4: Define backend security and refactor strategy from latest audit + +Status: pending +Priority: high +Source audit: `.taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md` + +Convert the backend stack security/refactor assessment into concrete architecture decisions, documentation deliverables, and developer handoff criteria. + +This is an advisory/architecture task. It should run in parallel with immediate backend hardening rather than block urgent remediation. + +Subtasks: + +1. Assign security ownership and launch decision criteria. +2. Produce threat model for escrow platform. +3. Specify funds ledger and escrow state machine. +4. Create authorization matrix for REST and Socket.IO. +5. Decide session, passkey, and admin step-up architecture. +6. Specify webhook security and provider adapter contracts. +7. Define secure build and supply-chain policy. +8. Make backend-core stack decision. +9. Create migration and operational runbooks. diff --git a/.taskmaster/tasks/tasks.json b/.taskmaster/tasks/tasks.json index 364be00..cfd7d46 100644 --- a/.taskmaster/tasks/tasks.json +++ b/.taskmaster/tasks/tasks.json @@ -8,7 +8,8 @@ "sourcePrds": [ ".taskmaster/docs/prd-mermaid-diagram-rendering-stabilization.md", ".taskmaster/docs/prd-platform-audit-remediation-plan-2026-05-24.md", - ".taskmaster/docs/prd-request-network-migration-and-funds-management.md" + ".taskmaster/docs/prd-request-network-migration-and-funds-management.md", + ".taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md" ] }, "tasks": [ @@ -22,9 +23,36 @@ "status": "done", "dependencies": [], "subtasks": [ - { "id": 1, "title": "Fix Security Architecture email/password sequence", "description": "Normalize parser-sensitive sequence text in 01 - Architecture/Security Architecture.md.", "details": "Avoid semicolons and ambiguous inline punctuation in sequence messages.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "mmdc parse for the specific block." }, - { "id": 2, "title": "Fix authentication login and refresh diagrams", "description": "Normalize parser-sensitive token and refresh-token sequence text in Authentication Flow.", "details": "Split method-like or expression-like message text into parser-safe plain text lines.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "mmdc parse for both Authentication Flow blocks." }, - { "id": 3, "title": "Fix chat, delivery, dispute, OAuth, purchase request, referral, registration, and seller-offer diagrams", "description": "Clean the remaining Mermaid sequence diagrams with invalid or ambiguous syntax.", "details": "Split multi-recipient arrows, remove parser-conflicting semicolon/expression text, and keep intent unchanged.", "status": "done", "priority": "medium", "dependencies": [], "testStrategy": "Full vault mmdc parser sweep across all Mermaid blocks." } + { + "id": 1, + "title": "Fix Security Architecture email/password sequence", + "description": "Normalize parser-sensitive sequence text in 01 - Architecture/Security Architecture.md.", + "details": "Avoid semicolons and ambiguous inline punctuation in sequence messages.", + "status": "done", + "priority": "medium", + "dependencies": [], + "testStrategy": "mmdc parse for the specific block." + }, + { + "id": 2, + "title": "Fix authentication login and refresh diagrams", + "description": "Normalize parser-sensitive token and refresh-token sequence text in Authentication Flow.", + "details": "Split method-like or expression-like message text into parser-safe plain text lines.", + "status": "done", + "priority": "medium", + "dependencies": [], + "testStrategy": "mmdc parse for both Authentication Flow blocks." + }, + { + "id": 3, + "title": "Fix chat, delivery, dispute, OAuth, purchase request, referral, registration, and seller-offer diagrams", + "description": "Clean the remaining Mermaid sequence diagrams with invalid or ambiguous syntax.", + "details": "Split multi-recipient arrows, remove parser-conflicting semicolon/expression text, and keep intent unchanged.", + "status": "done", + "priority": "medium", + "dependencies": [], + "testStrategy": "Full vault mmdc parser sweep across all Mermaid blocks." + } ] }, { @@ -37,13 +65,94 @@ "status": "pending", "dependencies": [], "subtasks": [ - { "id": 1, "title": "Secure unauthenticated endpoints and owner enforcement", "description": "Require authenticateToken and owner/admin checks on exposed payment, AI, and legacy notification routes.", "details": "Derive notification userId from authenticated principal. Protect payment history and mutation endpoints. Restrict AI calls to authenticated users with per-user budgets. Add denied-access audit logs.", "status": "pending", "priority": "high", "dependencies": [], "testStrategy": "Unauthorized callers receive 401/403; users cannot access or mutate other users' payments/notifications; admins retain authorized access." }, - { "id": 2, "title": "Re-enable and scope rate limiting", "description": "Restore global and route-tiered rate limits for public-sensitive paths.", "details": "Use stricter limits for auth, financial, AI, file upload, and verification paths. Keep public reads at relaxed limits. Add observability for 429 spikes.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Exercise configured limits per tier and confirm expected 429 responses without blocking ordinary reads." }, - { "id": 3, "title": "Replace stubbed passkey/WebAuthn flow", "description": "Implement production-grade WebAuthn registration/authentication and shared challenge storage.", "details": "Use real attestation/assertion verification, Redis-backed TTL challenges, refresh-token persistence/rotation, and deterministic malformed/reused/expired challenge errors.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Registration, login, replay, expired challenge, and refresh-token continuity tests pass." }, - { "id": 4, "title": "Strengthen DePay/Web3 payment verification", "description": "Verify transaction recipient, token contract, and amount, not only receipt success.", "details": "Decode ERC-20 Transfer logs, compare recipient against escrow address, validate token contract and decimals-adjusted minimum amount, store verifier evidence and idempotency fingerprint.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Reject successful but wrong-recipient/wrong-token/underpaid tx hashes; accept only matching transfers." }, - { "id": 5, "title": "Lock Socket.IO room joins to authenticated context", "description": "Remove trust in client-supplied user/buyer/seller room IDs.", "details": "Validate socket handshake token, derive server-side room membership, reject mismatched joins, and monitor suspicious join attempts.", "status": "pending", "priority": "medium", "dependencies": [1], "testStrategy": "A user cannot subscribe to another user's rooms; legitimate realtime notifications still arrive." }, - { "id": 6, "title": "Enforce dispute hold before payout and release operations", "description": "Add payment hold state and central release/refund guards that block disputed funds.", "details": "Introduce explicit dispute hold fields or state, enforce in PaymentCoordinator and payout/release services, return clear 409/423 responses, and backfill/report blocked payments.", "status": "pending", "priority": "medium", "dependencies": [1, 4], "testStrategy": "Open dispute blocks release/refund until resolved or explicitly overridden through authorized path." }, - { "id": 7, "title": "Align documentation, API references, and runtime enums", "description": "Normalize disputed/payment/request status docs and implementation references after security behavior changes.", "details": "Resolve mismatch around absent dispute module, endpoint names, status enums, and action names across Data Models, API Reference, and Flows.", "status": "pending", "priority": "medium", "dependencies": [1, 2, 3, 4, 5, 6], "testStrategy": "Docs match implemented routes, models, enum values, and state transitions." } + { + "id": 1, + "title": "Secure unauthenticated endpoints and owner enforcement", + "description": "Require authenticateToken and owner/admin checks on exposed payment, AI, and legacy notification routes.", + "details": "Derive notification userId from authenticated principal. Protect payment history and mutation endpoints. Restrict AI calls to authenticated users with per-user budgets. Add denied-access audit logs.", + "status": "pending", + "priority": "high", + "dependencies": [], + "testStrategy": "Unauthorized callers receive 401/403; users cannot access or mutate other users' payments/notifications; admins retain authorized access." + }, + { + "id": 2, + "title": "Re-enable and scope rate limiting", + "description": "Restore global and route-tiered rate limits for public-sensitive paths.", + "details": "Use stricter limits for auth, financial, AI, file upload, and verification paths. Keep public reads at relaxed limits. Add observability for 429 spikes.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Exercise configured limits per tier and confirm expected 429 responses without blocking ordinary reads." + }, + { + "id": 3, + "title": "Replace stubbed passkey/WebAuthn flow", + "description": "Implement production-grade WebAuthn registration/authentication and shared challenge storage.", + "details": "Use real attestation/assertion verification, Redis-backed TTL challenges, refresh-token persistence/rotation, and deterministic malformed/reused/expired challenge errors.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Registration, login, replay, expired challenge, and refresh-token continuity tests pass." + }, + { + "id": 4, + "title": "Strengthen DePay/Web3 payment verification", + "description": "Verify transaction recipient, token contract, and amount, not only receipt success.", + "details": "Decode ERC-20 Transfer logs, compare recipient against escrow address, validate token contract and decimals-adjusted minimum amount, store verifier evidence and idempotency fingerprint.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Reject successful but wrong-recipient/wrong-token/underpaid tx hashes; accept only matching transfers." + }, + { + "id": 5, + "title": "Lock Socket.IO room joins to authenticated context", + "description": "Remove trust in client-supplied user/buyer/seller room IDs.", + "details": "Validate socket handshake token, derive server-side room membership, reject mismatched joins, and monitor suspicious join attempts.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 1 + ], + "testStrategy": "A user cannot subscribe to another user's rooms; legitimate realtime notifications still arrive." + }, + { + "id": 6, + "title": "Enforce dispute hold before payout and release operations", + "description": "Add payment hold state and central release/refund guards that block disputed funds.", + "details": "Introduce explicit dispute hold fields or state, enforce in PaymentCoordinator and payout/release services, return clear 409/423 responses, and backfill/report blocked payments.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 1, + 4 + ], + "testStrategy": "Open dispute blocks release/refund until resolved or explicitly overridden through authorized path." + }, + { + "id": 7, + "title": "Align documentation, API references, and runtime enums", + "description": "Normalize disputed/payment/request status docs and implementation references after security behavior changes.", + "details": "Resolve mismatch around absent dispute module, endpoint names, status enums, and action names across Data Models, API Reference, and Flows.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 1, + 2, + 3, + 4, + 5, + 6 + ], + "testStrategy": "Docs match implemented routes, models, enum values, and state transitions." + } ] }, { @@ -54,15 +163,222 @@ "testStrategy": "Use feature flags, provider fixture tests, webhook signature/idempotency tests, ledger invariant tests, migration dry-run reports, and limited cohort rollout before default provider switch.", "priority": "high", "status": "pending", - "dependencies": [2], + "dependencies": [ + 2 + ], "subtasks": [ - { "id": 1, "title": "Introduce provider-neutral payment adapter", "description": "Decouple checkout, webhook, and payout flows from SHKeeper-specific routes and metadata.", "details": "Define createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, and searchProviderPayments. Add provider values shkeeper, request_network, manual, admin_wallet and PAYMENT_PROVIDER feature flag.", "status": "pending", "priority": "high", "dependencies": [], "testStrategy": "New provider can be selected by feature flag while existing SHKeeper payments remain readable and process late webhooks." }, - { "id": 2, "title": "Implement Request Network pay-in integration", "description": "Create Request Network payment requests or Secure Payment Pages for new checkout flows.", "details": "Store requestId, paymentReference, securePaymentUrl, token, merchantReference, network, invoiceCurrency, and paymentCurrency. Validate supported networks/currencies before creating links.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Buyer receives hosted payment URL; webhook reconciles matching internal payment only after amount/currency/reference validation." }, - { "id": 3, "title": "Add funds ledger and escrow state machine", "description": "Introduce internal funds accounting independent from provider metadata.", "details": "Add FundsAccount, LedgerEntry, derived FundsBalance, expected/held/releasable/releasing/released/refunded/disputed/failed states, fee representation, and release/refund invariant checks.", "status": "pending", "priority": "high", "dependencies": [1], "testStrategy": "Every pay-in creates immutable ledger entries and payout/refund cannot exceed available held funds or bypass dispute holds." }, - { "id": 4, "title": "Build Request Network webhook and reconciliation service", "description": "Process signed Request Network events and repair missed webhook state through reconciliation.", "details": "Add /api/payment/request-network/webhook, verify raw-body x-request-network-signature, store delivery ID/retry/event/request/payment reference/payload hash, support test webhooks, and add scheduled payment search/status reconciliation.", "status": "pending", "priority": "high", "dependencies": [2, 3], "testStrategy": "Invalid signatures reject; duplicate delivery IDs acknowledge without duplicate ledger entries; reconciliation repairs missed state." }, - { "id": 5, "title": "Implement release, refund, and payout orchestration", "description": "Replace SHKeeper payout tasks and simulated release with auditable transaction instruction and confirmation flows.", "details": "Create release/refund service consuming ledger balances, generate Request Network payout or direct admin wallet instructions, store unsigned tx payloads, signer, submitted hash, confirmation status, provider status, and require admin/operator authorization plus dispute checks.", "status": "pending", "priority": "high", "dependencies": [3, 4], "testStrategy": "Release cannot occur if unpaid, already released, refunded, or disputed; tx hash confirmation updates ledger once; admin can retry/cancel safely." }, - { "id": 6, "title": "Migrate frontend checkout and admin payment UI", "description": "Update buyer checkout, admin release, seller payout, and payment details for provider-neutral Request Network flows.", "details": "Replace ShkeeperPayment with CryptoPayment/RequestNetworkPayment redirect flow, keep legacy SHKeeper only for legacy records, replace ShkeeperPayout with release queue/admin payout UI, and show provider IDs, payment references, hosted links, ledger balances, webhook/reconciliation status.", "status": "pending", "priority": "medium", "dependencies": [2, 3, 5], "testStrategy": "Request Network checkout does not expect walletAddress; admin UI blocks unsafe release; legacy labels are hidden for Request Network records." }, - { "id": 7, "title": "Backfill legacy SHKeeper records and decommission provider-specific code", "description": "Migrate historical SHKeeper payment metadata and safely remove legacy wallet monitor/webhook/payout paths after cutoff.", "details": "Backfill provider namespace, create ledger entries for trusted completed SHKeeper payments, mark legacyProvider, keep webhook tail period, and produce decommission checklist for env vars, docs, labels, routes, and runbooks.", "status": "pending", "priority": "medium", "dependencies": [3, 4, 5, 6], "testStrategy": "Dry-run report includes total, migrated, skipped, ambiguous, failed; no historical transaction hash/invoice/task metadata is lost." } + { + "id": 1, + "title": "Introduce provider-neutral payment adapter", + "description": "Decouple checkout, webhook, and payout flows from SHKeeper-specific routes and metadata.", + "details": "Define createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, and searchProviderPayments. Add provider values shkeeper, request_network, manual, admin_wallet and PAYMENT_PROVIDER feature flag.", + "status": "pending", + "priority": "high", + "dependencies": [], + "testStrategy": "New provider can be selected by feature flag while existing SHKeeper payments remain readable and process late webhooks." + }, + { + "id": 2, + "title": "Implement Request Network pay-in integration", + "description": "Create Request Network payment requests or Secure Payment Pages for new checkout flows.", + "details": "Store requestId, paymentReference, securePaymentUrl, token, merchantReference, network, invoiceCurrency, and paymentCurrency. Validate supported networks/currencies before creating links.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Buyer receives hosted payment URL; webhook reconciles matching internal payment only after amount/currency/reference validation." + }, + { + "id": 3, + "title": "Add funds ledger and escrow state machine", + "description": "Introduce internal funds accounting independent from provider metadata.", + "details": "Add FundsAccount, LedgerEntry, derived FundsBalance, expected/held/releasable/releasing/released/refunded/disputed/failed states, fee representation, and release/refund invariant checks.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Every pay-in creates immutable ledger entries and payout/refund cannot exceed available held funds or bypass dispute holds." + }, + { + "id": 4, + "title": "Build Request Network webhook and reconciliation service", + "description": "Process signed Request Network events and repair missed webhook state through reconciliation.", + "details": "Add /api/payment/request-network/webhook, verify raw-body x-request-network-signature, store delivery ID/retry/event/request/payment reference/payload hash, support test webhooks, and add scheduled payment search/status reconciliation.", + "status": "pending", + "priority": "high", + "dependencies": [ + 2, + 3 + ], + "testStrategy": "Invalid signatures reject; duplicate delivery IDs acknowledge without duplicate ledger entries; reconciliation repairs missed state." + }, + { + "id": 5, + "title": "Implement release, refund, and payout orchestration", + "description": "Replace SHKeeper payout tasks and simulated release with auditable transaction instruction and confirmation flows.", + "details": "Create release/refund service consuming ledger balances, generate Request Network payout or direct admin wallet instructions, store unsigned tx payloads, signer, submitted hash, confirmation status, provider status, and require admin/operator authorization plus dispute checks.", + "status": "pending", + "priority": "high", + "dependencies": [ + 3, + 4 + ], + "testStrategy": "Release cannot occur if unpaid, already released, refunded, or disputed; tx hash confirmation updates ledger once; admin can retry/cancel safely." + }, + { + "id": 6, + "title": "Migrate frontend checkout and admin payment UI", + "description": "Update buyer checkout, admin release, seller payout, and payment details for provider-neutral Request Network flows.", + "details": "Replace ShkeeperPayment with CryptoPayment/RequestNetworkPayment redirect flow, keep legacy SHKeeper only for legacy records, replace ShkeeperPayout with release queue/admin payout UI, and show provider IDs, payment references, hosted links, ledger balances, webhook/reconciliation status.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 2, + 3, + 5 + ], + "testStrategy": "Request Network checkout does not expect walletAddress; admin UI blocks unsafe release; legacy labels are hidden for Request Network records." + }, + { + "id": 7, + "title": "Backfill legacy SHKeeper records and decommission provider-specific code", + "description": "Migrate historical SHKeeper payment metadata and safely remove legacy wallet monitor/webhook/payout paths after cutoff.", + "details": "Backfill provider namespace, create ledger entries for trusted completed SHKeeper payments, mark legacyProvider, keep webhook tail period, and produce decommission checklist for env vars, docs, labels, routes, and runbooks.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 3, + 4, + 5, + 6 + ], + "testStrategy": "Dry-run report includes total, migrated, skipped, ambiguous, failed; no historical transaction hash/invoice/task metadata is lost." + } + ] + }, + { + "id": 4, + "title": "Define backend security and refactor strategy from latest audit", + "description": "Convert the backend stack security/refactor assessment into concrete architecture decisions, documentation deliverables, and developer handoff criteria.", + "details": "Source audit: .taskmaster/docs/audit-backend-stack-security-and-refactor-assessment-2026-05-24.md. This task is advisory/architecture-focused and should run in parallel with immediate hardening. It should produce the decision artifacts needed before any backend-core rewrite or provider migration is started.", + "testStrategy": "Review and sign off each architecture document with backend, payments, frontend, and operations stakeholders. Confirm every open question has an owner or explicit deferred decision before implementation work begins.", + "priority": "high", + "status": "pending", + "dependencies": [], + "subtasks": [ + { + "id": 1, + "title": "Assign security ownership and launch decision criteria", + "description": "Define who owns security decisions and what must be true before public launch or migration work proceeds.", + "details": "Answer ownership questions from the audit: security owner, launch safety bar, whether launch prioritizes hardening or redesign, and whether external penetration testing is required.", + "status": "pending", + "priority": "high", + "dependencies": [], + "testStrategy": "Written owner/RACI and launch gate checklist are accepted by leadership and engineering." + }, + { + "id": 2, + "title": "Produce threat model for escrow platform", + "description": "Document protected assets, actors, trust boundaries, and abuse cases for the financial marketplace.", + "details": "Include buyer, seller, admin, support, unauthenticated attacker, compromised user/admin, provider, malicious webhook sender, browser/backend/database/Redis/provider/wallet/Socket.IO trust boundaries, and abuse cases such as fake payment proof, replayed webhook, arbitrary room join, stolen token, double payout, dispute bypass, email abuse, and AI abuse.", + "status": "pending", + "priority": "high", + "dependencies": [ + 1 + ], + "testStrategy": "Threat model maps each high-risk finding to at least one mitigation task or accepted risk." + }, + { + "id": 3, + "title": "Specify funds ledger and escrow state machine", + "description": "Define canonical money movement and legal state transitions before refactor or provider migration.", + "details": "Create specs for FundsAccount, LedgerEntry, FundsBalance, gross paid, provider fees, platform fees, held, disputed, releasable, released, refunded, idempotency keys, reconciliation behavior, purchase request states, payment states, escrow/funds states, dispute states, valid transitions, forbidden transitions, and release/refund/admin override preconditions.", + "status": "pending", + "priority": "high", + "dependencies": [ + 2 + ], + "testStrategy": "Spec can be used to reject double-release, release-during-dispute, underfunded payout, and ambiguous provider-event scenarios." + }, + { + "id": 4, + "title": "Create authorization matrix for REST and Socket.IO", + "description": "Map every endpoint and realtime event to access level, ownership checks, state preconditions, rate-limit tier, and audit-log requirement.", + "details": "Include public/authenticated/owner/buyer/seller/admin/support/service-role classifications. Socket.IO rooms must be server-derived from authenticated identity, not client-supplied user IDs.", + "status": "pending", + "priority": "high", + "dependencies": [ + 2 + ], + "testStrategy": "No route or socket event remains unmapped; implementation tasks can reference matrix rows directly." + }, + { + "id": 5, + "title": "Decide session, passkey, and admin step-up architecture", + "description": "Choose browser session model and high-risk admin authentication requirements.", + "details": "Decide localStorage versus httpOnly cookies, access/refresh token lifetimes, CSRF strategy, refresh rotation, WebAuthn requirements, OAuth requirements, device/session revocation, and whether payouts/role changes require step-up authentication or two-person approval.", + "status": "pending", + "priority": "high", + "dependencies": [ + 2 + ], + "testStrategy": "Decision record lists chosen model, rejected alternatives, migration cost, and required implementation tasks." + }, + { + "id": 6, + "title": "Specify webhook security and provider adapter contracts", + "description": "Define provider-neutral payment interface and signed webhook processing rules.", + "details": "Document createPayInIntent, getPayInStatus, handleProviderWebhook, createHostedPaymentLink, createReleaseInstruction, createRefundInstruction, getPayoutStatus, searchProviderPayments, raw-body signature verification, replay prevention, delivery ID idempotency, duplicate/unknown event behavior, retry semantics, dead-letter/replay storage, and alert thresholds.", + "status": "pending", + "priority": "high", + "dependencies": [ + 3 + ], + "testStrategy": "Contracts cover SHKeeper legacy, Request Network, manual/admin wallet, invalid signatures, duplicate deliveries, and missed webhook reconciliation." + }, + { + "id": 7, + "title": "Define secure build and supply-chain policy", + "description": "Reduce npm/dependency compromise risk across frontend and any remaining Node services.", + "details": "Specify package manager and lockfile policy, CI install mode, dependency update cadence, advisory monitoring, npm provenance/signature policy where available, secrets handling, reproducible production builds, and separation between frontend npm risk and backend-core risk.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 1 + ], + "testStrategy": "Policy is actionable in CI and includes response steps for compromised package, leaked token, and vulnerable dependency alerts." + }, + { + "id": 8, + "title": "Make backend-core stack decision", + "description": "Choose whether the security-critical backend core remains TypeScript or moves to Go/Kotlin/Rust/Python.", + "details": "Evaluate team capability, two-year maintainability, operational footprint, rewrite cost, dual-stack complexity, auditability, supply-chain exposure, and which modules belong in a payment/auth/escrow core versus the existing marketplace/chat API.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 2, + 3, + 4, + 5, + 6, + 7 + ], + "testStrategy": "Architecture decision record states chosen stack, scope of extraction, non-goals, migration phases, rollback criteria, and owners." + }, + { + "id": 9, + "title": "Create migration and operational runbooks", + "description": "Document rollout, rollback, and incident response for the selected backend/funds architecture.", + "details": "Include SHKeeper legacy read path, provider feature flag, ledger backfill, validation report before enforcement, rollback criteria, webhook cutoff, manual reconciliation, failed webhook, duplicate/missing payment, stuck release, disputed release attempt, compromised admin, leaked API key, provider outage, chain/RPC outage, suspicious payment proof, and npm/package compromise.", + "status": "pending", + "priority": "medium", + "dependencies": [ + 8 + ], + "testStrategy": "Runbooks identify owner, trigger, detection signal, immediate action, recovery action, and post-incident documentation for each scenario." + } ] } ]