--- title: Webhook Security Spec tags: [webhooks, security, audit, payments] created: 2026-05-24 status: advisory reviewers: [backend, security, operations] --- # Webhook Security Spec This document defines signed callback handling for all payment and payout providers. It closes the gaps in [[Security Architecture]] by turning webhook behavior into an explicit, auditable contract. The scope is inbound callbacks only: - SHKeeper pay-in (`/api/payment/shkeeper/webhook`) - SHKeeper payout (`/api/payment/shkeeper/payout/webhook`) - Request Network (`/api/payment/request-network/webhook`) - Manual/admin reconciliation channels (where applicable) ## 1. Canonical event envelope All callbacks are normalized by [[Payment Provider Adapter Spec]] into: ```ts type ProviderCallback = { provider: "shkeeper" | "request_network" | "manual_wallet" | "admin_wallet" | string; providerPaymentId: string; purchaseRequestId?: string; requestId?: string; deliveryId?: string; eventType: string; // e.g., paid, payout_completed, status_update status: string; // provider-specific raw status normalizedStatus: "pending" | "completed" | "failed" | "cancelled" | "released" | "refunded"; amount?: string; currency?: string; transactionHash?: string; occurredAt?: string; // ISO 8601 if provided receivedAt: string; // server-side receive time rawFingerprint: string; // sha256(raw_body) }; ``` Callbacks are processed only through adapter entry points; provider-specific parsing remains private to the adapter. ## 2. Signature verification ### 2.1 Required mechanics - Verify signatures against raw request bytes, **before JSON parsing**. - Use constant-time comparison and short-circuit to 401/403 on mismatch. - Never disable verification outside local-only test tooling. - Store raw payload hash (`rawFingerprint`) for forensics and idempotency checks. ### 2.2 Provider headers | Provider | Header(s) | |---|---| | SHKeeper | `x-shkeeper-signature` | | Request Network | `x-request-network-signature` | | Test override (local only) | explicitly documented in deployment notes, never in production | If expected signature header is absent or malformed, treat as a non-retryable client error. ## 3. Replay prevention and idempotency For each callback store and enforce one of: - `deliveryId` + `provider` + `eventType`, or - `(providerPaymentId, normalizedStatus, provider)` when provider has no delivery id. Replay rules: - First successful write path = **processed**. - Same key seen again with no state change = **duplicate** (HTTP 200 response, no side effects). - Same key seen for different payload hash = **conflict** (HTTP 409, captured to DLQ). ## 4. Unknown and duplicate behavior | Condition | Response | Side effects | |---|---|---| | Signature valid, unknown `providerPaymentId` | `200` (`unknown_payment`) in v1 mode / `404` in strict mode | no state write, record DLQ entry for operator review | | Known `providerPaymentId`, already terminal | `200` (`duplicate_terminal`) | no state write | | Known `providerPaymentId`, stale status transition | `200` (`duplicate_or_out_of_order`) | no state write | | Unknown signature | `401` | no state write | | Malformed payload | `400` | no state write | ## 5. Retry semantics - Callback consumers (providers) may retry: - transient network failures, - 5xx/provider internal timeouts, - explicit retryable status from endpoint. - Retry is triggered only on non-2xx codes for SHKeeper and Request Network. - Recommended handler mapping: - `401/400` = do not retry (hard fail), - `409` = do not retry until manual release, - `500/503` = retry. ## 6. Dead-letter and replay storage Persist all failed callbacks for at least 7 days in append-only storage: - `providerWebhookFailures` - key fields: `provider`, `deliveryId`, `providerPaymentId`, `requestPath`, `requestHeaders`, `rawFingerprint`, `statusCode`, `errorCode`, `attemptCount`, `nextRetryAt`, `rawBodyRef`, `createdAt`. - If storage is unavailable, fail closed and raise a high-severity ops alert. Retention policy: - 30 days for `success==true`, - 180 days for `unknown_payment`, `repeated_conflict`, `signature_failure`, - immediate alert if retry queue exceeds 500 entries for a provider. ## 7. Alerting thresholds - `failed_webhook_count` over 1 minute: - warning at `> 20`, - critical at `> 100`. - signature failures: - warning at `> 5` in 5 minutes, - critical at `> 20` in 5 minutes. - duplicate ratio: - warning if `duplicates / total >= 0.15` for 10 minutes. - dead-letter growth: - warning at `+200` new entries/hour, - critical at `+500`/hour. ## 8. Required operator signals Webhook health checks should expose: - last-seen timestamp by provider, - delivery backlog depth, - per-status counters (`processed`, `duplicate`, `unknown`, `conflict`, `signature_failure`), - DLQ length and oldest entry age. ## 9. Testing requirements - Signature bypass tests (must remain false in staging/prod), - replay/delivery-id duplicate tests, - malformed payload tests, - unknown payment tests, - non-terminal duplicate suppression tests. ## Related - [[Payment Provider Adapter Spec]] - [[Error Codes]] - [[Backend Funds Migration and Operational Runbooks]]