Complete task 4 backend security architecture docs
This commit is contained in:
150
09 - Audits/Webhook Security Spec.md
Normal file
150
09 - Audits/Webhook Security Spec.md
Normal file
@@ -0,0 +1,150 @@
|
||||
---
|
||||
title: Webhook Security Spec
|
||||
tags: [webhooks, security, audit, payments]
|
||||
created: 2026-05-24
|
||||
status: advisory
|
||||
reviewers: [backend, security, operations]
|
||||
---
|
||||
|
||||
# Webhook Security Spec
|
||||
|
||||
This document defines signed callback handling for all payment and payout providers.
|
||||
It closes the gaps in [[Security Architecture]] by turning webhook behavior into an explicit,
|
||||
auditable contract.
|
||||
|
||||
The scope is inbound callbacks only:
|
||||
|
||||
- SHKeeper pay-in (`/api/payment/shkeeper/webhook`)
|
||||
- SHKeeper payout (`/api/payment/shkeeper/payout/webhook`)
|
||||
- Request Network (`/api/payment/request-network/webhook`)
|
||||
- Manual/admin reconciliation channels (where applicable)
|
||||
|
||||
## 1. Canonical event envelope
|
||||
|
||||
All callbacks are normalized by [[Payment Provider Adapter Spec]] into:
|
||||
|
||||
```ts
|
||||
type ProviderCallback = {
|
||||
provider: "shkeeper" | "request_network" | "manual_wallet" | "admin_wallet" | string;
|
||||
providerPaymentId: string;
|
||||
purchaseRequestId?: string;
|
||||
requestId?: string;
|
||||
deliveryId?: string;
|
||||
eventType: string; // e.g., paid, payout_completed, status_update
|
||||
status: string; // provider-specific raw status
|
||||
normalizedStatus: "pending" | "completed" | "failed" | "cancelled" | "released" | "refunded";
|
||||
amount?: string;
|
||||
currency?: string;
|
||||
transactionHash?: string;
|
||||
occurredAt?: string; // ISO 8601 if provided
|
||||
receivedAt: string; // server-side receive time
|
||||
rawFingerprint: string; // sha256(raw_body)
|
||||
};
|
||||
```
|
||||
|
||||
Callbacks are processed only through adapter entry points; provider-specific parsing remains private to the adapter.
|
||||
|
||||
## 2. Signature verification
|
||||
|
||||
### 2.1 Required mechanics
|
||||
|
||||
- Verify signatures against raw request bytes, **before JSON parsing**.
|
||||
- Use constant-time comparison and short-circuit to 401/403 on mismatch.
|
||||
- Never disable verification outside local-only test tooling.
|
||||
- Store raw payload hash (`rawFingerprint`) for forensics and idempotency checks.
|
||||
|
||||
### 2.2 Provider headers
|
||||
|
||||
| Provider | Header(s) |
|
||||
|---|---|
|
||||
| SHKeeper | `x-shkeeper-signature` |
|
||||
| Request Network | `x-request-network-signature` |
|
||||
| Test override (local only) | explicitly documented in deployment notes, never in production |
|
||||
|
||||
If expected signature header is absent or malformed, treat as a non-retryable client error.
|
||||
|
||||
## 3. Replay prevention and idempotency
|
||||
|
||||
For each callback store and enforce one of:
|
||||
|
||||
- `deliveryId` + `provider` + `eventType`, or
|
||||
- `(providerPaymentId, normalizedStatus, provider)` when provider has no delivery id.
|
||||
|
||||
Replay rules:
|
||||
|
||||
- First successful write path = **processed**.
|
||||
- Same key seen again with no state change = **duplicate** (HTTP 200 response, no side effects).
|
||||
- Same key seen for different payload hash = **conflict** (HTTP 409, captured to DLQ).
|
||||
|
||||
## 4. Unknown and duplicate behavior
|
||||
|
||||
| Condition | Response | Side effects |
|
||||
|---|---|---|
|
||||
| Signature valid, unknown `providerPaymentId` | `200` (`unknown_payment`) in v1 mode / `404` in strict mode | no state write, record DLQ entry for operator review |
|
||||
| Known `providerPaymentId`, already terminal | `200` (`duplicate_terminal`) | no state write |
|
||||
| Known `providerPaymentId`, stale status transition | `200` (`duplicate_or_out_of_order`) | no state write |
|
||||
| Unknown signature | `401` | no state write |
|
||||
| Malformed payload | `400` | no state write |
|
||||
|
||||
## 5. Retry semantics
|
||||
|
||||
- Callback consumers (providers) may retry:
|
||||
- transient network failures,
|
||||
- 5xx/provider internal timeouts,
|
||||
- explicit retryable status from endpoint.
|
||||
- Retry is triggered only on non-2xx codes for SHKeeper and Request Network.
|
||||
- Recommended handler mapping:
|
||||
- `401/400` = do not retry (hard fail),
|
||||
- `409` = do not retry until manual release,
|
||||
- `500/503` = retry.
|
||||
|
||||
## 6. Dead-letter and replay storage
|
||||
|
||||
Persist all failed callbacks for at least 7 days in append-only storage:
|
||||
|
||||
- `providerWebhookFailures`
|
||||
- key fields: `provider`, `deliveryId`, `providerPaymentId`, `requestPath`, `requestHeaders`, `rawFingerprint`, `statusCode`, `errorCode`, `attemptCount`, `nextRetryAt`, `rawBodyRef`, `createdAt`.
|
||||
- If storage is unavailable, fail closed and raise a high-severity ops alert.
|
||||
|
||||
Retention policy:
|
||||
|
||||
- 30 days for `success==true`,
|
||||
- 180 days for `unknown_payment`, `repeated_conflict`, `signature_failure`,
|
||||
- immediate alert if retry queue exceeds 500 entries for a provider.
|
||||
|
||||
## 7. Alerting thresholds
|
||||
|
||||
- `failed_webhook_count` over 1 minute:
|
||||
- warning at `> 20`,
|
||||
- critical at `> 100`.
|
||||
- signature failures:
|
||||
- warning at `> 5` in 5 minutes,
|
||||
- critical at `> 20` in 5 minutes.
|
||||
- duplicate ratio:
|
||||
- warning if `duplicates / total >= 0.15` for 10 minutes.
|
||||
- dead-letter growth:
|
||||
- warning at `+200` new entries/hour,
|
||||
- critical at `+500`/hour.
|
||||
|
||||
## 8. Required operator signals
|
||||
|
||||
Webhook health checks should expose:
|
||||
|
||||
- last-seen timestamp by provider,
|
||||
- delivery backlog depth,
|
||||
- per-status counters (`processed`, `duplicate`, `unknown`, `conflict`, `signature_failure`),
|
||||
- DLQ length and oldest entry age.
|
||||
|
||||
## 9. Testing requirements
|
||||
|
||||
- Signature bypass tests (must remain false in staging/prod),
|
||||
- replay/delivery-id duplicate tests,
|
||||
- malformed payload tests,
|
||||
- unknown payment tests,
|
||||
- non-terminal duplicate suppression tests.
|
||||
|
||||
## Related
|
||||
|
||||
- [[Payment Provider Adapter Spec]]
|
||||
- [[Error Codes]]
|
||||
- [[Backend Funds Migration and Operational Runbooks]]
|
||||
Reference in New Issue
Block a user