151 lines
5.2 KiB
Markdown
151 lines
5.2 KiB
Markdown
---
|
|
title: Webhook Security Spec
|
|
tags: [webhooks, security, audit, payments]
|
|
created: 2026-05-24
|
|
status: advisory
|
|
reviewers: [backend, security, operations]
|
|
---
|
|
|
|
# Webhook Security Spec
|
|
|
|
This document defines signed callback handling for all payment and payout providers.
|
|
It closes the gaps in [[Security Architecture]] by turning webhook behavior into an explicit,
|
|
auditable contract.
|
|
|
|
The scope is inbound callbacks only:
|
|
|
|
- SHKeeper pay-in (`/api/payment/shkeeper/webhook`)
|
|
- SHKeeper payout (`/api/payment/shkeeper/payout/webhook`)
|
|
- Request Network (`/api/payment/request-network/webhook`)
|
|
- Manual/admin reconciliation channels (where applicable)
|
|
|
|
## 1. Canonical event envelope
|
|
|
|
All callbacks are normalized by [[Payment Provider Adapter Spec]] into:
|
|
|
|
```ts
|
|
type ProviderCallback = {
|
|
provider: "shkeeper" | "request_network" | "manual_wallet" | "admin_wallet" | string;
|
|
providerPaymentId: string;
|
|
purchaseRequestId?: string;
|
|
requestId?: string;
|
|
deliveryId?: string;
|
|
eventType: string; // e.g., paid, payout_completed, status_update
|
|
status: string; // provider-specific raw status
|
|
normalizedStatus: "pending" | "completed" | "failed" | "cancelled" | "released" | "refunded";
|
|
amount?: string;
|
|
currency?: string;
|
|
transactionHash?: string;
|
|
occurredAt?: string; // ISO 8601 if provided
|
|
receivedAt: string; // server-side receive time
|
|
rawFingerprint: string; // sha256(raw_body)
|
|
};
|
|
```
|
|
|
|
Callbacks are processed only through adapter entry points; provider-specific parsing remains private to the adapter.
|
|
|
|
## 2. Signature verification
|
|
|
|
### 2.1 Required mechanics
|
|
|
|
- Verify signatures against raw request bytes, **before JSON parsing**.
|
|
- Use constant-time comparison and short-circuit to 401/403 on mismatch.
|
|
- Never disable verification outside local-only test tooling.
|
|
- Store raw payload hash (`rawFingerprint`) for forensics and idempotency checks.
|
|
|
|
### 2.2 Provider headers
|
|
|
|
| Provider | Header(s) |
|
|
|---|---|
|
|
| SHKeeper | `x-shkeeper-signature` |
|
|
| Request Network | `x-request-network-signature` |
|
|
| Test override (local only) | explicitly documented in deployment notes, never in production |
|
|
|
|
If expected signature header is absent or malformed, treat as a non-retryable client error.
|
|
|
|
## 3. Replay prevention and idempotency
|
|
|
|
For each callback store and enforce one of:
|
|
|
|
- `deliveryId` + `provider` + `eventType`, or
|
|
- `(providerPaymentId, normalizedStatus, provider)` when provider has no delivery id.
|
|
|
|
Replay rules:
|
|
|
|
- First successful write path = **processed**.
|
|
- Same key seen again with no state change = **duplicate** (HTTP 200 response, no side effects).
|
|
- Same key seen for different payload hash = **conflict** (HTTP 409, captured to DLQ).
|
|
|
|
## 4. Unknown and duplicate behavior
|
|
|
|
| Condition | Response | Side effects |
|
|
|---|---|---|
|
|
| Signature valid, unknown `providerPaymentId` | `200` (`unknown_payment`) in v1 mode / `404` in strict mode | no state write, record DLQ entry for operator review |
|
|
| Known `providerPaymentId`, already terminal | `200` (`duplicate_terminal`) | no state write |
|
|
| Known `providerPaymentId`, stale status transition | `200` (`duplicate_or_out_of_order`) | no state write |
|
|
| Unknown signature | `401` | no state write |
|
|
| Malformed payload | `400` | no state write |
|
|
|
|
## 5. Retry semantics
|
|
|
|
- Callback consumers (providers) may retry:
|
|
- transient network failures,
|
|
- 5xx/provider internal timeouts,
|
|
- explicit retryable status from endpoint.
|
|
- Retry is triggered only on non-2xx codes for SHKeeper and Request Network.
|
|
- Recommended handler mapping:
|
|
- `401/400` = do not retry (hard fail),
|
|
- `409` = do not retry until manual release,
|
|
- `500/503` = retry.
|
|
|
|
## 6. Dead-letter and replay storage
|
|
|
|
Persist all failed callbacks for at least 7 days in append-only storage:
|
|
|
|
- `providerWebhookFailures`
|
|
- key fields: `provider`, `deliveryId`, `providerPaymentId`, `requestPath`, `requestHeaders`, `rawFingerprint`, `statusCode`, `errorCode`, `attemptCount`, `nextRetryAt`, `rawBodyRef`, `createdAt`.
|
|
- If storage is unavailable, fail closed and raise a high-severity ops alert.
|
|
|
|
Retention policy:
|
|
|
|
- 30 days for `success==true`,
|
|
- 180 days for `unknown_payment`, `repeated_conflict`, `signature_failure`,
|
|
- immediate alert if retry queue exceeds 500 entries for a provider.
|
|
|
|
## 7. Alerting thresholds
|
|
|
|
- `failed_webhook_count` over 1 minute:
|
|
- warning at `> 20`,
|
|
- critical at `> 100`.
|
|
- signature failures:
|
|
- warning at `> 5` in 5 minutes,
|
|
- critical at `> 20` in 5 minutes.
|
|
- duplicate ratio:
|
|
- warning if `duplicates / total >= 0.15` for 10 minutes.
|
|
- dead-letter growth:
|
|
- warning at `+200` new entries/hour,
|
|
- critical at `+500`/hour.
|
|
|
|
## 8. Required operator signals
|
|
|
|
Webhook health checks should expose:
|
|
|
|
- last-seen timestamp by provider,
|
|
- delivery backlog depth,
|
|
- per-status counters (`processed`, `duplicate`, `unknown`, `conflict`, `signature_failure`),
|
|
- DLQ length and oldest entry age.
|
|
|
|
## 9. Testing requirements
|
|
|
|
- Signature bypass tests (must remain false in staging/prod),
|
|
- replay/delivery-id duplicate tests,
|
|
- malformed payload tests,
|
|
- unknown payment tests,
|
|
- non-terminal duplicate suppression tests.
|
|
|
|
## Related
|
|
|
|
- [[Payment Provider Adapter Spec]]
|
|
- [[Error Codes]]
|
|
- [[Backend Funds Migration and Operational Runbooks]]
|