Add full system audit reports and Telegram Mini App debug handoff

- Three-stream audit (security / logic / performance) with 35+ findings
  derived from actual source code, each with file:line and remediation
- Audit Index cross-references criticals across streams into prioritized
  fix tiers: immediately / before soft launch / before public launch
- Telegram Mini App debug handoff documenting what was implemented and
  all remaining work items with exact file lists and test commands
- Updated architecture, data model, auth API, and registration flow docs
  to reflect Telegram auth, TON wallet, and email verification additions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-05-24 17:20:08 +04:00
parent 2533bedb91
commit 940ad0c655
12 changed files with 1795 additions and 28 deletions

View File

@@ -0,0 +1,343 @@
---
title: Logic & Correctness Audit — 2026-05-24
tags: [audit, logic, correctness, findings]
created: 2026-05-24
status: open
---
# Logic & Correctness Audit — 2026-05-24
Full-codebase review of business logic, state machines, race conditions, and data integrity. Triggered by completion of Request Network integration, Telegram first-class auth, and the internal funds ledger. Every finding is derived from actual source code — no hypothetical issues are included.
> [!danger] Action required
> 4 CRITICAL findings: two are exploitable in the current production code path (double-spend via concurrent webhooks, simulated-payment bypass); two cause data corruption (orphan User documents, non-atomic create/delete).
---
## CRITICAL
### CRIT-1 — Concurrent Webhooks Can Double-Process the Same Payment
**File:** `src/services/payment/shkeeper/shkeeperWebhook.ts:195241` and `src/services/payment/paymentCoordinator.ts:2560`
**Status:** Open
The duplicate-detection guard uses an in-memory 10-second window. There is no database-level atomic lock. Two simultaneous webhook deliveries for the same `external_id` will both read `payment` before either writes, both pass the in-memory deduplication check, both enter `PaymentCoordinator.coordinatePaymentUpdate`, and both proceed — because `coordinationState` is an in-memory `Map` whose `get/set` pair is not atomic under async concurrency (any `await` between them allows the event loop to interleave).
The `status === 'completed'` DB guard at `paymentCoordinator.ts:54` only protects re-processing of already-completed payments. The first webhook has not yet written `completed` when the second arrives, so both pass.
Side effects of double-processing:
- Duplicate chats created between buyer and seller (`chatService.createChat` has no uniqueness guard).
- Duplicate socket events sent to the buyer.
- Depending on timing, duplicate state transitions on `PurchaseRequest`.
**Remediation:** Replace the coordinator's check with an atomic MongoDB conditional update:
```ts
const result = await Payment.findOneAndUpdate(
{ _id: paymentId, status: { $ne: 'completed' } },
{ $set: { status: 'completed', ... } },
{ new: true }
);
if (!result) return; // already processed
```
For chat creation, add a unique compound index on `{ 'relatedTo.type': 1, 'relatedTo.id': 1 }` and handle the duplicate key error gracefully.
---
### CRIT-2 — Telegram Auto-Provisioning Race Creates Orphan User Documents
**File:** `src/services/auth/authController.ts:377434`
**Status:** Open
The flow is:
1. `TelegramLink.findOne({ telegramUserId })` → null (new user)
2. `new User({...}).save()` — creates User document
3. `TelegramLink.findOneAndUpdate({ telegramUserId }, ..., { upsert: true })`
Two parallel Telegram auth requests for the same `telegramUserId` both find no link, both create a new `User`, and both try to upsert `TelegramLink`. Because `TelegramLink.telegramUserId` has `unique: true`, only one upsert wins. The loser's `User.save()` has already committed, producing a fully-active `User` document with no `TelegramLink` — it can never be authenticated via Telegram and will not be cleaned up.
**Remediation:** Use optimistic locking or a MongoDB transaction. Practical approach without transactions:
```ts
// Step 1: try to upsert the link first
const link = await TelegramLink.findOneAndUpdate(
{ telegramUserId },
{ $setOnInsert: { telegramUserId, source, ... } },
{ upsert: true, new: true, rawResult: true }
);
const isNew = link.lastErrorObject?.upserted != null;
// Step 2: create User only if this process won the upsert
if (isNew) { user = await User.create({...}); await link.value.updateOne({ userId: user._id }); }
else { user = await User.findById(link.value.userId); }
```
---
### CRIT-3 — Simulated Transaction Bypass Active in Production
**File:** `src/services/payment/paymentRoutes.ts:379396`
**Status:** Open
*(Also cited in Security Audit C-3.)*
```ts
if (paymentHash.startsWith('SIM_') || ...) {
isVerified = true;
}
```
No environment guard. A client sending `paymentHash: "SIM_anything"` gets a completed `Payment` and the `PurchaseRequest` advances without any real funds.
**Remediation:** Wrap in `if (process.env.NODE_ENV !== 'production')`.
---
### CRIT-4 — Hardcoded Credential in Production Payment Code
**File:** `src/services/payment/shkeeper/shkeeperPayoutService.ts:224228`
**Status:** Open
*(Also cited in Security Audit C-5.)*
```ts
'Authorization': `Basic ${Buffer.from(`admin:!NMI4WdGkVQ#dQ`).toString('base64')}`
```
**Remediation:** `process.env.SHKEEPER_ADMIN_PASSWORD` — rotate immediately.
---
## HIGH
### HIGH-1 — `updatePurchaseRequest` Has a Read-Modify-Write Race Condition
**File:** `src/services/marketplace/PurchaseRequestService.ts:405425`
**Status:** Open
```ts
const currentRequest = await PurchaseRequest.findById(requestId); // read
// ... validate status ...
await PurchaseRequest.findByIdAndUpdate(requestId, updateData, { new: true }); // write
```
Two concurrent calls both read the same `currentRequest.status`, both pass `isValidStatusProgression`, and both write — potentially into conflicting terminal states or setting the same field twice.
Furthermore, `isValidStatusProgression` allows any status → any terminal state transition (e.g., `pending``completed` in one step), because the guard only blocks backward progression among non-terminal states.
**Remediation:** Use atomic optimistic locking:
```ts
const updated = await PurchaseRequest.findOneAndUpdate(
{ _id: requestId, status: currentStatus }, // guard: status unchanged
{ $set: { status: newStatus } },
{ new: true }
);
if (!updated) throw new ConflictError('State changed concurrently — retry');
```
Additionally, tighten `isValidStatusProgression` to use an explicit allowed-transitions table rather than a blanket terminal-from-anywhere rule.
---
### HIGH-2 — Seller Offer Acceptance Has No Concurrency Guard
**File:** `src/services/marketplace/SellerOfferService.ts:355426`
**Status:** Open
`acceptOffer` reads the offer, then updates it in a separate write. Two simultaneous accept calls for different offers on the same purchase request will both succeed: both set their target offer to `accepted`, and both run `updateMany` to reject all others — each query excluding its own `offerId`. The result is two `accepted` offers on the same purchase request.
**Remediation:** Use `findOneAndUpdate` with a status guard:
```ts
const updated = await SellerOffer.findOneAndUpdate(
{ _id: offerId, status: 'pending' },
{ $set: { status: 'accepted' } },
{ new: true }
);
if (!updated) throw new ConflictError('Offer already processed');
await SellerOffer.updateMany(
{ purchaseRequestId, _id: { $ne: offerId } },
{ $set: { status: 'rejected' } }
);
```
Consider wrapping in a MongoDB transaction to keep the accept + reject-others atomic.
---
### HIGH-3 — `refreshTokens[]` Array Grows Unboundedly
**File:** `src/services/auth/authController.ts:6263` and login handler
**Status:** Open
Every `login` pushes a new refresh token onto `user.refreshTokens[]` without any cap or expiry pruning. The `User` model has no `maxlength` on this array. A user logging in from many devices, or an attacker rapidly requesting tokens, causes the array to grow without limit — increasing document size and slowing any query that reads the user document.
The `refreshToken` rotation endpoint prunes the old token on rotation, but `login` does not prune expired tokens before pushing.
**Remediation:** Before pushing, prune expired tokens:
```ts
const now = Date.now() / 1000;
user.refreshTokens = user.refreshTokens.filter(t => {
try { return jwt.decode(t)?.exp > now; } catch { return false; }
});
```
Enforce a hard cap (e.g. `slice(-10)`) after pruning.
---
### HIGH-4 — Blocked Telegram User Can Bypass Block via Deleted TelegramLink
**File:** `src/services/auth/authController.ts:355363`
**Status:** Open
The block check queries `TelegramLink` where `status = 'blocked'` AND `blockedReason != 'unlinked_by_user'`. If the blocked user's `TelegramLink` is **deleted** entirely (not just status-changed), the check finds nothing, `activeLink` finds nothing, and the code auto-provisions a brand-new `User` with a fresh `TelegramLink` — bypassing the block entirely.
**Remediation:** Store the blocked Telegram user ID on the `User` document (a `blockedTelegramIds` set at the app level, or a `status: 'suspended'` flag tied to the Telegram ID) so the block persists even without a `TelegramLink` record.
---
### HIGH-5 — `verifyEmailWithCode`: Non-Atomic User Create + TempVerification Delete
**File:** `src/services/auth/authController.ts:620633`
**Status:** Open
```ts
await user.save(); // line 630
await TempVerification.findByIdAndDelete(tempVerification._id); // line 633
```
If `user.save()` succeeds but the delete fails (network error, timeout), the `TempVerification` document remains. The next call with the same `email+code` will find it valid, attempt `User.create` again, and hit the unique email index — returning a 500 error instead of a clean "already verified" message. This persists until the MongoDB TTL scan runs (up to 15 min + scan interval).
**Remediation:** Delete the TempVerification **before** saving the user, or check for duplicate-key errors after `user.save()` and delete the TempVerification in the catch handler.
---
## MEDIUM
### MED-1 — Funds Ledger Availability Check Is Not Atomic with Append
**File:** `src/services/payment/orchestration/releaseRefundService.ts:3755,99114`
**Status:** Open
`validateReleaseAvailability` reads a balance via aggregation, checks `availableBalance >= amount`, then `appendFundsLedgerEntry` inserts the entry. Two concurrent release/refund calls can both read the same balance snapshot, both pass the availability check, and both insert entries — resulting in a double-spend from the ledger.
The `idempotencyKey` on `FundsLedgerEntry` prevents exact duplicates when the same `txHash` is used, but does not prevent two different `txHash` values from both passing the availability check concurrently.
**Remediation:** Wrap the check + insert in a MongoDB transaction, or add a single-release constraint: before inserting a `release` entry, verify no prior `release` entry exists for the same `paymentId`.
---
### MED-2 — `updatePurchaseRequestStatus` Bypasses the State Machine Validator
**File:** `src/services/marketplace/PurchaseRequestService.ts:551589`
**Status:** Open
`updatePurchaseRequestStatus` calls `PurchaseRequest.findByIdAndUpdate` directly, without calling `isValidStatusProgression`. The webhook path (`paymentCoordinator.ts:208`) uses this method, meaning payment-triggered state transitions skip the state machine guard entirely.
**Remediation:** Add `isValidStatusProgression(currentStatus, newStatus)` check inside `updatePurchaseRequestStatus` before writing.
---
### MED-3 — `getUserPayments` Queries Non-Existent Field `userId`
**File:** `src/services/payment/paymentService.ts:342346`
**Status:** Open
```ts
query.userId = userId;
```
The `Payment` schema has no top-level `userId` field. The buyer is stored as `buyerId` and the seller as `sellerId`. This query will always return zero results for current payments; only legacy payments with `metadata.legacyUserId` are returned (none for new users). The endpoint appears non-functional for all users.
**Remediation:**
```ts
query.$or = [{ buyerId: userId }, { sellerId: userId }];
```
---
### MED-4 — Payout Created Without Verifying an Inbound Completed Payment Exists
**File:** `src/services/payment/shkeeper/shkeeperPayoutService.ts:4280`
**Status:** Open
`createPayoutTask` takes `buyerId`, `sellerId`, `purchaseRequestId`, and `amount` from the request body. The service checks for duplicate payouts (`existingPayout` idempotency) but does not verify that a corresponding inbound `Payment` with `status: 'completed'` and `direction: 'in'` exists. An admin could accidentally initiate a payout for a purchase request that has never been paid.
**Remediation:**
```ts
const inboundPayment = await Payment.findOne({
purchaseRequestId, direction: 'in', status: 'completed'
});
if (!inboundPayment) throw new AppError(400, 'No completed inbound payment for this request');
```
---
### MED-5 — Unauthenticated `/payment/callback` Endpoint Can Mutate Payment Status
**File:** `src/services/payment/paymentControllerRoutes.ts:20`
**Status:** Open
`POST /callback` is mounted with no authentication middleware. The handler accepts `{ transactionId, status, amount }` and updates a payment's status to any value without verification. Unlike the SHKeeper webhook (which has optional HMAC), this endpoint has zero authentication.
**Remediation:** Either remove the endpoint (the SHKeeper webhook path supersedes it), restrict to internal network only (Nginx `allow` rules), or add HMAC verification.
---
### MED-6 — In-Memory Payment Coordinator State Not Shared Across Replicas
**File:** `src/services/payment/paymentCoordinator.ts:2425`
**Status:** Open
The `coordinationState` Map is process-local. In a multi-process deployment (PM2 cluster, k8s with >1 replica), the 10-second debounce window provides no protection — each process has its own empty map at startup, and webhooks routed to different replicas will both process. The existing DB `status === 'completed'` guard is the only durable protection.
*(The code comment at line 24 acknowledges this. Fix is the same as CRIT-1 — atomic DB conditional update makes this moot.)*
---
### MED-7 — Seller Offer Notification Uses Undefined `offer.title` Field
**File:** `src/services/marketplace/SellerOfferService.ts:403,415`
**Status:** Open
`SellerOffer` does not have a `title` field directly — the title belongs to the associated `PurchaseRequest`. `offer.title` resolves to `undefined` in the notification message body, resulting in broken notification text for buyers and sellers.
**Remediation:** Either populate the `purchaseRequestId` and use `purchaseRequest.title`, or populate the field name from the offer's own descriptive field.
---
## LOW
### LOW-1 — Duplicate Callback Route Definitions
**File:** `src/services/payment/paymentRoutes.ts:29,256`
**Status:** Open
`POST /callback` is defined at line 29 (mapped to `handlePaymentCallback`) and again as `POST /payments/callback` at line 256. These resolve to different URL paths given the router mount, but suggest dead code. No security impact; creates maintenance confusion.
**Remediation:** Remove the dead route definition.
---
### LOW-2 — `paymentCoordinator.ts` Map Grows Without Bound Until Cleanup Fires
**File:** `src/services/payment/paymentCoordinator.ts:35`
**Status:** Open
The cleanup function fires after each coordination cycle but only on active payments. Under a webhook flood before cleanup, the Map accumulates entries. Not a crash risk at current scale; becomes a concern at high webhook volume.
**Remediation:** The atomic DB fix in CRIT-1 eliminates the need for the Map entirely. Until then, add a size cap and a periodic cleanup interval independent of payment processing.
---
## Confirmed PASS
| Check | Result | Source |
|-------|--------|--------|
| TempVerification TTL index | PASS | `TempVerification.ts:59``expireAfterSeconds: 0` on expiry field |
| Legacy email token cleared on first use | PASS | `authController.ts:667` — token set to undefined before save |
| SHKeeper HMAC timing-safe comparison | PASS | `shkeeperWebhook.ts:84``crypto.timingSafeEqual` |
| Socket event DB as source of truth | PASS | DB write committed before emit; emit failure is non-fatal |
| `FundsLedgerEntry` idempotency key (unique) | PASS | `FundsLedgerEntry.ts:86` — sparse unique index |
| `TelegramSession` TTL index | PASS | `TelegramSession.ts:66` |
| Password change clears refresh tokens | PASS | `authController.ts:887` |
---
## Summary
| Severity | Count |
|----------|-------|
| CRITICAL | 4 |
| HIGH | 5 |
| MEDIUM | 7 |
| LOW | 2 |
**Top priority:** CRIT-1 (concurrent webhook double-processing) and CRIT-2 (Telegram race condition) require atomic DB operations. CRIT-3 (simulation bypass) and CRIT-4 (hardcoded credential) are one-line fixes.
---
## Related
- [[Security Audit - 2026-05-24]]
- [[Performance Audit - 2026-05-24]]
- [[Funds Ledger and Escrow State Machine Specification]]
- [[Payment Provider Adapter Spec]]
- [[Authentication Flow]]
- [[Platform Logical Audit - 2026-05-24]]