Add full system audit reports and Telegram Mini App debug handoff

- Three-stream audit (security / logic / performance) with 35+ findings derived from actual source code, each with file:line and remediation - Audit Index cross-references criticals across streams into prioritized fix tiers: immediately / before soft launch / before public launch - Telegram Mini App debug handoff documenting what was implemented and all remaining work items with exact file lists and test commands - Updated architecture, data model, auth API, and registration flow docs to reflect Telegram auth, TON wallet, and email verification additions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-24 17:20:08 +04:00
parent 2533bedb91
commit 940ad0c655
12 changed files with 1795 additions and 28 deletions
--- a/Audits/Performance
+++ b/Audits/Performance
@@ -0,0 +1,338 @@
+---
+title: Performance Audit — 2026-05-24
+tags: [audit, performance, database, findings]
+created: 2026-05-24
+status: open
+---
+
+# Performance Audit — 2026-05-24
+
+Performance review of MongoDB queries, indexes, in-memory data structures, Socket.IO emission patterns, and pagination. Triggered by completion of Request Network integration, Telegram auth, and the internal funds ledger. Every finding is derived from actual source code.
+
+> [!warning] Scale risk
+> H3 (unbounded seller fan-out) and H4 (full chat document in memory) are time-bombs that will cause outages once user counts and message volumes grow past modest scale. Both have zero performance headroom at their current implementation.
+
+---
+
+## HIGH
+
+### H1 — N+1 Query: `getPurchaseRequestsByBuyer` Issues One Payment Lookup Per Request Row
+**File:** `src/services/marketplace/PurchaseRequestService.ts:516–534`
+**Status:** Open
+
+```ts
+await Promise.all(requests.map(async (request) =>
+  Payment.findOne({ purchaseRequestId: request._id })
+))
+```
+
+A page of 10 requests produces 11 DB round-trips (1 list + 10 payment lookups). With a 50-item page, that is 51 round-trips on every buyer dashboard load.
+
+**Remediation:**
+```ts
+const requestIds = requests.map(r => r._id);
+const payments = await Payment.find({ purchaseRequestId: { $in: requestIds } }).lean();
+const paymentMap = new Map(payments.map(p => [p.purchaseRequestId.toString(), p]));
+// then: requestWithPayment = { ...r, payment: paymentMap.get(r._id.toString()) }
+```
+
+---
+
+### H2 — Missing Index on `Payment.purchaseRequestId`
+**File:** `src/models/Payment.ts:190–196`
+**Status:** Open
+
+The Payment schema indexes `buyerId`, `sellerId`, `status`, `transactionHash`, and `providerPaymentId` — but not `purchaseRequestId`. Every call to `Payment.findOne({ purchaseRequestId })` (used in `getPurchaseRequestsByBuyer`, `getPurchaseRequestById`, payment coordinator, notification handlers) causes a full collection scan.
+
+**Remediation:** Add to `Payment.ts`:
+```ts
+paymentSchema.index({ purchaseRequestId: 1, status: 1 });
+```
+
+---
+
+### H3 — Unbounded Seller Fan-Out on New Public Purchase Request
+**File:** `src/services/marketplace/PurchaseRequestService.ts:190–191`
+**Status:** Open
+
+```ts
+sellers = await User.find({ role: "seller", status: "active" }).select(...)
+```
+
+When a public purchase request is created, every active seller is fetched with no limit. For 10 000 sellers:
+- 10 000 `Notification` documents are inserted (not bulk-inserted — individual creates in a loop).
+- Each insert calls `emitRealTimeNotification` → `getUnreadCount` (an additional `countDocuments` query per seller = 10 000 extra queries).
+- A 50 ms staggered `setTimeout` per seller means the event loop is managing 500 seconds of scheduled callbacks.
+
+**Remediation:**
+1. Cap the fan-out with a configurable batch limit (e.g., 500 most-relevant sellers by category).
+2. Use `Notification.insertMany(docs)` for bulk creation.
+3. Push a single room-broadcast (`io.to('sellers').emit(...)`) rather than per-user socket emits.
+4. Move heavy fan-out to a background job queue (Bull/BullMQ).
+
+---
+
+### H4 — Full Chat Document with All Embedded Messages Loaded for Pagination
+**File:** `src/services/chat/ChatService.ts:370–408`
+**Status:** Open
+
+```ts
+const chat = await Chat.findById(chatId).populate("messages.senderId", ...)
+const messages = chat.messages.sort(...).slice(skip, skip + limit)
+```
+
+A chat with 50 000 messages (~5 KB each = 250 MB) is fully loaded into Node.js memory on every paginated message request. Pagination happens in JavaScript after the full document is fetched.
+
+**Remediation (minimum):** Use MongoDB's `$slice` projection:
+```ts
+const chat = await Chat.findById(chatId, {
+  messages: { $slice: [skip, limit] },
+  participants: 1,
+  unreadCounts: 1
+}).lean();
+```
+**Recommended long-term:** Extract messages into a separate `Message` collection with server-side pagination (`Message.find({ chatId }).sort({ timestamp: -1 }).skip(skip).limit(limit).lean()`).
+
+---
+
+### H5 — Extra `countDocuments` Query on Every Single Notification Creation
+**File:** `src/services/notification/NotificationService.ts:131–152`
+**Status:** Open
+
+`emitRealTimeNotification` calls `emitUnreadCountUpdate`, which calls `getUnreadCount` (a full `countDocuments` query) synchronously within the notification creation hot path. When H3's 10 000-seller fan-out fires, this produces 10 000 additional `countDocuments` queries in rapid succession.
+
+**Remediation:** Maintain an unread count field directly on the `User` or in a Redis hash. Increment on notification insert rather than re-counting. Or skip the re-query and have the client increment its local counter on socket event receipt.
+
+---
+
+### H6 — `getUserPayments` Queries Non-Indexed Field and Has No Default Limit
+**File:** `src/services/payment/paymentService.ts:331–370`
+**Status:** Open
+
+The query builds `query.userId = userId` (line 343), but the Payment schema stores buyers as `buyerId` (ObjectId) — not `userId`. The query hits `metadata.legacyUserId`, which has no index. Additionally, if `options.limit` is not supplied the query returns all matching documents with no limit.
+
+**Remediation:**
+1. Fix field: `query.$or = [{ buyerId: userId }, { sellerId: userId }]`
+2. Add default limit: `.limit(options.limit ?? 100)`
+
+---
+
+## MEDIUM
+
+### M1 — Missing Compound Index `(buyerId, createdAt)` on PurchaseRequest
+**File:** `src/models/PurchaseRequest.ts:360–369`
+**Status:** Open
+
+`getPurchaseRequestsByBuyer` queries `{ buyerId }` with `.sort({ createdAt: -1 })`. The existing single-field index on `buyerId` satisfies the filter but forces MongoDB to sort results post-scan. A compound index eliminates the sort entirely.
+
+**Remediation:** Add to `PurchaseRequest.ts`:
+```ts
+PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
+PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
+PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
+```
+
+---
+
+### M2 — Unanchored Case-Insensitive Regex Search — Full Collection Scan
+**File:** `src/services/marketplace/PurchaseRequestService.ts:703–720`
+**Status:** Open
+
+```ts
+{ title: { $regex: searchTerm, $options: "i" } }
+```
+
+An unanchored, case-insensitive regex cannot use a B-tree index. MongoDB performs a full collection scan with in-memory string matching on every search request.
+
+**Remediation:** Create a text index:
+```ts
+PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
+```
+Query with `{ $text: { $search: searchTerm } }` and sort by `{ score: { $meta: 'textScore' } }`.
+
+---
+
+### M3 — Double-Fetch Pattern in Update Methods (No `.lean()` on Pre-Check)
+**File:** `src/services/marketplace/PurchaseRequestService.ts:406–425,557–572`
+**Status:** Open
+
+Both `updatePurchaseRequest` and `updatePurchaseRequestStatus` call `PurchaseRequest.findById(requestId)` to validate current status, then call `findByIdAndUpdate`. The pre-check `findById` instantiates a full Mongoose document that is immediately discarded.
+
+**Remediation:** Use `findByIdAndUpdate` with `{ new: false }` to get the pre-update document in a single round-trip, or use `.lean().select('status')` on the pre-check to minimize overhead.
+
+---
+
+### M4 — Regex on Embedded `messages.content` Array — Full Array Deserialization
+**File:** `src/services/chat/ChatService.ts:318–320`
+**Status:** Open
+
+```ts
+{ "messages.content": { $regex: filters.search, $options: "i" } }
+```
+
+MongoDB must deserialize the entire embedded `messages` array for every chat document to evaluate this filter. There is no index support for nested array field regex searches.
+
+**Remediation:** Move to a dedicated `Message` collection with a text index on `content`, then query `Message.find({ chatId: { $in: userChatIds }, $text: { $search: searchTerm } })`.
+
+---
+
+### M5 — In-Memory Coordination State Not Shared Across Replicas
+**File:** `src/services/payment/paymentCoordinator.ts:25`
+**Status:** Open
+
+The `coordinationState` Map is process-local. In a multi-replica deployment, each process starts with an empty map; the 10-second debounce provides no cross-process deduplication. The DB guard in CRIT-1 (Logic Audit) is the correct production fix; this Map is supplementary at best.
+
+*(Code comment at line 24 already acknowledges this — a Redis migration is planned.)*
+
+---
+
+### M6 — `getSellers()` Is Unbounded (No Limit)
+**File:** `src/services/marketplace/PurchaseRequestService.ts:728–741`
+**Status:** Open
+
+```ts
+await User.find({ role: "seller", isEmailVerified: true }).select(...).lean()
+```
+
+No `.limit()`. Returns every seller in the collection. This method is exposed via API.
+
+**Remediation:** Add `.limit(500)` at minimum, or paginate with a cursor.
+
+---
+
+### M7 — `user-online` Event Broadcast to All Connected Sockets Globally
+**File:** `src/app.ts:300`
+**Status:** Open
+
+```ts
+socket.broadcast.emit('user-status-change', { userId, status: 'online', ... })
+```
+
+This pushes the online-presence event to every connected socket regardless of whether they have a relationship with the user. With 10 000 concurrent connections, this is 10 000 individual socket writes per login event.
+
+**Remediation:** Restrict to relevant rooms. For a marketplace, "relevant" means chat participants and active purchase request counterparties:
+```ts
+userChatIds.forEach(chatId => {
+  socket.to(`chat-${chatId}`).emit('user-status-change', { userId, status: 'online' });
+});
+```
+
+---
+
+### M8 — Post-Filter After Pagination Causes Wrong `totalItems` Count
+**File:** `src/services/marketplace/PurchaseRequestService.ts:258–337`
+**Status:** Open
+
+The code paginates the base query, then applies visibility filtering (offer ownership, anonymization) after pagination. This means `totalItems` in the response is the pre-filter count, not the count of items the caller can actually see. Users may see fewer items than indicated by the pagination metadata (e.g., "showing 8 of 10" when only 6 are visible).
+
+**Remediation:** Move visibility filtering into the MongoDB query (aggregation pipeline with `$match` conditions) so `countDocuments` reflects the actual visible set.
+
+---
+
+## LOW
+
+### L1 — `getUserChats` Runs Count Query Sequentially After Find
+**File:** `src/services/chat/ChatService.ts:324–335`
+**Status:** Open
+
+`Chat.find(query)` and `Chat.countDocuments(query)` are independent queries run sequentially. They should run in parallel.
+
+**Remediation:**
+```ts
+const [chats, total] = await Promise.all([
+  Chat.find(query).lean(),
+  Chat.countDocuments(query)
+]);
+```
+*(The notification service at `NotificationService.ts:53–61` already does this correctly — apply the same pattern.)*
+
+---
+
+### L2 — `findByIdAndUpdate` Results Not Using `.lean()`
+**File:** `src/services/marketplace/PurchaseRequestService.ts:418–425,566–572`
+**Status:** Open
+
+`findByIdAndUpdate` with `.populate(...)` returns full Mongoose document objects for data that is only read and returned to the client. `.lean()` saves ~15–30% memory and processing per returned document.
+
+**Remediation:** Add `.lean()` to read-only `findByIdAndUpdate` calls.
+
+---
+
+### L3 — Telegram Replay Maps Are Process-Local (Multi-Replica Gap, Same as M5)
+**File:** `src/services/telegram/telegramService.ts:395–396`
+**Status:** Open
+
+```ts
+const initDataReplayWindow = new Map<string, number>();
+const webhookReplayWindow = new Map<string, number>();
+```
+
+`cleanupReplayMap` is called lazily before each check — entries are evicted correctly once the window expires, so the Maps do not grow unboundedly under normal load. However, like `coordinationState`, they are process-local and will not catch replays across multiple Node replicas.
+
+**Remediation:** Redis `SET NX EX` (same as Security Audit H-1 recommendation).
+
+---
+
+### L4 — PasskeyService Challenge Map: 5-Minute Cleanup Is Correct, Process-Local Only
+**File:** `src/services/auth/passkeyService.ts:56–63`
+**Status:** Open
+
+Cleanup is implemented and the code comment (lines 50–55) already notes the Redis migration path. Not a performance risk at current scale. Same multi-replica caveat as L3.
+
+---
+
+## Confirmed PASS
+
+| Check | Result | Source |
+|-------|--------|--------|
+| `TempVerification` TTL index | PASS | `TempVerification.ts:59` — MongoDB-native TTL, no app polling |
+| `TelegramSession` TTL index | PASS | `TelegramSession.ts:66` |
+| `Notification` 90-day TTL index | PASS | `Notification.ts:77` — `expireAfterSeconds: 7776000` |
+| `FundsLedgerEntry` indexes (purchaseRequestId, paymentId, idempotencyKey) | PASS | `FundsLedgerEntry.ts:86–107` |
+| SHKeeper polling rate bounded | PASS | `shkeeperSimpleAuto.ts:64` — 2-minute interval, self-cleaning |
+| Socket.IO room cleanup on disconnect | PASS | Socket.IO removes socket from all rooms automatically on disconnect |
+| `getUserNotifications` count + find in parallel | PASS | `NotificationService.ts:53–61` — already uses `Promise.all` |
+| Telegram replay map bounded (lazy eviction) | PASS | `telegramService.ts:100` — `cleanupReplayMap` evicts expired entries |
+
+---
+
+## Index Additions Summary
+
+The following indexes should be added to eliminate collection scans identified in this audit:
+
+```ts
+// Payment.ts
+paymentSchema.index({ purchaseRequestId: 1, status: 1 });
+
+// PurchaseRequest.ts
+PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
+PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
+PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
+
+// PurchaseRequest.ts (text search)
+PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
+```
+
+Caution: adding indexes increases write latency and storage. Run `db.collection.getIndexes()` before adding to avoid duplicating existing indexes (some `unique: true` fields already create indexes automatically).
+
+---
+
+## Summary
+
+| Severity | Count |
+|----------|-------|
+| HIGH | 6 |
+| MEDIUM | 8 |
+| LOW | 4 |
+
+**Highest urgency:** H3 (seller fan-out) and H4 (full chat in memory) will cause visible degradation with modest growth. H1 (N+1 payments) and H2 (missing index) are easy wins with high query-count impact.
+
+---
+
+## Related
+
+- [[Security Audit - 2026-05-24]]
+- [[Logic Audit - 2026-05-24]]
+- [[Backend Architecture]]
+- [[Data Model Overview]]
+- [[Funds Ledger and Escrow State Machine Specification]]