Files
nick-doc/09 - Audits/Performance Audit - 2026-05-24.md
Siavash Sameni 940ad0c655 Add full system audit reports and Telegram Mini App debug handoff
- Three-stream audit (security / logic / performance) with 35+ findings
  derived from actual source code, each with file:line and remediation
- Audit Index cross-references criticals across streams into prioritized
  fix tiers: immediately / before soft launch / before public launch
- Telegram Mini App debug handoff documenting what was implemented and
  all remaining work items with exact file lists and test commands
- Updated architecture, data model, auth API, and registration flow docs
  to reflect Telegram auth, TON wallet, and email verification additions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-24 17:20:08 +04:00

339 lines
14 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
title: Performance Audit — 2026-05-24
tags: [audit, performance, database, findings]
created: 2026-05-24
status: open
---
# Performance Audit — 2026-05-24
Performance review of MongoDB queries, indexes, in-memory data structures, Socket.IO emission patterns, and pagination. Triggered by completion of Request Network integration, Telegram auth, and the internal funds ledger. Every finding is derived from actual source code.
> [!warning] Scale risk
> H3 (unbounded seller fan-out) and H4 (full chat document in memory) are time-bombs that will cause outages once user counts and message volumes grow past modest scale. Both have zero performance headroom at their current implementation.
---
## HIGH
### H1 — N+1 Query: `getPurchaseRequestsByBuyer` Issues One Payment Lookup Per Request Row
**File:** `src/services/marketplace/PurchaseRequestService.ts:516534`
**Status:** Open
```ts
await Promise.all(requests.map(async (request) =>
Payment.findOne({ purchaseRequestId: request._id })
))
```
A page of 10 requests produces 11 DB round-trips (1 list + 10 payment lookups). With a 50-item page, that is 51 round-trips on every buyer dashboard load.
**Remediation:**
```ts
const requestIds = requests.map(r => r._id);
const payments = await Payment.find({ purchaseRequestId: { $in: requestIds } }).lean();
const paymentMap = new Map(payments.map(p => [p.purchaseRequestId.toString(), p]));
// then: requestWithPayment = { ...r, payment: paymentMap.get(r._id.toString()) }
```
---
### H2 — Missing Index on `Payment.purchaseRequestId`
**File:** `src/models/Payment.ts:190196`
**Status:** Open
The Payment schema indexes `buyerId`, `sellerId`, `status`, `transactionHash`, and `providerPaymentId` — but not `purchaseRequestId`. Every call to `Payment.findOne({ purchaseRequestId })` (used in `getPurchaseRequestsByBuyer`, `getPurchaseRequestById`, payment coordinator, notification handlers) causes a full collection scan.
**Remediation:** Add to `Payment.ts`:
```ts
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
```
---
### H3 — Unbounded Seller Fan-Out on New Public Purchase Request
**File:** `src/services/marketplace/PurchaseRequestService.ts:190191`
**Status:** Open
```ts
sellers = await User.find({ role: "seller", status: "active" }).select(...)
```
When a public purchase request is created, every active seller is fetched with no limit. For 10 000 sellers:
- 10 000 `Notification` documents are inserted (not bulk-inserted — individual creates in a loop).
- Each insert calls `emitRealTimeNotification``getUnreadCount` (an additional `countDocuments` query per seller = 10 000 extra queries).
- A 50 ms staggered `setTimeout` per seller means the event loop is managing 500 seconds of scheduled callbacks.
**Remediation:**
1. Cap the fan-out with a configurable batch limit (e.g., 500 most-relevant sellers by category).
2. Use `Notification.insertMany(docs)` for bulk creation.
3. Push a single room-broadcast (`io.to('sellers').emit(...)`) rather than per-user socket emits.
4. Move heavy fan-out to a background job queue (Bull/BullMQ).
---
### H4 — Full Chat Document with All Embedded Messages Loaded for Pagination
**File:** `src/services/chat/ChatService.ts:370408`
**Status:** Open
```ts
const chat = await Chat.findById(chatId).populate("messages.senderId", ...)
const messages = chat.messages.sort(...).slice(skip, skip + limit)
```
A chat with 50 000 messages (~5 KB each = 250 MB) is fully loaded into Node.js memory on every paginated message request. Pagination happens in JavaScript after the full document is fetched.
**Remediation (minimum):** Use MongoDB's `$slice` projection:
```ts
const chat = await Chat.findById(chatId, {
messages: { $slice: [skip, limit] },
participants: 1,
unreadCounts: 1
}).lean();
```
**Recommended long-term:** Extract messages into a separate `Message` collection with server-side pagination (`Message.find({ chatId }).sort({ timestamp: -1 }).skip(skip).limit(limit).lean()`).
---
### H5 — Extra `countDocuments` Query on Every Single Notification Creation
**File:** `src/services/notification/NotificationService.ts:131152`
**Status:** Open
`emitRealTimeNotification` calls `emitUnreadCountUpdate`, which calls `getUnreadCount` (a full `countDocuments` query) synchronously within the notification creation hot path. When H3's 10 000-seller fan-out fires, this produces 10 000 additional `countDocuments` queries in rapid succession.
**Remediation:** Maintain an unread count field directly on the `User` or in a Redis hash. Increment on notification insert rather than re-counting. Or skip the re-query and have the client increment its local counter on socket event receipt.
---
### H6 — `getUserPayments` Queries Non-Indexed Field and Has No Default Limit
**File:** `src/services/payment/paymentService.ts:331370`
**Status:** Open
The query builds `query.userId = userId` (line 343), but the Payment schema stores buyers as `buyerId` (ObjectId) — not `userId`. The query hits `metadata.legacyUserId`, which has no index. Additionally, if `options.limit` is not supplied the query returns all matching documents with no limit.
**Remediation:**
1. Fix field: `query.$or = [{ buyerId: userId }, { sellerId: userId }]`
2. Add default limit: `.limit(options.limit ?? 100)`
---
## MEDIUM
### M1 — Missing Compound Index `(buyerId, createdAt)` on PurchaseRequest
**File:** `src/models/PurchaseRequest.ts:360369`
**Status:** Open
`getPurchaseRequestsByBuyer` queries `{ buyerId }` with `.sort({ createdAt: -1 })`. The existing single-field index on `buyerId` satisfies the filter but forces MongoDB to sort results post-scan. A compound index eliminates the sort entirely.
**Remediation:** Add to `PurchaseRequest.ts`:
```ts
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
```
---
### M2 — Unanchored Case-Insensitive Regex Search — Full Collection Scan
**File:** `src/services/marketplace/PurchaseRequestService.ts:703720`
**Status:** Open
```ts
{ title: { $regex: searchTerm, $options: "i" } }
```
An unanchored, case-insensitive regex cannot use a B-tree index. MongoDB performs a full collection scan with in-memory string matching on every search request.
**Remediation:** Create a text index:
```ts
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
```
Query with `{ $text: { $search: searchTerm } }` and sort by `{ score: { $meta: 'textScore' } }`.
---
### M3 — Double-Fetch Pattern in Update Methods (No `.lean()` on Pre-Check)
**File:** `src/services/marketplace/PurchaseRequestService.ts:406425,557572`
**Status:** Open
Both `updatePurchaseRequest` and `updatePurchaseRequestStatus` call `PurchaseRequest.findById(requestId)` to validate current status, then call `findByIdAndUpdate`. The pre-check `findById` instantiates a full Mongoose document that is immediately discarded.
**Remediation:** Use `findByIdAndUpdate` with `{ new: false }` to get the pre-update document in a single round-trip, or use `.lean().select('status')` on the pre-check to minimize overhead.
---
### M4 — Regex on Embedded `messages.content` Array — Full Array Deserialization
**File:** `src/services/chat/ChatService.ts:318320`
**Status:** Open
```ts
{ "messages.content": { $regex: filters.search, $options: "i" } }
```
MongoDB must deserialize the entire embedded `messages` array for every chat document to evaluate this filter. There is no index support for nested array field regex searches.
**Remediation:** Move to a dedicated `Message` collection with a text index on `content`, then query `Message.find({ chatId: { $in: userChatIds }, $text: { $search: searchTerm } })`.
---
### M5 — In-Memory Coordination State Not Shared Across Replicas
**File:** `src/services/payment/paymentCoordinator.ts:25`
**Status:** Open
The `coordinationState` Map is process-local. In a multi-replica deployment, each process starts with an empty map; the 10-second debounce provides no cross-process deduplication. The DB guard in CRIT-1 (Logic Audit) is the correct production fix; this Map is supplementary at best.
*(Code comment at line 24 already acknowledges this — a Redis migration is planned.)*
---
### M6 — `getSellers()` Is Unbounded (No Limit)
**File:** `src/services/marketplace/PurchaseRequestService.ts:728741`
**Status:** Open
```ts
await User.find({ role: "seller", isEmailVerified: true }).select(...).lean()
```
No `.limit()`. Returns every seller in the collection. This method is exposed via API.
**Remediation:** Add `.limit(500)` at minimum, or paginate with a cursor.
---
### M7 — `user-online` Event Broadcast to All Connected Sockets Globally
**File:** `src/app.ts:300`
**Status:** Open
```ts
socket.broadcast.emit('user-status-change', { userId, status: 'online', ... })
```
This pushes the online-presence event to every connected socket regardless of whether they have a relationship with the user. With 10 000 concurrent connections, this is 10 000 individual socket writes per login event.
**Remediation:** Restrict to relevant rooms. For a marketplace, "relevant" means chat participants and active purchase request counterparties:
```ts
userChatIds.forEach(chatId => {
socket.to(`chat-${chatId}`).emit('user-status-change', { userId, status: 'online' });
});
```
---
### M8 — Post-Filter After Pagination Causes Wrong `totalItems` Count
**File:** `src/services/marketplace/PurchaseRequestService.ts:258337`
**Status:** Open
The code paginates the base query, then applies visibility filtering (offer ownership, anonymization) after pagination. This means `totalItems` in the response is the pre-filter count, not the count of items the caller can actually see. Users may see fewer items than indicated by the pagination metadata (e.g., "showing 8 of 10" when only 6 are visible).
**Remediation:** Move visibility filtering into the MongoDB query (aggregation pipeline with `$match` conditions) so `countDocuments` reflects the actual visible set.
---
## LOW
### L1 — `getUserChats` Runs Count Query Sequentially After Find
**File:** `src/services/chat/ChatService.ts:324335`
**Status:** Open
`Chat.find(query)` and `Chat.countDocuments(query)` are independent queries run sequentially. They should run in parallel.
**Remediation:**
```ts
const [chats, total] = await Promise.all([
Chat.find(query).lean(),
Chat.countDocuments(query)
]);
```
*(The notification service at `NotificationService.ts:5361` already does this correctly — apply the same pattern.)*
---
### L2 — `findByIdAndUpdate` Results Not Using `.lean()`
**File:** `src/services/marketplace/PurchaseRequestService.ts:418425,566572`
**Status:** Open
`findByIdAndUpdate` with `.populate(...)` returns full Mongoose document objects for data that is only read and returned to the client. `.lean()` saves ~1530% memory and processing per returned document.
**Remediation:** Add `.lean()` to read-only `findByIdAndUpdate` calls.
---
### L3 — Telegram Replay Maps Are Process-Local (Multi-Replica Gap, Same as M5)
**File:** `src/services/telegram/telegramService.ts:395396`
**Status:** Open
```ts
const initDataReplayWindow = new Map<string, number>();
const webhookReplayWindow = new Map<string, number>();
```
`cleanupReplayMap` is called lazily before each check — entries are evicted correctly once the window expires, so the Maps do not grow unboundedly under normal load. However, like `coordinationState`, they are process-local and will not catch replays across multiple Node replicas.
**Remediation:** Redis `SET NX EX` (same as Security Audit H-1 recommendation).
---
### L4 — PasskeyService Challenge Map: 5-Minute Cleanup Is Correct, Process-Local Only
**File:** `src/services/auth/passkeyService.ts:5663`
**Status:** Open
Cleanup is implemented and the code comment (lines 5055) already notes the Redis migration path. Not a performance risk at current scale. Same multi-replica caveat as L3.
---
## Confirmed PASS
| Check | Result | Source |
|-------|--------|--------|
| `TempVerification` TTL index | PASS | `TempVerification.ts:59` — MongoDB-native TTL, no app polling |
| `TelegramSession` TTL index | PASS | `TelegramSession.ts:66` |
| `Notification` 90-day TTL index | PASS | `Notification.ts:77``expireAfterSeconds: 7776000` |
| `FundsLedgerEntry` indexes (purchaseRequestId, paymentId, idempotencyKey) | PASS | `FundsLedgerEntry.ts:86107` |
| SHKeeper polling rate bounded | PASS | `shkeeperSimpleAuto.ts:64` — 2-minute interval, self-cleaning |
| Socket.IO room cleanup on disconnect | PASS | Socket.IO removes socket from all rooms automatically on disconnect |
| `getUserNotifications` count + find in parallel | PASS | `NotificationService.ts:5361` — already uses `Promise.all` |
| Telegram replay map bounded (lazy eviction) | PASS | `telegramService.ts:100``cleanupReplayMap` evicts expired entries |
---
## Index Additions Summary
The following indexes should be added to eliminate collection scans identified in this audit:
```ts
// Payment.ts
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
// PurchaseRequest.ts
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
// PurchaseRequest.ts (text search)
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
```
Caution: adding indexes increases write latency and storage. Run `db.collection.getIndexes()` before adding to avoid duplicating existing indexes (some `unique: true` fields already create indexes automatically).
---
## Summary
| Severity | Count |
|----------|-------|
| HIGH | 6 |
| MEDIUM | 8 |
| LOW | 4 |
**Highest urgency:** H3 (seller fan-out) and H4 (full chat in memory) will cause visible degradation with modest growth. H1 (N+1 payments) and H2 (missing index) are easy wins with high query-count impact.
---
## Related
- [[Security Audit - 2026-05-24]]
- [[Logic Audit - 2026-05-24]]
- [[Backend Architecture]]
- [[Data Model Overview]]
- [[Funds Ledger and Escrow State Machine Specification]]