- Three-stream audit (security / logic / performance) with 35+ findings derived from actual source code, each with file:line and remediation - Audit Index cross-references criticals across streams into prioritized fix tiers: immediately / before soft launch / before public launch - Telegram Mini App debug handoff documenting what was implemented and all remaining work items with exact file lists and test commands - Updated architecture, data model, auth API, and registration flow docs to reflect Telegram auth, TON wallet, and email verification additions Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
339 lines
14 KiB
Markdown
339 lines
14 KiB
Markdown
---
|
||
title: Performance Audit — 2026-05-24
|
||
tags: [audit, performance, database, findings]
|
||
created: 2026-05-24
|
||
status: open
|
||
---
|
||
|
||
# Performance Audit — 2026-05-24
|
||
|
||
Performance review of MongoDB queries, indexes, in-memory data structures, Socket.IO emission patterns, and pagination. Triggered by completion of Request Network integration, Telegram auth, and the internal funds ledger. Every finding is derived from actual source code.
|
||
|
||
> [!warning] Scale risk
|
||
> H3 (unbounded seller fan-out) and H4 (full chat document in memory) are time-bombs that will cause outages once user counts and message volumes grow past modest scale. Both have zero performance headroom at their current implementation.
|
||
|
||
---
|
||
|
||
## HIGH
|
||
|
||
### H1 — N+1 Query: `getPurchaseRequestsByBuyer` Issues One Payment Lookup Per Request Row
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:516–534`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
await Promise.all(requests.map(async (request) =>
|
||
Payment.findOne({ purchaseRequestId: request._id })
|
||
))
|
||
```
|
||
|
||
A page of 10 requests produces 11 DB round-trips (1 list + 10 payment lookups). With a 50-item page, that is 51 round-trips on every buyer dashboard load.
|
||
|
||
**Remediation:**
|
||
```ts
|
||
const requestIds = requests.map(r => r._id);
|
||
const payments = await Payment.find({ purchaseRequestId: { $in: requestIds } }).lean();
|
||
const paymentMap = new Map(payments.map(p => [p.purchaseRequestId.toString(), p]));
|
||
// then: requestWithPayment = { ...r, payment: paymentMap.get(r._id.toString()) }
|
||
```
|
||
|
||
---
|
||
|
||
### H2 — Missing Index on `Payment.purchaseRequestId`
|
||
**File:** `src/models/Payment.ts:190–196`
|
||
**Status:** Open
|
||
|
||
The Payment schema indexes `buyerId`, `sellerId`, `status`, `transactionHash`, and `providerPaymentId` — but not `purchaseRequestId`. Every call to `Payment.findOne({ purchaseRequestId })` (used in `getPurchaseRequestsByBuyer`, `getPurchaseRequestById`, payment coordinator, notification handlers) causes a full collection scan.
|
||
|
||
**Remediation:** Add to `Payment.ts`:
|
||
```ts
|
||
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
|
||
```
|
||
|
||
---
|
||
|
||
### H3 — Unbounded Seller Fan-Out on New Public Purchase Request
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:190–191`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
sellers = await User.find({ role: "seller", status: "active" }).select(...)
|
||
```
|
||
|
||
When a public purchase request is created, every active seller is fetched with no limit. For 10 000 sellers:
|
||
- 10 000 `Notification` documents are inserted (not bulk-inserted — individual creates in a loop).
|
||
- Each insert calls `emitRealTimeNotification` → `getUnreadCount` (an additional `countDocuments` query per seller = 10 000 extra queries).
|
||
- A 50 ms staggered `setTimeout` per seller means the event loop is managing 500 seconds of scheduled callbacks.
|
||
|
||
**Remediation:**
|
||
1. Cap the fan-out with a configurable batch limit (e.g., 500 most-relevant sellers by category).
|
||
2. Use `Notification.insertMany(docs)` for bulk creation.
|
||
3. Push a single room-broadcast (`io.to('sellers').emit(...)`) rather than per-user socket emits.
|
||
4. Move heavy fan-out to a background job queue (Bull/BullMQ).
|
||
|
||
---
|
||
|
||
### H4 — Full Chat Document with All Embedded Messages Loaded for Pagination
|
||
**File:** `src/services/chat/ChatService.ts:370–408`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
const chat = await Chat.findById(chatId).populate("messages.senderId", ...)
|
||
const messages = chat.messages.sort(...).slice(skip, skip + limit)
|
||
```
|
||
|
||
A chat with 50 000 messages (~5 KB each = 250 MB) is fully loaded into Node.js memory on every paginated message request. Pagination happens in JavaScript after the full document is fetched.
|
||
|
||
**Remediation (minimum):** Use MongoDB's `$slice` projection:
|
||
```ts
|
||
const chat = await Chat.findById(chatId, {
|
||
messages: { $slice: [skip, limit] },
|
||
participants: 1,
|
||
unreadCounts: 1
|
||
}).lean();
|
||
```
|
||
**Recommended long-term:** Extract messages into a separate `Message` collection with server-side pagination (`Message.find({ chatId }).sort({ timestamp: -1 }).skip(skip).limit(limit).lean()`).
|
||
|
||
---
|
||
|
||
### H5 — Extra `countDocuments` Query on Every Single Notification Creation
|
||
**File:** `src/services/notification/NotificationService.ts:131–152`
|
||
**Status:** Open
|
||
|
||
`emitRealTimeNotification` calls `emitUnreadCountUpdate`, which calls `getUnreadCount` (a full `countDocuments` query) synchronously within the notification creation hot path. When H3's 10 000-seller fan-out fires, this produces 10 000 additional `countDocuments` queries in rapid succession.
|
||
|
||
**Remediation:** Maintain an unread count field directly on the `User` or in a Redis hash. Increment on notification insert rather than re-counting. Or skip the re-query and have the client increment its local counter on socket event receipt.
|
||
|
||
---
|
||
|
||
### H6 — `getUserPayments` Queries Non-Indexed Field and Has No Default Limit
|
||
**File:** `src/services/payment/paymentService.ts:331–370`
|
||
**Status:** Open
|
||
|
||
The query builds `query.userId = userId` (line 343), but the Payment schema stores buyers as `buyerId` (ObjectId) — not `userId`. The query hits `metadata.legacyUserId`, which has no index. Additionally, if `options.limit` is not supplied the query returns all matching documents with no limit.
|
||
|
||
**Remediation:**
|
||
1. Fix field: `query.$or = [{ buyerId: userId }, { sellerId: userId }]`
|
||
2. Add default limit: `.limit(options.limit ?? 100)`
|
||
|
||
---
|
||
|
||
## MEDIUM
|
||
|
||
### M1 — Missing Compound Index `(buyerId, createdAt)` on PurchaseRequest
|
||
**File:** `src/models/PurchaseRequest.ts:360–369`
|
||
**Status:** Open
|
||
|
||
`getPurchaseRequestsByBuyer` queries `{ buyerId }` with `.sort({ createdAt: -1 })`. The existing single-field index on `buyerId` satisfies the filter but forces MongoDB to sort results post-scan. A compound index eliminates the sort entirely.
|
||
|
||
**Remediation:** Add to `PurchaseRequest.ts`:
|
||
```ts
|
||
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
|
||
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
|
||
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
|
||
```
|
||
|
||
---
|
||
|
||
### M2 — Unanchored Case-Insensitive Regex Search — Full Collection Scan
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:703–720`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
{ title: { $regex: searchTerm, $options: "i" } }
|
||
```
|
||
|
||
An unanchored, case-insensitive regex cannot use a B-tree index. MongoDB performs a full collection scan with in-memory string matching on every search request.
|
||
|
||
**Remediation:** Create a text index:
|
||
```ts
|
||
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
|
||
```
|
||
Query with `{ $text: { $search: searchTerm } }` and sort by `{ score: { $meta: 'textScore' } }`.
|
||
|
||
---
|
||
|
||
### M3 — Double-Fetch Pattern in Update Methods (No `.lean()` on Pre-Check)
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:406–425,557–572`
|
||
**Status:** Open
|
||
|
||
Both `updatePurchaseRequest` and `updatePurchaseRequestStatus` call `PurchaseRequest.findById(requestId)` to validate current status, then call `findByIdAndUpdate`. The pre-check `findById` instantiates a full Mongoose document that is immediately discarded.
|
||
|
||
**Remediation:** Use `findByIdAndUpdate` with `{ new: false }` to get the pre-update document in a single round-trip, or use `.lean().select('status')` on the pre-check to minimize overhead.
|
||
|
||
---
|
||
|
||
### M4 — Regex on Embedded `messages.content` Array — Full Array Deserialization
|
||
**File:** `src/services/chat/ChatService.ts:318–320`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
{ "messages.content": { $regex: filters.search, $options: "i" } }
|
||
```
|
||
|
||
MongoDB must deserialize the entire embedded `messages` array for every chat document to evaluate this filter. There is no index support for nested array field regex searches.
|
||
|
||
**Remediation:** Move to a dedicated `Message` collection with a text index on `content`, then query `Message.find({ chatId: { $in: userChatIds }, $text: { $search: searchTerm } })`.
|
||
|
||
---
|
||
|
||
### M5 — In-Memory Coordination State Not Shared Across Replicas
|
||
**File:** `src/services/payment/paymentCoordinator.ts:25`
|
||
**Status:** Open
|
||
|
||
The `coordinationState` Map is process-local. In a multi-replica deployment, each process starts with an empty map; the 10-second debounce provides no cross-process deduplication. The DB guard in CRIT-1 (Logic Audit) is the correct production fix; this Map is supplementary at best.
|
||
|
||
*(Code comment at line 24 already acknowledges this — a Redis migration is planned.)*
|
||
|
||
---
|
||
|
||
### M6 — `getSellers()` Is Unbounded (No Limit)
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:728–741`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
await User.find({ role: "seller", isEmailVerified: true }).select(...).lean()
|
||
```
|
||
|
||
No `.limit()`. Returns every seller in the collection. This method is exposed via API.
|
||
|
||
**Remediation:** Add `.limit(500)` at minimum, or paginate with a cursor.
|
||
|
||
---
|
||
|
||
### M7 — `user-online` Event Broadcast to All Connected Sockets Globally
|
||
**File:** `src/app.ts:300`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
socket.broadcast.emit('user-status-change', { userId, status: 'online', ... })
|
||
```
|
||
|
||
This pushes the online-presence event to every connected socket regardless of whether they have a relationship with the user. With 10 000 concurrent connections, this is 10 000 individual socket writes per login event.
|
||
|
||
**Remediation:** Restrict to relevant rooms. For a marketplace, "relevant" means chat participants and active purchase request counterparties:
|
||
```ts
|
||
userChatIds.forEach(chatId => {
|
||
socket.to(`chat-${chatId}`).emit('user-status-change', { userId, status: 'online' });
|
||
});
|
||
```
|
||
|
||
---
|
||
|
||
### M8 — Post-Filter After Pagination Causes Wrong `totalItems` Count
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:258–337`
|
||
**Status:** Open
|
||
|
||
The code paginates the base query, then applies visibility filtering (offer ownership, anonymization) after pagination. This means `totalItems` in the response is the pre-filter count, not the count of items the caller can actually see. Users may see fewer items than indicated by the pagination metadata (e.g., "showing 8 of 10" when only 6 are visible).
|
||
|
||
**Remediation:** Move visibility filtering into the MongoDB query (aggregation pipeline with `$match` conditions) so `countDocuments` reflects the actual visible set.
|
||
|
||
---
|
||
|
||
## LOW
|
||
|
||
### L1 — `getUserChats` Runs Count Query Sequentially After Find
|
||
**File:** `src/services/chat/ChatService.ts:324–335`
|
||
**Status:** Open
|
||
|
||
`Chat.find(query)` and `Chat.countDocuments(query)` are independent queries run sequentially. They should run in parallel.
|
||
|
||
**Remediation:**
|
||
```ts
|
||
const [chats, total] = await Promise.all([
|
||
Chat.find(query).lean(),
|
||
Chat.countDocuments(query)
|
||
]);
|
||
```
|
||
*(The notification service at `NotificationService.ts:53–61` already does this correctly — apply the same pattern.)*
|
||
|
||
---
|
||
|
||
### L2 — `findByIdAndUpdate` Results Not Using `.lean()`
|
||
**File:** `src/services/marketplace/PurchaseRequestService.ts:418–425,566–572`
|
||
**Status:** Open
|
||
|
||
`findByIdAndUpdate` with `.populate(...)` returns full Mongoose document objects for data that is only read and returned to the client. `.lean()` saves ~15–30% memory and processing per returned document.
|
||
|
||
**Remediation:** Add `.lean()` to read-only `findByIdAndUpdate` calls.
|
||
|
||
---
|
||
|
||
### L3 — Telegram Replay Maps Are Process-Local (Multi-Replica Gap, Same as M5)
|
||
**File:** `src/services/telegram/telegramService.ts:395–396`
|
||
**Status:** Open
|
||
|
||
```ts
|
||
const initDataReplayWindow = new Map<string, number>();
|
||
const webhookReplayWindow = new Map<string, number>();
|
||
```
|
||
|
||
`cleanupReplayMap` is called lazily before each check — entries are evicted correctly once the window expires, so the Maps do not grow unboundedly under normal load. However, like `coordinationState`, they are process-local and will not catch replays across multiple Node replicas.
|
||
|
||
**Remediation:** Redis `SET NX EX` (same as Security Audit H-1 recommendation).
|
||
|
||
---
|
||
|
||
### L4 — PasskeyService Challenge Map: 5-Minute Cleanup Is Correct, Process-Local Only
|
||
**File:** `src/services/auth/passkeyService.ts:56–63`
|
||
**Status:** Open
|
||
|
||
Cleanup is implemented and the code comment (lines 50–55) already notes the Redis migration path. Not a performance risk at current scale. Same multi-replica caveat as L3.
|
||
|
||
---
|
||
|
||
## Confirmed PASS
|
||
|
||
| Check | Result | Source |
|
||
|-------|--------|--------|
|
||
| `TempVerification` TTL index | PASS | `TempVerification.ts:59` — MongoDB-native TTL, no app polling |
|
||
| `TelegramSession` TTL index | PASS | `TelegramSession.ts:66` |
|
||
| `Notification` 90-day TTL index | PASS | `Notification.ts:77` — `expireAfterSeconds: 7776000` |
|
||
| `FundsLedgerEntry` indexes (purchaseRequestId, paymentId, idempotencyKey) | PASS | `FundsLedgerEntry.ts:86–107` |
|
||
| SHKeeper polling rate bounded | PASS | `shkeeperSimpleAuto.ts:64` — 2-minute interval, self-cleaning |
|
||
| Socket.IO room cleanup on disconnect | PASS | Socket.IO removes socket from all rooms automatically on disconnect |
|
||
| `getUserNotifications` count + find in parallel | PASS | `NotificationService.ts:53–61` — already uses `Promise.all` |
|
||
| Telegram replay map bounded (lazy eviction) | PASS | `telegramService.ts:100` — `cleanupReplayMap` evicts expired entries |
|
||
|
||
---
|
||
|
||
## Index Additions Summary
|
||
|
||
The following indexes should be added to eliminate collection scans identified in this audit:
|
||
|
||
```ts
|
||
// Payment.ts
|
||
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
|
||
|
||
// PurchaseRequest.ts
|
||
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
|
||
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
|
||
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
|
||
|
||
// PurchaseRequest.ts (text search)
|
||
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
|
||
```
|
||
|
||
Caution: adding indexes increases write latency and storage. Run `db.collection.getIndexes()` before adding to avoid duplicating existing indexes (some `unique: true` fields already create indexes automatically).
|
||
|
||
---
|
||
|
||
## Summary
|
||
|
||
| Severity | Count |
|
||
|----------|-------|
|
||
| HIGH | 6 |
|
||
| MEDIUM | 8 |
|
||
| LOW | 4 |
|
||
|
||
**Highest urgency:** H3 (seller fan-out) and H4 (full chat in memory) will cause visible degradation with modest growth. H1 (N+1 payments) and H2 (missing index) are easy wins with high query-count impact.
|
||
|
||
---
|
||
|
||
## Related
|
||
|
||
- [[Security Audit - 2026-05-24]]
|
||
- [[Logic Audit - 2026-05-24]]
|
||
- [[Backend Architecture]]
|
||
- [[Data Model Overview]]
|
||
- [[Funds Ledger and Escrow State Machine Specification]]
|