Add full system audit reports and Telegram Mini App debug handoff

- Three-stream audit (security / logic / performance) with 35+ findings
  derived from actual source code, each with file:line and remediation
- Audit Index cross-references criticals across streams into prioritized
  fix tiers: immediately / before soft launch / before public launch
- Telegram Mini App debug handoff documenting what was implemented and
  all remaining work items with exact file lists and test commands
- Updated architecture, data model, auth API, and registration flow docs
  to reflect Telegram auth, TON wallet, and email verification additions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-05-24 17:20:08 +04:00
parent 2533bedb91
commit 940ad0c655
12 changed files with 1795 additions and 28 deletions

View File

@@ -0,0 +1,338 @@
---
title: Performance Audit — 2026-05-24
tags: [audit, performance, database, findings]
created: 2026-05-24
status: open
---
# Performance Audit — 2026-05-24
Performance review of MongoDB queries, indexes, in-memory data structures, Socket.IO emission patterns, and pagination. Triggered by completion of Request Network integration, Telegram auth, and the internal funds ledger. Every finding is derived from actual source code.
> [!warning] Scale risk
> H3 (unbounded seller fan-out) and H4 (full chat document in memory) are time-bombs that will cause outages once user counts and message volumes grow past modest scale. Both have zero performance headroom at their current implementation.
---
## HIGH
### H1 — N+1 Query: `getPurchaseRequestsByBuyer` Issues One Payment Lookup Per Request Row
**File:** `src/services/marketplace/PurchaseRequestService.ts:516534`
**Status:** Open
```ts
await Promise.all(requests.map(async (request) =>
Payment.findOne({ purchaseRequestId: request._id })
))
```
A page of 10 requests produces 11 DB round-trips (1 list + 10 payment lookups). With a 50-item page, that is 51 round-trips on every buyer dashboard load.
**Remediation:**
```ts
const requestIds = requests.map(r => r._id);
const payments = await Payment.find({ purchaseRequestId: { $in: requestIds } }).lean();
const paymentMap = new Map(payments.map(p => [p.purchaseRequestId.toString(), p]));
// then: requestWithPayment = { ...r, payment: paymentMap.get(r._id.toString()) }
```
---
### H2 — Missing Index on `Payment.purchaseRequestId`
**File:** `src/models/Payment.ts:190196`
**Status:** Open
The Payment schema indexes `buyerId`, `sellerId`, `status`, `transactionHash`, and `providerPaymentId` — but not `purchaseRequestId`. Every call to `Payment.findOne({ purchaseRequestId })` (used in `getPurchaseRequestsByBuyer`, `getPurchaseRequestById`, payment coordinator, notification handlers) causes a full collection scan.
**Remediation:** Add to `Payment.ts`:
```ts
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
```
---
### H3 — Unbounded Seller Fan-Out on New Public Purchase Request
**File:** `src/services/marketplace/PurchaseRequestService.ts:190191`
**Status:** Open
```ts
sellers = await User.find({ role: "seller", status: "active" }).select(...)
```
When a public purchase request is created, every active seller is fetched with no limit. For 10 000 sellers:
- 10 000 `Notification` documents are inserted (not bulk-inserted — individual creates in a loop).
- Each insert calls `emitRealTimeNotification``getUnreadCount` (an additional `countDocuments` query per seller = 10 000 extra queries).
- A 50 ms staggered `setTimeout` per seller means the event loop is managing 500 seconds of scheduled callbacks.
**Remediation:**
1. Cap the fan-out with a configurable batch limit (e.g., 500 most-relevant sellers by category).
2. Use `Notification.insertMany(docs)` for bulk creation.
3. Push a single room-broadcast (`io.to('sellers').emit(...)`) rather than per-user socket emits.
4. Move heavy fan-out to a background job queue (Bull/BullMQ).
---
### H4 — Full Chat Document with All Embedded Messages Loaded for Pagination
**File:** `src/services/chat/ChatService.ts:370408`
**Status:** Open
```ts
const chat = await Chat.findById(chatId).populate("messages.senderId", ...)
const messages = chat.messages.sort(...).slice(skip, skip + limit)
```
A chat with 50 000 messages (~5 KB each = 250 MB) is fully loaded into Node.js memory on every paginated message request. Pagination happens in JavaScript after the full document is fetched.
**Remediation (minimum):** Use MongoDB's `$slice` projection:
```ts
const chat = await Chat.findById(chatId, {
messages: { $slice: [skip, limit] },
participants: 1,
unreadCounts: 1
}).lean();
```
**Recommended long-term:** Extract messages into a separate `Message` collection with server-side pagination (`Message.find({ chatId }).sort({ timestamp: -1 }).skip(skip).limit(limit).lean()`).
---
### H5 — Extra `countDocuments` Query on Every Single Notification Creation
**File:** `src/services/notification/NotificationService.ts:131152`
**Status:** Open
`emitRealTimeNotification` calls `emitUnreadCountUpdate`, which calls `getUnreadCount` (a full `countDocuments` query) synchronously within the notification creation hot path. When H3's 10 000-seller fan-out fires, this produces 10 000 additional `countDocuments` queries in rapid succession.
**Remediation:** Maintain an unread count field directly on the `User` or in a Redis hash. Increment on notification insert rather than re-counting. Or skip the re-query and have the client increment its local counter on socket event receipt.
---
### H6 — `getUserPayments` Queries Non-Indexed Field and Has No Default Limit
**File:** `src/services/payment/paymentService.ts:331370`
**Status:** Open
The query builds `query.userId = userId` (line 343), but the Payment schema stores buyers as `buyerId` (ObjectId) — not `userId`. The query hits `metadata.legacyUserId`, which has no index. Additionally, if `options.limit` is not supplied the query returns all matching documents with no limit.
**Remediation:**
1. Fix field: `query.$or = [{ buyerId: userId }, { sellerId: userId }]`
2. Add default limit: `.limit(options.limit ?? 100)`
---
## MEDIUM
### M1 — Missing Compound Index `(buyerId, createdAt)` on PurchaseRequest
**File:** `src/models/PurchaseRequest.ts:360369`
**Status:** Open
`getPurchaseRequestsByBuyer` queries `{ buyerId }` with `.sort({ createdAt: -1 })`. The existing single-field index on `buyerId` satisfies the filter but forces MongoDB to sort results post-scan. A compound index eliminates the sort entirely.
**Remediation:** Add to `PurchaseRequest.ts`:
```ts
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
```
---
### M2 — Unanchored Case-Insensitive Regex Search — Full Collection Scan
**File:** `src/services/marketplace/PurchaseRequestService.ts:703720`
**Status:** Open
```ts
{ title: { $regex: searchTerm, $options: "i" } }
```
An unanchored, case-insensitive regex cannot use a B-tree index. MongoDB performs a full collection scan with in-memory string matching on every search request.
**Remediation:** Create a text index:
```ts
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
```
Query with `{ $text: { $search: searchTerm } }` and sort by `{ score: { $meta: 'textScore' } }`.
---
### M3 — Double-Fetch Pattern in Update Methods (No `.lean()` on Pre-Check)
**File:** `src/services/marketplace/PurchaseRequestService.ts:406425,557572`
**Status:** Open
Both `updatePurchaseRequest` and `updatePurchaseRequestStatus` call `PurchaseRequest.findById(requestId)` to validate current status, then call `findByIdAndUpdate`. The pre-check `findById` instantiates a full Mongoose document that is immediately discarded.
**Remediation:** Use `findByIdAndUpdate` with `{ new: false }` to get the pre-update document in a single round-trip, or use `.lean().select('status')` on the pre-check to minimize overhead.
---
### M4 — Regex on Embedded `messages.content` Array — Full Array Deserialization
**File:** `src/services/chat/ChatService.ts:318320`
**Status:** Open
```ts
{ "messages.content": { $regex: filters.search, $options: "i" } }
```
MongoDB must deserialize the entire embedded `messages` array for every chat document to evaluate this filter. There is no index support for nested array field regex searches.
**Remediation:** Move to a dedicated `Message` collection with a text index on `content`, then query `Message.find({ chatId: { $in: userChatIds }, $text: { $search: searchTerm } })`.
---
### M5 — In-Memory Coordination State Not Shared Across Replicas
**File:** `src/services/payment/paymentCoordinator.ts:25`
**Status:** Open
The `coordinationState` Map is process-local. In a multi-replica deployment, each process starts with an empty map; the 10-second debounce provides no cross-process deduplication. The DB guard in CRIT-1 (Logic Audit) is the correct production fix; this Map is supplementary at best.
*(Code comment at line 24 already acknowledges this — a Redis migration is planned.)*
---
### M6 — `getSellers()` Is Unbounded (No Limit)
**File:** `src/services/marketplace/PurchaseRequestService.ts:728741`
**Status:** Open
```ts
await User.find({ role: "seller", isEmailVerified: true }).select(...).lean()
```
No `.limit()`. Returns every seller in the collection. This method is exposed via API.
**Remediation:** Add `.limit(500)` at minimum, or paginate with a cursor.
---
### M7 — `user-online` Event Broadcast to All Connected Sockets Globally
**File:** `src/app.ts:300`
**Status:** Open
```ts
socket.broadcast.emit('user-status-change', { userId, status: 'online', ... })
```
This pushes the online-presence event to every connected socket regardless of whether they have a relationship with the user. With 10 000 concurrent connections, this is 10 000 individual socket writes per login event.
**Remediation:** Restrict to relevant rooms. For a marketplace, "relevant" means chat participants and active purchase request counterparties:
```ts
userChatIds.forEach(chatId => {
socket.to(`chat-${chatId}`).emit('user-status-change', { userId, status: 'online' });
});
```
---
### M8 — Post-Filter After Pagination Causes Wrong `totalItems` Count
**File:** `src/services/marketplace/PurchaseRequestService.ts:258337`
**Status:** Open
The code paginates the base query, then applies visibility filtering (offer ownership, anonymization) after pagination. This means `totalItems` in the response is the pre-filter count, not the count of items the caller can actually see. Users may see fewer items than indicated by the pagination metadata (e.g., "showing 8 of 10" when only 6 are visible).
**Remediation:** Move visibility filtering into the MongoDB query (aggregation pipeline with `$match` conditions) so `countDocuments` reflects the actual visible set.
---
## LOW
### L1 — `getUserChats` Runs Count Query Sequentially After Find
**File:** `src/services/chat/ChatService.ts:324335`
**Status:** Open
`Chat.find(query)` and `Chat.countDocuments(query)` are independent queries run sequentially. They should run in parallel.
**Remediation:**
```ts
const [chats, total] = await Promise.all([
Chat.find(query).lean(),
Chat.countDocuments(query)
]);
```
*(The notification service at `NotificationService.ts:5361` already does this correctly — apply the same pattern.)*
---
### L2 — `findByIdAndUpdate` Results Not Using `.lean()`
**File:** `src/services/marketplace/PurchaseRequestService.ts:418425,566572`
**Status:** Open
`findByIdAndUpdate` with `.populate(...)` returns full Mongoose document objects for data that is only read and returned to the client. `.lean()` saves ~1530% memory and processing per returned document.
**Remediation:** Add `.lean()` to read-only `findByIdAndUpdate` calls.
---
### L3 — Telegram Replay Maps Are Process-Local (Multi-Replica Gap, Same as M5)
**File:** `src/services/telegram/telegramService.ts:395396`
**Status:** Open
```ts
const initDataReplayWindow = new Map<string, number>();
const webhookReplayWindow = new Map<string, number>();
```
`cleanupReplayMap` is called lazily before each check — entries are evicted correctly once the window expires, so the Maps do not grow unboundedly under normal load. However, like `coordinationState`, they are process-local and will not catch replays across multiple Node replicas.
**Remediation:** Redis `SET NX EX` (same as Security Audit H-1 recommendation).
---
### L4 — PasskeyService Challenge Map: 5-Minute Cleanup Is Correct, Process-Local Only
**File:** `src/services/auth/passkeyService.ts:5663`
**Status:** Open
Cleanup is implemented and the code comment (lines 5055) already notes the Redis migration path. Not a performance risk at current scale. Same multi-replica caveat as L3.
---
## Confirmed PASS
| Check | Result | Source |
|-------|--------|--------|
| `TempVerification` TTL index | PASS | `TempVerification.ts:59` — MongoDB-native TTL, no app polling |
| `TelegramSession` TTL index | PASS | `TelegramSession.ts:66` |
| `Notification` 90-day TTL index | PASS | `Notification.ts:77``expireAfterSeconds: 7776000` |
| `FundsLedgerEntry` indexes (purchaseRequestId, paymentId, idempotencyKey) | PASS | `FundsLedgerEntry.ts:86107` |
| SHKeeper polling rate bounded | PASS | `shkeeperSimpleAuto.ts:64` — 2-minute interval, self-cleaning |
| Socket.IO room cleanup on disconnect | PASS | Socket.IO removes socket from all rooms automatically on disconnect |
| `getUserNotifications` count + find in parallel | PASS | `NotificationService.ts:5361` — already uses `Promise.all` |
| Telegram replay map bounded (lazy eviction) | PASS | `telegramService.ts:100``cleanupReplayMap` evicts expired entries |
---
## Index Additions Summary
The following indexes should be added to eliminate collection scans identified in this audit:
```ts
// Payment.ts
paymentSchema.index({ purchaseRequestId: 1, status: 1 });
// PurchaseRequest.ts
PurchaseRequestSchema.index({ buyerId: 1, createdAt: -1 });
PurchaseRequestSchema.index({ status: 1, createdAt: -1 });
PurchaseRequestSchema.index({ categoryId: 1, status: 1, createdAt: -1 });
// PurchaseRequest.ts (text search)
PurchaseRequestSchema.index({ title: 'text', description: 'text', tags: 'text' });
```
Caution: adding indexes increases write latency and storage. Run `db.collection.getIndexes()` before adding to avoid duplicating existing indexes (some `unique: true` fields already create indexes automatically).
---
## Summary
| Severity | Count |
|----------|-------|
| HIGH | 6 |
| MEDIUM | 8 |
| LOW | 4 |
**Highest urgency:** H3 (seller fan-out) and H4 (full chat in memory) will cause visible degradation with modest growth. H1 (N+1 payments) and H2 (missing index) are easy wins with high query-count impact.
---
## Related
- [[Security Audit - 2026-05-24]]
- [[Logic Audit - 2026-05-24]]
- [[Backend Architecture]]
- [[Data Model Overview]]
- [[Funds Ledger and Escrow State Machine Specification]]