Files
nick-doc/11 - Testing/Concurrency Test Results 2026-06-06.md

6.6 KiB
Raw Permalink Blame History

Concurrency & Performance Test Results — 2026-06-06

Environment

Item Value
Date 2026-06-06
Backend version v2.9.3 → v2.9.5
Target http://172.18.0.6:5001 (loopback, server → container direct)
Payment mode PAYMENT_MODE=status (no real blockchain)
Flow Full E2E: setup buyer+3 sellers → createRequest → 3 offers → selectOffer → pay → deliver → confirmDelivery
Server 89.58.32.32 (netcup ARM, 6 vCPU)
Runner scripts/smoke/marketplace-e2e-notifications.mjs

Run 1 — Baseline (rate limiter blocking, v2.9.3)

CONCURRENCY_LEVELS=1,2,4,8,16,32

Level Passed Total Rate Failure
C1 1 1 100%
C2 0 2 0% 429 rate limit
C4C32 0 0% 429 rate limit

Finding: globalLimiter (100 req/15 min) exhausted by concurrent user setup. Added RATE_LIMIT_BYPASS_IPS env var to skip limiter for the Docker host gateway IP.


Run 2 — Clean baseline (bypass active, UV_THREADPOOL_SIZE default=4)

CONCURRENCY_LEVELS=1,2,4,8,16,32 — run ID 20260606090606

Level Passed Total Rate Failure
C1 1 1 100%
C2 2 2 100%
C4 4 4 100%
C8 8 8 100%
C16 15 16 93.75% 1× admin.create 500
C32 10 32 31% auth.login + admin.create timeouts
Total 40 63 63.5%

API Latency (all levels combined)

API p50 p95 p99 Max
auth.login 5221ms 15000ms 15002ms 15002ms
users.admin.create 4372ms 15004ms 15007ms 15007ms
marketplace.purchaseRequests.create 315ms 507ms 579ms 579ms
marketplace.offers.create 246ms 399ms 448ms 450ms
marketplace.offers.select 193ms 455ms 504ms 504ms
marketplace.purchaseRequests.status.payment 231ms 383ms 512ms 512ms
marketplace.delivery.update 92ms 245ms 258ms 258ms
marketplace.delivery.confirm 42ms 96ms 129ms 129ms
notifications.list 23ms 233ms 592ms 640ms

Root cause of C32 failures: bcrypt is CPU-bound; with 4 libuv threads (default), 128 concurrent bcrypt ops (32 flows × 4 hashes each) queue behind 4 slots. p50 login jumps from 509ms (C1) to 5221ms (C32 aggregate).

Bugs found during this run:

  1. Selected seller never received offer-accepted notification — acceptedOffer.id was undefined because toSellerOffer() maps to _id not .id on a plain object. Fixed in commit de910aa.
  2. Telegram Mini App URL was the entire comma-separated FRONTEND_URL CORS list, producing ERR_NAME_NOT_RESOLVED. Fixed in commit 6b6319c.

Run 3 — After UV_THREADPOOL_SIZE=16

Added UV_THREADPOOL_SIZE=16 to /opt/arcane/data/projects/escrow-dev/.env. Redeployed v2.9.5.

CONCURRENCY_LEVELS=16,20 — run ID 20260606103005

Level Passed Total Rate
C16 16 16 100%
C20 20 20 100%
Total 36 36 100%

API Latency (C16+C20 combined)

API p50 p95 Max
auth.login 8227ms 12702ms 13996ms
users.admin.create 6383ms 11002ms 14416ms
marketplace.offers.create 604ms 1111ms 1380ms
marketplace.offers.select 758ms 1359ms 1675ms
marketplace.purchaseRequests.create 499ms 1010ms 1160ms
marketplace.delivery.update 236ms 379ms 489ms
marketplace.delivery.confirm 66ms 218ms 221ms
notifications.list 92ms 653ms 3233ms

Auth and admin.create are still slow (68s p50) but no longer timeout. All flows complete successfully.


Run 4 — C24 + C32 (UV_THREADPOOL_SIZE=16)

CONCURRENCY_LEVELS=24,32 — run ID 20260606103348

Level Passed Total Rate Failure
C24 16 24 66.7% 8× admin.create 500 (DB unique collision)
C32 14 32 43.75% 6× auth.login timeout, 12× admin.create timeout
Total 30 56 53.6%

New failure mode at C24: users.admin.create returns 500 (not timeout). Likely a DB unique constraint collision when 24 workers simultaneously generate user emails with similar patterns, or a Mongoose/Postgres write conflict. This is a test-harness artifact — in production, 24 users don't register simultaneously.

Health alert: Gatus fired status=degraded during the C24 wave. The 500 errors on admin.create triggered the health endpoint's degraded status. Recovered immediately after the test.


Summary

Metric Value
Stable ceiling C20 (100% pass rate)
Soft ceiling C24 (66% — DB write conflict on concurrent user creation)
Hard ceiling C32 (44% — bcrypt CPU saturation even with threadpool=16)
UV_THREADPOOL fix Moved stable ceiling from C8 → C20
Real-world equivalent C20 ≈ 5001,500 simultaneous active users (at 1530s think time)
DAU estimate Safe up to ~5,0008,000 DAU at current infra

Bugs fixed as a result of testing

Bug Fix
Selected seller never gets offer-accepted notification acceptedOffer.idString(acceptedOffer._id) in SellerOfferService.ts
Telegram Mini App URL was unparseable CORS list Split FRONTEND_URL on comma, take first entry
RATE_LIMIT_BYPASS_IPS env var added Skip globalLimiter for trusted internal IPs (loopback test runner)

Recommendations

  1. UV_THREADPOOL_SIZE=16 — already applied to dev env. Apply to production env file as well.
  2. Reduce bcrypt rounds 12 → 10 — 4× faster per hash, still above OWASP minimum. Apply in authService.ts, userRoutes.ts, userController.ts, init-admin.ts.
  3. Test harness improvement — pre-pool users before concurrent phase to eliminate admin.create as a concurrency bottleneck. See scripts/smoke/marketplace-realistic-load.mjs.

Feature idea noted during testing

Counter-offer mechanism (eBay-style): Allow a seller to propose a counter-price on an existing offer rather than only accepting or rejecting. Buyer can accept/reject/counter again. This would add a natural negotiation loop to the marketplace without requiring full escrow re-entry. Low implementation cost on the offer state machine; high UX value for high-value transactions.


Raw report files

Stored on the test server at /tmp/e2e-reports/:

  • marketplace-e2e-20260606090606.{json,md} — Run 2 (baseline)
  • marketplace-e2e-20260606103005.{json,md} — Run 3 (C16+C20)
  • marketplace-e2e-20260606103348.{json,md} — Run 4 (C24+C32)