wz-phone

Author	SHA1	Message	Date
Siavash Sameni	1de280fe04	fix(nat): working NAT tickle + smart filter debug + timeout diags Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m39s Details Fixes from real-world 5G↔Starlink testing: NAT tickle fix: - tokio::net::UdpSocket::bind() doesn't set SO_REUSEADDR, so binding to the same port as quinn silently failed. Now uses socket2::Socket with explicit SO_REUSEADDR + SO_REUSEPORT (via libc on unix). - Tickle now logs success/failure for debugging. Diagnostic fixes: - connect:dual_path_race_start shows both dial_order_raw and dial_order_smart so we can see what filtering removed - Grace-period timeout (relay wins first, direct still running) now fills "timeout:grace" diags for unrecorded candidates - Previously candidate_diags was empty when relay won the race Dependencies: - Added socket2 = "0.5" to wzp-client 593 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:58:13 +04:00
Siavash Sameni	bc6d327ebb	feat(nat): smart candidate filtering + acceptor NAT tickle + 4s timeout Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m33s Details Major P2P improvements for cross-network calls: Smart candidate filtering (smart_dial_order): - Strip LAN candidates when peer's public IP differs from ours (172.16.x.x is unreachable from a different network) - Strip all IPv6 candidates (Phase 7 disabled, wastes dial slots) - Only keep mapped + reflexive for cross-network calls - LAN candidates preserved when both peers share the same public IP Acceptor NAT tickle: - A-role sends a 1-byte UDP packet to each peer candidate BEFORE accepting. This opens the NAT pinhole for return traffic from the Dialer's IP — critical for address-restricted NATs that only allow inbound from IPs they've seen outbound traffic to. - Uses SO_REUSEADDR on the same port as the quinn endpoint. Direct timeout increased from 2s to 4s: - Cross-network QUIC handshakes through CGNAT can take 2-3s - 2s was too aggressive for 5G/LTE networks Diagnostic fix: - Record "timeout:4s" for candidates still in-flight when the timeout fires (previously these had no diagnostic entry) 5 new tests for smart_dial_order edge cases. 593 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:42:02 +04:00
Siavash Sameni	c478224d67	fix(ui): remove buffer clear that wiped connect events Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m35s Details The callDebugBuffer.length=0 in showCallScreen() ran AFTER the connect command returned, wiping all connect: events (path_negotiated, race_start, race_done, candidate_diags). Only media: events survived because they arrived after the clear. Removed all automatic buffer clearing. The reverse().find() already handles stale data by picking the most recent event. The manual "Clear log" button (line 624) is the only way to clear now. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:25:13 +04:00
Siavash Sameni	16dcc75514	fix(ui): move buffer clear from call-end to call-start Some checks failed Mirror to GitHub / mirror (push) Failing after 25s Details Build Release Binaries / build-amd64 (push) Failing after 3m42s Details Clearing callDebugBuffer in showConnectScreen() wiped all debug events the moment a call ended, so the user saw empty logs. Moved the clear to showCallScreen() instead — the buffer is reset at the START of a new call, not the end. This way: - After hanging up, all events from the call are still visible - Starting a new call clears stale data from the previous one - The reverse().find() for P2P badge still gets fresh data Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:17:16 +04:00
Siavash Sameni	db5751985e	fix(ui): replace findLast with reverse().find() for WebView compat Some checks failed Mirror to GitHub / mirror (push) Failing after 26s Details Build Release Binaries / build-amd64 (push) Failing after 3m46s Details findLast() requires Chrome 97+ / Android WebView 97+. Older Android devices crash with TypeError in pollStatus(), killing all status updates including the debug log. Use [...arr].reverse().find() which works everywhere. Also pass peerMappedAddr in the direct-call connect invoke. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:06:07 +04:00
Siavash Sameni	c0dd6c06ff	feat(debug): per-candidate dial diagnostics in dual-path race Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m24s Details Added CandidateDiag struct to RaceResult with per-candidate: - address attempted - result (ok / skipped:ipv6 / error:reason) - elapsed time in ms Surfaced in call-debug events: - connect:dual_path_race_start now includes dial_order + peer_mapped - connect:dual_path_race_done now includes candidate_diags array Upgraded dual_path tracing from debug to info for IPv6 skips and dial failures so they appear in logcat/console. Helps diagnose why P2P fails on specific networks (5G CGNAT, address-restricted NATs, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:16:34 +04:00
Siavash Sameni	6805caae0e	fix(ui): P2P badge showing stale status from previous call Some checks failed Mirror to GitHub / mirror (push) Failing after 26s Details Build Release Binaries / build-amd64 (push) Failing after 3m47s Details The callDebugBuffer persisted across calls, so .find() returned the path_negotiated event from Call 1 (P2P Direct) when rendering the badge during Call 2 (Relay). Two fixes: 1. Clear callDebugBuffer in showConnectScreen() between calls 2. Use .findLast() instead of .find() so the most recent event wins Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:02:06 +04:00
Siavash Sameni	5a03da72d3	feat(ui): selectable NAT detection mode + netcheck Tauri command Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details detect_nat_type now accepts optional `mode` parameter: - "relay" — relay-based Reflect only (original behavior) - "stun" — public STUN servers only (no relay needed) - "both" — relay + STUN in parallel (default, highest confidence) New run_netcheck Tauri command exposes the full network diagnostic (NAT type, IPv4/v6, port mapping, relay latencies, port allocation) to the JS frontend. JS usage: await invoke('detect_nat_type', { relays, mode: 'stun' }) await invoke('run_netcheck', { relays }) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:43:17 +04:00
Siavash Sameni	e3e63a40a0	feat(nat): wire hard NAT port prediction into call flow (#85 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m27s Details End-to-end integration of sequential port prediction: - place_call: spawns background detect_port_allocation() + sends HardNatProbe signal after offer (doesn't delay call setup) - answer_call: same for AcceptTrusted answers (privacy mode skips) - Signal recv loop: stashes HardNatProbe in SignalState.peer_hard_nat_probe - connect: reads peer's probe, if Sequential{delta} runs predict_ports() and adds predicted addrs to PeerCandidates.local for the dual-path race - parse_sequential_delta() helper for "sequential(delta=N)" strings The full flow: both peers independently detect their NAT's port allocation, exchange HardNatProbe via relay, and the connect command uses the peer's sequence to predict which ports to dial — all before the dual-path race starts. 588 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:39:40 +04:00
Siavash Sameni	7b4bce69d5	docs: update all docs for hard NAT detection + relay wiring Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m36s Details - PROGRESS.md: hard NAT Phase A, relay cross-wiring, 588 tests - ARCHITECTURE.md: hard NAT port prediction diagram + pattern table - PRD-p2p-direct.md: Phase 8.6 split into a/b/c/d with status - PRD-hard-nat.md: Phase A done, B signal ready, effort table updated - PRD-netcheck.md: port_allocation field + probe documented Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:33:12 +04:00
Siavash Sameni	ec1bdf3cd5	feat(nat): hard NAT port allocation detection + prediction + HardNatProbe signal (#29 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m30s Details Phase A of hard NAT traversal (PRD-hard-nat.md): - PortAllocation enum: PortPreserving / Sequential{delta} / Random / Unknown - detect_port_allocation(): sequential STUN probes from single socket, analyzes port sequence for allocation pattern - classify_port_allocation(): pure function with jitter tolerance, wraparound handling, 60% threshold for noisy sequences - predict_ports(): generates target port range from last_port + delta - HardNatProbe signal message: carries port_sequence, allocation pattern, external_ip for peer coordination - Relay forwards HardNatProbe to call peer - Netcheck gains port_allocation field + format_report display 588 tests pass (17 new), 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:29:35 +04:00
Siavash Sameni	ee14862376	docs: add PRD for hard NAT traversal (port prediction + birthday attack) Some checks failed Mirror to GitHub / mirror (push) Failing after 22s Details Build Release Binaries / build-amd64 (push) Failing after 3m26s Details 4-phase design: A. Port allocation pattern detection (sequential vs random) B. Sequential port prediction (~80% success, <2s) C. Birthday attack for random NATs (98% success, ~10s) D. Hybrid waterfall with background relay-to-direct upgrade Taskmaster tasks #84-87 added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:20:19 +04:00
Siavash Sameni	f83361895e	docs: add PRDs for Phase 8 Tailscale-inspired features Some checks failed Mirror to GitHub / mirror (push) Failing after 23s Details Build Release Binaries / build-amd64 (push) Failing after 3m35s Details 5 new PRDs: - PRD-public-stun.md — RFC 5389 STUN client - PRD-portmap.md — NAT-PMP/PCP/UPnP port mapping - PRD-ice-regather.md — Mid-call ICE re-gathering - PRD-netcheck.md — Network diagnostic - PRD-relay-selection.md — Region-based relay selection Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:08:46 +04:00
Siavash Sameni	0857d190ed	chore: rename legacy Android build script to prevent accidental use Some checks failed Mirror to GitHub / mirror (push) Failing after 30s Details Build Release Binaries / build-amd64 (push) Failing after 3m23s Details build-android-docker.sh builds the old Kotlin app in android/app/ (18M APK), not the live Tauri app (209M). Renamed to build-android-docker-LEGACY.sh so it's never picked by accident. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:42:23 +04:00
Siavash Sameni	5d431c0721	fix(android): restore tauri::Emitter import for Docker builder toolchain Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Edition 2024 on local macOS auto-resolves the Emitter trait, but the Docker builder's Rust/Tauri version requires the explicit import for AppHandle::emit() to resolve. Keeps the warning locally to avoid breaking the CI build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:34:23 +04:00
Siavash Sameni	8fcf1be341	feat(nat): Tailscale-inspired STUN/ICE + port mapping + mid-call re-gathering (#28 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 23s Details Build Release Binaries / build-amd64 (push) Failing after 6m8s Details Phase 8: 5 new modules bringing NAT traversal close to Tailscale's approach. - stun.rs: RFC 5389 STUN client — public server reflexive discovery, XOR-MAPPED-ADDRESS parsing, parallel probe with retry, STUN fallback in desktop try_reflect_own_addr() - portmap.rs: NAT-PMP (RFC 6886) + PCP (RFC 6887) + UPnP IGD port mapping — gateway discovery, acquire/release/refresh lifecycle, new PeerCandidates.mapped candidate type in dial order - ice_agent.rs: candidate lifecycle — gather(), re_gather(), apply_peer_update() with monotonic generation counter, CandidateUpdate signal message forwarded by relay - netcheck.rs: comprehensive diagnostic — NAT type, IPv4/v6, port mapping availability, relay latencies, CLI --netcheck - relay_map.rs: RTT-sorted relay map, preferred() selection, populate_from_ack() for RegisterPresenceAck.available_relays Relay: CallRegistry stores + cross-wires caller/callee_mapped_addr into CallSetup.peer_mapped_addr. Region config + available_relays populated from federation peers in RegisterPresenceAck. Desktop: place_call/answer_call call acquire_port_mapping() and fill caller/callee_mapped_addr. STUN+relay combined NAT detection. 571 tests pass (66 new), 0 regressions, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:17:17 +04:00
Siavash Sameni	9377a9009c	feat(quality): bandwidth probing for upward adaptive quality (#10 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 25s Details Build Release Binaries / build-amd64 (push) Failing after 3m36s Details After 30s stable at a tier, the AdaptiveQualityController actively probes the next tier up by switching the encoder and observing for 5s. If loss/RTT stay within the target tier's thresholds, the upgrade commits. If >1 bad report, the probe aborts with a 60s cooldown. Probing is disabled on cellular (studio tiers aren't classified there) and skipped when already at Studio64k (highest tier). This complements the passive upgrade path (10 consecutive good reports) by actively discovering that a path can sustain higher quality, rather than waiting for the classification to drift upward. New: ProbeState struct, check_probe() method, 4 constants, 5 tests. 377 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 16:47:21 +04:00
Siavash Sameni	4471797edf	docs: update all PRDs and PROGRESS to current state (2026-04-13) Some checks failed Mirror to GitHub / mirror (push) Has been cancelled Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Updated 6 PRDs with implementation status: - PRD-adaptive-quality: P2P quality done, bandwidth probing remains - PRD-protocol-analyzer: all 5 phases documented - PRD-relay-concurrency: DashMap + clone-before-send done - PRD-p2p-direct: P2P adaptive quality update - PRD-engine-dedup: all phases done - PROGRESS.md: test count 372+, 3 new change sections Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 16:40:56 +04:00
Siavash Sameni	425c67a08a	feat(analyzer): replay, HTML report, encrypted decode stub (#15 , #16 , #17 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 26s Details Build Release Binaries / build-amd64 (push) Failing after 3m31s Details #15 - Replay mode: --replay <file.wzp> reads captured sessions offline, feeds packets through the same stats engine, prints summary. CaptureReader mirrors CaptureWriter's binary format. #16 - HTML report: --html <report.html> generates self-contained HTML with Chart.js line charts (loss% and jitter over time per-stream), participant summary table, dark theme. Works with live sessions (after exit) or replay mode. #17 - Encrypted decode: --key <hex> flag accepted and stored. Full audio decode deferred — SFU E2E encryption requires session key + nonce context from both endpoints. Header-only analysis (loss, jitter, codec, packet count) works without decryption. Usage: wzp-analyzer --replay session.wzp --html report.html wzp-analyzer relay:4433 --room test --capture out.wzp --html report.html 372 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 16:31:28 +04:00
Siavash Sameni	88ca3e099a	feat: wzp-analyzer binary — protocol analyzer with TUI (#13 , #14 , #15 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m20s Details New binary: wzp-analyzer joins a room as a passive observer and displays real-time per-participant quality metrics. Features: - Passive observation: connects to relay, receives all media, never sends - Participant detection: identifies senders by sequence number streams - Per-participant stats: packets, loss%, jitter, codec, codec switches - TUI mode (ratatui): color-coded table (green/yellow/red by loss), 10 FPS refresh, session header, quit with q/Ctrl+C - No-TUI mode: prints stats to stderr every 2s (for headless/CI use) - Capture mode: binary .wzp format with microsecond timestamps for offline replay (magic WZP\x01, JSON header, per-packet records) - Session summary on exit Usage: wzp-analyzer 193.180.213.68:4433 --room general wzp-analyzer 193.180.213.68:4433 --room general --no-tui --duration 60 wzp-analyzer 193.180.213.68:4433 --room general --capture session.wzp 372 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 16:26:46 +04:00
Siavash Sameni	1e82811cc1	feat(p2p): adaptive quality on direct calls (#23 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details P2P calls now adapt codec quality based on observed network conditions, matching what relay calls already had. Three-layer implementation: - QualityReport::from_path_stats(): construct reports from local quinn stats (loss%, RTT, jitter) without needing relay-generated reports - CallEncoder.pending_quality_report: one-shot attachment to next source packet (consumed on encode, not repeated) - Engine send tasks: generate quality report every 50 frames (~1s) from quinn_path_stats() and attach via set_pending_quality_report() - Engine recv tasks: self-observe from own QUIC path stats every 50 packets, feed to AdaptiveQualityController for P2P adaptation (works even if peer isn't sending quality reports yet) Both relay and P2P calls now have adaptive quality. On relay calls, both peer-sent reports AND local observations feed the controller. Hysteresis (3 consecutive bad reports to downgrade) prevents thrashing. 372 tests passing (+4 new: from_path_stats encoding, clamping, zero values, encoder quality report attachment). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 16:14:06 +04:00
Siavash Sameni	81b5522942	refactor: clap CLI parser, safety docs, dead code docs, cross-refs Some checks failed Mirror to GitHub / mirror (push) Failing after 26s Details Build Release Binaries / build-amd64 (push) Failing after 4m1s Details Audit items 6, 8, 9, 10: #6 - Relay CLI: replaced 154-line manual parse_args() with clap derive (13 flags/options preserved, auto --help, --version from build hash) #8 - wzp-native: added # Safety docs to all 3 unsafe extern "C" fns #9 - wzp-crypto: documented x25519_static_secret/public as reserved for future static-key federation auth (not dead code, intentionally unused) #10 - Cross-references between quality.rs ↔ dred_tuner.rs module docs 368 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:40:49 +04:00
Siavash Sameni	d539a6dfb9	test(federation): 29 tests for federation.rs (was 0), engine dedup PRD Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m45s Details Federation test coverage (crates/wzp-relay/tests/federation.rs): - room_hash: determinism, uniqueness, length, case sensitivity (5) - is_global_room: static config, call-* implicit, exact match (3) - resolve_global_room: static + call-* resolution (2) - global_room_hash: canonical names, fallthrough, independence (4) - forward_to_peers: zero peers, live QUIC datagram delivery (2) - broadcast_signal: zero peers, live QUIC signal delivery (2) - send_signal_to_peer: unknown fingerprint error (1) - peer lookup: fingerprint normalization, IP, trust priority (5) - accessors: local_tls_fp, cross_relay_tx, remote_participants (3) - integration: full media egress over live QUIC link (1) - edge case: exact room match (1) Total relay tests: 120 (was 91). Full suite: 368 passing. Also added PRD-engine-dedup.md for the engine.rs helper extraction completed in the previous commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:35:04 +04:00
Siavash Sameni	ba12aae439	refactor: extract shared engine helpers, federation clone-before-send, constants Some checks failed Mirror to GitHub / mirror (push) Failing after 30s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details Engine deduplication (PRD-engine-dedup.md): - build_call_config(): shared CallConfig construction (was 23 lines × 2) - codec_to_profile(): shared CodecId → QualityProfile mapping (was 19 lines × 2) - run_signal_task(): shared signal handler (was 48 lines × 2) - Net -39 lines from engine.rs, 6 duplicated blocks → single-line calls Quick wins from REFACTOR-codebase-audit.md: - 6 magic number constants extracted (CAPTURE_POLL_MS, RECV_TIMEOUT_MS, etc.) - DRED_POLL_INTERVAL moved from 2 local defs to 1 module-level const - federation.rs: forward_to_peers, broadcast_signal, send_signal_to_peer now clone peer list and release lock before sending (was holding Mutex across async I/O — last lock-during-send pattern eliminated) - main.rs: close_transport() helper replaces 12 silent .ok() calls with debug-level logging 314 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:22:44 +04:00
Siavash Sameni	fdb78e08bd	docs: full codebase refactoring audit with prioritized suggestions Some checks failed Mirror to GitHub / mirror (push) Failing after 32s Details Build Release Binaries / build-amd64 (push) Failing after 3m33s Details Comprehensive analysis across all 8 crates + Tauri engine covering: - engine.rs: 35% duplication between Android/desktop (350+ lines) - SignalMessage: 36 variants mixing orthogonal concerns - federation.rs: zero test coverage on 1,132 lines of complex logic - peer_links: lock held across async sends (last lock-during-I/O) - Magic numbers, error handling, CLI parsing, unsafe docs - Priority matrix: 10 items ranked by effort/impact/risk Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:35:59 +04:00
Siavash Sameni	3a51db998a	docs: relay concurrency refactor guide + PRD update for DashMap Some checks failed Mirror to GitHub / mirror (push) Failing after 25s Details Build Release Binaries / build-amd64 (push) Failing after 8m3s Details REFACTOR-relay-concurrency.md: complete post-DashMap analysis with current lock inventory, 4 prioritized suggestions (clone-before-send, peer_links DashMap, quality atomics, arc-swap snapshots), decision matrix, and concurrency diagram. PRD-relay-concurrency.md: updated to recommend DashMap as primary approach (was Option A per-room locks). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:27:26 +04:00
Siavash Sameni	a52b011fb5	feat(relay): replace global Mutex<RoomManager> with DashMap sharding Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details Eliminates the single-lock bottleneck for media forwarding. Before: all participants across all rooms competed for one Mutex. Now rooms are stored in DashMap (64 internal shards with per-shard RwLocks). Changes: - RoomManager.rooms: HashMap → DashMap<String, Room> - Per-room quality tracking (qualities, current_tier moved into Room) - Arc<Mutex<RoomManager>> → Arc<RoomManager> everywhere - 20 .lock().await sites removed across room.rs, main.rs, federation.rs, ws.rs - federation forward_to_peers: clone peer list, release lock, then send - ACL uses std::sync::Mutex (rarely accessed, non-async) Concurrency improvement: - Before: 100 rooms × 10 people = 1000 tasks → 1 Mutex - After: distributed across 64 DashMap shards, ~15 tasks per shard avg - Rooms are fully independent — room A never blocks room B 314 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:17:57 +04:00
Siavash Sameni	2514151a89	docs: PRD for relay concurrency — per-room lock sharding Some checks failed Mirror to GitHub / mirror (push) Failing after 32s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details Full analysis of relay lock contention with precise inventory of every lock acquisition in the hot path. Evaluates 4 design options: A) Per-room Arc<Mutex<Room>> (recommended — 100x improvement for multi-room) B) DashMap (good but less explicit) C) Channel-based fan-out (over-engineered for current scale) D) Snapshot-on-change via arc-swap (best perf, more complex) Phase 1: per-room locks, Phase 2: federation lock fix, Phase 3: quality tracking out of critical path. Estimated 1.5-2.5 days total. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:01:21 +04:00
Siavash Sameni	f265fd772d	docs: relay concurrency model, Opus6k fix, build script fixes Some checks failed Mirror to GitHub / mirror (push) Failing after 34s Details Build Release Binaries / build-amd64 (push) Failing after 3m56s Details - ARCHITECTURE.md: new "Relay Concurrency Model" section documenting threading, shared state locking table, scaling characteristics, and the RoomManager Mutex as primary bottleneck - PROGRESS.md: Opus6k frame starvation fix, build script fixes - PRD-dred-integration.md: Opus6k frame starvation bug documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:54:37 +04:00
Siavash Sameni	9ae9441de4	fix(audio): check capture ring available before read (fixes Opus6k choppy) Some checks failed Mirror to GitHub / mirror (push) Failing after 32s Details Build Release Binaries / build-amd64 (push) Failing after 3m58s Details Partial reads from the capture ring consumed samples that were then discarded when the send loop retried from buf[0]. For 20ms codecs this was invisible (single Oboe burst fills 960 samples in one read), but 40ms codecs (Opus6k, 1920 samples) needed 2 bursts — the first partial read consumed 960 real samples and threw them away. Result: Opus6k produced ~11 frames/s instead of 25 (~44% of expected). Fix: expose wzp_native_audio_capture_available() and check it before reading, matching the desktop capture_ring.available() pattern. Partial reads no longer occur because we only read when enough samples exist. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:46:15 +04:00
Siavash Sameni	d9e7e72978	docs: update PROGRESS, PRDs for completed tasks #9 , #11 , #12 , #27 Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m50s Details - PROGRESS.md: add 2026-04-13 section with 5-tier quality, QualityDirective handling, debug tap enhancements, dual_path fix, keystore sync - PRD-coordinated-codec.md: Phase 3 marked complete (client directive handling) - PRD-adaptive-quality.md: milestone table updated with Done/Pending status Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:34:01 +04:00
Siavash Sameni	8ff0c548a7	fix(audio): update frame_samples on codec profile switch, fix buf sizing Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details frame_samples was immutable — when adaptive quality switched from 20ms (Opus24k, 960 samples) to 40ms (Opus6k, 1920 samples), the send loop kept reading 960 samples and feeding half-sized frames to the encoder. This caused Opus6k to produce ~11 frames/s instead of 25, making audio choppy. Fix: - frame_samples is now mut and updated on profile switch - buf sized for max frame (1920) with frame_samples-bounded slices - RMS, mute, encode, and capture reads all use &buf[..frame_samples] - Applied to both Android and desktop send tasks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:33:02 +04:00
Siavash Sameni	f17420aa98	fix(build): sync keystores from persistent cache before build Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details Keystores are gitignored so git reset --hard deletes them. The build script now copies them from a persistent $BASE_DIR/data/keystore/ cache into the source tree before building. This ensures both primary and alt servers always have signing keys available. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 11:11:28 +04:00
Siavash Sameni	d424515542	feat: 5-tier quality classification, QualityDirective handling, debug tap stats Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details - Extend Tier enum from 3 to 6 levels: Studio64k/48k/32k + Good + Degraded + Catastrophic with asymmetric hysteresis (down:3, up:5, studio:10) - Handle QualityDirective signals in both desktop and Android engines — relay-coordinated codec switching now works end-to-end - Add periodic TAP STATS to debug tap: packets in/out, fan-out avg, seq gaps, codecs seen (every 5s) - Mark task #2 done (ParticipantInfo in federation signals already implemented) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 10:23:48 +04:00
Siavash Sameni	ea5fc17c34	fix(relay): debug tap signal logging, dual_path test regression, PRD updates Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m39s Details Mirror to GitHub / mirror (push) Failing after 28s Details - Add log_signal() and log_event() to DebugTap for RoomUpdate, QualityDirective, join/leave lifecycle events (task #11) - Fix dual_path.rs Phase 7 regression: add missing ipv6_endpoint arg to 3 race() call sites - Update PRDs to reflect actual implementation status: mark adaptive quality, coordinated codec, P2P, network awareness, protocol analyzer - Update PROGRESS.md with QualityDirective gap and dual_path regression Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 09:54:52 +04:00
Siavash Sameni	1a7dd935ee	fix(build): add zipalign + apksigner signing to build.sh Some checks failed Mirror to GitHub / mirror (push) Failing after 43s Details Build Release Binaries / build-amd64 (push) Failing after 3m44s Details build.sh was producing unsigned APKs because it reimplemented the Docker build inline without the signing step from build-tauri-android.sh. Now uses the same pipeline: find keystore (release preferred, debug fallback), zipalign -f 4, apksigner sign with keystore credentials. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 20:13:20 +04:00
Siavash Sameni	a7c2261b70	fix(build): clean stale APKs before build, prefer release APK on upload Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m50s Details find was picking up a cached 384MB debug APK over the fresh 25MB release APK because the old file was listed first. Now: 1. Delete all APKs before the build starts (clean slate) 2. On upload, prefer release.apk over any other match Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 20:08:06 +04:00
Siavash Sameni	eca0bb7531	Merge branch 'opus-DRED-v2' Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m26s Details	2026-04-12 19:57:35 +04:00
Siavash Sameni	d249b32ee5	test+docs: add tests for QualityDirective, ParticipantQuality; update docs - QualityDirective signal roundtrip tests (with/without reason) - ParticipantQuality unit tests (initial tier, degradation, weakest-link) - Updated PROGRESS.md with desktop adaptive quality, relay coordinated switching, Oboe state polling entries - Updated ARCHITECTURE.md SFU fan-out rules with QualityDirective - Updated PRD-coordinated-codec.md with implementation status - 312 tests passing across all modified crates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:56:46 +04:00
Siavash Sameni	22045bc5e6	feat: adaptive quality in desktop, relay quality directive, Oboe state polling - Wire AdaptiveQualityController into desktop engine send/recv tasks (mirrors Android pattern: AtomicU8 pending_profile, auto-mode check) - Wire same into Android engine send task (was only in recv before) - QualityDirective SignalMessage variant for relay-initiated codec switch - ParticipantQuality tracking in relay RoomManager (per-participant AdaptiveQualityController, weakest-link tier computation) - Relay broadcasts QualityDirective to all participants when room-wide tier degrades (coordinated codec switching) - Oboe stream state polling: poll getState() for up to 2s after requestStart() to ensure both streams reach Started before proceeding (fixes intermittent silent calls on cold start, Nothing Phone A059) Tasks: #7, #25, #26, #31, #35 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:54:04 +04:00
Siavash Sameni	766c9df442	feat(dred): continuous DRED tuning, PMTUD, extended Opus6k window - DredTuner: maps live network metrics (loss/RTT/jitter) to continuous DRED duration every ~500ms instead of discrete tier-locked values. Includes jitter-spike detection for pre-emptive Starlink-style boost. - Opus6k DRED extended from 500ms to 1040ms (max libopus 1.5 supports) - PMTUD: quinn MtuDiscoveryConfig with upper_bound=1452, 300s interval - TrunkedForwarder respects discovered MTU (was hard-coded 1200) - QuinnPathSnapshot exposes quinn internal stats + discovered MTU - AudioEncoder trait: set_expected_loss() + set_dred_duration() methods - PathMonitor: sliding-window jitter variance for spike detection - Integrated into both Android and desktop send tasks in engine.rs - 14 new tests (10 tuner unit + 4 encoder integration) - Updated ARCHITECTURE.md, PROGRESS.md, PRD-dred-integration, PRD-mtu Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:38:37 +04:00
Siavash Sameni	6f43415285	merge opus-DRED-v2 into main Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m25s Details 50 commits: BT audio routing, network change detection, Hangup call_id, per-arch APK builds, setCommunicationDevice API 31+, deferred MODE_IN_COMMUNICATION, Oboe BT mode, build signing, doc updates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:41:57 +04:00
Siavash Sameni	24cc74d93c	fix(audio): clear BT SCO communication device on call end Without clearCommunicationDevice(), the BT headset stays locked in SCO mode after the call. Media playback (video, music) can't route to BT A2DP, requiring a device reboot to restore normal audio. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:40:44 +04:00
Siavash Sameni	300ea66d13	docs: update DESIGN, ARCHITECTURE, PRDs, PROGRESS for BT + network + build changes Reflects the current reality: setCommunicationDevice API 31+, deferred MODE_IN_COMMUNICATION, BT-mode Oboe (bt_active flag), per-arch builds, Hangup call_id fix, and network monitoring integration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:39:59 +04:00
Siavash Sameni	114d69e488	fix: use tracing::warn! instead of bare warn! in engine.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:31:12 +04:00
Siavash Sameni	15c237ceea	fix(audio): defer MODE_IN_COMMUNICATION to call start, restore on end Root cause: MainActivity set MODE_IN_COMMUNICATION at app launch, hijacking system audio routing immediately — BT A2DP music dropped to earpiece, and the pre-existing communication mode confused subsequent setCommunicationDevice calls for BT SCO. Fix: MainActivity now only sets volumes. MODE_IN_COMMUNICATION is set via JNI right before Oboe audio_start() in CallEngine, and MODE_NORMAL is restored after audio_stop() when the call ends. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:29:59 +04:00
Siavash Sameni	a37c8b30fe	fix(native): add missing bt_active field to stall detector config Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:25:11 +04:00
Siavash Sameni	137fe5f084	fix(bluetooth): BT SCO mode skips 48kHz + VoiceCommunication on capture Root cause: Oboe capture at 48kHz with InputPreset::VoiceCommunication cannot open against a BT SCO device (only supports 8/16kHz). The stream silently falls back to builtin mic, delivering zeros. Fix: add bt_active flag to WzpOboeConfig. When set, capture skips setSampleRate and setInputPreset, letting the system route to BT SCO at its native rate. Oboe's SampleRateConversionQuality::Best resamples to 48kHz for our ring buffers. Playout uses Usage::Media in BT mode. New API: wzp_native_audio_start_bt() for BT mode, called from set_bluetooth_sco(on=true). Normal audio_start() restores the standard config when switching back to earpiece/speaker. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:23:19 +04:00
Siavash Sameni	5dfb5b3581	fix(bluetooth): use Shared mode for Oboe + delay restart for BT route Two fixes for BT audio silence: 1. Switch Oboe streams from Exclusive to Shared sharing mode. Exclusive mode bypasses Oboe's internal resampler, so opening a 48kHz stream against a BT SCO device (8/16kHz only) fails at the AudioPolicy level. Shared mode lets Oboe's resampler bridge the gap. 2. Add 500ms post-SCO delay before Oboe restart. The audio policy needs time to apply the bt-sco route after setCommunicationDevice returns. Without the delay, Oboe opens against the old device (handset). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:14:06 +04:00
Siavash Sameni	fd0ccf8e99	fix(bluetooth): enable Oboe sample rate conversion for BT SCO (8/16kHz) BT SCO devices only support 8kHz or 16kHz but our Oboe streams request 48kHz. Without resampling, AudioPolicyManager rejects the input stream ("getInputProfile could not find profile for... sampling rate 48000"). Fix: add setSampleRateConversionQuality(Best) to both capture and playout stream builders. Oboe resamples internally so our ring buffers stay at 48kHz regardless of the hardware sample rate. Also removes the broken setBluetoothScoOn/isBluetoothScoOn calls from stop_bluetooth_sco — just call stopBluetoothSco() unconditionally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:08:48 +04:00

1 2 3 4 5 ...

447 Commits