wz-phone

Author	SHA1	Message	Date
Siavash Sameni	1120c7b579	feat(signal): PresenceList broadcast for lobby user discovery Some checks failed Build Release Binaries / build-amd64 (push) Failing after 7m21s Details Mirror to GitHub / mirror (push) Failing after 27s Details New signal infrastructure for the lobby-first UI: - PresenceUser struct: { fingerprint, alias } - SignalMessage::PresenceList: relay broadcasts full user list to all signal clients on every register/deregister - SignalHub::presence_list(): builds the list from connected clients - SignalHub::broadcast(): sends to ALL signal clients - Relay calls broadcast on register + unregister - Desktop emits "presence_list" signal-event to JS frontend This gives clients real-time visibility of who's online via the signal channel, without needing to join a voice room first. 603 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:12:47 +04:00
Siavash Sameni	bb23976076	feat(quality): upgrade negotiation + asymmetric quality signals (#28 , #29 , #30 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m33s Details New SignalMessage variants for P2P quality coordination: UpgradeProposal/UpgradeResponse/UpgradeConfirm (#28): - Consensual quality upgrade flow — proposer sends desired profile, peer accepts/rejects based on own conditions, confirm commits both - All carry call_id for relay routing QualityCapability (#30): - Peer reports its max sustainable profile — enables asymmetric encoding where each side uses its own best quality instead of forcing everyone to the weakest link Relay forwards all 4 signals to the call peer (same pattern as MediaPathReport, CandidateUpdate, HardNatProbe). Desktop signal recv loop handles all 4 with debug logging. Encoder switching TODOs noted for wiring into CallEngine. 4 new serde roundtrip tests. 603 total, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:25:34 +04:00
Siavash Sameni	488efcb614	feat(ui): birthday attack toggle in settings (default off) Some checks failed Mirror to GitHub / mirror (push) Failing after 22s Details Build Release Binaries / build-amd64 (push) Failing after 3m36s Details New setting: "Birthday attack (opens extra ports for hard NAT)" - Default: OFF — no extra latency on call setup - When ON: waits up to 3s for peer's birthday ports if peer has non-cone NAT, adds them to the dial race Gated end-to-end: Settings → localStorage → JS invoke → Rust connect param → birthday wait + target injection. LAN/cone calls unaffected regardless of setting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:54:22 +04:00
Siavash Sameni	8c360186df	feat(nat): wire birthday attack end-to-end into connect flow Some checks failed Mirror to GitHub / mirror (push) Failing after 32s Details Build Release Binaries / build-amd64 (push) Failing after 3m19s Details Complete Dialer-side birthday attack integration: - SignalState stores peer_birthday_ports from HardNatBirthdayStart - connect command: if peer's HardNatProbe shows non-cone NAT, waits up to 3s for birthday ports to arrive (Acceptor needs time to open 32 sockets + STUN-probe each) - When birthday ports arrive, generate_dialer_targets() builds hit list (known ports + random fill) and adds them to PeerCandidates - All birthday targets go into the dual-path race as extra candidates - LAN/cone calls skip the wait entirely (gated on allocation type) Full waterfall now: 1. Standard candidates (reflexive + mapped) → immediate 2. Port prediction (sequential delta) → immediate 3. Birthday targets (if non-cone peer) → +3s wait 4. All of above raced in parallel via JoinSet 5. Relay runs concurrently with 500ms head-start 599 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:50:11 +04:00
Siavash Sameni	f06f9073ae	feat(nat): birthday attack module + HardNatBirthdayStart signal (#86 , #87 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 25s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details Birthday attack for random symmetric NATs: - birthday.rs: open_acceptor_ports() opens N sockets, STUN-probes each to learn external ports. generate_dialer_targets() builds hit list (known ports first, then random fill). spray_dialer() sprays QUIC connects with rate limiting, first success wins. - Default: 32 acceptor ports, 128 dialer probes, 20ms interval Signal coordination: - HardNatBirthdayStart { acceptor_ports, external_ip } sent by Acceptor when peer's HardNatProbe shows random/sequential NAT - Relay forwards it like other call signals - Desktop recv loop handles and logs it Hybrid waterfall integration: - On receiving HardNatProbe with non-cone allocation, Acceptor auto-opens birthday ports and sends BirthdayStart - Sockets kept alive 10s for NAT mapping persistence - Dialer spray integration into race() pending (needs transport hot-swap for background upgrade) 6 new tests, 599 total, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:44:36 +04:00
Siavash Sameni	6c49d7436f	feat(ui): direct-only mode setting (no relay fallback) Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m38s Details New toggle in Settings → "Direct-only mode (no relay fallback)": - Default: OFF (normal behavior, relay fallback on P2P failure) - When ON: connect returns error if P2P fails, with full candidate_diags in the debug log showing why each candidate failed. Call never falls back to relay. Useful for testing NAT traversal — you see the exact failure reason instead of the call silently working through relay. Wired end-to-end: - Settings.directOnly persisted in localStorage - Passed as directOnly param to Rust connect command - connect:path_negotiated shows direct_only flag - connect:direct_only_failed emits on failure with diags Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:04:45 +04:00
Siavash Sameni	1de280fe04	fix(nat): working NAT tickle + smart filter debug + timeout diags Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m39s Details Fixes from real-world 5G↔Starlink testing: NAT tickle fix: - tokio::net::UdpSocket::bind() doesn't set SO_REUSEADDR, so binding to the same port as quinn silently failed. Now uses socket2::Socket with explicit SO_REUSEADDR + SO_REUSEPORT (via libc on unix). - Tickle now logs success/failure for debugging. Diagnostic fixes: - connect:dual_path_race_start shows both dial_order_raw and dial_order_smart so we can see what filtering removed - Grace-period timeout (relay wins first, direct still running) now fills "timeout:grace" diags for unrecorded candidates - Previously candidate_diags was empty when relay won the race Dependencies: - Added socket2 = "0.5" to wzp-client 593 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:58:13 +04:00
Siavash Sameni	bc6d327ebb	feat(nat): smart candidate filtering + acceptor NAT tickle + 4s timeout Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m33s Details Major P2P improvements for cross-network calls: Smart candidate filtering (smart_dial_order): - Strip LAN candidates when peer's public IP differs from ours (172.16.x.x is unreachable from a different network) - Strip all IPv6 candidates (Phase 7 disabled, wastes dial slots) - Only keep mapped + reflexive for cross-network calls - LAN candidates preserved when both peers share the same public IP Acceptor NAT tickle: - A-role sends a 1-byte UDP packet to each peer candidate BEFORE accepting. This opens the NAT pinhole for return traffic from the Dialer's IP — critical for address-restricted NATs that only allow inbound from IPs they've seen outbound traffic to. - Uses SO_REUSEADDR on the same port as the quinn endpoint. Direct timeout increased from 2s to 4s: - Cross-network QUIC handshakes through CGNAT can take 2-3s - 2s was too aggressive for 5G/LTE networks Diagnostic fix: - Record "timeout:4s" for candidates still in-flight when the timeout fires (previously these had no diagnostic entry) 5 new tests for smart_dial_order edge cases. 593 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:42:02 +04:00
Siavash Sameni	c0dd6c06ff	feat(debug): per-candidate dial diagnostics in dual-path race Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m24s Details Added CandidateDiag struct to RaceResult with per-candidate: - address attempted - result (ok / skipped:ipv6 / error:reason) - elapsed time in ms Surfaced in call-debug events: - connect:dual_path_race_start now includes dial_order + peer_mapped - connect:dual_path_race_done now includes candidate_diags array Upgraded dual_path tracing from debug to info for IPv6 skips and dial failures so they appear in logcat/console. Helps diagnose why P2P fails on specific networks (5G CGNAT, address-restricted NATs, etc). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:16:34 +04:00
Siavash Sameni	5a03da72d3	feat(ui): selectable NAT detection mode + netcheck Tauri command Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details detect_nat_type now accepts optional `mode` parameter: - "relay" — relay-based Reflect only (original behavior) - "stun" — public STUN servers only (no relay needed) - "both" — relay + STUN in parallel (default, highest confidence) New run_netcheck Tauri command exposes the full network diagnostic (NAT type, IPv4/v6, port mapping, relay latencies, port allocation) to the JS frontend. JS usage: await invoke('detect_nat_type', { relays, mode: 'stun' }) await invoke('run_netcheck', { relays }) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:43:17 +04:00
Siavash Sameni	e3e63a40a0	feat(nat): wire hard NAT port prediction into call flow (#85 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 28s Details Build Release Binaries / build-amd64 (push) Failing after 3m27s Details End-to-end integration of sequential port prediction: - place_call: spawns background detect_port_allocation() + sends HardNatProbe signal after offer (doesn't delay call setup) - answer_call: same for AcceptTrusted answers (privacy mode skips) - Signal recv loop: stashes HardNatProbe in SignalState.peer_hard_nat_probe - connect: reads peer's probe, if Sequential{delta} runs predict_ports() and adds predicted addrs to PeerCandidates.local for the dual-path race - parse_sequential_delta() helper for "sequential(delta=N)" strings The full flow: both peers independently detect their NAT's port allocation, exchange HardNatProbe via relay, and the connect command uses the peer's sequence to predict which ports to dial — all before the dual-path race starts. 588 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:39:40 +04:00
Siavash Sameni	8fcf1be341	feat(nat): Tailscale-inspired STUN/ICE + port mapping + mid-call re-gathering (#28 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 23s Details Build Release Binaries / build-amd64 (push) Failing after 6m8s Details Phase 8: 5 new modules bringing NAT traversal close to Tailscale's approach. - stun.rs: RFC 5389 STUN client — public server reflexive discovery, XOR-MAPPED-ADDRESS parsing, parallel probe with retry, STUN fallback in desktop try_reflect_own_addr() - portmap.rs: NAT-PMP (RFC 6886) + PCP (RFC 6887) + UPnP IGD port mapping — gateway discovery, acquire/release/refresh lifecycle, new PeerCandidates.mapped candidate type in dial order - ice_agent.rs: candidate lifecycle — gather(), re_gather(), apply_peer_update() with monotonic generation counter, CandidateUpdate signal message forwarded by relay - netcheck.rs: comprehensive diagnostic — NAT type, IPv4/v6, port mapping availability, relay latencies, CLI --netcheck - relay_map.rs: RTT-sorted relay map, preferred() selection, populate_from_ack() for RegisterPresenceAck.available_relays Relay: CallRegistry stores + cross-wires caller/callee_mapped_addr into CallSetup.peer_mapped_addr. Region config + available_relays populated from federation peers in RegisterPresenceAck. Desktop: place_call/answer_call call acquire_port_mapping() and fill caller/callee_mapped_addr. STUN+relay combined NAT detection. 571 tests pass (66 new), 0 regressions, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:17:17 +04:00
Siavash Sameni	137fe5f084	fix(bluetooth): BT SCO mode skips 48kHz + VoiceCommunication on capture Root cause: Oboe capture at 48kHz with InputPreset::VoiceCommunication cannot open against a BT SCO device (only supports 8/16kHz). The stream silently falls back to builtin mic, delivering zeros. Fix: add bt_active flag to WzpOboeConfig. When set, capture skips setSampleRate and setInputPreset, letting the system route to BT SCO at its native rate. Oboe's SampleRateConversionQuality::Best resamples to 48kHz for our ring buffers. Playout uses Usage::Media in BT mode. New API: wzp_native_audio_start_bt() for BT mode, called from set_bluetooth_sco(on=true). Normal audio_start() restores the standard config when switching back to earpiece/speaker. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:23:19 +04:00
Siavash Sameni	5dfb5b3581	fix(bluetooth): use Shared mode for Oboe + delay restart for BT route Two fixes for BT audio silence: 1. Switch Oboe streams from Exclusive to Shared sharing mode. Exclusive mode bypasses Oboe's internal resampler, so opening a 48kHz stream against a BT SCO device (8/16kHz only) fails at the AudioPolicy level. Shared mode lets Oboe's resampler bridge the gap. 2. Add 500ms post-SCO delay before Oboe restart. The audio policy needs time to apply the bt-sco route after setCommunicationDevice returns. Without the delay, Oboe opens against the old device (handset). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 17:14:06 +04:00
Siavash Sameni	7e8dc400dc	fix(bluetooth): wait for SCO link before Oboe restart + detect A2DP devices Three fixes for Bluetooth audio not working: 1. is_bluetooth_available() now checks for TYPE_BLUETOOTH_A2DP (8) in addition to TYPE_BLUETOOTH_SCO (7) — many headsets only register as A2DP until SCO is explicitly started. 2. set_bluetooth_sco(on=true) polls isBluetoothScoOn() for up to 3s before restarting Oboe. startBluetoothSco() is async — the SCO link takes 500ms-2s to establish. Without waiting, Oboe opens against earpiece and audio goes nowhere. 3. Frontend skips redundant set_speakerphone(false) when transitioning to BT — start_bluetooth_sco() handles speaker-off internally, avoiding a double Oboe restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 16:46:56 +04:00
Siavash Sameni	a798634b3d	fix(signal): add call_id to Hangup — prevents stale hangup killing new calls Root cause: Hangup had no call_id field. The relay forwarded hangups to ALL active calls for a user. When user A hung up call 1 and user B immediately placed call 2, the relay's processing of A's hangup would also kill call 2 (race window ~1-2s). Fix: add optional call_id to Hangup (backwards-compatible via serde skip_serializing_if). When present, the relay only ends the named call. Old clients send call_id=None and get the legacy broadcast behavior. Also: clear pending_path_report in Hangup recv handler and internal_deregister to prevent stale oneshot channels from blocking subsequent call setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 16:39:21 +04:00
Siavash Sameni	4c1ad841e1	feat(android): Bluetooth audio routing + network change detection + per-arch APK builds Bluetooth: wire existing AudioRouteManager SCO support through both app variants. Replace binary speaker toggle with 3-way route cycling (Earpiece → Speaker → Bluetooth). Tauri side adds JNI bridge functions (start/stop/query SCO, device availability) and Oboe stream restart. Network awareness: integrate Android ConnectivityManager to detect WiFi/cellular transitions and feed them to AdaptiveQualityController via lock-free AtomicU8 signaling. Enables proactive quality downgrade and FEC boost on network handoffs. Build: add --arch flag to build-tauri-android.sh supporting arm64, armv7, or all (separate per-arch APKs for smaller tester binaries). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 16:07:41 +04:00
Siavash Sameni	29cd23fe39	fix(p2p): connection cleanup — 4 fixes for stale/dead connections PRD 4: Disable IPv6 direct dial/accept temporarily. IPv6 QUIC handshakes succeed but connections die immediately on datagram send ("connection lost"). IPv4 candidates work reliably. IPv6 candidates still gathered but filtered at dial time. PRD 1: Close losing transport after Phase 6 negotiation. The non-selected transport now gets an explicit QUIC close frame instead of silently dropping after 30s idle timeout. Prevents phantom connections from polluting future accept() calls. PRD 2: Harden accept loop with max 3 stale retries. Stale connections are explicitly closed (conn.close) and counted. After 3 stale connections, the accept loop aborts instead of spinning until the race timeout. PRD 3: Resource cleanup — close old IPv6 endpoint before creating a new one in place_call/answer_call. Add Drop impl to CallEngine so tasks are signalled to stop on ungraceful shutdown. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 15:11:50 +04:00
Siavash Sameni	1eb82d77b8	feat(relay+client): relay reports build version in Ack Add relay_build field to RegisterPresenceAck so the client logs which relay version it connected to. Shows in the debug log as register_signal:ack_received {"relay_build":"f843a93"}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:27:58 +04:00
Siavash Sameni	b79073c649	Revert "fix(connect): trust direct path on peer report timeout" This reverts commit `82b439595c`.	2026-04-12 14:10:44 +04:00
Siavash Sameni	82b439595c	fix(connect): trust direct path on peer report timeout When peers are on different relays, MediaPathReport can't be forwarded — causing a 3s timeout and false relay fallback even though direct P2P works perfectly. Fix: on timeout, if local_direct_ok is true AND the direct transport's connection is still alive (no close_reason), trust the direct path instead of falling back to relay. The timeout indicates a relay forwarding issue, not a direct path failure. Also fix ALT build paste URL (paste.tbs.manko.yoga not amn.gg). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:07:44 +04:00
Siavash Sameni	1904b19d05	fix(direct): validate A-role accepted connection, skip stale ones The Acceptor's accept() on the shared signal endpoint can dequeue a stale QUIC connection from a previous call that the Dialer has already dropped. This results in "connection lost" errors when media datagrams are sent — 100% drops on both sides. Fix: after accepting a connection, check close_reason(). If the connection is already closed, log a warning and re-accept. Also verify max_datagram_size() is available before returning. Additionally: emit transport details (remote addr, max_datagram, close_reason) in the call_engine_starting debug event so stale connection issues are visible in the user-facing debug log. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 13:50:21 +04:00
Siavash Sameni	4cfcd5117f	fix(connect): install MediaPathReport oneshot BEFORE race starts The peer's MediaPathReport can arrive while our dual_path::race is still running. Previously, the oneshot was created AFTER the race completed, so the recv loop had nowhere to deliver the report — it was silently dropped, causing a 3s timeout and false relay fallback on ~50% of calls. Fix: create the oneshot and install it in SignalState BEFORE starting the race. The oneshot::Receiver buffers the value so the connect command can read it immediately after the race finishes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 13:06:13 +04:00
Siavash Sameni	bd6733b2e5	feat(signal): advertise build version in Offer/Answer Add caller_build_version / callee_build_version (git short hash) to DirectCallOffer and DirectCallAnswer so peers can identify each other's build in debug logs. Also log own build at register time. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 12:43:55 +04:00
Siavash Sameni	c2d298beb5	feat(net): Phase 7 — dual-socket IPv4+IPv6 ICE Adds a dedicated IPv6 QUIC endpoint (IPV6_V6ONLY=1 via socket2) alongside the existing IPv4 signal endpoint for proper dual-stack P2P connectivity. Previous [::]:0 dual-stack attempt broke IPv4 on Android; this uses separate sockets per address family like WebRTC/libwebrtc. - create_ipv6_endpoint(): socket2-based IPv6-only UDP socket, tries same port as IPv4 signal EP, falls back to ephemeral - local_host_candidates(v4_port, v6_port): now gathers IPv6 global-unicast (2000::/3) and unique-local (fc00::/7) addrs - dual_path::race(): A-role accepts on both v4+v6 via select!, D-role routes each candidate to matching-AF endpoint - Graceful fallback: if IPv6 unavailable, .ok() → None → pure IPv4 behavior identical to pre-Phase-7 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 11:54:13 +04:00
Siavash Sameni	aee41a638d	fix(audio+net): revert dual-stack [::]:0, add Oboe playout stall auto-restart Two fixes: ## Revert [::]:0 dual-stack sockets → back to 0.0.0.0:0 Android's IPV6_V6ONLY=1 default on some kernels (confirmed on Nothing Phone) makes [::]:0 IPv6-only, silently killing ALL IPv4 traffic. This broke P2P direct calls: IPv4 LAN candidates (172.16.81.x) couldn't complete QUIC handshakes through the IPv6-only socket, causing local_direct_ok=false and relay fallback on every call after the first. Reverted all bind sites to 0.0.0.0:0 (reliable IPv4). IPv6 host candidates are disabled in local_host_candidates() until a proper dual-socket approach (one IPv4 + one IPv6 endpoint, Phase 7) is implemented. ## Fix A (task #35): Oboe playout callback stall auto-restart The Nothing Phone's Oboe playout callback fires once (cb#0) and then stops draining the ring on ~50% of cold-launch calls. Fix D+C (stop+prime from previous commit) didn't help because audio_stop is a no-op on cold launch. New approach: self-healing watchdog in audio_write_playout. Tracks the playout ring's read_idx across writes. If read_idx hasn't advanced in 50 consecutive writes (~1 second), the Oboe playout callback has stopped: 1. Log "playout STALL detected" 2. Call wzp_oboe_stop() to tear down the stuck streams 3. Clear both ring buffers (prevent stale data reads) 4. Call wzp_oboe_start() to rebuild fresh streams 5. Log success/failure 6. Return 0 (caller retries on next frame) This is the same teardown+rebuild that "rejoin" does — but triggered automatically from the first stalled call instead of requiring the user to hang up and redial. The watchdog runs on every write so it fires within 1s of the stall starting. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 11:24:16 +04:00
Siavash Sameni	9fb92967eb	fix(net): bind all endpoints to [::]:0 for dual-stack IPv4+IPv6 Every QUIC endpoint was bound to 0.0.0.0:0 (IPv4-only). This silently killed ALL IPv6 host candidates: the Dialer couldn't send packets to [2a0d:...] addresses (wrong address family on the socket), and the Acceptor couldn't receive incoming IPv6 QUIC handshakes. The IPv6 candidates were gathered and advertised in DirectCallOffer/Answer but were completely non-functional. On same-LAN with dual-stack (which both test phones have), this meant: - JoinSet fanned out 3+ candidates (2× IPv6 + 1× IPv4) - IPv6 dials failed silently or timed out - IPv4 dial worked but competed with failed IPv6 for JoinSet attention - Sometimes the JoinSet returned an IPv6 failure before the IPv4 success, causing unnecessary fallback to relay Fix: bind to [::]:0 (IPv6 any) instead of 0.0.0.0:0. On dual-stack systems (Linux/Android default), [::]:0 creates a socket that handles BOTH: - IPv6 natively (global unicast, ULA) - IPv4 via v4-mapped addresses (::ffff:172.16.81.x) One socket, both protocols. All 7 bind sites updated: - register_signal (signal endpoint) - do_register_signal - ping_relay - probe_reflect_addr (fresh endpoint fallback) - dual_path::race (A-role fresh, D-role fresh, relay fresh) With this fix, same-LAN P2P should prefer the IPv6 path (no NAT, direct routing, lower latency) and fall through to IPv4 if IPv6 fails — relay is the last resort after ALL candidates are exhausted. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 11:09:06 +04:00
Siavash Sameni	134ee3a77f	fix(engine): pass is_direct_p2p explicitly instead of deriving from is_some Critical Phase 6 bug: when the negotiation agreed on relay path but delivered the relay transport via pre_connected_transport, CallEngine saw is_some() = true → is_direct_p2p = true → skipped perform_handshake. The relay couldn't authenticate the participant → room join silently failed → recv_fr: 0, both sides sending into the void. Fix: add explicit is_direct_p2p: bool parameter to CallEngine:: start (both android and desktop branches). The connect command sets it from the Phase 6 negotiation result (use_direct), not from whether pre_connected_transport is Some. Now relay-negotiated calls correctly run perform_handshake, and direct P2P calls correctly skip it. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:34:21 +04:00
Siavash Sameni	e61397ca85	fix(connect): remove pre-Phase-6 same-IP heuristic The commit `de007ec` added a heuristic that forced relay-only when peers had different public IPs. That was a stopgap for the race condition where one side picked Direct and the other picked Relay. Phase 6 (`f5542ef`) solved this properly via MediaPathReport negotiation, but the heuristic wasn't cleaned up and was still running BEFORE the Phase 6 code — suppressing the race entirely for cross-network calls. Removed. Phase 6 negotiation now handles ALL cases: both sides race, exchange reports, and agree on the same path before committing media. Cross-network calls that can't go P2P will have both sides report direct_ok=false and agree on relay. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:23:36 +04:00
Siavash Sameni	f5542ef822	feat(p2p): Phase 6 — ICE-style path negotiation Before Phase 6, each side's dual-path race ran independently and committed to whichever transport completed first. When one side picked Direct and the other picked Relay, they sent media to different places — TX > 0 RX: 0 on both, completely silent call. Phase 6 adds a negotiation step: after the local race completes, each side sends a MediaPathReport { call_id, direct_ok, winner } to the peer through the relay. Both wait for the other's report before committing a transport to the CallEngine. The decision rule is simple: if BOTH report direct_ok = true, use direct; if EITHER reports false, BOTH use relay. ## Wire protocol New `SignalMessage::MediaPathReport { call_id, direct_ok, race_winner }`. The relay forwards it to the call peer via the same signal_hub routing used for DirectCallOffer/Answer. The cross-relay dispatcher also forwards it. ## dual_path::race restructured Returns `RaceResult` instead of `(Arc<QuinnTransport>, WinningPath)`: - `direct_transport: Option<Arc<QuinnTransport>>` - `relay_transport: Option<Arc<QuinnTransport>>` - `local_winner: WinningPath` Both paths are run as spawned tasks. After the first completes, a 1s grace period lets the loser also finish. The connect command gets BOTH transports (when available) and picks the right one based on the negotiation outcome. The unused transport is dropped. ## connect command flow (revised) 1. Run race() → RaceResult with both transports 2. Send MediaPathReport to relay with our direct_ok 3. Install oneshot; wait for peer's report (3s timeout) 4. Decision: both direct_ok → use direct; else → use relay 5. Start CallEngine with the agreed transport If the peer never responds (old build, timeout), falls back to relay — backward compatible. ## Relay forwarding MediaPathReport is forwarded like DirectCallOffer/Answer: via signal_hub.send_to(peer_fp) for same-relay calls, and via cross-relay dispatcher for federated calls. ## Debug log events - `connect:dual_path_race_done` — local race result - `connect:path_report_sent` — our report to the peer - `connect:peer_report_received` — peer's report - `connect:peer_report_timeout` — peer didn't respond (3s) - `connect:path_negotiated` — final agreed path with reasons Full workspace test: 423 passing (no regressions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:03:42 +04:00
Siavash Sameni	de007ec2fd	fix(p2p): skip direct P2P when peers are on different public IPs Race condition: when two phones are on different networks (WiFi vs LTE, home vs office, etc.), each side's dual-path race runs independently. One side may pick Direct while the other picks Relay, causing both to send media to different places — TX > 0, RX: 0 on both sides, completely silent call. Root cause: the dual-path race doesn't have a negotiation step. Each side picks the first transport that completes a QUIC handshake, which may be a different path than the other side picked. On same-LAN this doesn't matter because direct always wins on both (the 500ms relay delay guarantees it). On cross- network, the asymmetry bites. Heuristic fix: compare own_reflex_addr IP to peer_reflex_addr IP. If they're different → different networks → force relay-only (set role = None, which skips the dual-path race entirely). Same public IP means same LAN / same NAT: → LAN host candidates work, direct always wins on both sides → Safe for P2P Different public IPs means cross-network: → Direct may work on one side but not the other → Relay is the safe choice for both This preserves the proven same-LAN P2P and eliminates the broken cross-network case. The full fix is ICE-style path negotiation (Phase 6) where both sides exchange connectivity check results through the signal plane and agree on a winner before committing media — but that's a 500+ line protocol change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 09:50:56 +04:00
Siavash Sameni	2427630472	fix(connect): make peerLocalAddrs optional + skip handshake on direct P2P Two regressions from Phase 5.5/5.6: 1. Room connect broken: the connect Tauri command required peerLocalAddrs as a Vec<String>, but the room-join JS path doesn't pass it (only the direct-call setup handler does). Error: "invalid args 'peerLocalAddrs' for command 'connect': command connect missing required key peerLocalAddrs". Fix: change to Option<Vec<String>>, unwrap_or_default() at usage sites. Room connect works again with zero peer addrs. 2. Direct P2P call connects but then CallEngine fails with "expected CallAnswer, got Discriminant(0)". Root cause: after the dual-path race picked a direct P2P transport, CallEngine still ran perform_handshake() on it. That handshake is a relay-specific protocol — sends a CallOffer signal and waits for CallAnswer back. On a direct QUIC connection to a phone, there's nobody running accept_handshake, so the handshake reads garbage from the peer's first media packet and errors. Fix: track is_direct_p2p = pre_connected_transport.is_some() and skip perform_handshake when true. The direct connection is already TLS-encrypted by QUIC, and both peers' identities were verified through the signal channel (DirectCallOffer/ Answer carry identity_pub + ephemeral_pub + signature). Both android and desktop branches updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 08:09:32 +04:00
Siavash Sameni	16793be36f	fix(p2p): Phase 5.6 — direct-path head start + hangup propagation + media debug events Three fixes from a field-test log where same-LAN calls were still losing the dual-path race to the relay path, peers were getting stuck on an empty call screen when the other side hung up, and 1-way audio was hard to diagnose because the GUI debug log had no media-level events. ## 1. Direct-path 500ms head start (dual_path.rs) The race was resolving in ~105ms with Relay winning even when both phones were on the same MikroTik LAN with valid IPv6 host candidates. Root cause: the relay dial is a plain outbound QUIC connect that completes in whatever the client→relay RTT is (~100ms), while the direct path needs the PEER to also process its CallSetup, spin up its own race, and complete at least one LAN dial back to us. That cross-client sequence reliably takes longer than 100ms, so relay always won. Fix: delay the relay_fut with `tokio::time::sleep(500ms)` before starting its connect. Same-LAN direct dials complete in 30-50ms typically, so the head start gives direct plenty of time to win cleanly. Users on setups where direct genuinely can't work (LTE-to-LTE cross-carrier) pay 500ms extra on the relay fallback, which is invisible for a call setup. ## 2. Hangup propagation via a new hangup_call command (lib.rs + main.ts) The hangup button was calling `disconnect` which stopped the local media engine but never sent a SignalMessage::Hangup to the relay. The peer never got notified and was stuck on the call screen with silent audio. My earlier fix (commit `e75b045`) only handled the RECEIVE side — auto-dismiss call screen on recv:Hangup — but the SEND side was still missing. New Tauri command `hangup_call`: 1. Acquire state.signal.lock(), send SignalMessage::Hangup over the signal transport (best-effort; log + continue if signal is down) 2. Acquire state.engine.lock(), stop the CallEngine JS hangupBtn click handler now calls hangup_call with a fallback to raw disconnect if the command is missing (older builds). ## 3. Media debug events (engine.rs + lib.rs) Threaded tauri::AppHandle into CallEngine::start so the send/ recv tasks can emit call-debug events when the user has debug logs enabled. Added on the Android branch (desktop branch accepts the arg for API symmetry but doesn't emit yet): - media:first_send — emitted when the first encoded frame is handed to the transport. Useful for 1-way audio diagnosis: if this fires on side A but side B never sees media:first_recv, A's outbound is broken. - media:first_recv — emitted when the first packet from the peer arrives. Mirror of first_send. - media:send_heartbeat — every 2s with frames_sent, last_rms, last_pkt_bytes, short_reads, drops. A stalled last_rms (== 0) tells you the mic isn't producing samples; a frozen frames_sent tells you the encode pipeline hung. - media:recv_heartbeat — every 2s with recv_fr, decoded_frames, last_written, written_samples, decode_errs, codec. Mirror invariants for the inbound direction. All four are gated by `call_debug_logs_enabled()` via `emit_call_debug`, so they only show up in the GUI log when the user has the Call Flow Debug Logs checkbox on. Tracing::info! still runs unconditionally so logcat (adb) keeps its copy regardless. The `emit_call_debug` fn in lib.rs is now `pub(crate)` so engine.rs can call it via `crate::emit_call_debug`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:55:41 +04:00
Siavash Sameni	fa038df057	feat(p2p): Phase 5.5 — ICE LAN host candidates (IPv4 + IPv6) Same-LAN P2P was failing because MikroTik masquerade (like most consumer NATs) doesn't support NAT hairpinning — the advertised WAN reflex addr is unreachable from a peer on the same LAN as the advertiser. Phase 5 got us Cone NAT classification and fixed the measurement artifact, but same-LAN direct dials still had nowhere to land. Phase 5.5 adds ICE-style host candidates: each client enumerates its LAN-local network interface addresses, includes them in the DirectCallOffer/Answer alongside the reflex addr, and the dual-path race fans out to ALL peer candidates in parallel. Same-LAN peers find each other via their RFC1918 IPv4 + ULA / global-unicast IPv6 addresses without touching the NAT at all. Dual-stack IPv6 is in scope from the start — on modern ISPs (including Starlink) the v6 path often works even when v4 hairpinning doesn't, because there's no NAT on the v6 side. ## Changes ### `wzp_client::reflect::local_host_candidates(port)` (new) Enumerates network interfaces via `if-addrs` and returns SocketAddrs paired with the caller's port. Filters: - IPv4: RFC1918 (10/8, 172.16/12, 192.168/16) + CGNAT (100.64/10) - IPv6: global unicast (2000::/3) + ULA (fc00::/7) - Skipped: loopback, link-local (169.254, fe80::), public v4 (already covered by reflex-addr), unspecified Safe from any thread, one `getifaddrs(3)` syscall. ### Wire protocol (wzp-proto/packet.rs) Three new `#[serde(default, skip_serializing_if = "Vec::is_empty")]` fields, backward-compat with pre-5.5 clients/relays by construction: - `DirectCallOffer.caller_local_addrs: Vec<String>` - `DirectCallAnswer.callee_local_addrs: Vec<String>` - `CallSetup.peer_local_addrs: Vec<String>` ### Call registry (wzp-relay/call_registry.rs) `DirectCall` gains `caller_local_addrs` + `callee_local_addrs` Vec<String> fields. New `set_caller_local_addrs` / `set_callee_local_addrs` setters. Follow the same pattern as the reflex addr fields. ### Relay cross-wiring (wzp-relay/main.rs) Both the local-call and cross-relay-federation paths now track the local_addrs through the registry and inject them into the CallSetup's peer_local_addrs. Cross-wiring is identical to the existing peer_direct_addr logic — each party's CallSetup carries the OTHER party's LAN candidates. ### Client side (desktop/src-tauri/lib.rs) - `place_call`: gathers local host candidates via `local_host_candidates(signal_endpoint.local_addr().port())` and includes them in `DirectCallOffer.caller_local_addrs`. The port match is critical — it's the Phase 5 shared signal socket, so incoming dials to these addrs land on the same endpoint that's already listening. - `answer_call`: same, AcceptTrusted only (privacy mode keeps LAN addrs hidden too, for consistency with the reflex addr). - `connect` Tauri command: new `peer_local_addrs: Vec<String>` arg. Builds a `PeerCandidates` bundle and passes it to the dual-path race. - Recv loop's CallSetup handler: destructures + forwards the new field to JS via the signal-event payload. ### `dual_path::race` (wzp-client/dual_path.rs) Signature change: takes `PeerCandidates` (reflex + local Vec) instead of a single SocketAddr. The D-role branch now fans out N parallel dials via `tokio::task::JoinSet` — one per candidate — and the first successful dial wins (losers are aborted immediately via `set.abort_all()`). Only when ALL candidates have failed do we return Err; individual candidate failures are just traced at debug level and the race waits for the others. LAN host candidates are tried BEFORE the reflex addr in `PeerCandidates::dial_order()` — they're faster when they work, and the reflex addr is the fallback for the not-on-same-LAN case. ### JS side (desktop/main.ts) `connect` invoke now passes `peerLocalAddrs: data.peer_local_addrs ?? []` alongside the existing `peerDirectAddr`. ### Tests All existing test callsites updated for the new Vec<String> fields (defaults to Vec::new() in tests — they don't exercise the multi-candidate path). `dual_path.rs` integration tests wrap the single `dead_peer` / `acceptor_listen_addr` in a `PeerCandidates { reflexive: Some(_), local: Vec::new() }`. Full workspace test: 423 passing (same as before 5.5). ## Expected behavior on the reporter's setup Two phones behind MikroTik, both on the same LAN: place_call:host_candidates {"local_addrs": ["192.168.88.21:XXX", "2001:...:YY:XXX"]} recv:DirectCallAnswer {"callee_local_addrs": ["192.168.88.22:ZZZ", "2001:...:WW:ZZZ"]} recv:CallSetup {"peer_direct_addr":"150.228.49.65:NN", "peer_local_addrs":["192.168.88.22:ZZZ","2001:...:WW:ZZZ"]} connect:dual_path_race_start {"peer_reflex":"...","peer_local":[...]} dual_path: direct dial succeeded on candidate 0 ← LAN v4 wins connect:dual_path_race_won {"path":"Direct"} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:34:49 +04:00
Siavash Sameni	1618ff6c9d	feat(p2p): Phase 5 — single-socket architecture (Nebula-style) Before Phase 5 WarzonePhone used THREE separate UDP sockets per client: 1. Signal endpoint (register_signal, client-only) 2. Reflect probe endpoints (one fresh socket per relay probe) 3. Dual-path race endpoint (fresh per call setup) This broke two things in production on port-preserving NATs (MikroTik masquerade, most consumer routers): a. Phase 2 NAT detection was WRONG. Each probe used a fresh internal port, so MikroTik mapped each one to a different external port, and the classifier saw "different port per relay" and labeled it SymmetricPort. The real NAT was cone-like but measurement via fresh sockets hid that. b. Phase 3.5 dual-path P2P race was BROKEN. The reflex addr we advertised in DirectCallOffer was observed by the signal endpoint's socket. The actual dual-path race listened on a DIFFERENT fresh socket, on a different internal (and therefore external) port. Peers dialed the advertised addr and hit MikroTik's mapping for the signal socket, which forwarded to the signal endpoint — a client-only endpoint that doesn't accept incoming connections. Direct path silently failed, relay always won the race. Nebula-style fix: one socket for everything. The signal endpoint is now dual-purpose (client + server_config), and both the reflect probes and the dual-path race reuse it instead of creating fresh ones. MikroTik's port-preservation then gives us a stable external port across all flows → classifier correctly sees Cone NAT → advertised reflex addr is the actual listening port → direct dials from peers land on the right socket → `endpoint.accept()` in the A-role branch of the dual-path race picks up the incoming connection. ## Changes ### `register_signal` (desktop/src-tauri/src/lib.rs) - Endpoint now created with `Some(server_config())` instead of `None`. The socket can now accept incoming QUIC connections as well as dial outbound. - Every code path that previously read `sig.endpoint` for the relay-dial reuse benefits automatically — same socket is now ALSO listening for peer dials. ### `probe_reflect_addr` (wzp-client/src/reflect.rs) - New `existing_endpoint: Option<Endpoint>` arg. `Some` reuses the caller's socket (production: pass the signal endpoint). `None` creates a fresh one (tests + pre-registration). - Removed the `drop(endpoint)` at the end — was correct for fresh endpoints (explicit early socket close) but incorrect for shared ones. End-of-scope drop does the right thing in both cases via Arc semantics. ### `detect_nat_type` (wzp-client/src/reflect.rs) - New `shared_endpoint: Option<Endpoint>` arg, forwarded to every probe in the JoinSet fan-out. One shared socket means the classifier sees the true NAT type. ### `detect_nat_type` Tauri command (desktop/src-tauri/src/lib.rs) - Reads `state.signal.endpoint` and passes it as the shared endpoint. Falls back to None when not registered. NAT detection now produces accurate classifications against MikroTik / most consumer NATs. ### `dual_path::race` (wzp-client/src/dual_path.rs) - New `shared_endpoint: Option<Endpoint>` arg. - A-role: when `Some`, reuses it for `accept()`. This is the critical change — the reflex addr advertised to peers is now the address listening for incoming direct dials. - D-role: when `Some`, reuses it for the outbound direct dial. MikroTik keeps the same external port for the dial as for the signal flow → direct dial through a cone-mapped NAT. - Relay path: also reuses the shared endpoint so MikroTik has a single consistent mapping across the whole call (saves one extra external port and makes firewall traces cleaner). - When `None`, falls back to fresh per-role endpoints as before. ### `connect` Tauri command (desktop/src-tauri/src/lib.rs) - Reads `state.signal.endpoint` once when acquiring own reflex addr and passes it through to `dual_path::race`. ### Tests - `wzp-client/tests/dual_path.rs` and `wzp-relay/tests/multi_reflect.rs` updated to pass `None` for the new endpoint arg — tests use fresh sockets and that's fine because the loopback harness doesn't care about port-preserving NAT behavior. Full workspace test: 423 passing (no regressions). ## Expected behavior after this commit on real hardware Behind MikroTik + Starlink-bypass (the reporter's setup): - Phase 2 NAT detect → Cone NAT (was SymmetricPort — false positive from the measurement artifact) - Phase 3.5 direct-P2P dial → succeeds for both cone-cone and cone-CGNAT cases where the remote side was previously blocked by our own socket mismatch - LTE ↔ LTE cross-carrier → still likely relay fallback; that's genuinely strict symmetric and needs Phase 5.5 port prediction. ## Phase 5.5 (next, separate PRD) Multi-candidate port prediction + ICE-style candidate aggregation for truly strict symmetric NATs. Not needed for the 95% case — Phase 5 alone fixes most consumer-router setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 19:47:20 +04:00
Siavash Sameni	b7a48bf13b	feat(ui): incoming-call ring tone + system notification Previously: incoming calls silently popped an "Accept/Reject" panel. Easy to miss — no audible cue, no system-level alert if the app was backgrounded. Now the incoming-call path triggers both a synthesized ring tone and a system notification banner. ## Ring tone (desktop/src/main.ts) New `Ringer` class using Web Audio API directly — no external asset files, no new npm dep. Synthesizes a classic NANP two-tone cadence (440Hz + 480Hz sine mix, 2s tone + 4s silence, looped) through an envelope-gated gain node that ramps on/off to avoid clicks. Audible on every Tauri-supported platform because WebView carries Web Audio. - `start()` — lazily creates AudioContext on first use (platforms that require a user gesture for AudioContext creation still work because the incoming-call event is user-adjacent from the webview's perspective), starts setInterval(6000) loop. - `stop()` — clears the timer AND disconnects any active oscillators so there's no tail audio. - Active-nodes array is swept every cycle so it doesn't grow unbounded across long rings. Hooked into signal-event handlers: - `"incoming"` → `ringer.start()` + notifyIncomingCall - `"answered"`, `"setup"`, `"hangup"` → `ringer.stop()` - Accept/Reject button click handlers → `ringer.stop()` as the first thing they do (before any await) ## System notification (desktop/src-tauri + main.ts) Added `tauri-plugin-notification = "2"` to the Tauri app and registered in the builder. Capabilities updated with the four notification permissions. Frontend calls the plugin commands via the generic `invoke` instead of adding `@tauri-apps/plugin-notification` as a JS dep — Tauri plugins expose `plugin:notification\|notify` etc. directly. Flow: 1. `is_permission_granted` — check cached 2. If not granted → `request_permission` (Android prompts the user once, cached thereafter) 3. `notify` with title="Incoming call", body="From <alias>" All wrapped in try/catch with console.debug fallback — plugin missing or permission denied is non-fatal, the visible panel + ring tone still alert the user. ## Known gaps (deferred) - Android native system ringtone (RingtoneManager) + full- screen intent for lockscreen-visible ringer. Requires platform-specific Java/Kotlin glue in the Tauri Android shell — bigger lift. - Desktop window flash / taskbar attention-seek on incoming call when app is backgrounded. - Vibration pattern on Android. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 18:46:13 +04:00
Siavash Sameni	20375eceb9	feat(signal): transparent reconnect + auto-swap on relay change Two related UX fixes, same state-machine surface: 1. Relay drops / goes offline / restarts: the client now auto- reconnects in the background instead of silently falling to "not registered" and requiring the user to tap Deregister + Register. 2. User switches relay in settings: client auto-swaps — close old transport, register against new, all transparent. ## Signal state additions (desktop/src-tauri/src/lib.rs) - `SignalState.desired_relay_addr: Option<String>` — what the user CURRENTLY wants. `Some(x)` means "keep me connected to x", `None` means "user explicitly asked for idle". This is the pivot that distinguishes "connection dropped, retry" from "user deregistered, stop". - `SignalState.reconnect_in_progress: bool` — single-flight guard so concurrent triggers (recv-loop exit + manual register_signal + another recv-loop exit after a brief success) don't spawn duplicate supervisors. ## Refactor The old `register_signal` Tauri command was doing the whole connect + Register + spawn-recv-loop flow inline. Split into: - `internal_deregister(signal_state, keep_desired)` — shared teardown helper that nulls out transport/endpoint/call state and optionally clears `desired_relay_addr`. - `do_register_signal(signal_state, app, relay)` — core connect + register + spawn-recv-loop flow, callable from both the Tauri command and the reconnect supervisor. Returns an explicit `impl Future<...> + Send` to avoid auto-trait inference bailing inside the tokio::spawn chain (rustc loses the Send trail through the recv-loop spawn inside the fn body). - `register_signal` Tauri command — now thin: if already registered to the same relay, no-op; otherwise internal_deregister(keep_desired=false), set desired_relay_addr = Some(new), call do_register_signal. The Rust side handles the "change of server" transition entirely on its own, no deregister+register dance from JS needed. - `deregister` Tauri command — internal_deregister(keep_desired = false) so the recv-loop exit path sees the cleared desired addr and does NOT spawn a supervisor. ## Reconnect supervisor New `signal_reconnect_supervisor(signal_state, app, relay)` task. Spawned from the recv-loop exit path when the loop exits unexpectedly AND `desired_relay_addr.is_some()` AND no supervisor is already running. - Exponential backoff: 1s, 2s, 4s, 8s, 15s, 30s (capped at 30s, never gives up). First attempt is immediate (attempt 0 skips the wait). - On each iteration checks whether `desired_relay_addr` was cleared (user deregistered mid-flight) or another path already re-registered; either short-circuits the supervisor. - Also detects if the user changed relays while the supervisor was sleeping — resets the backoff counter and retries against the new addr. - On success, exits so the newly-spawned recv loop owns the connection from that point. If THAT drops again, a fresh supervisor spawns. - Emits `call-debug-log` and `signal-event` events at every state transition so the GUI can display "reconnecting...", "registered" banners. ## UI wiring (desktop/src/main.ts) - signal-event handler gets two new cases: - `"reconnecting"` — amber "🔄 reconnecting to <relay>…" in the registered banner area - `"registered"` — green "✓ registered (<fp prefix>…)" to clear the reconnecting badge - Relay-selection click handler checks if a signal is currently registered and, if the user picked a different relay, fires `register_signal` with the new address. Rust side handles the swap transparently. Full workspace test: 423 passing (no regressions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 18:40:11 +04:00
Siavash Sameni	da08723fe7	fix(signal): forward-compat — log+continue on unknown SignalMessage variants Both sides of the signal channel previously broke their recv loop on any deserialize error, which meant adding a new variant in one build silently killed signal connections from peers running an older build. This bit us during Phase 1 testing: a new client sending SignalMessage::Reflect to a pre-Phase-1 relay caused the relay to drop the whole signal connection, which looked like "Error: not registered" on the next place_call. Fix: - New TransportError::Deserialize(String) variant in wzp-proto carries serde errors as a distinct category. - wzp-transport/reliable.rs::recv_signal returns Deserialize on serde_json::from_slice failures (was wrapped in Internal). - wzp-relay/main.rs signal loop matches on Deserialize → warn + continue (instead of break). - desktop/src-tauri/lib.rs recv loop does the same. Other TransportError variants (ConnectionLost, Io, Internal) still break the loop — only pure parse failures are recoverable. This means future SignalMessage variant additions are backward- compat by construction: older peers will see "unknown variant, continuing" in their logs while newer peers can keep evolving the protocol. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 18:13:31 +04:00
Siavash Sameni	59ce52f8e8	feat(p2p): Phase 3.5 dual-path QUIC race + GUI call-flow debug logs Two features in one commit because they ship and test together: Phase 3.5 closes the hole-punching loop and the call-flow debug logs give the user live visibility into every step of a call so real-hardware testing of the new P2P path is debuggable. ## Phase 3.5 — dual-path QUIC connect race Completes the hole-punching work Phase 3 scaffolded. On receiving a CallSetup with peer_direct_addr, the client now actually races a direct QUIC handshake against the relay dial and uses whichever completes first. Symmetric role assignment avoids the two-conns- per-call problem: - Both peers compare `own_reflex_addr` vs `peer_reflex_addr` lexicographically. - Smaller addr → Acceptor (A-role): builds a server-capable dual endpoint, awaits an incoming QUIC session. Does NOT dial. - Larger addr → Dialer (D-role): builds a client-only endpoint, dials the peer's addr with `call-<id>` SNI. Does NOT listen. - Both sides always dial the relay in parallel as fallback. - `tokio::select!` with `biased` preference for direct, `tokio::pin!` so each branch can await the losing opposite as fallback. - Direct timeout 2s, relay fallback timeout 5s (so 7s worst case from CallSetup to "no media path" error). New crate module `wzp_client::dual_path::{race, WinningPath}` (moved here from desktop/src-tauri so it's testable from a workspace test). `determine_role` in `wzp_client::reflect` is pure-function and unit-tested. ### CallEngine integration - New `pre_connected_transport: Option<Arc<QuinnTransport>>` arg on both android + desktop `CallEngine::start` branches. Skips the internal wzp_transport::connect step when Some. Backward- compat: None keeps Phase 0 relay-only behavior. - `connect` Tauri command reads own_reflex_addr from SignalState, computes role, runs the race, passes the winning transport into CallEngine. If ANY input is missing (no peer addr, no own addr, equal addrs), falls back to classic relay path — identical to pre-Phase-3.5 behavior. ### Tests (9 new, all passing) - 6 unit tests for `determine_role` truth table in `wzp-client/src/reflect.rs` (smaller=Acceptor, larger=Dialer, port-only diff, equal, missing-side, symmetry) - 3 integration tests in `crates/wzp-client/tests/dual_path.rs`: * `dual_path_direct_wins_on_loopback` — two-endpoint test rig, Dialer wins direct path vs loopback mock relay * `dual_path_relay_wins_when_direct_is_dead` — dead peer port, 2s direct timeout, relay fallback wins * `dual_path_errors_cleanly_when_both_paths_dead` — <10s error, no hang ## GUI call-flow debug logs Runtime-toggled structured events at every step of a call so the user can see where a call progressed or stalled on real hardware. Modeled on the existing DRED_VERBOSE_LOGS pattern. ### Rust side - `static CALL_DEBUG_LOGS: AtomicBool` + `emit_call_debug(&app, step, details)` helper. Always logs via `tracing::info!` (logcat always has a copy); GUI Tauri `call-debug-log` event only fires when the flag is on. - Tauri commands `set_call_debug_logs` / `get_call_debug_logs`. ### Instrumented steps (24 emit_call_debug sites) - `register_signal`: start, identity loaded, endpoint created, connect failed/ok, RegisterPresence sent, ack received/failed, recv loop spawning - Recv loop: CallRinging, DirectCallOffer (w/ caller_reflexive_addr), DirectCallAnswer (w/ callee_reflexive_addr), CallSetup (w/ peer_direct_addr), Hangup - `place_call`: start, reflect query start/ok/none, offer sent, send failed - `answer_call`: start, reflect query start/ok/none or privacy skip, answer sent, send failed - `connect`: start, dual_path_race_start (w/ role), won (w/ path), failed, skipped (w/ reasons), call_engine_starting/ started/failed ### JS side - New `callDebugLogs: boolean` field on Settings type. - Boot-time hydrate of the Rust flag from localStorage so the choice survives restarts (like `dredDebugLogs`). - Settings panel: new "Call flow debug logs" checkbox alongside the DRED toggle. - New "Call Debug Log" section that ONLY shows when the flag is on. Rolling in-memory buffer of the last 200 events, rendered as monospace `HH:MM:SS.mmm step {details}` lines with auto- scroll and a Clear button. - `listen("call-debug-log", ...)` subscribed at app startup, appends to the buffer, re-renders on every event. Full workspace test goes from 404 → 413 passing. Clippy clean on touched crates. PRD: .taskmaster/docs/prd_phase35_dual_path_race.txt Tasks: 61-69 all completed Next: APK + desktop build carrying everything — Phase 2 NAT detect, Phase 3 advertising, Phase 3.5 dual-path + call debug logs, plus the earlier Android first-join diagnostics — so the user can validate the P2P path on real hardware with live per-step visibility into where any failures happen. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 14:06:44 +04:00
Siavash Sameni	39277bf3a0	feat(hole-punching): advertise peer reflexive addrs in DirectCall flow — Phase 3 Completes the signal-plane plumbing for P2P direct calling: both peers now learn their own server-reflexive address (Phase 1 Reflect), include it in DirectCallOffer / DirectCallAnswer, and the relay cross-wires them into each side's CallSetup so the client knows the OTHER party's direct addr. Dual-path QUIC race is scaffolded but deferred to Phase 3.5 — this commit ships the full advertising layer so real-hardware testing can confirm the addrs flow end-to-end before adding the concurrent-connect logic. Wire protocol (wzp-proto/src/packet.rs): - DirectCallOffer gains optional `caller_reflexive_addr` - DirectCallAnswer gains optional `callee_reflexive_addr` - CallSetup gains optional `peer_direct_addr` - All #[serde(default, skip_serializing_if = "Option::is_none")] so pre-Phase-3 peers and relays stay backward compatible by construction — the new fields are elided from the JSON on the wire when None, and older clients parse the JSON ignoring any fields they don't know. - 2 new roundtrip tests (Some + None cases, old-JSON parse-back). Call registry (wzp-relay/src/call_registry.rs): - DirectCall gains caller_reflexive_addr + callee_reflexive_addr. - set_caller_reflexive_addr / set_callee_reflexive_addr setters. - 2 new unit tests: stores and returns addrs, clearing works. Relay cross-wiring (wzp-relay/src/main.rs): - On DirectCallOffer: stash the caller's addr in the registry. - On DirectCallAnswer: stash the callee's addr (only set by AcceptTrusted answers — privacy-mode leaves it None). - Send two different CallSetup messages: one to the caller with peer_direct_addr=callee_addr, and one to the callee with peer_direct_addr=caller_addr. The cross-wiring means each side gets the OTHER party's direct addr, not its own. - Logs `p2p_viable=true` when both sides advertised. Client advertising (desktop/src-tauri/src/lib.rs): - New `try_reflect_own_addr` helper that reuses the Phase 1 oneshot pattern WITHOUT holding state.signal.lock() across the await (critical: the recv loop reacquires the same mutex to fire the oneshot, so holding it would deadlock). - `place_call` queries reflect first and includes the returned addr in DirectCallOffer. Falls back to None on any failure — call still proceeds via the relay path. - `answer_call` queries reflect ONLY on AcceptTrusted so AcceptGeneric keeps the callee's IP private by design. Reject and AcceptGeneric both pass None. - recv loop's CallSetup handler destructures and forwards peer_direct_addr to the JS layer in the signal-event payload. Client scaffolding for dual-path (desktop/src-tauri/src/lib.rs + desktop/src/main.ts): - `connect` Tauri command gets a new optional `peer_direct_addr` argument. Currently LOGS the addr but still uses the relay path for the media connection — Phase 3.5 will swap in a tokio::select! race between direct dial + relay dial. Scaffolding lands here so the JS wire is stable, real-hardware testing can confirm advertising works end-to-end, and Phase 3.5 is a pure Rust change with no JS touches. - JS setup handler forwards `data.peer_direct_addr` to invoke. Back-compat with the CLI client (crates/wzp-client/src/cli.rs): - CLI test harness updated for the new fields — always passes None for both reflex addrs (no hole-punching). Also destructures peer_direct_addr: _ in its CallSetup handler. Tests (8 new, all passing): - wzp-proto: hole_punching_optional_fields_roundtrip, hole_punching_backward_compat_old_json_parses - wzp-relay call_registry: call_registry_stores_reflexive_addrs, call_registry_clearing_reflex_addr_works - wzp-relay integration: crates/wzp-relay/tests/hole_punching.rs * both_peers_advertise_reflex_addrs_cross_wire_in_setup * privacy_mode_answer_omits_callee_addr_from_setup * pre_phase3_caller_leaves_both_setups_relay_only * neither_peer_advertises_both_setups_are_relay_only Full workspace test goes from 396 → 404 passing. PRD: .taskmaster/docs/prd_hole_punching.txt Tasks: 53-60 all completed (58 = scaffolding-only; 3.5 follow-up) Next up: Phase 3.5 — dual-path QUIC connect race. With the advertising layer live, this becomes a focused change: on CallSetup-with-peer_direct_addr, start a server-capable dual endpoint, and tokio::select! across (direct dial, relay dial, inbound accept). Whichever QUIC handshake completes first wins, the losers drop, 2s direct timeout falls back to relay. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 13:37:04 +04:00
Siavash Sameni	8d903f16c6	feat(reflect): multi-relay NAT type detection — Phase 2 Builds on Phase 1's SignalMessage::Reflect to probe N relays in parallel through transient QUIC connections and classify the client's NAT type for the future P2P hole-punching path. No wire protocol changes — Phase 1's Reflect/ReflectResponse pair is reused unchanged. New client-side module (crates/wzp-client/src/reflect.rs): - probe_reflect_addr(relay, timeout_ms): opens a throwaway quinn::Endpoint (fresh ephemeral source port per probe, essential for NAT-type detection — sharing one endpoint would make a symmetric NAT look like a cone NAT), connects to _signal, sends RegisterPresence with zero identity, consumes the Ack, sends Reflect, awaits ReflectResponse, cleanly closes. - detect_nat_type(relays, timeout_ms): parallel probes via tokio::task::JoinSet (bounded by slowest probe not sum) and returns a NatDetection with per-probe results + aggregate classification. - classify_nat(probes): pure-function classifier split out for network-free unit tests. Rules: * 0-1 successful probes → Unknown * 2+ successes, same ip same port → Cone (P2P viable) * 2+ successes, same ip diff ports → SymmetricPort (relay) * 2+ successes, different ips → Multiple (treat as symmetric) Tauri command (desktop/src-tauri/src/lib.rs): - detect_nat_type({ relays: [{ name, address }] }) -> NatDetection as JSON. Takes the relay list from JS because localStorage owns the config. Parse-up-front so a malformed entry fails clean instead of as a probe error. 1500ms per-probe timeout. UI (desktop/index.html + src/main.ts): - New "NAT type" row + "Detect NAT" button in the Network settings section. Renders per-probe status (name, address, observed addr, latency, or error) plus the colored verdict: * green Cone — shows consensus addr * amber SymmetricPort / Multiple — must relay * gray Unknown — not enough data Tests: - 7 unit tests in wzp-client/src/reflect.rs covering every classifier branch (empty, 1 success, 2 identical, 2 diff ports, 2 diff ips, success+failure mix, pure-failure). - 3 integration tests in crates/wzp-relay/tests/multi_reflect.rs: * probe_reflect_addr_happy_path — single mock relay end-to-end * detect_nat_type_two_loopback_relays_is_cone — two concurrent relays, asserts both see 127.0.0.1 and classifier returns Cone or SymmetricPort (accepted because the test harness uses fresh ephemeral ports per probe which look like SymmetricPort on single-host loopback) * detect_nat_type_dead_relay_is_unknown — alive + dead port mix, asserts the dead probe surfaces an error string and the aggregator returns Unknown (only 1 success) Full workspace test goes from 386 → 396 passing. PRD: .taskmaster/docs/prd_multi_relay_reflect.txt Tasks: 47-52 all completed Next up: hole-punching (Phase 3) — use the reflected address in DirectCallOffer/Answer and CallSetup so peers attempt a direct QUIC handshake to each other, with relay fallback on timeout. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:47:12 +04:00
Siavash Sameni	921856eba9	feat(reflect): QUIC-native NAT reflection ("STUN for QUIC") — Phase 1 Lets a client ask its registered relay "what IP:port do you see for me?" over the existing TLS-authenticated signal channel, returning the client's server-reflexive address as a SocketAddr. Replaces the need for a classic STUN deployment and becomes the bootstrap step for future P2P hole-punching: once both peers know their own reflex addrs, they can advertise them in DirectCallOffer and attempt a direct QUIC handshake to each other. Wire protocol (wzp-proto): - SignalMessage::Reflect — unit variant, client -> relay - SignalMessage::ReflectResponse { observed_addr: String } — relay -> client - JSON-serde, appended at end of enum: zero ordinal concerns, backward compat with pre-Phase-1 relays by construction (older relays log "unexpected message" and drop; newer clients time out cleanly within 1s). Relay handler (wzp-relay/src/main.rs, signal loop): - New match arm next to Ping reuses the already-bound `addr` from connection.remote_address() and replies with observed_addr as a string. debug!-level log on success, warn!-level on send failure. Client side (desktop/src-tauri/src/lib.rs): - SignalState gains pending_reflect: Option<oneshot::Sender<SocketAddr>>. - get_reflected_address Tauri command installs the oneshot before sending Reflect and awaits it with a 1s timeout; cleans up on every exit path (send failure, timeout, parse error). - recv loop's new ReflectResponse arm fires the pending sender or emits a debug log for unsolicited responses — never crashes the loop on malformed input. - Integrated into invoke_handler! alongside the other signal commands. UI (desktop/index.html + src/main.ts): - New "Network" section in settings panel with a "Detect" button that displays the reflected address or a categorized warning ("register first" / "relay does not support reflection" / error). Tests (crates/wzp-relay/tests/reflect.rs — 3 new, all passing): - reflect_happy_path: client on loopback gets back 127.0.0.1:<its own port> - reflect_two_clients_distinct_ports: two concurrent clients see their own distinct ports, proving per-connection remote_address - reflect_old_relay_times_out: mock relay that ignores Reflect — client times out between 1000-1200ms and does not hang Also pre-existing test bit-rot unrelated to this PR — fixed so the full workspace `cargo test` goes green: - handshake_integration tests in wzp-client, wzp-relay and featherchat_compat in wzp-crypto all missed the `alias` field addition to CallOffer and the 3-arg form of perform_handshake plus 4-tuple return of accept_handshake. Updated to the current API surface. Results: cargo test --workspace --exclude wzp-android: 386 passed cargo check --workspace: clean cargo clippy: no new warnings in touched files Verification excludes wzp-android because it's dead code on this branch (Tauri mobile uses wzp-native instead) and can't link -llog on macOS host — unchanged status quo. PRD: .taskmaster/docs/prd_reflect_over_quic.txt Tasks: 39-46 all completed Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 12:29:07 +04:00
Siavash Sameni	578ff8cff4	feat(debug): GUI toggle for DRED verbose logs + macOS mic permission DRED verbose logs (off by default — keeps logcat clean in normal use): - wzp-codec: DRED_VERBOSE_LOGS atomic flag with dred_verbose_logs() / set_dred_verbose_logs() helpers - opus_enc: gate "DRED enabled" + libopus version logs behind the flag - desktop/src-tauri/engine.rs: gate DredRecvState parse log, reconstruction log, classical PLC log, and DRED-counter fields in the Android recv heartbeat (non-verbose path still logs basic recv stats) - Tauri commands set_dred_verbose_logs / get_dred_verbose_logs - Settings panel gets a "DRED debug logs (verbose, dev only)" checkbox; preference persists in wzp-settings localStorage and is pushed to Rust on save and on app boot macOS mic permission: - Add desktop/src-tauri/Info.plist with NSMicrophoneUsageDescription. Without it, modern macOS silently denies CoreAudio capture for ad-hoc-signed Tauri builds — capture starts but every callback hands you zeros. Symptom: phones could not hear desktop client, desktop could still hear phones (playout has no TCC gate). The Tauri 2 bundler auto-merges this file into WarzonePhone.app's Contents/Info.plist on the next build, so first launch will pop the standard mic prompt. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 09:48:32 +04:00
Siavash Sameni	510eae2089	feat(direct-call): call history, recent contacts, deregister button Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details Persistent JSON-backed call history for the direct-call screen so users can see what they've placed / received / missed and dial back with one click. Also fixes two small latent UX issues reported alongside. Backend (Rust) - new crate/module desktop/src-tauri/src/history.rs: thread-safe in- process store (OnceLock<RwLock<Vec<CallHistoryEntry>>>) backed by <APP_DATA_DIR>/call_history.json. Atomic writes via temp+rename. Max 200 entries, FIFO pruning. CallDirection { Placed, Received, Missed }. - Log hooks in the signal loop + commands: * place_call → Placed entry (with target fingerprint) * DirectCallOffer → Missed entry up front; upgraded to Received inside answer_call when accept_mode != Reject via history::mark_received_if_pending(call_id). If user rejects or never answers, it stays Missed. - New Tauri commands: * get_call_history() → all entries, newest first * get_recent_contacts() → unique peers by fp, newest interaction first * clear_call_history() → wipes JSON + in-memory * deregister() → tears down signal transport + endpoint Backend emits `history-changed` events so the UI can live-refresh without polling. Frontend (main.ts + index.html + style.css) - Direct-call panel now has: * Recent contacts chip row (top 6 unique peers). Click a chip → dial. * Call history list (up to 50 rows). Direction icon (↗ placed, ↙ received, ✗ missed), peer alias/fp, relative timestamp, callback button. Both click handlers populate target-fp and fire place_call. * Deregister button in the "registered" header — calls the new deregister command, tears down the signal transport, returns the UI to the pre-register state. * Clear-history link in the history header. - Subscribes to `history-changed` events so the list updates the moment the backend logs a new entry. Also refreshed on register + after a clear. - Nothing is rendered until there is data — empty sections stay hidden. Tasks #20 + #21 (small UX items bundled in) - Default room "general" for new installations: the html input value attribute is now "general" and loadSettings() defaults match. Existing users' localStorage still wins. - Random alias on desktop: already latent but confirmed working — the startup IIFE at main.ts:374 calls get_app_info() and prefills the alias input from derive_alias(seed) when the input is empty. No code change needed, just verified it flows through the same path as the Android client. Known follow-ups (deferred to step 6 polish) - Call duration tracking (currently all entries have no duration field) - Hangup signal from an unanswered incoming should emit history-changed so the missed state is visible even when the user never tapped accept - Android UI layout fit-check on the smaller Nothing screen	2026-04-10 11:03:36 +04:00
Siavash Sameni	76a4c53e21	fix(android-audio): spawn_blocking for Oboe restart — unblock tokio executor Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details Build `4c6aac6` added a stop+sleep+start Oboe restart inside the set_speakerphone Tauri command, but calling wzp_native::audio_stop() and audio_start() synchronously from an async fn blocks the tokio executor thread — those FFI calls wait for AAudio to finalise the stream teardown/bringup, which takes ~400ms each on Nothing phone (Pixel is fast enough to hide the bug). Reproduced on Nothing: 7 rapid Speaker button clicks across ~30 seconds, each restarting Oboe. After the 5th click the engine send and recv tokio tasks froze for 22 seconds — decoded_frames stuck at 1159 across 9 heartbeats, send_drops growing from 148 to 1720 as encoded frames couldn't make it past `send_t.send_media(pkt).await`. At 08:40:48 the runtime finally caught up and processed a 911-frame burst at once (buffered QUIC datagrams flooding through). Classic "blocking sync call in async context" anti-pattern. Fix: run the stop + start sequence inside tokio::task::spawn_blocking so the Oboe teardown + reopen happens on a dedicated blocking thread, leaving the tokio runtime free to keep driving the send and recv tasks. AAudio's requestStop returns only after the stream is actually in Stopped state, so the explicit sleep that bridged stop and start is no longer needed and is dropped. Send and recv tasks still see a ~500ms window of empty reads / partial writes during the blocking restart, but they get SCHEDULED through it — network packets keep being received + decoded + dropped into the playout ring, and captured mic samples keep being encoded + sent through quinn. No more executor starvation, no more 22-second audio dropouts, no more send_drops burst. Pixel still worked before this fix only because its AAudio teardown is fast enough to not exceed the scheduler's cooperative yield interval — same bug was latent on both devices, Nothing just made it visible.	2026-04-10 08:45:54 +04:00
Siavash Sameni	4c6aac654a	fix(android-audio): restart Oboe on speakerphone toggle + unbreak button UI Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m25s Details Build `4f2ad65` wired the Speaker button to AudioManager.setSpeakerphoneOn but user testing found that flipping speakerphone on an active Oboe VoiceCommunication stream silently tears down the AAudio streams on Pixel-class devices — both capture and playout stop producing data. Only ending the call and rejoining brings audio back (because the fresh Oboe open runs with the new routing already applied). Also the earpiece state showed up red in the UI because the button was getting the `.muted` CSS class when speakerphoneOn=false. Earpiece is a valid routing state, not a muted one. Fix set_speakerphone Tauri command: 1. Flip AudioManager.setSpeakerphoneOn via JNI (as before). 2. If the Oboe backend is currently running, stop it, sleep 50 ms to let AAudio finalise the transition, then start it again. The Rust send/recv tokio tasks keep running across the gap — they just read zero samples and write into the preserved ring buffers for a few frames, which is acceptable. The AudioBackend singleton's ring state is preserved across stop+start because it's in a 'static OnceLock. 3. Debounce the UI click via speakerphoneBusy + spkBtn.disabled so users can't queue up multiple toggles during the restart window. Fix main.ts Speaker button: - Remove the `.muted` classList toggle (added `.speaker-on` for CSS). - Update label text to "🔊 Speaker" / "🔈 Earpiece" for clarity. - On showCallScreen(), invoke is_speakerphone_on to sync the label with the real AudioManager state, so it matches reality after a rejoin (which was another symptom the user hit — the button label desynced from the actual routing after ending and restarting a call). - Debounce click + disable button while the restart is in flight. Drops #[allow(dead_code)] from wzp_native::audio_is_running now that it is actually called from the set_speakerphone restart guard.	2026-04-10 07:35:12 +04:00
Siavash Sameni	0178cbd91d	android(audio): Speaker button toggles earpiece↔speaker via JNI (WIP, untested) Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Build `9e37201` confirmed on-device that Usage::VoiceCommunication + MODE_IN_COMMUNICATION + speakerphoneOn=false routes Oboe playout to the handset earpiece and the callback drains the ring correctly. Next step: let the user flip speakerphoneOn at runtime so the existing Speaker button actually switches audio routing instead of just gating writes. - Cargo.toml (android target): pull in `jni = 0.21` and `ndk-context = 0.1`. Both are already transitively in the lockfile via Tauri/Wry, so this just promotes them to direct deps. - desktop/src-tauri/src/android_audio.rs: new module. Grabs the JavaVM + current Activity from `ndk_context::android_context()`, attaches a JNI thread, calls `activity.getSystemService("audio")` to get the AudioManager, and exposes `set_speakerphone(bool)` + `is_speakerphone_on()` helpers that call the AudioManager method of the same name. All gated behind `#[cfg(target_os = "android")]`. - lib.rs: adds `mod android_audio;` (android only), two new Tauri commands `set_speakerphone(on)` and `is_speakerphone_on()` — desktop gets no-op stubs so the same frontend invoke() works everywhere. Both registered in the invoke_handler. - desktop/src/main.ts: the Speaker button (previously toggled the playout-write gate via `toggle_speaker`) now calls `set_speakerphone` and reads back the new routing state. Labels switched from "Spk" / "Spk Off" to "Earpiece" / "Speaker" so users can't be confused into thinking clicking turns audio off. pollStatus no longer clobbers the spkBtn label based on engine spk_muted, since the two concepts are now decoupled. WIP because this has NOT been built or tested yet — committing at night to save the work. Tomorrow: build #50 with this change, smoke-test the Handset↔Speaker toggle, then move on to call history + last-contacts UI and the Speaker-button mute bug on the other phone.	2026-04-09 22:00:34 +04:00
Siavash Sameni	49f101d785	fix(android): reuse signal endpoint for direct-call media connection Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m46s Details Direct-call accept hangs forever at the QUIC handshake on Android. Logs from `d7b37a5` showed: CallEngine::start (android) invoked relay=172.16.81.172:4433 room=call-… resolved relay addr identity loaded endpoint created, dialing relay ← reached ← nothing, 90s+, no error The "connect failed" and "QUIC connection established" log lines never fire, meaning endpoint.connect_with(…).await never makes progress. Repro is 100%: SFU room join (one endpoint) works perfectly; direct call (opens a SECOND quinn::Endpoint on top of the signal one) hangs in the QUIC handshake. Creating two quinn::Endpoints on Android's AAudio-adjacent UDP stack apparently causes the second one's datagrams to never reach the relay (the server never sees the Initial packet). Rather than fight the platform, quinn is happy to multiplex multiple Connections on a single Endpoint — so we reuse the signal endpoint for the media connection. - SignalState now stores the quinn::Endpoint alongside the QuinnTransport. register_signal populates both at the same time. - CallEngine::start (both android and desktop branches) takes an Option<wzp_transport::Endpoint>. Some → reuse (direct-call path, after register_signal). None → create fresh (SFU room join path). - The connect tauri command reads state.signal.endpoint and threads it through to CallEngine::start, so the direct-call auto-connect (fired by the "setup" signal-event in main.ts) lands on the existing UDP socket. - wzp_transport re-exports quinn::Endpoint so wzp-desktop doesn't need to depend on quinn directly. - Also wraps the android connect in tokio::time::timeout(10s) so future hangs become deterministic "connect TIMED OUT" errors in logcat instead of silent deadlock. Same fix applies verbatim to the desktop client — the user suspects direct call is broken there too and this was likely always the cause, just never surfaced because desktop was only tested via SFU rooms.	2026-04-09 20:29:51 +04:00
Siavash Sameni	d7b37a5749	diag: tracing for direct-call signal loop + CallEngine::start stages Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m57s Details User reports tapping "answer" on an incoming direct call does nothing visible, and suspects the same may affect desktop. The signal recv loop had no tracing at all, so we can't tell whether CallSetup is being received, whether the recv loop died silently, or whether CallEngine::start is failing between "identity loaded" and "connected to relay, handshake complete". - register_signal recv loop now logs every message type with fields (CallRinging, DirectCallOffer, DirectCallAnswer, CallSetup, Hangup, unhandled), plus a warn! on recv errors and a final warn when the loop exits. - place_call / answer_call commands log entry + success / error. The answer_call error path logs the underlying send_signal error so we can see it in logcat instead of only in the JS error toast. - CallEngine::start android branch logs relay/room/alias on entry, logs "endpoint created, dialing relay" between create_endpoint and connect, "QUIC connection established, performing handshake" between connect and perform_handshake, and promotes all three potential failures to explicit error! logs so a silent hang / error becomes visible in logcat. No functional changes — pure diagnostics. Stacks on `b35a6b7` (the Oboe stack-pointer-escape fix) so build #43 carries both.	2026-04-09 19:17:03 +04:00
Siavash Sameni	5beea7de40	phase 3(android): unify connect/disconnect/toggle_*/get_status commands Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details Step 3 of the Tauri Android rewrite was still returning "audio backend not yet wired on Android (step 3)" because the cfg-gated Android stubs for connect/disconnect/toggle_mic/toggle_speaker/get_status were shadowing the real commands. Now that CallEngine::start() has a real Android body (phase 3, commit `fdbe502`), the gates are unnecessary. - Drop the #[cfg(not(target_os = "android"))] gates from all five engine-backed Tauri commands. - Delete the Android stub block (~50 LOC of "not connected" boilerplate). - Ungate `use engine::CallEngine;` and the AppState.engine field so both targets share the same Mutex<Option<CallEngine>>. - CallEngine::stop() now calls crate::wzp_native::audio_stop() on Android so the mic + speaker are released between calls, matching the desktop behaviour where dropping _audio_handle tears down CPAL. Direct-call flow on Android: peer sends DirectCallOffer → user accepts via answer_call → relay sends signal "setup" event → main.ts auto-invokes connect(relay, room) → CallEngine::start() runs the Android branch → wzp_native::audio_start() brings up Oboe → send/recv tasks stream PCM through the dlopen boundary.	2026-04-09 18:53:54 +04:00

1 2

57 Commits