wz-phone

Author	SHA1	Message	Date
Siavash Sameni	a08a37b5eb	fix(video): stabilize relay streams and remote rendering Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m2s Details	2026-05-26 07:18:22 +04:00
Siavash Sameni	ca164ada5c	fix(relay): forward legacy h264 room video stream Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details	2026-05-25 20:46:41 +04:00
Siavash Sameni	2d58bae9ba	chore(relay): log video forwarding decisions in debug tap Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details	2026-05-25 20:42:24 +04:00
Siavash Sameni	d57ebe3d2c	fix(video): force h264 and trace frame pipeline Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m32s Details Mirror to GitHub / mirror (push) Failing after 28s Details	2026-05-25 20:03:11 +04:00
Siavash Sameni	06253fdeeb	feat(video+desktop): camera capture, video UI, E2E AEAD wiring, test fixes Blockers 4 & 5: browser getUserMedia → JPEG IPC → Rust I420 pipeline; remote video strip renders decoded frames via canvas; EncryptingTransport wraps QuinnTransport so WZP AEAD is applied to all media (C2 fix). Test fixes: HandshakeResult.session destructuring across relay/client/crypto integration tests; video_codecs field added to all CallOffer/CallAnswer structs; wzp-video pipeline_roundtrip integration tests added. PRD docs: five Kimi-ready specs for E2E encryption, Android NDK 0.9 migration, quality upgrade flow, wire-format hardening, and clippy debt. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 15:30:26 +04:00
Siavash Sameni	52a6f5e048	fix(audit): address C2, C3, M4, M5 from 2026-05-25 audit C2: Add EncryptingTransport wrapper — all media I/O now goes through ChaChaSession encrypt/decrypt before hitting the QUIC datagram path. cli.rs run_live/run_silence/run_file_mode accept Arc<dyn MediaTransport> and receive a wrapped transport after the handshake. C3: Wire VideoScorer::observe() into both plain and trunked forwarding loops in room.rs. Packets from participants with Abusive verdict are dropped before forwarding. last_bwe_kbps tracked from quality reports. M4: Widen FEC repair symbol index from u8 to u16 throughout (FecEncoder::generate_repair, FecDecoder::add_symbol, all call sites in call.rs, bench.rs, pipeline.rs, wzp-android). Eliminates theoretical wrapping when num_source + repair_count > 255. M5: Track last_encrypt_timestamp in ChaChaSession. debug_assert in encrypt() that timestamp is non-decreasing across calls (including post- rekey). complete_rekey() explicitly preserves last_encrypt_timestamp to prevent accidental timestamp reset regressions. 583 tests passing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 06:20:05 +04:00
Siavash Sameni	12b0d9738f	fix(wzp-crypto): derive AEAD nonces from MediaHeader.seq, not recv_seq The previous scheme built ChaCha20-Poly1305 nonces from an internal recv_seq counter that incremented once per decrypt() call. Under in-order delivery recv_seq stayed in sync with the sender's send_seq, but any out-of-order or lost packet caused them to diverge permanently — every subsequent packet then used the wrong nonce and AEAD decryption failed for the rest of the session. Fix: parse the MediaHeader at the top of both encrypt() and decrypt() and use header.seq as the nonce input. Both sides now derive the nonce from the same wire field, surviving reordering by construction. send_seq / recv_seq are kept as pure packet counters for the rekey interval trigger; they no longer affect nonce derivation. All tests updated to pass valid v2 MediaHeader bytes instead of raw byte literals (the new code requires a parseable header for nonce derivation). New test decrypt_survives_out_of_order_delivery encrypts 5 packets and delivers them out of order (indices 0,2,1,4,3); this test would have failed under the old counter-based scheme. Fixes audit finding C1 from AUDIT-2026-05-25.md. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-25 06:00:01 +04:00
Siavash Sameni	9334aa5ccd	T6.1: AV1 encoder/decoder with HW probe + SVT-AV1 SW fallback - New: av1_obu.rs — OBU framer, depacketizer, keyframe detection, LEB128 helpers - New: dav1d.rs — SW AV1 decoder wrapper (shiguredo_dav1d) - New: svt_av1.rs — SW AV1 encoder wrapper (shiguredo_svt_av1) - Add CodecId::Av1Main = 12 with match-arm fixes in downstream crates - Add VideoToolboxAv1Decoder for macOS M3+ HW decode - Add MediaCodecAv1Encoder/Decoder for Android (video/av01) - Add extract_sequence_header_obu() helper for AV1 decoder CSD - Add 10-frame encode-decode roundtrip test (svt_av1 + dav1d) - Fix clippy unused import in dav1d.rs - 15 tests; all workspace tests pass; cargo fmt clean	2026-05-12 18:44:44 +04:00
Siavash Sameni	f16d650721	T6.2: Tier F video scorer — keyframe periodicity, I/P ratio, BWE responsiveness + 10 tests	2026-05-12 17:42:39 +04:00
Siavash Sameni	517d0ebfe0	T5.7.1: Unify Verdict enum into wzp_relay::verdict, drop RepeatAbusive variant	2026-05-12 16:49:04 +04:00
Siavash Sameni	ffded2a913	clippy: fix wzp-relay lint issues (empty doc, unused var, TokenExhausted, Default, dead field)	2026-05-12 15:40:55 +04:00
Siavash Sameni	fdfaed5390	fmt: cargo fmt --all	2026-05-12 15:40:02 +04:00
Siavash Sameni	dbbab0decf	T5.8: Tier G response policy — Verdict enum + ResponsePolicy + typed Hangup::PolicyViolation + 9 tests	2026-05-12 15:13:20 +04:00
Siavash Sameni	5fda5ecc52	T5.7: Tier F audio scorer — IAT CoV + silence fraction + bitrate + Q-flag + bimodality + 11 tests	2026-05-12 15:09:28 +04:00
Siavash Sameni	2bbb664df4	T5.6: Per-receiver layer selection at SFU — ReceiverState + hysteresis + forwarding filter	2026-05-12 15:05:32 +04:00
Siavash Sameni	b197651557	T5.4: H.265 encoder/decoder wrappers — VideoToolbox + MediaCodec, CodecId::H265Main	2026-05-12 14:50:20 +04:00
Siavash Sameni	001d94f9ae	T4.7 rework: make should_forward_pli take now: Instant + 6 unit tests - Refactor should_forward_pli(room, stream_id) -> should_forward_pli(room, stream_id, now: Instant) so the 200 ms dedup window is deterministically testable. - Update the one caller in run_participant_signals to pass Instant::now(). - Add 6 PLI unit tests covering: * first PLI forwards * duplicate within 200 ms suppressed * after 200 ms forwards again * different streams independent * different rooms independent * no stream owner returns None Addresses reviewer CR on T4.7 (line drawn at T4.6 — stateful relay features must have state-transition tests). wzp-relay tests: 93 -> 99 pass.	2026-05-12 11:39:35 +04:00
Siavash Sameni	36b0421d68	T4.7: PLI suppression at SFU — 200 ms dedup window per (room, stream_id)	2026-05-12 11:25:25 +04:00
Siavash Sameni	828fbea2ea	T4.6: SFU keyframe cache — per-(room,sender,stream) I-frame replay on join	2026-05-12 10:54:04 +04:00
Siavash Sameni	490d2d31c6	T4.1: wzp-video crate scaffold + H.264 NAL framer + depacketizer	2026-05-12 07:22:54 +04:00
Siavash Sameni	f1b86e0fed	T3.5: Tier E per-session token bucket	2026-05-12 06:45:56 +04:00
Siavash Sameni	017c371611	T3.4: Tier D per-codec payload size sanity	2026-05-12 06:24:40 +04:00
Siavash Sameni	e73f8a7150	T3.3: SignalMessage version field	2026-05-12 06:11:59 +04:00
Siavash Sameni	f3398adb95	T3.1: RoomManager concurrency — Arc<RwLock<Room>> per room	2026-05-11 21:12:04 +04:00
Siavash Sameni	54c1a35186	T2.3-T2.6: BWE guard, relay conformance Tier A/B/C, Prometheus metrics	2026-05-11 20:50:22 +04:00
Siavash Sameni	6f81487778	T1.6: Protocol version negotiation in handshake	2026-05-11 15:53:04 +04:00
Siavash Sameni	c93d302656	T1.5: Migrate emit/parse sites to v2 wire format	2026-05-11 12:37:32 +04:00
Siavash Sameni	defd8eab07	fix(signal): send PresenceList directly to new client after ack Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m50s Details The broadcast alone wasn't reaching the first client because its recv loop hadn't started yet when the second client registered. Now the relay sends PresenceList directly to the new client (right after RegisterPresenceAck) AND broadcasts to all others. This guarantees every client gets the full user list: - New client: via direct send (queued before recv loop starts) - Existing clients: via broadcast Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:20:37 +04:00
Siavash Sameni	1120c7b579	feat(signal): PresenceList broadcast for lobby user discovery Some checks failed Build Release Binaries / build-amd64 (push) Failing after 7m21s Details Mirror to GitHub / mirror (push) Failing after 27s Details New signal infrastructure for the lobby-first UI: - PresenceUser struct: { fingerprint, alias } - SignalMessage::PresenceList: relay broadcasts full user list to all signal clients on every register/deregister - SignalHub::presence_list(): builds the list from connected clients - SignalHub::broadcast(): sends to ALL signal clients - Relay calls broadcast on register + unregister - Desktop emits "presence_list" signal-event to JS frontend This gives clients real-time visibility of who's online via the signal channel, without needing to join a voice room first. 603 tests pass, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 18:12:47 +04:00
Siavash Sameni	bb23976076	feat(quality): upgrade negotiation + asymmetric quality signals (#28 , #29 , #30 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m33s Details New SignalMessage variants for P2P quality coordination: UpgradeProposal/UpgradeResponse/UpgradeConfirm (#28): - Consensual quality upgrade flow — proposer sends desired profile, peer accepts/rejects based on own conditions, confirm commits both - All carry call_id for relay routing QualityCapability (#30): - Peer reports its max sustainable profile — enables asymmetric encoding where each side uses its own best quality instead of forcing everyone to the weakest link Relay forwards all 4 signals to the call peer (same pattern as MediaPathReport, CandidateUpdate, HardNatProbe). Desktop signal recv loop handles all 4 with debug logging. Encoder switching TODOs noted for wiring into CallEngine. 4 new serde roundtrip tests. 603 total, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:25:34 +04:00
Siavash Sameni	f06f9073ae	feat(nat): birthday attack module + HardNatBirthdayStart signal (#86 , #87 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 25s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details Birthday attack for random symmetric NATs: - birthday.rs: open_acceptor_ports() opens N sockets, STUN-probes each to learn external ports. generate_dialer_targets() builds hit list (known ports first, then random fill). spray_dialer() sprays QUIC connects with rate limiting, first success wins. - Default: 32 acceptor ports, 128 dialer probes, 20ms interval Signal coordination: - HardNatBirthdayStart { acceptor_ports, external_ip } sent by Acceptor when peer's HardNatProbe shows random/sequential NAT - Relay forwards it like other call signals - Desktop recv loop handles and logs it Hybrid waterfall integration: - On receiving HardNatProbe with non-cone allocation, Acceptor auto-opens birthday ports and sends BirthdayStart - Sockets kept alive 10s for NAT mapping persistence - Dialer spray integration into race() pending (needs transport hot-swap for background upgrade) 6 new tests, 599 total, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 16:44:36 +04:00
Siavash Sameni	ec1bdf3cd5	feat(nat): hard NAT port allocation detection + prediction + HardNatProbe signal (#29 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m30s Details Phase A of hard NAT traversal (PRD-hard-nat.md): - PortAllocation enum: PortPreserving / Sequential{delta} / Random / Unknown - detect_port_allocation(): sequential STUN probes from single socket, analyzes port sequence for allocation pattern - classify_port_allocation(): pure function with jitter tolerance, wraparound handling, 60% threshold for noisy sequences - predict_ports(): generates target port range from last_port + delta - HardNatProbe signal message: carries port_sequence, allocation pattern, external_ip for peer coordination - Relay forwards HardNatProbe to call peer - Netcheck gains port_allocation field + format_report display 588 tests pass (17 new), 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:29:35 +04:00
Siavash Sameni	8fcf1be341	feat(nat): Tailscale-inspired STUN/ICE + port mapping + mid-call re-gathering (#28 ) Some checks failed Mirror to GitHub / mirror (push) Failing after 23s Details Build Release Binaries / build-amd64 (push) Failing after 6m8s Details Phase 8: 5 new modules bringing NAT traversal close to Tailscale's approach. - stun.rs: RFC 5389 STUN client — public server reflexive discovery, XOR-MAPPED-ADDRESS parsing, parallel probe with retry, STUN fallback in desktop try_reflect_own_addr() - portmap.rs: NAT-PMP (RFC 6886) + PCP (RFC 6887) + UPnP IGD port mapping — gateway discovery, acquire/release/refresh lifecycle, new PeerCandidates.mapped candidate type in dial order - ice_agent.rs: candidate lifecycle — gather(), re_gather(), apply_peer_update() with monotonic generation counter, CandidateUpdate signal message forwarded by relay - netcheck.rs: comprehensive diagnostic — NAT type, IPv4/v6, port mapping availability, relay latencies, CLI --netcheck - relay_map.rs: RTT-sorted relay map, preferred() selection, populate_from_ack() for RegisterPresenceAck.available_relays Relay: CallRegistry stores + cross-wires caller/callee_mapped_addr into CallSetup.peer_mapped_addr. Region config + available_relays populated from federation peers in RegisterPresenceAck. Desktop: place_call/answer_call call acquire_port_mapping() and fill caller/callee_mapped_addr. STUN+relay combined NAT detection. 571 tests pass (66 new), 0 regressions, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:17:17 +04:00
Siavash Sameni	81b5522942	refactor: clap CLI parser, safety docs, dead code docs, cross-refs Some checks failed Mirror to GitHub / mirror (push) Failing after 26s Details Build Release Binaries / build-amd64 (push) Failing after 4m1s Details Audit items 6, 8, 9, 10: #6 - Relay CLI: replaced 154-line manual parse_args() with clap derive (13 flags/options preserved, auto --help, --version from build hash) #8 - wzp-native: added # Safety docs to all 3 unsafe extern "C" fns #9 - wzp-crypto: documented x25519_static_secret/public as reserved for future static-key federation auth (not dead code, intentionally unused) #10 - Cross-references between quality.rs ↔ dred_tuner.rs module docs 368 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:40:49 +04:00
Siavash Sameni	d539a6dfb9	test(federation): 29 tests for federation.rs (was 0), engine dedup PRD Some checks failed Mirror to GitHub / mirror (push) Failing after 27s Details Build Release Binaries / build-amd64 (push) Failing after 3m45s Details Federation test coverage (crates/wzp-relay/tests/federation.rs): - room_hash: determinism, uniqueness, length, case sensitivity (5) - is_global_room: static config, call-* implicit, exact match (3) - resolve_global_room: static + call-* resolution (2) - global_room_hash: canonical names, fallthrough, independence (4) - forward_to_peers: zero peers, live QUIC datagram delivery (2) - broadcast_signal: zero peers, live QUIC signal delivery (2) - send_signal_to_peer: unknown fingerprint error (1) - peer lookup: fingerprint normalization, IP, trust priority (5) - accessors: local_tls_fp, cross_relay_tx, remote_participants (3) - integration: full media egress over live QUIC link (1) - edge case: exact room match (1) Total relay tests: 120 (was 91). Full suite: 368 passing. Also added PRD-engine-dedup.md for the engine.rs helper extraction completed in the previous commit. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:35:04 +04:00
Siavash Sameni	ba12aae439	refactor: extract shared engine helpers, federation clone-before-send, constants Some checks failed Mirror to GitHub / mirror (push) Failing after 30s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details Engine deduplication (PRD-engine-dedup.md): - build_call_config(): shared CallConfig construction (was 23 lines × 2) - codec_to_profile(): shared CodecId → QualityProfile mapping (was 19 lines × 2) - run_signal_task(): shared signal handler (was 48 lines × 2) - Net -39 lines from engine.rs, 6 duplicated blocks → single-line calls Quick wins from REFACTOR-codebase-audit.md: - 6 magic number constants extracted (CAPTURE_POLL_MS, RECV_TIMEOUT_MS, etc.) - DRED_POLL_INTERVAL moved from 2 local defs to 1 module-level const - federation.rs: forward_to_peers, broadcast_signal, send_signal_to_peer now clone peer list and release lock before sending (was holding Mutex across async I/O — last lock-during-send pattern eliminated) - main.rs: close_transport() helper replaces 12 silent .ok() calls with debug-level logging 314 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 15:22:44 +04:00
Siavash Sameni	a52b011fb5	feat(relay): replace global Mutex<RoomManager> with DashMap sharding Some checks failed Mirror to GitHub / mirror (push) Failing after 24s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details Eliminates the single-lock bottleneck for media forwarding. Before: all participants across all rooms competed for one Mutex. Now rooms are stored in DashMap (64 internal shards with per-shard RwLocks). Changes: - RoomManager.rooms: HashMap → DashMap<String, Room> - Per-room quality tracking (qualities, current_tier moved into Room) - Arc<Mutex<RoomManager>> → Arc<RoomManager> everywhere - 20 .lock().await sites removed across room.rs, main.rs, federation.rs, ws.rs - federation forward_to_peers: clone peer list, release lock, then send - ACL uses std::sync::Mutex (rarely accessed, non-async) Concurrency improvement: - Before: 100 rooms × 10 people = 1000 tasks → 1 Mutex - After: distributed across 64 DashMap shards, ~15 tasks per shard avg - Rooms are fully independent — room A never blocks room B 314 tests passing, 0 regressions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 12:17:57 +04:00
Siavash Sameni	d424515542	feat: 5-tier quality classification, QualityDirective handling, debug tap stats Some checks failed Mirror to GitHub / mirror (push) Failing after 31s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details - Extend Tier enum from 3 to 6 levels: Studio64k/48k/32k + Good + Degraded + Catastrophic with asymmetric hysteresis (down:3, up:5, studio:10) - Handle QualityDirective signals in both desktop and Android engines — relay-coordinated codec switching now works end-to-end - Add periodic TAP STATS to debug tap: packets in/out, fan-out avg, seq gaps, codecs seen (every 5s) - Mark task #2 done (ParticipantInfo in federation signals already implemented) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 10:23:48 +04:00
Siavash Sameni	ea5fc17c34	fix(relay): debug tap signal logging, dual_path test regression, PRD updates Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m39s Details Mirror to GitHub / mirror (push) Failing after 28s Details - Add log_signal() and log_event() to DebugTap for RoomUpdate, QualityDirective, join/leave lifecycle events (task #11) - Fix dual_path.rs Phase 7 regression: add missing ipv6_endpoint arg to 3 race() call sites - Update PRDs to reflect actual implementation status: mark adaptive quality, coordinated codec, P2P, network awareness, protocol analyzer - Update PROGRESS.md with QualityDirective gap and dual_path regression Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 09:54:52 +04:00
Siavash Sameni	d249b32ee5	test+docs: add tests for QualityDirective, ParticipantQuality; update docs - QualityDirective signal roundtrip tests (with/without reason) - ParticipantQuality unit tests (initial tier, degradation, weakest-link) - Updated PROGRESS.md with desktop adaptive quality, relay coordinated switching, Oboe state polling entries - Updated ARCHITECTURE.md SFU fan-out rules with QualityDirective - Updated PRD-coordinated-codec.md with implementation status - 312 tests passing across all modified crates Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:56:46 +04:00
Siavash Sameni	22045bc5e6	feat: adaptive quality in desktop, relay quality directive, Oboe state polling - Wire AdaptiveQualityController into desktop engine send/recv tasks (mirrors Android pattern: AtomicU8 pending_profile, auto-mode check) - Wire same into Android engine send task (was only in recv before) - QualityDirective SignalMessage variant for relay-initiated codec switch - ParticipantQuality tracking in relay RoomManager (per-participant AdaptiveQualityController, weakest-link tier computation) - Relay broadcasts QualityDirective to all participants when room-wide tier degrades (coordinated codec switching) - Oboe stream state polling: poll getState() for up to 2s after requestStart() to ensure both streams reach Started before proceeding (fixes intermittent silent calls on cold start, Nothing Phone A059) Tasks: #7, #25, #26, #31, #35 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:54:04 +04:00
Siavash Sameni	766c9df442	feat(dred): continuous DRED tuning, PMTUD, extended Opus6k window - DredTuner: maps live network metrics (loss/RTT/jitter) to continuous DRED duration every ~500ms instead of discrete tier-locked values. Includes jitter-spike detection for pre-emptive Starlink-style boost. - Opus6k DRED extended from 500ms to 1040ms (max libopus 1.5 supports) - PMTUD: quinn MtuDiscoveryConfig with upper_bound=1452, 300s interval - TrunkedForwarder respects discovered MTU (was hard-coded 1200) - QuinnPathSnapshot exposes quinn internal stats + discovered MTU - AudioEncoder trait: set_expected_loss() + set_dred_duration() methods - PathMonitor: sliding-window jitter variance for spike detection - Integrated into both Android and desktop send tasks in engine.rs - 14 new tests (10 tuner unit + 4 encoder integration) - Updated ARCHITECTURE.md, PROGRESS.md, PRD-dred-integration, PRD-mtu Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 19:38:37 +04:00
Siavash Sameni	a798634b3d	fix(signal): add call_id to Hangup — prevents stale hangup killing new calls Root cause: Hangup had no call_id field. The relay forwarded hangups to ALL active calls for a user. When user A hung up call 1 and user B immediately placed call 2, the relay's processing of A's hangup would also kill call 2 (race window ~1-2s). Fix: add optional call_id to Hangup (backwards-compatible via serde skip_serializing_if). When present, the relay only ends the named call. Old clients send call_id=None and get the legacy broadcast behavior. Also: clear pending_path_report in Hangup recv handler and internal_deregister to prevent stale oneshot channels from blocking subsequent call setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 16:39:21 +04:00
Siavash Sameni	4d66d3769d	fix(relay): set peer_relay_fp on originating relay when answer arrives The originating relay (where the caller is) never set peer_relay_fp because the call was created locally. When the callee's answer arrived via federation, the cross-relay dispatcher handled it but didn't mark the call as cross-relay. This meant the caller's MediaPathReport was delivered via local hub.send_to() to a peer fingerprint that isn't connected locally — silently dropped. Fix: in the cross-relay answer dispatcher, call reg.set_peer_relay_fp(call_id, Some(origin_relay_fp)) so the originating relay knows to forward MediaPathReport via federation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:49:34 +04:00
Siavash Sameni	1eb82d77b8	feat(relay+client): relay reports build version in Ack Add relay_build field to RegisterPresenceAck so the client logs which relay version it connected to. Shows in the debug log as register_signal:ack_received {"relay_build":"f843a93"}. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:27:58 +04:00
Siavash Sameni	f843a934fe	fix(relay): forward MediaPathReport across federation MediaPathReport was only delivered via local signal_hub, so calls between peers on different relays always hit peer_report_timeout and fell back to relay — even when direct P2P worked perfectly. Fix: check peer_relay_fp in call_registry (same pattern as DirectCallAnswer). If the peer is on a remote relay, wrap in FederatedSignalForward and send via federation link. Also fix the cross-relay dispatcher to deliver to BOTH caller and callee (not just caller), since the report can come from either side. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 14:14:30 +04:00
Siavash Sameni	f5542ef822	feat(p2p): Phase 6 — ICE-style path negotiation Before Phase 6, each side's dual-path race ran independently and committed to whichever transport completed first. When one side picked Direct and the other picked Relay, they sent media to different places — TX > 0 RX: 0 on both, completely silent call. Phase 6 adds a negotiation step: after the local race completes, each side sends a MediaPathReport { call_id, direct_ok, winner } to the peer through the relay. Both wait for the other's report before committing a transport to the CallEngine. The decision rule is simple: if BOTH report direct_ok = true, use direct; if EITHER reports false, BOTH use relay. ## Wire protocol New `SignalMessage::MediaPathReport { call_id, direct_ok, race_winner }`. The relay forwards it to the call peer via the same signal_hub routing used for DirectCallOffer/Answer. The cross-relay dispatcher also forwards it. ## dual_path::race restructured Returns `RaceResult` instead of `(Arc<QuinnTransport>, WinningPath)`: - `direct_transport: Option<Arc<QuinnTransport>>` - `relay_transport: Option<Arc<QuinnTransport>>` - `local_winner: WinningPath` Both paths are run as spawned tasks. After the first completes, a 1s grace period lets the loser also finish. The connect command gets BOTH transports (when available) and picks the right one based on the negotiation outcome. The unused transport is dropped. ## connect command flow (revised) 1. Run race() → RaceResult with both transports 2. Send MediaPathReport to relay with our direct_ok 3. Install oneshot; wait for peer's report (3s timeout) 4. Decision: both direct_ok → use direct; else → use relay 5. Start CallEngine with the agreed transport If the peer never responds (old build, timeout), falls back to relay — backward compatible. ## Relay forwarding MediaPathReport is forwarded like DirectCallOffer/Answer: via signal_hub.send_to(peer_fp) for same-relay calls, and via cross-relay dispatcher for federated calls. ## Debug log events - `connect:dual_path_race_done` — local race result - `connect:path_report_sent` — our report to the peer - `connect:peer_report_received` — peer's report - `connect:peer_report_timeout` — peer didn't respond (3s) - `connect:path_negotiated` — final agreed path with reasons Full workspace test: 423 passing (no regressions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 10:03:42 +04:00
Siavash Sameni	026940d492	fix(federation): diagnostic logging for cross-relay media routing Added warn-level log in handle_datagram when a federation datagram arrives but no matching local room is found. Prints: - room_hash (8-byte tag from the datagram) - active_rooms (all rooms the relay currently has) - seq + peer label This diagnoses the cross-relay recv_fr=0 issue: if media IS arriving from the peer relay but the room hash doesn't match any active room, the log tells us exactly what hash is expected vs what rooms exist locally. If no datagram log fires at all, the issue is upstream (peer relay not forwarding, federation link down, etc.). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 09:27:34 +04:00
Siavash Sameni	6cd61fc63b	feat(federation): Phase 4.1 — call-* rooms are implicitly global All rooms with names starting with 'call-' are now treated as global rooms by the federation pipeline. This enables relay- mediated media fallback for cross-relay direct calls: when Alice on Relay A and Bob on Relay B both join the same call-<id> room, the federation media forwarding pipeline (GlobalRoomActive announcements + datagram forwarding + presence replication) kicks in automatically without any runtime registration step. Previously, cross-relay direct calls that couldn't go P2P (symmetric NAT on either side) failed with "no media path" because the call-<id> room wasn't in the configured global_rooms set and media datagrams weren't forwarded across the federation link. The relay's existing ACL for call-* rooms (only the two authorized fingerprints from the call registry can join) prevents random clients from creating or eavesdropping on call rooms. ## Changes ### `is_global_room` (federation.rs) Added `room.starts_with("call-")` check before the static global_rooms set lookup. Returns true immediately for any call-prefixed room. ### `resolve_global_room` (federation.rs) Return type changed from `Option<&str>` to `Option<String>` (owned) because call-* room names aren't stored on `self` — they come from the caller and resolve to themselves as the canonical name. The 13 callers continue to work via String/&str auto-deref; 4 HashMap lookups needed explicit `.as_str()` or `&` borrows. Full workspace test: 423 passing (no regressions). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 08:55:01 +04:00
Siavash Sameni	fa038df057	feat(p2p): Phase 5.5 — ICE LAN host candidates (IPv4 + IPv6) Same-LAN P2P was failing because MikroTik masquerade (like most consumer NATs) doesn't support NAT hairpinning — the advertised WAN reflex addr is unreachable from a peer on the same LAN as the advertiser. Phase 5 got us Cone NAT classification and fixed the measurement artifact, but same-LAN direct dials still had nowhere to land. Phase 5.5 adds ICE-style host candidates: each client enumerates its LAN-local network interface addresses, includes them in the DirectCallOffer/Answer alongside the reflex addr, and the dual-path race fans out to ALL peer candidates in parallel. Same-LAN peers find each other via their RFC1918 IPv4 + ULA / global-unicast IPv6 addresses without touching the NAT at all. Dual-stack IPv6 is in scope from the start — on modern ISPs (including Starlink) the v6 path often works even when v4 hairpinning doesn't, because there's no NAT on the v6 side. ## Changes ### `wzp_client::reflect::local_host_candidates(port)` (new) Enumerates network interfaces via `if-addrs` and returns SocketAddrs paired with the caller's port. Filters: - IPv4: RFC1918 (10/8, 172.16/12, 192.168/16) + CGNAT (100.64/10) - IPv6: global unicast (2000::/3) + ULA (fc00::/7) - Skipped: loopback, link-local (169.254, fe80::), public v4 (already covered by reflex-addr), unspecified Safe from any thread, one `getifaddrs(3)` syscall. ### Wire protocol (wzp-proto/packet.rs) Three new `#[serde(default, skip_serializing_if = "Vec::is_empty")]` fields, backward-compat with pre-5.5 clients/relays by construction: - `DirectCallOffer.caller_local_addrs: Vec<String>` - `DirectCallAnswer.callee_local_addrs: Vec<String>` - `CallSetup.peer_local_addrs: Vec<String>` ### Call registry (wzp-relay/call_registry.rs) `DirectCall` gains `caller_local_addrs` + `callee_local_addrs` Vec<String> fields. New `set_caller_local_addrs` / `set_callee_local_addrs` setters. Follow the same pattern as the reflex addr fields. ### Relay cross-wiring (wzp-relay/main.rs) Both the local-call and cross-relay-federation paths now track the local_addrs through the registry and inject them into the CallSetup's peer_local_addrs. Cross-wiring is identical to the existing peer_direct_addr logic — each party's CallSetup carries the OTHER party's LAN candidates. ### Client side (desktop/src-tauri/lib.rs) - `place_call`: gathers local host candidates via `local_host_candidates(signal_endpoint.local_addr().port())` and includes them in `DirectCallOffer.caller_local_addrs`. The port match is critical — it's the Phase 5 shared signal socket, so incoming dials to these addrs land on the same endpoint that's already listening. - `answer_call`: same, AcceptTrusted only (privacy mode keeps LAN addrs hidden too, for consistency with the reflex addr). - `connect` Tauri command: new `peer_local_addrs: Vec<String>` arg. Builds a `PeerCandidates` bundle and passes it to the dual-path race. - Recv loop's CallSetup handler: destructures + forwards the new field to JS via the signal-event payload. ### `dual_path::race` (wzp-client/dual_path.rs) Signature change: takes `PeerCandidates` (reflex + local Vec) instead of a single SocketAddr. The D-role branch now fans out N parallel dials via `tokio::task::JoinSet` — one per candidate — and the first successful dial wins (losers are aborted immediately via `set.abort_all()`). Only when ALL candidates have failed do we return Err; individual candidate failures are just traced at debug level and the race waits for the others. LAN host candidates are tried BEFORE the reflex addr in `PeerCandidates::dial_order()` — they're faster when they work, and the reflex addr is the fallback for the not-on-same-LAN case. ### JS side (desktop/main.ts) `connect` invoke now passes `peerLocalAddrs: data.peer_local_addrs ?? []` alongside the existing `peerDirectAddr`. ### Tests All existing test callsites updated for the new Vec<String> fields (defaults to Vec::new() in tests — they don't exercise the multi-candidate path). `dual_path.rs` integration tests wrap the single `dead_peer` / `acceptor_listen_addr` in a `PeerCandidates { reflexive: Some(_), local: Vec::new() }`. Full workspace test: 423 passing (same as before 5.5). ## Expected behavior on the reporter's setup Two phones behind MikroTik, both on the same LAN: place_call:host_candidates {"local_addrs": ["192.168.88.21:XXX", "2001:...:YY:XXX"]} recv:DirectCallAnswer {"callee_local_addrs": ["192.168.88.22:ZZZ", "2001:...:WW:ZZZ"]} recv:CallSetup {"peer_direct_addr":"150.228.49.65:NN", "peer_local_addrs":["192.168.88.22:ZZZ","2001:...:WW:ZZZ"]} connect:dual_path_race_start {"peer_reflex":"...","peer_local":[...]} dual_path: direct dial succeeded on candidate 0 ← LAN v4 wins connect:dual_path_race_won {"path":"Direct"} Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 07:34:49 +04:00

1 2 3

125 Commits