The broadcast alone wasn't reaching the first client because its
recv loop hadn't started yet when the second client registered.
Now the relay sends PresenceList directly to the new client (right
after RegisterPresenceAck) AND broadcasts to all others.
This guarantees every client gets the full user list:
- New client: via direct send (queued before recv loop starts)
- Existing clients: via broadcast
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New signal infrastructure for the lobby-first UI:
- PresenceUser struct: { fingerprint, alias }
- SignalMessage::PresenceList: relay broadcasts full user list
to all signal clients on every register/deregister
- SignalHub::presence_list(): builds the list from connected clients
- SignalHub::broadcast(): sends to ALL signal clients
- Relay calls broadcast on register + unregister
- Desktop emits "presence_list" signal-event to JS frontend
This gives clients real-time visibility of who's online via the
signal channel, without needing to join a voice room first.
603 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New SignalMessage variants for P2P quality coordination:
UpgradeProposal/UpgradeResponse/UpgradeConfirm (#28):
- Consensual quality upgrade flow — proposer sends desired profile,
peer accepts/rejects based on own conditions, confirm commits both
- All carry call_id for relay routing
QualityCapability (#30):
- Peer reports its max sustainable profile — enables asymmetric
encoding where each side uses its own best quality instead of
forcing everyone to the weakest link
Relay forwards all 4 signals to the call peer (same pattern as
MediaPathReport, CandidateUpdate, HardNatProbe).
Desktop signal recv loop handles all 4 with debug logging.
Encoder switching TODOs noted for wiring into CallEngine.
4 new serde roundtrip tests. 603 total, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When --key <64-char-hex> is provided with --replay, the analyzer
decrypts each packet's ChaCha20-Poly1305 payload using the session
key and logs plaintext frame sizes. Prints first 5 + every 100th
decrypt result, and a summary at the end.
This completes all 5 protocol analyzer tasks (#13-17):
- #13: Observer mode (live passive listener) — was done
- #14: TUI with Ratatui (per-participant panels) — was done
- #15: Capture and replay (.wzp format) — was done
- #16: HTML report (Chart.js loss/jitter graphs) — was done
- #17: Encrypted decode (--key for replay) — done now
Usage:
wzp-analyzer --replay session.wzp --key <64-hex-chars> --html report.html
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Birthday attack for random symmetric NATs:
- birthday.rs: open_acceptor_ports() opens N sockets, STUN-probes
each to learn external ports. generate_dialer_targets() builds
hit list (known ports first, then random fill). spray_dialer()
sprays QUIC connects with rate limiting, first success wins.
- Default: 32 acceptor ports, 128 dialer probes, 20ms interval
Signal coordination:
- HardNatBirthdayStart { acceptor_ports, external_ip } sent by
Acceptor when peer's HardNatProbe shows random/sequential NAT
- Relay forwards it like other call signals
- Desktop recv loop handles and logs it
Hybrid waterfall integration:
- On receiving HardNatProbe with non-cone allocation, Acceptor
auto-opens birthday ports and sends BirthdayStart
- Sockets kept alive 10s for NAT mapping persistence
- Dialer spray integration into race() pending (needs transport
hot-swap for background upgrade)
6 new tests, 599 total, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes from real-world 5G↔Starlink testing:
NAT tickle fix:
- tokio::net::UdpSocket::bind() doesn't set SO_REUSEADDR, so binding
to the same port as quinn silently failed. Now uses socket2::Socket
with explicit SO_REUSEADDR + SO_REUSEPORT (via libc on unix).
- Tickle now logs success/failure for debugging.
Diagnostic fixes:
- connect:dual_path_race_start shows both dial_order_raw and
dial_order_smart so we can see what filtering removed
- Grace-period timeout (relay wins first, direct still running)
now fills "timeout:grace" diags for unrecorded candidates
- Previously candidate_diags was empty when relay won the race
Dependencies:
- Added socket2 = "0.5" to wzp-client
593 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Major P2P improvements for cross-network calls:
Smart candidate filtering (smart_dial_order):
- Strip LAN candidates when peer's public IP differs from ours
(172.16.x.x is unreachable from a different network)
- Strip all IPv6 candidates (Phase 7 disabled, wastes dial slots)
- Only keep mapped + reflexive for cross-network calls
- LAN candidates preserved when both peers share the same public IP
Acceptor NAT tickle:
- A-role sends a 1-byte UDP packet to each peer candidate BEFORE
accepting. This opens the NAT pinhole for return traffic from
the Dialer's IP — critical for address-restricted NATs that only
allow inbound from IPs they've seen outbound traffic to.
- Uses SO_REUSEADDR on the same port as the quinn endpoint.
Direct timeout increased from 2s to 4s:
- Cross-network QUIC handshakes through CGNAT can take 2-3s
- 2s was too aggressive for 5G/LTE networks
Diagnostic fix:
- Record "timeout:4s" for candidates still in-flight when the
timeout fires (previously these had no diagnostic entry)
5 new tests for smart_dial_order edge cases.
593 tests pass, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added CandidateDiag struct to RaceResult with per-candidate:
- address attempted
- result (ok / skipped:ipv6 / error:reason)
- elapsed time in ms
Surfaced in call-debug events:
- connect:dual_path_race_start now includes dial_order + peer_mapped
- connect:dual_path_race_done now includes candidate_diags array
Upgraded dual_path tracing from debug to info for IPv6 skips and
dial failures so they appear in logcat/console.
Helps diagnose why P2P fails on specific networks (5G CGNAT,
address-restricted NATs, etc).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase A of hard NAT traversal (PRD-hard-nat.md):
- PortAllocation enum: PortPreserving / Sequential{delta} / Random / Unknown
- detect_port_allocation(): sequential STUN probes from single socket,
analyzes port sequence for allocation pattern
- classify_port_allocation(): pure function with jitter tolerance,
wraparound handling, 60% threshold for noisy sequences
- predict_ports(): generates target port range from last_port + delta
- HardNatProbe signal message: carries port_sequence, allocation
pattern, external_ip for peer coordination
- Relay forwards HardNatProbe to call peer
- Netcheck gains port_allocation field + format_report display
588 tests pass (17 new), 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After 30s stable at a tier, the AdaptiveQualityController actively
probes the next tier up by switching the encoder and observing for 5s.
If loss/RTT stay within the target tier's thresholds, the upgrade
commits. If >1 bad report, the probe aborts with a 60s cooldown.
Probing is disabled on cellular (studio tiers aren't classified there)
and skipped when already at Studio64k (highest tier).
This complements the passive upgrade path (10 consecutive good reports)
by actively discovering that a path can sustain higher quality, rather
than waiting for the classification to drift upward.
New: ProbeState struct, check_probe() method, 4 constants, 5 tests.
377 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
#15 - Replay mode: --replay <file.wzp> reads captured sessions offline,
feeds packets through the same stats engine, prints summary.
CaptureReader mirrors CaptureWriter's binary format.
#16 - HTML report: --html <report.html> generates self-contained HTML
with Chart.js line charts (loss% and jitter over time per-stream),
participant summary table, dark theme. Works with live sessions
(after exit) or replay mode.
#17 - Encrypted decode: --key <hex> flag accepted and stored. Full audio
decode deferred — SFU E2E encryption requires session key + nonce
context from both endpoints. Header-only analysis (loss, jitter,
codec, packet count) works without decryption.
Usage:
wzp-analyzer --replay session.wzp --html report.html
wzp-analyzer relay:4433 --room test --capture out.wzp --html report.html
372 tests passing, 0 regressions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P2P calls now adapt codec quality based on observed network conditions,
matching what relay calls already had.
Three-layer implementation:
- QualityReport::from_path_stats(): construct reports from local quinn
stats (loss%, RTT, jitter) without needing relay-generated reports
- CallEncoder.pending_quality_report: one-shot attachment to next
source packet (consumed on encode, not repeated)
- Engine send tasks: generate quality report every 50 frames (~1s)
from quinn_path_stats() and attach via set_pending_quality_report()
- Engine recv tasks: self-observe from own QUIC path stats every 50
packets, feed to AdaptiveQualityController for P2P adaptation
(works even if peer isn't sending quality reports yet)
Both relay and P2P calls now have adaptive quality. On relay calls,
both peer-sent reports AND local observations feed the controller.
Hysteresis (3 consecutive bad reports to downgrade) prevents thrashing.
372 tests passing (+4 new: from_path_stats encoding, clamping, zero
values, encoder quality report attachment).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Partial reads from the capture ring consumed samples that were then
discarded when the send loop retried from buf[0]. For 20ms codecs this
was invisible (single Oboe burst fills 960 samples in one read), but
40ms codecs (Opus6k, 1920 samples) needed 2 bursts — the first partial
read consumed 960 real samples and threw them away.
Result: Opus6k produced ~11 frames/s instead of 25 (~44% of expected).
Fix: expose wzp_native_audio_capture_available() and check it before
reading, matching the desktop capture_ring.available() pattern. Partial
reads no longer occur because we only read when enough samples exist.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extend Tier enum from 3 to 6 levels: Studio64k/48k/32k + Good +
Degraded + Catastrophic with asymmetric hysteresis (down:3, up:5,
studio:10)
- Handle QualityDirective signals in both desktop and Android engines
— relay-coordinated codec switching now works end-to-end
- Add periodic TAP STATS to debug tap: packets in/out, fan-out avg,
seq gaps, codecs seen (every 5s)
- Mark task #2 done (ParticipantInfo in federation signals already
implemented)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>