14 Commits

Author SHA1 Message Date
Siavash Sameni
766c9df442 feat(dred): continuous DRED tuning, PMTUD, extended Opus6k window
- DredTuner: maps live network metrics (loss/RTT/jitter) to continuous
  DRED duration every ~500ms instead of discrete tier-locked values.
  Includes jitter-spike detection for pre-emptive Starlink-style boost.
- Opus6k DRED extended from 500ms to 1040ms (max libopus 1.5 supports)
- PMTUD: quinn MtuDiscoveryConfig with upper_bound=1452, 300s interval
- TrunkedForwarder respects discovered MTU (was hard-coded 1200)
- QuinnPathSnapshot exposes quinn internal stats + discovered MTU
- AudioEncoder trait: set_expected_loss() + set_dred_duration() methods
- PathMonitor: sliding-window jitter variance for spike detection
- Integrated into both Android and desktop send tasks in engine.rs
- 14 new tests (10 tuner unit + 4 encoder integration)
- Updated ARCHITECTURE.md, PROGRESS.md, PRD-dred-integration, PRD-mtu

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 19:38:37 +04:00
Siavash Sameni
40955bd11c debug(media): add connection diagnostics for direct P2P drops
When direct P2P calls show 100% datagram drops, we need to know
WHY send_media() fails. This commit adds:

- Remote address + stable_id logging on A-role accept and D-role
  dial success (dual_path.rs) — tells us which candidate won
- Remote address + max_datagram_size on engine transport init —
  verifies datagrams are negotiated
- last_send_err in send heartbeat — captures the actual error
  from send_datagram() failures
- QuinnTransport::remote_address() helper

Also fixes UI badge: was looking for wrong event name
("dual_path_race_won" → "path_negotiated").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:29:58 +04:00
Siavash Sameni
c2d298beb5 feat(net): Phase 7 — dual-socket IPv4+IPv6 ICE
Adds a dedicated IPv6 QUIC endpoint (IPV6_V6ONLY=1 via socket2)
alongside the existing IPv4 signal endpoint for proper dual-stack
P2P connectivity. Previous [::]:0 dual-stack attempt broke IPv4
on Android; this uses separate sockets per address family like
WebRTC/libwebrtc.

- create_ipv6_endpoint(): socket2-based IPv6-only UDP socket,
  tries same port as IPv4 signal EP, falls back to ephemeral
- local_host_candidates(v4_port, v6_port): now gathers IPv6
  global-unicast (2000::/3) and unique-local (fc00::/7) addrs
- dual_path::race(): A-role accepts on both v4+v6 via select!,
  D-role routes each candidate to matching-AF endpoint
- Graceful fallback: if IPv6 unavailable, .ok() → None → pure
  IPv4 behavior identical to pre-Phase-7

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 11:54:13 +04:00
Siavash Sameni
da08723fe7 fix(signal): forward-compat — log+continue on unknown SignalMessage variants
Both sides of the signal channel previously broke their recv loop
on any deserialize error, which meant adding a new variant in one
build silently killed signal connections from peers running an
older build. This bit us during Phase 1 testing: a new client
sending SignalMessage::Reflect to a pre-Phase-1 relay caused the
relay to drop the whole signal connection, which looked like
"Error: not registered" on the next place_call.

Fix:
- New TransportError::Deserialize(String) variant in wzp-proto
  carries serde errors as a distinct category.
- wzp-transport/reliable.rs::recv_signal returns Deserialize on
  serde_json::from_slice failures (was wrapped in Internal).
- wzp-relay/main.rs signal loop matches on Deserialize → warn +
  continue (instead of break).
- desktop/src-tauri/lib.rs recv loop does the same.

Other TransportError variants (ConnectionLost, Io, Internal) still
break the loop — only pure parse failures are recoverable.

This means future SignalMessage variant additions are backward-
compat by construction: older peers will see "unknown variant,
continuing" in their logs while newer peers can keep evolving the
protocol.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:13:31 +04:00
Siavash Sameni
49f101d785 fix(android): reuse signal endpoint for direct-call media connection
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
Direct-call accept hangs forever at the QUIC handshake on Android. Logs
from d7b37a5 showed:
  CallEngine::start (android) invoked relay=172.16.81.172:4433 room=call-…
  resolved relay addr
  identity loaded
  endpoint created, dialing relay   ← reached
                                    ← nothing, 90s+, no error
The "connect failed" and "QUIC connection established" log lines never
fire, meaning endpoint.connect_with(…).await never makes progress.

Repro is 100%: SFU room join (one endpoint) works perfectly; direct call
(opens a SECOND quinn::Endpoint on top of the signal one) hangs in the
QUIC handshake. Creating two quinn::Endpoints on Android's AAudio-adjacent
UDP stack apparently causes the second one's datagrams to never reach the
relay (the server never sees the Initial packet). Rather than fight the
platform, quinn is happy to multiplex multiple Connections on a single
Endpoint — so we reuse the signal endpoint for the media connection.

- SignalState now stores the quinn::Endpoint alongside the QuinnTransport.
  register_signal populates both at the same time.
- CallEngine::start (both android and desktop branches) takes an
  Option<wzp_transport::Endpoint>. Some → reuse (direct-call path, after
  register_signal). None → create fresh (SFU room join path).
- The connect tauri command reads state.signal.endpoint and threads it
  through to CallEngine::start, so the direct-call auto-connect (fired by
  the "setup" signal-event in main.ts) lands on the existing UDP socket.
- wzp_transport re-exports quinn::Endpoint so wzp-desktop doesn't need to
  depend on quinn directly.
- Also wraps the android connect in tokio::time::timeout(10s) so future
  hangs become deterministic "connect TIMED OUT" errors in logcat
  instead of silent deadlock.

Same fix applies verbatim to the desktop client — the user suspects
direct call is broken there too and this was likely always the cause,
just never surfaced because desktop was only tested via SFU rooms.
2026-04-09 20:29:51 +04:00
Siavash Sameni
f935bd69cd fix: rewrite seq/fec for federation-delivered packets
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 2m48s
Mirror to GitHub / mirror (push) Failing after 4m2s
- Time-based dedup (2s TTL) replaces fixed-window dedup — consecutive
  senders with same seq numbers no longer collide
- Raw byte forwarding for federation local delivery (no re-serialization)
- Jitter buffer resets on large backward seq jumps (>100)
- recv_media skips malformed datagrams instead of returning connection-closed
- SIGTERM handler for clean QUIC shutdown on wzp-client
- JSONL event log infrastructure (--event-log flag) for protocol analysis
- FEC disabled on GOOD profile for federation debugging (fec_ratio=0.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:55:06 +04:00
Siavash Sameni
6be36e43c2 feat: relay federation infrastructure — room bridging, loop prevention, peer connections
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 2m1s
Phase 1 of relay federation:

1. Signal messages: FederationRoomJoin/Leave/ParticipantUpdate added
   to SignalMessage enum for relay-to-relay room coordination.

2. Room changes: ParticipantOrigin (Local/Federated) tracking, loop
   prevention (federated media only forwards to local participants),
   ParticipantSender::Federation with 8-byte room-hash prefixed
   datagrams, merged participant lists (local + remote), new methods:
   join_federated(), update_federated_participants(), local_senders(),
   active_rooms(), local_participants().

3. FederationManager: connects to configured peers via QUIC with SNI
   "_federation", reconnects with exponential backoff (5s-300s),
   exchanges FederationRoomJoin signals, runs recv loops for both
   signals and media datagrams, creates virtual participants in rooms.

4. Accept-side: _federation SNI handling in main.rs, unknown peer
   gets helpful "add to relay.toml" log message, recognized peers
   handed off to FederationManager.

TODO: TLS fingerprint verification — currently outbound connections
use client_config() which doesn't present a cert, so inbound
verification fails. Need mutual TLS or URL-based peer matching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:30:18 +04:00
Siavash Sameni
087bfd2335 feat: deterministic TLS certificate from relay identity seed
The relay's TLS certificate is now derived from the persisted
Ed25519 seed via HKDF, so the same seed produces the same cert
and the same TLS fingerprint across restarts. This fixes the
"Server Key Changed" warnings on every relay restart.

Implementation: HKDF-SHA256(seed, "wzp-tls-ed25519") → Ed25519
signing key → PKCS8 DER → rcgen KeyPair → self-signed cert.

Also adds tls_fingerprint() helper (SHA-256 of DER cert, hex with
colons) and prints it on startup. This is the prerequisite for
relay federation (peers verify each other by TLS fingerprint).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:10:08 +04:00
Claude
a9c4260b4e fix: close QUIC connection on hangup so relay removes participant immediately
stop_call() now calls close_now() on the stored transport handle before
killing the tokio runtime. This sends a QUIC CONNECTION_CLOSE frame so
the relay's recv loop breaks immediately, triggering leave() + RoomUpdate
broadcast. Previously the runtime was killed first, so transport.close()
never ran and the relay kept stale participants until idle timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 04:58:24 +00:00
Claude
a23d9f5e41 feat: foreground service, dB gain sliders, speaker routing, live network stats
- Wire CallService foreground service for background calls (microphone type)
- Add Voice Volume + Mic Gain sliders (-20 to +20 dB) applied in Kotlin
- Connect AudioRouteManager for real speaker toggle via AudioManager
- Feed quinn QUIC RTT into PathMonitor, display Loss/RTT/Jitter from live data
- Nuclear teardown between calls — recreate engine + audio pipeline each call
- Fix re-entrant teardown loop from CallService notification callback
- Park audio threads as daemons to avoid libcrypto TLS destructor crash on exit
- Remove duplicate wakelocks from Activity (service owns them now)
- Strip AEC + denoise from capture path, keep AGC only (incremental approach)
- Fix .so copy target: libwzp_android.so not libwzp.so

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:45:00 +00:00
Claude
26e9c55f1f feat: Android VoIP client — Phase 1 (audio quality, network adaptation, crate skeleton)
- New wzp-android crate with Oboe C++ backend, lock-free SPSC ring buffers,
  engine orchestrator, codec pipeline, and Android Gradle project structure
- AEC (NLMS adaptive filter), AGC (two-stage with fast attack/slow release),
  windowed-sinc FIR resampler replacing linear interpolation (wzp-codec)
- Opus encoder tuning: complexity 7 default, set_expected_loss support
- Mobile jitter buffer: asymmetric EMA (fast up/slow down), handoff spike
  detection with 2s cooldown, configurable safety margin
- Network-aware quality control: cellular-specific thresholds, faster
  downgrade on cellular, proactive tier drop on WiFi→cellular handoff,
  FEC ratio boost during network transitions
- Handoff detection in PathMonitor via RTT jitter spike analysis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:07:55 +00:00
Siavash Sameni
0dc381e948 feat: protocol improvements — live trunking, mini-frames, noise suppression, adaptive jitter
T6 wiring: Trunking in relay hot path
- TrunkedForwarder wraps transport with TrunkBatcher
- run_participant uses 5ms flush timer when trunking enabled
- send_trunk/recv_trunk on QuinnTransport
- --trunking flag on relay config
- 2 new tests: forwarder batches, auto-flush on full

T7 wiring: Mini-frames in encoder/decoder
- MediaPacket::encode_compact/decode_compact with MiniFrameContext
- CallEncoder sends mini-headers for consecutive frames (full every 50th)
- CallDecoder auto-detects full vs mini on receive
- mini_frames_enabled in CallConfig (default true)
- 3 new tests: encode/decode sequence, periodic full, disabled mode

Noise suppression (nnnoiseless/RNNoise)
- NoiseSupressor in wzp-codec: pure Rust ML-based noise removal
- Processes 960-sample frames as two 480-sample halves
- Integrated in CallEncoder before silence detection
- noise_suppression in CallConfig (default true)
- 4 new tests: creation, processing, SNR improvement, passthrough

T1-S4: Adaptive playout delay
- AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired)
- Computes target_delay from observed inter-arrival jitter
- JitterBuffer::new_adaptive() uses adaptive delay
- adaptive_jitter in CallConfig (default true)
- 5 new tests: stable, jitter increase, recovery, clamping, estimate

272 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:24:53 +04:00
Siavash Sameni
ce6aacb25f fix: bridge pairing + auto-reconnect + test stability
Bridge mode rewrite:
- First client echoes while waiting, checks every 100ms if paired
- Second client triggers bridge immediately, first exits echo loop
- After bridge ends, slot is cleared for the next pair
- No more two tasks competing for the same transport recv

Web client auto-reconnect:
- On WebSocket close/error, automatically reconnects after 1s
- Keeps retrying as long as the user hasn't clicked Disconnect

Test fix:
- Install rustls crypto provider in transport config tests
  (fixes race condition when running full workspace tests)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 19:49:27 +04:00
Siavash Sameni
51e893590c feat: WarzonePhone lossy VoIP protocol — Phase 1 complete
Rust workspace with 7 crates implementing a custom VoIP protocol
designed for extremely lossy connections (5-70% loss, 100-500kbps,
300-800ms RTT). 89 tests passing across all crates.

Crates:
- wzp-proto: Wire format, traits, adaptive quality controller, jitter buffer, session FSM
- wzp-codec: Opus encoder/decoder (audiopus), Codec2 stubs, adaptive switching, resampling
- wzp-fec: RaptorQ fountain codes, interleaving, block management (proven 30-70% loss recovery)
- wzp-crypto: X25519+ChaCha20-Poly1305, Warzone identity compatible, anti-replay, rekeying
- wzp-transport: QUIC via quinn with DATAGRAM frames, path monitoring, signaling streams
- wzp-relay: Integration stub (Phase 2)
- wzp-client: Integration stub (Phase 2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:45:07 +04:00