Siavash Sameni 1618ff6c9d feat(p2p): Phase 5 — single-socket architecture (Nebula-style)
Before Phase 5 WarzonePhone used THREE separate UDP sockets per
client:

  1. Signal endpoint         (register_signal, client-only)
  2. Reflect probe endpoints (one fresh socket per relay probe)
  3. Dual-path race endpoint (fresh per call setup)

This broke two things in production on port-preserving NATs
(MikroTik masquerade, most consumer routers):

  a. Phase 2 NAT detection was WRONG. Each probe used a fresh
     internal port, so MikroTik mapped each one to a different
     external port, and the classifier saw "different port per
     relay" and labeled it SymmetricPort. The real NAT was
     cone-like but measurement via fresh sockets hid that.

  b. Phase 3.5 dual-path P2P race was BROKEN. The reflex addr
     we advertised in DirectCallOffer was observed by the signal
     endpoint's socket. The actual dual-path race listened on a
     DIFFERENT fresh socket, on a different internal (and
     therefore external) port. Peers dialed the advertised addr
     and hit MikroTik's mapping for the signal socket, which
     forwarded to the signal endpoint — a client-only endpoint
     that doesn't accept incoming connections. Direct path
     silently failed, relay always won the race.

Nebula-style fix: one socket for everything. The signal endpoint
is now dual-purpose (client + server_config), and both the
reflect probes and the dual-path race reuse it instead of
creating fresh ones. MikroTik's port-preservation then gives us
a stable external port across all flows → classifier correctly
sees Cone NAT → advertised reflex addr is the actual listening
port → direct dials from peers land on the right socket →
`endpoint.accept()` in the A-role branch of the dual-path race
picks up the incoming connection.

## Changes

### `register_signal` (desktop/src-tauri/src/lib.rs)
- Endpoint now created with `Some(server_config())` instead of
  `None`. The socket can now accept incoming QUIC connections as
  well as dial outbound.
- Every code path that previously read `sig.endpoint` for the
  relay-dial reuse benefits automatically — same socket is now
  ALSO listening for peer dials.

### `probe_reflect_addr` (wzp-client/src/reflect.rs)
- New `existing_endpoint: Option<Endpoint>` arg. `Some` reuses
  the caller's socket (production: pass the signal endpoint).
  `None` creates a fresh one (tests + pre-registration).
- Removed the `drop(endpoint)` at the end — was correct for
  fresh endpoints (explicit early socket close) but incorrect
  for shared ones. End-of-scope drop does the right thing in
  both cases via Arc semantics.

### `detect_nat_type` (wzp-client/src/reflect.rs)
- New `shared_endpoint: Option<Endpoint>` arg, forwarded to
  every probe in the JoinSet fan-out. One shared socket means
  the classifier sees the true NAT type.

### `detect_nat_type` Tauri command (desktop/src-tauri/src/lib.rs)
- Reads `state.signal.endpoint` and passes it as the shared
  endpoint. Falls back to None when not registered. NAT detection
  now produces accurate classifications against MikroTik / most
  consumer NATs.

### `dual_path::race` (wzp-client/src/dual_path.rs)
- New `shared_endpoint: Option<Endpoint>` arg.
- A-role: when `Some`, reuses it for `accept()`. This is the
  critical change — the reflex addr advertised to peers is now
  the address listening for incoming direct dials.
- D-role: when `Some`, reuses it for the outbound direct dial.
  MikroTik keeps the same external port for the dial as for
  the signal flow → direct dial through a cone-mapped NAT.
- Relay path: also reuses the shared endpoint so MikroTik has
  a single consistent mapping across the whole call (saves one
  extra external port and makes firewall traces cleaner).
- When `None`, falls back to fresh per-role endpoints as before.

### `connect` Tauri command (desktop/src-tauri/src/lib.rs)
- Reads `state.signal.endpoint` once when acquiring own reflex
  addr and passes it through to `dual_path::race`.

### Tests
- `wzp-client/tests/dual_path.rs` and
  `wzp-relay/tests/multi_reflect.rs` updated to pass `None` for
  the new endpoint arg — tests use fresh sockets and that's
  fine because the loopback harness doesn't care about
  port-preserving NAT behavior.

Full workspace test: 423 passing (no regressions).

## Expected behavior after this commit on real hardware

Behind MikroTik + Starlink-bypass (the reporter's setup):
- Phase 2 NAT detect → **Cone NAT** (was SymmetricPort — false
  positive from the measurement artifact)
- Phase 3.5 direct-P2P dial → succeeds for both cone-cone and
  cone-CGNAT cases where the remote side was previously blocked
  by our own socket mismatch
- LTE ↔ LTE cross-carrier → still likely relay fallback; that's
  genuinely strict symmetric and needs Phase 5.5 port prediction.

## Phase 5.5 (next, separate PRD)

Multi-candidate port prediction + ICE-style candidate aggregation
for truly strict symmetric NATs. Not needed for the 95% case —
Phase 5 alone fixes most consumer-router setups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 19:47:20 +04:00

WarzonePhone

Custom lossy VoIP protocol built in Rust. E2E encrypted, FEC-protected, adaptive quality, designed for hostile network conditions.

Quick Start

# Build
cargo build --release

# Run relay
./target/release/wzp-relay --listen 0.0.0.0:4433

# Send a test tone
./target/release/wzp-client --send-tone 5 relay-addr:4433

# Web bridge (browser calls)
./target/release/wzp-web --port 8080 --relay 127.0.0.1:4433 --tls
# Open https://localhost:8080/room-name in two browser tabs

Architecture

See docs/ARCHITECTURE.md for the full system architecture with Mermaid diagrams covering:

  • System overview and data flow
  • Crate dependency graph (8 crates)
  • Wire formats (MediaHeader, MiniHeader, TrunkFrame, SignalMessage)
  • Cryptographic handshake (X25519 + Ed25519 + ChaCha20-Poly1305)
  • Identity model (BIP39 seed, featherChat compatible)
  • Quality profiles (GOOD/DEGRADED/CATASTROPHIC)
  • FEC protection (RaptorQ with interleaving)
  • Adaptive jitter buffer (NetEq-inspired)
  • Telemetry stack (Prometheus + Grafana)
  • Deployment topology

Features

  • 3 quality tiers: Opus 24k (28.8 kbps) / Opus 6k (9 kbps) / Codec2 1200 (2.4 kbps)
  • RaptorQ FEC: Recovers from 20-100% packet loss depending on tier
  • E2E encryption: ChaCha20-Poly1305 with X25519 key exchange
  • Adaptive jitter buffer: EMA-based playout delay tracking
  • Silence suppression: VAD + comfort noise (~50% bandwidth savings)
  • ML noise removal: RNNoise (nnnoiseless pure Rust port)
  • Mini-frames: 67% header compression for steady-state packets
  • Trunking: Multiplex sessions into batched datagrams
  • featherChat integration: Shared BIP39 identity, token auth, call signaling
  • Prometheus metrics: Relay, web bridge, inter-relay probes
  • Grafana dashboard: Pre-built JSON with 18 panels

Documentation

Document Description
ARCHITECTURE.md Full system architecture with diagrams
TELEMETRY.md Prometheus metrics specification
INTEGRATION_TASKS.md featherChat integration tracker
WZP-FC-SHARED-CRATES.md Shared crate strategy
grafana-dashboard.json Importable Grafana dashboard

Binaries

Binary Description
wzp-relay Relay daemon (SFU room mode, forward mode, probes)
wzp-client CLI client (send-tone, record, live mic, echo-test, drift-test, sweep)
wzp-web Browser bridge (HTTPS + WebSocket + AudioWorklet)
wzp-bench Component benchmarks

Linux Build

./scripts/build-linux.sh --prepare   # Create Hetzner VM + install deps
./scripts/build-linux.sh --build     # Build release binaries
./scripts/build-linux.sh --transfer  # Download to target/linux-x86_64/
./scripts/build-linux.sh --destroy   # Delete VM

Tests

cargo test --workspace   # 272 tests

License

MIT OR Apache-2.0

Description
No description provided
Readme 147 MiB
Languages
Rust 78%
Kotlin 7.9%
Shell 6.7%
TypeScript 3.2%
C++ 1.5%
Other 2.6%