feat(reflect): multi-relay NAT type detection — Phase 2

Builds on Phase 1's SignalMessage::Reflect to probe N relays in
parallel through transient QUIC connections and classify the
client's NAT type for the future P2P hole-punching path. No wire
protocol changes — Phase 1's Reflect/ReflectResponse pair is
reused unchanged.

New client-side module (crates/wzp-client/src/reflect.rs):
- probe_reflect_addr(relay, timeout_ms): opens a throwaway
  quinn::Endpoint (fresh ephemeral source port per probe,
  essential for NAT-type detection — sharing one endpoint would
  make a symmetric NAT look like a cone NAT), connects to _signal,
  sends RegisterPresence with zero identity, consumes the Ack,
  sends Reflect, awaits ReflectResponse, cleanly closes.
- detect_nat_type(relays, timeout_ms): parallel probes via
  tokio::task::JoinSet (bounded by slowest probe not sum) and
  returns a NatDetection with per-probe results + aggregate
  classification.
- classify_nat(probes): pure-function classifier split out for
  network-free unit tests. Rules:
    * 0-1 successful probes              → Unknown
    * 2+ successes, same ip same port    → Cone (P2P viable)
    * 2+ successes, same ip diff ports   → SymmetricPort (relay)
    * 2+ successes, different ips        → Multiple (treat as
                                             symmetric)

Tauri command (desktop/src-tauri/src/lib.rs):
- detect_nat_type({ relays: [{ name, address }] }) -> NatDetection
  as JSON. Takes the relay list from JS because localStorage
  owns the config. Parse-up-front so a malformed entry fails
  clean instead of as a probe error. 1500ms per-probe timeout.

UI (desktop/index.html + src/main.ts):
- New "NAT type" row + "Detect NAT" button in the Network
  settings section. Renders per-probe status (name, address,
  observed addr, latency, or error) plus the colored verdict:
    * green  Cone — shows consensus addr
    * amber  SymmetricPort / Multiple — must relay
    * gray   Unknown — not enough data

Tests:
- 7 unit tests in wzp-client/src/reflect.rs covering every
  classifier branch (empty, 1 success, 2 identical, 2 diff ports,
  2 diff ips, success+failure mix, pure-failure).
- 3 integration tests in crates/wzp-relay/tests/multi_reflect.rs:
    * probe_reflect_addr_happy_path — single mock relay end-to-end
    * detect_nat_type_two_loopback_relays_is_cone — two concurrent
      relays, asserts both see 127.0.0.1 and classifier returns
      Cone or SymmetricPort (accepted because the test harness
      uses fresh ephemeral ports per probe which look like
      SymmetricPort on single-host loopback)
    * detect_nat_type_dead_relay_is_unknown — alive + dead port
      mix, asserts the dead probe surfaces an error string and
      the aggregator returns Unknown (only 1 success)

Full workspace test goes from 386 → 396 passing.

PRD: .taskmaster/docs/prd_multi_relay_reflect.txt
Tasks: 47-52 all completed

Next up: hole-punching (Phase 3) — use the reflected address in
DirectCallOffer/Answer and CallSetup so peers attempt a direct
QUIC handshake to each other, with relay fallback on timeout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-04-11 12:47:12 +04:00
parent 921856eba9
commit 8d903f16c6
6 changed files with 701 additions and 1 deletions

View File

@@ -719,6 +719,53 @@ async fn get_reflected_address(
}
}
/// Phase 2 of the "STUN for QUIC" rollout — probe multiple relays
/// in parallel to classify this client's NAT type. See
/// `wzp_client::reflect` for the per-probe logic and the pure
/// classifier.
///
/// This does NOT touch the registered `SignalState` — each probe
/// opens a fresh throwaway QUIC endpoint so the OS gives it a
/// fresh ephemeral source port. Sharing one endpoint across probes
/// would make a symmetric NAT look like a cone NAT, which is
/// exactly the failure mode we're trying to detect.
///
/// Takes the relay list from JS because the GUI owns the relay
/// config (localStorage `wzp-settings.relays`). Frontend passes it
/// in; Rust side just does the network work.
#[tauri::command]
async fn detect_nat_type(
relays: Vec<RelayArg>,
) -> Result<serde_json::Value, String> {
// Parse relay args up front so a single malformed entry fails
// the whole call cleanly instead of surfacing as a probe error
// at the end.
let mut parsed = Vec::with_capacity(relays.len());
for r in relays {
let addr: std::net::SocketAddr = r
.address
.parse()
.map_err(|e| format!("bad relay address {:?}: {e}", r.address))?;
parsed.push((r.name, addr));
}
// 1500ms per probe is generous: a same-host probe is < 10ms,
// a cross-continent probe is typically < 300ms, and we want
// to tolerate a one-off packet loss during connect.
let detection = wzp_client::reflect::detect_nat_type(parsed, 1500).await;
serde_json::to_value(&detection).map_err(|e| format!("serialize: {e}"))
}
/// Deserialization shim for the relay list coming from JS. The
/// `wzp-settings.relays` array in localStorage has more fields
/// (rtt, serverFingerprint, knownFingerprint) but we only need
/// name + address here.
#[derive(serde::Deserialize)]
struct RelayArg {
name: String,
address: String,
}
#[tauri::command]
async fn get_signal_status(state: tauri::State<'_, Arc<AppState>>) -> Result<serde_json::Value, String> {
let sig = state.signal.lock().await;
@@ -805,7 +852,7 @@ pub fn run() {
ping_relay, get_identity, get_app_info,
connect, disconnect, toggle_mic, toggle_speaker, get_status,
register_signal, place_call, answer_call, get_signal_status,
get_reflected_address,
get_reflected_address, detect_nat_type,
deregister,
set_speakerphone, is_speakerphone_on,
get_call_history, get_recent_contacts, clear_call_history,