From 7b4bce69d55a59161dacb28ef3765ded51c491bc Mon Sep 17 00:00:00 2001 From: Siavash Sameni Date: Tue, 14 Apr 2026 11:33:12 +0400 Subject: [PATCH] docs: update all docs for hard NAT detection + relay wiring - PROGRESS.md: hard NAT Phase A, relay cross-wiring, 588 tests - ARCHITECTURE.md: hard NAT port prediction diagram + pattern table - PRD-p2p-direct.md: Phase 8.6 split into a/b/c/d with status - PRD-hard-nat.md: Phase A done, B signal ready, effort table updated - PRD-netcheck.md: port_allocation field + probe documented Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/ARCHITECTURE.md | 36 ++++++++++++++++++++++++++++++++++++ docs/PRD-hard-nat.md | 20 ++++++++++---------- docs/PRD-netcheck.md | 2 ++ docs/PRD-p2p-direct.md | 15 +++++++++++---- docs/PROGRESS.md | 19 ++++++++++++++++++- 5 files changed, 77 insertions(+), 15 deletions(-) diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 68068d3..98bb8be 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -1179,3 +1179,39 @@ dual_path::race() with new candidates (TODO: transport hot-swap) | `RegisterPresenceAck` | `relay_region`, `available_relays` | Relay mesh metadata for auto-selection | All new fields use `#[serde(default, skip_serializing_if)]` for backward compatibility with older clients/relays. + +### Hard NAT Port Prediction + +For symmetric NATs that don't support port mapping, the system detects the NAT's port allocation pattern: + +``` +Single socket → 5 STUN servers (sequential probes) + │ + ▼ +Observed ports: [40001, 40002, 40003, 40004, 40005] + │ + ▼ +classify_port_allocation() → Sequential { delta: 1 } + │ + ▼ +predict_ports(last=40005, delta=1, offset=0, spread=2) + → [40004, 40005, 40006, 40007, 40008] + │ + ▼ +HardNatProbe signal → peer + │ + ▼ +Peer dials predicted port range in parallel +``` + +| Pattern | Detection | Traversal Strategy | +|---------|-----------|-------------------| +| Port-preserving | All probes return same port | Standard hole-punch | +| Sequential (delta=N) | Consistent N-increment | Predict next port, dial range | +| Random | No pattern | Birthday attack or relay | +| Unknown | < 3 probes succeeded | Relay fallback | + +The classifier tolerates: +- **Jitter**: ±1 from dominant delta (concurrent flow grabbed a port) +- **Wraparound**: 65535 → 1 treated as delta=+2, not -65534 +- **Noise**: 60% threshold — if most deltas agree, call it sequential diff --git a/docs/PRD-hard-nat.md b/docs/PRD-hard-nat.md index c6b2167..b59c61a 100644 --- a/docs/PRD-hard-nat.md +++ b/docs/PRD-hard-nat.md @@ -1,8 +1,8 @@ # PRD: Hard NAT Traversal (Port Prediction + Birthday Attack) -> Phase: Design -> Status: Not started -> Crate: wzp-client, wzp-proto +> Phase: Partial implementation +> Status: Phase A done, Phase B signal ready, C-D not started (2026-04-14) +> Crate: wzp-client, wzp-proto, wzp-relay ## Problem @@ -193,14 +193,14 @@ HardNatBirthdayStart { ## Effort Estimate -| Phase | Scope | Effort | -|-------|-------|--------| -| A | Port allocation pattern detection | 1 day | -| B | Sequential port prediction + coordination | 2 days | -| C | Birthday attack (256 sockets + 1024 probes) | 3 days | -| D | Hybrid waterfall + background upgrade | 2 days | +| Phase | Scope | Effort | Status | +|-------|-------|--------|--------| +| A | Port allocation pattern detection | 1 day | **Done** — `PortAllocation` enum, `detect_port_allocation()`, `classify_port_allocation()`, `predict_ports()`, 17 tests | +| B | Sequential port prediction + coordination | 2 days | **Signal ready** — `HardNatProbe` signal + relay forwarding done. `dual_path::race()` integration pending | +| C | Birthday attack (256 sockets + 1024 probes) | 3 days | Not started | +| D | Hybrid waterfall + background upgrade | 2 days | Not started | -**Total**: ~8 days. Recommend starting with Phase A (detection) which is useful for netcheck even without the attack. Phase B (sequential prediction) covers ~40% of hard NATs with minimal complexity. Phase C (birthday) is the most complex and lowest ROI. +**Total**: ~8 days. Phase A is done and feeds into netcheck. Phase B has signal plumbing complete — needs `dual_path::race()` integration to actually dial predicted ports. Phase C (birthday) is the most complex and lowest ROI. ## Success Criteria diff --git a/docs/PRD-netcheck.md b/docs/PRD-netcheck.md index f6ab372..02a2793 100644 --- a/docs/PRD-netcheck.md +++ b/docs/PRD-netcheck.md @@ -34,6 +34,7 @@ pub struct NetcheckReport { pub gateway: Option, pub duration_ms: u32, pub stun_probes: Vec, + pub port_allocation: Option, } ``` @@ -43,6 +44,7 @@ pub struct NetcheckReport { 3. **Port mapping** — `acquire_port_mapping()` to detect NAT-PMP/PCP/UPnP 4. **Gateway** — `default_gateway()` for the router address 5. **IPv6** — attempt to bind `[::]:0` and send to an IPv6 STUN server +6. **Port allocation** — `detect_port_allocation()` probes STUN servers from single socket to classify NAT pattern as PortPreserving/Sequential/Random (feeds into hard NAT prediction) **Derived fields**: - `nat_type` / `reflexive_addr` — from `classify_nat()` on STUN probes diff --git a/docs/PRD-p2p-direct.md b/docs/PRD-p2p-direct.md index a5f8285..77db163 100644 --- a/docs/PRD-p2p-direct.md +++ b/docs/PRD-p2p-direct.md @@ -152,7 +152,10 @@ The existing relay connection carries `IceCandidate` signals. No new infrastruct | 8.3 | Mid-call ICE re-gathering + CandidateUpdate signal | 2 days | Done (signal plane; transport hot-swap TODO) | | 8.4 | Netcheck diagnostic | 1 day | Done | | 8.5 | Region-based relay selection (data model) | 1 day | Done | -| 8.6 | Hard NAT traversal (birthday attack) | — | Deferred | +| 8.6a | Hard NAT: port allocation detection | 1 day | Done | +| 8.6b | Hard NAT: sequential port prediction signal | 1 day | Done (signal + prediction fn; dial integration pending) | +| 8.6c | Hard NAT: birthday attack (256×1024 probes) | 3 days | Not started | +| 8.6d | Hard NAT: hybrid waterfall + background upgrade | 2 days | Not started | ## Implementation Status (2026-04-13) @@ -200,6 +203,10 @@ Added 5 new modules to bring NAT traversal capability close to Tailscale's: - `relay_map.rs`: `RelayMap` sorted by RTT with `preferred()` selection - `RegisterPresenceAck` extended with `relay_region` + `available_relays` -### Phase 8.6: Hard NAT Traversal (Deferred) -- Birthday-attack port prediction deferred — 2-5s probing latency is excessive for VoIP call setup -- Phases 8.1-8.2 cover the vast majority of NAT configurations +### Phase 8.6: Hard NAT Traversal (Phase A done, B-D pending) +- **Phase A (Done)**: Port allocation pattern detection — `PortAllocation` enum (`PortPreserving`/`Sequential{delta}`/`Random`/`Unknown`), `detect_port_allocation()` probes N STUN servers from single socket, `classify_port_allocation()` with wraparound + jitter tolerance, `predict_ports()` for sequential NATs +- **Phase B (signal ready)**: `HardNatProbe` signal message carries `port_sequence`, `allocation`, `external_ip` — relay forwarding implemented. Actual dial-to-predicted-ports integration into `dual_path::race()` pending. +- **Phase C (not started)**: Birthday attack (256 sockets × 1024 probes) for random NATs +- **Phase D (not started)**: Hybrid waterfall with background relay-to-direct upgrade +- `NetcheckReport.port_allocation` populated automatically from `detect_port_allocation()` +- See `docs/PRD-hard-nat.md` for full design diff --git a/docs/PROGRESS.md b/docs/PROGRESS.md index dd029a1..b2a4a6b 100644 --- a/docs/PROGRESS.md +++ b/docs/PROGRESS.md @@ -332,7 +332,7 @@ Run with `wzp-bench --all`. Representative results (Apple M-series, single core) ### Phase 8: Tailscale-Inspired STUN/ICE Enhancements (2026-04-14) -5 new modules in `wzp-client`, 64 new unit tests (363 total across client/proto/relay). +5 new modules in `wzp-client`, 83 new unit tests (588 total across workspace). #### Public STUN Client (`stun.rs`) - Minimal RFC 5389 STUN Binding Request/Response over raw UDP @@ -372,3 +372,20 @@ Run with `wzp-bench --all`. Representative results (Apple M-series, single core) - `populate_from_ack()` — parses `RegisterPresenceAck.available_relays` - Stale detection (`needs_reprobe()`, `stale_entries()`) - `RegisterPresenceAck` extended with `relay_region` and `available_relays` + +#### Hard NAT Port Allocation Detection (`stun.rs` Phase A) +- `PortAllocation` enum: `PortPreserving` / `Sequential { delta }` / `Random` / `Unknown` +- `detect_port_allocation()` — sequential STUN probes from single socket, analyzes external port sequence +- `classify_port_allocation()` — pure classifier with wraparound handling, jitter tolerance (±1), 60% threshold for noisy sequences +- `predict_ports(last_port, delta, offset, spread)` — generates target port range for sequential NATs +- `HardNatProbe` signal message for peer coordination (carries port_sequence, allocation, external_ip) +- Relay forwards `HardNatProbe` to call peer +- `NetcheckReport.port_allocation` field populated automatically +- 17 new tests for classification, prediction, serde, Display + +#### Relay End-to-End Wiring (2026-04-14) +- `CallRegistry` stores + cross-wires `caller_mapped_addr`/`callee_mapped_addr` into `CallSetup.peer_mapped_addr` +- `RelayConfig` extended with `region` + `advertised_addr` fields +- `RegisterPresenceAck` populates `relay_region` from config, `available_relays` from federation peers +- Desktop `place_call`/`answer_call` call `acquire_port_mapping()` and fill mapped addr fields +- Legacy `build-android-docker.sh` renamed to `build-android-docker-LEGACY.sh` to prevent accidental use