feat(nat): Tailscale-inspired STUN/ICE + port mapping + mid-call re-gathering (#28)
Phase 8: 5 new modules bringing NAT traversal close to Tailscale's approach. - stun.rs: RFC 5389 STUN client — public server reflexive discovery, XOR-MAPPED-ADDRESS parsing, parallel probe with retry, STUN fallback in desktop try_reflect_own_addr() - portmap.rs: NAT-PMP (RFC 6886) + PCP (RFC 6887) + UPnP IGD port mapping — gateway discovery, acquire/release/refresh lifecycle, new PeerCandidates.mapped candidate type in dial order - ice_agent.rs: candidate lifecycle — gather(), re_gather(), apply_peer_update() with monotonic generation counter, CandidateUpdate signal message forwarded by relay - netcheck.rs: comprehensive diagnostic — NAT type, IPv4/v6, port mapping availability, relay latencies, CLI --netcheck - relay_map.rs: RTT-sorted relay map, preferred() selection, populate_from_ack() for RegisterPresenceAck.available_relays Relay: CallRegistry stores + cross-wires caller/callee_mapped_addr into CallSetup.peer_mapped_addr. Region config + available_relays populated from federation peers in RegisterPresenceAck. Desktop: place_call/answer_call call acquire_port_mapping() and fill caller/callee_mapped_addr. STUN+relay combined NAT detection. 571 tests pass (66 new), 0 regressions, 0 warnings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -1100,3 +1100,82 @@ BT SCO only supports 8/16kHz. When `bt_active=1`, Oboe capture skips `setSampleR
|
||||
### Hangup Signal Fix
|
||||
|
||||
`SignalMessage::Hangup` now carries an optional `call_id` field. The relay uses it to end only the specific call instead of broadcasting to all active calls for the user — preventing a race where a hangup for call 1 kills a newly-placed call 2.
|
||||
|
||||
## Phase 8: Tailscale-Inspired NAT Traversal (2026-04-14)
|
||||
|
||||
Five new modules in `wzp-client` bring NAT traversal capability close to Tailscale's approach:
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────────────┐
|
||||
│ wzp-client NAT Traversal Stack │
|
||||
│ │
|
||||
│ ┌─────────────┐ ┌──────────────┐ ┌──────────────────────────┐ │
|
||||
│ │ stun.rs │ │ portmap.rs │ │ reflect.rs (existing) │ │
|
||||
│ │ RFC 5389 │ │ NAT-PMP │ │ Relay-based STUN │ │
|
||||
│ │ Public │ │ PCP │ │ Multi-relay NAT detect │ │
|
||||
│ │ STUN │ │ UPnP IGD │ │ │ │
|
||||
│ └──────┬──────┘ └──────┬───────┘ └────────────┬─────────────┘ │
|
||||
│ │ │ │ │
|
||||
│ └────────────────┼────────────────────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────▼────────┐ │
|
||||
│ │ ice_agent.rs │ │
|
||||
│ │ Gather / Re- │ │
|
||||
│ │ gather / Apply│ │
|
||||
│ └───────┬────────┘ │
|
||||
│ │ │
|
||||
│ ┌───────────┼───────────┐ │
|
||||
│ │ │ │ │
|
||||
│ ┌───────▼───┐ ┌───▼───┐ ┌───▼──────────┐ │
|
||||
│ │ netcheck │ │ dual_ │ │ relay_map.rs │ │
|
||||
│ │ .rs │ │ path │ │ RTT-sorted │ │
|
||||
│ │ Diagnostic│ │ .rs │ │ relay list │ │
|
||||
│ └───────────┘ │ Race │ └──────────────┘ │
|
||||
│ └───────┘ │
|
||||
└──────────────────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### Candidate Types
|
||||
|
||||
| Type | Source | Priority | When Used |
|
||||
|------|--------|----------|-----------|
|
||||
| Host | `local_host_candidates()` | 1 (highest) | Same-LAN peers |
|
||||
| Port-mapped | `portmap::acquire_port_mapping()` | 2 | Router supports NAT-PMP/PCP/UPnP |
|
||||
| Server-reflexive | `stun::discover_reflexive()` or relay Reflect | 3 | Cone NAT |
|
||||
| Relay | Relay address (fallback) | 4 (lowest) | Always available |
|
||||
|
||||
### Signal Flow for Mid-Call Re-Gathering
|
||||
|
||||
```
|
||||
Network change (WiFi → cellular)
|
||||
│
|
||||
▼
|
||||
IceAgent::re_gather()
|
||||
├── stun::discover_reflexive()
|
||||
├── portmap::acquire_port_mapping()
|
||||
└── local_host_candidates()
|
||||
│
|
||||
▼
|
||||
SignalMessage::CandidateUpdate { generation: N+1, ... }
|
||||
│
|
||||
▼ (via relay)
|
||||
Peer's IceAgent::apply_peer_update()
|
||||
│
|
||||
▼
|
||||
PeerCandidates { reflexive, local, mapped }
|
||||
│
|
||||
▼
|
||||
dual_path::race() with new candidates (TODO: transport hot-swap)
|
||||
```
|
||||
|
||||
### New SignalMessage Variants & Fields
|
||||
|
||||
| Signal | New Fields | Purpose |
|
||||
|--------|-----------|---------|
|
||||
| `DirectCallOffer` | `caller_mapped_addr` | Port-mapped address from NAT-PMP/PCP/UPnP |
|
||||
| `DirectCallAnswer` | `callee_mapped_addr` | Same, callee side |
|
||||
| `CallSetup` | `peer_mapped_addr` | Relay cross-wires mapped addr to peer |
|
||||
| `CandidateUpdate` | (new variant) | Mid-call candidate re-gathering |
|
||||
| `RegisterPresenceAck` | `relay_region`, `available_relays` | Relay mesh metadata for auto-selection |
|
||||
|
||||
All new fields use `#[serde(default, skip_serializing_if)]` for backward compatibility with older clients/relays.
|
||||
|
||||
@@ -105,15 +105,25 @@ Sentinel value `0xFF` means "no change pending". The recv task polls on every re
|
||||
|
||||
~~The Tauri engine doesn't use `AdaptiveQualityController` — quality is resolved once at call start.~~ **Update (2026-04-13):** Desktop now has `AdaptiveQualityController` wired into the recv task with `pending_profile` AtomicU8 bridge. Network monitoring on desktop is now feasible — the blocker was adaptive quality, which is done. Remaining work: platform-specific network change detection (macOS: `SCNetworkReachability` or `NWPathMonitor`; Linux: `netlink` socket).
|
||||
|
||||
### Mid-Call ICE Re-gathering
|
||||
### Mid-Call ICE Re-gathering — PARTIALLY IMPLEMENTED (2026-04-14)
|
||||
|
||||
When the device's IP address changes, ideally we should:
|
||||
1. Re-gather local host candidates (`local_host_candidates()`)
|
||||
2. Re-probe STUN (`probe_reflect_addr()`)
|
||||
3. Send updated candidates to the peer (`CandidateUpdate` signal message)
|
||||
4. Attempt new dual-path race for path upgrade
|
||||
When the device's IP address changes, the system now:
|
||||
1. Re-gather local host candidates (`local_host_candidates()`) ✅
|
||||
2. Re-probe STUN (`stun::discover_reflexive()` + `portmap::acquire_port_mapping()`) ✅
|
||||
3. Send updated candidates to the peer (`CandidateUpdate` signal message) ✅
|
||||
4. Relay forwards `CandidateUpdate` to peer (same pattern as `MediaPathReport`) ✅
|
||||
5. Peer receives and can parse via `IceAgent::apply_peer_update()` ✅
|
||||
6. Attempt new dual-path race for path upgrade — **NOT YET WIRED** (transport hot-swap)
|
||||
|
||||
`NetworkMonitor.onIpChanged` fires on `onLinkPropertiesChanged` — the hook is ready, but the signaling and re-racing logic is not yet implemented.
|
||||
`NetworkMonitor.onIpChanged` fires on `onLinkPropertiesChanged` — the hook is ready.
|
||||
The signaling plane is fully implemented via `IceAgent` + `CandidateUpdate`.
|
||||
Remaining: wire `onIpChanged` → JNI → `pending_ice_regather` AtomicBool → recv task → `ice_agent.re_gather()` → transport swap.
|
||||
|
||||
New modules added in Phase 8 (Tailscale-inspired):
|
||||
- `crates/wzp-client/src/ice_agent.rs` — candidate lifecycle management
|
||||
- `crates/wzp-client/src/stun.rs` — public STUN server probing (independent of relay)
|
||||
- `crates/wzp-client/src/portmap.rs` — NAT-PMP/PCP/UPnP port mapping
|
||||
- `crates/wzp-client/src/netcheck.rs` — comprehensive network diagnostic
|
||||
|
||||
## Testing
|
||||
|
||||
|
||||
@@ -142,11 +142,17 @@ The existing relay connection carries `IceCandidate` signals. No new infrastruct
|
||||
|-------|-------|--------|--------|
|
||||
| 1 | STUN client + candidate gathering | 2 days | Done |
|
||||
| 2 | QUIC hole punching + identity verification | 3 days | Done |
|
||||
| 3 | Adaptive quality on P2P connection | 2 days | Pending (needs 5-tier classification, task #9) |
|
||||
| 3 | Adaptive quality on P2P connection | 2 days | Done (#23) |
|
||||
| 4 | Hybrid mode (relay + P2P, seamless migration) | 3 days | Done |
|
||||
| 5 | Single-socket Nebula (shared signal+direct endpoint) | 2 days | Done |
|
||||
| 6 | ICE path negotiation + dual-path race | 3 days | Done |
|
||||
| 7 | IPv6 dual-socket | 2 days | Done (but `dual_path.rs` integration tests broken — missing `ipv6_endpoint` arg) |
|
||||
| 8.1 | Public STUN client (RFC 5389) | 1 day | Done |
|
||||
| 8.2 | PCP/PMP/UPnP port mapping | 2 days | Done |
|
||||
| 8.3 | Mid-call ICE re-gathering + CandidateUpdate signal | 2 days | Done (signal plane; transport hot-swap TODO) |
|
||||
| 8.4 | Netcheck diagnostic | 1 day | Done |
|
||||
| 8.5 | Region-based relay selection (data model) | 1 day | Done |
|
||||
| 8.6 | Hard NAT traversal (birthday attack) | — | Deferred |
|
||||
|
||||
## Implementation Status (2026-04-13)
|
||||
|
||||
@@ -162,3 +168,38 @@ P2P adaptive quality (#23) now implemented:
|
||||
- Both peers self-observe network quality from QUIC path stats
|
||||
- Quality reports generated every ~1s and attached to outgoing packets
|
||||
- AdaptiveQualityController drives codec switching on both P2P and relay calls
|
||||
|
||||
## Update (2026-04-14): Phase 8 — Tailscale-Inspired Enhancements
|
||||
|
||||
Added 5 new modules to bring NAT traversal capability close to Tailscale's:
|
||||
|
||||
### Phase 8.1: Public STUN Client (Done)
|
||||
- `stun.rs`: RFC 5389 Binding Request/Response over raw UDP
|
||||
- Independent reflexive discovery via public STUN servers (Google, Cloudflare)
|
||||
- `detect_nat_type_with_stun()` combines relay + STUN probes for higher confidence
|
||||
- STUN fallback in desktop's `try_reflect_own_addr()` when relay reflection fails
|
||||
|
||||
### Phase 8.2: PCP/PMP/UPnP Port Mapping (Done)
|
||||
- `portmap.rs`: NAT-PMP (RFC 6886), PCP (RFC 6887), UPnP IGD
|
||||
- Gateway discovery (macOS + Linux), try NAT-PMP → PCP → UPnP in sequence
|
||||
- New candidate type: `PeerCandidates.mapped` + signal fields `caller_mapped_addr`/`callee_mapped_addr`/`peer_mapped_addr`
|
||||
- Dial order: host → mapped → reflexive (mapped helps on symmetric NATs)
|
||||
|
||||
### Phase 8.3: Mid-Call ICE Re-Gathering (Done — signal plane)
|
||||
- `ice_agent.rs`: `IceAgent` with `gather()`, `re_gather()`, `apply_peer_update()`
|
||||
- `SignalMessage::CandidateUpdate` with monotonic generation counter
|
||||
- Relay forwards `CandidateUpdate` like `MediaPathReport`
|
||||
- Desktop handles and emits to JS frontend
|
||||
- Transport hot-swap: designed but not yet wired into live call engine
|
||||
|
||||
### Phase 8.4: Netcheck Diagnostic (Done)
|
||||
- `netcheck.rs`: comprehensive network diagnostic (NAT type, reflexive addr, IPv4/v6, port mapping, relay latencies)
|
||||
- CLI: `wzp-client --netcheck <relay>`
|
||||
|
||||
### Phase 8.5: Region-Based Relay Selection (Done — data model)
|
||||
- `relay_map.rs`: `RelayMap` sorted by RTT with `preferred()` selection
|
||||
- `RegisterPresenceAck` extended with `relay_region` + `available_relays`
|
||||
|
||||
### Phase 8.6: Hard NAT Traversal (Deferred)
|
||||
- Birthday-attack port prediction deferred — 2-5s probing latency is excessive for VoIP call setup
|
||||
- Phases 8.1-8.2 cover the vast majority of NAT configurations
|
||||
|
||||
@@ -329,3 +329,46 @@ Run with `wzp-bench --all`. Representative results (Apple M-series, single core)
|
||||
- APK signing: added zipalign + apksigner pipeline to `build.sh` (was in `build-tauri-android.sh` only)
|
||||
- Keystore persistence: `$BASE_DIR/data/keystore/` cache synced into source tree before build
|
||||
- Fixes: 384MB debug APK uploaded instead of 25MB release; unsigned APK on alt server
|
||||
|
||||
### Phase 8: Tailscale-Inspired STUN/ICE Enhancements (2026-04-14)
|
||||
|
||||
5 new modules in `wzp-client`, 64 new unit tests (363 total across client/proto/relay).
|
||||
|
||||
#### Public STUN Client (`stun.rs`)
|
||||
- Minimal RFC 5389 STUN Binding Request/Response over raw UDP
|
||||
- XOR-MAPPED-ADDRESS (preferred) + MAPPED-ADDRESS (fallback) parsing
|
||||
- Default servers: `stun.l.google.com:19302`, `stun1.l.google.com:19302`, `stun.cloudflare.com:3478`
|
||||
- `discover_reflexive()` — first-success parallel probe across N servers
|
||||
- `probe_stun_servers()` — full results for NAT classification
|
||||
- Integrated into `detect_nat_type_with_stun()` combining relay + STUN probes
|
||||
- Desktop STUN fallback in `try_reflect_own_addr()` when relay reflection fails
|
||||
|
||||
#### PCP/PMP/UPnP Port Mapping (`portmap.rs`)
|
||||
- **NAT-PMP** (RFC 6886): UDP to gateway:5351, external address + port mapping
|
||||
- **PCP** (RFC 6887): PCP MAP opcode, IPv4-mapped IPv6 client address
|
||||
- **UPnP IGD**: SSDP M-SEARCH discovery + SOAP `AddPortMapping`/`GetExternalIPAddress`
|
||||
- Gateway discovery: macOS (`route -n get default`), Linux (`/proc/net/route`)
|
||||
- `acquire_port_mapping()` tries NAT-PMP → PCP → UPnP, first success wins
|
||||
- `release_port_mapping()` + `spawn_refresh()` for lifecycle management
|
||||
- Signal protocol: `caller_mapped_addr`/`callee_mapped_addr` on offer/answer, `peer_mapped_addr` on CallSetup
|
||||
- `PeerCandidates.mapped` — new candidate type in dial order (host → mapped → reflexive)
|
||||
|
||||
#### Mid-Call ICE Re-Gathering (`ice_agent.rs`)
|
||||
- `IceAgent`: owns candidate lifecycle with `gather()`, `re_gather()`, `apply_peer_update()`
|
||||
- Monotonic generation counter prevents stale candidate updates from reordering
|
||||
- `SignalMessage::CandidateUpdate` — new signal for mid-call candidate exchange
|
||||
- Relay forwards `CandidateUpdate` to call peer (same pattern as `MediaPathReport`)
|
||||
- Desktop handles `CandidateUpdate` in signal recv loop, emits to JS frontend
|
||||
- Transport hot-swap architecture designed (TODO: wire into live call engine)
|
||||
|
||||
#### Netcheck Diagnostic (`netcheck.rs`)
|
||||
- `NetcheckReport`: NAT type, reflexive addr, IPv4/v6, port mapping, relay latencies, gateway
|
||||
- `run_netcheck()` — parallel probes for STUN + relay + portmap + IPv6
|
||||
- `format_report()` — human-readable diagnostic output
|
||||
- CLI: `wzp-client --netcheck <relay>` runs diagnostic
|
||||
|
||||
#### Region-Based Relay Selection (`relay_map.rs`)
|
||||
- `RelayMap` sorted by RTT, `preferred()` returns lowest-latency reachable relay
|
||||
- `populate_from_ack()` — parses `RegisterPresenceAck.available_relays`
|
||||
- Stale detection (`needs_reprobe()`, `stale_entries()`)
|
||||
- `RegisterPresenceAck` extended with `relay_region` and `available_relays`
|
||||
|
||||
Reference in New Issue
Block a user