feat(dred): continuous DRED tuning, PMTUD, extended Opus6k window

- DredTuner: maps live network metrics (loss/RTT/jitter) to continuous DRED duration every ~500ms instead of discrete tier-locked values. Includes jitter-spike detection for pre-emptive Starlink-style boost. - Opus6k DRED extended from 500ms to 1040ms (max libopus 1.5 supports) - PMTUD: quinn MtuDiscoveryConfig with upper_bound=1452, 300s interval - TrunkedForwarder respects discovered MTU (was hard-coded 1200) - QuinnPathSnapshot exposes quinn internal stats + discovered MTU - AudioEncoder trait: set_expected_loss() + set_dred_duration() methods - PathMonitor: sliding-window jitter variance for spike detection - Integrated into both Android and desktop send tasks in engine.rs - 14 new tests (10 tuner unit + 4 encoder integration) - Updated ARCHITECTURE.md, PROGRESS.md, PRD-dred-integration, PRD-mtu Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 19:38:37 +04:00
parent 24cc74d93c
commit 766c9df442
19 changed files with 829 additions and 32 deletions
--- a/docs/ARCHITECTURE.md
+++ b/docs/ARCHITECTURE.md
@@ -103,11 +103,13 @@ sequenceDiagram
    participant RNN as RNNoise<br/>(2 x 480)
    participant VAD as SilenceDetector
    participant Codec as Opus / Codec2
+    participant DT as DredTuner<br/>(wzp-proto)
    participant FEC as RaptorQ FEC
    participant INT as Interleaver<br/>(depth=3)
    participant HDR as MediaHeader<br/>(12B or Mini 4B)
    participant Enc as ChaCha20-Poly1305
    participant QUIC as QUIC Datagram
+    participant QPS as QuinnPathSnapshot

    Mic->>Ring: f32 x 512 (macOS callback)
    Ring->>Ring: Accumulate to 960 samples
@@ -118,10 +120,19 @@ sequenceDiagram
    else Silence (>100ms)
        VAD->>Codec: ComfortNoise (every 200ms)
    end
-    Codec->>FEC: Compressed bytes (pad to 256B symbol)
-    FEC->>FEC: Accumulate block (5-10 symbols)
-    FEC->>INT: Source + repair symbols
-    INT->>HDR: Interleaved packets
+
+    Note over QPS,DT: Every 25 frames (~500ms)
+    QPS->>DT: loss_pct, rtt_ms, jitter_ms
+    DT->>Codec: set_dred_duration() + set_expected_loss()
+
+    alt Opus tier (any bitrate)
+        Codec->>HDR: Compressed bytes + DRED side-channel (no RaptorQ)
+    else Codec2 tier
+        Codec->>FEC: Compressed bytes (pad to 256B symbol)
+        FEC->>FEC: Accumulate block (5-10 symbols)
+        FEC->>INT: Source + repair symbols
+        INT->>HDR: Interleaved packets
+    end
    HDR->>Enc: Header as AAD
    Enc->>QUIC: Encrypted payload + 16B tag
 ```
@@ -134,6 +145,9 @@ sequenceDiagram
 - Silence detection uses VAD + 100ms hangover before switching to ComfortNoise
 - FEC symbols are padded to **256 bytes** with a 2-byte LE length prefix
 - MiniHeaders (4 bytes) replace full headers (12 bytes) for 49 of every 50 frames
+- DRED tuner polls quinn path stats every 25 frames (~500ms) and adjusts DRED lookback duration continuously
+- Opus tiers bypass RaptorQ entirely -- DRED handles loss recovery at the codec layer
+- Opus6k DRED window: 1040ms (maximum libopus allows)

 ## Audio Decode Pipeline

@@ -154,13 +168,30 @@ sequenceDiagram
    Dec->>AR: Decrypt (header = AAD)
    AR->>AR: Check seq window (reject replay)
    AR->>HDR: Verified packet
-    HDR->>DEINT: MediaHeader + payload
-    DEINT->>FEC: Reordered symbols by block
-    FEC->>FEC: Attempt decode (need K of K+R)
-    FEC->>JIT: Recovered audio frames
+
+    alt Opus packet
+        HDR->>JIT: Direct to jitter buffer (no FEC/interleave)
+    else Codec2 packet
+        HDR->>DEINT: MediaHeader + payload
+        DEINT->>FEC: Reordered symbols by block
+        FEC->>FEC: Attempt decode (need K of K+R)
+        FEC->>JIT: Recovered audio frames
+    end
+
    JIT->>JIT: BTreeMap ordered by seq
    JIT->>JIT: Wait until depth >= target
-    JIT->>Codec: Pop lowest seq frame
+
+    alt Packet present
+        JIT->>Codec: Pop lowest seq frame
+    else Packet missing (Opus)
+        JIT->>Codec: DRED reconstruction (neural)
+        alt DRED fails or unavailable
+            Codec->>Codec: Classical PLC fallback
+        end
+    else Packet missing (Codec2)
+        Codec->>Codec: Classical PLC
+    end
+
    Codec->>Ring: PCM i16 x 960
    Ring->>SPK: Audio callback pulls samples
 ```
@@ -172,6 +203,8 @@ sequenceDiagram
 - Jitter buffer target: **10 packets (200ms)** for client, **50 packets (1s)** for relay
 - Desktop client uses **direct playout** (no jitter buffer) with lock-free ring
 - Codec2 frames at 8 kHz are resampled to 48 kHz transparently
+- DRED reconstruction: on packet loss, decoder tries neural DRED reconstruction before falling back to classical PLC
+- Jitter-spike detection pre-emptively boosts DRED to ceiling when jitter variance spikes >30%

 ## Relay SFU Forwarding

@@ -348,7 +381,7 @@ Used for 49 of every 50 frames (~1s cycle). Saves 8 bytes per packet (67% header
  [session_id: 2][len: u16][payload: len]  x count
 ```

-Packs multiple session packets into one QUIC datagram. Maximum 10 entries or 1200 bytes, flushed every 5ms.
+Packs multiple session packets into one QUIC datagram. Maximum 10 entries or PMTUD-discovered MTU (starts at 1200, grows to ~1452 on Ethernet), flushed every 5ms.

 ### QualityReport (4 bytes, optional trailer)

@@ -361,6 +394,40 @@ Byte 3: bitrate_cap_kbps (0-255 kbps)

 Appended to a media packet when the Q flag is set in the MediaHeader.

+## Path MTU Discovery
+
+Quinn's PLPMTUD is enabled with:
+- `initial_mtu`: 1200 bytes (QUIC minimum, always safe)
+- `upper_bound`: 1452 bytes (Ethernet minus IP/UDP/QUIC headers)
+- `interval`: 300s (re-probe every 5 minutes)
+- `black_hole_cooldown`: 30s (faster retry on lossy links)
+
+The discovered MTU is exposed via `QuinnPathSnapshot::current_mtu` and used by:
+- `TrunkedForwarder`: refreshes `max_bytes` on every send to fill larger datagrams
+- Future video framer: larger MTU = fewer application-layer fragments per frame
+
+## Continuous DRED Tuning
+
+Instead of locking DRED duration to 3 discrete quality tiers, the `DredTuner` (in `wzp-proto::dred_tuner`) maps live path quality to a continuous DRED duration:
+
+| Input | Source | Update Rate |
+|-------|--------|-------------|
+| Loss % | `QuinnPathSnapshot::loss_pct` (from quinn ACK frames) | Every 25 packets (~500ms) |
+| RTT ms | `QuinnPathSnapshot::rtt_ms` (quinn congestion controller) | Every 25 packets |
+| Jitter ms | `PathMonitor::jitter_ms` (EWMA of RTT variance) | Every 25 packets |
+
+### Mapping Logic
+
+- **Baseline**: codec-tier default (Studio=100ms, Good=200ms, Degraded=500ms)
+- **Ceiling**: codec-tier max (Studio=300ms, Good=500ms, Degraded=1040ms)
+- **Continuous**: linear interpolation between baseline and ceiling based on loss (0%->baseline, 40%->ceiling)
+- **RTT phantom loss**: high RTT (>200ms) adds phantom loss contribution to keep DRED generous
+- **Jitter spike**: >30% EWMA spike pre-emptively boosts to ceiling for ~5s cooldown
+
+### Output
+
+`DredTuning { dred_frames: u8, expected_loss_pct: u8 }` -> fed to `CallEncoder::apply_dred_tuning()` -> `OpusEncoder::set_dred_duration()` + `set_expected_loss()`
+
 ## Signal Message Handshake Flow

 ```mermaid