# WZP Protocol Audit > Protocol-level review of WZP as of 2026-05-11. See `WZP-SPEC.md` for the spec being audited. ## Strengths - **QUIC datagrams instead of raw UDP + SRTP** — buys TLS 1.3, PLPMTUD, path migration, and ACK-based loss/RTT estimation. Quinn's `PathSnapshot` feeding `DredTuner` is something WebRTC stacks build from scratch. - **Continuous DRED tuning.** Mapping RTT / loss / jitter to a continuous Opus DRED lookback window is genuinely better than discrete tiers — most stacks treat DRED as on/off. - **MiniHeader (49/50).** At 50 pps that is ~400 B/s saved per stream; meaningful at scale. - **SFU never decodes.** Preserves E2E. Most SFUs (LiveKit, Janus) terminate SRTP at the SFU. - **RaptorQ for low-bitrate Codec2 + DRED for Opus.** Correct split — DRED is cheaper than FEC at high bitrate; RaptorQ shines when you can afford many small symbols. ## Weaknesses ### W1. `u16` sequence wraps every ~21 minutes at 50 pps Anti-replay window is 64 packets so wrap is safe for replay. **But** the jitter buffer's `BTreeMap` will misorder across the wrap boundary if a packet is delayed more than ~32 k frames. Widen to `u32` (or version the field). ### W2. `fec_block_id: u8` wraps every 256 blocks (~25 s at 5-frame blocks) A late-joining peer or a slow reconstructor can collide block IDs. Widen to `u16` or carry an epoch counter. ### W3. `timestamp_ms` rebase behavior at rekey is unspecified Rekey every 65,536 packets (~22 min). If `timestamp_ms` resets, downstream sync glitches. If it does not, document explicitly. ### W4. `MiniHeader` has no `seq` Receiver infers absolute seq from the most recent full header + frame count. One missed full header (every 50 frames = 1 s) leaves 49 packets with unknown absolute seq. Acceptable for audio with short jitter buffers — **fatal for video** where one missed full header can desync an entire GOP. **Add `seq_delta: u8` to MiniHeader before video lands.** ### W5. `QualityReport` placement vs. AEAD A 4-byte trailer on encrypted media is fine **iff it sits inside the AEAD payload**. If it is outside, anything stripping the last 4 bytes corrupts decryption and creates a downgrade vector. Verify in `packet.rs`; if outside, move it inside or AAD-bind it. ### W6. Adaptive controller is loss / RTT-only — no bandwidth estimator Quinn exposes `cwnd` and `bytes_in_flight`, but `AdaptiveQualityController` does not consume them. Under low utilization you cannot detect that you *could* upgrade to Opus 64 k. **For video this is mandatory** — without BWE you will either oscillate or never use available capacity. ### W7. No NACK / explicit retransmit path For audio with DRED + FEC this is fine. For video keyframes it is wasteful — an I-frame is 50–200 packets, protecting at 50 % FEC doubles bitrate. A NACK path is cheap and far cheaper than blanket FEC for I-frames. ### W8. TrunkFrame batching multiplies AEAD cost Each inner payload is its own AEAD operation. At 10 entries that is 10× ChaCha calls per recv. Fine on x86 / ARM with AES-NI / NEON; profile on weak Android (Nothing A059 baseline). ### W9. `CodecID` is 4 bits → max 16 codecs; 9 already used Adding H.264, H.265, AV1, VP9 takes you to 13. Land the widening **before** deployment — either steal from `reserved` / `csrc_count` to make CodecID 8-bit, or split into `MediaType:2 / CodecID:6`. Doing this post-deployment is painful. ### W10. No `MediaType` field Audio vs. video vs. data is implicit in CodecID. A 2-bit `MediaType` lets the SFU apply per-type policy (drop video first under congestion, prioritize audio fan-out) without a codec lookup. ### W11. Anti-replay window 64 packets is tight for video One keyframe burst can be 100+ packets; a single reordered earlier packet stalls the window. Bump to 256 or 1024 for video streams, or maintain a per-stream window. ### W12. `SignalMessage` has no version byte Bincode + `#[serde(default, skip_serializing_if)]` covers field additions but not variant removal or semantic change. Lead every variant with `version: u8`. ### W13. RoomManager Mutex per-packet — **RESOLVED** Already flagged in `ARCHITECTURE.md`. At ~1500 pps/sender for video this becomes a real ceiling. **Resolution (T3.1):** `RoomManager` now stores `DashMap>>` instead of `DashMap`. The DashMap guard is held only long enough to clone the `Arc`; all per-room operations (fan-out `others()`, quality `observe_quality()`, join/leave) then acquire the room-level `std::sync::RwLock`. This lets concurrent `others()` calls share a read lock while writers hold the write lock, eliminating the per-packet DashMap contention that was the original concern. ### W14. No receiver → sender congestion feedback beyond inline QualityReport For video you need REMB-style or transport-CC-style explicit BWE feedback at ~50 ms cadence, independent of media packets. ## Priorities | Priority | Issue | Why | |---|---|---| | P0 | W9 (CodecID width), W10 (MediaType), W4 (MiniHeader seq_delta) | Wire-format changes — must land before video, painful to change post-deploy | | P0 | W1 (seq u16 → u32) | Same window; audio benefits too | | P1 | W6 (BWE), W14 (transport feedback) | Blocking for usable video; improves audio adaptation | | P1 | W5 (QualityReport in AEAD) | Security correctness | | P2 | W2 (fec_block_id width), W11 (anti-replay window), W12 (signal version byte) | Long-tail correctness | | P2 | W7 (NACK path), W13 (RoomManager lock) | Video performance, not correctness | | P3 | W3 (timestamp rebase doc), W8 (AEAD profiling) | Documentation / measurement | ## Resolution status (2026-05-11) The v2 wire format specified in `ROAD-TO-VIDEO.md` Phase V1 addresses: | Issue | Resolved by | |---|---| | W1 (seq u16 → u32) | `sequence: u32` in MediaHeader v2 | | W4 (MiniHeader seq) | `seq_delta: u8` added; MiniHeader v2 is 5 B | | W9 (CodecID width) | Widened to 8-bit (room for 256) | | W10 (MediaType) | Explicit `media_type: u8` byte | W6 / W14 (BWE + TransportFeedback) addressed in Phase V2. W7 (NACK) addressed in Phase V2 / V4. Others remain open. ## Known pre-existing clippy debt (as of T1.5.2) Measured at commit `c93d302` on `experimental-ui` (2026-05-11). `cargo clippy --workspace --all-targets -- -D warnings` fails in two crates with **pre-existing** errors (verified against `HEAD~1`). These are not introduced by any Wave 1 task; they should be cleaned up in a dedicated hygiene sprint or accepted as known debt. ### `wzp-codec` — 9 errors | Category | Count | Lint | Files | |---|---|---|---| | Manual saturating sub | 1 | `clippy::implicit_saturating_sub` | `aec.rs:117` | | Needless range loop | 2 | `clippy::needless_range_loop` | `aec.rs:164`, `resample.rs:51` | | Manual `div_ceil` | 2 | `clippy::manual_div_ceil` | `codec2_dec.rs:48`, `codec2_enc.rs:48` | | Manual `clamp` | 2 | `clippy::manual_clamp` | `denoise.rs:59`, `opus_enc.rs:250` | | Manual ASCII case-cmp | 1 | `clippy::manual_ascii_check` | `opus_enc.rs:99` | | Same-item push in loop | 1 | `clippy::same_item_push` | `resample.rs:184` | ### `warzone-protocol` (submodule `deps/featherchat`) — 3 errors | Category | Count | Lint | Files | |---|---|---|---| | `clone` on `Copy` type | 1 | `clippy::clone_on_copy` | `ratchet.rs:202` | | Missing `Default` impl | 2 | `clippy::new_without_default` | `types.rs:59`, `types.rs:69` | **Policy:** New tasks must not add *new* clippy errors in crates they touch. The 12 errors above are grandfathered; a follow-up cleanup task should be scheduled to fix them (especially the `wzp-codec` ones, which are straightforward mechanical replacements).