docs: protocol audit 2026-05-25, update architecture + Obsidian vault

Audit:
- docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings
  (4 critical, 2 high, 5 medium, 4 low) with code references and fix
  effort estimates
- vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit
  items with priorities, due dates, and per-step checklists

Architecture docs updated for Wire format v2 and Wave 5/6 features:
- ARCHITECTURE.md: adds wzp-video to dependency graph and project
  structure; wire format updated to v2 (16B header, 5B MiniHeader);
  relay concurrency section corrected (DashMap+RwLock is current, not
  a future optimization); test count 571→702; Android note
- PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702;
  current status and open blockers as of 2026-05-25
- ROAD-TO-VIDEO.md: implementation status table inserted (/🟡/🔴/🔲
  per phase); 6-step critical path to first video call
- WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader
  updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1);
  version negotiation section added

Obsidian vault (vault/):
- 114 files across Architecture/, PRDs/, Reports/, Android/,
  Reference/, Audit/ with YAML frontmatter
- 00 - Home.md index note with wiki links
- .obsidian/app.json config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-05-25 06:00:17 +04:00
parent 12b0d9738f
commit ed8a7ae5aa
120 changed files with 22781 additions and 65 deletions

View File

@@ -2,7 +2,7 @@
> Distilled from `docs/ARCHITECTURE.md` and the `wzp-proto` crate. Authoritative wire details live in `crates/wzp-proto/src/packet.rs`.
>
> **Status:** v1 (audio-only) is the deployed protocol. v2 (audio + video, 16 B header, MediaType, u32 seq, etc.) is specified in `ROAD-TO-VIDEO.md` Phase V1 and supersedes this document when implemented.
> **Status:** v2 is the deployed protocol (audio + video, 16 B header, MediaType, u32 seq). v1 clients are rejected with `Hangup::ProtocolVersionMismatch`.
## Layer summary
@@ -16,42 +16,47 @@
| Loss recovery | **RaptorQ FEC + Opus DRED + classical PLC** | NACK / PLI + reference-picture selection |
| Adaptive | 3-tier hysteresis (Good / Degraded / Catastrophic) + continuous DRED tuner | Per-frame bitrate ladder |
| Topology | SFU rooms + inter-relay federation + P2P via ICE | Mesh ≤ ~3, SFU above, Apple relays |
| Header | 12 B `MediaHeader` / 4 B `MiniHeader` (49 of 50), 4 B `QualityReport` trailer | RTP 12 B + extensions |
| Header | 16 B `MediaHeader` v2 / 5 B `MiniHeader` (49 of 50), 4 B `QualityReport` trailer | RTP 12 B + extensions |
## Distinctive choices
- **QUIC datagrams instead of raw UDP + SRTP.** Brings TLS 1.3, PLPMTUD, path migration, and ACK-based RTT/loss estimation for free.
- **Continuous DRED tuning.** Maps live `(loss%, RTT, jitter)` to a continuous Opus DRED lookback window. Most stacks treat DRED as discrete tiers.
- **MiniHeader (4 B for 49/50 packets).** Saves ~8 B/packet ≈ 400 B/s/stream at 50 pps.
- **MiniHeader (5 B for 49/50 packets).** Saves ~11 B/packet ≈ 550 B/s/stream at 50 pps vs. the full 16 B header.
- **E2E-preserving SFU.** The relay forwards encrypted datagrams; it never decrypts media. Room membership uses SNI = `hash(room_name)`.
- **Codec coordination via `QualityReport` trailer.** Receivers attach 4-byte loss/RTT/jitter/cap to media packets; the SFU broadcasts `QualityDirective` so all senders in a room converge on the same tier.
## Wire format (current — v1)
## Wire format (current — v2)
### `MediaHeader` (12 bytes)
### `MediaHeader` v2 (16 bytes, byte-aligned)
```
Byte 0: [V:1][T:1][CodecID:4][Q:1][FecRatioHi:1]
Byte 1: [FecRatioLo:6][unused:2]
Bytes 2-3: sequence (u16 BE)
Bytes 4-7: timestamp_ms (u32 BE)
Byte 8: fec_block_id (u8)
Byte 9: fec_symbol_idx (u8)
Byte 10: reserved
Byte 11: csrc_count
Byte 0: version (u8) 0x02
Byte 1: flags (u8) [T:1][Q:1][KeyFrame:1][FrameEnd:1][reserved:4]
Byte 2: media_type (u8) 0=audio, 1=video, 2=data, 3=control
Byte 3: codec_id (u8) 0-255 (see codec table)
Byte 4: stream_id (u8) simulcast layer; 0=base
Byte 5: fec_ratio (u8) 0..200 → 0.0..2.0
Bytes 6-9: sequence (u32 BE)
Bytes 10-13: timestamp_ms (u32 BE)
Bytes 14-15: fec_block_id (u16 BE)
```
| Field | Bits | Meaning |
|---|---|---|
| V | 1 | Protocol version |
| T | 1 | 1 = FEC repair packet |
| CodecID | 4 | See codec table |
| Q | 1 | QualityReport trailer present |
| FecRatio | 7 | 0127 → 0.02.0 |
| sequence | 16 | Wrapping packet seq |
| version | 8 | Must be `0x02`; v1 clients receive `Hangup::ProtocolVersionMismatch` |
| T (bit 7 of flags) | 1 | 1 = FEC repair packet |
| Q (bit 6 of flags) | 1 | QualityReport trailer present |
| KeyFrame (bit 5 of flags) | 1 | Packet belongs to a video I-frame |
| FrameEnd (bit 4 of flags) | 1 | Last packet of an access unit |
| reserved (bits 3-0 of flags) | 4 | Must be zero |
| media_type | 8 | 0=audio, 1=video, 2=data, 3=control |
| codec_id | 8 | See codec table (widened from v1's 4-bit field) |
| stream_id | 8 | Simulcast layer; 0=base layer |
| fec_ratio | 8 | 0..200 → 0.0..2.0 |
| sequence | 32 | Monotonically increasing packet seq (not reset by rekey) |
| timestamp_ms | 32 | ms since session start. Monotonic across the full session; **not reset by rekey** |
| fec_block_id | 8 | FEC source block ID |
| fec_symbol_idx | 8 | Symbol index in block |
| fec_block_id | 16 | FEC source block ID |
### Codec table
@@ -66,13 +71,18 @@ Byte 11: csrc_count
| 6 | Opus 32k | 32 kbps | 48 kHz | 20 ms |
| 7 | Opus 48k | 48 kbps | 48 kHz | 20 ms |
| 8 | Opus 64k | 64 kbps | 48 kHz | 20 ms |
| 9 | H.264 Baseline | — | — | — |
| 10 | H.264 Main | — | — | — |
| 11 | H.265 Main | — | — | — |
| 12 | AV1 Main | — | — | — |
### `MiniHeader` (4 bytes, compressed — 49 of every 50 packets)
### `MiniHeader` v2 (5 bytes, compressed — 49 of every 50 packets)
```
[FRAME_TYPE_MINI = 0x01]
Bytes 0-1: timestamp_delta_ms (u16 BE)
Bytes 2-3: payload_len (u16 BE)
Byte 0: seq_delta (u8)
Bytes 1-2: timestamp_delta_ms (u16 BE)
Bytes 3-4: payload_len (u16 BE)
```
Full header sent every 50th packet to resync.
@@ -95,6 +105,12 @@ Byte 2: jitter_ms (0-255 ms)
Byte 3: bitrate_cap_kbps (0-255 kbps)
```
### Version negotiation
- `version=0x02` in `MediaHeader` is a hard switch — there is no fallback negotiation.
- Both endpoints must speak v2. A v1 peer receives `Hangup::ProtocolVersionMismatch` immediately.
- Relays inspect only `version` and `media_type`; they never downgrade or translate between versions.
## Session lifecycle
```