docs: protocol audit 2026-05-25, update architecture + Obsidian vault

Audit: - docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings (4 critical, 2 high, 5 medium, 4 low) with code references and fix effort estimates - vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit items with priorities, due dates, and per-step checklists Architecture docs updated for Wire format v2 and Wave 5/6 features: - ARCHITECTURE.md: adds wzp-video to dependency graph and project structure; wire format updated to v2 (16B header, 5B MiniHeader); relay concurrency section corrected (DashMap+RwLock is current, not a future optimization); test count 571→702; Android note - PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702; current status and open blockers as of 2026-05-25 - ROAD-TO-VIDEO.md: implementation status table inserted (✅/🟡/🔴/🔲 per phase); 6-step critical path to first video call - WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1); version negotiation section added Obsidian vault (vault/): - 114 files across Architecture/, PRDs/, Reports/, Android/, Reference/, Audit/ with YAML frontmatter - 00 - Home.md index note with wiki links - .obsidian/app.json config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 06:00:17 +04:00
parent 12b0d9738f
commit ed8a7ae5aa
120 changed files with 22781 additions and 65 deletions
--- a/vault/PRDs/PRD-video-simulcast.md
+++ b/vault/PRDs/PRD-video-simulcast.md
@@ -0,0 +1,111 @@
+---
+tags: [prd, wzp]
+type: prd
+---
+
+# PRD: Simulcast + Per-Receiver Layer Selection
+
+> **Status:** proposed
+> **Resolves:** Road-to-video Phases V5 + V6 (simulcast at sender, layer selection at SFU).
+> **Depends on:** PRD #5 (video v1), PRD #7 (VideoQualityController).
+
+## Problem
+
+In a multi-peer video room, peers have wildly different link quality. A single uplink stream forces a choice: encode for the worst peer (everyone sees SD) or encode for the best peer (poor peers drop out). Simulcast solves this — sender uploads multiple independent layers, and the SFU forwards the appropriate layer to each receiver based on their current quality.
+
+WZP's v2 wire format already reserves `stream_id: u8` for this. This PRD wires it up.
+
+## Goals
+
+- Sender emits 2–3 simultaneous H.264/H.265/AV1 streams per source (different bitrate/resolution).
+- Each layer tagged by `stream_id` (0 = base/SD, 1 = mid/HD, 2 = high/FHD).
+- SFU selects per-receiver which layer to forward, based on that receiver's last `QualityReport` / BWE.
+- Layer switches are seamless (next keyframe boundary) and don't require sender involvement.
+- Mixed-quality rooms work: best peer gets FHD, worst peer gets SD, no peer holds the room back.
+
+## Non-goals
+
+- SVC (per-layer temporal scalability within one bitstream). Simulcast achieves the same outcome with simpler encoder.
+- Audio simulcast (audio is small; not worth the encode cost).
+
+## Design
+
+### Sender side
+
+Three encoder instances per source:
+
+| `stream_id` | Resolution | Target bitrate | Frame rate |
+|---|---|---|---|
+| 0 (low) | 480×270 | 150 kbps | 15 fps |
+| 1 (mid) | 960×540 | 600 kbps | 30 fps |
+| 2 (high) | 1920×1080 | 2.5 Mbps | 30 fps |
+
+Resolution/bitrate ladder configurable per profile. Encoders share input frames (downsample for low/mid).
+
+Each layer is an independent stream with its own `sequence`, `timestamp_ms`, and FEC blocks. Identified on the wire by `stream_id` byte in `MediaHeader` v2.
+
+### SFU forwarding
+
+`RoomManager` per-receiver state:
+
+```rust
+pub struct ReceiverState {
+    fingerprint: Fingerprint,
+    bwe_kbps: AtomicU32,
+    loss_pct: AtomicU8,
+    selected_layer: AtomicU8,  // per (sender, source_stream)
+}
+```
+
+Layer selection logic (run periodically per receiver):
+
+```
+if receiver.bwe_kbps > HIGH_THRESHOLD && receiver.loss_pct < 2:
+    selected_layer = high
+elif receiver.bwe_kbps > MID_THRESHOLD:
+    selected_layer = mid
+else:
+    selected_layer = low
+```
+
+Hysteresis: must hold new tier for 3 s before switching.
+
+On layer switch:
+- SFU continues forwarding the old layer until the next keyframe arrives on the new layer.
+- If no keyframe on the new layer within 500 ms, SFU emits PLI to sender for that layer.
+
+### Per-layer keyframe cache
+
+PRD #5 keyframe cache extended: one cache entry per `(room, sender, stream_id)`. New joiner gets the most recent keyframe from the layer matched to their BWE.
+
+### Layer-aware PLI suppression
+
+PLI is layer-scoped. Sender refreshes only the requested layer, not all three.
+
+## Implementation outline
+
+1. `VideoQualityController` extended to drive 3 encoder instances per source (T5.5).
+2. Frame distributor: downsample input frame for low/mid layers before encode.
+3. Per-layer state on `MediaHeader` (already in v2 via `stream_id`).
+4. SFU `ReceiverState` and selection logic (T5.6).
+5. Per-layer keyframe cache (extension of PRD #5).
+6. Per-layer PLI plumbing.
+7. Telemetry: `wzp_room_layer_distribution{stream_id}` histogram.
+
+## Acceptance criteria
+
+- 3-encoder uplink works on M1 within 8 % CPU at 1080p30 / 540p30 / 270p15.
+- 4-peer room with shaped links (5 Mbps, 1 Mbps, 500 kbps, 100 kbps): each peer receives the highest layer their link supports.
+- Layer switch under improving link conditions occurs within 5 s of bandwidth recovery.
+- No peer's bandwidth degradation holds back any other peer.
+
+## Risks
+
+- **3-encoder CPU cost on mid/low-end Android.** Mitigation: dynamic layer count — drop high layer if encoder queue grows; some devices may only support 2 layers.
+- **Frame-rate drift between layers** (independent encoders running). Mitigation: shared frame clock; low/mid layers drop frames if needed to stay aligned.
+- **SFU per-receiver state bloat.** Mitigation: only allocate state for active receivers; 80 B/receiver/sender bound.
+- **Layer switch causing brief visible flicker.** Mitigation: switch only at keyframes; UI may show momentary resolution change but no glitch.
+
+## Effort
+
+~7 engineer-days (Wave 5 tasks T5.5 + T5.6).