docs: protocol audit 2026-05-25, update architecture + Obsidian vault

Audit:
- docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings
  (4 critical, 2 high, 5 medium, 4 low) with code references and fix
  effort estimates
- vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit
  items with priorities, due dates, and per-step checklists

Architecture docs updated for Wire format v2 and Wave 5/6 features:
- ARCHITECTURE.md: adds wzp-video to dependency graph and project
  structure; wire format updated to v2 (16B header, 5B MiniHeader);
  relay concurrency section corrected (DashMap+RwLock is current, not
  a future optimization); test count 571→702; Android note
- PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702;
  current status and open blockers as of 2026-05-25
- ROAD-TO-VIDEO.md: implementation status table inserted (/🟡/🔴/🔲
  per phase); 6-step critical path to first video call
- WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader
  updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1);
  version negotiation section added

Obsidian vault (vault/):
- 114 files across Architecture/, PRDs/, Reports/, Android/,
  Reference/, Audit/ with YAML frontmatter
- 00 - Home.md index note with wiki links
- .obsidian/app.json config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-05-25 06:00:17 +04:00
parent 12b0d9738f
commit ed8a7ae5aa
120 changed files with 22781 additions and 65 deletions

View File

@@ -0,0 +1,165 @@
---
tags: [prd, wzp]
type: prd
---
# PRD: Video Quality Controller + PriorityMode
> **Status:** proposed
> **Resolves:** Road-to-video Phase V5 (video adaptive controller, audio-priority gate, ScreenShare slide-mode).
> **Depends on:** PRD #3 (BWE), PRD #5 (video v1).
## Problem
Audio and video share a finite bandwidth budget. The FaceTime model — audio absolute priority, video elastic on top — is right for the default voice/video call, but it's wrong for screen-share / presentation where a frozen slide deck is worse than slightly degraded audio.
We need: a single `VideoQualityController` consuming BWE, with a policy gate driven by a user/product-selectable `PriorityMode`.
## Goals
- `PriorityMode` enum carried on `QualityProfile`.
- Per-mode allocation gates: `AudioFirst`, `VideoFirst`, `ScreenShare`, `Balanced`.
- Mid-call `SetPriorityMode` signal for runtime override.
- ScreenShare slide-fallback: when bandwidth drops below SD video floor, encoder switches to single-I-frame-every-N-seconds mode (no wire format change).
- Sensible defaults per call type (voice/video call → AudioFirst; presentation app → ScreenShare).
## Non-goals
- Multi-stream priority (e.g., one HD + one screen-share in the same session — separate work).
- Custom user-defined modes; only the four enum variants.
## Design
### `PriorityMode`
```rust
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
pub enum PriorityMode {
AudioFirst, // default for voice/video calls
VideoFirst, // user override
ScreenShare, // video + slide fallback; audio = intelligible speech only
Balanced, // proportional split
}
```
Carried on `QualityProfile`:
```rust
pub struct QualityProfile {
...
pub priority_mode: PriorityMode, // default AudioFirst
pub video_bitrate_kbps: Option<u32>,
pub video_resolution: Option<(u16, u16)>,
pub video_fps: Option<u8>,
}
```
Mid-call change:
```rust
SignalMessage::SetPriorityMode {
version: u8,
mode: PriorityMode,
}
```
### Allocation gates
```
let bwe = bandwidth_estimator.target_send_bps();
match priority_mode {
AudioFirst => {
audio_budget = max(24_kbps, audio_tier_min); // audio floor first
video_budget = bwe.saturating_sub(audio_budget);
// video → 0 before audio degrades below floor
}
VideoFirst => {
video_budget = max(video_floor, target_video_bps);
audio_budget = bwe.saturating_sub(video_budget);
// audio degrades to Opus 16k floor first
}
ScreenShare => {
// Audio gets just enough for intelligible speech.
audio_budget = 16_kbps;
video_budget = bwe.saturating_sub(audio_budget);
if video_budget < SD_VIDEO_FLOOR {
encoder.set_mode(EncoderMode::SlideFallback);
}
}
Balanced => {
audio_budget = (bwe as f64 * 0.15) as u64;
video_budget = bwe - audio_budget;
}
}
```
### `VideoQualityController`
```rust
pub struct VideoQualityController {
bwe: Arc<BandwidthEstimator>,
mode: AtomicU8, // PriorityMode
encoder: Arc<dyn VideoEncoder>,
loss_pct: AtomicU8,
rtt_ms: AtomicU32,
encoder_queue_ms: AtomicU32,
}
impl VideoQualityController {
pub fn tick(&self) {
let budget = self.allocate();
let target = self.derive_target(budget); // (bitrate, fps, resolution, layer)
self.encoder.set_target(target);
}
}
```
`derive_target` maps `(budget, loss, rtt, queue)` to encoder parameters via a step table. Smoothed; no jumps larger than 2× per second.
### ScreenShare slide-fallback
Pure encoder policy:
- Normal video: continuous frames, target fps (515 for screen content).
- When `video_budget < SD_VIDEO_FLOOR` (e.g., 150 kbps): switch to slide mode.
- Slide mode: emit one high-quality I-frame every 25 s. No P-frames. Encoder prefers H.265 or AV1 (text legibility).
- Wire format: `KeyFrame=1` on every packet, `FrameEnd=1` on last packet of slide. No new fields.
Receiver doesn't know slide mode is on — just sees keyframes arriving slowly.
### Defaults
| Product flow | Default mode |
|---|---|
| Voice call | AudioFirst (no video) |
| Video call | AudioFirst |
| Screen share | ScreenShare |
| User toggle in settings | VideoFirst or Balanced |
## Implementation outline
1. `PriorityMode` enum + serde + `QualityProfile` field (T5.1).
2. `SetPriorityMode` signal variant (T5.1).
3. `VideoQualityController::new` + `tick` (T5.2).
4. Per-mode allocation gates (T5.2).
5. `EncoderMode::SlideFallback` in `wzp-video` (T5.3).
6. Integration: `CallEngine` honors `SetPriorityMode` within 1 s.
7. UI plumbing for runtime toggle (out of scope here; tracked by platform team).
## Acceptance criteria
- 100 kbps shaped link, `AudioFirst`: audio holds Opus 24 k, video drops to 0.
- 100 kbps shaped link, `ScreenShare`: audio holds Opus 16 k, video in slide mode emits 1 I-frame / 3 s.
- 100 kbps shaped link, `VideoFirst`: audio drops to Opus 16 k, video holds floor.
- 5 Mbps link, `AudioFirst`: video reaches HD within 10 s.
- `SetPriorityMode` mid-call applied within 1 s.
## Risks
- **Mode flapping under unstable BWE.** Mitigation: 10 s dwell time before allowing mode-driven encoder reconfiguration.
- **Slide mode mistaken for poor connection by users.** Mitigation: UI indicator distinguishing "slide mode active" from "poor connection".
- **AudioFirst floor too aggressive for low-bandwidth music calls.** Mitigation: when audio profile is `Opus 64k music`, floor raised to 48 k.
## Effort
~6 engineer-days (Wave 5 tasks T5.1T5.3).