From d66d583583816a8cef0d9f7ecaf6ff992fe707dc Mon Sep 17 00:00:00 2001 From: Siavash Sameni Date: Tue, 7 Apr 2026 21:25:33 +0400 Subject: [PATCH] docs: PRD for adaptive quality control (auto codec) Covers the full design for runtime codec switching based on network conditions: 3-tier basic (GOOD/DEGRADED/CATASTROPHIC), extended 5-tier with studio levels, and bandwidth probing. Details the existing QualityAdapter infrastructure, what's missing (report ingestion, profile switch loop, cross-task signaling via AtomicU8), and implementation plan for both Android and desktop engines. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/PRD-adaptive-quality.md | 201 +++++++++++++++++++++++++++++++++++ 1 file changed, 201 insertions(+) create mode 100644 docs/PRD-adaptive-quality.md diff --git a/docs/PRD-adaptive-quality.md b/docs/PRD-adaptive-quality.md new file mode 100644 index 0000000..7f5376d --- /dev/null +++ b/docs/PRD-adaptive-quality.md @@ -0,0 +1,201 @@ +# PRD: Adaptive Quality Control (Auto Codec) + +## Problem + +When a user selects "Auto" quality, the system currently just starts at Opus 24k (GOOD) and never changes. There is no runtime adaptation — if the network degrades mid-call, audio breaks up instead of gracefully stepping down to a lower bitrate codec. Conversely, if the network is excellent, the user stays on 24k when they could have studio-quality 64k. + +The relay already sends `QualityReport` messages with loss % and RTT, and a `QualityAdapter` exists in `call.rs` that classifies network conditions into GOOD/DEGRADED/CATASTROPHIC — but none of this is wired into the Android or desktop engines. + +## Solution + +Wire the existing `QualityAdapter` into both engines so that "Auto" mode continuously monitors network quality and switches codecs mid-call. The full quality range should be used: + +``` +Excellent network → Studio 64k (best quality) +Good network → Opus 24k (default) +Degraded network → Opus 6k (lower bitrate, more FEC) +Poor network → Codec2 3.2k (vocoder, heavy FEC) +Catastrophic → Codec2 1.2k (minimum viable voice) +``` + +## Architecture + +``` + ┌─────────────────────┐ + Relay ──────────► │ QualityReport │ loss %, RTT, jitter + │ (every ~1s) │ + └────────┬────────────┘ + │ + ▼ + ┌─────────────────────┐ + │ QualityAdapter │ classify + hysteresis + │ (3-report window) │ + └────────┬────────────┘ + │ recommend new profile + ▼ + ┌──────────────┴──────────────┐ + │ │ + ▼ ▼ + ┌────────────────┐ ┌────────────────┐ + │ Encoder │ │ Decoder │ + │ set_profile() │ │ (auto-switch │ + │ + FEC update │ │ already works)│ + └────────────────┘ └────────────────┘ +``` + +## Existing Infrastructure + +### What already exists (in `crates/wzp-client/src/call.rs`) + +1. **`QualityAdapter`** (lines 97-196): + - Sliding window of `QualityReport` messages + - `classify()`: loss > 15% or RTT > 200ms → CATASTROPHIC, loss > 5% or RTT > 100ms → DEGRADED, else → GOOD + - `should_switch()`: hysteresis — requires 3 consecutive reports recommending the same profile before switching + - Prevents oscillation between profiles + +2. **`QualityReport`** (in `wzp-proto/src/packet.rs`): + - Sent by relay piggy-backed on media packets + - Fields: `loss_pct` (u8, 0-255 scaled), `rtt_4ms` (u8, RTT in 4ms units), `jitter_ms`, `bitrate_cap_kbps` + +3. **`CallEncoder::set_profile()`** / **`CallDecoder` auto-switch**: + - Encoder can switch codec mid-stream + - Decoder already auto-detects incoming codec from packet headers + +### What's missing + +1. **QualityReport ingestion** — neither Android engine nor desktop engine reads quality reports from the relay +2. **Profile switch loop** — no periodic check that feeds reports to `QualityAdapter` and applies recommended switches +3. **Upward adaptation** — `QualityAdapter` only classifies into 3 tiers (GOOD/DEGRADED/CATASTROPHIC). Needs extension to recommend studio tiers when conditions are excellent (loss < 1%, RTT < 50ms) +4. **Notification to UI** — when quality changes, the UI should show the current active codec + +## Requirements + +### Phase 1: Basic Adaptive (3-tier) + +**Both Android and Desktop:** + +1. **Ingest QualityReports**: In the recv loop, extract `quality_report` from incoming `MediaPacket`s when present. Feed to `QualityAdapter`. + +2. **Periodic quality check**: Every 1 second (or on each QualityReport), call `adapter.should_switch(¤t_profile)`. If it returns `Some(new_profile)`: + - Switch the encoder: `encoder.set_profile(new_profile)` + - Update FEC encoder: `fec_enc = create_encoder(&new_profile)` + - Update frame size if changed (e.g., 20ms → 40ms) + - Log the switch + +3. **Frame size adaptation on switch**: When switching from 20ms to 40ms frames (or vice versa): + - Android: update `frame_samples` variable, resize `capture_buf` + - Desktop: same — the send loop reads `frame_samples` dynamically + +4. **UI indicator**: Show current active codec in the call screen stats line. + - Android: add to `CallStats` and display in stats text + - Desktop: add to `get_status` response and display in stats div + +5. **Only in Auto mode**: Adaptive switching should only happen when the user selected "Auto". If they manually selected a profile, respect their choice. + +### Phase 2: Extended Range (5-tier) + +Extend `QualityAdapter::classify()` to use the full codec range: + +| Condition | Profile | Codec | +|-----------|---------|-------| +| loss < 1% AND RTT < 30ms | STUDIO_64K | Opus 64k | +| loss < 1% AND RTT < 50ms | STUDIO_48K | Opus 48k | +| loss < 2% AND RTT < 80ms | STUDIO_32K | Opus 32k | +| loss < 5% AND RTT < 100ms | GOOD | Opus 24k | +| loss < 15% AND RTT < 200ms | DEGRADED | Opus 6k | +| loss >= 15% OR RTT >= 200ms | CATASTROPHIC | Codec2 1.2k | + +With hysteresis: +- **Downgrade**: 3 consecutive reports (fast reaction to degradation) +- **Upgrade**: 5 consecutive reports (slow, cautious improvement) +- **Studio upgrade**: 10 consecutive reports (very conservative — avoid bouncing to 64k on brief good patches) + +### Phase 3: Bandwidth Probing + +Rather than relying solely on loss/RTT: +1. Start at GOOD +2. After 10 seconds of stable call, probe upward by switching to STUDIO_32K +3. If no quality degradation after 5 seconds, probe to STUDIO_48K +4. If degradation detected, immediately fall back +5. This discovers the true available bandwidth rather than guessing from loss stats + +## Implementation Plan + +### Android (`crates/wzp-android/src/engine.rs`) + +```rust +// In the recv loop, after decoding: +if let Some(ref qr) = pkt.quality_report { + quality_adapter.ingest(qr); +} + +// Periodic check (every 50 frames ≈ 1 second): +if auto_profile && frames_decoded % 50 == 0 { + if let Some(new_profile) = quality_adapter.should_switch(¤t_profile) { + info!(from = ?current_profile.codec, to = ?new_profile.codec, "auto: switching quality"); + let _ = encoder_ref.lock().set_profile(new_profile); + fec_enc_ref.lock() = create_encoder(&new_profile); + current_profile = new_profile; + frame_samples = frame_samples_for(&new_profile); + // Resize capture buffer if needed + } +} +``` + +**Challenge**: The encoder is in the send task and the quality reports arrive in the recv task. Need shared state (AtomicU8 for profile index, or a channel). + +**Recommended approach**: Use an `AtomicU8` that the recv task writes and the send task reads: +```rust +let pending_profile = Arc::new(AtomicU8::new(0xFF)); // 0xFF = no change + +// Recv task: when adapter recommends switch +pending_profile.store(new_profile_index, Ordering::Release); + +// Send task: check at frame boundary +let p = pending_profile.swap(0xFF, Ordering::Acquire); +if p != 0xFF { /* apply switch */ } +``` + +### Desktop (`desktop/src-tauri/src/engine.rs`) + +Same pattern. The desktop engine already has separate send/recv tasks with shared atomics for mic_muted, etc. Add a `pending_profile: Arc` following the same pattern. + +### Desktop CLI (`crates/wzp-client/src/call.rs`) + +The `CallEncoder` already has `set_profile()`. The `CallDecoder` already auto-switches. Just need to: +1. Add `QualityAdapter` to `CallDecoder` +2. Feed quality reports in `ingest()` +3. Check `should_switch()` in `decode_next()` +4. Emit the recommendation via a callback or return value + +## Testing + +1. **Local test with tc/netem**: Use Linux traffic control to simulate loss/latency: + ```bash + # Simulate 10% loss, 150ms RTT + tc qdisc add dev lo root netem loss 10% delay 75ms + # Run 2 clients in auto mode, verify they switch to DEGRADED + ``` + +2. **CLI test**: Run `wzp-client --profile auto` between two instances with simulated network conditions + +3. **Relay quality reports**: Verify the relay actually sends QualityReport messages. If it doesn't yet, that needs to be implemented first (check relay code). + +## Open Questions + +1. **Does the relay currently send QualityReports?** If not, Phase 1 is blocked until the relay implements per-client loss/RTT tracking and report generation. The relay sees all packets and can compute loss % per sender. + +2. **Codec2 3.2k placement**: Should auto mode use Codec2 3.2k between DEGRADED and CATASTROPHIC? It's 20ms frames (lower latency than Opus 6k's 40ms) but speech-only quality. + +3. **Cross-client adaptation**: If client A is on GOOD and client B auto-adapts to CATASTROPHIC, client A still sends Opus 24k. Client B can decode it fine (auto-switch on recv). But should A also be told to lower quality to save B's bandwidth? This requires signaling between clients. + +## Milestones + +| Phase | Scope | Effort | Dependency | +|-------|-------|--------|------------| +| 0 | Verify relay sends QualityReports | 0.5 day | None | +| 1a | Wire QualityAdapter in Android engine | 1 day | Phase 0 | +| 1b | Wire QualityAdapter in desktop engine | 1 day | Phase 0 | +| 1c | UI indicator (current codec) | 0.5 day | Phase 1a/1b | +| 2 | Extended 5-tier classification | 0.5 day | Phase 1 | +| 3 | Bandwidth probing | 2 days | Phase 2 |