# PRD: Coordinated Codec Switching (Relay-Judged Quality) ## Problem The current adaptive quality system (`QualityAdapter` in call.rs) exists but isn't wired into either engine. Clients encode at a fixed quality chosen at call start. When network conditions change mid-call, audio degrades instead of gracefully stepping down. When conditions improve, clients stay on low quality unnecessarily. Additionally, in SFU mode with multiple participants, uncoordinated codec switching creates asymmetry: if client A upgrades to 64k while B stays on 24k, bandwidth is wasted. Participants should switch together. ## Solution The **relay acts as the quality judge** since it sees both sides of every connection. It monitors packet loss, jitter, and RTT per participant, then signals quality recommendations. Clients react to these signals with coordinated codec switches. ## Architecture ``` ┌─────────┐ ┌─────────┐ ┌─────────┐ │ Client A │◄──────►│ Relay │◄──────►│ Client B │ │ │ │ (judge) │ │ │ │ Encoder │ │ │ │ Encoder │ │ Decoder │ │ Monitor │ │ Decoder │ └─────────┘ │ per-peer│ └─────────┘ │ quality │ └────┬────┘ │ Quality Signals: - StableSignal (conditions good) - DegradeSignal (conditions bad) - UpgradeProposal (try higher quality?) - UpgradeConfirm (all agreed, switch at T) ``` ## Quality Classification (Relay-Side) The relay monitors each participant's connection quality: | Condition | Classification | Action | |-----------|---------------|--------| | loss >= 15% OR RTT >= 200ms | Critical | Immediate downgrade signal | | loss >= 5% OR RTT >= 100ms | Degraded | Downgrade signal after 3 reports | | loss < 2% AND RTT < 80ms | Good | Stable signal | | loss < 1% AND RTT < 50ms for 30s | Excellent | Upgrade proposal | | loss < 0.5% AND RTT < 30ms for 60s | Studio | Studio upgrade proposal | ## Coordinated Switching Protocol ### Downgrade (fast, safety-first) 1. Relay detects degradation for ANY participant 2. Relay sends `QualityUpdate { recommended_profile: DEGRADED }` to ALL participants 3. ALL participants immediately switch encoder to the recommended profile 4. No negotiation — downgrade is mandatory and instant ### Upgrade (slow, consensual) 1. Relay detects sustained good conditions for ALL participants (threshold: 30s stable) 2. Relay sends `UpgradeProposal { target_profile, switch_timestamp }` to all 3. Each client responds: `UpgradeAccept` or `UpgradeReject` 4. If ALL accept within 5s → Relay sends `UpgradeConfirm { profile, switch_at_ms }` 5. All clients switch encoder at the agreed timestamp (relative to session clock) 6. If ANY rejects or times out → upgrade cancelled, stay on current profile ### Asymmetric Encoding (SFU optimization) In SFU mode, each client encodes independently. The relay could allow: - Client A (strong connection): encode at 64k - Client B (weak connection): encode at 6k - Relay forwards A's 64k to B's decoder (auto-switch handles it) - B benefits from A's quality without needing to send at 64k This requires NO protocol changes — just each client independently following the relay's recommendation for their own encoding quality. The decoder already handles any codec. ### Split Network Consideration If participant A has great quality but participant C has terrible quality: - Option 1: **Match weakest link** — everyone encodes at C's level (current approach, simple) - Option 2: **Per-participant recommendations** — A encodes at 64k, C encodes at 6k. B (good connection) receives and decodes both. Works because decoders auto-switch per packet. - Option 3: **Relay transcoding** — relay re-encodes A's 64k as 6k for C. Adds CPU on relay, but saves bandwidth for C. Future feature. Recommended: start with Option 1 (match weakest), add Option 2 later. ## Signal Messages (New/Modified) ```rust /// Quality signal from relay to client QualityDirective { /// Recommended profile to use for encoding recommended_profile: QualityProfile, /// Reason for the recommendation reason: QualityReason, } enum QualityReason { /// Network conditions require this quality level NetworkCondition, /// Coordinated upgrade — all participants agreed CoordinatedUpgrade, /// Coordinated downgrade — weakest link determines level CoordinatedDowngrade, } /// Upgrade proposal from relay UpgradeProposal { target_profile: QualityProfile, /// Milliseconds from now when the switch would happen switch_delay_ms: u32, } /// Client response to upgrade proposal UpgradeResponse { accepted: bool, } /// Confirmed upgrade — all clients switch at this time UpgradeConfirm { profile: QualityProfile, /// Session-relative timestamp to switch (ms since call start) switch_at_session_ms: u64, } ``` ## Relay-Side Implementation ### Per-Participant Quality Tracking ```rust struct ParticipantQuality { /// Sliding window of recent observations loss_samples: VecDeque, // last 30 seconds rtt_samples: VecDeque, // last 30 seconds jitter_samples: VecDeque, /// Current classification classification: QualityClass, /// How long current classification has been stable stable_since: Instant, } ``` ### Quality Monitor Task (on relay) Runs alongside the SFU forwarding loop: 1. Every 1 second, compute per-participant quality from QUIC connection stats 2. Classify each participant 3. If ANY participant degrades → send downgrade to ALL 4. If ALL participants stable for threshold → propose upgrade 5. Track upgrade negotiation state ### Integration with Existing Code The relay already has access to: - `QuinnTransport::path_quality()` → loss, RTT, jitter, bandwidth estimates - `QualityReport` embedded in media packet headers - Per-session metrics in `RelayMetrics` The quality monitor just needs to read these existing metrics and produce signals. ## Client-Side Implementation ### Handling Quality Signals In the recv loop (both Android engine and desktop engine): ```rust SignalMessage::QualityDirective { recommended_profile, .. } => { // Immediate: switch encoder to recommended profile encoder.set_profile(recommended_profile)?; fec_enc = create_encoder(&recommended_profile); frame_samples = frame_samples_for(&recommended_profile); info!(codec = ?recommended_profile.codec, "quality directive: switched"); } ``` ### P2P Quality (simpler case) For P2P calls (no relay), both clients directly observe quality: 1. Each client runs its own `QualityAdapter` on the direct connection 2. When quality changes, client proposes to peer via signal 3. Simpler negotiation: only 2 parties, no relay middleman 4. Same coordinated switching logic, just peer-to-peer signals ## Backporting P2P → Relay The quality monitoring and codec switching logic is identical: - **P2P**: client observes quality directly → proposes switch to peer - **Relay**: relay observes quality → proposes switch to all clients The only difference is WHO makes the decision (client vs relay) and HOW many participants need to agree (2 vs N). Implementation strategy: build for P2P first (simpler, 2 parties), then wrap the same logic with relay-mediated signals for SFU mode. ## Milestones | Phase | Scope | Effort | |-------|-------|--------| | 1 | Relay-side quality monitor (per-participant tracking) | 1 day | | 2 | Downgrade signal (immediate, match weakest) | 1 day | | 3 | Client handling of QualityDirective | 1 day (both engines) | | 4 | Upgrade proposal + negotiation protocol | 2 days | | 5 | P2P quality adaptation (direct observation) | 1 day | | 6 | Per-participant asymmetric encoding (Option 2) | 1 day | ## Implementation Status (2026-04-13) Phases 1-2 are implemented. Phase 3 has a critical gap. ### What was built - **`QualityDirective` signal** (`crates/wzp-proto/src/packet.rs`): New `SignalMessage` variant with `recommended_profile` and optional `reason` - **`ParticipantQuality`** (`crates/wzp-relay/src/room.rs`): Per-participant quality tracking using `AdaptiveQualityController`, created on join, removed on leave - **Weakest-link broadcast**: `observe_quality()` method computes room-wide worst tier, broadcasts `QualityDirective` to all participants when tier changes - **Desktop engine handling** (`desktop/src-tauri/src/engine.rs`): `AdaptiveQualityController` in recv task, `pending_profile` AtomicU8 bridge to send task, auto-mode profile switching based on **inbound quality reports** ### Phase 3 completed (2026-04-13) Both engines now handle `QualityDirective` signals from the relay: - **Desktop** (`engine.rs`): both P2P and relay signal tasks match `QualityDirective`, extract `recommended_profile`, store index via `sig_pending_profile.store(idx, Release)`. Send task picks it up at the next frame boundary. - **Android** (`engine.rs`): signal task matches `QualityDirective`, stores via `pending_profile_recv.store(idx, Release)`. Relay-coordinated codec switching is now end-to-end: relay monitors → broadcasts directive → clients switch. ### Phase remaining - Phase 4: Upgrade proposal/negotiation protocol for quality recovery (task #28)