Files
wz-phone/docs/PRD-coordinated-codec.md
Siavash Sameni d9e7e72978
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m50s
docs: update PROGRESS, PRDs for completed tasks #9, #11, #12, #27
- PROGRESS.md: add 2026-04-13 section with 5-tier quality, QualityDirective
  handling, debug tap enhancements, dual_path fix, keystore sync
- PRD-coordinated-codec.md: Phase 3 marked complete (client directive handling)
- PRD-adaptive-quality.md: milestone table updated with Done/Pending status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:34:01 +04:00

9.4 KiB

PRD: Coordinated Codec Switching (Relay-Judged Quality)

Problem

The current adaptive quality system (QualityAdapter in call.rs) exists but isn't wired into either engine. Clients encode at a fixed quality chosen at call start. When network conditions change mid-call, audio degrades instead of gracefully stepping down. When conditions improve, clients stay on low quality unnecessarily.

Additionally, in SFU mode with multiple participants, uncoordinated codec switching creates asymmetry: if client A upgrades to 64k while B stays on 24k, bandwidth is wasted. Participants should switch together.

Solution

The relay acts as the quality judge since it sees both sides of every connection. It monitors packet loss, jitter, and RTT per participant, then signals quality recommendations. Clients react to these signals with coordinated codec switches.

Architecture

┌─────────┐        ┌─────────┐        ┌─────────┐
│ Client A │◄──────►│  Relay  │◄──────►│ Client B │
│          │        │ (judge) │        │          │
│ Encoder  │        │         │        │ Encoder  │
│ Decoder  │        │ Monitor │        │ Decoder  │
└─────────┘        │ per-peer│        └─────────┘
                    │ quality │
                    └────┬────┘
                         │
                    Quality Signals:
                    - StableSignal (conditions good)
                    - DegradeSignal (conditions bad)
                    - UpgradeProposal (try higher quality?)
                    - UpgradeConfirm (all agreed, switch at T)

Quality Classification (Relay-Side)

The relay monitors each participant's connection quality:

Condition Classification Action
loss >= 15% OR RTT >= 200ms Critical Immediate downgrade signal
loss >= 5% OR RTT >= 100ms Degraded Downgrade signal after 3 reports
loss < 2% AND RTT < 80ms Good Stable signal
loss < 1% AND RTT < 50ms for 30s Excellent Upgrade proposal
loss < 0.5% AND RTT < 30ms for 60s Studio Studio upgrade proposal

Coordinated Switching Protocol

Downgrade (fast, safety-first)

  1. Relay detects degradation for ANY participant
  2. Relay sends QualityUpdate { recommended_profile: DEGRADED } to ALL participants
  3. ALL participants immediately switch encoder to the recommended profile
  4. No negotiation — downgrade is mandatory and instant

Upgrade (slow, consensual)

  1. Relay detects sustained good conditions for ALL participants (threshold: 30s stable)
  2. Relay sends UpgradeProposal { target_profile, switch_timestamp } to all
  3. Each client responds: UpgradeAccept or UpgradeReject
  4. If ALL accept within 5s → Relay sends UpgradeConfirm { profile, switch_at_ms }
  5. All clients switch encoder at the agreed timestamp (relative to session clock)
  6. If ANY rejects or times out → upgrade cancelled, stay on current profile

Asymmetric Encoding (SFU optimization)

In SFU mode, each client encodes independently. The relay could allow:

  • Client A (strong connection): encode at 64k
  • Client B (weak connection): encode at 6k
  • Relay forwards A's 64k to B's decoder (auto-switch handles it)
  • B benefits from A's quality without needing to send at 64k

This requires NO protocol changes — just each client independently following the relay's recommendation for their own encoding quality. The decoder already handles any codec.

Split Network Consideration

If participant A has great quality but participant C has terrible quality:

  • Option 1: Match weakest link — everyone encodes at C's level (current approach, simple)
  • Option 2: Per-participant recommendations — A encodes at 64k, C encodes at 6k. B (good connection) receives and decodes both. Works because decoders auto-switch per packet.
  • Option 3: Relay transcoding — relay re-encodes A's 64k as 6k for C. Adds CPU on relay, but saves bandwidth for C. Future feature.

Recommended: start with Option 1 (match weakest), add Option 2 later.

Signal Messages (New/Modified)

/// Quality signal from relay to client
QualityDirective {
    /// Recommended profile to use for encoding
    recommended_profile: QualityProfile,
    /// Reason for the recommendation
    reason: QualityReason,
}

enum QualityReason {
    /// Network conditions require this quality level
    NetworkCondition,
    /// Coordinated upgrade — all participants agreed
    CoordinatedUpgrade,
    /// Coordinated downgrade — weakest link determines level
    CoordinatedDowngrade,
}

/// Upgrade proposal from relay
UpgradeProposal {
    target_profile: QualityProfile,
    /// Milliseconds from now when the switch would happen
    switch_delay_ms: u32,
}

/// Client response to upgrade proposal
UpgradeResponse {
    accepted: bool,
}

/// Confirmed upgrade — all clients switch at this time
UpgradeConfirm {
    profile: QualityProfile,
    /// Session-relative timestamp to switch (ms since call start)
    switch_at_session_ms: u64,
}

Relay-Side Implementation

Per-Participant Quality Tracking

struct ParticipantQuality {
    /// Sliding window of recent observations
    loss_samples: VecDeque<f32>,    // last 30 seconds
    rtt_samples: VecDeque<u32>,     // last 30 seconds
    jitter_samples: VecDeque<u32>,
    /// Current classification
    classification: QualityClass,
    /// How long current classification has been stable
    stable_since: Instant,
}

Quality Monitor Task (on relay)

Runs alongside the SFU forwarding loop:

  1. Every 1 second, compute per-participant quality from QUIC connection stats
  2. Classify each participant
  3. If ANY participant degrades → send downgrade to ALL
  4. If ALL participants stable for threshold → propose upgrade
  5. Track upgrade negotiation state

Integration with Existing Code

The relay already has access to:

  • QuinnTransport::path_quality() → loss, RTT, jitter, bandwidth estimates
  • QualityReport embedded in media packet headers
  • Per-session metrics in RelayMetrics

The quality monitor just needs to read these existing metrics and produce signals.

Client-Side Implementation

Handling Quality Signals

In the recv loop (both Android engine and desktop engine):

SignalMessage::QualityDirective { recommended_profile, .. } => {
    // Immediate: switch encoder to recommended profile
    encoder.set_profile(recommended_profile)?;
    fec_enc = create_encoder(&recommended_profile);
    frame_samples = frame_samples_for(&recommended_profile);
    info!(codec = ?recommended_profile.codec, "quality directive: switched");
}

P2P Quality (simpler case)

For P2P calls (no relay), both clients directly observe quality:

  1. Each client runs its own QualityAdapter on the direct connection
  2. When quality changes, client proposes to peer via signal
  3. Simpler negotiation: only 2 parties, no relay middleman
  4. Same coordinated switching logic, just peer-to-peer signals

Backporting P2P → Relay

The quality monitoring and codec switching logic is identical:

  • P2P: client observes quality directly → proposes switch to peer
  • Relay: relay observes quality → proposes switch to all clients

The only difference is WHO makes the decision (client vs relay) and HOW many participants need to agree (2 vs N).

Implementation strategy: build for P2P first (simpler, 2 parties), then wrap the same logic with relay-mediated signals for SFU mode.

Milestones

Phase Scope Effort
1 Relay-side quality monitor (per-participant tracking) 1 day
2 Downgrade signal (immediate, match weakest) 1 day
3 Client handling of QualityDirective 1 day (both engines)
4 Upgrade proposal + negotiation protocol 2 days
5 P2P quality adaptation (direct observation) 1 day
6 Per-participant asymmetric encoding (Option 2) 1 day

Implementation Status (2026-04-13)

Phases 1-2 are implemented. Phase 3 has a critical gap.

What was built

  • QualityDirective signal (crates/wzp-proto/src/packet.rs): New SignalMessage variant with recommended_profile and optional reason
  • ParticipantQuality (crates/wzp-relay/src/room.rs): Per-participant quality tracking using AdaptiveQualityController, created on join, removed on leave
  • Weakest-link broadcast: observe_quality() method computes room-wide worst tier, broadcasts QualityDirective to all participants when tier changes
  • Desktop engine handling (desktop/src-tauri/src/engine.rs): AdaptiveQualityController in recv task, pending_profile AtomicU8 bridge to send task, auto-mode profile switching based on inbound quality reports

Phase 3 completed (2026-04-13)

Both engines now handle QualityDirective signals from the relay:

  • Desktop (engine.rs): both P2P and relay signal tasks match QualityDirective, extract recommended_profile, store index via sig_pending_profile.store(idx, Release). Send task picks it up at the next frame boundary.
  • Android (engine.rs): signal task matches QualityDirective, stores via pending_profile_recv.store(idx, Release).

Relay-coordinated codec switching is now end-to-end: relay monitors → broadcasts directive → clients switch.

Phase remaining

  • Phase 4: Upgrade proposal/negotiation protocol for quality recovery (task #28)