manawenuz/wz-phone

Fork 0

Files

Siavash Sameni ea5fc17c34

Build Release Binaries / build-amd64 (push) Failing after 3m39s

Details

Mirror to GitHub / mirror (push) Failing after 28s

Details

fix(relay): debug tap signal logging, dual_path test regression, PRD updates

- Add log_signal() and log_event() to DebugTap for RoomUpdate,
  QualityDirective, join/leave lifecycle events (task #11)
- Fix dual_path.rs Phase 7 regression: add missing ipv6_endpoint arg
  to 3 race() call sites
- Update PRDs to reflect actual implementation status: mark adaptive
  quality, coordinated codec, P2P, network awareness, protocol analyzer
- Update PROGRESS.md with QualityDirective gap and dual_path regression

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-13 09:54:52 +04:00

10 KiB

Raw Blame History

PRD: Adaptive Quality Control (Auto Codec)

Problem

When a user selects "Auto" quality, the system currently just starts at Opus 24k (GOOD) and never changes. There is no runtime adaptation — if the network degrades mid-call, audio breaks up instead of gracefully stepping down to a lower bitrate codec. Conversely, if the network is excellent, the user stays on 24k when they could have studio-quality 64k.

The relay already sends QualityReport messages with loss % and RTT, and a QualityAdapter exists in call.rs that classifies network conditions into GOOD/DEGRADED/CATASTROPHIC — but none of this is wired into the Android or desktop engines.

Solution

Wire the existing QualityAdapter into both engines so that "Auto" mode continuously monitors network quality and switches codecs mid-call. The full quality range should be used:

Excellent network  →  Studio 64k (best quality)
Good network       →  Opus 24k (default)
Degraded network   →  Opus 6k (lower bitrate, more FEC)
Poor network       →  Codec2 3.2k (vocoder, heavy FEC)
Catastrophic       →  Codec2 1.2k (minimum viable voice)

Architecture

                    ┌─────────────────────┐
  Relay ──────────► │  QualityReport      │  loss %, RTT, jitter
                    │  (every ~1s)        │
                    └────────┬────────────┘
                             │
                             ▼
                    ┌─────────────────────┐
                    │  QualityAdapter     │  classify + hysteresis
                    │  (3-report window)  │
                    └────────┬────────────┘
                             │ recommend new profile
                             ▼
              ┌──────────────┴──────────────┐
              │                             │
              ▼                             ▼
     ┌────────────────┐           ┌────────────────┐
     │  Encoder        │           │  Decoder        │
     │  set_profile()  │           │  (auto-switch   │
     │  + FEC update   │           │   already works)│
     └────────────────┘           └────────────────┘

Existing Infrastructure

What already exists (in `crates/wzp-client/src/call.rs`)

QualityAdapter (lines 97-196):
- Sliding window of QualityReport messages
- classify(): loss > 15% or RTT > 200ms → CATASTROPHIC, loss > 5% or RTT > 100ms → DEGRADED, else → GOOD
- should_switch(): hysteresis — requires 3 consecutive reports recommending the same profile before switching
- Prevents oscillation between profiles
QualityReport (in wzp-proto/src/packet.rs):
- Sent by relay piggy-backed on media packets
- Fields: loss_pct (u8, 0-255 scaled), rtt_4ms (u8, RTT in 4ms units), jitter_ms, bitrate_cap_kbps
CallEncoder::set_profile() / CallDecoder auto-switch:
- Encoder can switch codec mid-stream
- Decoder already auto-detects incoming codec from packet headers

What's been implemented since PRD was written

QualityReport ingestion — ~~neither Android engine nor desktop engine reads quality reports from the relay~~ Done: both Android (crates/wzp-android/src/engine.rs) and desktop (desktop/src-tauri/src/engine.rs) recv tasks ingest quality reports and feed AdaptiveQualityController
Profile switch loop — ~~no periodic check~~ Done: pending_profile AtomicU8 bridges recv→send task in both engines; send task applies profile switch at frame boundary
Notification to UI — ~~when quality changes, the UI should show the current active codec~~ Done: tx_codec/rx_codec in desktop EngineStatus; currentCodec/peerCodec in Android CallStats

What's still missing

Upward adaptation — QualityAdapter only classifies into 3 tiers (GOOD/DEGRADED/CATASTROPHIC). Needs extension to recommend studio tiers when conditions are excellent (loss < 1%, RTT < 50ms). See Phase 2 below.
Relay QualityDirective handling — relay broadcasts coordinated quality directives but neither engine processes them (signals are silently discarded). See PRD-coordinated-codec.md for details.

Requirements

Phase 1: Basic Adaptive (3-tier)

Both Android and Desktop:

Ingest QualityReports: In the recv loop, extract quality_report from incoming MediaPackets when present. Feed to QualityAdapter.
Periodic quality check: Every 1 second (or on each QualityReport), call adapter.should_switch(&current_profile). If it returns Some(new_profile):
- Switch the encoder: encoder.set_profile(new_profile)
- Update FEC encoder: fec_enc = create_encoder(&new_profile)
- Update frame size if changed (e.g., 20ms → 40ms)
- Log the switch
Frame size adaptation on switch: When switching from 20ms to 40ms frames (or vice versa):
- Android: update frame_samples variable, resize capture_buf
- Desktop: same — the send loop reads frame_samples dynamically
UI indicator: Show current active codec in the call screen stats line.
- Android: add to CallStats and display in stats text
- Desktop: add to get_status response and display in stats div
Only in Auto mode: Adaptive switching should only happen when the user selected "Auto". If they manually selected a profile, respect their choice.

Phase 2: Extended Range (5-tier)

Extend QualityAdapter::classify() to use the full codec range:

Condition	Profile	Codec
loss < 1% AND RTT < 30ms	STUDIO_64K	Opus 64k
loss < 1% AND RTT < 50ms	STUDIO_48K	Opus 48k
loss < 2% AND RTT < 80ms	STUDIO_32K	Opus 32k
loss < 5% AND RTT < 100ms	GOOD	Opus 24k
loss < 15% AND RTT < 200ms	DEGRADED	Opus 6k
loss >= 15% OR RTT >= 200ms	CATASTROPHIC	Codec2 1.2k

With hysteresis:

Downgrade: 3 consecutive reports (fast reaction to degradation)
Upgrade: 5 consecutive reports (slow, cautious improvement)
Studio upgrade: 10 consecutive reports (very conservative — avoid bouncing to 64k on brief good patches)

Phase 3: Bandwidth Probing

Rather than relying solely on loss/RTT:

Start at GOOD
After 10 seconds of stable call, probe upward by switching to STUDIO_32K
If no quality degradation after 5 seconds, probe to STUDIO_48K
If degradation detected, immediately fall back
This discovers the true available bandwidth rather than guessing from loss stats

Implementation Plan

Android (`crates/wzp-android/src/engine.rs`)

// In the recv loop, after decoding:
if let Some(ref qr) = pkt.quality_report {
    quality_adapter.ingest(qr);
}

// Periodic check (every 50 frames ≈ 1 second):
if auto_profile && frames_decoded % 50 == 0 {
    if let Some(new_profile) = quality_adapter.should_switch(&current_profile) {
        info!(from = ?current_profile.codec, to = ?new_profile.codec, "auto: switching quality");
        let _ = encoder_ref.lock().set_profile(new_profile);
        fec_enc_ref.lock() = create_encoder(&new_profile);
        current_profile = new_profile;
        frame_samples = frame_samples_for(&new_profile);
        // Resize capture buffer if needed
    }
}

Challenge: The encoder is in the send task and the quality reports arrive in the recv task. Need shared state (AtomicU8 for profile index, or a channel).

Recommended approach: Use an AtomicU8 that the recv task writes and the send task reads:

let pending_profile = Arc::new(AtomicU8::new(0xFF)); // 0xFF = no change

// Recv task: when adapter recommends switch
pending_profile.store(new_profile_index, Ordering::Release);

// Send task: check at frame boundary
let p = pending_profile.swap(0xFF, Ordering::Acquire);
if p != 0xFF { /* apply switch */ }

Desktop (`desktop/src-tauri/src/engine.rs`)

Same pattern. The desktop engine already has separate send/recv tasks with shared atomics for mic_muted, etc. Add a pending_profile: Arc<AtomicU8> following the same pattern.

Desktop CLI (`crates/wzp-client/src/call.rs`)

The CallEncoder already has set_profile(). The CallDecoder already auto-switches. Just need to:

Add QualityAdapter to CallDecoder
Feed quality reports in ingest()
Check should_switch() in decode_next()
Emit the recommendation via a callback or return value

Testing

Local test with tc/netem: Use Linux traffic control to simulate loss/latency:

# Simulate 10% loss, 150ms RTT
tc qdisc add dev lo root netem loss 10% delay 75ms
# Run 2 clients in auto mode, verify they switch to DEGRADED

CLI test: Run wzp-client --profile auto between two instances with simulated network conditions
Relay quality reports: Verify the relay actually sends QualityReport messages. If it doesn't yet, that needs to be implemented first (check relay code).

Open Questions

Does the relay currently send QualityReports? If not, Phase 1 is blocked until the relay implements per-client loss/RTT tracking and report generation. The relay sees all packets and can compute loss % per sender.
Codec2 3.2k placement: Should auto mode use Codec2 3.2k between DEGRADED and CATASTROPHIC? It's 20ms frames (lower latency than Opus 6k's 40ms) but speech-only quality.
Cross-client adaptation: If client A is on GOOD and client B auto-adapts to CATASTROPHIC, client A still sends Opus 24k. Client B can decode it fine (auto-switch on recv). But should A also be told to lower quality to save B's bandwidth? This requires signaling between clients.

Milestones

Phase	Scope	Effort	Dependency
0	Verify relay sends QualityReports	0.5 day	None
1a	Wire QualityAdapter in Android engine	1 day	Phase 0
1b	Wire QualityAdapter in desktop engine	1 day	Phase 0
1c	UI indicator (current codec)	0.5 day	Phase 1a/1b
2	Extended 5-tier classification	0.5 day	Phase 1
3	Bandwidth probing	2 days	Phase 2

10 KiB Raw Blame History