From 898c1ea32bc98f460ae2d9344da330ff866a89d5 Mon Sep 17 00:00:00 2001 From: Siavash Sameni Date: Wed, 8 Apr 2026 08:12:12 +0400 Subject: [PATCH] docs: PRDs for P2P direct calls and coordinated codec switching MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PRD-p2p-direct.md: STUN-based NAT traversal for direct QUIC connections between clients. True E2E with mutual TLS cert pinning via identity fingerprints. Hybrid mode: try P2P, fall back to relay. 4 phases: STUN discovery, hole punching, P2P adaptive quality, seamless relay-to-P2P migration. PRD-coordinated-codec.md: Relay acts as quality judge — monitors per-participant loss/RTT/jitter, sends quality directives. Downgrade is immediate (match weakest link), upgrade is consensual (all participants must agree, synchronized switch at agreed timestamp). Covers asymmetric encoding in SFU and P2P→relay backporting strategy. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/PRD-coordinated-codec.md | 198 ++++++++++++++++++++++++++++++++++ docs/PRD-p2p-direct.md | 146 +++++++++++++++++++++++++ 2 files changed, 344 insertions(+) create mode 100644 docs/PRD-coordinated-codec.md create mode 100644 docs/PRD-p2p-direct.md diff --git a/docs/PRD-coordinated-codec.md b/docs/PRD-coordinated-codec.md new file mode 100644 index 0000000..e181855 --- /dev/null +++ b/docs/PRD-coordinated-codec.md @@ -0,0 +1,198 @@ +# PRD: Coordinated Codec Switching (Relay-Judged Quality) + +## Problem + +The current adaptive quality system (`QualityAdapter` in call.rs) exists but isn't wired into either engine. Clients encode at a fixed quality chosen at call start. When network conditions change mid-call, audio degrades instead of gracefully stepping down. When conditions improve, clients stay on low quality unnecessarily. + +Additionally, in SFU mode with multiple participants, uncoordinated codec switching creates asymmetry: if client A upgrades to 64k while B stays on 24k, bandwidth is wasted. Participants should switch together. + +## Solution + +The **relay acts as the quality judge** since it sees both sides of every connection. It monitors packet loss, jitter, and RTT per participant, then signals quality recommendations. Clients react to these signals with coordinated codec switches. + +## Architecture + +``` +┌─────────┐ ┌─────────┐ ┌─────────┐ +│ Client A │◄──────►│ Relay │◄──────►│ Client B │ +│ │ │ (judge) │ │ │ +│ Encoder │ │ │ │ Encoder │ +│ Decoder │ │ Monitor │ │ Decoder │ +└─────────┘ │ per-peer│ └─────────┘ + │ quality │ + └────┬────┘ + │ + Quality Signals: + - StableSignal (conditions good) + - DegradeSignal (conditions bad) + - UpgradeProposal (try higher quality?) + - UpgradeConfirm (all agreed, switch at T) +``` + +## Quality Classification (Relay-Side) + +The relay monitors each participant's connection quality: + +| Condition | Classification | Action | +|-----------|---------------|--------| +| loss >= 15% OR RTT >= 200ms | Critical | Immediate downgrade signal | +| loss >= 5% OR RTT >= 100ms | Degraded | Downgrade signal after 3 reports | +| loss < 2% AND RTT < 80ms | Good | Stable signal | +| loss < 1% AND RTT < 50ms for 30s | Excellent | Upgrade proposal | +| loss < 0.5% AND RTT < 30ms for 60s | Studio | Studio upgrade proposal | + +## Coordinated Switching Protocol + +### Downgrade (fast, safety-first) + +1. Relay detects degradation for ANY participant +2. Relay sends `QualityUpdate { recommended_profile: DEGRADED }` to ALL participants +3. ALL participants immediately switch encoder to the recommended profile +4. No negotiation — downgrade is mandatory and instant + +### Upgrade (slow, consensual) + +1. Relay detects sustained good conditions for ALL participants (threshold: 30s stable) +2. Relay sends `UpgradeProposal { target_profile, switch_timestamp }` to all +3. Each client responds: `UpgradeAccept` or `UpgradeReject` +4. If ALL accept within 5s → Relay sends `UpgradeConfirm { profile, switch_at_ms }` +5. All clients switch encoder at the agreed timestamp (relative to session clock) +6. If ANY rejects or times out → upgrade cancelled, stay on current profile + +### Asymmetric Encoding (SFU optimization) + +In SFU mode, each client encodes independently. The relay could allow: +- Client A (strong connection): encode at 64k +- Client B (weak connection): encode at 6k +- Relay forwards A's 64k to B's decoder (auto-switch handles it) +- B benefits from A's quality without needing to send at 64k + +This requires NO protocol changes — just each client independently following the relay's recommendation for their own encoding quality. The decoder already handles any codec. + +### Split Network Consideration + +If participant A has great quality but participant C has terrible quality: +- Option 1: **Match weakest link** — everyone encodes at C's level (current approach, simple) +- Option 2: **Per-participant recommendations** — A encodes at 64k, C encodes at 6k. B (good connection) receives and decodes both. Works because decoders auto-switch per packet. +- Option 3: **Relay transcoding** — relay re-encodes A's 64k as 6k for C. Adds CPU on relay, but saves bandwidth for C. Future feature. + +Recommended: start with Option 1 (match weakest), add Option 2 later. + +## Signal Messages (New/Modified) + +```rust +/// Quality signal from relay to client +QualityDirective { + /// Recommended profile to use for encoding + recommended_profile: QualityProfile, + /// Reason for the recommendation + reason: QualityReason, +} + +enum QualityReason { + /// Network conditions require this quality level + NetworkCondition, + /// Coordinated upgrade — all participants agreed + CoordinatedUpgrade, + /// Coordinated downgrade — weakest link determines level + CoordinatedDowngrade, +} + +/// Upgrade proposal from relay +UpgradeProposal { + target_profile: QualityProfile, + /// Milliseconds from now when the switch would happen + switch_delay_ms: u32, +} + +/// Client response to upgrade proposal +UpgradeResponse { + accepted: bool, +} + +/// Confirmed upgrade — all clients switch at this time +UpgradeConfirm { + profile: QualityProfile, + /// Session-relative timestamp to switch (ms since call start) + switch_at_session_ms: u64, +} +``` + +## Relay-Side Implementation + +### Per-Participant Quality Tracking + +```rust +struct ParticipantQuality { + /// Sliding window of recent observations + loss_samples: VecDeque, // last 30 seconds + rtt_samples: VecDeque, // last 30 seconds + jitter_samples: VecDeque, + /// Current classification + classification: QualityClass, + /// How long current classification has been stable + stable_since: Instant, +} +``` + +### Quality Monitor Task (on relay) + +Runs alongside the SFU forwarding loop: +1. Every 1 second, compute per-participant quality from QUIC connection stats +2. Classify each participant +3. If ANY participant degrades → send downgrade to ALL +4. If ALL participants stable for threshold → propose upgrade +5. Track upgrade negotiation state + +### Integration with Existing Code + +The relay already has access to: +- `QuinnTransport::path_quality()` → loss, RTT, jitter, bandwidth estimates +- `QualityReport` embedded in media packet headers +- Per-session metrics in `RelayMetrics` + +The quality monitor just needs to read these existing metrics and produce signals. + +## Client-Side Implementation + +### Handling Quality Signals + +In the recv loop (both Android engine and desktop engine): +```rust +SignalMessage::QualityDirective { recommended_profile, .. } => { + // Immediate: switch encoder to recommended profile + encoder.set_profile(recommended_profile)?; + fec_enc = create_encoder(&recommended_profile); + frame_samples = frame_samples_for(&recommended_profile); + info!(codec = ?recommended_profile.codec, "quality directive: switched"); +} +``` + +### P2P Quality (simpler case) + +For P2P calls (no relay), both clients directly observe quality: +1. Each client runs its own `QualityAdapter` on the direct connection +2. When quality changes, client proposes to peer via signal +3. Simpler negotiation: only 2 parties, no relay middleman +4. Same coordinated switching logic, just peer-to-peer signals + +## Backporting P2P → Relay + +The quality monitoring and codec switching logic is identical: +- **P2P**: client observes quality directly → proposes switch to peer +- **Relay**: relay observes quality → proposes switch to all clients + +The only difference is WHO makes the decision (client vs relay) and HOW many participants need to agree (2 vs N). + +Implementation strategy: build for P2P first (simpler, 2 parties), then wrap the same logic with relay-mediated signals for SFU mode. + +## Milestones + +| Phase | Scope | Effort | +|-------|-------|--------| +| 1 | Relay-side quality monitor (per-participant tracking) | 1 day | +| 2 | Downgrade signal (immediate, match weakest) | 1 day | +| 3 | Client handling of QualityDirective | 1 day (both engines) | +| 4 | Upgrade proposal + negotiation protocol | 2 days | +| 5 | P2P quality adaptation (direct observation) | 1 day | +| 6 | Per-participant asymmetric encoding (Option 2) | 1 day | diff --git a/docs/PRD-p2p-direct.md b/docs/PRD-p2p-direct.md new file mode 100644 index 0000000..374f1a6 --- /dev/null +++ b/docs/PRD-p2p-direct.md @@ -0,0 +1,146 @@ +# PRD: Peer-to-Peer Direct Calls (No Relay) + +## Problem + +All calls currently route through a relay, even 1-on-1 calls between clients that could reach each other directly. This adds latency (2x hop), creates a single point of failure, and requires trusting the relay operator (even though media is encrypted, the relay sees metadata). + +## Solution + +For 1-on-1 calls, clients attempt a direct QUIC connection using STUN-discovered addresses. If NAT traversal succeeds, media flows directly between peers. If it fails, fall back to relay-assisted mode (current behavior). + +## Architecture + +``` +Preferred (P2P): + Client A ←──QUIC direct──→ Client B + (no relay in media path, true E2E) + +Fallback (Relay): + Client A ──→ Relay ──→ Client B + (current model) + +Hybrid discovery: + Client A → Relay (signaling only) → Client B + ↓ ↓ + STUN server STUN server + ↓ ↓ + Discover public IP:port Discover public IP:port + ↓ ↓ + Exchange candidates via relay signaling + ↓ ↓ + Attempt direct QUIC connection ←──→ +``` + +## Why P2P = True E2E + +- QUIC TLS handshake establishes encrypted tunnel directly between A and B +- No third party sees the traffic +- Certificate pinning via identity fingerprints: each client derives their TLS cert from their Ed25519 seed (same as relay identity). During QUIC handshake, both sides verify the peer's cert fingerprint against the known identity +- MITM elimination: if A knows B's fingerprint (from prior call, QR code, or identity server), any interceptor presents a different cert → fingerprint mismatch → connection rejected +- Stronger guarantee than relay-assisted: user doesn't need to trust relay operator + +## Requirements + +### Phase 1: STUN Discovery + +1. **STUN client**: lightweight UDP-based STUN client to discover public IP:port + - Use existing public STUN servers (stun.l.google.com:19302, etc.) + - Or run a STUN server alongside the relay + - Discover: local addresses, server-reflexive addresses (STUN), relay candidates (TURN/relay fallback) + +2. **Candidate gathering**: on call initiation, gather all candidates: + - Host candidates: local network interfaces + - Server-reflexive: STUN-discovered public IP:port + - Relay candidate: the relay's address (fallback) + +3. **Candidate exchange**: via relay signaling channel (existing `IceCandidate` signal message) + - A sends candidates to relay → relay forwards to B + - B sends candidates to relay → relay forwards to A + +### Phase 2: Direct Connection + +1. **QUIC hole punching**: both clients simultaneously attempt QUIC connections to each other's candidates + - Quinn supports connecting to multiple addresses + - First successful connection wins + - Timeout after 3 seconds, fall back to relay + +2. **Identity verification**: during QUIC handshake, verify peer's TLS cert fingerprint + - `server_config_from_seed()` already exists — derive client cert from identity seed + - Both sides present certs (mutual TLS) + - Verify fingerprint matches expected identity + +3. **Media flow**: once connected, use existing `QuinnTransport` for media + signals + - Same `send_media()` / `recv_media()` API + - Same codec pipeline, FEC, jitter buffer + - No code changes needed in the call engine + +### Phase 3: Adaptive Quality (P2P) + +P2P connections have direct quality visibility — no relay middleman: + +1. Both clients observe RTT, loss, jitter directly from QUIC stats +2. Adapt codec quality based on direct observations +3. Since only 2 participants, coordinated switching is simple: propose → ack → switch + +This is the simplest case for adaptive quality. Once proven, backport the logic to relay-assisted mode. + +### Phase 4: Hybrid Mode + +1. **Call initiation**: always connect to relay for signaling +2. **Parallel attempt**: while relay call is active, attempt P2P in background +3. **Seamless migration**: if P2P succeeds, migrate media path from relay to direct + - Both clients switch simultaneously + - Relay connection kept alive for signaling (presence, room updates) +4. **Fallback**: if P2P connection drops, seamlessly fall back to relay + +## Security Properties + +| Property | Relay Mode | P2P Mode | +|----------|-----------|----------| +| Encryption | ChaCha20-Poly1305 (app layer) | QUIC TLS 1.3 + ChaCha20-Poly1305 | +| Key exchange | Via relay signaling | Direct QUIC handshake | +| Identity verification | TOFU (server fingerprint) | Mutual TLS cert pinning | +| Metadata privacy | Relay sees who talks to whom | No third party sees anything | +| MITM resistance | Depends on relay trust | Strong (cert pinning) | +| Forward secrecy | ECDH ephemeral keys | QUIC built-in + app-layer rekey | + +## Implementation Notes + +### STUN in Rust + +Use `stun-rs` or `webrtc-rs` crate for STUN client. Minimal: just need Binding Request/Response to discover server-reflexive address. + +### Quinn Hole Punching + +Quinn's `Endpoint` can both listen and connect. For hole punching: +```rust +let endpoint = create_endpoint(bind_addr, Some(server_config))?; +// Send connect to peer's address (opens NAT pinhole) +let conn = connect(&endpoint, peer_addr, "peer", client_config).await?; +// Simultaneously, peer connects to our address +// First successful handshake wins +``` + +### Client TLS Certificate + +Already have `server_config_from_seed()` for relays. Create `client_config_from_seed()` that presents a TLS client certificate derived from the identity seed. The peer verifies this cert's fingerprint. + +### Signaling via Relay + +The existing relay connection carries `IceCandidate` signals. No new infrastructure needed — just use the relay as a dumb signaling pipe for candidate exchange. + +## Non-Goals (v1) + +- SFU over P2P (P2P is 1-on-1 only; multi-party uses relay SFU) +- TURN server (relay acts as the fallback, no separate TURN) +- mDNS local discovery (future) +- Mesh P2P for multi-party (future, complex) + +## Milestones + +| Phase | Scope | Effort | +|-------|-------|--------| +| 1 | STUN client + candidate gathering | 2 days | +| 2 | QUIC hole punching + identity verification | 3 days | +| 3 | Adaptive quality on P2P connection | 2 days | +| 4 | Hybrid mode (relay + P2P, seamless migration) | 3 days |