From 993cf9ab7fe53b3b393808bb1cf0bc2436dffc16 Mon Sep 17 00:00:00 2001 From: Siavash Sameni Date: Sat, 28 Mar 2026 16:41:39 +0400 Subject: [PATCH] docs: full system architecture with Mermaid diagrams + project README MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit ARCHITECTURE.md covers the entire system with 13 Mermaid diagrams: - System overview (send/recv pipeline, relay SFU) - Crate dependency graph (8 crates + featherChat) - Wire formats (MediaHeader, MiniHeader, TrunkFrame, QualityReport, SignalMessage) - Quality profiles with adaptive switching thresholds - Cryptographic handshake sequence (X25519 + Ed25519) - Identity model (BIP39 seed → HKDF → Ed25519/X25519 → Fingerprint) - Relay modes (Room SFU, Forward, Probe) - Web bridge architecture (Browser ↔ WS ↔ QUIC) - FEC protection pipeline (RaptorQ + interleaving) - Telemetry stack (Prometheus → Grafana) - Session state machine - Audio processing detail (denoise → VAD → encode → FEC → encrypt) - Adaptive jitter buffer flow - Deployment topology (multi-region) - featherChat integration sequence README.md: quick start, feature list, documentation index, build instructions. Co-Authored-By: Claude Opus 4.6 (1M context) --- README.md | 87 +++++ docs/ARCHITECTURE.md | 834 ++++++++++++++++++++++++++++--------------- 2 files changed, 643 insertions(+), 278 deletions(-) create mode 100644 README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..9a86fb1 --- /dev/null +++ b/README.md @@ -0,0 +1,87 @@ +# WarzonePhone + +Custom lossy VoIP protocol built in Rust. E2E encrypted, FEC-protected, adaptive quality, designed for hostile network conditions. + +## Quick Start + +```bash +# Build +cargo build --release + +# Run relay +./target/release/wzp-relay --listen 0.0.0.0:4433 + +# Send a test tone +./target/release/wzp-client --send-tone 5 relay-addr:4433 + +# Web bridge (browser calls) +./target/release/wzp-web --port 8080 --relay 127.0.0.1:4433 --tls +# Open https://localhost:8080/room-name in two browser tabs +``` + +## Architecture + +See [docs/ARCHITECTURE.md](docs/ARCHITECTURE.md) for the full system architecture with Mermaid diagrams covering: + +- System overview and data flow +- Crate dependency graph (8 crates) +- Wire formats (MediaHeader, MiniHeader, TrunkFrame, SignalMessage) +- Cryptographic handshake (X25519 + Ed25519 + ChaCha20-Poly1305) +- Identity model (BIP39 seed, featherChat compatible) +- Quality profiles (GOOD/DEGRADED/CATASTROPHIC) +- FEC protection (RaptorQ with interleaving) +- Adaptive jitter buffer (NetEq-inspired) +- Telemetry stack (Prometheus + Grafana) +- Deployment topology + +## Features + +- **3 quality tiers**: Opus 24k (28.8 kbps) / Opus 6k (9 kbps) / Codec2 1200 (2.4 kbps) +- **RaptorQ FEC**: Recovers from 20-100% packet loss depending on tier +- **E2E encryption**: ChaCha20-Poly1305 with X25519 key exchange +- **Adaptive jitter buffer**: EMA-based playout delay tracking +- **Silence suppression**: VAD + comfort noise (~50% bandwidth savings) +- **ML noise removal**: RNNoise (nnnoiseless pure Rust port) +- **Mini-frames**: 67% header compression for steady-state packets +- **Trunking**: Multiplex sessions into batched datagrams +- **featherChat integration**: Shared BIP39 identity, token auth, call signaling +- **Prometheus metrics**: Relay, web bridge, inter-relay probes +- **Grafana dashboard**: Pre-built JSON with 18 panels + +## Documentation + +| Document | Description | +|----------|-------------| +| [ARCHITECTURE.md](docs/ARCHITECTURE.md) | Full system architecture with diagrams | +| [TELEMETRY.md](docs/TELEMETRY.md) | Prometheus metrics specification | +| [INTEGRATION_TASKS.md](docs/INTEGRATION_TASKS.md) | featherChat integration tracker | +| [WZP-FC-SHARED-CRATES.md](docs/WZP-FC-SHARED-CRATES.md) | Shared crate strategy | +| [grafana-dashboard.json](docs/grafana-dashboard.json) | Importable Grafana dashboard | + +## Binaries + +| Binary | Description | +|--------|-------------| +| `wzp-relay` | Relay daemon (SFU room mode, forward mode, probes) | +| `wzp-client` | CLI client (send-tone, record, live mic, echo-test, drift-test, sweep) | +| `wzp-web` | Browser bridge (HTTPS + WebSocket + AudioWorklet) | +| `wzp-bench` | Component benchmarks | + +## Linux Build + +```bash +./scripts/build-linux.sh --prepare # Create Hetzner VM + install deps +./scripts/build-linux.sh --build # Build release binaries +./scripts/build-linux.sh --transfer # Download to target/linux-x86_64/ +./scripts/build-linux.sh --destroy # Delete VM +``` + +## Tests + +```bash +cargo test --workspace # 272 tests +``` + +## License + +MIT OR Apache-2.0 diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index fe74d3b..c482b23 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -1,329 +1,607 @@ -# WarzonePhone Protocol Design & Architecture +# WarzonePhone Architecture -## Network Topology +> Custom lossy VoIP protocol built in Rust. E2E encrypted, FEC-protected, adaptive quality, designed for hostile network conditions. -``` - Lossy / censored link - ◄──────────────────────► - ┌────────┐ ┌─────────┐ ┌─────────┐ ┌─────────────┐ - │ Client │─QUIC─│ Relay A │─QUIC─│ Relay B │─QUIC─│ Destination │ - └────────┘ └─────────┘ └─────────┘ └─────────────┘ - │ │ │ │ - Encode Forward Forward Decode - FEC FEC FEC FEC - Encrypt (opaque) (opaque) Decrypt +## System Overview + +```mermaid +graph TB + subgraph "Client A (Browser/CLI)" + MIC[Microphone] --> DN[NoiseSupressor
RNNoise ML] + DN --> SD[SilenceDetector
VAD + Hangover] + SD --> ENC[CallEncoder
Opus/Codec2] + ENC --> FEC_E[FEC Encoder
RaptorQ] + FEC_E --> CRYPT_E[ChaCha20-Poly1305
Encrypt] + CRYPT_E --> QUIC_S[QUIC Datagram
Send] + + QUIC_R[QUIC Datagram
Recv] --> CRYPT_D[ChaCha20-Poly1305
Decrypt] + CRYPT_D --> FEC_D[FEC Decoder
RaptorQ] + FEC_D --> JIT[JitterBuffer
Adaptive Playout] + JIT --> DEC[CallDecoder
Opus/Codec2] + DEC --> SPK[Speaker] + end + + subgraph "Relay (SFU)" + ACCEPT[Accept QUIC] --> AUTH{Auth?} + AUTH -->|token| VALIDATE[POST /v1/auth/validate] + AUTH -->|no auth| HS + VALIDATE --> HS[Crypto Handshake
X25519 + Ed25519] + HS --> ROOM[Room Manager
Named Rooms via SNI] + ROOM --> FWD[Forward to
Other Participants] + end + + subgraph "Client B" + B_SPK[Speaker] + B_MIC[Microphone] + end + + QUIC_S -->|UDP/QUIC| ACCEPT + FWD -->|UDP/QUIC| QUIC_R + B_MIC -.->|same pipeline| ACCEPT + FWD -.->|same pipeline| B_SPK + + style MIC fill:#4a9eff + style SPK fill:#4a9eff + style B_MIC fill:#4a9eff + style B_SPK fill:#4a9eff + style ROOM fill:#ff9f43 + style CRYPT_E fill:#ee5a24 + style CRYPT_D fill:#ee5a24 ``` -In the simplest deployment a single relay serves as the meeting point (room mode, SFU). Clients connect directly to one relay, which forwards media to all other participants in the same room. For censorship-resistant links, two relays can be chained: a client-facing relay forwards all traffic to a remote relay via QUIC. +## Crate Dependency Graph -Room names are carried in the QUIC SNI field during the TLS handshake, so a single relay can host many independent rooms without additional signaling. +```mermaid +graph TD + PROTO[wzp-proto
Types, Traits, Wire Format] -## Protocol Stack + CODEC[wzp-codec
Opus + Codec2 + RNNoise] + FEC[wzp-fec
RaptorQ FEC] + CRYPTO[wzp-crypto
ChaCha20 + Identity] + TRANSPORT[wzp-transport
QUIC/Quinn] -``` -┌──────────────────────────────────────────────┐ -│ Application (Opus / Codec2 audio) │ wzp-codec -├──────────────────────────────────────────────┤ -│ Redundancy (RaptorQ FEC + interleaving) │ wzp-fec -├──────────────────────────────────────────────┤ -│ Crypto (ChaCha20-Poly1305 + AEAD) │ wzp-crypto -├──────────────────────────────────────────────┤ -│ Transport (QUIC DATAGRAM + reliable stream) │ wzp-transport -├──────────────────────────────────────────────┤ -│ Obfuscation (Phase 2 — trait defined) │ wzp-proto::ObfuscationLayer -└──────────────────────────────────────────────┘ + RELAY[wzp-relay
Relay Daemon] + CLIENT[wzp-client
CLI + Call Engine] + WEB[wzp-web
Browser Bridge] + + PROTO --> CODEC + PROTO --> FEC + PROTO --> CRYPTO + PROTO --> TRANSPORT + + CODEC --> CLIENT + FEC --> CLIENT + CRYPTO --> CLIENT + TRANSPORT --> CLIENT + CODEC --> RELAY + FEC --> RELAY + CRYPTO --> RELAY + TRANSPORT --> RELAY + + CLIENT --> WEB + TRANSPORT --> WEB + CRYPTO --> WEB + + FC[warzone-protocol
featherChat Identity] -.->|path dep| CRYPTO + + style PROTO fill:#6c5ce7 + style RELAY fill:#ff9f43 + style CLIENT fill:#00b894 + style WEB fill:#0984e3 + style FC fill:#fd79a8 ``` -Audio and FEC are end-to-end between caller and callee. The relay operates on opaque, encrypted, FEC-protected packets. Crypto keys are never shared with relays. - -## Wire Format +## Wire Formats ### MediaHeader (12 bytes) ``` -Byte 0: [V:1][T:1][CodecID:4][Q:1][FecRatioHi:1] -Byte 1: [FecRatioLo:6][unused:2] -Byte 2-3: Sequence number (big-endian u16) -Byte 4-7: Timestamp in ms since session start (big-endian u32) -Byte 8: FEC block ID (wrapping u8) -Byte 9: FEC symbol index within block -Byte 10: Reserved / flags -Byte 11: CSRC count (for future mixing) +Byte 0: [V:1][T:1][CodecID:4][Q:1][FecHi:1] +Byte 1: [FecLo:6][unused:2] +Bytes 2-3: sequence (u16 BE) +Bytes 4-7: timestamp_ms (u32 BE) +Byte 8: fec_block_id (u8) +Byte 9: fec_symbol_idx (u8) +Byte 10: reserved +Byte 11: csrc_count + +V = version (0), T = is_repair, CodecID = codec, Q = quality_report appended ``` -Field details: - -| Field | Bits | Description | -|-------|------|-------------| -| V | 1 | Protocol version (0 = v1) | -| T | 1 | 1 = FEC repair packet, 0 = source media | -| CodecID | 4 | Codec identifier (0=Opus24k, 1=Opus16k, 2=Opus6k, 3=Codec2_3200, 4=Codec2_1200) | -| Q | 1 | QualityReport trailer appended | -| FecRatio | 7 | FEC ratio encoded as 7-bit value (0-127 maps to 0.0-2.0) | -| Seq | 16 | Wrapping packet sequence number | -| Timestamp | 32 | Milliseconds since session start | -| FEC block | 8 | Source block ID (wrapping) | -| FEC symbol | 8 | Symbol index within the FEC block | -| Reserved | 8 | Reserved flags | -| CSRC count | 8 | Contributing source count (future) | - -Defined in `crates/wzp-proto/src/packet.rs` as `MediaHeader`. - -### QualityReport (4 bytes) - -Appended to a media packet when the Q flag is set. +### MiniHeader (4 bytes, compressed) ``` -Byte 0: loss_pct — 0-255 maps to 0-100% loss -Byte 1: rtt_4ms — RTT in 4ms units (0-255 = 0-1020ms) -Byte 2: jitter_ms — Jitter in milliseconds -Byte 3: bitrate_cap — Max receive bitrate in kbps +Bytes 0-1: timestamp_delta_ms (u16 BE) +Bytes 2-3: payload_len (u16 BE) + +Preceded by FRAME_TYPE_MINI (0x01). Full header every 50 frames (~1s). +Saves 8 bytes/packet (67% header reduction). ``` -Defined in `crates/wzp-proto/src/packet.rs` as `QualityReport`. - -### MediaPacket - -A complete media packet on the wire: +### TrunkFrame (batched datagrams) ``` -[MediaHeader: 12 bytes][Payload: variable][QualityReport: 4 bytes if Q=1] +[count:u16] + [session_id:2][len:u16][payload:len] x count + +Packs multiple session packets into one QUIC datagram. +Max 10 entries or 1200 bytes, flushed every 5ms. ``` -Defined in `crates/wzp-proto/src/packet.rs` as `MediaPacket`. - -### SignalMessage (reliable stream) - -Signaling uses length-prefixed JSON over reliable QUIC bidirectional streams. Each message opens a new bidi stream, writes a 4-byte big-endian length prefix followed by the JSON payload, then finishes the send side. - -Variants defined in `crates/wzp-proto/src/packet.rs`: - -- `CallOffer` — identity_pub, ephemeral_pub, signature, supported_profiles -- `CallAnswer` — identity_pub, ephemeral_pub, signature, chosen_profile -- `IceCandidate` — NAT traversal candidate string -- `Rekey` — new_ephemeral_pub, signature -- `QualityUpdate` — report, recommended_profile -- `Ping` / `Pong` — timestamp_ms for RTT measurement -- `Hangup` — reason (Normal, Busy, Declined, Timeout, Error) - -## FEC Strategy - -WarzonePhone uses **RaptorQ fountain codes** (via the `raptorq` crate) for forward error correction. This is implemented in `crates/wzp-fec/`. - -### Block Structure - -Audio frames are grouped into FEC blocks. Each block contains a fixed number of source symbols (configured per quality profile). Each source symbol is a single encoded audio frame, zero-padded to a uniform 256-byte symbol size with a 2-byte little-endian length prefix. - -### Encoding Process - -1. Audio frames are added to the encoder as source symbols -2. When a block is full (`frames_per_block` symbols), repair symbols are generated -3. The repair ratio determines how many repair symbols: `ceil(num_source * ratio)` -4. Both source and repair packets are transmitted with the block ID and symbol index in the header - -### Decoding Process - -1. Received symbols (source or repair) are fed to the decoder keyed by block ID -2. The decoder attempts reconstruction when sufficient symbols arrive -3. RaptorQ can recover the full block from any `K` symbols out of `K + R` total (where K = source count, R = repair count) -4. Old blocks are expired via wrapping u8 distance - -### Interleaving - -The `Interleaver` spreads symbols from multiple FEC blocks across transmission slots in round-robin fashion. With depth=3, a burst loss of 6 consecutive packets damages at most 2 symbols per block instead of 6 symbols in one block. - -### FEC Configuration by Quality Tier - -| Tier | Frames/Block | Repair Ratio | Total Bandwidth Overhead | -|------|-------------|-------------|-------------------------| -| GOOD | 5 | 0.2 (20%) | 1.2x | -| DEGRADED | 10 | 0.5 (50%) | 1.5x | -| CATASTROPHIC | 8 | 1.0 (100%) | 2.0x | - -## Adaptive Quality - -Three quality tiers drive codec and FEC selection. The controller is implemented in `crates/wzp-proto/src/quality.rs` as `AdaptiveQualityController`. - -### Tier Thresholds - -| Tier | Loss | RTT | Codec | FEC Ratio | -|------|------|-----|-------|-----------| -| GOOD | < 10% | < 400ms | Opus 24kbps, 20ms frames | 0.2 | -| DEGRADED | 10-40% or 400-600ms | | Opus 6kbps, 40ms frames | 0.5 | -| CATASTROPHIC | > 40% or > 600ms | | Codec2 1200bps, 40ms frames | 1.0 | - -### Hysteresis - -- **Downgrade**: Triggers after 3 consecutive reports in a worse tier (fast reaction) -- **Upgrade**: Triggers after 10 consecutive reports in a better tier (slow, cautious) -- **Step limit**: Upgrades move only one tier at a time (Catastrophic -> Degraded -> Good) -- **History**: A sliding window of 20 recent reports is maintained for smoothing -- **Force mode**: Manual `force_profile()` disables adaptive logic entirely - -### QualityProfile Constants - -```rust -GOOD: Opus24k, fec=0.2, 20ms, 5 frames/block → 28.8 kbps total -DEGRADED: Opus6k, fec=0.5, 40ms, 10 frames/block → 9.0 kbps total -CATASTROPHIC: Codec2_1200, fec=1.0, 40ms, 8 frames/block → 2.4 kbps total -``` - -## Encryption - -Implemented in `crates/wzp-crypto/`. - -### Identity Model (Warzone-Compatible) - -- **Seed**: 32-byte random value (BIP39 mnemonic for backup) -- **Ed25519**: Derived via `HKDF(seed, "warzone-ed25519-identity")` -- signing/identity -- **X25519**: Derived via `HKDF(seed, "warzone-x25519-identity")` -- encryption -- **Fingerprint**: `SHA-256(Ed25519_pub)[:16]` -- 128-bit identifier - -### Per-Call Key Exchange - -1. Each side generates an ephemeral X25519 keypair -2. Ephemeral public keys are exchanged via `CallOffer`/`CallAnswer` signaling -3. Signatures are computed: `Ed25519_sign(ephemeral_pub || context_string)` -4. Shared secret: `X25519_DH(our_ephemeral_secret, peer_ephemeral_pub)` -5. Session key: `HKDF(shared_secret, "warzone-session-key")` -> 32 bytes - -### Nonce Construction (12 bytes, not transmitted) +### QualityReport (4 bytes, optional) ``` -session_id[0..4] || sequence_number (u32 BE) || direction (1 byte) || padding (3 bytes zero) +Byte 0: loss_pct (0-255 maps to 0-100%) +Byte 1: rtt_4ms (0-255 maps to 0-1020ms) +Byte 2: jitter_ms +Byte 3: bitrate_cap_kbps ``` -- `session_id`: First 4 bytes of `SHA-256(session_key)` -- `direction`: 0 = Send, 1 = Recv -- Nonces are derived deterministically, saving 12 bytes per packet - -### AEAD Encryption - -- Algorithm: ChaCha20-Poly1305 -- AAD: The 12-byte MediaHeader (authenticated but not encrypted) -- Tag: 16 bytes appended to ciphertext -- Overhead per packet: 16 bytes - -### Rekeying - -- Trigger: Every 2^16 packets (65536) -- Process: New ephemeral X25519 exchange, mixed with old key via HKDF -- Key evolution: `HKDF(old_key as salt, new_DH_result, "warzone-rekey")` -- Old key is zeroized after derivation (forward secrecy) -- Sequence counters reset to 0 after rekey - -### Anti-Replay - -- Sliding window of 1024 packets using a bitmap -- Sequence numbers too old (> 1024 behind highest seen) are rejected -- Handles u16 wrapping correctly (RFC 1982 serial number arithmetic) -- Implemented in `crates/wzp-crypto/src/anti_replay.rs` as `AntiReplayWindow` - -## Jitter Buffer - -Implemented in `crates/wzp-proto/src/jitter.rs` as `JitterBuffer`. - -- **Structure**: BTreeMap keyed by sequence number for ordered playout -- **Target depth**: 50 packets (1 second) default -- **Max depth**: 250 packets (5 seconds at 20ms/frame) -- **Min depth**: 25 packets (0.5 seconds) before playout begins -- **Sequence wrapping**: RFC 1982 serial number arithmetic for u16 -- **Duplicate handling**: Silently dropped -- **Late packets**: Packets arriving after their sequence has been played out are dropped -- **Overflow**: When buffer exceeds max depth, oldest packets are evicted - -### Playout Results - -- `Packet(MediaPacket)` -- normal delivery -- `Missing { seq }` -- gap detected, decoder should generate PLC -- `NotReady` -- buffer not yet filled to minimum depth - -### Known Limitations - -- No adaptive depth adjustment based on observed jitter (target_depth is configurable but not self-tuning in the current implementation) -- No timestamp-based playout scheduling (uses sequence-number ordering only) -- Jitter buffer drift has been observed during long echo tests - -## Session State Machine - -Defined in `crates/wzp-proto/src/session.rs`: +### SignalMessage (JSON over reliable QUIC stream) ``` -Idle -> Connecting -> Handshaking -> Active <-> Rekeying -> Active - | - Closed +[4-byte length prefix][serde_json payload] + +Variants: + CallOffer { identity_pub, ephemeral_pub, signature, supported_profiles } + CallAnswer { identity_pub, ephemeral_pub, signature, chosen_profile } + IceCandidate { candidate } + Hangup { reason: Normal|Busy|Declined|Timeout|Error } + AuthToken { token } + Hold, Unhold, Mute, Unmute + Transfer { target_fingerprint, relay_addr } + TransferAck + Rekey { new_ephemeral_pub, signature } + QualityUpdate { report, recommended_profile } + Ping/Pong { timestamp_ms } ``` -- Media flows during both `Active` and `Rekeying` states -- Any state can transition to `Closed` via `Terminate` or `ConnectionLost` -- Invalid transitions produce a `TransitionError` +## Quality Profiles + +```mermaid +graph LR + subgraph GOOD ["GOOD (28.8 kbps)"] + G_C[Opus 24kbps] + G_F[FEC 20%] + G_FR[20ms frames] + end + + subgraph DEGRADED ["DEGRADED (9.0 kbps)"] + D_C[Opus 6kbps] + D_F[FEC 50%] + D_FR[40ms frames] + end + + subgraph CATASTROPHIC ["CATASTROPHIC (2.4 kbps)"] + C_C[Codec2 1200bps] + C_F[FEC 100%] + C_FR[40ms frames] + end + + GOOD -->|"loss>5% or RTT>100ms
3 consecutive reports"| DEGRADED + DEGRADED -->|"loss>15% or RTT>200ms
3 consecutive"| CATASTROPHIC + CATASTROPHIC -->|"loss<5% and RTT<100ms
3 consecutive"| DEGRADED + DEGRADED -->|"loss<5% and RTT<100ms
3 consecutive"| GOOD + + style GOOD fill:#00b894 + style DEGRADED fill:#fdcb6e + style CATASTROPHIC fill:#e17055 +``` + +## Cryptographic Handshake + +```mermaid +sequenceDiagram + participant C as Caller + participant R as Relay/Callee + + Note over C: Derive identity from seed
Ed25519 + X25519 via HKDF + + C->>C: Generate ephemeral X25519 + C->>C: Sign(ephemeral_pub || "call-offer") + C->>R: CallOffer { identity_pub, ephemeral_pub, signature, profiles } + + R->>R: Verify Ed25519 signature + R->>R: Generate ephemeral X25519 + R->>R: shared_secret = DH(eph_b, eph_a) + R->>R: session_key = HKDF(shared_secret, "warzone-session-key") + R->>R: Sign(ephemeral_pub || "call-answer") + R->>C: CallAnswer { identity_pub, ephemeral_pub, signature, chosen_profile } + + C->>C: Verify signature + C->>C: shared_secret = DH(eph_a, eph_b) + C->>C: session_key = HKDF(shared_secret) + + Note over C,R: Both have identical ChaCha20-Poly1305 session key + C->>R: Encrypted media (QUIC datagrams) + R->>C: Encrypted media (QUIC datagrams) + + Note over C,R: Rekey every 65,536 packets
New ephemeral DH + HKDF mix +``` + +## Identity Model (featherChat Compatible) + +```mermaid +graph TD + SEED[32-byte Seed
BIP39 Mnemonic 24 words] --> HKDF1[HKDF
salt=None
info=warzone-ed25519] + SEED --> HKDF2[HKDF
salt=None
info=warzone-x25519] + + HKDF1 --> ED[Ed25519 SigningKey
Digital Signatures] + HKDF2 --> X25519[X25519 StaticSecret
Key Agreement] + + ED --> VKEY[Ed25519 VerifyingKey
Public] + X25519 --> XPUB[X25519 PublicKey
Public] + + VKEY --> FP[Fingerprint
SHA-256 pubkey truncated 16 bytes
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx] + + style SEED fill:#6c5ce7 + style FP fill:#fd79a8 + style ED fill:#ee5a24 + style X25519 fill:#00b894 +``` ## Relay Modes -### Room Mode (Default, SFU) +```mermaid +graph TB + subgraph "Room Mode (Default SFU)" + C1[Client 1] -->|QUIC SNI=room-hash| RM[Room Manager] + C2[Client 2] -->|QUIC SNI=room-hash| RM + C3[Client 3] -->|QUIC SNI=room-hash| RM + RM --> R1[Room abc123] + R1 -->|fan-out| C1 + R1 -->|fan-out| C2 + R1 -->|fan-out| C3 + end -- Clients join named rooms via QUIC SNI -- When a participant sends a packet, the relay forwards it to all other participants -- No transcoding -- packets are forwarded opaquely -- Rooms are auto-created when the first participant joins and auto-deleted when empty -- Managed by `RoomManager` in `crates/wzp-relay/src/room.rs` + subgraph "Forward Mode with --remote" + C4[Client] -->|QUIC| RA[Relay A] + RA -->|FEC decode then jitter then FEC encode| RB[Relay B] + RB -->|QUIC| C5[Client] + end -### Forward Mode (`--remote`) + subgraph "Probe Mode with --probe" + PA[Relay A] -->|Ping 1/s ~50 bytes| PB[Relay B] + PB -->|Pong| PA + PA --> PM[Prometheus
RTT Loss Jitter Up/Down] + end -- All incoming traffic is forwarded to a remote relay via QUIC -- Two-pipeline architecture: upstream (client->remote) and downstream (remote->client) -- Each direction has its own `RelayPipeline` with FEC decode/encode and jitter buffering -- Intended for chaining relays across censored/lossy boundaries - -### Relay Pipeline (Forward Mode) - -Implemented in `crates/wzp-relay/src/pipeline.rs` as `RelayPipeline`: - -``` -Inbound: recv -> FEC decode -> jitter buffer -> pop -Outbound: packet -> assign seq -> FEC encode -> repair packets -> send + style RM fill:#ff9f43 + style R1 fill:#fdcb6e + style PM fill:#0984e3 ``` -The pipeline does NOT decode/re-encode audio. It operates on FEC-protected packets, managing loss recovery and re-FEC-encoding for the next hop. +## Web Bridge Architecture -## Transport +```mermaid +sequenceDiagram + participant B as Browser + participant W as wzp-web + participant R as wzp-relay -Implemented in `crates/wzp-transport/` using QUIC via the `quinn` crate. + B->>W: HTTPS GET /room-name + W->>B: index.html (SPA) -### QUIC Configuration + B->>W: WebSocket /ws/room-name + Note over B,W: Optional auth JSON message -- ALPN protocol: `wzp` -- Idle timeout: 30 seconds -- Keep-alive interval: 5 seconds -- DATAGRAM extension enabled (for unreliable media) -- Datagram receive buffer: 64 KB -- Receive window: 256 KB -- Send window: 128 KB -- Stream receive window: 64 KB per stream -- Initial RTT estimate: 300ms (tuned for high-latency links) + W->>R: QUIC connect (SNI = hashed room name) + Note over W,R: AuthToken then Handshake then Join Room -### Media Transport + loop Every 20ms + B->>W: WS Binary Int16 x 960 PCM + W->>W: CallEncoder Opus + FEC + W->>R: QUIC Datagram encrypted + end -- **Unreliable media**: QUIC DATAGRAM frames (no retransmission, no head-of-line blocking) -- **Reliable signaling**: QUIC bidirectional streams with length-prefixed JSON framing + loop Incoming audio + R->>W: QUIC Datagram + W->>W: CallDecoder FEC + Opus + W->>B: WS Binary Int16 x 960 PCM + end -### Path Quality Monitoring + Note over B: AudioWorklet
WZPCaptureProcessor mic to 960 frames
WZPPlaybackProcessor ring buffer to speaker +``` -`PathMonitor` in `crates/wzp-transport/src/path_monitor.rs` tracks: +## FEC Protection (RaptorQ) -- **Loss**: EWMA-smoothed percentage from sent/received packet counts -- **RTT**: EWMA-smoothed round-trip time (alpha=0.1) -- **Jitter**: EWMA of RTT variance (|current_rtt - previous_rtt|) -- **Bandwidth**: Estimated from bytes received over elapsed time +```mermaid +graph LR + subgraph "Encoder" + F1[Frame 1] --> BLK[Source Block
5-10 frames] + F2[Frame 2] --> BLK + F3[Frame 3] --> BLK + F4[Frame 4] --> BLK + F5[Frame 5] --> BLK + BLK --> SRC[5 Source Symbols] + BLK --> REP[1-10 Repair Symbols
ratio dependent] + SRC --> INT[Interleaver
depth=3] + REP --> INT + end -### Codec Selection by Tier + subgraph "Network" + INT --> LOSS{Packet Loss} + LOSS -->|some lost| RCV[Received Symbols] + end -| Codec | Sample Rate | Frame Duration | Bitrate | Use Case | -|-------|------------|----------------|---------|----------| -| Opus24k | 48 kHz | 20ms (960 samples) | 24 kbps | Good conditions | -| Opus16k | 48 kHz | 20ms | 16 kbps | Moderate conditions | -| Opus6k | 48 kHz | 40ms (1920 samples) | 6 kbps | Degraded conditions | -| Codec2_3200 | 8 kHz | 20ms (160 samples) | 3.2 kbps | Poor conditions | -| Codec2_1200 | 8 kHz | 40ms (320 samples) | 1.2 kbps | Catastrophic conditions | + subgraph "Decoder" + RCV --> DEINT[De-interleaver] + DEINT --> RAPTORQ[RaptorQ Decoder
Reconstruct from
any K of K+R symbols] + RAPTORQ --> OUT[Original Frames] + end -Opus operates at 48 kHz natively. When Codec2 is selected, the adaptive codec layer handles 48 kHz <-> 8 kHz resampling transparently using a simple linear resampler (6:1 decimation/interpolation). + style LOSS fill:#e17055 + style RAPTORQ fill:#00b894 +``` + +## Telemetry Stack + +```mermaid +graph TB + subgraph "Relay" + RM[RelayMetrics
sessions rooms packets] + SM[SessionMetrics
per-session jitter loss RTT] + PM[ProbeMetrics
inter-relay RTT loss] + RM --> PROM1[GET /metrics :9090] + SM --> PROM1 + PM --> PROM1 + end + + subgraph "Web Bridge" + WM[WebMetrics
connections frames latency] + WM --> PROM2[GET /metrics :8080] + end + + subgraph "Client" + CM[JitterStats + QualityAdapter] + CM --> JSONL[--metrics-file
JSONL 1 line/sec] + end + + PROM1 --> GRAF[Grafana Dashboard
4 rows 18 panels] + PROM2 --> GRAF + JSONL --> ANALYSIS[Offline Analysis] + + style GRAF fill:#ff6b6b + style PROM1 fill:#0984e3 + style PROM2 fill:#0984e3 +``` + +## Session State Machine + +```mermaid +stateDiagram-v2 + [*] --> Idle + Idle --> Connecting: connect + Connecting --> Handshaking: QUIC established + Handshaking --> Active: CallOffer/Answer complete + Active --> Rekeying: 65536 packets + Rekeying --> Active: new key derived + Active --> Closed: Hangup/Error/Timeout + Rekeying --> Closed: Error + Connecting --> Closed: Timeout + Handshaking --> Closed: Signature fail + + note right of Active: Media flows + note right of Rekeying: Media continues while rekeying +``` + +## Audio Processing Pipeline Detail + +```mermaid +graph TD + subgraph "Capture 20ms at 48kHz = 960 samples" + MIC[Microphone / AudioWorklet] --> PCM[PCM i16 x 960] + PCM --> RNN[RNNoise Denoise
2 x 480 samples] + RNN --> VAD{Silent?} + VAD -->|Yes over 100ms| CN[ComfortNoise packet
every 200ms] + VAD -->|No or Hangover| OPUS[Opus/Codec2 Encode] + end + + subgraph "FEC + Crypto" + OPUS --> SYMBOL[Pad to 256-byte symbol] + CN --> SYMBOL + SYMBOL --> BLOCK[Accumulate block
5-10 symbols] + BLOCK --> RAPTOR[RaptorQ encode
+ repair symbols] + RAPTOR --> INTERLEAVE[Interleave depth=3] + INTERLEAVE --> HDR[Add MediaHeader
or MiniHeader] + HDR --> ENCRYPT[ChaCha20-Poly1305
header=AAD payload=encrypted] + ENCRYPT --> QUIC[QUIC Datagram] + end + + style RNN fill:#a29bfe + style ENCRYPT fill:#ee5a24 + style RAPTOR fill:#00b894 +``` + +## Adaptive Jitter Buffer + +```mermaid +graph TD + PKT[Incoming Packet] --> SEQ{Sequence Check} + SEQ -->|Duplicate| DROP[Drop + AntiReplay] + SEQ -->|Valid| BUF[BTreeMap Buffer
ordered by seq] + + BUF --> ADAPT[AdaptivePlayoutDelay
EMA jitter tracking] + ADAPT --> TARGET[target_delay =
ceil jitter_ema/20ms + 2] + + BUF --> READY{depth >= target?} + READY -->|No| WAIT[Wait / Underrun++] + READY -->|Yes| POP[Pop lowest seq] + POP --> DECODE[Decode to PCM] + DECODE --> PLAY[Playout] + + BUF --> OVERFLOW{depth > max?} + OVERFLOW -->|Yes| EVICT[Drop oldest
Overrun++] + + style ADAPT fill:#fdcb6e + style DROP fill:#e17055 + style EVICT fill:#e17055 +``` + +## Deployment Topology + +```mermaid +graph TB + subgraph "Region A" + RA[wzp-relay A
:4433 UDP] + WA[wzp-web A
:8080 HTTPS] + WA --> RA + end + + subgraph "Region B" + RB[wzp-relay B
:4433 UDP] + WB[wzp-web B
:8080 HTTPS] + WB --> RB + end + + RA <-->|Probe 1/s| RB + + BA[Browser A] -->|WSS| WA + BB[Browser B] -->|WSS| WB + CA[CLI Client] -->|QUIC| RA + + PROM[Prometheus] -->|scrape| RA + PROM -->|scrape| RB + PROM -->|scrape| WA + PROM --> GRAF[Grafana] + + FC[featherChat Server] -->|auth validate| RA + FC -->|auth validate| RB + + style RA fill:#ff9f43 + style RB fill:#ff9f43 + style GRAF fill:#ff6b6b + style FC fill:#fd79a8 +``` + +## featherChat Integration Flow + +```mermaid +sequenceDiagram + participant A as User A WZP Client + participant FC as featherChat Server + participant R as WZP Relay + participant B as User B WZP Client + + Note over A,B: Both users share BIP39 seed = same identity + + A->>FC: WS CallSignal Offer payload=JSON SignalMessage + FC->>B: WS CallSignal Offer payload + relay_addr + room + + B->>R: QUIC connect SNI = hashed room + B->>R: AuthToken fc_bearer_token + R->>FC: POST /v1/auth/validate token + FC->>R: valid true fingerprint ... + B->>R: CallOffer then CallAnswer handshake + + A->>R: QUIC connect same room + A->>R: AuthToken + Handshake + + Note over A,B: Both in same room media flows E2E encrypted + A->>R: Encrypted media + R->>B: Forward SFU no decryption + B->>R: Encrypted media + R->>A: Forward +``` + +## Bandwidth Usage + +| Profile | Audio | FEC Overhead | Total | Use Case | +|---------|-------|-------------|-------|----------| +| **GOOD** | 24 kbps (Opus) | 20% = 4.8 kbps | **28.8 kbps** | WiFi, LTE, good links | +| **DEGRADED** | 6 kbps (Opus) | 50% = 3 kbps | **9.0 kbps** | 3G, congested WiFi | +| **CATASTROPHIC** | 1.2 kbps (Codec2) | 100% = 1.2 kbps | **2.4 kbps** | Satellite, extreme loss | + +With silence suppression: ~50% savings in typical conversations. +With mini-frames: 8 bytes/packet saved (67% header reduction). +With trunking: shared QUIC overhead across multiplexed sessions. + +## Project Structure + +``` +warzonePhone/ +├── Cargo.toml # Workspace root +├── crates/ +│ ├── wzp-proto/ # Protocol types, traits, wire format +│ │ └── src/ +│ │ ├── codec_id.rs # CodecId, QualityProfile +│ │ ├── error.rs # Error types +│ │ ├── jitter.rs # JitterBuffer, AdaptivePlayoutDelay +│ │ ├── packet.rs # MediaHeader, MiniHeader, TrunkFrame, SignalMessage +│ │ ├── quality.rs # Tier, AdaptiveQualityController +│ │ ├── session.rs # SessionState machine +│ │ └── traits.rs # AudioEncoder, FecEncoder, CryptoSession, etc. +│ ├── wzp-codec/ # Audio codecs +│ │ └── src/ +│ │ ├── adaptive.rs # AdaptiveEncoder/Decoder (Opus + Codec2) +│ │ ├── denoise.rs # NoiseSupressor (RNNoise/nnnoiseless) +│ │ └── silence.rs # SilenceDetector, ComfortNoise +│ ├── wzp-fec/ # Forward error correction +│ │ └── src/ +│ │ ├── encoder.rs # RaptorQFecEncoder +│ │ ├── decoder.rs # RaptorQFecDecoder +│ │ └── interleave.rs # Interleaver (burst protection) +│ ├── wzp-crypto/ # Cryptography + identity +│ │ └── src/ +│ │ ├── identity.rs # Seed, Fingerprint, hash_room_name +│ │ ├── handshake.rs # WarzoneKeyExchange (X25519 + Ed25519) +│ │ ├── session.rs # ChaChaSession (ChaCha20-Poly1305) +│ │ ├── nonce.rs # Deterministic nonce construction +│ │ ├── anti_replay.rs # Sliding window replay protection +│ │ └── rekey.rs # Forward secrecy rekeying +│ ├── wzp-transport/ # QUIC transport layer +│ │ └── src/lib.rs # QuinnTransport, send/recv media/signal/trunk +│ ├── wzp-relay/ # Relay daemon +│ │ └── src/ +│ │ ├── main.rs # CLI, connection loop, auth + handshake +│ │ ├── room.rs # RoomManager, TrunkedForwarder +│ │ ├── pipeline.rs # RelayPipeline (forward mode) +│ │ ├── session_mgr.rs # SessionManager (limits, lifecycle) +│ │ ├── auth.rs # featherChat token validation +│ │ ├── handshake.rs # Relay-side accept_handshake +│ │ ├── metrics.rs # Prometheus RelayMetrics + per-session +│ │ ├── probe.rs # Inter-relay probes + ProbeMesh +│ │ └── trunk.rs # TrunkBatcher +│ ├── wzp-client/ # Call engine + CLI +│ │ └── src/ +│ │ ├── cli.rs # CLI arg parsing + main +│ │ ├── call.rs # CallEncoder, CallDecoder, QualityAdapter +│ │ ├── handshake.rs # Client-side perform_handshake +│ │ ├── featherchat.rs # CallSignal bridge +│ │ ├── echo_test.rs # Automated echo quality test +│ │ ├── drift_test.rs # Clock drift measurement +│ │ ├── sweep.rs # Jitter buffer parameter sweep +│ │ ├── metrics.rs # JSONL telemetry writer +│ │ └── bench.rs # Component benchmarks +│ └── wzp-web/ # Browser bridge +│ ├── src/ +│ │ ├── main.rs # Axum server, WS handler, TLS +│ │ └── metrics.rs # Prometheus WebMetrics +│ └── static/ +│ ├── index.html # SPA UI (room, PTT, level meter) +│ └── audio-processor.js # AudioWorklet (capture + playback) +├── deps/featherchat/ # Git submodule +├── docs/ +│ ├── ARCHITECTURE.md # This file +│ ├── TELEMETRY.md # Metrics specification +│ ├── INTEGRATION_TASKS.md # featherChat task tracker +│ ├── WZP-FC-SHARED-CRATES.md # Shared crate strategy +│ └── grafana-dashboard.json # Pre-built Grafana dashboard +└── scripts/ + └── build-linux.sh # Hetzner VM build +``` + +## Test Coverage + +272 tests across all crates, 0 failures. + +| Crate | Tests | Key Coverage | +|-------|-------|-------------| +| wzp-proto | 41 | Wire format, jitter buffer, quality tiers, mini-frames, trunking | +| wzp-codec | 31 | Opus/Codec2 roundtrip, silence detection, noise suppression | +| wzp-fec | 22 | RaptorQ encode/decode, loss recovery, interleaving | +| wzp-crypto | 34 + 28 compat | Encrypt/decrypt, handshake, anti-replay, featherChat identity compat | +| wzp-transport | 2 | QUIC connection setup | +| wzp-relay | 40 + 4 integration | Room ACL, session mgmt, metrics, probes, mesh, trunking | +| wzp-client | 30 + 2 integration | Encoder/decoder, quality adapter, silence, drift, sweep | +| wzp-web | 2 | Metrics |