# WarzonePhone Architecture
> Custom lossy VoIP protocol built in Rust. E2E encrypted, FEC-protected, adaptive quality, designed for hostile network conditions.
## System Overview
```mermaid
graph TB
subgraph "Client A (Desktop / Android / CLI)"
MIC[Microphone] --> DN[NoiseSuppressor
RNNoise ML]
DN --> SD[SilenceDetector
VAD + Hangover]
SD --> ENC[CallEncoder
Opus / Codec2]
ENC --> FEC_E[FEC Encoder
RaptorQ]
FEC_E --> CRYPT_E[ChaCha20-Poly1305
Encrypt]
CRYPT_E --> QUIC_S[QUIC Datagram
Send]
QUIC_R[QUIC Datagram
Recv] --> CRYPT_D[ChaCha20-Poly1305
Decrypt]
CRYPT_D --> FEC_D[FEC Decoder
RaptorQ]
FEC_D --> JIT[JitterBuffer
Adaptive Playout]
JIT --> DEC[CallDecoder
Opus / Codec2]
DEC --> SPK[Speaker]
end
subgraph "Relay (SFU)"
ACCEPT[Accept QUIC] --> AUTH{Auth?}
AUTH -->|token| VALIDATE[POST /v1/auth/validate]
AUTH -->|no auth| HS
VALIDATE --> HS[Crypto Handshake
X25519 + Ed25519]
HS --> ROOM[Room Manager
Named Rooms via SNI]
ROOM --> FWD[Forward to
Other Participants]
end
subgraph "Client B"
B_SPK[Speaker]
B_MIC[Microphone]
end
QUIC_S -->|UDP / QUIC| ACCEPT
FWD -->|UDP / QUIC| QUIC_R
B_MIC -.->|same pipeline| ACCEPT
FWD -.->|same pipeline| B_SPK
style MIC fill:#4a9eff,color:#fff
style SPK fill:#4a9eff,color:#fff
style B_MIC fill:#4a9eff,color:#fff
style B_SPK fill:#4a9eff,color:#fff
style ROOM fill:#ff9f43,color:#fff
style CRYPT_E fill:#ee5a24,color:#fff
style CRYPT_D fill:#ee5a24,color:#fff
```
## Crate Dependency Graph
```mermaid
graph TD
PROTO["wzp-proto
Types, Traits, Wire Format"]
CODEC["wzp-codec
Opus + Codec2 + RNNoise"]
FEC["wzp-fec
RaptorQ FEC"]
CRYPTO["wzp-crypto
ChaCha20 + Identity"]
TRANSPORT["wzp-transport
QUIC / Quinn"]
RELAY["wzp-relay
Relay Daemon"]
CLIENT["wzp-client
CLI + Call Engine"]
WEB["wzp-web
Browser Bridge"]
PROTO --> CODEC
PROTO --> FEC
PROTO --> CRYPTO
PROTO --> TRANSPORT
CODEC --> CLIENT
FEC --> CLIENT
CRYPTO --> CLIENT
TRANSPORT --> CLIENT
CODEC --> RELAY
FEC --> RELAY
CRYPTO --> RELAY
TRANSPORT --> RELAY
CLIENT --> WEB
TRANSPORT --> WEB
CRYPTO --> WEB
FC["warzone-protocol
featherChat Identity"] -.->|path dep| CRYPTO
style PROTO fill:#6c5ce7,color:#fff
style RELAY fill:#ff9f43,color:#fff
style CLIENT fill:#00b894,color:#fff
style WEB fill:#0984e3,color:#fff
style FC fill:#fd79a8,color:#fff
```
**Star pattern**: Each leaf crate (`wzp-codec`, `wzp-fec`, `wzp-crypto`, `wzp-transport`) depends only on `wzp-proto`. No leaf depends on another leaf. Integration crates (`wzp-relay`, `wzp-client`, `wzp-web`) depend on all leaves.
## Audio Encode Pipeline
```mermaid
sequenceDiagram
participant Mic as Microphone
(48kHz)
participant Ring as SPSC Ring
(lock-free)
participant RNN as RNNoise
(2 x 480)
participant VAD as SilenceDetector
participant Codec as Opus / Codec2
participant FEC as RaptorQ FEC
participant INT as Interleaver
(depth=3)
participant HDR as MediaHeader
(12B or Mini 4B)
participant Enc as ChaCha20-Poly1305
participant QUIC as QUIC Datagram
Mic->>Ring: f32 x 512 (macOS callback)
Ring->>Ring: Accumulate to 960 samples
Ring->>RNN: PCM i16 x 960 (20ms frame)
RNN->>VAD: Denoised audio
alt Speech active (or hangover)
VAD->>Codec: Encode active frame
else Silence (>100ms)
VAD->>Codec: ComfortNoise (every 200ms)
end
Codec->>FEC: Compressed bytes (pad to 256B symbol)
FEC->>FEC: Accumulate block (5-10 symbols)
FEC->>INT: Source + repair symbols
INT->>HDR: Interleaved packets
HDR->>Enc: Header as AAD
Enc->>QUIC: Encrypted payload + 16B tag
```
### Key Details
- macOS delivers **512 f32** samples per callback (not configurable to 960)
- Ring buffer accumulates to **960 samples** (20ms at 48 kHz) for codec frame
- RNNoise processes **2 x 480** samples (ML-based noise suppression via nnnoiseless)
- Silence detection uses VAD + 100ms hangover before switching to ComfortNoise
- FEC symbols are padded to **256 bytes** with a 2-byte LE length prefix
- MiniHeaders (4 bytes) replace full headers (12 bytes) for 49 of every 50 frames
## Audio Decode Pipeline
```mermaid
sequenceDiagram
participant QUIC as QUIC Datagram
participant Dec as ChaCha20-Poly1305
participant AR as Anti-Replay
(sliding window)
participant HDR as Header Parse
participant DEINT as De-interleaver
participant FEC as RaptorQ FEC
(reconstruct)
participant JIT as JitterBuffer
(BTreeMap)
participant Codec as Opus / Codec2
participant Ring as SPSC Ring
(lock-free)
participant SPK as Speaker
QUIC->>Dec: Encrypted packet
Dec->>AR: Decrypt (header = AAD)
AR->>AR: Check seq window (reject replay)
AR->>HDR: Verified packet
HDR->>DEINT: MediaHeader + payload
DEINT->>FEC: Reordered symbols by block
FEC->>FEC: Attempt decode (need K of K+R)
FEC->>JIT: Recovered audio frames
JIT->>JIT: BTreeMap ordered by seq
JIT->>JIT: Wait until depth >= target
JIT->>Codec: Pop lowest seq frame
Codec->>Ring: PCM i16 x 960
Ring->>SPK: Audio callback pulls samples
```
### Key Details
- Anti-replay uses a **64-packet sliding window** to reject duplicates
- FEC decoder needs any **K of K+R** symbols to reconstruct a block
- Jitter buffer target: **10 packets (200ms)** for client, **50 packets (1s)** for relay
- Desktop client uses **direct playout** (no jitter buffer) with lock-free ring
- Codec2 frames at 8 kHz are resampled to 48 kHz transparently
## Relay SFU Forwarding
```mermaid
graph TB
subgraph "Room Mode (Default SFU)"
C1[Client 1
Alice] -->|"QUIC SNI=room-hash"| RM[Room Manager]
C2[Client 2
Bob] -->|"QUIC SNI=room-hash"| RM
C3[Client 3
Charlie] -->|"QUIC SNI=room-hash"| RM
RM --> R1["Room 'podcast'"]
R1 -->|"fan-out (skip sender)"| C1
R1 -->|"fan-out (skip sender)"| C2
R1 -->|"fan-out (skip sender)"| C3
end
subgraph "Forward Mode (--remote)"
C4[Client] -->|QUIC| RA[Relay A]
RA -->|"FEC decode
jitter buffer
FEC re-encode"| RB[Relay B
--remote]
RB -->|QUIC| C5[Client]
end
subgraph "Probe Mode (--probe)"
PA[Relay A] -->|"Ping 1/s
~50 bytes"| PB[Relay B]
PB -->|Pong| PA
PA --> PM[Prometheus
RTT / Loss / Jitter]
end
style RM fill:#ff9f43,color:#fff
style R1 fill:#fdcb6e
style PM fill:#0984e3,color:#fff
```
### SFU Fan-out Rules
1. Each incoming datagram is forwarded to all other participants in the room
2. The sender is excluded from fan-out (no echo)
3. If one send fails, the relay continues to the next participant (best-effort)
4. The relay never decodes or re-encodes audio (preserves E2E encryption)
5. With trunking enabled, packets to the same receiver are batched into TrunkFrames (flushed every 5ms)
## Federation Topology
```mermaid
graph TB
subgraph "Relay A (EU)"
A_R["Room Manager"]
A_F["Federation
Manager"]
A1["Alice (local)"]
A2["Bob (local)"]
end
subgraph "Relay B (US)"
B_R["Room Manager"]
B_F["Federation
Manager"]
B1["Charlie (local)"]
end
subgraph "Relay C (APAC)"
C_R["Room Manager"]
C_F["Federation
Manager"]
C1["Dave (local)"]
end
A1 -->|media| A_R
A2 -->|media| A_R
B1 -->|media| B_R
C1 -->|media| C_R
A_F <-->|"SNI='_federation'
GlobalRoomActive
media forward"| B_F
A_F <-->|"SNI='_federation'
GlobalRoomActive
media forward"| C_F
B_F <-->|"SNI='_federation'
GlobalRoomActive
media forward"| C_F
A_R --> A_F
B_R --> B_F
C_R --> C_F
style A_F fill:#6c5ce7,color:#fff
style B_F fill:#6c5ce7,color:#fff
style C_F fill:#6c5ce7,color:#fff
style A_R fill:#ff9f43,color:#fff
style B_R fill:#ff9f43,color:#fff
style C_R fill:#ff9f43,color:#fff
```
### Federation Protocol Flow
```mermaid
sequenceDiagram
participant RA as Relay A
participant RB as Relay B
Note over RA: Startup: connect to configured peers
RA->>RB: QUIC connect (SNI="_federation")
RA->>RB: FederationHello { tls_fingerprint }
RB->>RB: Verify fingerprint against [[trusted]]
Note over RA,RB: Federation link established
Note over RA: Alice joins global room "podcast"
RA->>RB: GlobalRoomActive { room: "podcast" }
Note over RB: Charlie joins global room "podcast"
RB->>RA: GlobalRoomActive { room: "podcast" }
Note over RA,RB: Media bridging active
loop Every media packet in global room
RA->>RB: [room_hash:8][encrypted_media]
RB->>RA: [room_hash:8][encrypted_media]
end
Note over RA: Last local participant leaves
RA->>RB: GlobalRoomInactive { room: "podcast" }
```
## Wire Formats
### MediaHeader (12 bytes)
```
Byte 0: [V:1][T:1][CodecID:4][Q:1][FecRatioHi:1]
Byte 1: [FecRatioLo:6][unused:2]
Bytes 2-3: sequence (u16 BE)
Bytes 4-7: timestamp_ms (u32 BE)
Byte 8: fec_block_id (u8)
Byte 9: fec_symbol_idx (u8)
Byte 10: reserved
Byte 11: csrc_count
```
| Field | Bits | Description |
|-------|------|-------------|
| V (version) | 1 | Protocol version (0 = v1) |
| T (is_repair) | 1 | 1 = FEC repair packet, 0 = source media |
| CodecID | 4 | Codec identifier (0-8, see table below) |
| Q | 1 | 1 = QualityReport trailer appended |
| FecRatio | 7 | FEC ratio encoded as 0-127 mapping to 0.0-2.0 |
| sequence | 16 | Wrapping packet sequence number |
| timestamp_ms | 32 | Milliseconds since session start |
| fec_block_id | 8 | FEC source block ID (wrapping) |
| fec_symbol_idx | 8 | Symbol index within FEC block |
| reserved | 8 | Reserved flags |
| csrc_count | 8 | Contributing source count (future mixing) |
#### CodecID Values
| Value | Codec | Bitrate | Sample Rate | Frame Duration |
|-------|-------|---------|-------------|---------------|
| 0 | Opus 24k | 24 kbps | 48 kHz | 20ms |
| 1 | Opus 16k | 16 kbps | 48 kHz | 20ms |
| 2 | Opus 6k | 6 kbps | 48 kHz | 40ms |
| 3 | Codec2 3200 | 3.2 kbps | 8 kHz | 20ms |
| 4 | Codec2 1200 | 1.2 kbps | 8 kHz | 40ms |
| 5 | ComfortNoise | 0 | 48 kHz | 20ms |
| 6 | Opus 32k | 32 kbps | 48 kHz | 20ms |
| 7 | Opus 48k | 48 kbps | 48 kHz | 20ms |
| 8 | Opus 64k | 64 kbps | 48 kHz | 20ms |
### MiniHeader (4 bytes, compressed)
```
[FRAME_TYPE_MINI: 0x01]
Bytes 0-1: timestamp_delta_ms (u16 BE)
Bytes 2-3: payload_len (u16 BE)
```
Used for 49 of every 50 frames (~1s cycle). Saves 8 bytes per packet (67% header reduction). Full header is sent every 50th frame to resynchronize state.
### TrunkFrame (batched datagrams)
```
[count: u16]
[session_id: 2][len: u16][payload: len] x count
```
Packs multiple session packets into one QUIC datagram. Maximum 10 entries or 1200 bytes, flushed every 5ms.
### QualityReport (4 bytes, optional trailer)
```
Byte 0: loss_pct (0-255 maps to 0-100%)
Byte 1: rtt_4ms (0-255 maps to 0-1020ms, resolution 4ms)
Byte 2: jitter_ms (0-255ms)
Byte 3: bitrate_cap_kbps (0-255 kbps)
```
Appended to a media packet when the Q flag is set in the MediaHeader.
## Signal Message Handshake Flow
```mermaid
sequenceDiagram
participant C as Client
participant R as Relay
C->>R: QUIC Connect (SNI = hashed room name)
alt Auth enabled (--auth-url)
C->>R: SignalMessage::AuthToken { token }
R->>R: POST auth_url to validate
R-->>C: (connection closed if invalid)
end
C->>R: CallOffer { identity_pub, ephemeral_pub, signature, supported_profiles }
R->>R: Verify Ed25519 signature
R->>R: Generate ephemeral X25519
R->>R: shared_secret = DH(eph_relay, eph_client)
R->>R: session_key = HKDF(shared_secret, "warzone-session-key")
R->>C: CallAnswer { identity_pub, ephemeral_pub, signature, chosen_profile }
C->>C: Verify signature
C->>C: Derive same session_key
Note over C,R: Session established -- both have ChaCha20-Poly1305 key
C->>R: RoomUpdate (join notification broadcast)
loop Media exchange
C->>R: QUIC Datagram (encrypted media)
R->>C: QUIC Datagram (forwarded from others)
end
opt Every 65,536 packets
C->>R: Rekey { new_ephemeral_pub, signature }
R->>C: Rekey { new_ephemeral_pub, signature }
Note over C,R: New session key via fresh DH
end
C->>R: Hangup { reason: Normal }
R->>R: Remove from room, broadcast RoomUpdate
```
## Client Architecture
### Desktop Engine (Tauri)
```mermaid
graph TB
subgraph "Tauri Frontend (HTML/JS)"
UI[Connect / Call UI]
SET[Settings Panel]
end
subgraph "Tauri Rust Backend"
CMD[Tauri Commands
connect/disconnect/toggle]
ENG[WzpEngine
State Machine]
end
subgraph "Audio I/O"
CPAL_C[CPAL Capture
or VoiceProcessingIO]
RING_C[SPSC Ring
Capture]
RING_P[SPSC Ring
Playout]
CPAL_P[CPAL Playback
or VoiceProcessingIO]
end
subgraph "Network Tasks (tokio)"
SEND[Send Loop
encode + encrypt]
RECV[Recv Loop
decrypt + decode]
SIG[Signal Handler
room updates]
end
UI --> CMD
SET --> CMD
CMD --> ENG
ENG --> SEND
ENG --> RECV
ENG --> SIG
CPAL_C --> RING_C --> SEND
RECV --> RING_P --> CPAL_P
style ENG fill:#00b894,color:#fff
style SEND fill:#0984e3,color:#fff
style RECV fill:#0984e3,color:#fff
```
Key design decisions:
- **Lock-free SPSC rings** between audio callbacks and network tasks (no mutex on audio thread)
- **VoiceProcessingIO** on macOS for OS-level AEC (CPAL uses HalOutput which has no AEC)
- **Direct playout** -- no jitter buffer on client; audio callback pulls from ring
- **Release builds required** -- debug builds too slow for real-time audio
### Android Engine (Kotlin + JNI)
```mermaid
graph TB
subgraph "Compose UI"
CALL[CallActivity]
SET[SettingsScreen]
VM[CallViewModel]
end
subgraph "Service Layer"
SVC[CallService
Foreground Service]
PIPE[AudioPipeline
AudioTrack + AudioRecord]
end
subgraph "Rust Engine (JNI)"
JNI[WzpEngine.kt
JNI bridge]
NATIVE[libwzp_android.so
Rust call engine]
end
subgraph "Android Audio"
REC[AudioRecord
+ AEC effect]
TRK[AudioTrack
low-latency]
end
CALL --> VM
SET --> VM
VM --> SVC
SVC --> PIPE
PIPE --> JNI
JNI --> NATIVE
REC --> PIPE
PIPE --> TRK
style NATIVE fill:#00b894,color:#fff
style SVC fill:#ff9f43,color:#fff
style PIPE fill:#0984e3,color:#fff
```
Key design decisions:
- **Foreground service** keeps audio alive when the screen is off
- **AudioRecord + AudioTrack** with Android's built-in AEC (AudioEffect)
- **Lock-free AudioRing** with preallocated Vec (not push/pop) to avoid allocation on audio thread
- **JNI bridge** marshals PCM frames between Kotlin and Rust
### CLI Architecture
```mermaid
graph TB
subgraph "CLI Modes"
LIVE[--live
Mic + Speaker]
TONE[--send-tone
Sine Generator]
FILE[--send-file
PCM Reader]
ECHO[--echo-test
Quality Analysis]
DRIFT[--drift-test
Clock Analysis]
SWEEP[--sweep
Buffer Sweep]
end
subgraph "Call Engine"
ENCODE[CallEncoder
codec + FEC]
DECODE[CallDecoder
FEC + codec]
QA[QualityAdapter
adaptive switching]
end
subgraph "Transport"
QUIC[QuinnTransport
send/recv media + signal]
HS[Handshake
X25519 + Ed25519]
end
LIVE --> ENCODE
TONE --> ENCODE
FILE --> ENCODE
ENCODE --> QUIC
QUIC --> DECODE
ECHO --> ENCODE
ECHO --> DECODE
DRIFT --> ENCODE
HS --> QUIC
style ENCODE fill:#00b894,color:#fff
style DECODE fill:#00b894,color:#fff
style QUIC fill:#0984e3,color:#fff
```
## Adaptive Quality System
```mermaid
graph LR
subgraph GOOD ["GOOD (28.8 kbps)"]
G_C[Opus 24kbps]
G_F[FEC 20%]
G_FR[20ms frames]
end
subgraph DEGRADED ["DEGRADED (9.0 kbps)"]
D_C[Opus 6kbps]
D_F[FEC 50%]
D_FR[40ms frames]
end
subgraph CATASTROPHIC ["CATASTROPHIC (2.4 kbps)"]
C_C[Codec2 1200bps]
C_F[FEC 100%]
C_FR[40ms frames]
end
GOOD -->|"loss>10% or RTT>400ms
3 consecutive reports"| DEGRADED
DEGRADED -->|"loss>40% or RTT>600ms
3 consecutive"| CATASTROPHIC
CATASTROPHIC -->|"loss<10% and RTT<400ms
10 consecutive"| DEGRADED
DEGRADED -->|"loss<10% and RTT<400ms
10 consecutive"| GOOD
style GOOD fill:#00b894,color:#fff
style DEGRADED fill:#fdcb6e
style CATASTROPHIC fill:#e17055,color:#fff
```
Hysteresis prevents tier flapping: **fast downgrade** (3 reports, or 2 on cellular) and **slow upgrade** (10 reports, one tier at a time).
## Cryptographic Handshake
```mermaid
sequenceDiagram
participant C as Caller
participant R as Relay / Callee
Note over C: Derive identity from seed
Ed25519 + X25519 via HKDF
C->>C: Generate ephemeral X25519
C->>C: Sign(ephemeral_pub || "call-offer")
C->>R: CallOffer { identity_pub, ephemeral_pub, signature, profiles }
R->>R: Verify Ed25519 signature
R->>R: Generate ephemeral X25519
R->>R: shared_secret = DH(eph_b, eph_a)
R->>R: session_key = HKDF(shared_secret, "warzone-session-key")
R->>R: Sign(ephemeral_pub || "call-answer")
R->>C: CallAnswer { identity_pub, ephemeral_pub, signature, profile }
C->>C: Verify signature
C->>C: shared_secret = DH(eph_a, eph_b)
C->>C: session_key = HKDF(shared_secret)
Note over C,R: Both have identical ChaCha20-Poly1305 session key
C->>R: Encrypted media (QUIC datagrams)
R->>C: Encrypted media (QUIC datagrams)
Note over C,R: Rekey every 65,536 packets
New ephemeral DH + HKDF mix
```
## Identity Model
```mermaid
graph TD
SEED["32-byte Seed
(BIP39 Mnemonic: 24 words)"] --> HKDF1["HKDF
salt=None
info='warzone-ed25519'"]
SEED --> HKDF2["HKDF
salt=None
info='warzone-x25519'"]
HKDF1 --> ED["Ed25519 SigningKey
Digital Signatures"]
HKDF2 --> X25519["X25519 StaticSecret
Key Agreement"]
ED --> VKEY["Ed25519 VerifyingKey
(Public)"]
X25519 --> XPUB["X25519 PublicKey
(Public)"]
VKEY --> FP["Fingerprint
SHA-256(pubkey) truncated 16 bytes
xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx"]
style SEED fill:#6c5ce7,color:#fff
style FP fill:#fd79a8,color:#fff
style ED fill:#ee5a24,color:#fff
style X25519 fill:#00b894,color:#fff
```
## Adaptive Jitter Buffer
```mermaid
graph TD
PKT[Incoming Packet] --> SEQ{Sequence Check}
SEQ -->|Duplicate| DROP[Drop + AntiReplay]
SEQ -->|Valid| BUF["BTreeMap Buffer
(ordered by seq)"]
BUF --> ADAPT["AdaptivePlayoutDelay
(EMA jitter tracking)"]
ADAPT --> TARGET["target_delay =
ceil(jitter_ema / 20ms) + 2"]
BUF --> READY{"depth >= target?"}
READY -->|No| WAIT["Wait (Underrun++)"]
READY -->|Yes| POP[Pop lowest seq]
POP --> DECODE[Decode to PCM]
DECODE --> PLAY[Playout]
BUF --> OVERFLOW{"depth > max?"}
OVERFLOW -->|Yes| EVICT["Drop oldest (Overrun++)"]
style ADAPT fill:#fdcb6e
style DROP fill:#e17055,color:#fff
style EVICT fill:#e17055,color:#fff
```
## FEC Protection (RaptorQ)
```mermaid
graph LR
subgraph "Encoder"
F1[Frame 1] --> BLK["Source Block
(5-10 frames)"]
F2[Frame 2] --> BLK
F3[Frame 3] --> BLK
F4[Frame 4] --> BLK
F5[Frame 5] --> BLK
BLK --> SRC[5 Source Symbols]
BLK --> REP["1-10 Repair Symbols
(ratio dependent)"]
SRC --> INT["Interleaver
(depth=3)"]
REP --> INT
end
subgraph "Network"
INT --> LOSS{Packet Loss}
LOSS -->|some lost| RCV[Received Symbols]
end
subgraph "Decoder"
RCV --> DEINT[De-interleaver]
DEINT --> RAPTORQ["RaptorQ Decoder
Reconstruct from
any K of K+R symbols"]
RAPTORQ --> OUT[Original Frames]
end
style LOSS fill:#e17055,color:#fff
style RAPTORQ fill:#00b894,color:#fff
```
## Telemetry Stack
```mermaid
graph TB
subgraph "Relay"
RM["RelayMetrics
sessions, rooms, packets"]
SM["SessionMetrics
per-session jitter, loss, RTT"]
PM["ProbeMetrics
inter-relay RTT, loss"]
RM --> PROM1["GET /metrics :9090"]
SM --> PROM1
PM --> PROM1
end
subgraph "Web Bridge"
WM["WebMetrics
connections, frames, latency"]
WM --> PROM2["GET /metrics :8080"]
end
subgraph "Client"
CM["JitterStats + QualityAdapter"]
CM --> JSONL["--metrics-file
JSONL 1 line/sec"]
end
PROM1 --> GRAF["Grafana Dashboard
4 rows, 18 panels"]
PROM2 --> GRAF
JSONL --> ANALYSIS[Offline Analysis]
style GRAF fill:#ff6b6b,color:#fff
style PROM1 fill:#0984e3,color:#fff
style PROM2 fill:#0984e3,color:#fff
```
## Deployment Topology
```mermaid
graph TB
subgraph "Region A"
RA["wzp-relay A
:4433 UDP"]
WA["wzp-web A
:8080 HTTPS"]
WA --> RA
end
subgraph "Region B"
RB["wzp-relay B
:4433 UDP"]
WB["wzp-web B
:8080 HTTPS"]
WB --> RB
end
RA <-->|"Probe 1/s + Federation"| RB
BA[Browser A] -->|WSS| WA
BB[Browser B] -->|WSS| WB
CA[CLI Client] -->|QUIC| RA
DA[Desktop Client] -->|QUIC| RA
MA[Android Client] -->|QUIC| RB
PROM[Prometheus] -->|scrape| RA
PROM -->|scrape| RB
PROM -->|scrape| WA
PROM --> GRAF[Grafana]
FC[featherChat Server] -->|auth validate| RA
FC -->|auth validate| RB
style RA fill:#ff9f43,color:#fff
style RB fill:#ff9f43,color:#fff
style GRAF fill:#ff6b6b,color:#fff
style FC fill:#fd79a8,color:#fff
```
## Session State Machine
```mermaid
stateDiagram-v2
[*] --> Idle
Idle --> Connecting: connect()
Connecting --> Handshaking: QUIC established
Handshaking --> Active: CallOffer/Answer complete
Active --> Rekeying: 65,536 packets
Rekeying --> Active: new key derived
Active --> Closed: Hangup / Error / Timeout
Rekeying --> Closed: Error
Connecting --> Closed: Timeout
Handshaking --> Closed: Signature fail
note right of Active: Media flows (encrypted)
note right of Rekeying: Media continues while rekeying
```
## Project Structure
```
warzonePhone/
├── Cargo.toml # Workspace root
├── crates/
│ ├── wzp-proto/ # Protocol types, traits, wire format
│ │ └── src/
│ │ ├── codec_id.rs # CodecId, QualityProfile
│ │ ├── error.rs # Error types
│ │ ├── jitter.rs # JitterBuffer, AdaptivePlayoutDelay
│ │ ├── packet.rs # MediaHeader, MiniHeader, TrunkFrame, SignalMessage
│ │ ├── quality.rs # Tier, AdaptiveQualityController
│ │ ├── session.rs # SessionState machine
│ │ └── traits.rs # AudioEncoder, FecEncoder, CryptoSession, etc.
│ ├── wzp-codec/ # Audio codecs
│ │ └── src/
│ │ ├── adaptive.rs # AdaptiveEncoder/Decoder (Opus + Codec2)
│ │ ├── denoise.rs # NoiseSuppressor (RNNoise / nnnoiseless)
│ │ └── silence.rs # SilenceDetector, ComfortNoise
│ ├── wzp-fec/ # Forward error correction
│ │ └── src/
│ │ ├── encoder.rs # RaptorQFecEncoder
│ │ ├── decoder.rs # RaptorQFecDecoder
│ │ └── interleave.rs # Interleaver (burst protection)
│ ├── wzp-crypto/ # Cryptography + identity
│ │ └── src/
│ │ ├── identity.rs # Seed, Fingerprint, hash_room_name
│ │ ├── handshake.rs # WarzoneKeyExchange (X25519 + Ed25519)
│ │ ├── session.rs # ChaChaSession (ChaCha20-Poly1305)
│ │ ├── nonce.rs # Deterministic nonce construction
│ │ ├── anti_replay.rs # Sliding window replay protection
│ │ └── rekey.rs # Forward secrecy rekeying
│ ├── wzp-transport/ # QUIC transport layer
│ │ └── src/lib.rs # QuinnTransport, send/recv media/signal/trunk
│ ├── wzp-relay/ # Relay daemon
│ │ └── src/
│ │ ├── main.rs # CLI, connection loop, auth + handshake
│ │ ├── config.rs # RelayConfig, TOML parsing
│ │ ├── room.rs # RoomManager, TrunkedForwarder
│ │ ├── pipeline.rs # RelayPipeline (forward mode)
│ │ ├── session_mgr.rs # SessionManager (limits, lifecycle)
│ │ ├── auth.rs # featherChat token validation
│ │ ├── handshake.rs # Relay-side accept_handshake
│ │ ├── metrics.rs # Prometheus RelayMetrics + per-session
│ │ ├── probe.rs # Inter-relay probes + ProbeMesh
│ │ ├── federation.rs # FederationManager, global rooms
│ │ ├── presence.rs # PresenceRegistry
│ │ ├── route.rs # RouteResolver
│ │ ├── trunk.rs # TrunkBatcher
│ │ └── ws.rs # WebSocket handler for browser clients
│ ├── wzp-client/ # Call engine + CLI
│ │ └── src/
│ │ ├── cli.rs # CLI arg parsing + main
│ │ ├── call.rs # CallEncoder, CallDecoder, QualityAdapter
│ │ ├── handshake.rs # Client-side perform_handshake
│ │ ├── featherchat.rs # CallSignal bridge
│ │ ├── echo_test.rs # Automated echo quality test
│ │ ├── drift_test.rs # Clock drift measurement
│ │ ├── sweep.rs # Jitter buffer parameter sweep
│ │ ├── metrics.rs # JSONL telemetry writer
│ │ └── bench.rs # Component benchmarks
│ └── wzp-web/ # Browser bridge
│ ├── src/
│ │ ├── main.rs # Axum server, WS handler, TLS
│ │ └── metrics.rs # Prometheus WebMetrics
│ └── static/
│ ├── index.html # SPA UI (room, PTT, level meter)
│ └── audio-processor.js # AudioWorklet (capture + playback)
├── android/ # Android app (Kotlin + JNI)
│ └── app/src/main/java/com/wzp/
│ ├── audio/ # AudioPipeline, AudioRouteManager
│ ├── engine/ # WzpEngine (JNI), CallStats, WzpCallback
│ ├── ui/ # CallActivity, SettingsScreen, Identicon
│ ├── data/ # SettingsRepository
│ ├── net/ # RelayPinger
│ ├── service/ # CallService (foreground)
│ └── debug/ # DebugReporter
├── desktop/ # Desktop app (Tauri)
│ └── dist/ # Built frontend (HTML/JS/CSS)
├── deps/featherchat/ # Git submodule
├── docs/ # Documentation
├── scripts/ # Build scripts
│ └── build-linux.sh # Hetzner VM build
└── tools/ # Development tools
```
## Test Coverage
272 tests across all crates, 0 failures:
| Crate | Tests | Key Coverage |
|-------|-------|-------------|
| wzp-proto | 41 | Wire format, jitter buffer, quality tiers, mini-frames, trunking |
| wzp-codec | 31 | Opus/Codec2 roundtrip, silence detection, noise suppression |
| wzp-fec | 22 | RaptorQ encode/decode, loss recovery, interleaving |
| wzp-crypto | 34 + 28 compat | Encrypt/decrypt, handshake, anti-replay, featherChat identity |
| wzp-transport | 2 | QUIC connection setup |
| wzp-relay | 40 + 4 integration | Room ACL, session mgmt, metrics, probes, mesh, trunking |
| wzp-client | 30 + 2 integration | Encoder/decoder, quality adapter, silence, drift, sweep |
| wzp-web | 2 | Metrics |
## Audio Backend Architecture (Platform Matrix)
WarzonePhone's audio I/O goes through one of four backends depending on the target platform and feature flags. All backends expose the same public API (`AudioCapture::start() → AudioCapture { ring(), stop() }`) via conditional re-exports in `crates/wzp-client/src/lib.rs`, so the `CallEngine` above the audio layer doesn't know or care which backend is running.
```
┌─────────────────────────────────────────────┐
│ CallEngine (platform-agnostic) │
│ reads PCM from AudioCapture::ring() │
│ writes PCM to AudioPlayback::ring() │
└────────────────────┬────────────────────────┘
│
┌─────────────────────┼─────────────────────┐
│ │ │
▼ ▼ ▼
┌───────────────┐ ┌────────────────┐ ┌───────────────┐
│ audio_io │ │ audio_vpio │ │ audio_wasapi │
│ (CPAL) │ │ (Core Audio │ │ (Windows │
│ │ │ VoiceProc IO) │ │ IAudioClient2│
│ All platforms │ │ macOS only │ │ Windows │
│ (baseline) │ │ feature=vpio │ │ feature= │
│ │ │ │ │ windows-aec │
└───────────────┘ └────────────────┘ └───────────────┘
│
▼ on Android only
┌───────────────┐
│ wzp-native │
│ (Oboe bridge │
│ via dlopen) │
│ │
│ Android only │
│ libloading │
└───────────────┘
```
### Backend selection matrix
| Platform | Capture | Playback | OS AEC | Feature flags |
|---|---|---|---|---|
| macOS | VoiceProcessingIO (native Core Audio) | CPAL | **Yes** — Apple's hardware-accelerated AEC (same AEC as FaceTime, iMessage audio, Voice Memos) | `audio`, `vpio` |
| Windows (AEC build) | Direct WASAPI with `AudioCategory_Communications` | CPAL | **Yes** — Windows routes the capture stream through the driver's communications APO chain (AEC + NS + AGC), driver-dependent quality | `audio`, `windows-aec` |
| Windows (baseline) | CPAL (WASAPI shared mode) | CPAL | No | `audio` |
| Linux | CPAL (ALSA / PulseAudio) | CPAL | No | `audio` |
| Android (Tauri Mobile) | Oboe via `wzp-native` cdylib, `Usage::VoiceCommunication` + `MODE_IN_COMMUNICATION` | Same Oboe stream | Depends on device (some Android devices apply AEC to the voice-communication stream, most do not) | none (`wzp-client` compiled with `default-features = false`) |
### Why `wzp-native` is a standalone cdylib
On Android, the audio backend lives in a separate cdylib crate (`crates/wzp-native`) that `wzp-desktop`'s lib crate loads at runtime via `libloading`. It is **not** linked as a regular Rust dep.
This is deliberate. rust-lang/rust#104707 documents that a crate with `crate-type = ["cdylib", "staticlib"]` leaks non-exported symbols from the staticlib into the cdylib. On Android, that caused Bionic's private `__init_tcb` / `pthread_create` symbols to be bound LOCALLY inside our `.so` instead of resolved dynamically against `libc.so` at `dlopen` time — which crashed the app at launch as soon as `tao` tried to `std::thread::spawn()` from the JNI `onCreate` callback.
Keeping `wzp-native` in its own cdylib and loading it via `libloading` means:
1. The app's own `.so` has `crate-type = ["cdylib", "rlib"]` only — no `staticlib`, no symbol leak.
2. `libwzp_native.so` is loaded via `System.loadLibrary` from the JVM side (or `dlopen` from Rust), which triggers the normal Bionic resolver and binds all private symbols against `libc.so` at load time.
3. The C/C++ Oboe bridge is fully isolated inside `libwzp_native.so`'s symbol space — no chance of its archives leaking into `wzp-desktop`'s `.so`.
See `docs/BRANCH-android-rewrite.md` for the full incident postmortem and `docs/incident-tauri-android-init-tcb.md` for the debug log.
### Vendored `audiopus_sys` for libopus / clang-cl cross-compile
The workspace root carries a vendored copy of `audiopus_sys` at `vendor/audiopus_sys/` with a patched `opus/CMakeLists.txt`. This is needed because libopus 1.3.1 gates its per-file `-msse4.1` / `-mssse3` `COMPILE_FLAGS` behind `if(NOT MSVC)`, and under `clang-cl` (used by `cargo-xwin` for Windows cross-compiles) CMake sets `MSVC=1` unconditionally — so the SIMD source files compile without the required target feature and fail to link the intrinsic `always_inline` functions.
The patch introduces an `MSVC_CL` variable that is true only for real `cl.exe` (distinguished via `CMAKE_C_COMPILER_ID STREQUAL "MSVC"`), and flips the eight `if(NOT MSVC)` SIMD guards to `if(NOT MSVC_CL)` so clang-cl gets the GCC-style per-file flags. Wired in via `[patch.crates-io] audiopus_sys = { path = "vendor/audiopus_sys" }` at the workspace root.
This does not affect macOS or Linux builds — on those platforms `MSVC=0` everywhere so the patched logic behaves identically to upstream.
Upstream tracking: xiph/opus#256, xiph/opus PR #257 (both stale).