# Architecture
## System Overview
The Android client is a four-layer stack: Kotlin UI, JNI bridge, Rust engine, and C++ audio I/O. Each layer communicates through well-defined interfaces with minimal coupling.
```mermaid
graph TB
subgraph "Kotlin (Main Thread)"
CA[CallActivity]
VM[CallViewModel]
UI[InCallScreen
Compose UI]
CA --> VM
VM --> UI
end
subgraph "JNI Bridge"
JB[jni_bridge.rs
panic-safe FFI]
end
subgraph "Rust Engine"
ENG[WzpEngine
Orchestrator]
CT[Codec Thread
20ms real-time loop]
NET[Tokio Runtime
2 async workers]
PIPE[Pipeline
Encode/Decode/FEC/Jitter]
end
subgraph "C++ Audio"
OBOE[Oboe Bridge
Capture + Playout callbacks]
RB[Ring Buffers
Lock-free SPSC]
end
subgraph "Network"
QUIC[QUIC Connection
quinn]
RELAY[WZP Relay
SFU Room]
end
VM <-->|"JNI calls
+ JSON stats"| JB
JB <--> ENG
ENG --> CT
ENG --> NET
CT <--> PIPE
CT <-->|"Atomic R/W"| RB
OBOE <-->|"Atomic R/W"| RB
CT <-->|"mpsc channels"| NET
NET <-->|"QUIC datagrams
+ streams"| QUIC
QUIC <--> RELAY
```
## Thread Model
The engine uses four distinct thread contexts, each with specific responsibilities and real-time constraints.
```mermaid
graph LR
subgraph "Android Main Thread"
UI_T["UI + JNI calls
startCall / stopCall / getStats"]
end
subgraph "Oboe Audio Thread (system)"
AUD["Capture callback: mic → ring buf
Playout callback: ring buf → speaker
⚡ Highest priority, no allocations"]
end
subgraph "Codec Thread (wzp-codec)"
COD["20ms loop:
1. Read capture ring buf
2. AEC → AGC → Encode
3. Send to network channel
4. Recv from network channel
5. FEC → Jitter → Decode
6. Write playout ring buf
⚡ Pinned to big core, RT priority"]
end
subgraph "Tokio Runtime (2 workers)"
NET_S["Send task:
Channel → MediaPacket → QUIC datagram"]
NET_R["Recv task:
QUIC datagram → MediaPacket → Channel"]
HS["Handshake:
CallOffer → CallAnswer"]
end
UI_T -->|"mpsc command channel"| COD
COD -->|"tokio::mpsc send_tx"| NET_S
NET_R -->|"tokio::mpsc recv_tx"| COD
AUD <-->|"Atomic ring buffers"| COD
```
### Thread Priorities and Constraints
| Thread | Priority | Allocations | Blocking | Lock-free |
|--------|----------|-------------|----------|-----------|
| Oboe audio | SCHED_FIFO (system) | None | Never | Yes |
| Codec | RT priority, big core | Pre-allocated buffers | sleep(remainder of 20ms) | Ring buf: yes, Stats: Mutex |
| Tokio workers | Normal | Allowed | Async only | N/A |
| Main/JNI | Normal | Allowed | Allowed | N/A |
## Call Lifecycle
```mermaid
sequenceDiagram
participant User
participant UI as InCallScreen
participant VM as CallViewModel
participant ENG as WzpEngine (JNI)
participant NET as Tokio Network
participant RELAY as WZP Relay
User->>UI: Tap CALL
UI->>VM: startCall()
VM->>ENG: init() + startCall(relay, room)
ENG->>ENG: Create tokio runtime
ENG->>NET: Spawn network task
NET->>RELAY: QUIC connect (SNI = room name)
RELAY-->>NET: Connection established
Note over NET,RELAY: Crypto Handshake
NET->>RELAY: CallOffer {identity_pub, ephemeral_pub, signature, profiles}
RELAY-->>NET: CallAnswer {ephemeral_pub, chosen_profile, signature}
NET->>NET: Derive ChaCha20-Poly1305 session
ENG->>ENG: Spawn codec thread
Note over ENG: State → Active
loop Every 20ms
ENG->>ENG: Read mic → AEC → AGC → Encode
ENG->>NET: Encoded frame via channel
NET->>RELAY: MediaPacket via QUIC DATAGRAM
RELAY->>NET: MediaPacket from other peer
NET->>ENG: MediaPacket via channel
ENG->>ENG: FEC → Jitter → Decode → Speaker
end
User->>UI: Tap END
UI->>VM: stopCall()
VM->>ENG: stopCall()
ENG->>ENG: Set running=false, send Stop command
ENG->>ENG: Join codec thread
ENG->>NET: Drop tokio runtime
NET->>RELAY: Connection close
```
## Audio Pipeline Detail
```mermaid
graph LR
subgraph "Capture Path"
MIC[Microphone] -->|"48kHz i16"| OBOE_C[Oboe Capture
Callback]
OBOE_C -->|"ring_write()"| RB_C[Capture
Ring Buffer]
RB_C -->|"read_capture()"| AEC[Echo
Canceller]
AEC --> AGC[Auto Gain
Control]
AGC --> ENC[AdaptiveEncoder
Opus 24k]
ENC -->|"Vec u8"| FEC_E[RaptorQ
FEC Encoder]
FEC_E -->|"send_tx"| CHAN_S[Send Channel]
end
subgraph "Network"
CHAN_S --> PKT_S[MediaPacket
Header + Payload]
PKT_S -->|"QUIC DATAGRAM"| RELAY[Relay SFU]
RELAY -->|"QUIC DATAGRAM"| PKT_R[MediaPacket
Deserialize]
PKT_R -->|"recv_tx"| CHAN_R[Recv Channel]
end
subgraph "Playout Path"
CHAN_R --> FEC_D[RaptorQ
FEC Decoder]
FEC_D --> JB[Jitter Buffer
10-250 pkts]
JB --> DEC[AdaptiveDecoder
Opus 24k]
DEC -->|"48kHz i16"| AEC_REF[AEC Far-End
Reference]
DEC -->|"write_playout()"| RB_P[Playout
Ring Buffer]
RB_P -->|"ring_read()"| OBOE_P[Oboe Playout
Callback]
OBOE_P --> SPK[Speaker]
end
```
### Audio Parameters
| Parameter | Value | Notes |
|-----------|-------|-------|
| Sample rate | 48,000 Hz | Opus native rate |
| Channels | 1 (mono) | VoIP only |
| Frame size | 960 samples | 20ms at 48kHz |
| Ring buffer | 7,680 samples | 160ms (8 frames) |
| Bit depth | 16-bit signed int | PCM format |
| AEC tail | 100ms | Echo canceller filter length |
## Crypto Handshake
```mermaid
sequenceDiagram
participant Client as Android Client
participant Relay as WZP Relay
Note over Client: Identity seed (32 bytes, random per launch)
Note over Client: HKDF → Ed25519 signing key + X25519 static key
Client->>Client: Generate ephemeral X25519 keypair
Client->>Client: Sign(ephemeral_pub || "call-offer") with Ed25519
Client->>Relay: SignalMessage::CallOffer
{identity_pub, ephemeral_pub, signature, [GOOD, DEGRADED, CATASTROPHIC]}
Relay->>Relay: Verify Ed25519 signature
Relay->>Relay: Generate own ephemeral X25519
Relay->>Relay: Sign(ephemeral_pub || "call-answer")
Relay->>Relay: DH(relay_ephemeral, client_ephemeral) → shared secret
Relay->>Relay: HKDF(shared_secret) → ChaCha20-Poly1305 key
Relay->>Client: SignalMessage::CallAnswer
{identity_pub, ephemeral_pub, signature, chosen_profile=GOOD}
Client->>Client: Verify relay signature
Client->>Client: DH(client_ephemeral, relay_ephemeral) → same shared secret
Client->>Client: HKDF(shared_secret) → same ChaCha20-Poly1305 key
Note over Client,Relay: Both sides now have identical session key
Note over Client,Relay: Media packets can be encrypted (not yet applied)
```
### Key Derivation Chain
```
Identity Seed (32 bytes, random)
│
├── HKDF(seed, info="warzone-ed25519") → Ed25519 signing key
│ └── Public key = identity_pub (32 bytes)
│ └── SHA-256(identity_pub)[:16] = fingerprint (16 bytes)
│
└── HKDF(seed, info="warzone-x25519") → X25519 static key (unused currently)
Per-Call Ephemeral:
Random X25519 keypair → ephemeral_pub (sent in CallOffer)
Session Key:
DH(our_ephemeral_secret, peer_ephemeral_pub) → shared_secret
HKDF(shared_secret, info="warzone-session-key") → ChaCha20-Poly1305 key (32 bytes)
```
## QUIC Transport
```mermaid
graph TB
subgraph "QUIC Connection"
EP[Client Endpoint
0.0.0.0:0 UDP]
CONN[Connection to Relay
SNI = room name]
subgraph "Unreliable Channel"
DG_S[Send DATAGRAM
MediaPacket serialized]
DG_R[Recv DATAGRAM
MediaPacket deserialized]
end
subgraph "Reliable Channel"
ST_S[Open bidi stream
JSON length-prefixed
SignalMessage]
ST_R[Accept bidi stream
JSON length-prefixed
SignalMessage]
end
EP --> CONN
CONN --> DG_S
CONN --> DG_R
CONN --> ST_S
CONN --> ST_R
end
```
### QUIC Configuration (VoIP-tuned)
| Setting | Value | Rationale |
|---------|-------|-----------|
| ALPN | `wzp` | Protocol identification |
| Idle timeout | 30s | Keep connection alive during silence |
| Keep-alive | 5s | Prevent NAT timeout |
| Datagram receive buffer | 65 KB | Buffer for burst arrivals |
| Flow control (recv) | 256 KB | Conservative for VoIP |
| Flow control (send) | 128 KB | Prevent bufferbloat |
| TLS | Self-signed certs | Development mode |
| Certificate verification | Disabled | Client accepts any cert |
## MediaPacket Wire Format
```
12-byte header:
┌─────────────────────────────────────────────────┐
│ Byte 0: V(1) T(1) CodecID(4) Q(1) FecHi(1) │
│ Byte 1: FecLo(6) unused(2) │
│ Byte 2-3: Sequence number (u16 BE) │
│ Byte 4-7: Timestamp ms (u32 BE) │
│ Byte 8: FEC block ID │
│ Byte 9: FEC symbol index │
│ Byte 10: Reserved │
│ Byte 11: CSRC count │
├─────────────────────────────────────────────────┤
│ Payload: Opus-encoded audio frame │
├─────────────────────────────────────────────────┤
│ Optional: QualityReport (4 bytes, if Q=1) │
│ loss_pct(u8) rtt_4ms(u8) jitter_ms(u8) │
│ bitrate_cap_kbps(u8) │
└─────────────────────────────────────────────────┘
```
## Relay Room Mode (SFU)
```mermaid
graph LR
subgraph "Room: android"
P1[Phone A
QUIC conn] -->|MediaPacket| RELAY[Relay SFU]
RELAY -->|MediaPacket| P2[Phone B
QUIC conn]
P2 -->|MediaPacket| RELAY
RELAY -->|MediaPacket| P1
end
Note1["Room name from QUIC TLS SNI
No auth required
Packets forwarded to all others"]
```
The relay operates as a Selective Forwarding Unit:
1. Client connects via QUIC, room name extracted from TLS SNI
2. Crypto handshake completes (relay has its own ephemeral identity)
3. Client joins named room
4. All received media packets are forwarded to every other participant in the room
5. Signaling messages are not forwarded (point-to-point with relay)
## Adaptive Quality System
```mermaid
graph TD
QR[QualityReport
loss%, RTT, jitter] --> AQC[AdaptiveQualityController]
AQC -->|"loss<10%, RTT<400ms"| GOOD[GOOD
Opus 24kbps
FEC 20%
20ms frames]
AQC -->|"loss 10-40%
RTT 400-600ms"| DEG[DEGRADED
Opus 6kbps
FEC 50%
40ms frames]
AQC -->|"loss>40%
RTT>600ms"| CAT[CATASTROPHIC
Codec2 1.2kbps
FEC 100%
40ms frames]
GOOD -->|"Hysteresis:
sustained degradation"| DEG
DEG -->|"Sustained improvement"| GOOD
DEG -->|"Further degradation"| CAT
CAT -->|"Improvement"| DEG
```
| Profile | Codec | Bitrate | FEC Ratio | Frame Size | FEC Block |
|---------|-------|---------|-----------|------------|-----------|
| GOOD | Opus 24k | 24 kbps | 20% | 20ms | 5 frames |
| DEGRADED | Opus 6k | 6 kbps | 50% | 40ms | 10 frames |
| CATASTROPHIC | Codec2 1.2k | 1.2 kbps | 100% | 40ms | 8 frames |
## Module Dependency Graph
```mermaid
graph BT
PROTO[wzp-proto
Types, traits, jitter,
quality, session]
CODEC[wzp-codec
Opus, Codec2, AEC,
AGC, resampling]
FEC[wzp-fec
RaptorQ fountain codes]
CRYPTO[wzp-crypto
Ed25519, X25519,
ChaCha20-Poly1305]
TRANSPORT[wzp-transport
QUIC, datagrams,
signaling streams]
ANDROID[wzp-android
Engine, JNI bridge,
Oboe audio, pipeline]
RELAY[wzp-relay
SFU, rooms, auth,
metrics, probes]
CODEC --> PROTO
FEC --> PROTO
CRYPTO --> PROTO
TRANSPORT --> PROTO
ANDROID --> PROTO
ANDROID --> CODEC
ANDROID --> FEC
ANDROID --> CRYPTO
ANDROID --> TRANSPORT
RELAY --> PROTO
RELAY --> CRYPTO
RELAY --> TRANSPORT
```
## File Map
### Kotlin (`android/app/src/main/java/com/wzp/`)
| File | Purpose |
|------|---------|
| `WzpApplication.kt` | App entry, notification channel creation |
| `engine/WzpEngine.kt` | JNI wrapper for native engine |
| `engine/WzpCallback.kt` | Callback interface for engine events |
| `engine/CallStats.kt` | Stats data class with JSON deserialization |
| `ui/call/CallActivity.kt` | Activity host, permissions, theme |
| `ui/call/CallViewModel.kt` | MVVM state holder, stats polling |
| `ui/call/InCallScreen.kt` | Compose UI (idle + in-call states) |
| `service/CallService.kt` | Foreground service, wake/wifi locks |
| `audio/AudioRouteManager.kt` | Speaker/earpiece/Bluetooth routing |
### Rust (`crates/wzp-android/src/`)
| File | Purpose |
|------|---------|
| `lib.rs` | Module declarations |
| `jni_bridge.rs` | JNI FFI (panic-safe, proper jni crate) |
| `engine.rs` | Call orchestrator (threads, channels, lifecycle) |
| `pipeline.rs` | Codec pipeline (AEC, AGC, encode, FEC, jitter, decode) |
| `audio_android.rs` | Oboe backend, SPSC ring buffers, RT scheduling |
| `commands.rs` | Engine command enum |
| `stats.rs` | CallState/CallStats types (serde) |
### C++ (`crates/wzp-android/cpp/`)
| File | Purpose |
|------|---------|
| `oboe_bridge.h` | FFI header for Rust-C++ audio interface |
| `oboe_bridge.cpp` | Oboe capture/playout callbacks, ring buffer I/O |
| `oboe_stub.cpp` | No-op stub for non-Android builds |
### Build
| File | Purpose |
|------|---------|
| `android/app/build.gradle.kts` | Android build config, cargo-ndk task |
| `crates/wzp-android/Cargo.toml` | Rust dependencies (cdylib output) |
| `crates/wzp-android/build.rs` | C++ compilation, Oboe fetch |