feat: wire QUIC transport, JNI bridge, connect UI + add docs

- Replace raw FFI with proper `jni` crate for string marshalling
- Wire QUIC transport in engine: connect to relay, crypto handshake
  (CallOffer/CallAnswer, X25519+Ed25519), send/recv MediaPackets
- Feed received packets into jitter buffer (was previously ignored)
- Add connect screen UI with CALL button (idle state) and in-call
  controls (mute, speaker, hang up, live stats)
- Hardcode relay 172.16.81.125:4433, room "android"
- Add comprehensive docs in docs/android/:
  architecture.md (8 mermaid diagrams), build-guide.md,
  debugging.md, maintenance.md, roadmap.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Claude
2026-04-05 04:43:49 +00:00
parent 780309fede
commit 8d5f6fe044
16 changed files with 1496 additions and 398 deletions

41
docs/android/README.md Normal file
View File

@@ -0,0 +1,41 @@
# WarzonePhone Android Client
The WZP Android client is a native VoIP application built with Kotlin/Jetpack Compose on top of a Rust audio engine. It connects to WZP relay servers over QUIC, providing encrypted voice calls with adaptive quality, forward error correction, and acoustic echo cancellation.
## Quick Start
1. **Build**: `cd android && ./gradlew assembleRelease` (requires NDK 26.1, cargo-ndk)
2. **Install**: `adb install app/build/outputs/apk/release/app-release.apk`
3. **Run**: Open "WZ Phone", tap **CALL** to connect to the hardcoded relay
4. **Relay**: Must be running at the configured address (default `172.16.81.125:4433`)
## Current State (April 2025)
| Feature | Status |
|---------|--------|
| QUIC transport to relay | Working |
| Crypto handshake (X25519 + Ed25519) | Working |
| Opus 24k encoding/decoding | Working |
| Oboe audio I/O (48kHz mono) | Working |
| AEC / AGC signal processing | Working |
| RaptorQ FEC | Wired (repair symbols not sent yet) |
| Jitter buffer | Working |
| Adaptive quality switching | Codec-ready, not network-driven yet |
| Authentication (featherChat) | Skipped (relay has no --auth-url) |
| Media encryption (ChaCha20-Poly1305) | Session derived but not applied to packets |
| Foreground service / wake locks | Implemented, not started from UI |
## Documentation Index
- [Architecture](architecture.md) - System design, data flow diagrams, thread model
- [Build Guide](build-guide.md) - Build environment setup, dependencies, signing
- [Debugging](debugging.md) - Crash diagnosis, logcat filters, common issues
- [Maintenance](maintenance.md) - Code map, dependency management, upgrade paths
- [Roadmap](roadmap.md) - Planned work and known gaps
## Key Design Decisions
- **Rust native engine**: All audio processing, codecs, FEC, crypto, and networking run in Rust. Kotlin is UI-only.
- **Lock-free audio**: SPSC ring buffers with atomic ordering between Oboe C++ callbacks and the Rust codec thread. No mutexes in the audio path.
- **cargo-ndk**: The native library (`libwzp_android.so`) is cross-compiled for `arm64-v8a` using cargo-ndk, invoked automatically by Gradle's `cargoNdkBuild` task.
- **Single-activity Compose**: One `CallActivity` hosts all UI via Jetpack Compose with `CallViewModel` as the state holder.

View File

@@ -0,0 +1,400 @@
# Architecture
## System Overview
The Android client is a four-layer stack: Kotlin UI, JNI bridge, Rust engine, and C++ audio I/O. Each layer communicates through well-defined interfaces with minimal coupling.
```mermaid
graph TB
subgraph "Kotlin (Main Thread)"
CA[CallActivity]
VM[CallViewModel]
UI[InCallScreen<br/>Compose UI]
CA --> VM
VM --> UI
end
subgraph "JNI Bridge"
JB[jni_bridge.rs<br/>panic-safe FFI]
end
subgraph "Rust Engine"
ENG[WzpEngine<br/>Orchestrator]
CT[Codec Thread<br/>20ms real-time loop]
NET[Tokio Runtime<br/>2 async workers]
PIPE[Pipeline<br/>Encode/Decode/FEC/Jitter]
end
subgraph "C++ Audio"
OBOE[Oboe Bridge<br/>Capture + Playout callbacks]
RB[Ring Buffers<br/>Lock-free SPSC]
end
subgraph "Network"
QUIC[QUIC Connection<br/>quinn]
RELAY[WZP Relay<br/>SFU Room]
end
VM <-->|"JNI calls<br/>+ JSON stats"| JB
JB <--> ENG
ENG --> CT
ENG --> NET
CT <--> PIPE
CT <-->|"Atomic R/W"| RB
OBOE <-->|"Atomic R/W"| RB
CT <-->|"mpsc channels"| NET
NET <-->|"QUIC datagrams<br/>+ streams"| QUIC
QUIC <--> RELAY
```
## Thread Model
The engine uses four distinct thread contexts, each with specific responsibilities and real-time constraints.
```mermaid
graph LR
subgraph "Android Main Thread"
UI_T["UI + JNI calls<br/>startCall / stopCall / getStats"]
end
subgraph "Oboe Audio Thread (system)"
AUD["Capture callback: mic → ring buf<br/>Playout callback: ring buf → speaker<br/>⚡ Highest priority, no allocations"]
end
subgraph "Codec Thread (wzp-codec)"
COD["20ms loop:<br/>1. Read capture ring buf<br/>2. AEC → AGC → Encode<br/>3. Send to network channel<br/>4. Recv from network channel<br/>5. FEC → Jitter → Decode<br/>6. Write playout ring buf<br/>⚡ Pinned to big core, RT priority"]
end
subgraph "Tokio Runtime (2 workers)"
NET_S["Send task:<br/>Channel → MediaPacket → QUIC datagram"]
NET_R["Recv task:<br/>QUIC datagram → MediaPacket → Channel"]
HS["Handshake:<br/>CallOffer → CallAnswer"]
end
UI_T -->|"mpsc command channel"| COD
COD -->|"tokio::mpsc send_tx"| NET_S
NET_R -->|"tokio::mpsc recv_tx"| COD
AUD <-->|"Atomic ring buffers"| COD
```
### Thread Priorities and Constraints
| Thread | Priority | Allocations | Blocking | Lock-free |
|--------|----------|-------------|----------|-----------|
| Oboe audio | SCHED_FIFO (system) | None | Never | Yes |
| Codec | RT priority, big core | Pre-allocated buffers | sleep(remainder of 20ms) | Ring buf: yes, Stats: Mutex |
| Tokio workers | Normal | Allowed | Async only | N/A |
| Main/JNI | Normal | Allowed | Allowed | N/A |
## Call Lifecycle
```mermaid
sequenceDiagram
participant User
participant UI as InCallScreen
participant VM as CallViewModel
participant ENG as WzpEngine (JNI)
participant NET as Tokio Network
participant RELAY as WZP Relay
User->>UI: Tap CALL
UI->>VM: startCall()
VM->>ENG: init() + startCall(relay, room)
ENG->>ENG: Create tokio runtime
ENG->>NET: Spawn network task
NET->>RELAY: QUIC connect (SNI = room name)
RELAY-->>NET: Connection established
Note over NET,RELAY: Crypto Handshake
NET->>RELAY: CallOffer {identity_pub, ephemeral_pub, signature, profiles}
RELAY-->>NET: CallAnswer {ephemeral_pub, chosen_profile, signature}
NET->>NET: Derive ChaCha20-Poly1305 session
ENG->>ENG: Spawn codec thread
Note over ENG: State → Active
loop Every 20ms
ENG->>ENG: Read mic → AEC → AGC → Encode
ENG->>NET: Encoded frame via channel
NET->>RELAY: MediaPacket via QUIC DATAGRAM
RELAY->>NET: MediaPacket from other peer
NET->>ENG: MediaPacket via channel
ENG->>ENG: FEC → Jitter → Decode → Speaker
end
User->>UI: Tap END
UI->>VM: stopCall()
VM->>ENG: stopCall()
ENG->>ENG: Set running=false, send Stop command
ENG->>ENG: Join codec thread
ENG->>NET: Drop tokio runtime
NET->>RELAY: Connection close
```
## Audio Pipeline Detail
```mermaid
graph LR
subgraph "Capture Path"
MIC[Microphone] -->|"48kHz i16"| OBOE_C[Oboe Capture<br/>Callback]
OBOE_C -->|"ring_write()"| RB_C[Capture<br/>Ring Buffer]
RB_C -->|"read_capture()"| AEC[Echo<br/>Canceller]
AEC --> AGC[Auto Gain<br/>Control]
AGC --> ENC[AdaptiveEncoder<br/>Opus 24k]
ENC -->|"Vec u8"| FEC_E[RaptorQ<br/>FEC Encoder]
FEC_E -->|"send_tx"| CHAN_S[Send Channel]
end
subgraph "Network"
CHAN_S --> PKT_S[MediaPacket<br/>Header + Payload]
PKT_S -->|"QUIC DATAGRAM"| RELAY[Relay SFU]
RELAY -->|"QUIC DATAGRAM"| PKT_R[MediaPacket<br/>Deserialize]
PKT_R -->|"recv_tx"| CHAN_R[Recv Channel]
end
subgraph "Playout Path"
CHAN_R --> FEC_D[RaptorQ<br/>FEC Decoder]
FEC_D --> JB[Jitter Buffer<br/>10-250 pkts]
JB --> DEC[AdaptiveDecoder<br/>Opus 24k]
DEC -->|"48kHz i16"| AEC_REF[AEC Far-End<br/>Reference]
DEC -->|"write_playout()"| RB_P[Playout<br/>Ring Buffer]
RB_P -->|"ring_read()"| OBOE_P[Oboe Playout<br/>Callback]
OBOE_P --> SPK[Speaker]
end
```
### Audio Parameters
| Parameter | Value | Notes |
|-----------|-------|-------|
| Sample rate | 48,000 Hz | Opus native rate |
| Channels | 1 (mono) | VoIP only |
| Frame size | 960 samples | 20ms at 48kHz |
| Ring buffer | 7,680 samples | 160ms (8 frames) |
| Bit depth | 16-bit signed int | PCM format |
| AEC tail | 100ms | Echo canceller filter length |
## Crypto Handshake
```mermaid
sequenceDiagram
participant Client as Android Client
participant Relay as WZP Relay
Note over Client: Identity seed (32 bytes, random per launch)
Note over Client: HKDF → Ed25519 signing key + X25519 static key
Client->>Client: Generate ephemeral X25519 keypair
Client->>Client: Sign(ephemeral_pub || "call-offer") with Ed25519
Client->>Relay: SignalMessage::CallOffer<br/>{identity_pub, ephemeral_pub, signature, [GOOD, DEGRADED, CATASTROPHIC]}
Relay->>Relay: Verify Ed25519 signature
Relay->>Relay: Generate own ephemeral X25519
Relay->>Relay: Sign(ephemeral_pub || "call-answer")
Relay->>Relay: DH(relay_ephemeral, client_ephemeral) → shared secret
Relay->>Relay: HKDF(shared_secret) → ChaCha20-Poly1305 key
Relay->>Client: SignalMessage::CallAnswer<br/>{identity_pub, ephemeral_pub, signature, chosen_profile=GOOD}
Client->>Client: Verify relay signature
Client->>Client: DH(client_ephemeral, relay_ephemeral) → same shared secret
Client->>Client: HKDF(shared_secret) → same ChaCha20-Poly1305 key
Note over Client,Relay: Both sides now have identical session key
Note over Client,Relay: Media packets can be encrypted (not yet applied)
```
### Key Derivation Chain
```
Identity Seed (32 bytes, random)
├── HKDF(seed, info="warzone-ed25519") → Ed25519 signing key
│ └── Public key = identity_pub (32 bytes)
│ └── SHA-256(identity_pub)[:16] = fingerprint (16 bytes)
└── HKDF(seed, info="warzone-x25519") → X25519 static key (unused currently)
Per-Call Ephemeral:
Random X25519 keypair → ephemeral_pub (sent in CallOffer)
Session Key:
DH(our_ephemeral_secret, peer_ephemeral_pub) → shared_secret
HKDF(shared_secret, info="warzone-session-key") → ChaCha20-Poly1305 key (32 bytes)
```
## QUIC Transport
```mermaid
graph TB
subgraph "QUIC Connection"
EP[Client Endpoint<br/>0.0.0.0:0 UDP]
CONN[Connection to Relay<br/>SNI = room name]
subgraph "Unreliable Channel"
DG_S[Send DATAGRAM<br/>MediaPacket serialized]
DG_R[Recv DATAGRAM<br/>MediaPacket deserialized]
end
subgraph "Reliable Channel"
ST_S[Open bidi stream<br/>JSON length-prefixed<br/>SignalMessage]
ST_R[Accept bidi stream<br/>JSON length-prefixed<br/>SignalMessage]
end
EP --> CONN
CONN --> DG_S
CONN --> DG_R
CONN --> ST_S
CONN --> ST_R
end
```
### QUIC Configuration (VoIP-tuned)
| Setting | Value | Rationale |
|---------|-------|-----------|
| ALPN | `wzp` | Protocol identification |
| Idle timeout | 30s | Keep connection alive during silence |
| Keep-alive | 5s | Prevent NAT timeout |
| Datagram receive buffer | 65 KB | Buffer for burst arrivals |
| Flow control (recv) | 256 KB | Conservative for VoIP |
| Flow control (send) | 128 KB | Prevent bufferbloat |
| TLS | Self-signed certs | Development mode |
| Certificate verification | Disabled | Client accepts any cert |
## MediaPacket Wire Format
```
12-byte header:
┌─────────────────────────────────────────────────┐
│ Byte 0: V(1) T(1) CodecID(4) Q(1) FecHi(1) │
│ Byte 1: FecLo(6) unused(2) │
│ Byte 2-3: Sequence number (u16 BE) │
│ Byte 4-7: Timestamp ms (u32 BE) │
│ Byte 8: FEC block ID │
│ Byte 9: FEC symbol index │
│ Byte 10: Reserved │
│ Byte 11: CSRC count │
├─────────────────────────────────────────────────┤
│ Payload: Opus-encoded audio frame │
├─────────────────────────────────────────────────┤
│ Optional: QualityReport (4 bytes, if Q=1) │
│ loss_pct(u8) rtt_4ms(u8) jitter_ms(u8) │
│ bitrate_cap_kbps(u8) │
└─────────────────────────────────────────────────┘
```
## Relay Room Mode (SFU)
```mermaid
graph LR
subgraph "Room: android"
P1[Phone A<br/>QUIC conn] -->|MediaPacket| RELAY[Relay SFU]
RELAY -->|MediaPacket| P2[Phone B<br/>QUIC conn]
P2 -->|MediaPacket| RELAY
RELAY -->|MediaPacket| P1
end
Note1["Room name from QUIC TLS SNI<br/>No auth required<br/>Packets forwarded to all others"]
```
The relay operates as a Selective Forwarding Unit:
1. Client connects via QUIC, room name extracted from TLS SNI
2. Crypto handshake completes (relay has its own ephemeral identity)
3. Client joins named room
4. All received media packets are forwarded to every other participant in the room
5. Signaling messages are not forwarded (point-to-point with relay)
## Adaptive Quality System
```mermaid
graph TD
QR[QualityReport<br/>loss%, RTT, jitter] --> AQC[AdaptiveQualityController]
AQC -->|"loss<10%, RTT<400ms"| GOOD[GOOD<br/>Opus 24kbps<br/>FEC 20%<br/>20ms frames]
AQC -->|"loss 10-40%<br/>RTT 400-600ms"| DEG[DEGRADED<br/>Opus 6kbps<br/>FEC 50%<br/>40ms frames]
AQC -->|"loss>40%<br/>RTT>600ms"| CAT[CATASTROPHIC<br/>Codec2 1.2kbps<br/>FEC 100%<br/>40ms frames]
GOOD -->|"Hysteresis:<br/>sustained degradation"| DEG
DEG -->|"Sustained improvement"| GOOD
DEG -->|"Further degradation"| CAT
CAT -->|"Improvement"| DEG
```
| Profile | Codec | Bitrate | FEC Ratio | Frame Size | FEC Block |
|---------|-------|---------|-----------|------------|-----------|
| GOOD | Opus 24k | 24 kbps | 20% | 20ms | 5 frames |
| DEGRADED | Opus 6k | 6 kbps | 50% | 40ms | 10 frames |
| CATASTROPHIC | Codec2 1.2k | 1.2 kbps | 100% | 40ms | 8 frames |
## Module Dependency Graph
```mermaid
graph BT
PROTO[wzp-proto<br/>Types, traits, jitter,<br/>quality, session]
CODEC[wzp-codec<br/>Opus, Codec2, AEC,<br/>AGC, resampling]
FEC[wzp-fec<br/>RaptorQ fountain codes]
CRYPTO[wzp-crypto<br/>Ed25519, X25519,<br/>ChaCha20-Poly1305]
TRANSPORT[wzp-transport<br/>QUIC, datagrams,<br/>signaling streams]
ANDROID[wzp-android<br/>Engine, JNI bridge,<br/>Oboe audio, pipeline]
RELAY[wzp-relay<br/>SFU, rooms, auth,<br/>metrics, probes]
CODEC --> PROTO
FEC --> PROTO
CRYPTO --> PROTO
TRANSPORT --> PROTO
ANDROID --> PROTO
ANDROID --> CODEC
ANDROID --> FEC
ANDROID --> CRYPTO
ANDROID --> TRANSPORT
RELAY --> PROTO
RELAY --> CRYPTO
RELAY --> TRANSPORT
```
## File Map
### Kotlin (`android/app/src/main/java/com/wzp/`)
| File | Purpose |
|------|---------|
| `WzpApplication.kt` | App entry, notification channel creation |
| `engine/WzpEngine.kt` | JNI wrapper for native engine |
| `engine/WzpCallback.kt` | Callback interface for engine events |
| `engine/CallStats.kt` | Stats data class with JSON deserialization |
| `ui/call/CallActivity.kt` | Activity host, permissions, theme |
| `ui/call/CallViewModel.kt` | MVVM state holder, stats polling |
| `ui/call/InCallScreen.kt` | Compose UI (idle + in-call states) |
| `service/CallService.kt` | Foreground service, wake/wifi locks |
| `audio/AudioRouteManager.kt` | Speaker/earpiece/Bluetooth routing |
### Rust (`crates/wzp-android/src/`)
| File | Purpose |
|------|---------|
| `lib.rs` | Module declarations |
| `jni_bridge.rs` | JNI FFI (panic-safe, proper jni crate) |
| `engine.rs` | Call orchestrator (threads, channels, lifecycle) |
| `pipeline.rs` | Codec pipeline (AEC, AGC, encode, FEC, jitter, decode) |
| `audio_android.rs` | Oboe backend, SPSC ring buffers, RT scheduling |
| `commands.rs` | Engine command enum |
| `stats.rs` | CallState/CallStats types (serde) |
### C++ (`crates/wzp-android/cpp/`)
| File | Purpose |
|------|---------|
| `oboe_bridge.h` | FFI header for Rust-C++ audio interface |
| `oboe_bridge.cpp` | Oboe capture/playout callbacks, ring buffer I/O |
| `oboe_stub.cpp` | No-op stub for non-Android builds |
### Build
| File | Purpose |
|------|---------|
| `android/app/build.gradle.kts` | Android build config, cargo-ndk task |
| `crates/wzp-android/Cargo.toml` | Rust dependencies (cdylib output) |
| `crates/wzp-android/build.rs` | C++ compilation, Oboe fetch |

155
docs/android/build-guide.md Normal file
View File

@@ -0,0 +1,155 @@
# Build Guide
## Prerequisites
| Tool | Version | Purpose |
|------|---------|---------|
| JDK | 17 | Android Gradle builds |
| Android SDK | 34 | Compile SDK |
| Android NDK | 26.1.10909125 | Native C++/Rust compilation |
| Rust | 1.85+ | Native engine (edition 2024) |
| cargo-ndk | latest | Cross-compile Rust → Android |
| `aarch64-linux-android` target | - | Rust target for ARM64 |
### Install Rust Android target
```bash
rustup target add aarch64-linux-android
cargo install cargo-ndk
```
### Environment Variables
```bash
export JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"
export ANDROID_HOME="$HOME/android-sdk"
export ANDROID_NDK_HOME="$ANDROID_HOME/ndk/26.1.10909125"
# For manual cargo-ndk builds (Gradle sets these automatically):
export CC_aarch64_linux_android="$ANDROID_NDK_HOME/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android21-clang"
export CXX_aarch64_linux_android="$ANDROID_NDK_HOME/toolchains/llvm/prebuilt/linux-x86_64/bin/aarch64-linux-android21-clang++"
export AR_aarch64_linux_android="$ANDROID_NDK_HOME/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-ar"
```
## Build Commands
### Full Build (Gradle drives everything)
```bash
cd android
./gradlew assembleRelease
```
This runs:
1. `cargoNdkBuild` task: invokes `cargo ndk -t arm64-v8a -o app/src/main/jniLibs build --release -p wzp-android`
2. Compiles Kotlin/Compose code
3. Packages APK with signing
### Native Library Only
```bash
cargo ndk -t arm64-v8a -o android/app/src/main/jniLibs build --release -p wzp-android
```
Output: `android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so`
### Skip Native Rebuild
If the `.so` hasn't changed:
```bash
cd android
./gradlew assembleRelease -x cargoNdkBuild
```
### Debug Build
```bash
cd android
./gradlew assembleDebug
```
Debug APK is ~8.9 MB (unstripped `.so`), release is ~6.9 MB.
## Signing
### Debug
```
Keystore: android/keystore/wzp-debug.jks
Password: android
Key alias: wzp-debug
```
### Release
```
Keystore: android/keystore/wzp-release.jks
Password: wzphone2024
Key alias: wzp-release
```
Both keystores are checked into the repo for development convenience. For production, replace with proper key management.
## Build Artifacts
| Artifact | Path | Size |
|----------|------|------|
| Debug APK | `android/app/build/outputs/apk/debug/app-debug.apk` | ~8.9 MB |
| Release APK | `android/app/build/outputs/apk/release/app-release.apk` | ~6.9 MB |
| Native lib | `android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so` | ~5 MB |
## ABI Support
Currently only `arm64-v8a` (ARM64) is built. This covers 95%+ of modern Android devices.
To add more ABIs, edit `build.gradle.kts`:
```kotlin
ndk { abiFilters += listOf("arm64-v8a", "armeabi-v7a") }
```
And update the cargo-ndk command in `cargoNdkBuild` task:
```kotlin
commandLine("cargo", "ndk", "-t", "arm64-v8a", "-t", "armeabi-v7a", ...)
```
## Oboe Dependency
The Oboe C++ audio library is fetched at build time by `build.rs`:
1. Attempts `git clone` of Oboe 1.8.1 into `$OUT_DIR/oboe`
2. If successful, compiles `oboe_bridge.cpp` with Oboe headers
3. If clone fails (no network), falls back to `oboe_stub.cpp` (no-op audio)
This means **first build requires internet** to fetch Oboe. Subsequent builds use the cached checkout.
## Common Build Issues
### `cargo ndk` not found
```bash
cargo install cargo-ndk
```
### Missing Android target
```bash
rustup target add aarch64-linux-android
```
### NDK not found
Ensure `ANDROID_NDK_HOME` points to the NDK directory containing `toolchains/llvm/`.
### C++ compilation errors
Check that `CXX_aarch64_linux_android` points to a valid clang++ from the NDK.
### Gradle daemon issues
```bash
./gradlew --stop
./gradlew assembleRelease --no-daemon
```

214
docs/android/debugging.md Normal file
View File

@@ -0,0 +1,214 @@
# Debugging Guide
## Crash on Launch
### Symptom: App crashes immediately after opening
**Most likely cause: Namespace mismatch in AndroidManifest.xml**
The Gradle namespace is `com.wzp.phone` but all Kotlin classes are in package `com.wzp.*`. If the manifest uses shorthand names (`.WzpApplication`, `.ui.call.CallActivity`), Android resolves them as `com.wzp.phone.WzpApplication` which doesn't exist.
**Fix**: Always use fully-qualified class names in the manifest:
```xml
<!-- WRONG -->
<application android:name=".WzpApplication">
<activity android:name=".ui.call.CallActivity">
<!-- CORRECT -->
<application android:name="com.wzp.WzpApplication">
<activity android:name="com.wzp.ui.call.CallActivity">
```
### Symptom: Crash in `System.loadLibrary("wzp_android")`
The native `.so` is missing or incompatible. Check:
```bash
# Verify the .so exists in the APK
unzip -l app-release.apk | grep libwzp
# Should show: lib/arm64-v8a/libwzp_android.so
# Verify ABI matches device
adb shell getprop ro.product.cpu.abi
# Should return: arm64-v8a
```
### Symptom: Crash when calling `nativeGetStats()` (returns null jstring)
The JNI bridge must return a valid `jstring`, not a null pointer. The Kotlin side declares the return as `String?` (nullable) and wraps in try/catch:
```kotlin
fun getStats(): String {
if (nativeHandle == 0L) return "{}"
return try {
nativeGetStats(nativeHandle) ?: "{}"
} catch (_: Exception) {
"{}"
}
}
```
### Symptom: Tracing subscriber panic
`tracing_subscriber::fmt()` writes to stdout, which doesn't exist on Android. The init was removed. If you need logging, use `android_logger` crate instead.
## Logcat Filters
### View all WZP logs
```bash
adb logcat -s wzp-android:V wzp-codec:V wzp-net:V
```
### View Rust tracing output (if android_logger is added)
```bash
adb logcat | grep -E "(wzp|WzpEngine|CallActivity)"
```
### View Oboe audio logs
```bash
adb logcat -s AAudio:V oboe:V
```
### View native crashes
```bash
adb logcat -s DEBUG:V libc:V
```
Look for `signal 11 (SIGSEGV)` or `signal 6 (SIGABRT)` with a backtrace in `libwzp_android.so`.
### Symbolicate native crash
```bash
# Find the .so with debug symbols (before stripping)
SO_PATH="target/aarch64-linux-android/release/libwzp_android.so"
# Use addr2line from NDK
$ANDROID_NDK_HOME/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-addr2line \
-e $SO_PATH -f 0x<address_from_crash>
```
## Network Issues
### Call stuck on "Connecting..."
The QUIC handshake to the relay is failing. Common causes:
1. **Relay not running**: Verify the relay is listening:
```bash
nc -zvu 172.16.81.125 4433
```
2. **Wrong relay address**: Hardcoded in `CallViewModel.kt`:
```kotlin
const val DEFAULT_RELAY = "172.16.81.125:4433"
```
3. **QUIC blocked by firewall**: QUIC uses UDP. Many networks block UDP traffic. Ensure UDP port 4433 is open.
4. **TLS handshake failure**: The client uses `client_config()` which disables certificate verification. If the relay's QUIC config changed, this may fail.
### Connected but no audio
1. **Microphone permission denied**: Check Android settings. The app requests `RECORD_AUDIO` on first launch.
2. **Oboe failed to start**: The codec thread logs this. Check logcat for "failed to start audio".
3. **Ring buffer underrun**: The stats overlay shows "Under" count. High underruns mean the codec thread isn't keeping up.
4. **Network not forwarding**: If both phones show "Active" but frame counters aren't increasing, the relay may not be forwarding. Check relay logs.
### High packet loss
The stats overlay shows loss percentage. Common causes:
- Wi-Fi congestion (try cellular or move closer to AP)
- UDP throttling by carrier/ISP
- Relay overloaded (check relay metrics)
## Audio Issues
### Echo
AEC (Acoustic Echo Cancellation) is enabled by default with a 100ms tail. If echo persists:
- The AEC may need a longer tail for the specific acoustic environment
- Speaker volume too high overwhelms the canceller
- Check that `last_decoded_farend` is being set (playout path working)
### Robot voice / glitching
Usually caused by jitter buffer underruns. The jitter buffer adapts between 10-250 packets. Check:
- `jitter_buffer_depth` in stats (should be > 0 during active call)
- `underruns` counter (should not climb rapidly)
- Network jitter (high jitter_ms causes adaptation)
### No sound from speaker
1. Check `isSpeaker` state in the UI
2. Oboe playout stream may have failed — check logcat for Oboe errors
3. Ring buffer might be empty — check `framesDecoded` counter
## JNI Issues
### `UnsatisfiedLinkError: No implementation found for...`
The JNI function name doesn't match. JNI names must follow the pattern:
```
Java_com_wzp_engine_WzpEngine_<methodName>
```
If the package structure changes, all JNI function names must be updated in `jni_bridge.rs`.
### Panic across FFI boundary
All JNI functions wrap their body in `panic::catch_unwind()`. If a Rust panic escapes to Java, it causes a `SIGABRT`. The catch_unwind returns safe defaults:
| Function | Panic return |
|----------|--------------|
| `nativeInit` | 0 (null handle) |
| `nativeStartCall` | -1 (error) |
| `nativeGetStats` | `JObject::null()` |
| Others | void (silently swallowed) |
### Thread safety
All JNI methods must be called from the same thread (Android main thread). The `EngineHandle` is a raw pointer — concurrent access is undefined behavior.
## Stats JSON Format
The `nativeGetStats()` returns JSON matching this Rust struct:
```json
{
"state": "Active",
"duration_secs": 42.5,
"quality_tier": 0,
"loss_pct": 0.5,
"rtt_ms": 45,
"jitter_ms": 12,
"jitter_buffer_depth": 3,
"frames_encoded": 2125,
"frames_decoded": 2100,
"underruns": 5
}
```
Kotlin deserializes this via `CallStats.fromJson()` using `org.json.JSONObject` (Android built-in, no library needed).
## Diagnostic Checklist
When something doesn't work, check in this order:
1. **APK installed for correct ABI?** (`arm64-v8a` only)
2. **Manifest class names fully qualified?** (no dots prefix)
3. **Relay running and reachable?** (`nc -zvu <host> <port>`)
4. **Microphone permission granted?**
5. **Stats polling working?** (check if frame counters increment)
6. **Logcat for native crashes?** (`adb logcat -s DEBUG:V`)
7. **Network connectivity?** (UDP port open, no firewall)

190
docs/android/maintenance.md Normal file
View File

@@ -0,0 +1,190 @@
# Maintenance Guide
## Code Map — Where to Change Things
### Changing the relay address or room
Edit `CallViewModel.kt`:
```kotlin
companion object {
const val DEFAULT_RELAY = "172.16.81.125:4433"
const val DEFAULT_ROOM = "android"
}
```
For a proper settings screen, add a new Composable in `ui/` that persists to `SharedPreferences` and passes values to `viewModel.startCall(relay, room)`.
### Adding authentication
1. In `CallViewModel.startCall()`, pass a token parameter
2. In `engine.rs`, after QUIC connect but before CallOffer, send:
```rust
transport.send_signal(&SignalMessage::AuthToken { token: auth_token }).await?;
```
3. Wait for the relay to accept before proceeding to handshake
4. Start relay with `--auth-url <featherchat-endpoint>`
### Enabling media encryption
The crypto session is already derived in `engine.rs` but not applied to packets. To enable:
1. Pass `_session` (currently unused) to the send/recv tasks
2. Before `transport.send_media()`, encrypt the payload:
```rust
let mut ciphertext = Vec::new();
session.encrypt(&header_bytes, &payload, &mut ciphertext)?;
packet.payload = Bytes::from(ciphertext);
```
3. After `transport.recv_media()`, decrypt:
```rust
let mut plaintext = Vec::new();
session.decrypt(&header_bytes, &pkt.payload, &mut plaintext)?;
pkt.payload = Bytes::from(plaintext);
```
### Adding a new codec / quality profile
1. Define the profile in `wzp-proto/src/codec_id.rs`
2. Implement `AudioEncoder`/`AudioDecoder` traits in `wzp-codec`
3. Register in `AdaptiveEncoder`/`AdaptiveDecoder` switch logic
4. Add to `supported_profiles` in the CallOffer (engine.rs)
### Changing audio parameters
- **Sample rate**: Change `FRAME_SAMPLES` in `audio_android.rs` and `WzpOboeConfig.sample_rate` in `oboe_bridge.cpp`. Must match the codec's expected rate.
- **Frame duration**: Change `FRAME_SAMPLES` (960 = 20ms at 48kHz, 1920 = 40ms)
- **Ring buffer size**: Change `RING_CAPACITY` in `audio_android.rs`
- **AEC tail length**: Change the `100` in `Pipeline::new()` → `EchoCanceller::new(48000, 100)`
### Adding x86_64 support (emulator)
1. `build.gradle.kts`: add `"x86_64"` to `abiFilters`
2. `cargoNdkBuild` task: add `-t x86_64`
3. `build.rs`: handle `x86_64-linux-android` target for Oboe
4. Note: Oboe in the emulator uses a different audio HAL — audio quality will differ
## Dependency Overview
### Rust Crate Dependencies (wzp-android)
| Crate | Version | Purpose | Upgrade risk |
|-------|---------|---------|--------------|
| `jni` | 0.21 | Java FFI | Low — stable API |
| `tokio` | 1.x | Async runtime | Low |
| `quinn` | 0.11 | QUIC transport | Medium — breaking changes between 0.x |
| `rustls` | 0.23 | TLS for QUIC | Medium — tied to quinn version |
| `serde_json` | 1.x | Stats serialization | Low |
| `anyhow` | 1.x | Error handling | Low |
| `tracing` | 0.1 | Logging | Low |
| `rand` | 0.8 | Random seed generation | Low |
### Workspace Crate Dependencies
| Crate | Purpose | Key trait |
|-------|---------|-----------|
| `wzp-proto` | Shared types and traits | `MediaTransport`, `AudioEncoder`, `KeyExchange` |
| `wzp-codec` | Opus + Codec2 + signal processing | `AdaptiveEncoder`, `EchoCanceller` |
| `wzp-fec` | RaptorQ FEC | `RaptorQFecEncoder` |
| `wzp-crypto` | Key exchange + encryption | `WarzoneKeyExchange`, `ChaChaSession` |
| `wzp-transport` | QUIC connection management | `QuinnTransport`, `connect()` |
### Android/Kotlin Dependencies
| Library | Version | Purpose |
|---------|---------|---------|
| `compose-bom` | 2024.01.00 | Compose version alignment |
| `material3` | (from BOM) | UI components |
| `activity-compose` | 1.8.2 | Activity integration |
| `lifecycle-runtime-ktx` | 2.7.0 | ViewModel + coroutines |
| `core-ktx` | 1.12.0 | Kotlin extensions |
## Updating Dependencies
### Rust
```bash
cargo update -p wzp-android
cargo ndk -t arm64-v8a build --release -p wzp-android
```
Watch for `quinn`/`rustls` version coupling. They must be compatible:
- quinn 0.11 requires rustls 0.23
### Android/Kotlin
Update versions in `android/app/build.gradle.kts`. Key compatibility:
- `kotlinCompilerExtensionVersion` must match the Kotlin version
- `compose-bom` version determines all Compose library versions
- `compileSdk` and `targetSdk` should stay in sync
### NDK
If upgrading the NDK:
1. Update `ndkVersion` in `build.gradle.kts`
2. Update `ANDROID_NDK_HOME` environment variable
3. Update `CC_aarch64_linux_android` and friends
4. Verify Oboe still builds with the new toolchain
## Key Invariants to Preserve
1. **JNI function names must match package structure**: If the Kotlin package changes, all `Java_com_wzp_engine_WzpEngine_*` functions in `jni_bridge.rs` must be renamed.
2. **Manifest uses fully-qualified class names**: Never use `.ClassName` shorthand because the Gradle namespace (`com.wzp.phone`) differs from the Kotlin package (`com.wzp`).
3. **Stats JSON field names are snake_case**: Rust serializes with serde defaults (snake_case). Kotlin's `CallStats.fromJson()` expects `duration_secs`, `loss_pct`, etc.
4. **Ring buffer ordering**: Producer uses Release store on write index, consumer uses Acquire load. Breaking this causes torn reads.
5. **Codec thread owns Pipeline**: Pipeline is `!Send` (Opus encoder state). It must never be accessed from another thread.
6. **panic::catch_unwind on all JNI functions**: Rust panics unwinding across the FFI boundary is UB. Every JNI-exposed function must catch panics.
7. **Channel capacity (64)**: Both `send_tx` and `recv_tx` are bounded at 64 packets. If the network is slow, packets are dropped (`try_send` best-effort).
## Testing
### Unit Tests (Rust)
```bash
# Run all workspace tests (host, not Android)
cargo test
# Run only wzp-android tests (uses oboe_stub.cpp on host)
cargo test -p wzp-android
```
Note: Pipeline, codec, FEC, crypto tests run on the host. Audio tests use stubs.
### On-Device Testing
1. Build and install debug APK
2. Open app, tap CALL
3. Verify in logcat:
- `WzpEngine created via JNI`
- `connecting to relay...`
- `QUIC connected to relay`
- `CallOffer sent`
- `handshake complete, call active`
- `codec thread started`
4. Check stats overlay: frame counters should increment
5. Speak into mic — other connected device should hear audio
### Stress Testing
- Run a call for 30+ minutes — check for memory leaks (stats should be stable)
- Kill and restart the relay — client should eventually get a connection error
- Toggle mute rapidly — verify no crashes
- Switch speaker on/off — verify audio route changes
## Performance Monitoring
Key metrics to watch during a call:
| Metric | Healthy Range | Warning | Critical |
|--------|--------------|---------|----------|
| frames_encoded | Increasing ~50/sec | Stalled | 0 |
| frames_decoded | Increasing ~50/sec | Stalled | 0 |
| underruns | < 5/min | > 20/min | > 100/min |
| jitter_buffer_depth | 2-5 | 0 or >10 | N/A |
| loss_pct | < 5% | 5-20% | > 20% |
| rtt_ms | < 100ms | 100-300ms | > 500ms |

112
docs/android/roadmap.md Normal file
View File

@@ -0,0 +1,112 @@
# Roadmap & Known Gaps
## Current State Summary
The Android client can connect to a WZP relay, complete the crypto handshake, and exchange audio in real-time. Two phones on the same network can talk to each other through the relay.
## What Works (April 2025)
- QUIC transport to relay with room-based SFU
- Full crypto handshake (X25519 ephemeral + Ed25519 signatures)
- Opus 24kbps encoding/decoding at 48kHz
- Lock-free audio I/O via Oboe (capture + playout)
- AEC (acoustic echo cancellation) with 100ms tail
- AGC (automatic gain control)
- RaptorQ FEC encoder/decoder (wired to pipeline)
- Adaptive jitter buffer (10-250 packets)
- UI with connect/disconnect, mute, speaker, live stats
- Random identity seed per app launch
## Known Gaps
### P0 — Must fix for usable calls
| Gap | Impact | Where to fix |
|-----|--------|--------------|
| **Media encryption not applied** | Audio sent in cleartext over QUIC | `engine.rs` — pass `_session` to send/recv, encrypt/decrypt payloads |
| **FEC repair symbols not sent** | No loss recovery — audio gaps on packet loss | `engine.rs` send task — call `fec_encoder.generate_repair()` and send repair packets |
| **Quality reports not sent** | Relay can't monitor quality, no adaptive switching | `engine.rs` — periodically attach `QualityReport` to MediaPacket header |
| **CallService not started** | Call dies when app is backgrounded | `CallViewModel.startCall()` — call `CallService.start(context)` |
### P1 — Important for production
| Gap | Impact | Where to fix |
|-----|--------|--------------|
| **Hardcoded relay address** | Can't change server without rebuild | Add settings screen with `SharedPreferences` |
| **No reconnection logic** | Connection drop = call over | `engine.rs` network task — detect disconnect, retry with backoff |
| **No adaptive quality switching** | Stays on GOOD profile even in bad conditions | Wire `AdaptiveQualityController` to network path quality from `QuinnTransport` |
| **Identity seed not persisted** | New identity every launch | Save seed to Android Keystore or SharedPreferences |
| **No Bluetooth audio routing** | `AudioRouteManager` exists but not wired to UI | Add Bluetooth button to InCallScreen, call `AudioRouteManager` methods |
| **No ringtone/notification for incoming** | Only outgoing calls supported | Need signaling for call setup (currently both sides initiate independently) |
### P2 — Nice to have
| Gap | Impact | Where to fix |
|-----|--------|--------------|
| **No android_logger** | Rust tracing output lost on Android | Add `android_logger` crate, init in `nativeInit()` |
| **Stats don't include network metrics** | Loss/RTT/jitter always 0 | Feed `QuinnTransport.path_quality()` back to stats |
| **No ProGuard/R8 minification** | Release APK larger than necessary | Enable `isMinifyEnabled = true` in build.gradle.kts |
| **Single ABI (arm64-v8a)** | No support for older 32-bit devices or emulators | Add `armeabi-v7a` and `x86_64` to cargo-ndk build |
| **No call history** | Can't see past calls | Add Room database for call log |
| **No contact integration** | Manual relay/room entry | Add contacts with fingerprint-based identity |
## Architecture Evolution Plan
### Phase 1: Make Calls Reliable (current → next)
```
[x] QUIC connection to relay
[x] Crypto handshake
[x] Audio encode/decode pipeline
[ ] Media encryption (ChaCha20-Poly1305)
[ ] FEC repair packet transmission
[ ] Foreground service for background calls
[ ] Reconnection on network change
```
### Phase 2: Quality & Polish
```
[ ] Adaptive quality (GOOD → DEGRADED → CATASTROPHIC switching)
[ ] Quality reports in MediaPacket headers
[ ] Network path quality display (real RTT, loss, jitter)
[ ] Settings screen (relay, room, seed persistence)
[ ] Bluetooth/wired headset audio routing
[ ] Rust android_logger for debugging
```
### Phase 3: Production Features
```
[ ] featherChat authentication
[ ] Persistent identity (Android Keystore)
[ ] Push notifications for incoming calls
[ ] Multi-party rooms (already supported by relay)
[ ] Call transfer
[ ] End-to-end encryption (bypass relay decryption)
```
## Dependency Upgrade Path
### quinn 0.11 → 0.12 (when released)
Quinn 0.12 will likely require rustls 0.24. Update both together:
1. `Cargo.toml`: bump quinn and rustls versions
2. Check `client_config()` and `server_config()` in wzp-transport for API changes
3. DATAGRAM API may change — check `send_datagram()` / `read_datagram()`
### Compose BOM 2024.01 → 2025.x
The `LinearProgressIndicator` `progress` parameter changed from `Float` to `() -> Float` in Material3 1.2+. If upgrading the BOM:
```kotlin
// Old (current):
LinearProgressIndicator(progress = level, ...)
// New (Material3 1.2+):
LinearProgressIndicator(progress = { level }, ...)
```
### Kotlin 1.9 → 2.x
Kotlin 2.0 changed the Compose compiler plugin. Update `kotlinCompilerExtensionVersion` in `composeOptions` and the Kotlin Gradle plugin version together.