manawenuz/wz-phone

Fork 0

Files

Siavash Sameni d9b2e0fd53

Mirror to GitHub / mirror (push) Failing after 36s

Details

Build Release Binaries / build-amd64 (push) Failing after 1m58s

Details

docs: comprehensive documentation — design, architecture, admin, user guide

4 files, 2,511 lines covering the entire WarzonePhone project:

DESIGN.md (591 lines): system overview, codec system (9 variants),
FEC (RaptorQ), transport (QUIC/quinn), security (Ed25519/X25519/
ChaCha20/HKDF/BIP39/TOFU), federation (global rooms), jitter buffer.
Mermaid diagrams for audio pipelines and crate dependencies.

ARCHITECTURE.md (874 lines): 15 mermaid diagrams — system overview,
encode/decode pipelines, relay SFU, federation topology/protocol,
signal handshake, client architectures (desktop/android/CLI), wire
format tables (MediaHeader/MiniHeader/QualityReport), project tree.

ADMINISTRATION.md (587 lines): relay deployment (binary/Docker/systemd),
complete TOML config reference, CLI flags table, federation setup
(peers/trusted/global_rooms), 3 example configs, Prometheus metrics,
auth, identity persistence, 12-item troubleshooting guide.

USER_GUIDE.md (459 lines): all clients — desktop (settings, quality
slider, key warning, shortcuts), Android (8-level quality slider,
server management, identity backup), CLI (flags table, 8 usage
patterns). Identity system, quality profiles when-to-use guide.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-08 08:21:13 +04:00

15 KiB

Raw Blame History

WarzonePhone User Guide

This guide covers all WarzonePhone client applications: Desktop (Tauri), Android, CLI, and Web.

Desktop Client (Tauri)

The desktop client is a Tauri application with a native Rust audio engine and a web-based UI. It runs on macOS, Windows, and Linux.

Connect Screen

When you launch the desktop client, you see the connect screen with:

Relay selector -- click the relay button to open the Manage Relays dialog. Shows relay name, address, connection status (verified/new/changed/offline), and RTT latency
Room -- enter a room name. Clients in the same room hear each other. Room names are hashed before being sent to the relay for privacy
Alias -- your display name shown to other participants
OS Echo Cancel -- checkbox to enable macOS VoiceProcessingIO (Apple's FaceTime-grade AEC). Strongly recommended when using speakers
Connect button -- connects to the selected relay and joins the room
Identity info -- your identicon and fingerprint are shown at the bottom. Click to copy

Recent rooms are displayed below the form for quick reconnection. Click any recent room to select it and its associated relay.

In-Call Screen

Once connected, the in-call screen shows:

Room name and call timer at the top
Status indicator -- green when connected, yellow when reconnecting
Audio level meter -- real-time visualization of outgoing audio
Participant list -- identicon, alias, and fingerprint for each participant. Your own entry is highlighted with a badge
Controls -- Mic toggle, Hang Up, Speaker toggle
Stats bar -- TX and RX frame rates

Settings Panel

Open with the gear icon or Cmd+, (Ctrl+, on Windows/Linux). Contains:

Connection

Default Room -- room name used on next connect
Alias -- display name

Audio

Quality slider -- 5 levels:

Position	Profile	Description
0	Auto	Adaptive quality based on network conditions
1	Opus 24k	Good conditions (28.8 kbps with FEC)
2	Opus 6k	Degraded conditions (9.0 kbps with FEC)
3	Codec2 3.2k	Poor conditions (4.8 kbps with FEC)
4	Codec2 1.2k	Catastrophic conditions (2.4 kbps with FEC)

OS Echo Cancellation -- macOS VoiceProcessingIO toggle
Automatic Gain Control -- normalize mic volume

Identity

Fingerprint -- your public identity fingerprint
Identity file -- stored at ~/.wzp/identity

Recent Rooms

History of recently joined rooms with relay association
Clear History button

Manage Relays Dialog

Open by clicking the relay selector button on the connect screen:

Relay list -- each entry shows name, address, identicon (from server fingerprint), lock status, and RTT
Select -- click a relay to make it the default
Remove -- click the X button to delete a relay
Add Relay -- enter name and host:port to add a new relay
Ping -- relays are automatically pinged when the dialog opens. RTT and server fingerprint are updated

Key Change Warning Dialog

If a relay's TLS fingerprint has changed since your last connection, a warning dialog appears:

Shows the previously known fingerprint and the new fingerprint
Accept New Key -- trust the new fingerprint and proceed
Cancel -- abort the connection

This is the TOFU (Trust on First Use) model. Fingerprint changes typically mean the relay was restarted with a new identity. However, they could also indicate a man-in-the-middle attack.

Keyboard Shortcuts

Shortcut	Action	Context
m	Toggle microphone	In-call
s	Toggle speaker	In-call
q	Hang up	In-call
Cmd+, (Ctrl+,)	Open/close settings	Any
Escape	Close dialog/settings	Any
Enter	Connect	Connect screen (when room/alias field is focused)

Audio Engine

The desktop audio engine uses:

CPAL for audio I/O (CoreAudio on macOS, WASAPI on Windows, ALSA on Linux)
VoiceProcessingIO on macOS for OS-level echo cancellation (opt-in via checkbox)
Lock-free SPSC ring buffers between audio threads and network threads
Direct playout -- no jitter buffer on the client (the relay buffers instead)
Audio callbacks deliver 512 f32 samples at 48 kHz on macOS (accumulated to 960-sample frames for codec)

Audio Quality Notes

Always use Release builds for real-time audio. Debug builds are too slow for wzp-codec, nnnoiseless, audiopus, and raptorq
VoiceProcessingIO is strongly recommended on macOS. Software AEC does not work well with the round-trip latency (~35-45ms)
The quality slider only affects the encode side. Decoding always accepts all codecs

Auto-Reconnect

If the connection drops, the client automatically attempts to reconnect with exponential backoff (1s, 2s, 4s, 8s, capped at 10s). After 5 failed attempts, the client returns to the connect screen. The status dot shows yellow during reconnection.

Android Client

The Android client is built with Kotlin and Jetpack Compose, using JNI to call the Rust audio engine.

Call Screen

The main call screen shows:

Server selector -- tap to choose from configured servers
Room name -- enter the room to join
Connect/Disconnect button
Participant list with identicons and aliases
Audio level visualization
Mute/Unmute button

Settings Screen

The settings screen is organized into sections:

Identity

Display Name -- your alias shown to other participants
Fingerprint -- displayed with an identicon. Tap to copy
Copy Key -- copy the 64-character hex seed to clipboard for backup
Restore Key -- paste a previously backed-up hex seed to restore your identity

Audio Defaults

Voice Volume -- playout gain slider (-20 dB to +20 dB)
Mic Gain -- capture gain slider (-20 dB to +20 dB)
Echo Cancellation (AEC) -- toggle Android's built-in AEC. Disable if audio sounds distorted

Quality slider -- 8 levels from best to lowest:

Position	Profile	Bitrate	Color
0	Studio 64k	70.4 kbps	Green
1	Studio 48k	52.8 kbps	Green
2	Studio 32k	35.2 kbps	Green
3	Auto	Adaptive	Yellow-green
4	Opus 24k	28.8 kbps	Yellow-green
5	Opus 6k	9.0 kbps	Yellow
6	Codec2 3.2k	4.8 kbps	Orange
7	Codec2 1.2k	2.4 kbps	Red

Note: "Decode always accepts all codecs" -- the quality setting only affects encoding.

Servers

Server chips -- tap to select, X to remove (built-in servers cannot be removed)
Add Server -- enter host, port (default 4433), and optional label
Force Ping -- servers are pinged on dialog open to measure RTT

Network

Prefer IPv6 -- toggle to prefer IPv6 connections when available

Room

Default Room -- the room name pre-filled on the call screen

Identity Backup and Restore

Your identity is a 32-byte seed stored as a 64-character hex string. To back up:

Go to Settings > Identity
Tap Copy Key
Store the hex string securely

To restore on a new device:

Go to Settings > Identity
Tap Restore Key
Paste the 64-character hex string
Tap Restore (key is staged)
Tap Save to apply

The same seed produces the same fingerprint on any device or platform.

CLI Client (wzp-client)

The CLI client is a command-line tool for testing, recording, and live audio.

Usage

wzp-client [options] [relay-addr]

Default relay address: 127.0.0.1:4433

Flags Reference

Flag	Description
`--live`	Live mic/speaker mode. Requires `--features audio` at build time
`--send-tone <secs>`	Send a 440 Hz test tone for N seconds
`--send-file <file>`	Send a raw PCM file (48 kHz mono s16le)
`--record <file.raw>`	Record received audio to raw PCM file
`--echo-test <secs>`	Run automated echo quality test for N seconds. Produces a windowed analysis with loss%, SNR, correlation
`--drift-test <secs>`	Run automated clock-drift measurement for N seconds
`--sweep`	Run jitter buffer parameter sweep (local, no network). Tests different buffer configurations
`--seed <hex>`	Identity seed as 64 hex characters. Compatible with featherChat
`--mnemonic <words...>`	Identity seed as BIP39 mnemonic (24 words). All remaining non-flag words are consumed
`--room <name>`	Room name. Hashed before sending for privacy
`--token <token>`	featherChat bearer token for relay authentication
`--metrics-file <path>`	Write JSONL telemetry to file (1 line/sec)
`--help`, `-h`	Print help and exit

Common Usage Patterns

Connectivity Test (Silence)

# Send 250 silence frames (5 seconds) and exit
wzp-client 127.0.0.1:4433

Live Audio Call

# Terminal 1
wzp-relay

# Terminal 2: Alice
wzp-client --live --room myroom 127.0.0.1:4433

# Terminal 3: Bob
wzp-client --live --room myroom 127.0.0.1:4433

Both capture from mic and play received audio. Press Ctrl+C to stop.

Send Test Tone and Record

# Terminal 1
wzp-relay

# Terminal 2: Send 10 seconds of 440 Hz tone
wzp-client --send-tone 10 127.0.0.1:4433

# Terminal 3: Record what is received
wzp-client --record call.raw 127.0.0.1:4433

Play the recording:

ffplay -f s16le -ar 48000 -ac 1 call.raw

Send Audio File

# Convert to raw PCM first
ffmpeg -i song.mp3 -f s16le -ar 48000 -ac 1 song.raw

# Send through relay
wzp-client --send-file song.raw 127.0.0.1:4433

Echo Quality Test

wzp-relay &
wzp-client --echo-test 30 127.0.0.1:4433

Produces a windowed analysis showing loss percentage, SNR, correlation, and quality degradation trends.

Clock Drift Test

wzp-relay &
wzp-client --drift-test 60 127.0.0.1:4433

Measures clock drift between the send and receive paths over the specified duration.

Jitter Buffer Sweep

# Runs locally, no network needed
wzp-client --sweep

Tests different jitter buffer configurations and prints results.

With Identity and Auth

# Using hex seed
wzp-client --seed 0123456789abcdef...64chars --room secure-room --token my-bearer-token relay.example.com:4433

# Using BIP39 mnemonic
wzp-client --mnemonic abandon abandon abandon ... zoo --room secure-room relay.example.com:4433

With JSONL Telemetry

wzp-client --live --metrics-file /tmp/call.jsonl relay.example.com:4433

Writes one JSON object per second:

{
  "ts": "2026-04-07T12:00:00Z",
  "buffer_depth": 45,
  "underruns": 0,
  "overruns": 0,
  "loss_pct": 1.2,
  "rtt_ms": 34,
  "jitter_ms": 8,
  "frames_sent": 50,
  "frames_received": 49,
  "quality_profile": "GOOD"
}

Audio File Format

All raw PCM files use:

Property	Value
Sample rate	48 kHz
Channels	1 (mono)
Sample format	signed 16-bit little-endian (s16le)

Conversion commands:

# WAV to raw PCM
ffmpeg -i input.wav -f s16le -ar 48000 -ac 1 output.raw

# MP3 to raw PCM
ffmpeg -i input.mp3 -f s16le -ar 48000 -ac 1 output.raw

# Raw PCM to WAV
ffmpeg -f s16le -ar 48000 -ac 1 -i input.raw output.wav

# Play raw PCM
ffplay -f s16le -ar 48000 -ac 1 file.raw

Web Client (Browser)

The web client runs in a browser via the wzp-web bridge server.

Setup

# Start relay
wzp-relay

# Start web bridge
wzp-web --port 8080 --relay 127.0.0.1:4433

# For remote access (requires TLS for mic)
wzp-web --port 8443 --relay 127.0.0.1:4433 --tls

Open http://localhost:8080/room-name (or https://... with TLS).

Features

Open mic (default) and push-to-talk modes
PTT via on-screen button, mouse hold, or spacebar
Audio level meter
Auto-reconnection on disconnect

Audio Processing

The web client uses AudioWorklet (preferred) with a ScriptProcessorNode fallback:

Capture: Accumulates Float32 samples into 960-sample (20ms) Int16 frames
Playback: Ring buffer capped at 200ms (9600 samples at 48 kHz)

Identity System

Overview

Your identity is a 32-byte cryptographic seed that derives:

Ed25519 signing key -- authenticates handshake messages
X25519 key agreement key -- derives shared session encryption keys
Fingerprint -- SHA-256 of the public key, truncated to 16 bytes, displayed as xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx
Identicon -- deterministic visual avatar generated from the fingerprint

Seed Sources

Source	Description
Auto-generated	Created on first run, stored in `~/.wzp/identity` (desktop/CLI) or app storage (Android)
`--seed <hex>`	64-character hex string (CLI)
`--mnemonic <words>`	24-word BIP39 mnemonic (CLI)
Copy Key / Restore Key	Hex backup/restore (Android settings)

BIP39 Mnemonic Backup

The 32-byte seed can be represented as a 24-word BIP39 mnemonic for human-readable backup. The same mnemonic produces the same identity on any platform or device.

featherChat Compatibility

The identity derivation uses the same HKDF scheme as featherChat (Warzone messenger). The same seed produces the same fingerprint in both systems, allowing a unified identity across messaging and calling.

Trust on First Use (TOFU)

Clients remember the fingerprints of relays and peers they connect to. On subsequent connections, if a fingerprint changes, the client warns the user. This protects against man-in-the-middle attacks but requires manual verification on first contact.

Quality Profiles Explained

When to Use Each Profile

Profile	Total Bandwidth	Best For	Trade-offs
Studio 64k	70.4 kbps	LAN calls, music, podcasting	Highest quality, needs good network
Studio 48k	52.8 kbps	Good WiFi, wired connections	Near-studio quality
Studio 32k	35.2 kbps	Reliable WiFi, LTE	Very good quality with lower bandwidth
Auto	Adaptive	Most users	Automatically switches based on network conditions
Opus 24k	28.8 kbps	General use, moderate networks	Good speech quality, reasonable bandwidth
Opus 6k	9.0 kbps	3G networks, congested WiFi	Intelligible speech, some artifacts
Codec2 3.2k	4.8 kbps	Poor connections	Robotic but intelligible, narrowband
Codec2 1.2k	2.4 kbps	Satellite links, extreme loss	Minimal intelligibility, last resort

Auto Mode

Auto mode starts at the Good (Opus 24k) profile and adapts based on observed network quality:

Downgrade -- 3 consecutive bad quality reports (2 on cellular) trigger a step down
Upgrade -- 10 consecutive good quality reports trigger a step up (one tier at a time)
Network handoff -- switching from WiFi to cellular triggers a preemptive one-tier downgrade plus a 10-second FEC boost

Auto mode uses three tiers (Good, Degraded, Catastrophic). It does not use the Studio profiles, which must be selected manually.

Manual Override

When you select a specific profile (not Auto), adaptive switching is disabled. The encoder stays at the selected profile regardless of network conditions. This is useful when you know your network quality and want consistent encoding, or when you want to force a specific bitrate.

Note: The decoder always accepts all codecs. A manual quality selection only affects what you send, not what you receive.

15 KiB Raw Blame History

WarzonePhone User Guide

Desktop Client (Tauri)

Connect Screen

In-Call Screen

Settings Panel

Connection

Audio

Identity

Recent Rooms

Manage Relays Dialog

Key Change Warning Dialog

Keyboard Shortcuts

Audio Engine

Audio Quality Notes

Auto-Reconnect

Android Client

Call Screen

Settings Screen

Identity

Audio Defaults

Servers

Network

Room

Identity Backup and Restore

CLI Client (wzp-client)

Usage

Flags Reference

Common Usage Patterns

Connectivity Test (Silence)

Live Audio Call

Send Test Tone and Record

Send Audio File

Echo Quality Test

Clock Drift Test

Jitter Buffer Sweep

With Identity and Auth

With JSONL Telemetry

Audio File Format

Web Client (Browser)

Setup

Features

Audio Processing

Identity System

Overview

Seed Sources

BIP39 Mnemonic Backup

featherChat Compatibility

Trust on First Use (TOFU)

Quality Profiles Explained

When to Use Each Profile

Auto Mode

Manual Override

15 KiB

Raw Blame History