The relay now supports loading configuration from a TOML file via
--config <path>. CLI flags override TOML values. All fields have
serde defaults so a minimal config only needs what you want to change.
Example relay.toml:
listen_addr = "0.0.0.0:4433"
[[peers]]
url = "193.180.213.68:4433"
fingerprint = "1a:39:38:..."
label = "Pangolin EU"
Federation hint on startup now shows TOML format with TLS fingerprint
(not Ed25519 identity fingerprint), since TLS fingerprint is what
peers actually verify. Configured peers are logged on startup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The relay's TLS certificate is now derived from the persisted
Ed25519 seed via HKDF, so the same seed produces the same cert
and the same TLS fingerprint across restarts. This fixes the
"Server Key Changed" warnings on every relay restart.
Implementation: HKDF-SHA256(seed, "wzp-tls-ed25519") → Ed25519
signing key → PKCS8 DER → rcgen KeyPair → self-signed cert.
Also adds tls_fingerprint() helper (SHA-256 of DER cert, hex with
colons) and prints it on startup. This is the prerequisite for
relay federation (peers verify each other by TLS fingerprint).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On startup, the relay detects its outbound IP (via UDP socket trick)
and prints a ready-to-copy YAML snippet for other relays to federate:
federation: to peer with this relay, add to peers config:
- url: "193.180.213.68:4433"
fingerprint: "a5d6:e3c6:..."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Auto codec: new "Auto" position on quality slider (JNI index 7).
When selected, the engine uses the relay's chosen_profile from
CallAnswer instead of the local preference. Slider now has 8
positions: Studio 64k → Auto → Codec2 1.2k.
2. Force ping: added refresh button (↻) in Manage Relays dialog
header. Calls pingAllServers() to re-check all relays on demand.
3. Delete relay fix: the X button was inside a Surface(onClick=...)
which swallowed the touch event. Replaced with a separate Surface
that properly intercepts the click.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same fix as Android — the CallOffer now includes STUDIO_64K/48K/32K
so the relay can negotiate studio quality levels.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallOffer only advertised GOOD/DEGRADED/CATASTROPHIC. When a
client uses a studio profile, the relay's choose_profile couldn't
pick it. Now advertises all 6 profiles (studio 64k/48k/32k + good +
degraded + catastrophic) in both Android engine and shared handshake.
Also: the relay MUST be rebuilt with the new CodecId variants,
otherwise it will fail to deserialize CallOffer messages containing
studio QualityProfiles in supported_profiles.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Wire protocol: add Opus 32k/48k/64k (CodecId 6/7/8) + STUDIO
profiles with is_opus() helper. Opus enc/dec accept all Opus variants.
2. JNI bridge: expand profile_from_int to 7 levels (0-6) mapping to
GOOD, DEGRADED, CATASTROPHIC, Codec2_3200, STUDIO_32K/48K/64K.
3. Settings UI: replace radio buttons with Material3 Slider — 7 stops
from Studio 64k (green) to Codec2 1.2k (dark red), matching desktop.
4. Key-change warning: AlertDialog on connect when server fingerprint
has changed. Shows old vs new fingerprint, Accept New Key or Cancel.
Accepting saves the new fingerprint and proceeds with the call.
5. Engine recv: handle studio codec IDs in auto-switch path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds three new codec IDs (Opus32k=6, Opus48k=7, Opus64k=8) and
corresponding STUDIO_32K, STUDIO_48K, STUDIO_64K quality profiles.
All use 20ms frames with minimal FEC (10%) for maximum quality on
good networks.
Updated across: wire protocol (codec_id.rs), encoder/decoder
(opus_enc/dec.rs), adaptive codec switch (call.rs), CLI
(--profile studio-64k), desktop engine + UI slider (8 quality
levels from Studio 64k green to Codec2 1.2k red).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The send loop was hardcoded to 960 samples (20ms/Opus24k), causing
DEGRADED (Opus 6k, 40ms) and CATASTROPHIC (Codec2 1200, 40ms) to
fail — the encoder needed 1920 samples but only got 960.
Changes:
- capture_buf, ring read threshold, and timestamp increment are now
computed from profile.frame_duration_ms (960 for 20ms, 1920 for 40ms)
- decode_buf sized to MAX_FRAME_SAMPLES (1920) to handle any incoming codec
- recv codec switch now uses correct QualityProfile per codec (was
inheriting original profile's frame_duration_ms, breaking cross-codec)
- added ComfortNoise guard on recv path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallDecoder now inspects each incoming packet's codec_id and
automatically switches the audio decoder if it differs from the
current profile. This enables cross-codec interop where one client
sends Opus and the other sends Codec2 — previously the receiver
would try to decode with the wrong codec, producing garbled audio.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enables debugging Codec2 by allowing forced codec selection from CLI.
Supports: good, degraded, catastrophic, codec2-3200, codec2-1200.
Frame size, timing, and jitter buffer are all adjusted dynamically
based on the selected profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay identity:
- Stored in ~/.wzp/relay-identity (hex-encoded 32-byte seed)
- Generated on first run, reused on restart
- Fingerprint stays consistent across relay restarts
Linux build script (scripts/build-linux-notify.sh):
- Fire and forget: Hetzner VM → build all binaries → upload to rustypaste → ntfy notify → destroy VM
- Builds: wzp-relay, wzp-client, wzp-client-audio, wzp-web, wzp-bench
- Packages as tar.gz, uploads to rustypaste
- --keep flag to preserve VM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay recognizes SNI "ping" and returns immediately — no handshake,
no stream accept, no timeout error logs. Client closes after QUIC
connect for RTT measurement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ping was a static JNI method that loaded the .so before nativeInit,
crashing jemalloc. Now ping is an instance method on WzpEngine:
- Engine is created once (nativeInit), reused for both ping and call
- pingRelay() uses same tokio runtime pattern as startCall()
- Auto-pings all servers on app launch (after engine init)
- No process restart needed
- TOFU fingerprints saved on first successful ping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same fix as Android: Box::new([0i16; 16384]) allocates 32KB on the
stack before moving to heap. Use vec![].into_boxed_slice() for
direct heap allocation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The jni crate emits VERBOSE logs for every JNI method lookup (~10 lines
per call, 100+ calls/sec on audio threads). This floods logcat, consumes
CPU, and triggers system kills. Filter to only show INFO+ for our crates
and WARN+ for everything else.
Also fix build script: clean full Rust target to ensure libc++_shared.so
is always copied by cargo-ndk.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds nativeWriteAudioDirect / nativeReadAudioDirect JNI functions
that accept a DirectByteBuffer instead of ShortArray. The buffer's
native memory is accessed directly by Rust via pointer — no
GetShortArrayRegion / SetShortArrayRegion, no GC-managed array
copies on the audio hot path.
This fixes SIGBUS crashes on Android 16 where ART's concurrent
mark-compact GC crashes when flipping thread roots during JNI
array operations on MAX_PRIORITY audio threads.
Old ShortArray methods kept for backward compatibility.
AudioPipeline switched to use Direct variants.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AudioRing: use vec![].into_boxed_slice() instead of Box::new([]) to
avoid 32KB stack allocation that crashes scudo on Android
- JNI bridge: wrap tracing_subscriber init in catch_unwind to survive
sharded_slab allocation failures on some devices
- Engine: per-step encode profiling (avg_agc_us, avg_opus_us, avg_fec_us,
avg_send_us) logged every 5 seconds in send stats
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds average microsecond timings for each encode step:
- avg_agc_us: AGC processing
- avg_opus_us: Opus encoding
- avg_fec_us: FEC encode + repair generation
- avg_send_us: QUIC send_media
- avg_total_us: sum of above
Logged every 5 seconds in send stats. Resets each interval.
Use to identify which step is bottlenecking the encode loop
on devices where fps drops below 50.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same fix as Android (4af7c5f): writer never touches read_pos,
reader self-corrects when lapped. Power-of-2 capacity (16384),
bitmask indexing, overflow/underrun counters.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rust tracing subscriber was never initialized — all info!/warn!/error!
calls in the engine went to /dev/null. This meant our send/recv health
logging was invisible and we couldn't confirm the congestion fix was
active.
Now initializes tracing-android layer on first nativeInit(), routing
all Rust logs to logcat under tag "wzp_android". Also expanded logcat
filter in DebugReporter to capture engine-level log lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
send_datagram() returns Err(Blocked) when the QUIC congestion window
is full. This is transient — the window reopens once ACKs arrive.
Previously, all send paths treated this as fatal (break/return),
which killed the send task and cascaded via tokio::select! to kill
the entire call.
Now: log warning, drop the packet, continue. Brief audio glitch
(20-100ms) instead of complete call death. FEC on the receiver
side recovers most dropped packets.
Fixed in:
- CLI run_live send task (continue + error counter)
- CLI run_file_mode send paths (2 locations)
- Desktop engine send task
Also hardened recv tasks: transient errors (non-closed/reset)
are survived instead of causing exit.
Matches the fix applied to Android client (engine.rs).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: send_media() returns Err(Blocked) when QUIC congestion
window is full. The send task treated ANY send error as fatal (break),
killing the entire call. Now send errors drop the packet and continue.
Also hardened recv task to survive transient errors and added health
logging (recv gap tracking, periodic stats) to both send and recv.
Relay: added comprehensive debug logging — recv gaps, lock contention,
forward latency, send errors — all per-participant with 5s stats.
Other changes:
- AEC toggle in Settings (persisted, applied on next call)
- Debug report: records call audio (WAV), RMS histogram (CSV), logcat,
stats. Emailed as zip via Android share intent after call ends.
- Replaced LinearProgressIndicator with Box (compose version compat)
- FileProvider for sharing debug zip attachments
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New desktop/ directory with Tauri v2 + Vite + TypeScript
- Rust backend: CallEngine wrapping wzp-client audio + transport
- Web frontend: connect screen, in-call screen with participants,
mic/speaker mute, keyboard shortcuts (m/s/q)
- Dark theme UI, settings persistence via localStorage
- Platform-aware --os-aec: warns on Windows/Linux (not yet implemented)
- Workspace updated to include desktop/src-tauri
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mic mute: the send loop now zeros the capture buffer when muted instead
of relying on write_audio() to skip writes. Previously stale ring data
and AGC amplification of near-silence caused crackling artifacts.
AEC: attach Android's hardware AcousticEchoCanceler to the AudioRecord
session. Also attach NoiseSuppressor when available. Both are released
on capture stop.
Room UI: deduplicate participants by fingerprint so ghost entries from
stale relay state don't show duplicate names.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UI: deduplicate room participants by fingerprint so ghost entries from
stale relay state don't show duplicates.
Engine: after select! ends, call close_now() + connection.closed() with
500ms timeout to wait for the relay to acknowledge the CONNECTION_CLOSE.
Previously the close frame was queued but the runtime died before quinn
could retransmit if the first packet was lost.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shutdown_background() killed the tokio runtime before quinn could send the
CONNECTION_CLOSE frame on the wire, so the relay never knew the client left.
Now use shutdown_timeout(500ms) to give quinn time to flush the close frame,
matching the desktop client pattern (which uses 2s timeout).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace Mutex-based CPAL callbacks with atomic SPSC ring buffers
- Proper async send/recv loops (no block_on), 20ms playout tick
- Add signal task for RoomUpdate presence display
- Add --alias, --raw-room flags and key persistence (~/.wzp/identity)
- Add SetAlias signal variant and relay-side handling
- Graceful Ctrl+C shutdown with force-quit on second press
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
stop_call() now calls close_now() on the stored transport handle before
killing the tokio runtime. This sends a QUIC CONNECTION_CLOSE frame so
the relay's recv loop breaks immediately, triggering leave() + RoomUpdate
broadcast. Previously the runtime was killed first, so transport.close()
never ran and the relay kept stale participants until idle timeout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Settings now uses draft state — changes only persist on explicit Save
- Back button discards unsaved changes
- Added applyServers() for batch server updates
- Added missing alias field to CallOffer in featherchat tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add SettingsScreen with identity (alias, key backup/restore), audio defaults,
server management, network prefs, and default room
- SettingsRepository persists all settings via SharedPreferences
- Auto-generate random display names on first launch (e.g. "Swift Wolf")
- Thread alias through CallOffer → relay handshake → RoomUpdate broadcast
- Derive caller fingerprint from identity key in relay handshake (fixes null
fingerprints when --auth-url is not set)
- Persist identity seed for stable fingerprints across reconnects
- Add alias field to SignalMessage::CallOffer (serde default for backward compat)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add RoomUpdate signal message to wzp-proto with participant count + list
- Add RoomParticipant struct (fingerprint + optional alias)
- Store fingerprint/alias in relay Participant struct
- Broadcast RoomUpdate to all room members on join and leave
- Add signal recv task in Android engine to handle RoomUpdate
- Surface room_participant_count + room_participants in CallStats JSON
- Show "X in room" with participant names in Android in-call UI
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire CallService foreground service for background calls (microphone type)
- Add Voice Volume + Mic Gain sliders (-20 to +20 dB) applied in Kotlin
- Connect AudioRouteManager for real speaker toggle via AudioManager
- Feed quinn QUIC RTT into PathMonitor, display Loss/RTT/Jitter from live data
- Nuclear teardown between calls — recreate engine + audio pipeline each call
- Fix re-entrant teardown loop from CallService notification callback
- Park audio threads as daemons to avoid libcrypto TLS destructor crash on exit
- Remove duplicate wakelocks from Activity (service owns them now)
- Strip AEC + denoise from capture path, keep AGC only (incremental approach)
- Fix .so copy target: libwzp_android.so not libwzp.so
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire AutoGainControl on both capture (mic → encode) and playout
(decode → speaker) paths to normalize volume levels
- Add server list with add/remove custom server dialog
- Add IPv4/IPv6 preference toggle for DNS resolution
- Resolve DNS hostnames to IP in Kotlin before passing to Rust engine
- Revert to IP addresses for default servers (DNS still broken on QUIC)
AGC confirmed working — voice levels noticeably improved in testing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AudioPipeline: Kotlin AudioRecord/AudioTrack on JVM threads, PCM
shuttled to Rust via lock-free ring buffers + JNI
- FEC: RaptorQ fountain codes on encode (5 frames/block, 20% repair
ratio for GOOD profile), decoder feeds repair symbols for recovery
- Real audio level meter from mic RMS (replaces fake animation)
- Room name editable in UI (default: "android")
- Relay changed to pangolin.manko.yoga:4433
- Stats overlay shows FEC recovered count
- CallState now synced from polled stats (fixes "Connecting" stuck bug)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pthread_create crashes on Android due to static bionic __init_tcb stubs
in the Rust std prebuilt rlibs. This is unfixable without rebuilding std.
Solution: run the entire call (QUIC connect, handshake, media send/recv)
on a single tokio current_thread runtime. The JNI startCall() now blocks,
so Kotlin dispatches it to Dispatchers.IO (JVM thread, not pthread).
Audio pipeline temporarily simplified to silence frames — will restore
once threading is solved (either via Java Thread or rebuilding std).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>