1. Auto codec: new "Auto" position on quality slider (JNI index 7).
When selected, the engine uses the relay's chosen_profile from
CallAnswer instead of the local preference. Slider now has 8
positions: Studio 64k → Auto → Codec2 1.2k.
2. Force ping: added refresh button (↻) in Manage Relays dialog
header. Calls pingAllServers() to re-check all relays on demand.
3. Delete relay fix: the X button was inside a Surface(onClick=...)
which swallowed the touch event. Replaced with a separate Surface
that properly intercepts the click.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallOffer only advertised GOOD/DEGRADED/CATASTROPHIC. When a
client uses a studio profile, the relay's choose_profile couldn't
pick it. Now advertises all 6 profiles (studio 64k/48k/32k + good +
degraded + catastrophic) in both Android engine and shared handshake.
Also: the relay MUST be rebuilt with the new CodecId variants,
otherwise it will fail to deserialize CallOffer messages containing
studio QualityProfiles in supported_profiles.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Wire protocol: add Opus 32k/48k/64k (CodecId 6/7/8) + STUDIO
profiles with is_opus() helper. Opus enc/dec accept all Opus variants.
2. JNI bridge: expand profile_from_int to 7 levels (0-6) mapping to
GOOD, DEGRADED, CATASTROPHIC, Codec2_3200, STUDIO_32K/48K/64K.
3. Settings UI: replace radio buttons with Material3 Slider — 7 stops
from Studio 64k (green) to Codec2 1.2k (dark red), matching desktop.
4. Key-change warning: AlertDialog on connect when server fingerprint
has changed. Shows old vs new fingerprint, Accept New Key or Cancel.
Accepting saves the new fingerprint and proceeds with the call.
5. Engine recv: handle studio codec IDs in auto-switch path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The send loop was hardcoded to 960 samples (20ms/Opus24k), causing
DEGRADED (Opus 6k, 40ms) and CATASTROPHIC (Codec2 1200, 40ms) to
fail — the encoder needed 1920 samples but only got 960.
Changes:
- capture_buf, ring read threshold, and timestamp increment are now
computed from profile.frame_duration_ms (960 for 20ms, 1920 for 40ms)
- decode_buf sized to MAX_FRAME_SAMPLES (1920) to handle any incoming codec
- recv codec switch now uses correct QualityProfile per codec (was
inheriting original profile's frame_duration_ms, breaking cross-codec)
- added ComfortNoise guard on recv path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay identity:
- Stored in ~/.wzp/relay-identity (hex-encoded 32-byte seed)
- Generated on first run, reused on restart
- Fingerprint stays consistent across relay restarts
Linux build script (scripts/build-linux-notify.sh):
- Fire and forget: Hetzner VM → build all binaries → upload to rustypaste → ntfy notify → destroy VM
- Builds: wzp-relay, wzp-client, wzp-client-audio, wzp-web, wzp-bench
- Packages as tar.gz, uploads to rustypaste
- --keep flag to preserve VM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay recognizes SNI "ping" and returns immediately — no handshake,
no stream accept, no timeout error logs. Client closes after QUIC
connect for RTT measurement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ping was a static JNI method that loaded the .so before nativeInit,
crashing jemalloc. Now ping is an instance method on WzpEngine:
- Engine is created once (nativeInit), reused for both ping and call
- pingRelay() uses same tokio runtime pattern as startCall()
- Auto-pings all servers on app launch (after engine init)
- No process restart needed
- TOFU fingerprints saved on first successful ping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The jni crate emits VERBOSE logs for every JNI method lookup (~10 lines
per call, 100+ calls/sec on audio threads). This floods logcat, consumes
CPU, and triggers system kills. Filter to only show INFO+ for our crates
and WARN+ for everything else.
Also fix build script: clean full Rust target to ensure libc++_shared.so
is always copied by cargo-ndk.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds nativeWriteAudioDirect / nativeReadAudioDirect JNI functions
that accept a DirectByteBuffer instead of ShortArray. The buffer's
native memory is accessed directly by Rust via pointer — no
GetShortArrayRegion / SetShortArrayRegion, no GC-managed array
copies on the audio hot path.
This fixes SIGBUS crashes on Android 16 where ART's concurrent
mark-compact GC crashes when flipping thread roots during JNI
array operations on MAX_PRIORITY audio threads.
Old ShortArray methods kept for backward compatibility.
AudioPipeline switched to use Direct variants.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AudioRing: use vec![].into_boxed_slice() instead of Box::new([]) to
avoid 32KB stack allocation that crashes scudo on Android
- JNI bridge: wrap tracing_subscriber init in catch_unwind to survive
sharded_slab allocation failures on some devices
- Engine: per-step encode profiling (avg_agc_us, avg_opus_us, avg_fec_us,
avg_send_us) logged every 5 seconds in send stats
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds average microsecond timings for each encode step:
- avg_agc_us: AGC processing
- avg_opus_us: Opus encoding
- avg_fec_us: FEC encode + repair generation
- avg_send_us: QUIC send_media
- avg_total_us: sum of above
Logged every 5 seconds in send stats. Resets each interval.
Use to identify which step is bottlenecking the encode loop
on devices where fps drops below 50.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rust tracing subscriber was never initialized — all info!/warn!/error!
calls in the engine went to /dev/null. This meant our send/recv health
logging was invisible and we couldn't confirm the congestion fix was
active.
Now initializes tracing-android layer on first nativeInit(), routing
all Rust logs to logcat under tag "wzp_android". Also expanded logcat
filter in DebugReporter to capture engine-level log lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Root cause: send_media() returns Err(Blocked) when QUIC congestion
window is full. The send task treated ANY send error as fatal (break),
killing the entire call. Now send errors drop the packet and continue.
Also hardened recv task to survive transient errors and added health
logging (recv gap tracking, periodic stats) to both send and recv.
Relay: added comprehensive debug logging — recv gaps, lock contention,
forward latency, send errors — all per-participant with 5s stats.
Other changes:
- AEC toggle in Settings (persisted, applied on next call)
- Debug report: records call audio (WAV), RMS histogram (CSV), logcat,
stats. Emailed as zip via Android share intent after call ends.
- Replaced LinearProgressIndicator with Box (compose version compat)
- FileProvider for sharing debug zip attachments
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Mic mute: the send loop now zeros the capture buffer when muted instead
of relying on write_audio() to skip writes. Previously stale ring data
and AGC amplification of near-silence caused crackling artifacts.
AEC: attach Android's hardware AcousticEchoCanceler to the AudioRecord
session. Also attach NoiseSuppressor when available. Both are released
on capture stop.
Room UI: deduplicate participants by fingerprint so ghost entries from
stale relay state don't show duplicate names.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
UI: deduplicate room participants by fingerprint so ghost entries from
stale relay state don't show duplicates.
Engine: after select! ends, call close_now() + connection.closed() with
500ms timeout to wait for the relay to acknowledge the CONNECTION_CLOSE.
Previously the close frame was queued but the runtime died before quinn
could retransmit if the first packet was lost.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
shutdown_background() killed the tokio runtime before quinn could send the
CONNECTION_CLOSE frame on the wire, so the relay never knew the client left.
Now use shutdown_timeout(500ms) to give quinn time to flush the close frame,
matching the desktop client pattern (which uses 2s timeout).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
stop_call() now calls close_now() on the stored transport handle before
killing the tokio runtime. This sends a QUIC CONNECTION_CLOSE frame so
the relay's recv loop breaks immediately, triggering leave() + RoomUpdate
broadcast. Previously the runtime was killed first, so transport.close()
never ran and the relay kept stale participants until idle timeout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Settings now uses draft state — changes only persist on explicit Save
- Back button discards unsaved changes
- Added applyServers() for batch server updates
- Added missing alias field to CallOffer in featherchat tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add SettingsScreen with identity (alias, key backup/restore), audio defaults,
server management, network prefs, and default room
- SettingsRepository persists all settings via SharedPreferences
- Auto-generate random display names on first launch (e.g. "Swift Wolf")
- Thread alias through CallOffer → relay handshake → RoomUpdate broadcast
- Derive caller fingerprint from identity key in relay handshake (fixes null
fingerprints when --auth-url is not set)
- Persist identity seed for stable fingerprints across reconnects
- Add alias field to SignalMessage::CallOffer (serde default for backward compat)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add RoomUpdate signal message to wzp-proto with participant count + list
- Add RoomParticipant struct (fingerprint + optional alias)
- Store fingerprint/alias in relay Participant struct
- Broadcast RoomUpdate to all room members on join and leave
- Add signal recv task in Android engine to handle RoomUpdate
- Surface room_participant_count + room_participants in CallStats JSON
- Show "X in room" with participant names in Android in-call UI
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire CallService foreground service for background calls (microphone type)
- Add Voice Volume + Mic Gain sliders (-20 to +20 dB) applied in Kotlin
- Connect AudioRouteManager for real speaker toggle via AudioManager
- Feed quinn QUIC RTT into PathMonitor, display Loss/RTT/Jitter from live data
- Nuclear teardown between calls — recreate engine + audio pipeline each call
- Fix re-entrant teardown loop from CallService notification callback
- Park audio threads as daemons to avoid libcrypto TLS destructor crash on exit
- Remove duplicate wakelocks from Activity (service owns them now)
- Strip AEC + denoise from capture path, keep AGC only (incremental approach)
- Fix .so copy target: libwzp_android.so not libwzp.so
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Wire AutoGainControl on both capture (mic → encode) and playout
(decode → speaker) paths to normalize volume levels
- Add server list with add/remove custom server dialog
- Add IPv4/IPv6 preference toggle for DNS resolution
- Resolve DNS hostnames to IP in Kotlin before passing to Rust engine
- Revert to IP addresses for default servers (DNS still broken on QUIC)
AGC confirmed working — voice levels noticeably improved in testing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AudioPipeline: Kotlin AudioRecord/AudioTrack on JVM threads, PCM
shuttled to Rust via lock-free ring buffers + JNI
- FEC: RaptorQ fountain codes on encode (5 frames/block, 20% repair
ratio for GOOD profile), decoder feeds repair symbols for recovery
- Real audio level meter from mic RMS (replaces fake animation)
- Room name editable in UI (default: "android")
- Relay changed to pangolin.manko.yoga:4433
- Stats overlay shows FEC recovered count
- CallState now synced from polled stats (fixes "Connecting" stuck bug)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
pthread_create crashes on Android due to static bionic __init_tcb stubs
in the Rust std prebuilt rlibs. This is unfixable without rebuilding std.
Solution: run the entire call (QUIC connect, handshake, media send/recv)
on a single tokio current_thread runtime. The JNI startCall() now blocks,
so Kotlin dispatches it to Dispatchers.IO (JVM thread, not pthread).
Audio pipeline temporarily simplified to silence frames — will restore
once threading is solved (either via Java Thread or rebuilding std).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The getauxval override (dlsym wrapper) fixes SIGSEGV in
init_have_lse_atomics at library load time. The current_thread
tokio runtime avoids SEGV_ACCERR in pthread_create/__init_tcb.
Both fixes are required together.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Multi-thread tokio runtime crashes with SEGV_ACCERR in __init_tcb
during pthread_create on Android (static bionic stubs from CRT).
Switch to current_thread runtime which runs network I/O on the
calling thread without spawning additional OS threads.
Also: clean up build.rs — use only libc++_shared.so (dynamic),
remove getauxval_fix.c hack, remove static c++/c++abi linking.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
compiler-rt's init_have_lse_atomics calls getauxval(AT_HWCAP) at
library load time. The static getauxval from the CRT reads from
__libc_auxv which is NULL in shared libraries → SIGSEGV at 0x0.
Fix: compile getauxval_fix.c that provides a getauxval() which uses
dlsym(RTLD_DEFAULT) to find the real bionic getauxval at runtime.
Also switch to libc++_shared.so (bundled in APK) to avoid pulling
in static libc stubs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set cpp_link_stdlib(None) to suppress cc crate's automatic linking
- Explicitly link both c++_static and c++abi with NDK sysroot search path
- Fixes RTTI vtable symbol (_ZTVN10__cxxabiv117__class_type_infoE) error
- Verified: only liblog.so remains as dynamic dependency
Closes#001
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous fix linked c++_static but not c++abi. Android NDK splits the
static C++ runtime into two archives: libc++_static.a (STL) and
libc++abi.a (RTTI/exceptions). Without c++abi, dlopen fails on
_ZTVN10__cxxabiv117__class_type_infoE.
Now using cpp_link_stdlib(None) to suppress cc crate auto-linking, then
explicitly linking both c++_static and c++abi via cargo:rustc-link-lib.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The app crashed immediately when loading libwzp_android.so because the
cc crate's default dynamic linking produced a runtime dependency on
libc++_shared.so, which was never packaged into the APK.
Adding .cpp_link_stdlib(Some("c++_static")) to build.rs bakes the C++
runtime into libwzp_android.so directly, eliminating the missing .so.
See issues/001-libc++-shared-crash.md for full diagnosis and logcat trace.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- CallActivity no longer auto-starts a call on launch
- CallViewModel lazily inits engine only when startCall() is called
- nativeGetStats nullable return handled safely in Kotlin
- Removed tracing_subscriber::fmt() which panics on Android (no stdout)
- All JNI calls wrapped in try/catch on Kotlin side
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New wzp-android crate with Oboe C++ backend, lock-free SPSC ring buffers,
engine orchestrator, codec pipeline, and Android Gradle project structure
- AEC (NLMS adaptive filter), AGC (two-stage with fast attack/slow release),
windowed-sinc FIR resampler replacing linear interpolation (wzp-codec)
- Opus encoder tuning: complexity 7 default, set_expected_loss support
- Mobile jitter buffer: asymmetric EMA (fast up/slow down), handoff spike
detection with 2s cooldown, configurable safety margin
- Network-aware quality control: cellular-specific thresholds, faster
downgrade on cellular, proactive tier drop on WiFi→cellular handoff,
FEC ratio boost during network transitions
- Handoff detection in PathMonitor via RTT jitter spike analysis
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RelayLink: QUIC connection to peer relay (SNI "_relay") for forwarding
specific sessions. Methods: connect, forward, add/remove_session, is_idle.
RelayLinkManager: manages connections to multiple peers.
- get_or_connect: lazy connection establishment
- forward_to: send media packet to specific peer
- register/unregister_session: track which sessions use which links
- Auto-closes idle links on session unregister
Protocol: added SignalMessage::SessionForward { session_id,
target_fingerprint, source_relay } and SessionForwardAck { session_id,
room_name } for relay-link session setup signaling.
Building block for P3-T7 (call setup over mesh) which wires
route resolution + relay links + handshake into a complete flow.
62 relay tests + 42 proto tests passing (7 new relay_link tests).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RouteResolver queries PresenceRegistry to determine how to reach a target:
- Route::Local — connected to this relay
- Route::DirectPeer(addr) — on a directly connected peer relay
- Route::Chain(addrs) — multi-hop (structure ready, single-hop for now)
- Route::NotFound — not in any known relay
Protocol: added SignalMessage::RouteQuery { fingerprint, ttl } and
RouteResponse { fingerprint, found, relay_chain } for peer-to-peer
route queries over probe connections.
HTTP API: GET /route/:fingerprint returns JSON with route type + chain.
Relay handles incoming RouteQuery on probe connections: looks up locally,
replies with RouteResponse. TTL decremented for future multi-hop forwarding.
55 relay tests + 42 proto tests passing (7 new route tests).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PresenceRegistry tracks who is connected where:
- register_local/unregister_local for directly connected users
- update_peer for fingerprints reported by peer relays
- lookup returns Local or Remote(addr)
- expire_stale removes entries older than timeout
Gossip via probe connections:
- New SignalMessage::PresenceUpdate { fingerprints, relay_addr }
- Probes send local fingerprints every 10s alongside Ping/Pong
- Receiving relay updates its remote presence table
HTTP API on metrics port:
- GET /presence — all known fingerprints + locations
- GET /presence/:fingerprint — single lookup
- GET /peers — peer relays + their connected users
Wired into relay main:
- Registry created at startup
- register_local after auth+handshake
- unregister_local on disconnect
- Passed to probe mesh and metrics server
Also marks FC-10 as DONE in integration tracker.
48 relay tests + 42 proto tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ServeDir now falls back to index.html for unknown paths (SPA routing).
https://host:port/manwe loads the page with room input pre-filled as "manwe".
JS getRoom() already reads the path, now the page actually loads.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>