Major rewrite of relay federation replacing virtual participants with
a clean router model:
1. Global rooms: [[global_rooms]] in TOML config declares rooms that
are bridged across federation. Each relay is a router + local SFU.
2. Room events: RoomManager emits LocalJoin/LocalLeave via broadcast
channel when rooms transition between empty and non-empty.
3. GlobalRoomActive/Inactive signals: relays announce when they have
local participants in global rooms. Peers track active state and
forward media accordingly. Announcements propagate for multi-hop.
4. Media forwarding: separated from SFU loop. Local participant sends
via mpsc channel → egress task → forward_to_peers() → room-hash
tagged datagrams to active peer links. Inbound datagrams delivered
to local participants + forwarded to other active peers (multi-hop).
5. Loop prevention: don't forward back to source relay.
6. Room name hashing: is_global_room() checks both plain name and
hash (clients hash room names for SNI privacy).
Removed: ParticipantSender::Federation, federated_participants, virtual
participant join/leave, periodic room polling. Rooms now only contain
local participants.
Signaling tested: 3-relay chain (A→B←C) correctly propagates
GlobalRoomActive through B to both A and C. Media forwarding plumbing
in place but needs final debugging.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Added [[trusted]] config: relay B can accept inbound federation
from relay A by fingerprint alone, without knowing A's address.
A connects to B with [[peers]], B trusts A with [[trusted]].
- FederationHello signal: outbound connections send their TLS
fingerprint as first signal. The accepting relay verifies it
against [[peers]] (by IP) or [[trusted]] (by fingerprint).
- Tested 3-relay chain: A→B←C. Both A and C connect to B, B trusts
both. B correctly accepts both inbound connections. Room
announcements flow A→B and C→B.
- Remaining: B needs to announce rooms back to A and C on the same
connection so media can flow A→B→C. Currently A has no virtual
participant for B, so media doesn't reach B's SFU for forwarding.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds --debug-tap <room> flag (or debug_tap in TOML config) that logs
every media packet's header metadata passing through a room. Use '*'
for all rooms.
Output (via tracing target "debug_tap"):
TAP room=... dir=in addr=... seq=31 codec=Opus24k ts=520
fec_block=5 fec_sym=1 repair=false len=65 fan_out=1
Shows: direction, source address, sequence number, codec ID, timestamp,
FEC block/symbol, repair flag, payload size, and fan-out count.
No decryption needed — headers are not encrypted.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added debug logging to federation signal path. Fixed the announce/recv
flow: outbound link's announce_task sends FederationRoomJoin, peer's
inbound signal_task receives it and creates virtual participant.
Tested: two relays on localhost with mutual TOML config, client A
sends tone via relay A, client B records via relay B — audio received
through federation (0.1s/RMS 7291/PASS).
Room announcement delay is ~1s (poll interval). The full pipeline:
client join → room created → announce_task detects → sends signal →
peer receives → creates virtual participant → SFU loop forwards
media via room-hash-tagged datagrams → peer demuxes → local delivery.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two tools:
1. --debug-tap on relay: logs packet header metadata (seq, codec, ts,
FEC, repair, size) per room without decryption. 0.5 day effort.
2. wzp-analyzer standalone: joins room as observer, decodes audio,
shows TUI with per-participant waveforms + quality stats + FEC
recovery rates. Capture/replay and HTML reports. 5-8 days total.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Inbound federation connections now matched by source IP against
configured peer URLs (QUIC clients don't present TLS certs, so
fingerprint matching fails for inbound direction).
- Added periodic room announcement task (1s poll) that sends
FederationRoomJoin to peers when new rooms appear with local
participants. Handles rooms created after federation link is up.
- Added find_peer_by_addr() to FederationManager.
Federation link topology: each relay pair has 2 connections (outbound
from each side). Outbound sends signals, peer's inbound receives them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 1 of relay federation:
1. Signal messages: FederationRoomJoin/Leave/ParticipantUpdate added
to SignalMessage enum for relay-to-relay room coordination.
2. Room changes: ParticipantOrigin (Local/Federated) tracking, loop
prevention (federated media only forwards to local participants),
ParticipantSender::Federation with 8-byte room-hash prefixed
datagrams, merged participant lists (local + remote), new methods:
join_federated(), update_federated_participants(), local_senders(),
active_rooms(), local_participants().
3. FederationManager: connects to configured peers via QUIC with SNI
"_federation", reconnects with exponential backoff (5s-300s),
exchanges FederationRoomJoin signals, runs recv loops for both
signals and media datagrams, creates virtual participants in rooms.
4. Accept-side: _federation SNI handling in main.rs, unknown peer
gets helpful "add to relay.toml" log message, recognized peers
handed off to FederationManager.
TODO: TLS fingerprint verification — currently outbound connections
use client_config() which doesn't present a cert, so inbound
verification fails. Need mutual TLS or URL-based peer matching.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The relay now supports loading configuration from a TOML file via
--config <path>. CLI flags override TOML values. All fields have
serde defaults so a minimal config only needs what you want to change.
Example relay.toml:
listen_addr = "0.0.0.0:4433"
[[peers]]
url = "193.180.213.68:4433"
fingerprint = "1a:39:38:..."
label = "Pangolin EU"
Federation hint on startup now shows TOML format with TLS fingerprint
(not Ed25519 identity fingerprint), since TLS fingerprint is what
peers actually verify. Configured peers are logged on startup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The relay's TLS certificate is now derived from the persisted
Ed25519 seed via HKDF, so the same seed produces the same cert
and the same TLS fingerprint across restarts. This fixes the
"Server Key Changed" warnings on every relay restart.
Implementation: HKDF-SHA256(seed, "wzp-tls-ed25519") → Ed25519
signing key → PKCS8 DER → rcgen KeyPair → self-signed cert.
Also adds tls_fingerprint() helper (SHA-256 of DER cert, hex with
colons) and prints it on startup. This is the prerequisite for
relay federation (peers verify each other by TLS fingerprint).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
On startup, the relay detects its outbound IP (via UDP socket trick)
and prints a ready-to-copy YAML snippet for other relays to federate:
federation: to peer with this relay, add to peers config:
- url: "193.180.213.68:4433"
fingerprint: "a5d6:e3c6:..."
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Documents the relay TLS identity bug (cert regenerates on restart
because server_config() creates a new keypair every time, ignoring
the persisted Ed25519 seed) and the full federation design:
- YAML config with mutual peer trust (url + fingerprint)
- QUIC connections between peers, fingerprint verification
- Room bridging: media forwarding for shared room names
- Merged participant presence across relays
- Helpful log message for unconfigured peer connection attempts
- No transcoding, no re-encryption, no central coordinator
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers the full design for runtime codec switching based on network
conditions: 3-tier basic (GOOD/DEGRADED/CATASTROPHIC), extended
5-tier with studio levels, and bandwidth probing. Details the
existing QualityAdapter infrastructure, what's missing (report
ingestion, profile switch loop, cross-task signaling via AtomicU8),
and implementation plan for both Android and desktop engines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Auto codec: new "Auto" position on quality slider (JNI index 7).
When selected, the engine uses the relay's chosen_profile from
CallAnswer instead of the local preference. Slider now has 8
positions: Studio 64k → Auto → Codec2 1.2k.
2. Force ping: added refresh button (↻) in Manage Relays dialog
header. Calls pingAllServers() to re-check all relays on demand.
3. Delete relay fix: the X button was inside a Surface(onClick=...)
which swallowed the touch event. Replaced with a separate Surface
that properly intercepts the click.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same fix as Android — the CallOffer now includes STUDIO_64K/48K/32K
so the relay can negotiate studio quality levels.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallOffer only advertised GOOD/DEGRADED/CATASTROPHIC. When a
client uses a studio profile, the relay's choose_profile couldn't
pick it. Now advertises all 6 profiles (studio 64k/48k/32k + good +
degraded + catastrophic) in both Android engine and shared handshake.
Also: the relay MUST be rebuilt with the new CodecId variants,
otherwise it will fail to deserialize CallOffer messages containing
studio QualityProfiles in supported_profiles.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. Wire protocol: add Opus 32k/48k/64k (CodecId 6/7/8) + STUDIO
profiles with is_opus() helper. Opus enc/dec accept all Opus variants.
2. JNI bridge: expand profile_from_int to 7 levels (0-6) mapping to
GOOD, DEGRADED, CATASTROPHIC, Codec2_3200, STUDIO_32K/48K/64K.
3. Settings UI: replace radio buttons with Material3 Slider — 7 stops
from Studio 64k (green) to Codec2 1.2k (dark red), matching desktop.
4. Key-change warning: AlertDialog on connect when server fingerprint
has changed. Shows old vs new fingerprint, Accept New Key or Cancel.
Accepting saves the new fingerprint and proceeds with the call.
5. Engine recv: handle studio codec IDs in auto-switch path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PRD-local-recording.md: Dual-path architecture for podcast-quality
interviews — local lossless WAV recording alongside live call, with
sync markers for post-session alignment, resumable upload to a
self-hosted mixer service that produces normalized multi-track output.
PRD-studio-quality.md: Documents the Opus 32k/48k/64k studio tiers,
when to use them, cross-codec interop, and backward compatibility.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds three new codec IDs (Opus32k=6, Opus48k=7, Opus64k=8) and
corresponding STUDIO_32K, STUDIO_48K, STUDIO_64K quality profiles.
All use 20ms frames with minimal FEC (10%) for maximum quality on
good networks.
Updated across: wire protocol (codec_id.rs), encoder/decoder
(opus_enc/dec.rs), adaptive codec switch (call.rs), CLI
(--profile studio-64k), desktop engine + UI slider (8 quality
levels from Studio 64k green to Codec2 1.2k red).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the relay's server key changes (e.g. after restart), show a
styled in-app warning dialog instead of the ugly browser confirm().
The dialog shows old vs new fingerprints and lets the user accept
the new key or cancel. Accepting updates the saved fingerprint and
refreshes the relay button state.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cargo-ndk doesn't always copy libc++_shared.so into jniLibs. The
build script now finds it in the NDK and copies it manually if
missing, preventing the build check from failing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The send loop was hardcoded to 960 samples (20ms/Opus24k), causing
DEGRADED (Opus 6k, 40ms) and CATASTROPHIC (Codec2 1200, 40ms) to
fail — the encoder needed 1920 samples but only got 960.
Changes:
- capture_buf, ring read threshold, and timestamp increment are now
computed from profile.frame_duration_ms (960 for 20ms, 1920 for 40ms)
- decode_buf sized to MAX_FRAME_SAMPLES (1920) to handle any incoming codec
- recv codec switch now uses correct QualityProfile per codec (was
inheriting original profile's frame_duration_ms, breaking cross-codec)
- added ComfortNoise guard on recv path
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace the quality dropdown with a range slider in the settings
panel. The slider goes from Auto (green) through Opus 24k, Opus 6k
(yellow), Codec2 3.2k (orange) to Codec2 1.2k (dark red). The
track uses a green-to-red gradient and the label color updates
to match the selected level. Removed the quality dropdown from
the connect screen — quality is now settings-only.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds a Quality dropdown (Auto / Opus 24k / Opus 6k / Codec2 3.2k /
Codec2 1.2k) to both the connect screen and settings panel. The
selected profile is passed through to the engine which configures
the encoder and decoder accordingly.
The desktop engine recv path now auto-switches the decoder codec
when incoming packets use a different codec than expected, enabling
cross-codec interop between clients on different quality settings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CallDecoder now inspects each incoming packet's codec_id and
automatically switches the audio decoder if it differs from the
current profile. This enables cross-codec interop where one client
sends Opus and the other sends Codec2 — previously the receiver
would try to decode with the wrong codec, producing garbled audio.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Enables debugging Codec2 by allowing forced codec selection from CLI.
Supports: good, degraded, catastrophic, codec2-3200, codec2-1200.
Frame size, timing, and jitter buffer are all adjusted dynamically
based on the selected profile.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server may be reachable even if ping failed (transient timeout).
User should always be able to try connecting. Fingerprint change
still shows confirm dialog (accept/reject).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Both Android and Linux build scripts now send ntfy notification
when build fails, not just on success.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same Docker image as Android build. Separate cache dirs (cache-linux/)
to avoid conflicts when running both builds simultaneously.
Builds: wzp-relay, wzp-client, wzp-client-audio, wzp-web, wzp-bench
Uploads tar.gz to rustypaste, notifies ntfy.sh/wzp.
Usage:
./scripts/build-linux-docker.sh --pull # fire and forget
./scripts/build-linux-docker.sh --pull --install # wait + download
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay identity:
- Stored in ~/.wzp/relay-identity (hex-encoded 32-byte seed)
- Generated on first run, reused on restart
- Fingerprint stays consistent across relay restarts
Linux build script (scripts/build-linux-notify.sh):
- Fire and forget: Hetzner VM → build all binaries → upload to rustypaste → ntfy notify → destroy VM
- Builds: wzp-relay, wzp-client, wzp-client-audio, wzp-web, wzp-bench
- Packages as tar.gz, uploads to rustypaste
- --keep flag to preserve VM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay recognizes SNI "ping" and returns immediately — no handshake,
no stream accept, no timeout error logs. Client closes after QUIC
connect for RTT measurement.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ping was a static JNI method that loaded the .so before nativeInit,
crashing jemalloc. Now ping is an instance method on WzpEngine:
- Engine is created once (nativeInit), reused for both ping and call
- pingRelay() uses same tokio runtime pattern as startCall()
- Auto-pings all servers on app launch (after engine init)
- No process restart needed
- TOFU fingerprints saved on first successful ping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Ping button: pings all servers via native QUIC, saves RTT + fingerprint
to SharedPreferences, then exits process (System.exit)
- On restart: loads saved ping results (no native .so loading needed)
- Avoids jemalloc crash: native lib only loaded once per process lifetime
- Removed broken UDP probe (QUIC servers don't respond to it)
- SettingsRepository: savePingRtt/loadPingRtt for cached results
- PingResult: added reachable field
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AudioPipeline.debugRecording defaults to false (was true)
- SettingsRepository: persist debug_recording preference
- CallViewModel: debugRecording StateFlow + setter, wired to AudioPipeline
- Only records PCM + RMS when explicitly enabled in settings
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace WzpEngine.pingRelay() (JNI, loads native .so, crashes jemalloc
on Android 16 MTE) with pure Kotlin DatagramSocket UDP probe.
- RelayPinger: sends QUIC Version Negotiation trigger packet, measures
RTT from response. No native lib, no JNI, zero crash risk.
- Periodic: pings all servers every 5 seconds via coroutine
- Server fingerprint: filled lazily on first real QUIC connection
(TOFU still works, just delayed)
- Lock status: OFFLINE when ping fails, NEW until first connection
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Same fix as Android: Box::new([0i16; 16384]) allocates 32KB on the
stack before moving to heap. Use vec![].into_boxed_slice() for
direct heap allocation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backport from desktop client to Android:
Identicons:
- New Identicon.kt composable: deterministic 5x5 symmetric Canvas pattern
from fingerprint hash (same algorithm as desktop identicon.ts)
- Participant list shows identicon + name + tappable fingerprint
- Settings page shows identicon next to fingerprint
CopyableFingerprint:
- Tap any fingerprint text to copy to clipboard with Toast feedback
- Used in participant list and settings page
Recent rooms:
- SettingsRepository: persists last 5 (relay, room) pairs
- CallViewModel: saves on startCall, exposes as StateFlow
- InCallScreen: clickable chips that fill room + select matching server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous attempt allocated DirectByteBuffer as local variables inside
runCapture/runPlayout. ART's JIT On-Stack Replacement nulled them
when recompiling the hot loop mid-execution.
Fix: allocate as class fields on AudioPipeline (captureDirectBuf,
playoutDirectBuf). Object fields live on the heap, immune to OSR
stack frame replacement.
Eliminates JNI array copies (GetShortArrayRegion/SetShortArrayRegion)
from the audio hot path, preventing ART GC SIGBUS crashes on
Android 16 with concurrent mark-compact GC.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>