Adds infrastructure for building the Tauri 2.x Android app (the pivot
away from the Kotlin+JNI approach whose stack overflow / libcrypto TLS
crash / thread lifecycle hell is documented in the incident report):
- scripts/Dockerfile.android-builder: extended to support both the
legacy Kotlin+JNI pipeline (cargo-ndk + Gradle) and the new Tauri
mobile pipeline (tauri-cli + Node/npm). Adds Node.js 20 LTS, API
level 36 + build-tools 35.0.0, and additional apt packages.
- scripts/build-tauri-android.sh: fire-and-forget remote build via
Docker on SepehrHomeserverdk, with ntfy.sh notifications and
rustypaste upload of the resulting APK. Mirrors the pattern of
build-tauri-android-docker.sh but targets the new Tauri pipeline.
- docs/incident-tauri-android-init-tcb.md: postmortem of the Kotlin+JNI
crash cascade that drove the Tauri mobile rewrite decision. Covers
the __init_tcb / pthread_create bionic private symbol leak, the
staticlib + cdylib crate-type interaction, the Dispatchers.IO 512 KB
thread stack overflow, and the tokio runtime / libcrypto TLS race.
- scripts/mint-tmux.sh, scripts/prep-linux-mint.sh: general dev
infrastructure (tmux + Linux Mint workstation prep scripts).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SignalManager (NEW):
- Dedicated Rust struct with its own QUIC connection to _signal
- Separate JNI handle (nativeSignalConnect/GetState/PlaceCall/etc)
- Kotlin wrapper polls state every 500ms via getState() JSON
- Lives independently of WzpEngine — survives across calls
- connect() blocks briefly on 8MB thread, then recv loop runs on dedicated thread
WzpEngine (CLEANED):
- Back to pure media-only role (audio, codec, FEC, jitter)
- Removed start_signaling/place_call/answer_call methods
- Removed signal_transport/signal_fingerprint from EngineState
CallViewModel:
- Two separate managers: signalManager (persistent) + engine (per-call)
- Two separate polling loops: signalPollJob + statsJob
- Auto-connect to media room when signal polling detects "setup" state
- hangupDirectCall() ends media but keeps signal alive
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Don't set callState for signal-only states (prevents auto-join room)
- Store signal transport + fingerprint in EngineState after registration
- place_call/answer_call send directly via signal transport (not command channel)
- Spawn small threads for async signal sends (non-blocking)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Spawn signaling on dedicated thread with 4MB stack instead of using
Android's IO dispatcher thread (insufficient stack for tokio + QUIC)
- Add "direct-call-v1" version marker to home screen subtitle
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Mode toggle: "Room" vs "Direct Call" tabs on pre-connection screen
- Direct Call mode: Register button → registers on relay signal channel
- After registration: shows fingerprint dial pad + incoming call panel
- Incoming call: green Accept / red Reject buttons with caller info
- Ringing state display while waiting for callee
- CallSetup auto-connects to media room
- CallStats extended: sas_code, incoming_call_id/fp/alias fields
- CallViewModel: registerForCalls(), placeDirectCall(), answerIncomingCall()
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
git pull fails when refs are stale from concurrent builds. Switch to
git gc + git fetch + git reset --hard origin/branch for robustness.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When relay listens on 0.0.0.0, derive the actual IP from the client's
connection address for the CallSetup message.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Derive a 4-digit code from the shared DH secret via HKDF with label
"warzone-sas-code". Both peers compute the same code; a MITM relay
produces a different one. Users compare verbally during the call.
- CryptoSession::sas_code() -> Option<u32> on the trait
- ChaChaSession stores and returns the SAS
- HKDF derivation in WarzoneKeyExchange::derive_session()
- Tests: both peers match, MITM produces different code
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Call rooms (call-*) restricted to the two authorized participants only
- Room capacity enforced at 2 for call rooms
- Unauthorized clients get immediate connection close
- Unified fingerprint format: SHA-256(Ed25519 pub)[:16] as xxxx:xxxx:...
Used consistently in signal registration, handshake, and ACL checks
Tested: Alice+Bob authorized, attacker rejected with "not authorized"
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New feature: call someone directly by fingerprint through the relay.
- Client connects with SNI "_signal" for persistent signaling
- RegisterPresence/RegisterPresenceAck for relay registration
- DirectCallOffer routed to target by fingerprint
- DirectCallAnswer with AcceptGeneric/AcceptTrusted/Reject modes
- Relay creates private room (call-{id}), sends CallSetup to both
- Both clients connect to private room for media (existing SFU path)
- Hangup forwarding + cleanup on disconnect
- Desktop CLI: --signal + --call <fingerprint> for testing
- CallRegistry tracks call state (Pending/Ringing/Active/Ended)
- SignalHub manages persistent signaling connections
Tested: Alice calls Bob by fingerprint, relay routes offer, Bob
auto-accepts, both join private room, media flows bidirectionally.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cargo.lock changes from Docker builds caused pull conflicts. Now uses
reset --hard + clean -fd to guarantee clean state before pulling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Time-based dedup (2s TTL) replaces fixed-window dedup — consecutive
senders with same seq numbers no longer collide
- Raw byte forwarding for federation local delivery (no re-serialization)
- Jitter buffer resets on large backward seq jumps (>100)
- recv_media skips malformed datagrams instead of returning connection-closed
- SIGTERM handler for clean QUIC shutdown on wzp-client
- JSONL event log infrastructure (--event-log flag) for protocol analysis
- FEC disabled on GOOD profile for federation debugging (fec_ratio=0.0)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Federation media from different senders had conflicting seq numbers,
FEC block IDs, and Opus decoder state. The relay now assigns fresh
monotonic seq/fec_block/fec_symbol to all federation-delivered packets,
ensuring clients see a clean continuous stream regardless of sender changes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When propagating GlobalRoomActive to other peers, use tagged participants
(with relay_label set to the originating relay) instead of the raw
untagged participants. This shows "Relay C" instead of "Relay B" when
C's participants are forwarded through hub B to A.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a new sender reuses the same block_id values as a previous sender,
the FEC decoder was silently dropping all data because blocks were marked
as "already decoded". Now blocks older than 2 seconds are automatically
reset when new data arrives for them.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Dedup key now includes source peer fingerprint hash, preventing
packets from different senders with same room+seq from being dropped
as duplicates (was silently killing all multi-hop audio)
- Build scripts default to --pull (use --no-pull to skip)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Deduplicate remote participants by fingerprint in all merge sites
(canonical == raw room name caused double-lookup, doubling every remote participant)
- GlobalRoomInactive now propagates updated participant list to other peers
(hub relay B was not informing A when C's participants left)
- Add 15-second stale presence sweeper that purges remote participants
from peers that stop sending data (safety net for QUIC timeout delays)
- Add @Synchronized to WzpEngine.getStats/stopCall/destroy to prevent
TOCTOU race between stats polling coroutine and engine teardown (SIGSEGV)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Android default room changed from 'android' to 'general'
- Relay choose_profile capped at GOOD (Opus 24k) — studio tiers
(32k/48k/64k) cause high packet loss on federation paths due to
larger datagrams exceeding path MTU. Will re-enable after MTU
discovery is implemented.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The hash was read inside Docker (/build/source) where .git doesn't
exist. Now reads from $BASE_DIR/data/source before Docker runs.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ntfy messages now show: "WZP Linux [abc1234] ready!" and
"WZP Android [abc1234] done! APK: url" so you can verify which
commit was built without checking relay version remotely.
Also added PRD-mtu-discovery.md for QUIC Path MTU Discovery.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a new federation link is established, announce not only LOCAL
global rooms but also rooms from OTHER peers (remote_participants).
This fixes multi-hop: when R2 connects to R3, R2 tells R3 about
R1's rooms that R2 learned about earlier.
Previously, only local rooms were announced on link setup. If R1
had a client but R2 had no clients, R2 wouldn't tell R3 about R1.
Also added diagnostic logging for room announcements on link setup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Three fixes for 3-relay chain (R1→R2→R3):
1. Room lookup in handle_datagram: hub relay (R2) has no local
participants, so active_rooms() was empty and datagrams were
silently dropped. Now also checks global_rooms config directly,
allowing hub relays to forward without local clients.
2. Multi-hop forwarding: removed active_rooms filter — forward to
ALL connected peers except source. The receiving peer decides
whether to deliver or forward further.
3. Android relay_label: native RoomMember now includes relay_label
from RoomUpdate signal. Kotlin UI reads it for relay grouping.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Participants now grouped by relay on Android:
- Green dot + "THIS RELAY" for local participants
- Blue dot + relay label for federated participants
Added relayLabel to RoomMember data class, parsed from
relay_label JSON field. UI groups and renders with headers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a remote relay's room goes inactive (all participants left),
the receiving relay now:
1. Clears remote_participants for that peer+room
2. Broadcasts updated RoomUpdate to local clients with the remote
participant removed
3. Updates federation_active_rooms metric
Previously, remote participants lingered in the participant list
after disconnect, causing ghost entries and stale media forwarding.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Connects to a relay over QUIC with SNI "version", reads build hash
from a unidirectional stream, prints "<relay> <git-hash>" and exits.
Usage: wzp-client --version-check 172.16.81.175:4434
Output: 172.16.81.175:4434 8dbda3e
Relay side: detects "version" SNI, opens uni stream, writes
BUILD_GIT_HASH, waits 100ms for client to read, closes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
wzp-relay --version prints "wzp-relay <short-git-hash>".
Build hash also logged on startup: version=abc1234.
Enables verifying deployed relay matches expected build.
Also fixed federation-test.sh: use kill -INT (not SIGTERM) so
clients save recordings before exit. Added save delay.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RoomParticipant.relay_label identifies which relay a participant is
connected to. Local participants have None, federated participants
get tagged with the peer relay's label when storing remote_participants.
This enables clients to group participants by relay in the UI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1. CLI client now sends raw room names (no hash), matching Android
JNI and Desktop Tauri. All three clients are now consistent.
2. When a client joins a global room, the relay merges federated
remote participants into the initial RoomUpdate. Previously,
clients that joined after the GlobalRoomActive signal only saw
local participants. Now they see everyone immediately.
3. Added get_remote_participants() to FederationManager for querying
cached remote participants from all peer links.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>