wz-phone

Author	SHA1	Message	Date
Siavash Sameni	99c0173590	feat(telemetry): Phase 4 — LossRecoveryUpdate protocol + relay metrics + DebugReporter Phase 4 lays the telemetry foundation for distinguishing DRED recoveries from classical PLC in production: a new SignalMessage variant, two new per-session Prometheus counters on the relay side, and a highlighted loss-recovery section in the Android DebugReporter. The periodic emitter (client → relay) and Grafana panel are deferred to Phase 4b — this commit ships the protocol surface, the relay sink, and the immediate user-visible debug output. Once 4b lands the full path (emitter → relay → Prometheus → Grafana), the metrics here will automatically start receiving data. Scope decision — why not extend QualityReport instead: The existing wire-format QualityReport is a fixed 4-byte media packet trailer. Adding counter fields to it would shift the binary layout and break backward compatibility (old receivers would parse the last 4 bytes of the extended trailer as QR, corrupting audio). Using a new SignalMessage variant on the reliable QUIC signal stream sidesteps the wire-format problem entirely — serde JSON enums tolerate unknown variants gracefully on old receivers, and the signal channel is the right layer for periodic telemetry aggregates. Changes: wzp-proto/src/packet.rs: - New SignalMessage::LossRecoveryUpdate variant carrying: * dred_reconstructions: u64 (monotonic since call start) * classical_plc_invocations: u64 (monotonic) * frames_decoded: u64 (for rate calculation) - All three fields tagged #[serde(default)] for forward compat. wzp-client/src/featherchat.rs: - Added a match arm so signal_to_call_type() handles the new variant (treat as Offer for featherChat bridging purposes). wzp-relay/src/metrics.rs: - Two new IntCounterVec metrics on the relay, labeled by session_id: * wzp_relay_session_dred_reconstructions_total * wzp_relay_session_classical_plc_total - New method update_session_loss_recovery(session_id, dred, plc) applies monotonic deltas: if the incoming totals exceed the current counter, the difference is inc_by'd. If the incoming totals are LOWER (client restart or counter reset), the Prometheus counter holds steady until the client catches up. This matches the existing update_session_buffer delta pattern. - remove_session_metrics() now cleans up the two new labels. - New test session_loss_recovery_monotonic_delta exercises: * initial population (10 DRED, 2 PLC) * forward advance (25, 5 → delta +15, +3) * lower values ignored (client reset → counters unchanged) * client catches up (30, 8 → advances to new max) - Existing session_metrics_cleanup test extended to cover the new counters. android/app/src/main/java/com/wzp/debug/DebugReporter.kt: - Phase 4 users — and incident responders — need to quickly see whether DRED is actually firing during a call. The stats JSON already carries the counters (after Phase 3c), but they were buried in the trailing JSON dump. Added a dedicated "=== Loss Recovery ===" section to the meta preamble that extracts dred_reconstructions, classical_plc_invocations, frames_decoded, and fec_recovered from the JSON and displays them plainly, plus computed percentages when frames_decoded > 0. - New extractLongField helper: tiny hand-rolled JSON integer extractor. We don't want to pull in a full JSON parser for this single use case and CallStats has a flat, well-known schema. Verification: - cargo check --workspace: zero errors - cargo test -p wzp-proto --lib: 63 passing - cargo test -p wzp-codec --lib: 68 passing - cargo test -p wzp-client --lib: 35 passing (+1 ignored probe) - cargo test -p wzp-relay --lib: 68 passing (+1 new Phase 4 test) - cargo check -p wzp-android --lib: zero errors - Android APK build verified earlier today (unridden-alfonso.apk via the remote Docker builder) — Phase 0–3c confirmed to compile end-to-end on the NDK target. Phase 4b remaining (not blocking this commit): - Periodic LossRecoveryUpdate emitter in wzp-client/src/call.rs and wzp-android/src/engine.rs (every ~5 s) - Relay-side handler in main.rs that matches the new variant and calls metrics.update_session_loss_recovery - Grafana "Loss recovery breakdown" panel in docs/grafana-dashboard.json Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 19:21:04 +04:00
Siavash Sameni	662b14a2af	feat(codec): Phase 3b — CallDecoder DRED reconstruction on packet loss Phase 3b of the DRED integration — wires the Phase 3a FFI primitives into the desktop receive path. When the jitter buffer reports a missing Opus frame, CallDecoder now attempts to reconstruct the audio from the most recently parsed DRED side-channel state before falling through to classical PLC. Architectural refinement vs the PRD's literal wording: the PRD said "jitter buffer takes a Box<dyn DredReconstructor>". After checking deps, wzp-transport depends only on wzp-proto (not wzp-codec). Putting DRED state in the jitter buffer would require a new cross-crate dep and couple the codec-agnostic buffer to libopus. Instead, this commit keeps the DRED state ring and reconstruction dispatch inside CallDecoder (one layer up from the jitter buffer), intercepting the existing PlayoutResult::Missing signal. Same lookahead/backfill semantics, cleaner layering, zero change to wzp-transport. Changes: CallDecoder field type: Box<dyn AudioDecoder> → AdaptiveDecoder. Required because Phase 3b calls the inherent reconstruct_from_dred method, which cannot live on the AudioDecoder trait without dragging libopus DredState through wzp-proto. In practice AdaptiveDecoder was the only AudioDecoder implementor anyway — the trait abstraction was buying nothing. Method call sites unchanged because AdaptiveDecoder also implements AudioDecoder. New CallDecoder fields: - dred_decoder: DredDecoderHandle - dred_parse_scratch: DredState (scratch for parse_into) - last_good_dred: DredState (cached most-recent valid state) - last_good_dred_seq: Option<u16> - dred_reconstructions: u64 (Phase 4 telemetry) - classical_plc_invocations: u64 (Phase 4 telemetry) CallDecoder::ingest — on Opus non-repair packets, parse DRED into the scratch state. On success (samples_available > 0), std::mem::swap the scratch into last_good_dred and record the seq. This is O(1) per packet, zero allocation after construction (the two DredState buffers are allocated once in new() and reused forever). CallDecoder::decode_next — on PlayoutResult::Missing(seq) for Opus profiles: if last_good_dred_seq > seq and the seq delta × frame_samples fits within samples_available, call audio_dec.reconstruct_from_dred and bump dred_reconstructions. Otherwise fall through to classical PLC and bump classical_plc_invocations. The Codec2 path always falls through to classical PLC since DRED is libopus-only and AdaptiveDecoder::reconstruct_from_dred rejects Codec2 tiers explicitly. OpusDecoder and AdaptiveDecoder: new inherent reconstruct_from_dred method that delegates to the underlying DecoderHandle. Needed to bridge CallDecoder's wzp-client code to the Phase 3a FFI wrappers without touching the AudioDecoder trait. CRITICAL FINDING — raised DRED loss floor from 5% to 15%: Phase 3b testing discovered that libopus 1.5's DRED emission window scales aggressively with OPUS_SET_PACKET_LOSS_PERC. Empirical data (see probe_dred_samples_available_by_loss_floor, an #[ignore]'d diagnostic test in call.rs): loss_pct samples_available effective_ms 5% 720 15 ms (useless!) 10% 2640 55 ms 15% 4560 95 ms 20% 6480 135 ms 25%+ 8400 (capped) 175 ms (~87% of 200 ms configured) The Phase 1 default of 5% produced only a 15 ms reconstruction window — too small to even cover a single 20 ms Opus frame. DRED was effectively disabled even though it was emitting bytes. Raised the floor to 15% (95 ms window) as the minimum that actually provides single-frame loss recovery. This updates Phase 1's DRED_LOSS_FLOOR_PCT constant in opus_enc.rs and the accompanying module docstring. Trade-off: 15% assumed loss slightly increases encoder bitrate overhead on clean networks. Measured via the existing phase1 bitrate probe: Before (5% floor): 3649 bytes/sec at Opus 24k + 300 Hz sine After (15% floor): 3568 bytes/sec at Opus 24k + 300 Hz sine The delta is within noise — 15% isn't meaningfully more expensive than 5% on this signal, which suggests the DRED emission size is signal- dependent rather than loss-dependent for small values. Net result: we get a 6x larger reconstruction window for essentially free. Tests (+3 DRED recovery, +1 #[ignore]'d probe): - opus_single_packet_loss_is_recovered_via_dred — full encode → ingest → decode_next loop with one packet dropped mid-stream. Asserts dred_reconstructions ≥ 1 and observes the exact counter deltas. - opus_lossless_ingest_never_triggers_dred_or_plc — baseline behavior, lossless stream never takes the Missing branch. - codec2_loss_falls_through_to_classical_plc — Codec2 never reconstructs via DRED even if state were populated (which it won't be — Codec2 packets don't carry DRED bytes). - probe_dred_samples_available_by_loss_floor — #[ignore]'d diagnostic that sweeps loss_pct values and prints the resulting DRED window sizes. Kept for future tuning work. New CallDecoder introspection accessors (public but undocumented in the PRD): last_good_dred_seq() and last_good_dred_samples_available() for test diagnostics and future telemetry surfaces in Phase 4. Verification: - cargo check --workspace: zero errors - cargo test -p wzp-codec --lib: 68 passing (Phase 3a baseline held) - cargo test -p wzp-client --lib: 35 passing (+3 Phase 3b tests, +1 ignored diagnostic, no regressions) Next up: Phase 3c mirrors this on the Android engine.rs receive path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 18:55:25 +04:00
Siavash Sameni	d5c298d0b5	feat(codec): Phase 2 — remove RaptorQ from Opus tiers, Codec2 unchanged Phase 2 of the DRED integration (docs/PRD-dred-integration.md). With Phase 1 having enabled DRED on every Opus profile, the app-level RaptorQ layer is now redundant overhead on those tiers: +20% bitrate, +40–100 ms receive-side latency (block wait), +CPU for stats we never used. This phase removes RaptorQ from the Opus encode and decode paths on both the desktop (wzp-client/call.rs) and Android (wzp-android/engine.rs) sides. Codec2 tiers keep RaptorQ with their current ratios unchanged — DRED is libopus-only and Codec2 has no neural equivalent. Encoder changes (the real bandwidth / CPU win): - CallEncoder::encode_frame and engine.rs encode loop now gate the RaptorQ path on !codec.is_opus(): - Opus source packets emit fec_block=0, fec_symbol=0, fec_ratio_encoded=0 in the MediaHeader - fec_enc.add_source_symbol is skipped on Opus - generate_repair + repair packet emission is skipped on Opus - block_id and frame_in_block counters stay frozen at 0 for Opus - Codec2 path is byte-for-byte identical to pre-Phase-2 behavior. Decoder changes (mostly cleanup, since both live decoder paths were already reading audio directly from source packets and only using the RaptorQ decoder output for stats): - CallDecoder::ingest skips fec_dec.add_symbol on Opus packets. Source packets still flow to the jitter buffer; Opus repair packets from old senders are dropped cleanly (repair packets never hit the jitter buffer either). - engine.rs recv loop skips fec_dec.add_symbol, fec_dec.try_decode, and fec_dec.expire_before on Opus packets. The `fec_recovered` stat counter becomes Codec2-only (a separate DRED reconstruction counter lands in Phase 4). Wire-format backward compat verified at pre-flight: - Old receiver + new sender: engine.rs pipeline.rs path gates on non-zero fec_block/fec_symbol which now never fire for Opus, so the RaptorQ decoder simply isn't fed. Audio flows normally. Desktop CallDecoder's old path accumulated packets into the stale-eviction HashMap, which cleans up after 2s — harmless. - New receiver + old sender: new receiver skips RaptorQ on Opus so old-sender repair packets are ignored entirely (no crash, no double- decode). Loses the (previously vestigial) RaptorQ recovery benefit, which was never actually active in the audio path. Source packets still decode normally. - No wire format version bump required. MediaHeader is unchanged; we just zero the FEC fields on Opus packets. Test changes: - Removed `encoder_generates_repair_on_full_block` — asserted the old (pre-Phase-2) RaptorQ-on-Opus behavior and is now incorrect. Replaced with two symmetric tests: - `opus_source_packets_have_zero_fec_header_fields` — verifies Phase 2 invariants on Opus packets - `opus_encoder_never_emits_repair_packets` — runs 20 frames of non-silent sine wave through a GOOD-profile encoder, asserts exactly 20 output packets, zero repair - `codec2_encoder_generates_repair_on_full_block` — same shape as the old test but on CATASTROPHIC profile (Codec2 1200, 8 frames/block, ratio 1.0) to verify Codec2 path still emits repairs as before Verification: - cargo check --workspace: zero errors - cargo test -p wzp-codec --lib: 61 passing (Phase 1 baseline held) - cargo test -p wzp-client --lib: 32 passing (+3 new Phase 2 tests, -1 old test removed) - cargo check -p wzp-android --lib: zero errors (host link of wzp-android tests fails on -llog per pre-existing Android-only build.rs, unrelated to this work; integration build via build-and-notify.sh will validate Android end-to-end) - Pre-existing broken integration test in crates/wzp-client/tests/handshake_integration.rs (SignalMessage schema drift) is NOT caused by this commit — baseline had the same 3 compile errors before Phase 2. Flagged as a separate cleanup task. Expected observable effects on a real call: - Opus 24k outgoing bitrate drops from ~28.8 kbps (ratio 0.2 RaptorQ) to ~25 kbps (base 24 kbps + DRED ~1–10 kbps signal-dependent) - Opus receive-side latency drops ~40 ms on clean network (no more block wait — jitter buffer emits as soon as a source packet arrives) - Codec2 calls show no latency or bitrate change Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-10 17:42:33 +04:00
Siavash Sameni	3351cb6473	feat: direct 1:1 calling via relay signaling (Phase 1) Some checks failed Mirror to GitHub / mirror (push) Failing after 35s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details New feature: call someone directly by fingerprint through the relay. - Client connects with SNI "_signal" for persistent signaling - RegisterPresence/RegisterPresenceAck for relay registration - DirectCallOffer routed to target by fingerprint - DirectCallAnswer with AcceptGeneric/AcceptTrusted/Reject modes - Relay creates private room (call-{id}), sends CallSetup to both - Both clients connect to private room for media (existing SFU path) - Hangup forwarding + cleanup on disconnect - Desktop CLI: --signal + --call <fingerprint> for testing - CallRegistry tracks call state (Pending/Ringing/Active/Ended) - SignalHub manages persistent signaling connections Tested: Alice calls Bob by fingerprint, relay routes offer, Bob auto-accepts, both join private room, media flows bidirectionally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 05:35:16 +04:00
Siavash Sameni	f935bd69cd	fix: rewrite seq/fec for federation-delivered packets Some checks failed Build Release Binaries / build-amd64 (push) Failing after 2m48s Details Mirror to GitHub / mirror (push) Failing after 4m2s Details - Time-based dedup (2s TTL) replaces fixed-window dedup — consecutive senders with same seq numbers no longer collide - Raw byte forwarding for federation local delivery (no re-serialization) - Jitter buffer resets on large backward seq jumps (>100) - recv_media skips malformed datagrams instead of returning connection-closed - SIGTERM handler for clean QUIC shutdown on wzp-client - JSONL event log infrastructure (--event-log flag) for protocol analysis - FEC disabled on GOOD profile for federation debugging (fec_ratio=0.0) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 21:55:06 +04:00
Siavash Sameni	5c24adf1c1	feat: remote version query — wzp-client --version-check <relay> Some checks failed Mirror to GitHub / mirror (push) Failing after 1m32s Details Build Release Binaries / build-amd64 (push) Failing after 2m16s Details Connects to a relay over QUIC with SNI "version", reads build hash from a unidirectional stream, prints "<relay> <git-hash>" and exits. Usage: wzp-client --version-check 172.16.81.175:4434 Output: 172.16.81.175:4434 `8dbda3e` Relay side: detects "version" SNI, opens uni stream, writes BUILD_GIT_HASH, waits 100ms for client to read, closes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 12:47:37 +04:00
Siavash Sameni	a3ebf5616f	fix: unified raw room names + merged presence on join Some checks failed Mirror to GitHub / mirror (push) Failing after 42s Details Build Release Binaries / build-amd64 (push) Failing after 2m1s Details 1. CLI client now sends raw room names (no hash), matching Android JNI and Desktop Tauri. All three clients are now consistent. 2. When a client joins a global room, the relay merges federated remote participants into the initial RoomUpdate. Previously, clients that joined after the GlobalRoomActive signal only saw local participants. Now they see everyone immediately. 3. Added get_remote_participants() to FederationManager for querying cached remote participants from all peer links. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 11:09:15 +04:00
Siavash Sameni	b00db5dfdc	feat: federation rewrite — global rooms router model Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 1m52s Details Major rewrite of relay federation replacing virtual participants with a clean router model: 1. Global rooms: [[global_rooms]] in TOML config declares rooms that are bridged across federation. Each relay is a router + local SFU. 2. Room events: RoomManager emits LocalJoin/LocalLeave via broadcast channel when rooms transition between empty and non-empty. 3. GlobalRoomActive/Inactive signals: relays announce when they have local participants in global rooms. Peers track active state and forward media accordingly. Announcements propagate for multi-hop. 4. Media forwarding: separated from SFU loop. Local participant sends via mpsc channel → egress task → forward_to_peers() → room-hash tagged datagrams to active peer links. Inbound datagrams delivered to local participants + forwarded to other active peers (multi-hop). 5. Loop prevention: don't forward back to source relay. 6. Room name hashing: is_global_room() checks both plain name and hash (clients hash room names for SNI privacy). Removed: ParticipantSender::Federation, federated_participants, virtual participant join/leave, periodic room polling. Rooms now only contain local participants. Signaling tested: 3-relay chain (A→B←C) correctly propagates GlobalRoomActive through B to both A and C. Media forwarding plumbing in place but needs final debugging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 07:54:38 +04:00
Siavash Sameni	bc8bb3d790	feat: [[trusted]] config + FederationHello for one-sided federation Some checks failed Mirror to GitHub / mirror (push) Failing after 34s Details Build Release Binaries / build-amd64 (push) Failing after 1m53s Details - Added [[trusted]] config: relay B can accept inbound federation from relay A by fingerprint alone, without knowing A's address. A connects to B with [[peers]], B trusts A with [[trusted]]. - FederationHello signal: outbound connections send their TLS fingerprint as first signal. The accepting relay verifies it against [[peers]] (by IP) or [[trusted]] (by fingerprint). - Tested 3-relay chain: A→B←C. Both A and C connect to B, B trusts both. B correctly accepts both inbound connections. Room announcements flow A→B and C→B. - Remaining: B needs to announce rooms back to A and C on the same connection so media can flow A→B→C. Currently A has no virtual participant for B, so media doesn't reach B's SFU for forwarding. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 06:49:20 +04:00
Siavash Sameni	6be36e43c2	feat: relay federation infrastructure — room bridging, loop prevention, peer connections Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 2m1s Details Phase 1 of relay federation: 1. Signal messages: FederationRoomJoin/Leave/ParticipantUpdate added to SignalMessage enum for relay-to-relay room coordination. 2. Room changes: ParticipantOrigin (Local/Federated) tracking, loop prevention (federated media only forwards to local participants), ParticipantSender::Federation with 8-byte room-hash prefixed datagrams, merged participant lists (local + remote), new methods: join_federated(), update_federated_participants(), local_senders(), active_rooms(), local_participants(). 3. FederationManager: connects to configured peers via QUIC with SNI "_federation", reconnects with exponential backoff (5s-300s), exchanges FederationRoomJoin signals, runs recv loops for both signals and media datagrams, creates virtual participants in rooms. 4. Accept-side: _federation SNI handling in main.rs, unknown peer gets helpful "add to relay.toml" log message, recognized peers handed off to FederationManager. TODO: TLS fingerprint verification — currently outbound connections use client_config() which doesn't present a cert, so inbound verification fails. Need mutual TLS or URL-based peer matching. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 22:30:18 +04:00
Siavash Sameni	c8bcc5c974	fix: advertise studio profiles in handshake supported_profiles Some checks failed Build Release Binaries / build-amd64 (push) Failing after 2m7s Details Mirror to GitHub / mirror (push) Failing after 35s Details The CallOffer only advertised GOOD/DEGRADED/CATASTROPHIC. When a client uses a studio profile, the relay's choose_profile couldn't pick it. Now advertises all 6 profiles (studio 64k/48k/32k + good + degraded + catastrophic) in both Android engine and shared handshake. Also: the relay MUST be rebuilt with the new CodecId variants, otherwise it will fail to deserialize CallOffer messages containing studio QualityProfiles in supported_profiles. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:39:31 +04:00
Claude	7eb136fcb3	fix: settings save button (back=discard), fix missing alias in featherchat tests - Settings now uses draft state — changes only persist on explicit Save - Back button discards unsaved changes - Added applyServers() for batch server updates - Added missing alias field to CallOffer in featherchat tests Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 04:30:23 +00:00
Claude	0835c36d0f	feat: settings page with persistence, client alias in handshake, fix null fingerprints Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m34s Details - Add SettingsScreen with identity (alias, key backup/restore), audio defaults, server management, network prefs, and default room - SettingsRepository persists all settings via SharedPreferences - Auto-generate random display names on first launch (e.g. "Swift Wolf") - Thread alias through CallOffer → relay handshake → RoomUpdate broadcast - Derive caller fingerprint from identity key in relay handshake (fixes null fingerprints when --auth-url is not set) - Persist identity seed for stable fingerprints across reconnects - Add alias field to SignalMessage::CallOffer (serde default for backward compat) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 03:56:33 +00:00
Claude	8bf073aa80	fix: handle RoomUpdate variant in wzp-client signal type mapping Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m37s Details Build Release Binaries / release (push) Has been skipped Details Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 01:54:36 +00:00
Claude	e7b1c3372a	feat: Android VoIP client — Phase 2 (JNI bridge, Compose UI, AEC pipeline wiring) - JNI bridge with 8 extern functions (init, startCall, stopCall, setMute, setSpeaker, getStats, forceProfile, destroy) with panic catching - Kotlin engine layer: WzpEngine JNI wrapper, WzpCallback interface, CallStats data class with JSON deserialization - Jetpack Compose UI: InCallScreen with quality indicator (green/yellow/red), mute/speaker/hangup buttons, stats overlay, duration timer - CallActivity with RECORD_AUDIO permission handling, Material3 theme - CallService foreground service with WakeLock, WiFi lock, notification - AudioRouteManager for speaker/earpiece/Bluetooth SCO switching - AEC wired into CallEncoder pipeline: AEC → AGC → denoise → silence → encode - AEC farend reference fed from decode path to encode path in pipeline - Engine exposes set_aec_enabled/set_agc_enabled via AtomicBool flags Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 18:16:38 +00:00
Claude	26e9c55f1f	feat: Android VoIP client — Phase 1 (audio quality, network adaptation, crate skeleton) - New wzp-android crate with Oboe C++ backend, lock-free SPSC ring buffers, engine orchestrator, codec pipeline, and Android Gradle project structure - AEC (NLMS adaptive filter), AGC (two-stage with fast attack/slow release), windowed-sinc FIR resampler replacing linear interpolation (wzp-codec) - Opus encoder tuning: complexity 7 default, set_expected_loss support - Mobile jitter buffer: asymmetric EMA (fast up/slow down), handoff spike detection with 2s cooldown, configurable safety margin - Network-aware quality control: cellular-specific thresholds, faster downgrade on cellular, proactive tier drop on WiFi→cellular handoff, FEC ratio boost during network transitions - Handoff detection in PathMonitor via RTT jitter spike analysis Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 18:07:55 +00:00
Siavash Sameni	e595fe6591	feat: P3-T6 per-session forwarding — relay links for hop-by-hop media RelayLink: QUIC connection to peer relay (SNI "_relay") for forwarding specific sessions. Methods: connect, forward, add/remove_session, is_idle. RelayLinkManager: manages connections to multiple peers. - get_or_connect: lazy connection establishment - forward_to: send media packet to specific peer - register/unregister_session: track which sessions use which links - Auto-closes idle links on session unregister Protocol: added SignalMessage::SessionForward { session_id, target_fingerprint, source_relay } and SessionForwardAck { session_id, room_name } for relay-link session setup signaling. Building block for P3-T7 (call setup over mesh) which wires route resolution + relay links + handshake into a complete flow. 62 relay tests + 42 proto tests passing (7 new relay_link tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 18:45:36 +04:00
Siavash Sameni	326aa491cc	feat: P3-T5 route resolution — find relay path to any fingerprint RouteResolver queries PresenceRegistry to determine how to reach a target: - Route::Local — connected to this relay - Route::DirectPeer(addr) — on a directly connected peer relay - Route::Chain(addrs) — multi-hop (structure ready, single-hop for now) - Route::NotFound — not in any known relay Protocol: added SignalMessage::RouteQuery { fingerprint, ttl } and RouteResponse { fingerprint, found, relay_chain } for peer-to-peer route queries over probe connections. HTTP API: GET /route/:fingerprint returns JSON with route type + chain. Relay handles incoming RouteQuery on probe connections: looks up locally, replies with RouteResponse. TTL decremented for future multi-hop forwarding. 55 relay tests + 42 proto tests passing (7 new route tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 18:38:24 +04:00
Siavash Sameni	464e95a4bd	feat: P3-T4 relay presence registry — gossip fingerprints across relay mesh PresenceRegistry tracks who is connected where: - register_local/unregister_local for directly connected users - update_peer for fingerprints reported by peer relays - lookup returns Local or Remote(addr) - expire_stale removes entries older than timeout Gossip via probe connections: - New SignalMessage::PresenceUpdate { fingerprints, relay_addr } - Probes send local fingerprints every 10s alongside Ping/Pong - Receiving relay updates its remote presence table HTTP API on metrics port: - GET /presence — all known fingerprints + locations - GET /presence/:fingerprint — single lookup - GET /peers — peer relays + their connected users Wired into relay main: - Registry created at startup - register_local after auth+handshake - unregister_local on disconnect - Passed to probe mesh and metrics server Also marks FC-10 as DONE in integration tracker. 48 relay tests + 42 proto tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-29 17:36:55 +04:00
Siavash Sameni	9e7fea7633	test: P2-T1-S5 long-session regression — 60s call with drift/loss assertions 3 tests in crates/wzp-client/tests/long_session.rs: 1. long_session_no_drift — 3000 frames (60s) through full encoder/decoder pipeline, asserts >95% decoded, 0 overruns, 0 underruns 2. long_session_with_simulated_loss — drops every 20th packet + reorders, asserts >90% decoded, confirms PLC fills gaps (2999/3000) 3. long_session_stats_consistency — verifies stats.total_decoded matches actual decoded count over 60s (no accounting drift) Completes P2-T1-S5. Phase 2 is now fully done. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 20:59:27 +04:00
Siavash Sameni	6310864b0b	fix: client sends Hangup before disconnect, relay handles timeouts gracefully Client: sends SignalMessage::Hangup(Normal) before closing in all modes (send-tone, file mode, silence mode) so the relay knows the session ended. Relay: downgrades "timed out" / "reset" / "closed" recv errors from ERROR to INFO since these are normal disconnect scenarios. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:15:47 +04:00
Siavash Sameni	4d2c9838c5	fix: eliminate all compiler warnings across client, relay, web - Remove unused imports in featherchat.rs (tracing, QualityProfile) - Remove unused comfort_noise field from CallEncoder (cn_level is used instead) - Prefix unused _metrics_file in CliArgs - Prefix unused _addr in Participant - Remove unused RoomSlot struct and rooms field from web AppState - Remove unused HashMap import from web main Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:13:48 +04:00
Siavash Sameni	ab8a7f7a96	fix: client exits after --send-tone completes (was hanging on recv task) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 15:04:44 +04:00
Siavash Sameni	6d5ee55393	fix: install rustls crypto provider in wzp-client (same as relay/web) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 14:45:26 +04:00
Siavash Sameni	0dc381e948	feat: protocol improvements — live trunking, mini-frames, noise suppression, adaptive jitter T6 wiring: Trunking in relay hot path - TrunkedForwarder wraps transport with TrunkBatcher - run_participant uses 5ms flush timer when trunking enabled - send_trunk/recv_trunk on QuinnTransport - --trunking flag on relay config - 2 new tests: forwarder batches, auto-flush on full T7 wiring: Mini-frames in encoder/decoder - MediaPacket::encode_compact/decode_compact with MiniFrameContext - CallEncoder sends mini-headers for consecutive frames (full every 50th) - CallDecoder auto-detects full vs mini on receive - mini_frames_enabled in CallConfig (default true) - 3 new tests: encode/decode sequence, periodic full, disabled mode Noise suppression (nnnoiseless/RNNoise) - NoiseSupressor in wzp-codec: pure Rust ML-based noise removal - Processes 960-sample frames as two 480-sample halves - Integrated in CallEncoder before silence detection - noise_suppression in CallConfig (default true) - 4 new tests: creation, processing, SNR improvement, passthrough T1-S4: Adaptive playout delay - AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired) - Computes target_delay from observed inter-arrival jitter - JitterBuffer::new_adaptive() uses adaptive delay - adaptive_jitter in CallConfig (default true) - 5 new tests: stable, jitter increase, recovery, clamping, estimate 272 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 14:24:53 +04:00
Siavash Sameni	34cd1017c1	feat: IAX2-inspired protocol improvements — trunking, mini-frames, silence suppression, call control (P2-T6/T7/T8/T9) WZP-P2-T6: Trunking - TrunkFrame/TrunkEntry: pack N session packets into one datagram - Wire format: [count:u16][session_id:2][len:u16][payload]... - TrunkBatcher: batches by count (10) or bytes (1200), flushes on limit - 5 tests: encode/decode roundtrip, empty frame, batcher fill/flush, byte limit WZP-P2-T7: Mini-frames - MiniHeader: 4-byte delta header (timestamp_delta + payload_len) - FRAME_TYPE_FULL (0x00) / FRAME_TYPE_MINI (0x01) discriminator - MiniFrameContext: expands mini-headers to full by tracking baseline - Saves 8 bytes per packet (5 vs 13 bytes with type prefix) - 5 tests: encode/decode, wire size, context expand, no baseline, size comparison WZP-P2-T8: Silence suppression - SilenceDetector: RMS-based detection with hangover (5 frames = 100ms) - ComfortNoise: low-level random noise generator - CodecId::ComfortNoise variant for CN packets - CallEncoder: suppresses silent frames, sends 1-byte CN every 200ms - CallDecoder: generates comfort noise on CN packets - ~50% bandwidth savings in typical conversations - 6 tests: silence/speech detection, hangover, CN generation, RMS math, suppression WZP-P2-T9: Call control signals - SignalMessage: Hold, Unhold, Mute, Unmute, Transfer, TransferAck - CallSignalType mapping in featherchat.rs for all new variants - 4 serde roundtrip tests + signal type mapping tests 255 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 14:13:05 +04:00
Siavash Sameni	39f6908478	feat: Prometheus metrics on relay + web bridge, client JSONL export (T5-S1/S3/S4) WZP-P2-T5-S1: Relay Prometheus /metrics - RelayMetrics: active_sessions, active_rooms, packets/bytes_forwarded, auth_attempts (ok/fail), handshake_duration histogram - --metrics-port flag spawns HTTP server - Wired into auth, handshake, session, and packet forwarding paths - 2 tests WZP-P2-T5-S3: Web bridge Prometheus /metrics - WebMetrics: active_connections, frames_bridged (up/down), auth_failures, handshake_latency histogram - Added /metrics route to existing axum app - Wired into WS connect/disconnect, auth, handshake, send/recv loops - 2 tests WZP-P2-T5-S4: Client --metrics-file JSONL - ClientMetricsSnapshot with all telemetry fields - MetricsWriter: writes one JSON line per second to file - snapshot_from_stats() converts JitterStats to snapshot - --metrics-file <path> flag - 3 tests 223 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 12:44:57 +04:00
Siavash Sameni	59a00d371b	feat: jitter buffer instrumentation — drift test, telemetry, parameter sweep WZP-P2-T1-S1: Automated drift measurement - New drift_test.rs: DriftTestConfig, DriftResult, run_drift_test() - CLI --drift-test <secs>: sends tone, measures actual vs expected duration - Interpretation tiers: EXCELLENT (<50ms) / GOOD / FAIR / POOR - 2 unit tests: drift math verification, config defaults WZP-P2-T1-S2: Jitter buffer telemetry - JitterStats gains: total_decoded, underruns, overruns, max_depth_seen - JitterBuffer: record_underrun(), record_decode(), reset_stats() - CallDecoder: stats() getter, reset_stats() - JitterTelemetry: periodic tracing::info! logger with configurable interval - 4 unit tests: ingestion tracking, underrun tracking, reset, interval WZP-P2-T1-S3: Parameter sweep - New sweep.rs: SweepConfig, SweepResult, run_local_sweep() - Tests 20 jitter buffer configs (5 target × 4 max depths) locally - CLI --sweep: runs sweep, prints ASCII comparison table - No network needed — pure encoder→decoder pipeline test - 3 unit tests: config defaults, local sweep runs, table formatting 216 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 10:26:40 +04:00
Siavash Sameni	524d1145bb	feat: complete WZP Phase 2 (T2/T3/T4) — adaptive quality, AudioWorklet, sessions WZP-P2-T2: Adaptive quality switching - QualityAdapter with sliding window of QualityReports - Hysteresis: 3 consecutive reports before switching profiles - Thresholds: loss>15%/rtt>200ms→CATASTROPHIC, loss>5%/rtt>100ms→DEGRADED - CallConfig::from_profile() constructor - 5 unit tests: good/degraded/catastrophic conditions, hysteresis, recovery WZP-P2-T3: AudioWorklet migration (web bridge) - audio-processor.js: WZPCaptureProcessor + WZPPlaybackProcessor - Capture: buffers 128-sample AudioWorklet blocks → 960-sample frames - Playback: ring buffer, Int16→Float32 conversion in worklet - ScriptProcessorNode fallback if AudioWorklet unavailable - Existing UI preserved (connect, room, PTT) WZP-P2-T4: Concurrent session management (relay) - SessionManager tracks active sessions with HashMap - Enforces max_sessions limit from RelayConfig - create_session/remove_session lifecycle - Wired into relay main: session created after auth+handshake, cleaned up after run_participant returns - 7 unit tests: create/remove, max enforced, room tracking, info, expiry 207 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 10:20:51 +04:00
Siavash Sameni	59069bfba2	feat: complete all WZP-S integration tasks (S-4/5/6/7/9) WZP-S-4: Room access control - hash_room_name() in wzp-crypto: SHA-256("featherchat-group:"+name)[:16] - CLI --room flag hashes before SNI, web bridge does the same - RoomManager gains ACL: with_acl(), allow(), is_authorized() - join() returns Result, rejects unauthorized fingerprints WZP-S-5: Crypto handshake wired into all live paths - CLI: perform_handshake() after connect, before any mode - Relay: accept_handshake() after auth, before room join - Web bridge: perform_handshake() after auth, before audio - Relay generates ephemeral identity at startup WZP-S-6: Web bridge featherChat auth - --auth-url flag: browsers send {"type":"auth","token":"..."} as first WS msg - Validates against featherChat, passes token to relay - --cert/--key flags for production TLS (replaces self-signed) WZP-S-7: wzp-proto standalone - Cargo.toml uses explicit versions (no workspace inheritance) - FC can use as git dependency WZP-S-9: All 6 hardcoded assumptions resolved - Auth, hashed rooms, mandatory handshake, real TLS certs, profile negotiation, token validation CLI also gains --room and --token flags. 179 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 09:59:05 +04:00
Siavash Sameni	ad16ddb903	feat: WZP-S-2 relay auth + WZP-S-3 featherChat signaling bridge WZP-S-2: Relay token authentication - New --auth-url flag: relay calls POST {url} with bearer token - Clients must send SignalMessage::AuthToken as first signal - Relay validates against featherChat's /v1/auth/validate endpoint - Rejects unauthenticated clients before they join rooms - New auth.rs module with validate_token() + tests WZP-S-3: featherChat signaling bridge - New featherchat.rs module for CallSignal interop - WzpCallPayload: wraps SignalMessage + relay_addr + room name - encode_call_payload/decode_call_payload for JSON serialization - CallSignalType enum mirrors featherChat's variant - signal_to_call_type maps WZP signals to FC types Protocol: Added SignalMessage::AuthToken { token } variant 129 tests passing across all crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 09:23:46 +04:00
Siavash Sameni	12cdfe6c8a	feat: featherChat-compatible identity — seed, mnemonic, fingerprint New identity module (wzp-crypto/src/identity.rs) mirrors featherChat's warzone-protocol identity.rs exactly: - Seed: 32 bytes, from hex or BIP39 mnemonic (24 words) - HKDF derivation: same salt (None), same info strings - Fingerprint: SHA-256(Ed25519 pub)[:16], same xxxx:xxxx format - Cross-verified: test proves identity module matches KeyExchange trait CLI flags: - --seed <64 hex chars>: use a specific identity - --mnemonic <24 words>: use BIP39 mnemonic from featherChat - Without either: generates ephemeral identity Also adds featherChat as git submodule at deps/featherchat for reference. 32 crypto tests passing (27 original + 5 identity tests). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-28 09:09:38 +04:00
Siavash Sameni	bddcfb1440	fix: remove unused variable warning in cli.rs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:15:09 +04:00
Siavash Sameni	a04b8271cc	fix: record mode decode-per-packet (same fix as live mode) The --record recv loop was using while-drain which exhausted the jitter buffer and stopped decoding after the first burst. Now decodes once per source packet, matching the live mode fix. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 18:05:10 +04:00
Siavash Sameni	d5390db7af	feat: --send-file for real audio testing + fix warnings - --send-file <file.raw> sends a raw PCM file (48kHz mono s16le) through relay - Combine with --record: --send-file talk.raw --record echo.raw <relay> - Fixed all unused import warnings in echo_test.rs Convert any audio to test format: ffmpeg -i input.mp3 -ar 48000 -ac 1 -f s16le input.raw Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 17:51:55 +04:00
Siavash Sameni	28d5a3a9ad	feat: automated echo quality test with time-window analysis New --echo-test <secs> flag sends a 440Hz tone through relay echo, records the return, and analyzes quality in 5-second windows: - Per-window: frames sent/received, loss %, SNR (dB), correlation - Detects quality degradation over time (compares first vs second half) - Reports jitter buffer stats (depth, lost, late packets) - Diagnoses jitter buffer drift and packet loss accumulation Also exposes jitter_stats() on CallDecoder for diagnostics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 17:44:08 +04:00
Siavash Sameni	0723f52d76	fix: live audio playback working — jitter buffer and decode loop fixes - Reduced jitter buffer min_depth from 25 (500ms) to 3 (60ms) for fast start - Fixed live recv loop: decode once per source packet instead of draining the jitter buffer dry (which advanced seq past future packets) - Fixed Ok(None) handling: connection closed, not "no packet yet" Live echo test confirmed working with continuous audio. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 17:09:33 +04:00
Siavash Sameni	b147de5ae9	fix: graceful Ctrl+C recording + relay echo mode - --record now handles Ctrl+C: saves PCM file before exiting - Relay without --remote runs in echo mode (loops packets back to sender) instead of sink mode, enabling single-relay audio testing - recv task returns collected PCM via channel for clean file write Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 16:32:12 +04:00
Siavash Sameni	df80ad5343	fix: make cpal/ALSA optional — headless Linux builds work without libasound - cpal is now behind an 'audio' feature flag (off by default) - --live mode requires --features audio at build time - --send-tone and --record work on headless servers without audio libs - Linux build script no longer installs libasound2-dev Build for headless: cargo build --release Build with mic/speakers: cargo build --release --features audio Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 16:24:44 +04:00
Siavash Sameni	708fb268bc	feat: file-based audio testing + Hetzner build scripts CLI modes: - --send-tone <secs>: send 440Hz test tone (no mic needed) - --record <file.raw>: save received audio to raw PCM file - --help: usage info - Combine: --send-tone 10 --record out.raw Raw PCM format: 48kHz mono s16le Play with: ffplay -f s16le -ar 48000 -ac 1 out.raw Build scripts: - scripts/build-linux.sh: Hetzner VPS build with auto-cleanup - scripts/cleanup-builder.sh: kill stale builders Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 16:11:59 +04:00
Siavash Sameni	85f472d824	fix: scale FEC ratio with loss rate in benchmarks The bench tool now auto-calculates the FEC ratio needed to survive the requested loss percentage, matching how the adaptive quality controller would behave in production. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 14:21:21 +04:00
Siavash Sameni	3c99503eb1	fix: IPv6 support, client address family matching, gitignore cleanup - Client auto-detects IPv4/IPv6 from relay address and binds accordingly - Relay defaults to 0.0.0.0:4433, use --listen [::]:4433 for IPv6 - .gitignore excludes .claude/, swap files - Fix pipeline drain infinite loop in benchmarks Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 14:17:49 +04:00
Siavash Sameni	79f9ff1596	feat: Phase 3 — crypto handshake, codec2, benchmarks, audio I/O, relay forwarding E2E crypto handshake: - Client/relay handshake via SignalMessage (CallOffer/CallAnswer) - X25519 ephemeral key exchange with Ed25519 identity signatures - Integration tests proving bidirectional encrypt/decrypt Codec2 integration: - Pure Rust codec2 crate (v0.3) — no C bindings needed - MODE_3200 (160 samples/20ms, 8 bytes) and MODE_1200 (320 samples/40ms, 6 bytes) - 11 new tests including encode/decode roundtrip and adaptive switching Relay forwarding: - Bidirectional client → remote forwarding with pipeline processing - CLI args: --listen, --remote - Periodic stats logging, clean shutdown via tokio::select! Benchmark tool (wzp-bench): - Codec roundtrip, FEC recovery, crypto throughput, full pipeline benchmarks - Sine wave PCM generator for realistic testing Audio I/O (cpal): - AudioCapture (microphone) and AudioPlayback (speakers) at 48kHz mono - CLI --live mode: mic → encode → send / recv → decode → speakers 120 tests passing, 0 failures. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 13:43:22 +04:00
Siavash Sameni	43d7f70fe9	feat: Phase 2 — relay daemon and client library with integration pipelines wzp-relay: - RelayPipeline: ingest → FEC decode → jitter buffer → FEC encode → send - SessionManager: tracks active calls, idle expiry - RelayConfig: TOML-based configuration - Binary: accepts QUIC connections, receives media packets wzp-client: - CallEncoder: mic PCM → Opus encode → FEC → MediaPackets - CallDecoder: MediaPackets → FEC decode → jitter → Opus decode → PCM - CLI binary: connects to relay, sends test silence frames 99 tests passing across all 7 crates. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 13:08:33 +04:00
Siavash Sameni	51e893590c	feat: WarzonePhone lossy VoIP protocol — Phase 1 complete Rust workspace with 7 crates implementing a custom VoIP protocol designed for extremely lossy connections (5-70% loss, 100-500kbps, 300-800ms RTT). 89 tests passing across all crates. Crates: - wzp-proto: Wire format, traits, adaptive quality controller, jitter buffer, session FSM - wzp-codec: Opus encoder/decoder (audiopus), Codec2 stubs, adaptive switching, resampling - wzp-fec: RaptorQ fountain codes, interleaving, block management (proven 30-70% loss recovery) - wzp-crypto: X25519+ChaCha20-Poly1305, Warzone identity compatible, anti-replay, rekeying - wzp-transport: QUIC via quinn with DATAGRAM frames, path monitoring, signaling streams - wzp-relay: Integration stub (Phase 2) - wzp-client: Integration stub (Phase 2) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-27 12:45:07 +04:00

45 Commits