Commit Graph

44 Commits

Author SHA1 Message Date
Siavash Sameni
662b14a2af feat(codec): Phase 3b — CallDecoder DRED reconstruction on packet loss
Phase 3b of the DRED integration — wires the Phase 3a FFI primitives
into the desktop receive path. When the jitter buffer reports a missing
Opus frame, CallDecoder now attempts to reconstruct the audio from the
most recently parsed DRED side-channel state before falling through to
classical PLC.

Architectural refinement vs the PRD's literal wording: the PRD said
"jitter buffer takes a Box<dyn DredReconstructor>". After checking deps,
wzp-transport depends only on wzp-proto (not wzp-codec). Putting DRED
state in the jitter buffer would require a new cross-crate dep and
couple the codec-agnostic buffer to libopus. Instead, this commit keeps
the DRED state ring and reconstruction dispatch inside CallDecoder (one
layer up from the jitter buffer), intercepting the existing
PlayoutResult::Missing signal. Same lookahead/backfill semantics,
cleaner layering, zero change to wzp-transport.

Changes:

  CallDecoder field type: Box<dyn AudioDecoder> → AdaptiveDecoder.
  Required because Phase 3b calls the inherent reconstruct_from_dred
  method, which cannot live on the AudioDecoder trait without dragging
  libopus DredState through wzp-proto. In practice AdaptiveDecoder was
  the only AudioDecoder implementor anyway — the trait abstraction was
  buying nothing. Method call sites unchanged because AdaptiveDecoder
  also implements AudioDecoder.

  New CallDecoder fields:
    - dred_decoder: DredDecoderHandle
    - dred_parse_scratch: DredState  (scratch for parse_into)
    - last_good_dred: DredState      (cached most-recent valid state)
    - last_good_dred_seq: Option<u16>
    - dred_reconstructions: u64      (Phase 4 telemetry)
    - classical_plc_invocations: u64 (Phase 4 telemetry)

  CallDecoder::ingest — on Opus non-repair packets, parse DRED into the
  scratch state. On success (samples_available > 0), std::mem::swap the
  scratch into last_good_dred and record the seq. This is O(1) per
  packet, zero allocation after construction (the two DredState buffers
  are allocated once in new() and reused forever).

  CallDecoder::decode_next — on PlayoutResult::Missing(seq) for Opus
  profiles: if last_good_dred_seq > seq and the seq delta × frame_samples
  fits within samples_available, call audio_dec.reconstruct_from_dred
  and bump dred_reconstructions. Otherwise fall through to classical
  PLC and bump classical_plc_invocations. The Codec2 path always falls
  through to classical PLC since DRED is libopus-only and
  AdaptiveDecoder::reconstruct_from_dred rejects Codec2 tiers
  explicitly.

  OpusDecoder and AdaptiveDecoder: new inherent reconstruct_from_dred
  method that delegates to the underlying DecoderHandle. Needed to
  bridge CallDecoder's wzp-client code to the Phase 3a FFI wrappers
  without touching the AudioDecoder trait.

CRITICAL FINDING — raised DRED loss floor from 5% to 15%:

Phase 3b testing discovered that libopus 1.5's DRED emission window
scales aggressively with OPUS_SET_PACKET_LOSS_PERC. Empirical data
(see probe_dred_samples_available_by_loss_floor, an #[ignore]'d
diagnostic test in call.rs):

  loss_pct   samples_available   effective_ms
    5%        720                  15 ms  (useless!)
   10%        2640                 55 ms
   15%        4560                 95 ms
   20%        6480                135 ms
   25%+       8400 (capped)       175 ms  (~87% of 200 ms configured)

The Phase 1 default of 5% produced only a 15 ms reconstruction window
— too small to even cover a single 20 ms Opus frame. DRED was
effectively disabled even though it was emitting bytes. Raised the
floor to 15% (95 ms window) as the minimum that actually provides
single-frame loss recovery. This updates Phase 1's DRED_LOSS_FLOOR_PCT
constant in opus_enc.rs and the accompanying module docstring.

Trade-off: 15% assumed loss slightly increases encoder bitrate overhead
on clean networks. Measured via the existing phase1 bitrate probe:

  Before (5% floor):  3649 bytes/sec at Opus 24k + 300 Hz sine
  After  (15% floor): 3568 bytes/sec at Opus 24k + 300 Hz sine

The delta is within noise — 15% isn't meaningfully more expensive than
5% on this signal, which suggests the DRED emission size is signal-
dependent rather than loss-dependent for small values. Net result: we
get a 6x larger reconstruction window for essentially free.

Tests (+3 DRED recovery, +1 #[ignore]'d probe):
- opus_single_packet_loss_is_recovered_via_dred — full encode → ingest
  → decode_next loop with one packet dropped mid-stream. Asserts
  dred_reconstructions ≥ 1 and observes the exact counter deltas.
- opus_lossless_ingest_never_triggers_dred_or_plc — baseline behavior,
  lossless stream never takes the Missing branch.
- codec2_loss_falls_through_to_classical_plc — Codec2 never
  reconstructs via DRED even if state were populated (which it won't
  be — Codec2 packets don't carry DRED bytes).
- probe_dred_samples_available_by_loss_floor — #[ignore]'d diagnostic
  that sweeps loss_pct values and prints the resulting DRED window
  sizes. Kept for future tuning work.

New CallDecoder introspection accessors (public but undocumented in
the PRD): last_good_dred_seq() and last_good_dred_samples_available()
for test diagnostics and future telemetry surfaces in Phase 4.

Verification:
- cargo check --workspace: zero errors
- cargo test -p wzp-codec --lib: 68 passing (Phase 3a baseline held)
- cargo test -p wzp-client --lib: 35 passing (+3 Phase 3b tests,
  +1 ignored diagnostic, no regressions)

Next up: Phase 3c mirrors this on the Android engine.rs receive path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:55:25 +04:00
Siavash Sameni
d5c298d0b5 feat(codec): Phase 2 — remove RaptorQ from Opus tiers, Codec2 unchanged
Phase 2 of the DRED integration (docs/PRD-dred-integration.md). With
Phase 1 having enabled DRED on every Opus profile, the app-level RaptorQ
layer is now redundant overhead on those tiers: +20% bitrate, +40–100 ms
receive-side latency (block wait), +CPU for stats we never used. This
phase removes RaptorQ from the Opus encode and decode paths on both the
desktop (wzp-client/call.rs) and Android (wzp-android/engine.rs) sides.
Codec2 tiers keep RaptorQ with their current ratios unchanged — DRED is
libopus-only and Codec2 has no neural equivalent.

Encoder changes (the real bandwidth / CPU win):
- CallEncoder::encode_frame and engine.rs encode loop now gate the
  RaptorQ path on !codec.is_opus():
    - Opus source packets emit fec_block=0, fec_symbol=0,
      fec_ratio_encoded=0 in the MediaHeader
    - fec_enc.add_source_symbol is skipped on Opus
    - generate_repair + repair packet emission is skipped on Opus
    - block_id and frame_in_block counters stay frozen at 0 for Opus
- Codec2 path is byte-for-byte identical to pre-Phase-2 behavior.

Decoder changes (mostly cleanup, since both live decoder paths were
already reading audio directly from source packets and only using the
RaptorQ decoder output for stats):
- CallDecoder::ingest skips fec_dec.add_symbol on Opus packets. Source
  packets still flow to the jitter buffer; Opus repair packets from old
  senders are dropped cleanly (repair packets never hit the jitter
  buffer either).
- engine.rs recv loop skips fec_dec.add_symbol, fec_dec.try_decode, and
  fec_dec.expire_before on Opus packets. The `fec_recovered` stat
  counter becomes Codec2-only (a separate DRED reconstruction counter
  lands in Phase 4).

Wire-format backward compat verified at pre-flight:
- Old receiver + new sender: engine.rs pipeline.rs path gates on
  non-zero fec_block/fec_symbol which now never fire for Opus, so the
  RaptorQ decoder simply isn't fed. Audio flows normally. Desktop
  CallDecoder's old path accumulated packets into the stale-eviction
  HashMap, which cleans up after 2s — harmless.
- New receiver + old sender: new receiver skips RaptorQ on Opus so
  old-sender repair packets are ignored entirely (no crash, no double-
  decode). Loses the (previously vestigial) RaptorQ recovery benefit,
  which was never actually active in the audio path. Source packets
  still decode normally.
- No wire format version bump required. MediaHeader is unchanged; we
  just zero the FEC fields on Opus packets.

Test changes:
- Removed `encoder_generates_repair_on_full_block` — asserted the old
  (pre-Phase-2) RaptorQ-on-Opus behavior and is now incorrect. Replaced
  with two symmetric tests:
    - `opus_source_packets_have_zero_fec_header_fields` — verifies
      Phase 2 invariants on Opus packets
    - `opus_encoder_never_emits_repair_packets` — runs 20 frames of
      non-silent sine wave through a GOOD-profile encoder, asserts
      exactly 20 output packets, zero repair
    - `codec2_encoder_generates_repair_on_full_block` — same shape as
      the old test but on CATASTROPHIC profile (Codec2 1200, 8
      frames/block, ratio 1.0) to verify Codec2 path still emits
      repairs as before

Verification:
- cargo check --workspace: zero errors
- cargo test -p wzp-codec --lib: 61 passing (Phase 1 baseline held)
- cargo test -p wzp-client --lib: 32 passing (+3 new Phase 2 tests,
  -1 old test removed)
- cargo check -p wzp-android --lib: zero errors (host link of
  wzp-android tests fails on -llog per pre-existing Android-only
  build.rs, unrelated to this work; integration build via
  build-and-notify.sh will validate Android end-to-end)
- Pre-existing broken integration test in
  crates/wzp-client/tests/handshake_integration.rs (SignalMessage
  schema drift) is NOT caused by this commit — baseline had the same
  3 compile errors before Phase 2. Flagged as a separate cleanup task.

Expected observable effects on a real call:
- Opus 24k outgoing bitrate drops from ~28.8 kbps (ratio 0.2 RaptorQ)
  to ~25 kbps (base 24 kbps + DRED ~1–10 kbps signal-dependent)
- Opus receive-side latency drops ~40 ms on clean network (no more
  block wait — jitter buffer emits as soon as a source packet arrives)
- Codec2 calls show no latency or bitrate change

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:42:33 +04:00
Siavash Sameni
3351cb6473 feat: direct 1:1 calling via relay signaling (Phase 1)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 35s
Build Release Binaries / build-amd64 (push) Failing after 3m43s
New feature: call someone directly by fingerprint through the relay.

- Client connects with SNI "_signal" for persistent signaling
- RegisterPresence/RegisterPresenceAck for relay registration
- DirectCallOffer routed to target by fingerprint
- DirectCallAnswer with AcceptGeneric/AcceptTrusted/Reject modes
- Relay creates private room (call-{id}), sends CallSetup to both
- Both clients connect to private room for media (existing SFU path)
- Hangup forwarding + cleanup on disconnect
- Desktop CLI: --signal + --call <fingerprint> for testing
- CallRegistry tracks call state (Pending/Ringing/Active/Ended)
- SignalHub manages persistent signaling connections

Tested: Alice calls Bob by fingerprint, relay routes offer, Bob
auto-accepts, both join private room, media flows bidirectionally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 05:35:16 +04:00
Siavash Sameni
f935bd69cd fix: rewrite seq/fec for federation-delivered packets
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 2m48s
Mirror to GitHub / mirror (push) Failing after 4m2s
- Time-based dedup (2s TTL) replaces fixed-window dedup — consecutive
  senders with same seq numbers no longer collide
- Raw byte forwarding for federation local delivery (no re-serialization)
- Jitter buffer resets on large backward seq jumps (>100)
- recv_media skips malformed datagrams instead of returning connection-closed
- SIGTERM handler for clean QUIC shutdown on wzp-client
- JSONL event log infrastructure (--event-log flag) for protocol analysis
- FEC disabled on GOOD profile for federation debugging (fec_ratio=0.0)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:55:06 +04:00
Siavash Sameni
5c24adf1c1 feat: remote version query — wzp-client --version-check <relay>
Some checks failed
Mirror to GitHub / mirror (push) Failing after 1m32s
Build Release Binaries / build-amd64 (push) Failing after 2m16s
Connects to a relay over QUIC with SNI "version", reads build hash
from a unidirectional stream, prints "<relay> <git-hash>" and exits.

Usage: wzp-client --version-check 172.16.81.175:4434
Output: 172.16.81.175:4434 8dbda3e

Relay side: detects "version" SNI, opens uni stream, writes
BUILD_GIT_HASH, waits 100ms for client to read, closes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:47:37 +04:00
Siavash Sameni
a3ebf5616f fix: unified raw room names + merged presence on join
Some checks failed
Mirror to GitHub / mirror (push) Failing after 42s
Build Release Binaries / build-amd64 (push) Failing after 2m1s
1. CLI client now sends raw room names (no hash), matching Android
   JNI and Desktop Tauri. All three clients are now consistent.

2. When a client joins a global room, the relay merges federated
   remote participants into the initial RoomUpdate. Previously,
   clients that joined after the GlobalRoomActive signal only saw
   local participants. Now they see everyone immediately.

3. Added get_remote_participants() to FederationManager for querying
   cached remote participants from all peer links.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:09:15 +04:00
Siavash Sameni
b00db5dfdc feat: federation rewrite — global rooms router model
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 1m52s
Major rewrite of relay federation replacing virtual participants with
a clean router model:

1. Global rooms: [[global_rooms]] in TOML config declares rooms that
   are bridged across federation. Each relay is a router + local SFU.

2. Room events: RoomManager emits LocalJoin/LocalLeave via broadcast
   channel when rooms transition between empty and non-empty.

3. GlobalRoomActive/Inactive signals: relays announce when they have
   local participants in global rooms. Peers track active state and
   forward media accordingly. Announcements propagate for multi-hop.

4. Media forwarding: separated from SFU loop. Local participant sends
   via mpsc channel → egress task → forward_to_peers() → room-hash
   tagged datagrams to active peer links. Inbound datagrams delivered
   to local participants + forwarded to other active peers (multi-hop).

5. Loop prevention: don't forward back to source relay.

6. Room name hashing: is_global_room() checks both plain name and
   hash (clients hash room names for SNI privacy).

Removed: ParticipantSender::Federation, federated_participants, virtual
participant join/leave, periodic room polling. Rooms now only contain
local participants.

Signaling tested: 3-relay chain (A→B←C) correctly propagates
GlobalRoomActive through B to both A and C. Media forwarding plumbing
in place but needs final debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:54:38 +04:00
Siavash Sameni
bc8bb3d790 feat: [[trusted]] config + FederationHello for one-sided federation
Some checks failed
Mirror to GitHub / mirror (push) Failing after 34s
Build Release Binaries / build-amd64 (push) Failing after 1m53s
- Added [[trusted]] config: relay B can accept inbound federation
  from relay A by fingerprint alone, without knowing A's address.
  A connects to B with [[peers]], B trusts A with [[trusted]].

- FederationHello signal: outbound connections send their TLS
  fingerprint as first signal. The accepting relay verifies it
  against [[peers]] (by IP) or [[trusted]] (by fingerprint).

- Tested 3-relay chain: A→B←C. Both A and C connect to B, B trusts
  both. B correctly accepts both inbound connections. Room
  announcements flow A→B and C→B.

- Remaining: B needs to announce rooms back to A and C on the same
  connection so media can flow A→B→C. Currently A has no virtual
  participant for B, so media doesn't reach B's SFU for forwarding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 06:49:20 +04:00
Siavash Sameni
6be36e43c2 feat: relay federation infrastructure — room bridging, loop prevention, peer connections
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 2m1s
Phase 1 of relay federation:

1. Signal messages: FederationRoomJoin/Leave/ParticipantUpdate added
   to SignalMessage enum for relay-to-relay room coordination.

2. Room changes: ParticipantOrigin (Local/Federated) tracking, loop
   prevention (federated media only forwards to local participants),
   ParticipantSender::Federation with 8-byte room-hash prefixed
   datagrams, merged participant lists (local + remote), new methods:
   join_federated(), update_federated_participants(), local_senders(),
   active_rooms(), local_participants().

3. FederationManager: connects to configured peers via QUIC with SNI
   "_federation", reconnects with exponential backoff (5s-300s),
   exchanges FederationRoomJoin signals, runs recv loops for both
   signals and media datagrams, creates virtual participants in rooms.

4. Accept-side: _federation SNI handling in main.rs, unknown peer
   gets helpful "add to relay.toml" log message, recognized peers
   handed off to FederationManager.

TODO: TLS fingerprint verification — currently outbound connections
use client_config() which doesn't present a cert, so inbound
verification fails. Need mutual TLS or URL-based peer matching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:30:18 +04:00
Siavash Sameni
c8bcc5c974 fix: advertise studio profiles in handshake supported_profiles
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 2m7s
Mirror to GitHub / mirror (push) Failing after 35s
The CallOffer only advertised GOOD/DEGRADED/CATASTROPHIC. When a
client uses a studio profile, the relay's choose_profile couldn't
pick it. Now advertises all 6 profiles (studio 64k/48k/32k + good +
degraded + catastrophic) in both Android engine and shared handshake.

Also: the relay MUST be rebuilt with the new CodecId variants,
otherwise it will fail to deserialize CallOffer messages containing
studio QualityProfiles in supported_profiles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 19:39:31 +04:00
Claude
7eb136fcb3 fix: settings save button (back=discard), fix missing alias in featherchat tests
- Settings now uses draft state — changes only persist on explicit Save
- Back button discards unsaved changes
- Added applyServers() for batch server updates
- Added missing alias field to CallOffer in featherchat tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 04:30:23 +00:00
Claude
0835c36d0f feat: settings page with persistence, client alias in handshake, fix null fingerprints
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m34s
- Add SettingsScreen with identity (alias, key backup/restore), audio defaults,
  server management, network prefs, and default room
- SettingsRepository persists all settings via SharedPreferences
- Auto-generate random display names on first launch (e.g. "Swift Wolf")
- Thread alias through CallOffer → relay handshake → RoomUpdate broadcast
- Derive caller fingerprint from identity key in relay handshake (fixes null
  fingerprints when --auth-url is not set)
- Persist identity seed for stable fingerprints across reconnects
- Add alias field to SignalMessage::CallOffer (serde default for backward compat)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 03:56:33 +00:00
Claude
8bf073aa80 fix: handle RoomUpdate variant in wzp-client signal type mapping
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m37s
Build Release Binaries / release (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:54:36 +00:00
Claude
e7b1c3372a feat: Android VoIP client — Phase 2 (JNI bridge, Compose UI, AEC pipeline wiring)
- JNI bridge with 8 extern functions (init, startCall, stopCall, setMute,
  setSpeaker, getStats, forceProfile, destroy) with panic catching
- Kotlin engine layer: WzpEngine JNI wrapper, WzpCallback interface,
  CallStats data class with JSON deserialization
- Jetpack Compose UI: InCallScreen with quality indicator (green/yellow/red),
  mute/speaker/hangup buttons, stats overlay, duration timer
- CallActivity with RECORD_AUDIO permission handling, Material3 theme
- CallService foreground service with WakeLock, WiFi lock, notification
- AudioRouteManager for speaker/earpiece/Bluetooth SCO switching
- AEC wired into CallEncoder pipeline: AEC → AGC → denoise → silence → encode
- AEC farend reference fed from decode path to encode path in pipeline
- Engine exposes set_aec_enabled/set_agc_enabled via AtomicBool flags

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:16:38 +00:00
Claude
26e9c55f1f feat: Android VoIP client — Phase 1 (audio quality, network adaptation, crate skeleton)
- New wzp-android crate with Oboe C++ backend, lock-free SPSC ring buffers,
  engine orchestrator, codec pipeline, and Android Gradle project structure
- AEC (NLMS adaptive filter), AGC (two-stage with fast attack/slow release),
  windowed-sinc FIR resampler replacing linear interpolation (wzp-codec)
- Opus encoder tuning: complexity 7 default, set_expected_loss support
- Mobile jitter buffer: asymmetric EMA (fast up/slow down), handoff spike
  detection with 2s cooldown, configurable safety margin
- Network-aware quality control: cellular-specific thresholds, faster
  downgrade on cellular, proactive tier drop on WiFi→cellular handoff,
  FEC ratio boost during network transitions
- Handoff detection in PathMonitor via RTT jitter spike analysis

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:07:55 +00:00
Siavash Sameni
e595fe6591 feat: P3-T6 per-session forwarding — relay links for hop-by-hop media
RelayLink: QUIC connection to peer relay (SNI "_relay") for forwarding
specific sessions. Methods: connect, forward, add/remove_session, is_idle.

RelayLinkManager: manages connections to multiple peers.
- get_or_connect: lazy connection establishment
- forward_to: send media packet to specific peer
- register/unregister_session: track which sessions use which links
- Auto-closes idle links on session unregister

Protocol: added SignalMessage::SessionForward { session_id,
target_fingerprint, source_relay } and SessionForwardAck { session_id,
room_name } for relay-link session setup signaling.

Building block for P3-T7 (call setup over mesh) which wires
route resolution + relay links + handshake into a complete flow.

62 relay tests + 42 proto tests passing (7 new relay_link tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 18:45:36 +04:00
Siavash Sameni
326aa491cc feat: P3-T5 route resolution — find relay path to any fingerprint
RouteResolver queries PresenceRegistry to determine how to reach a target:
- Route::Local — connected to this relay
- Route::DirectPeer(addr) — on a directly connected peer relay
- Route::Chain(addrs) — multi-hop (structure ready, single-hop for now)
- Route::NotFound — not in any known relay

Protocol: added SignalMessage::RouteQuery { fingerprint, ttl } and
RouteResponse { fingerprint, found, relay_chain } for peer-to-peer
route queries over probe connections.

HTTP API: GET /route/:fingerprint returns JSON with route type + chain.

Relay handles incoming RouteQuery on probe connections: looks up locally,
replies with RouteResponse. TTL decremented for future multi-hop forwarding.

55 relay tests + 42 proto tests passing (7 new route tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 18:38:24 +04:00
Siavash Sameni
464e95a4bd feat: P3-T4 relay presence registry — gossip fingerprints across relay mesh
PresenceRegistry tracks who is connected where:
- register_local/unregister_local for directly connected users
- update_peer for fingerprints reported by peer relays
- lookup returns Local or Remote(addr)
- expire_stale removes entries older than timeout

Gossip via probe connections:
- New SignalMessage::PresenceUpdate { fingerprints, relay_addr }
- Probes send local fingerprints every 10s alongside Ping/Pong
- Receiving relay updates its remote presence table

HTTP API on metrics port:
- GET /presence — all known fingerprints + locations
- GET /presence/:fingerprint — single lookup
- GET /peers — peer relays + their connected users

Wired into relay main:
- Registry created at startup
- register_local after auth+handshake
- unregister_local on disconnect
- Passed to probe mesh and metrics server

Also marks FC-10 as DONE in integration tracker.

48 relay tests + 42 proto tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 17:36:55 +04:00
Siavash Sameni
9e7fea7633 test: P2-T1-S5 long-session regression — 60s call with drift/loss assertions
3 tests in crates/wzp-client/tests/long_session.rs:

1. long_session_no_drift — 3000 frames (60s) through full encoder/decoder
   pipeline, asserts >95% decoded, 0 overruns, 0 underruns

2. long_session_with_simulated_loss — drops every 20th packet + reorders,
   asserts >90% decoded, confirms PLC fills gaps (2999/3000)

3. long_session_stats_consistency — verifies stats.total_decoded matches
   actual decoded count over 60s (no accounting drift)

Completes P2-T1-S5. Phase 2 is now fully done.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 20:59:27 +04:00
Siavash Sameni
6310864b0b fix: client sends Hangup before disconnect, relay handles timeouts gracefully
Client: sends SignalMessage::Hangup(Normal) before closing in all modes
(send-tone, file mode, silence mode) so the relay knows the session ended.

Relay: downgrades "timed out" / "reset" / "closed" recv errors from
ERROR to INFO since these are normal disconnect scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:15:47 +04:00
Siavash Sameni
4d2c9838c5 fix: eliminate all compiler warnings across client, relay, web
- Remove unused imports in featherchat.rs (tracing, QualityProfile)
- Remove unused comfort_noise field from CallEncoder (cn_level is used instead)
- Prefix unused _metrics_file in CliArgs
- Prefix unused _addr in Participant
- Remove unused RoomSlot struct and rooms field from web AppState
- Remove unused HashMap import from web main

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:13:48 +04:00
Siavash Sameni
ab8a7f7a96 fix: client exits after --send-tone completes (was hanging on recv task)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:04:44 +04:00
Siavash Sameni
6d5ee55393 fix: install rustls crypto provider in wzp-client (same as relay/web)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:45:26 +04:00
Siavash Sameni
0dc381e948 feat: protocol improvements — live trunking, mini-frames, noise suppression, adaptive jitter
T6 wiring: Trunking in relay hot path
- TrunkedForwarder wraps transport with TrunkBatcher
- run_participant uses 5ms flush timer when trunking enabled
- send_trunk/recv_trunk on QuinnTransport
- --trunking flag on relay config
- 2 new tests: forwarder batches, auto-flush on full

T7 wiring: Mini-frames in encoder/decoder
- MediaPacket::encode_compact/decode_compact with MiniFrameContext
- CallEncoder sends mini-headers for consecutive frames (full every 50th)
- CallDecoder auto-detects full vs mini on receive
- mini_frames_enabled in CallConfig (default true)
- 3 new tests: encode/decode sequence, periodic full, disabled mode

Noise suppression (nnnoiseless/RNNoise)
- NoiseSupressor in wzp-codec: pure Rust ML-based noise removal
- Processes 960-sample frames as two 480-sample halves
- Integrated in CallEncoder before silence detection
- noise_suppression in CallConfig (default true)
- 4 new tests: creation, processing, SNR improvement, passthrough

T1-S4: Adaptive playout delay
- AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired)
- Computes target_delay from observed inter-arrival jitter
- JitterBuffer::new_adaptive() uses adaptive delay
- adaptive_jitter in CallConfig (default true)
- 5 new tests: stable, jitter increase, recovery, clamping, estimate

272 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:24:53 +04:00
Siavash Sameni
34cd1017c1 feat: IAX2-inspired protocol improvements — trunking, mini-frames, silence suppression, call control (P2-T6/T7/T8/T9)
WZP-P2-T6: Trunking
- TrunkFrame/TrunkEntry: pack N session packets into one datagram
- Wire format: [count:u16][session_id:2][len:u16][payload]...
- TrunkBatcher: batches by count (10) or bytes (1200), flushes on limit
- 5 tests: encode/decode roundtrip, empty frame, batcher fill/flush, byte limit

WZP-P2-T7: Mini-frames
- MiniHeader: 4-byte delta header (timestamp_delta + payload_len)
- FRAME_TYPE_FULL (0x00) / FRAME_TYPE_MINI (0x01) discriminator
- MiniFrameContext: expands mini-headers to full by tracking baseline
- Saves 8 bytes per packet (5 vs 13 bytes with type prefix)
- 5 tests: encode/decode, wire size, context expand, no baseline, size comparison

WZP-P2-T8: Silence suppression
- SilenceDetector: RMS-based detection with hangover (5 frames = 100ms)
- ComfortNoise: low-level random noise generator
- CodecId::ComfortNoise variant for CN packets
- CallEncoder: suppresses silent frames, sends 1-byte CN every 200ms
- CallDecoder: generates comfort noise on CN packets
- ~50% bandwidth savings in typical conversations
- 6 tests: silence/speech detection, hangover, CN generation, RMS math, suppression

WZP-P2-T9: Call control signals
- SignalMessage: Hold, Unhold, Mute, Unmute, Transfer, TransferAck
- CallSignalType mapping in featherchat.rs for all new variants
- 4 serde roundtrip tests + signal type mapping tests

255 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:13:05 +04:00
Siavash Sameni
39f6908478 feat: Prometheus metrics on relay + web bridge, client JSONL export (T5-S1/S3/S4)
WZP-P2-T5-S1: Relay Prometheus /metrics
- RelayMetrics: active_sessions, active_rooms, packets/bytes_forwarded,
  auth_attempts (ok/fail), handshake_duration histogram
- --metrics-port flag spawns HTTP server
- Wired into auth, handshake, session, and packet forwarding paths
- 2 tests

WZP-P2-T5-S3: Web bridge Prometheus /metrics
- WebMetrics: active_connections, frames_bridged (up/down),
  auth_failures, handshake_latency histogram
- Added /metrics route to existing axum app
- Wired into WS connect/disconnect, auth, handshake, send/recv loops
- 2 tests

WZP-P2-T5-S4: Client --metrics-file JSONL
- ClientMetricsSnapshot with all telemetry fields
- MetricsWriter: writes one JSON line per second to file
- snapshot_from_stats() converts JitterStats to snapshot
- --metrics-file <path> flag
- 3 tests

223 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 12:44:57 +04:00
Siavash Sameni
59a00d371b feat: jitter buffer instrumentation — drift test, telemetry, parameter sweep
WZP-P2-T1-S1: Automated drift measurement
- New drift_test.rs: DriftTestConfig, DriftResult, run_drift_test()
- CLI --drift-test <secs>: sends tone, measures actual vs expected duration
- Interpretation tiers: EXCELLENT (<50ms) / GOOD / FAIR / POOR
- 2 unit tests: drift math verification, config defaults

WZP-P2-T1-S2: Jitter buffer telemetry
- JitterStats gains: total_decoded, underruns, overruns, max_depth_seen
- JitterBuffer: record_underrun(), record_decode(), reset_stats()
- CallDecoder: stats() getter, reset_stats()
- JitterTelemetry: periodic tracing::info! logger with configurable interval
- 4 unit tests: ingestion tracking, underrun tracking, reset, interval

WZP-P2-T1-S3: Parameter sweep
- New sweep.rs: SweepConfig, SweepResult, run_local_sweep()
- Tests 20 jitter buffer configs (5 target × 4 max depths) locally
- CLI --sweep: runs sweep, prints ASCII comparison table
- No network needed — pure encoder→decoder pipeline test
- 3 unit tests: config defaults, local sweep runs, table formatting

216 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:26:40 +04:00
Siavash Sameni
524d1145bb feat: complete WZP Phase 2 (T2/T3/T4) — adaptive quality, AudioWorklet, sessions
WZP-P2-T2: Adaptive quality switching
- QualityAdapter with sliding window of QualityReports
- Hysteresis: 3 consecutive reports before switching profiles
- Thresholds: loss>15%/rtt>200ms→CATASTROPHIC, loss>5%/rtt>100ms→DEGRADED
- CallConfig::from_profile() constructor
- 5 unit tests: good/degraded/catastrophic conditions, hysteresis, recovery

WZP-P2-T3: AudioWorklet migration (web bridge)
- audio-processor.js: WZPCaptureProcessor + WZPPlaybackProcessor
- Capture: buffers 128-sample AudioWorklet blocks → 960-sample frames
- Playback: ring buffer, Int16→Float32 conversion in worklet
- ScriptProcessorNode fallback if AudioWorklet unavailable
- Existing UI preserved (connect, room, PTT)

WZP-P2-T4: Concurrent session management (relay)
- SessionManager tracks active sessions with HashMap
- Enforces max_sessions limit from RelayConfig
- create_session/remove_session lifecycle
- Wired into relay main: session created after auth+handshake,
  cleaned up after run_participant returns
- 7 unit tests: create/remove, max enforced, room tracking, info, expiry

207 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 10:20:51 +04:00
Siavash Sameni
59069bfba2 feat: complete all WZP-S integration tasks (S-4/5/6/7/9)
WZP-S-4: Room access control
- hash_room_name() in wzp-crypto: SHA-256("featherchat-group:"+name)[:16]
- CLI --room flag hashes before SNI, web bridge does the same
- RoomManager gains ACL: with_acl(), allow(), is_authorized()
- join() returns Result, rejects unauthorized fingerprints

WZP-S-5: Crypto handshake wired into all live paths
- CLI: perform_handshake() after connect, before any mode
- Relay: accept_handshake() after auth, before room join
- Web bridge: perform_handshake() after auth, before audio
- Relay generates ephemeral identity at startup

WZP-S-6: Web bridge featherChat auth
- --auth-url flag: browsers send {"type":"auth","token":"..."} as first WS msg
- Validates against featherChat, passes token to relay
- --cert/--key flags for production TLS (replaces self-signed)

WZP-S-7: wzp-proto standalone
- Cargo.toml uses explicit versions (no workspace inheritance)
- FC can use as git dependency

WZP-S-9: All 6 hardcoded assumptions resolved
- Auth, hashed rooms, mandatory handshake, real TLS certs,
  profile negotiation, token validation

CLI also gains --room and --token flags.
179 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:59:05 +04:00
Siavash Sameni
ad16ddb903 feat: WZP-S-2 relay auth + WZP-S-3 featherChat signaling bridge
WZP-S-2: Relay token authentication
- New --auth-url flag: relay calls POST {url} with bearer token
- Clients must send SignalMessage::AuthToken as first signal
- Relay validates against featherChat's /v1/auth/validate endpoint
- Rejects unauthenticated clients before they join rooms
- New auth.rs module with validate_token() + tests

WZP-S-3: featherChat signaling bridge
- New featherchat.rs module for CallSignal interop
- WzpCallPayload: wraps SignalMessage + relay_addr + room name
- encode_call_payload/decode_call_payload for JSON serialization
- CallSignalType enum mirrors featherChat's variant
- signal_to_call_type maps WZP signals to FC types

Protocol: Added SignalMessage::AuthToken { token } variant

129 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:23:46 +04:00
Siavash Sameni
12cdfe6c8a feat: featherChat-compatible identity — seed, mnemonic, fingerprint
New identity module (wzp-crypto/src/identity.rs) mirrors featherChat's
warzone-protocol identity.rs exactly:
- Seed: 32 bytes, from hex or BIP39 mnemonic (24 words)
- HKDF derivation: same salt (None), same info strings
- Fingerprint: SHA-256(Ed25519 pub)[:16], same xxxx:xxxx format
- Cross-verified: test proves identity module matches KeyExchange trait

CLI flags:
- --seed <64 hex chars>: use a specific identity
- --mnemonic <24 words>: use BIP39 mnemonic from featherChat
- Without either: generates ephemeral identity

Also adds featherChat as git submodule at deps/featherchat for reference.

32 crypto tests passing (27 original + 5 identity tests).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:09:38 +04:00
Siavash Sameni
bddcfb1440 fix: remove unused variable warning in cli.rs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 18:15:09 +04:00
Siavash Sameni
a04b8271cc fix: record mode decode-per-packet (same fix as live mode)
The --record recv loop was using while-drain which exhausted the jitter
buffer and stopped decoding after the first burst. Now decodes once per
source packet, matching the live mode fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 18:05:10 +04:00
Siavash Sameni
d5390db7af feat: --send-file for real audio testing + fix warnings
- --send-file <file.raw> sends a raw PCM file (48kHz mono s16le) through relay
- Combine with --record: --send-file talk.raw --record echo.raw <relay>
- Fixed all unused import warnings in echo_test.rs

Convert any audio to test format:
  ffmpeg -i input.mp3 -ar 48000 -ac 1 -f s16le input.raw

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:51:55 +04:00
Siavash Sameni
28d5a3a9ad feat: automated echo quality test with time-window analysis
New --echo-test <secs> flag sends a 440Hz tone through relay echo,
records the return, and analyzes quality in 5-second windows:
- Per-window: frames sent/received, loss %, SNR (dB), correlation
- Detects quality degradation over time (compares first vs second half)
- Reports jitter buffer stats (depth, lost, late packets)
- Diagnoses jitter buffer drift and packet loss accumulation

Also exposes jitter_stats() on CallDecoder for diagnostics.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:44:08 +04:00
Siavash Sameni
0723f52d76 fix: live audio playback working — jitter buffer and decode loop fixes
- Reduced jitter buffer min_depth from 25 (500ms) to 3 (60ms) for fast start
- Fixed live recv loop: decode once per source packet instead of draining
  the jitter buffer dry (which advanced seq past future packets)
- Fixed Ok(None) handling: connection closed, not "no packet yet"

Live echo test confirmed working with continuous audio.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 17:09:33 +04:00
Siavash Sameni
b147de5ae9 fix: graceful Ctrl+C recording + relay echo mode
- --record now handles Ctrl+C: saves PCM file before exiting
- Relay without --remote runs in echo mode (loops packets back to sender)
  instead of sink mode, enabling single-relay audio testing
- recv task returns collected PCM via channel for clean file write

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:32:12 +04:00
Siavash Sameni
df80ad5343 fix: make cpal/ALSA optional — headless Linux builds work without libasound
- cpal is now behind an 'audio' feature flag (off by default)
- --live mode requires --features audio at build time
- --send-tone and --record work on headless servers without audio libs
- Linux build script no longer installs libasound2-dev

Build for headless: cargo build --release
Build with mic/speakers: cargo build --release --features audio

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:24:44 +04:00
Siavash Sameni
708fb268bc feat: file-based audio testing + Hetzner build scripts
CLI modes:
- --send-tone <secs>: send 440Hz test tone (no mic needed)
- --record <file.raw>: save received audio to raw PCM file
- --help: usage info
- Combine: --send-tone 10 --record out.raw

Raw PCM format: 48kHz mono s16le
Play with: ffplay -f s16le -ar 48000 -ac 1 out.raw

Build scripts:
- scripts/build-linux.sh: Hetzner VPS build with auto-cleanup
- scripts/cleanup-builder.sh: kill stale builders

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 16:11:59 +04:00
Siavash Sameni
85f472d824 fix: scale FEC ratio with loss rate in benchmarks
The bench tool now auto-calculates the FEC ratio needed to survive
the requested loss percentage, matching how the adaptive quality
controller would behave in production.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:21:21 +04:00
Siavash Sameni
3c99503eb1 fix: IPv6 support, client address family matching, gitignore cleanup
- Client auto-detects IPv4/IPv6 from relay address and binds accordingly
- Relay defaults to 0.0.0.0:4433, use --listen [::]:4433 for IPv6
- .gitignore excludes .claude/, swap files
- Fix pipeline drain infinite loop in benchmarks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 14:17:49 +04:00
Siavash Sameni
79f9ff1596 feat: Phase 3 — crypto handshake, codec2, benchmarks, audio I/O, relay forwarding
E2E crypto handshake:
- Client/relay handshake via SignalMessage (CallOffer/CallAnswer)
- X25519 ephemeral key exchange with Ed25519 identity signatures
- Integration tests proving bidirectional encrypt/decrypt

Codec2 integration:
- Pure Rust codec2 crate (v0.3) — no C bindings needed
- MODE_3200 (160 samples/20ms, 8 bytes) and MODE_1200 (320 samples/40ms, 6 bytes)
- 11 new tests including encode/decode roundtrip and adaptive switching

Relay forwarding:
- Bidirectional client → remote forwarding with pipeline processing
- CLI args: --listen, --remote
- Periodic stats logging, clean shutdown via tokio::select!

Benchmark tool (wzp-bench):
- Codec roundtrip, FEC recovery, crypto throughput, full pipeline benchmarks
- Sine wave PCM generator for realistic testing

Audio I/O (cpal):
- AudioCapture (microphone) and AudioPlayback (speakers) at 48kHz mono
- CLI --live mode: mic → encode → send / recv → decode → speakers

120 tests passing, 0 failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:43:22 +04:00
Siavash Sameni
43d7f70fe9 feat: Phase 2 — relay daemon and client library with integration pipelines
wzp-relay:
- RelayPipeline: ingest → FEC decode → jitter buffer → FEC encode → send
- SessionManager: tracks active calls, idle expiry
- RelayConfig: TOML-based configuration
- Binary: accepts QUIC connections, receives media packets

wzp-client:
- CallEncoder: mic PCM → Opus encode → FEC → MediaPackets
- CallDecoder: MediaPackets → FEC decode → jitter → Opus decode → PCM
- CLI binary: connects to relay, sends test silence frames

99 tests passing across all 7 crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:08:33 +04:00
Siavash Sameni
51e893590c feat: WarzonePhone lossy VoIP protocol — Phase 1 complete
Rust workspace with 7 crates implementing a custom VoIP protocol
designed for extremely lossy connections (5-70% loss, 100-500kbps,
300-800ms RTT). 89 tests passing across all crates.

Crates:
- wzp-proto: Wire format, traits, adaptive quality controller, jitter buffer, session FSM
- wzp-codec: Opus encoder/decoder (audiopus), Codec2 stubs, adaptive switching, resampling
- wzp-fec: RaptorQ fountain codes, interleaving, block management (proven 30-70% loss recovery)
- wzp-crypto: X25519+ChaCha20-Poly1305, Warzone identity compatible, anti-replay, rekeying
- wzp-transport: QUIC via quinn with DATAGRAM frames, path monitoring, signaling streams
- wzp-relay: Integration stub (Phase 2)
- wzp-client: Integration stub (Phase 2)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 12:45:07 +04:00