Commit Graph

17 Commits

Author SHA1 Message Date
Siavash Sameni
daf7bcd9ba chore(warnings): sweep the workspace — zero warnings on lib + bin targets
Addressed every rustc warning surfaced by \`cargo check --workspace
--release --lib --bins\` on opus-DRED-v2. Split across three
categories:

## Real bugs surfaced by the audit (fix, don't silence)

- **crates/wzp-relay/src/federation.rs** — the per-peer RTT monitor
  task computed \`rtt_ms\` every 5 s and threw it on the floor. The
  \`wzp_federation_peer_rtt_ms\` gauge has been registered in
  metrics.rs the whole time but was never receiving samples, leaving
  the Grafana panel blank. Wired it up: the task now calls
  \`fm_rtt.metrics.federation_peer_rtt_ms.with_label_values(&[&label_rtt]).set(rtt_ms)\`
  on every sample. Fixes three warnings (\`rtt_ms\`, \`fm_rtt\`,
  \`label_rtt\` were all captured for this task and all dead).

## Dead code removal

- **crates/wzp-relay/src/federation.rs** — removed \`local_delivery_seq:
  AtomicU16\` field and its initializer. It was described in comments
  as "per-room seq counter for federation media delivered to local
  clients" but was declared, initialized to 0, and never read or
  written anywhere else. Genuine half-wired feature; deletable with
  zero behavior change.
- **crates/wzp-relay/src/room.rs** — removed \`let recv_start =
  Instant::now()\` at the top of a recv loop that was never read.
  Separate variable \`last_recv_instant\` already measures the actual
  gap that's used for the \`max_recv_gap_ms\` stat.
- **crates/wzp-client/src/cli.rs** — removed \`let my_fp = fp.clone()\`
  from the signal loop setup. Cloned but never used in any match arm.

## Stub-intent warnings (underscore + explanatory comment)

- **crates/wzp-relay/src/handshake.rs** — \`choose_profile\` hardcodes
  \`QualityProfile::GOOD\` and ignores its \`supported\` parameter.
  Comment already documented "Cap at GOOD (24k) for now — studio
  tiers not yet tested for federation reliability". Renamed to
  \`_supported\`, expanded the comment to explicitly note the future
  plan (pick highest supported ≤ relay ceiling).
- **crates/wzp-relay/src/federation.rs** — \`forward_to_peers\` takes
  \`room_name: &str\` but only uses \`room_hash\`. The caller
  (handle_datagram) passes the name for caller-site symmetry with
  other helpers; kept the param shape and underscored the binding
  with a comment noting it's reserved for future per-name logging.

## Cosmetic fixes

- **crates/wzp-relay/src/event_log.rs** — dropped \`use std::sync::Arc\`
  (unused).
- **crates/wzp-relay/src/signal_hub.rs** — trimmed \`use tracing::{info,
  warn}\` to \`use tracing::info\`. Also removed unnecessary \`mut\` on
  \`hub\` binding in the \`register_unregister\` test.
- **crates/wzp-relay/src/room.rs** — trimmed \`use tracing::{debug,
  error, info, trace, warn}\` to \`{error, info, warn}\`. Also removed
  unnecessary \`mut\` on \`mgr\` binding in the \`room_join_leave\` test.
- **crates/wzp-relay/src/main.rs** — removed unnecessary \`mut\` on the
  \`config\` destructured binding from \`parse_args()\`; and dropped
  \`ref caller_alias\` from the \`DirectCallOffer\` match pattern since
  the relay just forwards the full \`msg\` (caller_alias is preserved
  end-to-end, we don't need to read it on the relay).
- **crates/wzp-crypto/tests/featherchat_compat.rs** — dropped
  \`CallSignalType\` from a \`use wzp_client::featherchat::{...}\`
  (unused in the test body). Note: this test file has pre-existing
  compile errors from SignalMessage schema drift unrelated to this
  sweep; that's tracked separately.

## Crate-level annotation

- **crates/wzp-android/src/lib.rs** — added
  \`#![allow(dead_code, unused_imports, unused_variables, unused_mut)]\`
  with a doc block explaining the crate is dead code since the Tauri
  mobile rewrite. The legacy Kotlin+JNI Android app that consumed
  this crate was replaced by desktop/src-tauri (live Android recv
  path) + crates/wzp-native (Oboe bridge). Rather than piecemeal
  cleanup of a crate that shouldn't be maintained, the whole-crate
  allow keeps CI clean until someone removes the crate entirely. Kills
  all 6 wzp-android warnings (4 unused imports/vars, 1 unused \`mut\`
  on a JNI env param, 1 dead \`command_rx\` field) in one line.

## Not touched

- **deps/featherchat/warzone/crates/warzone-protocol/src/x3dh.rs** —
  3 unused-variable warnings in \`alice_spk_secret\`, \`alice_bundle\`,
  \`bob_bundle_bytes\`. This is a vendored third-party submodule;
  upstream's problem, not ours. Would need to be reported to
  featherchat upstream if we care.

## Verification

- \`cargo check --workspace --release --lib --bins\` → 0 warnings, 0 errors
- \`cargo check --workspace --release --all-targets\` → only the 3
  featherchat submodule warnings remain, plus the pre-existing 3
  broken integration tests (SignalMessage schema drift from Phase 2,
  tracked separately and explicitly out of scope).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:28:26 +04:00
Siavash Sameni
54cb6c3b71 feat: relay_label in RoomParticipant + tagged remote participants
Some checks failed
Mirror to GitHub / mirror (push) Failing after 44s
Build Release Binaries / build-amd64 (push) Failing after 2m26s
RoomParticipant.relay_label identifies which relay a participant is
connected to. Local participants have None, federated participants
get tagged with the peer relay's label when storing remote_participants.

This enables clients to group participants by relay in the UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:22:53 +04:00
Siavash Sameni
8080713098 feat: federated presence — RoomUpdate includes remote participants
Some checks failed
Mirror to GitHub / mirror (push) Failing after 42s
Build Release Binaries / build-amd64 (push) Failing after 2m29s
GlobalRoomActive signal now carries participant list from the
announcing relay. When received, the relay:
1. Stores remote participants per peer link
2. Broadcasts merged RoomUpdate to local clients (local + all remote)

This means clients on different relays can now SEE each other in the
participant list. Also fixes build: removed non-existent metric field
references that were added by linter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:52:27 +04:00
Siavash Sameni
d52b8befd6 fix: canonical room hash for federation — handles hashed vs raw room names
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 2m13s
Different clients send different room names:
- Android: raw "general" as SNI
- Desktop: hash_room_name("general") = "f09ae11d..." as SNI

Federation datagrams are tagged with an 8-byte room hash. Previously,
each relay computed the hash from the client-provided room name,
causing mismatches between relays with different client types.

Fix: resolve_global_room() maps any room name (raw or hashed) to the
canonical [[global_rooms]] name. global_room_hash() always uses the
canonical name for federation hashing. handle_datagram uses both raw
and canonical hash matching to find the local room.

Also: run_participant now receives the pre-computed federation_room_hash
so the egress uses the canonical hash, not the client-specific name.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 10:31:26 +04:00
Siavash Sameni
b00db5dfdc feat: federation rewrite — global rooms router model
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 1m52s
Major rewrite of relay federation replacing virtual participants with
a clean router model:

1. Global rooms: [[global_rooms]] in TOML config declares rooms that
   are bridged across federation. Each relay is a router + local SFU.

2. Room events: RoomManager emits LocalJoin/LocalLeave via broadcast
   channel when rooms transition between empty and non-empty.

3. GlobalRoomActive/Inactive signals: relays announce when they have
   local participants in global rooms. Peers track active state and
   forward media accordingly. Announcements propagate for multi-hop.

4. Media forwarding: separated from SFU loop. Local participant sends
   via mpsc channel → egress task → forward_to_peers() → room-hash
   tagged datagrams to active peer links. Inbound datagrams delivered
   to local participants + forwarded to other active peers (multi-hop).

5. Loop prevention: don't forward back to source relay.

6. Room name hashing: is_global_room() checks both plain name and
   hash (clients hash room names for SNI privacy).

Removed: ParticipantSender::Federation, federated_participants, virtual
participant join/leave, periodic room polling. Rooms now only contain
local participants.

Signaling tested: 3-relay chain (A→B←C) correctly propagates
GlobalRoomActive through B to both A and C. Media forwarding plumbing
in place but needs final debugging.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 07:54:38 +04:00
Siavash Sameni
ea51d068e6 feat: --debug-tap for relay packet header logging
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 1m57s
Adds --debug-tap <room> flag (or debug_tap in TOML config) that logs
every media packet's header metadata passing through a room. Use '*'
for all rooms.

Output (via tracing target "debug_tap"):
  TAP room=... dir=in addr=... seq=31 codec=Opus24k ts=520
      fec_block=5 fec_sym=1 repair=false len=65 fan_out=1

Shows: direction, source address, sequence number, codec ID, timestamp,
FEC block/symbol, repair flag, payload size, and fan-out count.
No decryption needed — headers are not encrypted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 06:34:22 +04:00
Siavash Sameni
6be36e43c2 feat: relay federation infrastructure — room bridging, loop prevention, peer connections
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 2m1s
Phase 1 of relay federation:

1. Signal messages: FederationRoomJoin/Leave/ParticipantUpdate added
   to SignalMessage enum for relay-to-relay room coordination.

2. Room changes: ParticipantOrigin (Local/Federated) tracking, loop
   prevention (federated media only forwards to local participants),
   ParticipantSender::Federation with 8-byte room-hash prefixed
   datagrams, merged participant lists (local + remote), new methods:
   join_federated(), update_federated_participants(), local_senders(),
   active_rooms(), local_participants().

3. FederationManager: connects to configured peers via QUIC with SNI
   "_federation", reconnects with exponential backoff (5s-300s),
   exchanges FederationRoomJoin signals, runs recv loops for both
   signals and media datagrams, creates virtual participants in rooms.

4. Accept-side: _federation SNI handling in main.rs, unknown peer
   gets helpful "add to relay.toml" log message, recognized peers
   handed off to FederationManager.

TODO: TLS fingerprint verification — currently outbound connections
use client_config() which doesn't present a cert, so inbound
verification fails. Need mutual TLS or URL-based peer matching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 22:30:18 +04:00
Claude
20922455bd fix: send task crash on QUIC congestion + AEC toggle + debug reporter
Root cause: send_media() returns Err(Blocked) when QUIC congestion
window is full. The send task treated ANY send error as fatal (break),
killing the entire call. Now send errors drop the packet and continue.

Also hardened recv task to survive transient errors and added health
logging (recv gap tracking, periodic stats) to both send and recv.

Relay: added comprehensive debug logging — recv gaps, lock contention,
forward latency, send errors — all per-participant with 5s stats.

Other changes:
- AEC toggle in Settings (persisted, applied on next call)
- Debug report: records call audio (WAV), RMS histogram (CSV), logcat,
  stats. Emailed as zip via Android share intent after call ends.
- Replaced LinearProgressIndicator with Box (compose version compat)
- FileProvider for sharing debug zip attachments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 07:38:56 +00:00
Claude
2d4b8eebd5 feat: RoomUpdate protocol — broadcast participant list on join/leave
- Add RoomUpdate signal message to wzp-proto with participant count + list
- Add RoomParticipant struct (fingerprint + optional alias)
- Store fingerprint/alias in relay Participant struct
- Broadcast RoomUpdate to all room members on join and leave
- Add signal recv task in Android engine to handle RoomUpdate
- Surface room_participant_count + room_participants in CallStats JSON
- Show "X in room" with participant names in Android in-call UI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:12:24 +00:00
Siavash Sameni
aa09275015 feat: WebSocket support in relay — browsers connect directly, no bridge
Implements WS_RELAY_SPEC.md: relay handles both QUIC and WebSocket clients
in shared rooms, eliminating the wzp-web bridge server.

Room abstraction (room.rs):
- New ParticipantSender enum: Quic(transport) | WebSocket(mpsc::Sender)
- send_raw() sends PCM bytes to either transport type
- join_ws() convenience method for WS clients
- Forwarding loops handle mixed QUIC+WS rooms:
  QUIC→QUIC: send_media (trunked if enabled)
  QUIC→WS: send_raw payload bytes
  WS→QUIC: send_raw wraps in MediaPacket
  WS→WS: send_raw binary

WebSocket handler (ws.rs):
- GET /ws/{room} → WebSocket upgrade via axum
- Auth: first msg {"type":"auth","token":"..."} → validates against FC
- mpsc channel bridges room fan-out to WS binary frames
- Session + presence lifecycle matches QUIC path
- Optional static file serving via --static-dir (tower-http ServeDir)

Config: --ws-port 8080, --static-dir ./static
Proto: MediaHeader::default_pcm() for WS→QUIC wrapping

63 relay + 54 proto tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 14:38:33 +04:00
Siavash Sameni
6310864b0b fix: client sends Hangup before disconnect, relay handles timeouts gracefully
Client: sends SignalMessage::Hangup(Normal) before closing in all modes
(send-tone, file mode, silence mode) so the relay knows the session ended.

Relay: downgrades "timed out" / "reset" / "closed" recv errors from
ERROR to INFO since these are normal disconnect scenarios.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:15:47 +04:00
Siavash Sameni
4d2c9838c5 fix: eliminate all compiler warnings across client, relay, web
- Remove unused imports in featherchat.rs (tracing, QualityProfile)
- Remove unused comfort_noise field from CallEncoder (cn_level is used instead)
- Prefix unused _metrics_file in CliArgs
- Prefix unused _addr in Participant
- Remove unused RoomSlot struct and rooms field from web AppState
- Remove unused HashMap import from web main

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 15:13:48 +04:00
Siavash Sameni
0dc381e948 feat: protocol improvements — live trunking, mini-frames, noise suppression, adaptive jitter
T6 wiring: Trunking in relay hot path
- TrunkedForwarder wraps transport with TrunkBatcher
- run_participant uses 5ms flush timer when trunking enabled
- send_trunk/recv_trunk on QuinnTransport
- --trunking flag on relay config
- 2 new tests: forwarder batches, auto-flush on full

T7 wiring: Mini-frames in encoder/decoder
- MediaPacket::encode_compact/decode_compact with MiniFrameContext
- CallEncoder sends mini-headers for consecutive frames (full every 50th)
- CallDecoder auto-detects full vs mini on receive
- mini_frames_enabled in CallConfig (default true)
- 3 new tests: encode/decode sequence, periodic full, disabled mode

Noise suppression (nnnoiseless/RNNoise)
- NoiseSupressor in wzp-codec: pure Rust ML-based noise removal
- Processes 960-sample frames as two 480-sample halves
- Integrated in CallEncoder before silence detection
- noise_suppression in CallConfig (default true)
- 4 new tests: creation, processing, SNR improvement, passthrough

T1-S4: Adaptive playout delay
- AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired)
- Computes target_delay from observed inter-arrival jitter
- JitterBuffer::new_adaptive() uses adaptive delay
- adaptive_jitter in CallConfig (default true)
- 5 new tests: stable, jitter increase, recovery, clamping, estimate

272 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:24:53 +04:00
Siavash Sameni
216ebf4a25 feat: per-session metrics + inter-relay health probe (T5-S2/S5)
WZP-P2-T5-S2: Per-session Prometheus metrics
- 5 new per-session gauges/counters: buffer_depth, loss_pct, rtt_ms,
  underruns, overruns — all labeled by session_id
- update_session_quality() reads QualityReport from packet headers
- update_session_buffer() tracks jitter buffer state per session
- remove_session_metrics() cleans up labels on disconnect
- Delta-aware counter increments avoid double-counting
- 2 tests: session_quality_update, session_metrics_cleanup

WZP-P2-T5-S5: Inter-relay health probe
- New probe.rs: ProbeConfig, ProbeMetrics, SlidingWindow, ProbeRunner
- --probe <addr> flag (repeatable) spawns background probe per target
- Sends Ping/s over QUIC, receives Pong, computes RTT/loss/jitter
- SlidingWindow(60): tracks last 60 pings, loss = missed pongs,
  jitter = std deviation of RTT
- Prometheus gauges: wzp_probe_rtt_ms, loss_pct, jitter_ms, up
  with target label
- Probe connections use SNI "_probe" — relay responds with Pong loop,
  skipping auth/handshake
- Auto-reconnect with 5s backoff on disconnect
- 6 tests: metrics_register, rtt/loss/jitter calculation,
  window eviction, empty edge cases

231 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:09:52 +04:00
Siavash Sameni
39f6908478 feat: Prometheus metrics on relay + web bridge, client JSONL export (T5-S1/S3/S4)
WZP-P2-T5-S1: Relay Prometheus /metrics
- RelayMetrics: active_sessions, active_rooms, packets/bytes_forwarded,
  auth_attempts (ok/fail), handshake_duration histogram
- --metrics-port flag spawns HTTP server
- Wired into auth, handshake, session, and packet forwarding paths
- 2 tests

WZP-P2-T5-S3: Web bridge Prometheus /metrics
- WebMetrics: active_connections, frames_bridged (up/down),
  auth_failures, handshake_latency histogram
- Added /metrics route to existing axum app
- Wired into WS connect/disconnect, auth, handshake, send/recv loops
- 2 tests

WZP-P2-T5-S4: Client --metrics-file JSONL
- ClientMetricsSnapshot with all telemetry fields
- MetricsWriter: writes one JSON line per second to file
- snapshot_from_stats() converts JitterStats to snapshot
- --metrics-file <path> flag
- 3 tests

223 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 12:44:57 +04:00
Siavash Sameni
59069bfba2 feat: complete all WZP-S integration tasks (S-4/5/6/7/9)
WZP-S-4: Room access control
- hash_room_name() in wzp-crypto: SHA-256("featherchat-group:"+name)[:16]
- CLI --room flag hashes before SNI, web bridge does the same
- RoomManager gains ACL: with_acl(), allow(), is_authorized()
- join() returns Result, rejects unauthorized fingerprints

WZP-S-5: Crypto handshake wired into all live paths
- CLI: perform_handshake() after connect, before any mode
- Relay: accept_handshake() after auth, before room join
- Web bridge: perform_handshake() after auth, before audio
- Relay generates ephemeral identity at startup

WZP-S-6: Web bridge featherChat auth
- --auth-url flag: browsers send {"type":"auth","token":"..."} as first WS msg
- Validates against featherChat, passes token to relay
- --cert/--key flags for production TLS (replaces self-signed)

WZP-S-7: wzp-proto standalone
- Cargo.toml uses explicit versions (no workspace inheritance)
- FC can use as git dependency

WZP-S-9: All 6 hardcoded assumptions resolved
- Auth, hashed rooms, mandatory handshake, real TLS certs,
  profile negotiation, token validation

CLI also gains --room and --token flags.
179 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:59:05 +04:00
Siavash Sameni
d8330525ef feat: multi-party rooms (SFU) + push-to-talk radio mode
Room-based SFU relay:
- Clients join named rooms (room name from QUIC SNI)
- Each participant's packets forwarded to all others (no mixing)
- Multiple rooms run concurrently on one relay
- Web bridge passes room name from URL path to relay

Push-to-talk (radio mode):
- Toggle "Radio mode" checkbox after connecting
- Hold PTT button or spacebar to transmit
- Release to mute mic (receive-only)
- Works on desktop (spacebar) and mobile (touch)

URL routing:
- /myroom → joins room "myroom"
- Room name input field as fallback

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 20:36:19 +04:00