Commit Graph

7 Commits

Author SHA1 Message Date
Siavash Sameni
aa09275015 feat: WebSocket support in relay — browsers connect directly, no bridge
Implements WS_RELAY_SPEC.md: relay handles both QUIC and WebSocket clients
in shared rooms, eliminating the wzp-web bridge server.

Room abstraction (room.rs):
- New ParticipantSender enum: Quic(transport) | WebSocket(mpsc::Sender)
- send_raw() sends PCM bytes to either transport type
- join_ws() convenience method for WS clients
- Forwarding loops handle mixed QUIC+WS rooms:
  QUIC→QUIC: send_media (trunked if enabled)
  QUIC→WS: send_raw payload bytes
  WS→QUIC: send_raw wraps in MediaPacket
  WS→WS: send_raw binary

WebSocket handler (ws.rs):
- GET /ws/{room} → WebSocket upgrade via axum
- Auth: first msg {"type":"auth","token":"..."} → validates against FC
- mpsc channel bridges room fan-out to WS binary frames
- Session + presence lifecycle matches QUIC path
- Optional static file serving via --static-dir (tower-http ServeDir)

Config: --ws-port 8080, --static-dir ./static
Proto: MediaHeader::default_pcm() for WS→QUIC wrapping

63 relay + 54 proto tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-30 14:38:33 +04:00
Siavash Sameni
0dc381e948 feat: protocol improvements — live trunking, mini-frames, noise suppression, adaptive jitter
T6 wiring: Trunking in relay hot path
- TrunkedForwarder wraps transport with TrunkBatcher
- run_participant uses 5ms flush timer when trunking enabled
- send_trunk/recv_trunk on QuinnTransport
- --trunking flag on relay config
- 2 new tests: forwarder batches, auto-flush on full

T7 wiring: Mini-frames in encoder/decoder
- MediaPacket::encode_compact/decode_compact with MiniFrameContext
- CallEncoder sends mini-headers for consecutive frames (full every 50th)
- CallDecoder auto-detects full vs mini on receive
- mini_frames_enabled in CallConfig (default true)
- 3 new tests: encode/decode sequence, periodic full, disabled mode

Noise suppression (nnnoiseless/RNNoise)
- NoiseSupressor in wzp-codec: pure Rust ML-based noise removal
- Processes 960-sample frames as two 480-sample halves
- Integrated in CallEncoder before silence detection
- noise_suppression in CallConfig (default true)
- 4 new tests: creation, processing, SNR improvement, passthrough

T1-S4: Adaptive playout delay
- AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired)
- Computes target_delay from observed inter-arrival jitter
- JitterBuffer::new_adaptive() uses adaptive delay
- adaptive_jitter in CallConfig (default true)
- 5 new tests: stable, jitter increase, recovery, clamping, estimate

272 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 14:24:53 +04:00
Siavash Sameni
a64b79d953 feat: probe mesh mode + Grafana dashboard (T5-S6/S7) — completes T5
WZP-P2-T5-S6: Probe mesh mode
- ProbeMesh coordinator: wraps multiple ProbeRunners, spawns all concurrently
- mesh_summary(): scans registry, formats human-readable health table
- /mesh HTTP endpoint on metrics port alongside /metrics
- --probe-mesh flag, --mesh-status for CLI diagnostics
- Replaces individual probe spawn loop with ProbeMesh::run_all()
- 4 tests: mesh creation, empty/populated summary, zero targets

WZP-P2-T5-S7: Grafana dashboard
- docs/grafana-dashboard.json — importable directly into Grafana
- Row 1: Relay Health (sessions, rooms, packets/s, bytes/s, auth, handshake)
- Row 2: Call Quality (buffer depth, loss%, RTT, underruns per session)
- Row 3: Inter-Relay Mesh (RTT heatmap, loss, jitter, probe up/down)
- Row 4: Web Bridge (connections, frames bridged, auth failures, latency)
- Datasource variable ${DS_PROMETHEUS}, auto-refresh 10s
- Color thresholds: loss 2%/5%, RTT 100ms/300ms, probe up=green/down=red

T5 Telemetry & Observability is now COMPLETE (all 7 subtasks).
235 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:18:50 +04:00
Siavash Sameni
216ebf4a25 feat: per-session metrics + inter-relay health probe (T5-S2/S5)
WZP-P2-T5-S2: Per-session Prometheus metrics
- 5 new per-session gauges/counters: buffer_depth, loss_pct, rtt_ms,
  underruns, overruns — all labeled by session_id
- update_session_quality() reads QualityReport from packet headers
- update_session_buffer() tracks jitter buffer state per session
- remove_session_metrics() cleans up labels on disconnect
- Delta-aware counter increments avoid double-counting
- 2 tests: session_quality_update, session_metrics_cleanup

WZP-P2-T5-S5: Inter-relay health probe
- New probe.rs: ProbeConfig, ProbeMetrics, SlidingWindow, ProbeRunner
- --probe <addr> flag (repeatable) spawns background probe per target
- Sends Ping/s over QUIC, receives Pong, computes RTT/loss/jitter
- SlidingWindow(60): tracks last 60 pings, loss = missed pongs,
  jitter = std deviation of RTT
- Prometheus gauges: wzp_probe_rtt_ms, loss_pct, jitter_ms, up
  with target label
- Probe connections use SNI "_probe" — relay responds with Pong loop,
  skipping auth/handshake
- Auto-reconnect with 5s backoff on disconnect
- 6 tests: metrics_register, rtt/loss/jitter calculation,
  window eviction, empty edge cases

231 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 13:09:52 +04:00
Siavash Sameni
39f6908478 feat: Prometheus metrics on relay + web bridge, client JSONL export (T5-S1/S3/S4)
WZP-P2-T5-S1: Relay Prometheus /metrics
- RelayMetrics: active_sessions, active_rooms, packets/bytes_forwarded,
  auth_attempts (ok/fail), handshake_duration histogram
- --metrics-port flag spawns HTTP server
- Wired into auth, handshake, session, and packet forwarding paths
- 2 tests

WZP-P2-T5-S3: Web bridge Prometheus /metrics
- WebMetrics: active_connections, frames_bridged (up/down),
  auth_failures, handshake_latency histogram
- Added /metrics route to existing axum app
- Wired into WS connect/disconnect, auth, handshake, send/recv loops
- 2 tests

WZP-P2-T5-S4: Client --metrics-file JSONL
- ClientMetricsSnapshot with all telemetry fields
- MetricsWriter: writes one JSON line per second to file
- snapshot_from_stats() converts JitterStats to snapshot
- --metrics-file <path> flag
- 3 tests

223 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 12:44:57 +04:00
Siavash Sameni
ad16ddb903 feat: WZP-S-2 relay auth + WZP-S-3 featherChat signaling bridge
WZP-S-2: Relay token authentication
- New --auth-url flag: relay calls POST {url} with bearer token
- Clients must send SignalMessage::AuthToken as first signal
- Relay validates against featherChat's /v1/auth/validate endpoint
- Rejects unauthenticated clients before they join rooms
- New auth.rs module with validate_token() + tests

WZP-S-3: featherChat signaling bridge
- New featherchat.rs module for CallSignal interop
- WzpCallPayload: wraps SignalMessage + relay_addr + room name
- encode_call_payload/decode_call_payload for JSON serialization
- CallSignalType enum mirrors featherChat's variant
- signal_to_call_type maps WZP signals to FC types

Protocol: Added SignalMessage::AuthToken { token } variant

129 tests passing across all crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 09:23:46 +04:00
Siavash Sameni
43d7f70fe9 feat: Phase 2 — relay daemon and client library with integration pipelines
wzp-relay:
- RelayPipeline: ingest → FEC decode → jitter buffer → FEC encode → send
- SessionManager: tracks active calls, idle expiry
- RelayConfig: TOML-based configuration
- Binary: accepts QUIC connections, receives media packets

wzp-client:
- CallEncoder: mic PCM → Opus encode → FEC → MediaPackets
- CallDecoder: MediaPackets → FEC decode → jitter → Opus decode → PCM
- CLI binary: connects to relay, sends test silence frames

99 tests passing across all 7 crates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 13:08:33 +04:00