Commit Graph

465 Commits

Author SHA1 Message Date
Siavash Sameni
217567383d fix(ui): timestamps in logs, proper call debounce, no cross-calling
- Copy/Share log now includes HH:MM:SS timestamps
- callInProgress stays true until call resolves (setup or hangup),
  preventing multiple taps from firing multiple place_call offers
- Block place_call when there's a pending incoming call
- leaveVoice clears all call state (callInProgress, pendingCallId)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:16:20 +04:00
Siavash Sameni
98ed981805 fix(ui): self-call prevention, debounce, codec in stats
- Filter self from lobby list (double-check in renderLobbyUsers)
- Disable "Direct Call" button when tapping own user
- Debounce call button (callInProgress flag prevents double-tap)
- Block calling own fingerprint
- Stats line shows codec names + fps + audio level

The direct call to the other phone failing is likely because
both phones share the same reflexive addr:port on the same NAT,
making determine_role return None (equal addrs). This is an
existing edge case in reflect.rs — not a UI bug.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:10:31 +04:00
Siavash Sameni
01a3133544 fix(ui): drawer buttons, stats fields, nicknames
- Buttons: use text labels (Mic/Spk/End) instead of emoji HTML
  entities that rendered as raw text on Android WebView
- Stats: match Rust CallStatus fields (tx_codec, rx_codec,
  encode_fps, recv_fps, audio_level, spk_muted)
- Nicknames: register_signal sends derive_alias() as the alias
  so other users see "Brave Falcon" instead of "a525:e9b2:..."
- Lobby header shows alias from get_app_info instead of raw fp
- pollStatus uses correct field names from Rust struct

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:00:09 +04:00
Siavash Sameni
25471c694f feat(ui): voice drawer replaces full-screen call UI
Discord-style bottom drawer for voice instead of navigating away:

- "Join Voice" hides the FAB, slides up a persistent bottom bar
- Drawer shows: room name, timer, P2P/Relay badge, level meter
- Controls: mic, speaker, end call — all in the drawer
- Direct call info (identicon, name, P2P badge) shown inline
- Lobby stays visible above the drawer at all times
- Stats line shows codec/packet/FEC info
- Leave voice = drawer slides away, FAB returns

Removed: full-screen call-screen, back button, old participant
list, old mic/speaker/hangup buttons. All voice interaction
happens in the 15% bottom drawer while the lobby stays live.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:47:40 +04:00
Siavash Sameni
a058a83c91 feat(ui): relay list management in settings
Settings now shows relay list with:
- Visual list of all configured relays
- Active relay highlighted in green with "ACTIVE" badge
- Tap a relay to switch (deregisters + reconnects automatically)
- X button to remove a relay (keeps at least 1)
- Add relay with name + address inputs
- Reconnect flow: deregister → clear lobby → auto-connect to new relay

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:37:58 +04:00
Siavash Sameni
9b8013ba7f merge main: PresenceList direct send fix 2026-04-14 18:36:01 +04:00
Siavash Sameni
defd8eab07 fix(signal): send PresenceList directly to new client after ack
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m50s
The broadcast alone wasn't reaching the first client because its
recv loop hadn't started yet when the second client registered.
Now the relay sends PresenceList directly to the new client (right
after RegisterPresenceAck) AND broadcasts to all others.

This guarantees every client gets the full user list:
- New client: via direct send (queued before recv loop starts)
- Existing clients: via broadcast

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:20:37 +04:00
Siavash Sameni
cc23e829b2 feat(ui): handle PresenceList in lobby — show online users
The lobby now populates from PresenceList signal events:
- Relay broadcasts user list on register/deregister
- JS receives "presence_list" signal-event
- Updates lobbyUsers map (excluding self)
- Renders user rows with identicon, name, fingerprint

Users appear in the lobby as soon as they register their
signal channel — no need to join voice first.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:13:45 +04:00
Siavash Sameni
18c204c1ff merge main: PresenceList signal for lobby 2026-04-14 18:13:15 +04:00
Siavash Sameni
1120c7b579 feat(signal): PresenceList broadcast for lobby user discovery
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 7m21s
Mirror to GitHub / mirror (push) Failing after 27s
New signal infrastructure for the lobby-first UI:

- PresenceUser struct: { fingerprint, alias }
- SignalMessage::PresenceList: relay broadcasts full user list
  to all signal clients on every register/deregister
- SignalHub::presence_list(): builds the list from connected clients
- SignalHub::broadcast(): sends to ALL signal clients
- Relay calls broadcast on register + unregister
- Desktop emits "presence_list" signal-event to JS frontend

This gives clients real-time visibility of who's online via the
signal channel, without needing to join a voice room first.

603 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:12:47 +04:00
Siavash Sameni
7e7391fdbb feat(ui): lobby-first main.ts rewrite for experimental-ui
Complete JS rewrite for IRC-style lobby flow:

- Auto-connect signal channel on app launch (no connect button)
- Lobby shows online users with identicon, name, voice status
- "Join Voice" FAB toggles room voice on/off
- Tap user → context menu → Direct Call
- Incoming call banner slides up from bottom
- Back button returns from call to lobby
- Settings panel preserved with all debug toggles

~500 lines (down from 1786) — focused on the lobby experience.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:52:51 +04:00
Siavash Sameni
aa0362f318 feat(ui): lobby-first HTML/CSS layout for experimental-ui
New IRC-style lobby layout:
- Auto-connect on launch, drop into user list
- User rows with identicon, name, fingerprint, voice status
- Speaking indicator (green highlight + pulsing)
- Join Voice FAB (green, toggles to Leave/red)
- Incoming call banner (slides up from bottom)
- User context menu (tap user → Call / Message)
- Settings panel preserved from original

The old connect-screen HTML is removed. The call-screen is kept
intact. JS adaptation next.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:43:15 +04:00
Siavash Sameni
bb23976076 feat(quality): upgrade negotiation + asymmetric quality signals (#28, #29, #30)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m33s
New SignalMessage variants for P2P quality coordination:

UpgradeProposal/UpgradeResponse/UpgradeConfirm (#28):
- Consensual quality upgrade flow — proposer sends desired profile,
  peer accepts/rejects based on own conditions, confirm commits both
- All carry call_id for relay routing

QualityCapability (#30):
- Peer reports its max sustainable profile — enables asymmetric
  encoding where each side uses its own best quality instead of
  forcing everyone to the weakest link

Relay forwards all 4 signals to the call peer (same pattern as
MediaPathReport, CandidateUpdate, HardNatProbe).

Desktop signal recv loop handles all 4 with debug logging.
Encoder switching TODOs noted for wiring into CallEngine.

4 new serde roundtrip tests. 603 total, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:25:34 +04:00
Siavash Sameni
18e5e75f33 feat(analyzer): encrypted payload decoding in replay mode (#17)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 20s
Build Release Binaries / build-amd64 (push) Failing after 3m33s
When --key <64-char-hex> is provided with --replay, the analyzer
decrypts each packet's ChaCha20-Poly1305 payload using the session
key and logs plaintext frame sizes. Prints first 5 + every 100th
decrypt result, and a summary at the end.

This completes all 5 protocol analyzer tasks (#13-17):
- #13: Observer mode (live passive listener) — was done
- #14: TUI with Ratatui (per-participant panels) — was done
- #15: Capture and replay (.wzp format) — was done
- #16: HTML report (Chart.js loss/jitter graphs) — was done
- #17: Encrypted decode (--key for replay) — done now

Usage:
  wzp-analyzer --replay session.wzp --key <64-hex-chars> --html report.html

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:07:43 +04:00
Siavash Sameni
488efcb614 feat(ui): birthday attack toggle in settings (default off)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 22s
Build Release Binaries / build-amd64 (push) Failing after 3m36s
New setting: "Birthday attack (opens extra ports for hard NAT)"
- Default: OFF — no extra latency on call setup
- When ON: waits up to 3s for peer's birthday ports if peer has
  non-cone NAT, adds them to the dial race

Gated end-to-end: Settings → localStorage → JS invoke →
Rust connect param → birthday wait + target injection.
LAN/cone calls unaffected regardless of setting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:54:22 +04:00
Siavash Sameni
8c360186df feat(nat): wire birthday attack end-to-end into connect flow
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m19s
Complete Dialer-side birthday attack integration:

- SignalState stores peer_birthday_ports from HardNatBirthdayStart
- connect command: if peer's HardNatProbe shows non-cone NAT, waits
  up to 3s for birthday ports to arrive (Acceptor needs time to open
  32 sockets + STUN-probe each)
- When birthday ports arrive, generate_dialer_targets() builds hit
  list (known ports + random fill) and adds them to PeerCandidates
- All birthday targets go into the dual-path race as extra candidates
- LAN/cone calls skip the wait entirely (gated on allocation type)

Full waterfall now:
1. Standard candidates (reflexive + mapped)     → immediate
2. Port prediction (sequential delta)           → immediate
3. Birthday targets (if non-cone peer)          → +3s wait
4. All of above raced in parallel via JoinSet
5. Relay runs concurrently with 500ms head-start

599 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:50:11 +04:00
Siavash Sameni
f06f9073ae feat(nat): birthday attack module + HardNatBirthdayStart signal (#86, #87)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 25s
Build Release Binaries / build-amd64 (push) Failing after 3m43s
Birthday attack for random symmetric NATs:
- birthday.rs: open_acceptor_ports() opens N sockets, STUN-probes
  each to learn external ports. generate_dialer_targets() builds
  hit list (known ports first, then random fill). spray_dialer()
  sprays QUIC connects with rate limiting, first success wins.
- Default: 32 acceptor ports, 128 dialer probes, 20ms interval

Signal coordination:
- HardNatBirthdayStart { acceptor_ports, external_ip } sent by
  Acceptor when peer's HardNatProbe shows random/sequential NAT
- Relay forwards it like other call signals
- Desktop recv loop handles and logs it

Hybrid waterfall integration:
- On receiving HardNatProbe with non-cone allocation, Acceptor
  auto-opens birthday ports and sends BirthdayStart
- Sockets kept alive 10s for NAT mapping persistence
- Dialer spray integration into race() pending (needs transport
  hot-swap for background upgrade)

6 new tests, 599 total, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:44:36 +04:00
Siavash Sameni
6c49d7436f feat(ui): direct-only mode setting (no relay fallback)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m38s
New toggle in Settings → "Direct-only mode (no relay fallback)":
- Default: OFF (normal behavior, relay fallback on P2P failure)
- When ON: connect returns error if P2P fails, with full
  candidate_diags in the debug log showing why each candidate
  failed. Call never falls back to relay.

Useful for testing NAT traversal — you see the exact failure
reason instead of the call silently working through relay.

Wired end-to-end:
- Settings.directOnly persisted in localStorage
- Passed as directOnly param to Rust connect command
- connect:path_negotiated shows direct_only flag
- connect:direct_only_failed emits on failure with diags

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:04:45 +04:00
Siavash Sameni
1de280fe04 fix(nat): working NAT tickle + smart filter debug + timeout diags
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m39s
Fixes from real-world 5G↔Starlink testing:

NAT tickle fix:
- tokio::net::UdpSocket::bind() doesn't set SO_REUSEADDR, so binding
  to the same port as quinn silently failed. Now uses socket2::Socket
  with explicit SO_REUSEADDR + SO_REUSEPORT (via libc on unix).
- Tickle now logs success/failure for debugging.

Diagnostic fixes:
- connect:dual_path_race_start shows both dial_order_raw and
  dial_order_smart so we can see what filtering removed
- Grace-period timeout (relay wins first, direct still running)
  now fills "timeout:grace" diags for unrecorded candidates
- Previously candidate_diags was empty when relay won the race

Dependencies:
- Added socket2 = "0.5" to wzp-client

593 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:58:13 +04:00
Siavash Sameni
bc6d327ebb feat(nat): smart candidate filtering + acceptor NAT tickle + 4s timeout
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m33s
Major P2P improvements for cross-network calls:

Smart candidate filtering (smart_dial_order):
- Strip LAN candidates when peer's public IP differs from ours
  (172.16.x.x is unreachable from a different network)
- Strip all IPv6 candidates (Phase 7 disabled, wastes dial slots)
- Only keep mapped + reflexive for cross-network calls
- LAN candidates preserved when both peers share the same public IP

Acceptor NAT tickle:
- A-role sends a 1-byte UDP packet to each peer candidate BEFORE
  accepting. This opens the NAT pinhole for return traffic from
  the Dialer's IP — critical for address-restricted NATs that only
  allow inbound from IPs they've seen outbound traffic to.
- Uses SO_REUSEADDR on the same port as the quinn endpoint.

Direct timeout increased from 2s to 4s:
- Cross-network QUIC handshakes through CGNAT can take 2-3s
- 2s was too aggressive for 5G/LTE networks

Diagnostic fix:
- Record "timeout:4s" for candidates still in-flight when the
  timeout fires (previously these had no diagnostic entry)

5 new tests for smart_dial_order edge cases.
593 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:42:02 +04:00
Siavash Sameni
c478224d67 fix(ui): remove buffer clear that wiped connect events
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m35s
The callDebugBuffer.length=0 in showCallScreen() ran AFTER the
connect command returned, wiping all connect: events (path_negotiated,
race_start, race_done, candidate_diags). Only media: events survived
because they arrived after the clear.

Removed all automatic buffer clearing. The reverse().find() already
handles stale data by picking the most recent event. The manual
"Clear log" button (line 624) is the only way to clear now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:25:13 +04:00
Siavash Sameni
16dcc75514 fix(ui): move buffer clear from call-end to call-start
Some checks failed
Mirror to GitHub / mirror (push) Failing after 25s
Build Release Binaries / build-amd64 (push) Failing after 3m42s
Clearing callDebugBuffer in showConnectScreen() wiped all debug
events the moment a call ended, so the user saw empty logs. Moved
the clear to showCallScreen() instead — the buffer is reset at the
START of a new call, not the end. This way:

- After hanging up, all events from the call are still visible
- Starting a new call clears stale data from the previous one
- The reverse().find() for P2P badge still gets fresh data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:17:16 +04:00
Siavash Sameni
db5751985e fix(ui): replace findLast with reverse().find() for WebView compat
Some checks failed
Mirror to GitHub / mirror (push) Failing after 26s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
findLast() requires Chrome 97+ / Android WebView 97+. Older Android
devices crash with TypeError in pollStatus(), killing all status
updates including the debug log. Use [...arr].reverse().find() which
works everywhere.

Also pass peerMappedAddr in the direct-call connect invoke.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 15:06:07 +04:00
Siavash Sameni
c0dd6c06ff feat(debug): per-candidate dial diagnostics in dual-path race
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m24s
Added CandidateDiag struct to RaceResult with per-candidate:
- address attempted
- result (ok / skipped:ipv6 / error:reason)
- elapsed time in ms

Surfaced in call-debug events:
- connect:dual_path_race_start now includes dial_order + peer_mapped
- connect:dual_path_race_done now includes candidate_diags array

Upgraded dual_path tracing from debug to info for IPv6 skips and
dial failures so they appear in logcat/console.

Helps diagnose why P2P fails on specific networks (5G CGNAT,
address-restricted NATs, etc).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:16:34 +04:00
Siavash Sameni
6805caae0e fix(ui): P2P badge showing stale status from previous call
Some checks failed
Mirror to GitHub / mirror (push) Failing after 26s
Build Release Binaries / build-amd64 (push) Failing after 3m47s
The callDebugBuffer persisted across calls, so .find() returned the
path_negotiated event from Call 1 (P2P Direct) when rendering the
badge during Call 2 (Relay). Two fixes:

1. Clear callDebugBuffer in showConnectScreen() between calls
2. Use .findLast() instead of .find() so the most recent event wins

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 12:02:06 +04:00
Siavash Sameni
5a03da72d3 feat(ui): selectable NAT detection mode + netcheck Tauri command
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m48s
detect_nat_type now accepts optional `mode` parameter:
- "relay" — relay-based Reflect only (original behavior)
- "stun" — public STUN servers only (no relay needed)
- "both" — relay + STUN in parallel (default, highest confidence)

New run_netcheck Tauri command exposes the full network diagnostic
(NAT type, IPv4/v6, port mapping, relay latencies, port allocation)
to the JS frontend.

JS usage:
  await invoke('detect_nat_type', { relays, mode: 'stun' })
  await invoke('run_netcheck', { relays })

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:43:17 +04:00
Siavash Sameni
e3e63a40a0 feat(nat): wire hard NAT port prediction into call flow (#85)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m27s
End-to-end integration of sequential port prediction:

- place_call: spawns background detect_port_allocation() + sends
  HardNatProbe signal after offer (doesn't delay call setup)
- answer_call: same for AcceptTrusted answers (privacy mode skips)
- Signal recv loop: stashes HardNatProbe in SignalState.peer_hard_nat_probe
- connect: reads peer's probe, if Sequential{delta} runs predict_ports()
  and adds predicted addrs to PeerCandidates.local for the dual-path race
- parse_sequential_delta() helper for "sequential(delta=N)" strings

The full flow: both peers independently detect their NAT's port
allocation, exchange HardNatProbe via relay, and the connect command
uses the peer's sequence to predict which ports to dial — all before
the dual-path race starts.

588 tests pass, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:39:40 +04:00
Siavash Sameni
7b4bce69d5 docs: update all docs for hard NAT detection + relay wiring
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m36s
- PROGRESS.md: hard NAT Phase A, relay cross-wiring, 588 tests
- ARCHITECTURE.md: hard NAT port prediction diagram + pattern table
- PRD-p2p-direct.md: Phase 8.6 split into a/b/c/d with status
- PRD-hard-nat.md: Phase A done, B signal ready, effort table updated
- PRD-netcheck.md: port_allocation field + probe documented

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:33:12 +04:00
Siavash Sameni
ec1bdf3cd5 feat(nat): hard NAT port allocation detection + prediction + HardNatProbe signal (#29)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m30s
Phase A of hard NAT traversal (PRD-hard-nat.md):

- PortAllocation enum: PortPreserving / Sequential{delta} / Random / Unknown
- detect_port_allocation(): sequential STUN probes from single socket,
  analyzes port sequence for allocation pattern
- classify_port_allocation(): pure function with jitter tolerance,
  wraparound handling, 60% threshold for noisy sequences
- predict_ports(): generates target port range from last_port + delta
- HardNatProbe signal message: carries port_sequence, allocation
  pattern, external_ip for peer coordination
- Relay forwards HardNatProbe to call peer
- Netcheck gains port_allocation field + format_report display

588 tests pass (17 new), 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:29:35 +04:00
Siavash Sameni
ee14862376 docs: add PRD for hard NAT traversal (port prediction + birthday attack)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 22s
Build Release Binaries / build-amd64 (push) Failing after 3m26s
4-phase design:
A. Port allocation pattern detection (sequential vs random)
B. Sequential port prediction (~80% success, <2s)
C. Birthday attack for random NATs (98% success, ~10s)
D. Hybrid waterfall with background relay-to-direct upgrade

Taskmaster tasks #84-87 added.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:20:19 +04:00
Siavash Sameni
f83361895e docs: add PRDs for Phase 8 Tailscale-inspired features
Some checks failed
Mirror to GitHub / mirror (push) Failing after 23s
Build Release Binaries / build-amd64 (push) Failing after 3m35s
5 new PRDs:
- PRD-public-stun.md — RFC 5389 STUN client
- PRD-portmap.md — NAT-PMP/PCP/UPnP port mapping
- PRD-ice-regather.md — Mid-call ICE re-gathering
- PRD-netcheck.md — Network diagnostic
- PRD-relay-selection.md — Region-based relay selection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:08:46 +04:00
Siavash Sameni
0857d190ed chore: rename legacy Android build script to prevent accidental use
Some checks failed
Mirror to GitHub / mirror (push) Failing after 30s
Build Release Binaries / build-amd64 (push) Failing after 3m23s
build-android-docker.sh builds the old Kotlin app in android/app/
(18M APK), not the live Tauri app (209M). Renamed to
build-android-docker-LEGACY.sh so it's never picked by accident.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:42:23 +04:00
Siavash Sameni
5d431c0721 fix(android): restore tauri::Emitter import for Docker builder toolchain
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Has been cancelled
Edition 2024 on local macOS auto-resolves the Emitter trait, but the
Docker builder's Rust/Tauri version requires the explicit import for
AppHandle::emit() to resolve. Keeps the warning locally to avoid
breaking the CI build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:34:23 +04:00
Siavash Sameni
8fcf1be341 feat(nat): Tailscale-inspired STUN/ICE + port mapping + mid-call re-gathering (#28)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 23s
Build Release Binaries / build-amd64 (push) Failing after 6m8s
Phase 8: 5 new modules bringing NAT traversal close to Tailscale's approach.

- stun.rs: RFC 5389 STUN client — public server reflexive discovery,
  XOR-MAPPED-ADDRESS parsing, parallel probe with retry, STUN fallback
  in desktop try_reflect_own_addr()
- portmap.rs: NAT-PMP (RFC 6886) + PCP (RFC 6887) + UPnP IGD port
  mapping — gateway discovery, acquire/release/refresh lifecycle,
  new PeerCandidates.mapped candidate type in dial order
- ice_agent.rs: candidate lifecycle — gather(), re_gather(),
  apply_peer_update() with monotonic generation counter,
  CandidateUpdate signal message forwarded by relay
- netcheck.rs: comprehensive diagnostic — NAT type, IPv4/v6,
  port mapping availability, relay latencies, CLI --netcheck
- relay_map.rs: RTT-sorted relay map, preferred() selection,
  populate_from_ack() for RegisterPresenceAck.available_relays

Relay: CallRegistry stores + cross-wires caller/callee_mapped_addr
into CallSetup.peer_mapped_addr. Region config + available_relays
populated from federation peers in RegisterPresenceAck.

Desktop: place_call/answer_call call acquire_port_mapping() and
fill caller/callee_mapped_addr. STUN+relay combined NAT detection.

571 tests pass (66 new), 0 regressions, 0 warnings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 10:17:17 +04:00
Siavash Sameni
9377a9009c feat(quality): bandwidth probing for upward adaptive quality (#10)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 25s
Build Release Binaries / build-amd64 (push) Failing after 3m36s
After 30s stable at a tier, the AdaptiveQualityController actively
probes the next tier up by switching the encoder and observing for 5s.
If loss/RTT stay within the target tier's thresholds, the upgrade
commits. If >1 bad report, the probe aborts with a 60s cooldown.

Probing is disabled on cellular (studio tiers aren't classified there)
and skipped when already at Studio64k (highest tier).

This complements the passive upgrade path (10 consecutive good reports)
by actively discovering that a path can sustain higher quality, rather
than waiting for the classification to drift upward.

New: ProbeState struct, check_probe() method, 4 constants, 5 tests.
377 tests passing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:47:21 +04:00
Siavash Sameni
4471797edf docs: update all PRDs and PROGRESS to current state (2026-04-13)
Some checks failed
Mirror to GitHub / mirror (push) Has been cancelled
Build Release Binaries / build-amd64 (push) Has been cancelled
Updated 6 PRDs with implementation status:
- PRD-adaptive-quality: P2P quality done, bandwidth probing remains
- PRD-protocol-analyzer: all 5 phases documented
- PRD-relay-concurrency: DashMap + clone-before-send done
- PRD-p2p-direct: P2P adaptive quality update
- PRD-engine-dedup: all phases done
- PROGRESS.md: test count 372+, 3 new change sections

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:40:56 +04:00
Siavash Sameni
425c67a08a feat(analyzer): replay, HTML report, encrypted decode stub (#15, #16, #17)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 26s
Build Release Binaries / build-amd64 (push) Failing after 3m31s
#15 - Replay mode: --replay <file.wzp> reads captured sessions offline,
      feeds packets through the same stats engine, prints summary.
      CaptureReader mirrors CaptureWriter's binary format.

#16 - HTML report: --html <report.html> generates self-contained HTML
      with Chart.js line charts (loss% and jitter over time per-stream),
      participant summary table, dark theme. Works with live sessions
      (after exit) or replay mode.

#17 - Encrypted decode: --key <hex> flag accepted and stored. Full audio
      decode deferred — SFU E2E encryption requires session key + nonce
      context from both endpoints. Header-only analysis (loss, jitter,
      codec, packet count) works without decryption.

Usage:
  wzp-analyzer --replay session.wzp --html report.html
  wzp-analyzer relay:4433 --room test --capture out.wzp --html report.html

372 tests passing, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:31:28 +04:00
Siavash Sameni
88ca3e099a feat: wzp-analyzer binary — protocol analyzer with TUI (#13, #14, #15)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m20s
New binary: wzp-analyzer joins a room as a passive observer and displays
real-time per-participant quality metrics.

Features:
- Passive observation: connects to relay, receives all media, never sends
- Participant detection: identifies senders by sequence number streams
- Per-participant stats: packets, loss%, jitter, codec, codec switches
- TUI mode (ratatui): color-coded table (green/yellow/red by loss),
  10 FPS refresh, session header, quit with q/Ctrl+C
- No-TUI mode: prints stats to stderr every 2s (for headless/CI use)
- Capture mode: binary .wzp format with microsecond timestamps for
  offline replay (magic WZP\x01, JSON header, per-packet records)
- Session summary on exit

Usage:
  wzp-analyzer 193.180.213.68:4433 --room general
  wzp-analyzer 193.180.213.68:4433 --room general --no-tui --duration 60
  wzp-analyzer 193.180.213.68:4433 --room general --capture session.wzp

372 tests passing, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:26:46 +04:00
Siavash Sameni
1e82811cc1 feat(p2p): adaptive quality on direct calls (#23)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m37s
P2P calls now adapt codec quality based on observed network conditions,
matching what relay calls already had.

Three-layer implementation:
- QualityReport::from_path_stats(): construct reports from local quinn
  stats (loss%, RTT, jitter) without needing relay-generated reports
- CallEncoder.pending_quality_report: one-shot attachment to next
  source packet (consumed on encode, not repeated)
- Engine send tasks: generate quality report every 50 frames (~1s)
  from quinn_path_stats() and attach via set_pending_quality_report()
- Engine recv tasks: self-observe from own QUIC path stats every 50
  packets, feed to AdaptiveQualityController for P2P adaptation
  (works even if peer isn't sending quality reports yet)

Both relay and P2P calls now have adaptive quality. On relay calls,
both peer-sent reports AND local observations feed the controller.
Hysteresis (3 consecutive bad reports to downgrade) prevents thrashing.

372 tests passing (+4 new: from_path_stats encoding, clamping, zero
values, encoder quality report attachment).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:14:06 +04:00
Siavash Sameni
81b5522942 refactor: clap CLI parser, safety docs, dead code docs, cross-refs
Some checks failed
Mirror to GitHub / mirror (push) Failing after 26s
Build Release Binaries / build-amd64 (push) Failing after 4m1s
Audit items 6, 8, 9, 10:

#6 - Relay CLI: replaced 154-line manual parse_args() with clap derive
     (13 flags/options preserved, auto --help, --version from build hash)
#8 - wzp-native: added # Safety docs to all 3 unsafe extern "C" fns
#9 - wzp-crypto: documented x25519_static_secret/public as reserved for
     future static-key federation auth (not dead code, intentionally unused)
#10 - Cross-references between quality.rs ↔ dred_tuner.rs module docs

368 tests passing, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:40:49 +04:00
Siavash Sameni
d539a6dfb9 test(federation): 29 tests for federation.rs (was 0), engine dedup PRD
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m45s
Federation test coverage (crates/wzp-relay/tests/federation.rs):
- room_hash: determinism, uniqueness, length, case sensitivity (5)
- is_global_room: static config, call-* implicit, exact match (3)
- resolve_global_room: static + call-* resolution (2)
- global_room_hash: canonical names, fallthrough, independence (4)
- forward_to_peers: zero peers, live QUIC datagram delivery (2)
- broadcast_signal: zero peers, live QUIC signal delivery (2)
- send_signal_to_peer: unknown fingerprint error (1)
- peer lookup: fingerprint normalization, IP, trust priority (5)
- accessors: local_tls_fp, cross_relay_tx, remote_participants (3)
- integration: full media egress over live QUIC link (1)
- edge case: exact room match (1)

Total relay tests: 120 (was 91). Full suite: 368 passing.

Also added PRD-engine-dedup.md for the engine.rs helper extraction
completed in the previous commit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:35:04 +04:00
Siavash Sameni
ba12aae439 refactor: extract shared engine helpers, federation clone-before-send, constants
Some checks failed
Mirror to GitHub / mirror (push) Failing after 30s
Build Release Binaries / build-amd64 (push) Failing after 3m48s
Engine deduplication (PRD-engine-dedup.md):
- build_call_config(): shared CallConfig construction (was 23 lines × 2)
- codec_to_profile(): shared CodecId → QualityProfile mapping (was 19 lines × 2)
- run_signal_task(): shared signal handler (was 48 lines × 2)
- Net -39 lines from engine.rs, 6 duplicated blocks → single-line calls

Quick wins from REFACTOR-codebase-audit.md:
- 6 magic number constants extracted (CAPTURE_POLL_MS, RECV_TIMEOUT_MS, etc.)
- DRED_POLL_INTERVAL moved from 2 local defs to 1 module-level const
- federation.rs: forward_to_peers, broadcast_signal, send_signal_to_peer
  now clone peer list and release lock before sending (was holding Mutex
  across async I/O — last lock-during-send pattern eliminated)
- main.rs: close_transport() helper replaces 12 silent .ok() calls with
  debug-level logging

314 tests passing, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:22:44 +04:00
Siavash Sameni
fdb78e08bd docs: full codebase refactoring audit with prioritized suggestions
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m33s
Comprehensive analysis across all 8 crates + Tauri engine covering:
- engine.rs: 35% duplication between Android/desktop (350+ lines)
- SignalMessage: 36 variants mixing orthogonal concerns
- federation.rs: zero test coverage on 1,132 lines of complex logic
- peer_links: lock held across async sends (last lock-during-I/O)
- Magic numbers, error handling, CLI parsing, unsafe docs
- Priority matrix: 10 items ranked by effort/impact/risk

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:35:59 +04:00
Siavash Sameni
3a51db998a docs: relay concurrency refactor guide + PRD update for DashMap
Some checks failed
Mirror to GitHub / mirror (push) Failing after 25s
Build Release Binaries / build-amd64 (push) Failing after 8m3s
REFACTOR-relay-concurrency.md: complete post-DashMap analysis with
current lock inventory, 4 prioritized suggestions (clone-before-send,
peer_links DashMap, quality atomics, arc-swap snapshots), decision
matrix, and concurrency diagram.

PRD-relay-concurrency.md: updated to recommend DashMap as primary
approach (was Option A per-room locks).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:27:26 +04:00
Siavash Sameni
a52b011fb5 feat(relay): replace global Mutex<RoomManager> with DashMap sharding
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m41s
Eliminates the single-lock bottleneck for media forwarding. Before:
all participants across all rooms competed for one Mutex. Now rooms
are stored in DashMap (64 internal shards with per-shard RwLocks).

Changes:
- RoomManager.rooms: HashMap → DashMap<String, Room>
- Per-room quality tracking (qualities, current_tier moved into Room)
- Arc<Mutex<RoomManager>> → Arc<RoomManager> everywhere
- 20 .lock().await sites removed across room.rs, main.rs, federation.rs, ws.rs
- federation forward_to_peers: clone peer list, release lock, then send
- ACL uses std::sync::Mutex (rarely accessed, non-async)

Concurrency improvement:
- Before: 100 rooms × 10 people = 1000 tasks → 1 Mutex
- After: distributed across 64 DashMap shards, ~15 tasks per shard avg
- Rooms are fully independent — room A never blocks room B

314 tests passing, 0 regressions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:17:57 +04:00
Siavash Sameni
2514151a89 docs: PRD for relay concurrency — per-room lock sharding
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m43s
Full analysis of relay lock contention with precise inventory of every
lock acquisition in the hot path. Evaluates 4 design options:
A) Per-room Arc<Mutex<Room>> (recommended — 100x improvement for multi-room)
B) DashMap (good but less explicit)
C) Channel-based fan-out (over-engineered for current scale)
D) Snapshot-on-change via arc-swap (best perf, more complex)

Phase 1: per-room locks, Phase 2: federation lock fix, Phase 3: quality
tracking out of critical path. Estimated 1.5-2.5 days total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:01:21 +04:00
Siavash Sameni
f265fd772d docs: relay concurrency model, Opus6k fix, build script fixes
Some checks failed
Mirror to GitHub / mirror (push) Failing after 34s
Build Release Binaries / build-amd64 (push) Failing after 3m56s
- ARCHITECTURE.md: new "Relay Concurrency Model" section documenting
  threading, shared state locking table, scaling characteristics, and
  the RoomManager Mutex as primary bottleneck
- PROGRESS.md: Opus6k frame starvation fix, build script fixes
- PRD-dred-integration.md: Opus6k frame starvation bug documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:54:37 +04:00
Siavash Sameni
9ae9441de4 fix(audio): check capture ring available before read (fixes Opus6k choppy)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m58s
Partial reads from the capture ring consumed samples that were then
discarded when the send loop retried from buf[0]. For 20ms codecs this
was invisible (single Oboe burst fills 960 samples in one read), but
40ms codecs (Opus6k, 1920 samples) needed 2 bursts — the first partial
read consumed 960 real samples and threw them away.

Result: Opus6k produced ~11 frames/s instead of 25 (~44% of expected).

Fix: expose wzp_native_audio_capture_available() and check it before
reading, matching the desktop capture_ring.available() pattern. Partial
reads no longer occur because we only read when enough samples exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:46:15 +04:00
Siavash Sameni
d9e7e72978 docs: update PROGRESS, PRDs for completed tasks #9, #11, #12, #27
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m50s
- PROGRESS.md: add 2026-04-13 section with 5-tier quality, QualityDirective
  handling, debug tap enhancements, dual_path fix, keystore sync
- PRD-coordinated-codec.md: Phase 3 marked complete (client directive handling)
- PRD-adaptive-quality.md: milestone table updated with Done/Pending status

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:34:01 +04:00
Siavash Sameni
8ff0c548a7 fix(audio): update frame_samples on codec profile switch, fix buf sizing
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Has been cancelled
frame_samples was immutable — when adaptive quality switched from 20ms
(Opus24k, 960 samples) to 40ms (Opus6k, 1920 samples), the send loop
kept reading 960 samples and feeding half-sized frames to the encoder.
This caused Opus6k to produce ~11 frames/s instead of 25, making audio
choppy.

Fix:
- frame_samples is now mut and updated on profile switch
- buf sized for max frame (1920) with frame_samples-bounded slices
- RMS, mute, encode, and capture reads all use &buf[..frame_samples]
- Applied to both Android and desktop send tasks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:33:02 +04:00