Commit Graph

44 Commits

Author SHA1 Message Date
Siavash Sameni
9ae9441de4 fix(audio): check capture ring available before read (fixes Opus6k choppy)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m58s
Partial reads from the capture ring consumed samples that were then
discarded when the send loop retried from buf[0]. For 20ms codecs this
was invisible (single Oboe burst fills 960 samples in one read), but
40ms codecs (Opus6k, 1920 samples) needed 2 bursts — the first partial
read consumed 960 real samples and threw them away.

Result: Opus6k produced ~11 frames/s instead of 25 (~44% of expected).

Fix: expose wzp_native_audio_capture_available() and check it before
reading, matching the desktop capture_ring.available() pattern. Partial
reads no longer occur because we only read when enough samples exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:46:15 +04:00
Siavash Sameni
8ff0c548a7 fix(audio): update frame_samples on codec profile switch, fix buf sizing
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Has been cancelled
frame_samples was immutable — when adaptive quality switched from 20ms
(Opus24k, 960 samples) to 40ms (Opus6k, 1920 samples), the send loop
kept reading 960 samples and feeding half-sized frames to the encoder.
This caused Opus6k to produce ~11 frames/s instead of 25, making audio
choppy.

Fix:
- frame_samples is now mut and updated on profile switch
- buf sized for max frame (1920) with frame_samples-bounded slices
- RMS, mute, encode, and capture reads all use &buf[..frame_samples]
- Applied to both Android and desktop send tasks

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:33:02 +04:00
Siavash Sameni
d424515542 feat: 5-tier quality classification, QualityDirective handling, debug tap stats
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m49s
- Extend Tier enum from 3 to 6 levels: Studio64k/48k/32k + Good +
  Degraded + Catastrophic with asymmetric hysteresis (down:3, up:5,
  studio:10)
- Handle QualityDirective signals in both desktop and Android engines
  — relay-coordinated codec switching now works end-to-end
- Add periodic TAP STATS to debug tap: packets in/out, fan-out avg,
  seq gaps, codecs seen (every 5s)
- Mark task #2 done (ParticipantInfo in federation signals already
  implemented)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:23:48 +04:00
Siavash Sameni
22045bc5e6 feat: adaptive quality in desktop, relay quality directive, Oboe state polling
- Wire AdaptiveQualityController into desktop engine send/recv tasks
  (mirrors Android pattern: AtomicU8 pending_profile, auto-mode check)
- Wire same into Android engine send task (was only in recv before)
- QualityDirective SignalMessage variant for relay-initiated codec switch
- ParticipantQuality tracking in relay RoomManager (per-participant
  AdaptiveQualityController, weakest-link tier computation)
- Relay broadcasts QualityDirective to all participants when room-wide
  tier degrades (coordinated codec switching)
- Oboe stream state polling: poll getState() for up to 2s after
  requestStart() to ensure both streams reach Started before proceeding
  (fixes intermittent silent calls on cold start, Nothing Phone A059)

Tasks: #7, #25, #26, #31, #35

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 19:54:04 +04:00
Siavash Sameni
766c9df442 feat(dred): continuous DRED tuning, PMTUD, extended Opus6k window
- DredTuner: maps live network metrics (loss/RTT/jitter) to continuous
  DRED duration every ~500ms instead of discrete tier-locked values.
  Includes jitter-spike detection for pre-emptive Starlink-style boost.
- Opus6k DRED extended from 500ms to 1040ms (max libopus 1.5 supports)
- PMTUD: quinn MtuDiscoveryConfig with upper_bound=1452, 300s interval
- TrunkedForwarder respects discovered MTU (was hard-coded 1200)
- QuinnPathSnapshot exposes quinn internal stats + discovered MTU
- AudioEncoder trait: set_expected_loss() + set_dred_duration() methods
- PathMonitor: sliding-window jitter variance for spike detection
- Integrated into both Android and desktop send tasks in engine.rs
- 14 new tests (10 tuner unit + 4 encoder integration)
- Updated ARCHITECTURE.md, PROGRESS.md, PRD-dred-integration, PRD-mtu

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 19:38:37 +04:00
Siavash Sameni
24cc74d93c fix(audio): clear BT SCO communication device on call end
Without clearCommunicationDevice(), the BT headset stays locked in SCO
mode after the call. Media playback (video, music) can't route to BT
A2DP, requiring a device reboot to restore normal audio.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:40:44 +04:00
Siavash Sameni
114d69e488 fix: use tracing::warn! instead of bare warn! in engine.rs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:31:12 +04:00
Siavash Sameni
15c237ceea fix(audio): defer MODE_IN_COMMUNICATION to call start, restore on end
Root cause: MainActivity set MODE_IN_COMMUNICATION at app launch,
hijacking system audio routing immediately — BT A2DP music dropped to
earpiece, and the pre-existing communication mode confused subsequent
setCommunicationDevice calls for BT SCO.

Fix: MainActivity now only sets volumes. MODE_IN_COMMUNICATION is set
via JNI right before Oboe audio_start() in CallEngine, and MODE_NORMAL
is restored after audio_stop() when the call ends.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:29:59 +04:00
Siavash Sameni
29cd23fe39 fix(p2p): connection cleanup — 4 fixes for stale/dead connections
PRD 4: Disable IPv6 direct dial/accept temporarily. IPv6 QUIC
handshakes succeed but connections die immediately on datagram
send ("connection lost"). IPv4 candidates work reliably. IPv6
candidates still gathered but filtered at dial time.

PRD 1: Close losing transport after Phase 6 negotiation. The
non-selected transport now gets an explicit QUIC close frame
instead of silently dropping after 30s idle timeout. Prevents
phantom connections from polluting future accept() calls.

PRD 2: Harden accept loop with max 3 stale retries. Stale
connections are explicitly closed (conn.close) and counted.
After 3 stale connections, the accept loop aborts instead of
spinning until the race timeout.

PRD 3: Resource cleanup — close old IPv6 endpoint before
creating a new one in place_call/answer_call. Add Drop impl
to CallEngine so tasks are signalled to stop on ungraceful
shutdown.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:11:50 +04:00
Siavash Sameni
40955bd11c debug(media): add connection diagnostics for direct P2P drops
When direct P2P calls show 100% datagram drops, we need to know
WHY send_media() fails. This commit adds:

- Remote address + stable_id logging on A-role accept and D-role
  dial success (dual_path.rs) — tells us which candidate won
- Remote address + max_datagram_size on engine transport init —
  verifies datagrams are negotiated
- last_send_err in send heartbeat — captures the actual error
  from send_datagram() failures
- QuinnTransport::remote_address() helper

Also fixes UI badge: was looking for wrong event name
("dual_path_race_won" → "path_negotiated").

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:29:58 +04:00
Siavash Sameni
9f2ff6a6ec fix(android-audio): Fix D+C — stop+prime cycle on every call start
Addresses the first-join no-audio regression (tasks #35-37) where
the Oboe playout callback fires once (cb#0) and then stops
draining the ring on the Nothing Phone, causing written_samples
to freeze at 7679 (ring capacity minus one burst). Second call
(rejoin) always works because audio_stop tears down the streams
and audio_start rebuilds them fresh.

Two combined fixes:

**Fix D (task #37)**: always call audio_stop() before audio_start()
at the top of CallEngine::start. On a cold launch this is a no-op
(streams not yet started). On subsequent calls it guarantees a
clean teardown before rebuild — the same thing rejoin does. Added
a 50ms pause between stop and start to let the Android HAL release
the audio session.

**Fix C (task #36)**: after audio_start(), immediately write 960
samples (20ms) of silence into the playout ring. This ensures the
Oboe playout callback has data to drain on its first invocation.
On devices where an empty-ring first callback causes the stream
to self-pause (Nothing Phone's Qualcomm HAL), the priming data
keeps the callback loop alive until real decoded audio arrives
from the recv task.

Together these cover the two most likely root causes:
1. Stale Oboe state from a previous audio_start that didn't
   clean up properly → Fix D forces a clean rebuild
2. Playout callback self-pausing on an empty ring → Fix C
   ensures the ring is non-empty at callback time

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:50:58 +04:00
Siavash Sameni
134ee3a77f fix(engine): pass is_direct_p2p explicitly instead of deriving from is_some
Critical Phase 6 bug: when the negotiation agreed on relay path
but delivered the relay transport via pre_connected_transport,
CallEngine saw is_some() = true → is_direct_p2p = true → skipped
perform_handshake. The relay couldn't authenticate the participant
→ room join silently failed → recv_fr: 0, both sides sending
into the void.

Fix: add explicit is_direct_p2p: bool parameter to CallEngine::
start (both android and desktop branches). The connect command
sets it from the Phase 6 negotiation result (use_direct), not
from whether pre_connected_transport is Some.

Now relay-negotiated calls correctly run perform_handshake,
and direct P2P calls correctly skip it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:34:21 +04:00
Siavash Sameni
0a973b234b fix(engine): import tauri::Emitter for AppHandle::emit on Android target 2026-04-12 09:29:56 +04:00
Siavash Sameni
0ccf4ed6b5 feat(call): media health watchdog — warn user when no audio arrives
When a P2P direct call establishes successfully but the underlying
network path dies (phone switched from WiFi to LTE mid-call, or
cross-relay media forwarding isn't working), the call stays up
silently with recv_fr frozen at 0. No feedback to the user.

New watchdog in the Android recv task: tracks consecutive
heartbeat ticks (2s each) where recv_fr hasn't advanced. After 3
ticks (6s) with no new packets, emits:

- call-event { kind: "media-degraded" } — user-facing warning
  banner: "No audio — connection may be lost. Try hanging up and
  reconnecting, or switch to a different relay."
- call-debug media:no_recv_timeout for the debug log

If packets resume (recv_fr advances), clears the banner via:
- call-event { kind: "media-recovered" }

JS listener creates/removes a red-tinted banner dynamically at
the top of the call screen. Banner is also cleaned up on
showConnectScreen (call end).

This covers:
- Direct P2P that established on WiFi but died when the phone
  switched to LTE (stale NAT mapping, unreachable peer)
- Cross-relay calls where federation media isn't forwarding
  (relay not upgraded, not federated, etc.)
- Any other "connected but silent" scenario

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 09:18:38 +04:00
Siavash Sameni
2427630472 fix(connect): make peerLocalAddrs optional + skip handshake on direct P2P
Two regressions from Phase 5.5/5.6:

1. Room connect broken: the connect Tauri command required
   peerLocalAddrs as a Vec<String>, but the room-join JS path
   doesn't pass it (only the direct-call setup handler does).
   Error: "invalid args 'peerLocalAddrs' for command 'connect':
   command connect missing required key peerLocalAddrs".

   Fix: change to Option<Vec<String>>, unwrap_or_default() at
   usage sites. Room connect works again with zero peer addrs.

2. Direct P2P call connects but then CallEngine fails with
   "expected CallAnswer, got Discriminant(0)". Root cause: after
   the dual-path race picked a direct P2P transport, CallEngine
   still ran perform_handshake() on it. That handshake is a
   relay-specific protocol — sends a CallOffer signal and waits
   for CallAnswer back. On a direct QUIC connection to a phone,
   there's nobody running accept_handshake, so the handshake
   reads garbage from the peer's first media packet and errors.

   Fix: track is_direct_p2p = pre_connected_transport.is_some()
   and skip perform_handshake when true. The direct connection
   is already TLS-encrypted by QUIC, and both peers' identities
   were verified through the signal channel (DirectCallOffer/
   Answer carry identity_pub + ephemeral_pub + signature). Both
   android and desktop branches updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 08:09:32 +04:00
Siavash Sameni
16793be36f fix(p2p): Phase 5.6 — direct-path head start + hangup propagation + media debug events
Three fixes from a field-test log where same-LAN calls were
still losing the dual-path race to the relay path, peers were
getting stuck on an empty call screen when the other side
hung up, and 1-way audio was hard to diagnose because the
GUI debug log had no media-level events.

## 1. Direct-path 500ms head start (dual_path.rs)

The race was resolving in ~105ms with Relay winning even when
both phones were on the same MikroTik LAN with valid IPv6 host
candidates. Root cause: the relay dial is a plain outbound QUIC
connect that completes in whatever the client→relay RTT is
(~100ms), while the direct path needs the PEER to also process
its CallSetup, spin up its own race, and complete at least one
LAN dial back to us. That cross-client sequence reliably takes
longer than 100ms, so relay always won.

Fix: delay the relay_fut with `tokio::time::sleep(500ms)` before
starting its connect. Same-LAN direct dials complete in 30-50ms
typically, so the head start gives direct plenty of time to win
cleanly. Users on setups where direct genuinely can't work
(LTE-to-LTE cross-carrier) pay 500ms extra on the relay fallback,
which is invisible for a call setup.

## 2. Hangup propagation via a new hangup_call command (lib.rs + main.ts)

The hangup button was calling `disconnect` which stopped the
local media engine but never sent a SignalMessage::Hangup to
the relay. The peer never got notified and was stuck on the
call screen with silent audio. My earlier fix (commit e75b045)
only handled the RECEIVE side — auto-dismiss call screen on
recv:Hangup — but the SEND side was still missing.

New Tauri command `hangup_call`:
  1. Acquire state.signal.lock(), send SignalMessage::Hangup
     over the signal transport (best-effort; log + continue if
     signal is down)
  2. Acquire state.engine.lock(), stop the CallEngine

JS hangupBtn click handler now calls hangup_call with a fallback
to raw disconnect if the command is missing (older builds).

## 3. Media debug events (engine.rs + lib.rs)

Threaded tauri::AppHandle into CallEngine::start so the send/
recv tasks can emit call-debug events when the user has debug
logs enabled. Added on the Android branch (desktop branch
accepts the arg for API symmetry but doesn't emit yet):

  - media:first_send — emitted when the first encoded frame is
    handed to the transport. Useful for 1-way audio diagnosis:
    if this fires on side A but side B never sees media:first_recv,
    A's outbound is broken.
  - media:first_recv — emitted when the first packet from the
    peer arrives. Mirror of first_send.
  - media:send_heartbeat — every 2s with frames_sent, last_rms,
    last_pkt_bytes, short_reads, drops. A stalled last_rms
    (== 0) tells you the mic isn't producing samples; a frozen
    frames_sent tells you the encode pipeline hung.
  - media:recv_heartbeat — every 2s with recv_fr, decoded_frames,
    last_written, written_samples, decode_errs, codec. Mirror
    invariants for the inbound direction.

All four are gated by `call_debug_logs_enabled()` via
`emit_call_debug`, so they only show up in the GUI log when the
user has the Call Flow Debug Logs checkbox on. Tracing::info!
still runs unconditionally so logcat (adb) keeps its copy
regardless.

The `emit_call_debug` fn in lib.rs is now `pub(crate)` so
engine.rs can call it via `crate::emit_call_debug`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 07:55:41 +04:00
Siavash Sameni
59ce52f8e8 feat(p2p): Phase 3.5 dual-path QUIC race + GUI call-flow debug logs
Two features in one commit because they ship and test together:
Phase 3.5 closes the hole-punching loop and the call-flow debug
logs give the user live visibility into every step of a call so
real-hardware testing of the new P2P path is debuggable.

## Phase 3.5 — dual-path QUIC connect race

Completes the hole-punching work Phase 3 scaffolded. On receiving
a CallSetup with peer_direct_addr, the client now actually races a
direct QUIC handshake against the relay dial and uses whichever
completes first. Symmetric role assignment avoids the two-conns-
per-call problem:

- Both peers compare `own_reflex_addr` vs `peer_reflex_addr`
  lexicographically.
- Smaller addr → **Acceptor** (A-role): builds a server-capable
  dual endpoint, awaits an incoming QUIC session. Does NOT dial.
- Larger addr → **Dialer** (D-role): builds a client-only
  endpoint, dials the peer's addr with `call-<id>` SNI. Does NOT
  listen.
- Both sides always dial the relay in parallel as fallback.
- `tokio::select!` with `biased` preference for direct, `tokio::pin!`
  so each branch can await the losing opposite as fallback.
- Direct timeout 2s, relay fallback timeout 5s (so 7s worst case
  from CallSetup to "no media path" error).

New crate module `wzp_client::dual_path::{race, WinningPath}`
(moved here from desktop/src-tauri so it's testable from a
workspace test). `determine_role` in `wzp_client::reflect` is
pure-function and unit-tested.

### CallEngine integration
- New `pre_connected_transport: Option<Arc<QuinnTransport>>` arg
  on both android + desktop `CallEngine::start` branches. Skips
  the internal wzp_transport::connect step when Some. Backward-
  compat: None keeps Phase 0 relay-only behavior.
- `connect` Tauri command reads own_reflex_addr from SignalState,
  computes role, runs the race, passes the winning transport
  into CallEngine. If ANY input is missing (no peer addr, no own
  addr, equal addrs), falls back to classic relay path —
  identical to pre-Phase-3.5 behavior.

### Tests (9 new, all passing)
- 6 unit tests for `determine_role` truth table in
  `wzp-client/src/reflect.rs` (smaller=Acceptor, larger=Dialer,
  port-only diff, equal, missing-side, symmetry)
- 3 integration tests in `crates/wzp-client/tests/dual_path.rs`:
    * `dual_path_direct_wins_on_loopback` — two-endpoint test
      rig, Dialer wins direct path vs loopback mock relay
    * `dual_path_relay_wins_when_direct_is_dead` — dead peer
      port, 2s direct timeout, relay fallback wins
    * `dual_path_errors_cleanly_when_both_paths_dead` — <10s
      error, no hang

## GUI call-flow debug logs

Runtime-toggled structured events at every step of a call so the
user can see where a call progressed or stalled on real hardware.
Modeled on the existing DRED_VERBOSE_LOGS pattern.

### Rust side
- `static CALL_DEBUG_LOGS: AtomicBool` + `emit_call_debug(&app,
  step, details)` helper. Always logs via `tracing::info!`
  (logcat always has a copy); GUI Tauri `call-debug-log` event
  only fires when the flag is on.
- Tauri commands `set_call_debug_logs` / `get_call_debug_logs`.

### Instrumented steps (24 emit_call_debug sites)
- `register_signal`: start, identity loaded, endpoint created,
  connect failed/ok, RegisterPresence sent, ack received/failed,
  recv loop spawning
- Recv loop: CallRinging, DirectCallOffer (w/ caller_reflexive_addr),
  DirectCallAnswer (w/ callee_reflexive_addr), CallSetup (w/
  peer_direct_addr), Hangup
- `place_call`: start, reflect query start/ok/none, offer sent,
  send failed
- `answer_call`: start, reflect query start/ok/none or privacy
  skip, answer sent, send failed
- `connect`: start, dual_path_race_start (w/ role), won (w/
  path), failed, skipped (w/ reasons), call_engine_starting/
  started/failed

### JS side
- New `callDebugLogs: boolean` field on Settings type.
- Boot-time hydrate of the Rust flag from localStorage so the
  choice survives restarts (like `dredDebugLogs`).
- Settings panel: new "Call flow debug logs" checkbox alongside
  the DRED toggle.
- New "Call Debug Log" section that ONLY shows when the flag is
  on. Rolling in-memory buffer of the last 200 events, rendered
  as monospace `HH:MM:SS.mmm step {details}` lines with auto-
  scroll and a Clear button.
- `listen("call-debug-log", ...)` subscribed at app startup,
  appends to the buffer, re-renders on every event.

Full workspace test goes from 404 → 413 passing. Clippy clean
on touched crates.

PRD: .taskmaster/docs/prd_phase35_dual_path_race.txt
Tasks: 61-69 all completed

Next: APK + desktop build carrying everything — Phase 2 NAT
detect, Phase 3 advertising, Phase 3.5 dual-path + call debug
logs, plus the earlier Android first-join diagnostics — so the
user can validate the P2P path on real hardware with live
per-step visibility into where any failures happen.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 14:06:44 +04:00
Siavash Sameni
7e7968b2f9 diag(android-engine): first-join no-audio ordering instrumentation
Adds a single call_t0 = Instant::now() at the top of the Android
CallEngine::start path, threaded through send + recv tasks as
send_t0 / recv_t0, and tags the following milestones with
t_ms_since_call_start so we can build a clean side-by-side log of
first-call vs rejoin:

  1. QUIC connection established
  2. handshake complete
  3. wzp-native audio_start returned (+ how long audio_start itself took)
  4. send task spawned
  5. send: first full capture frame read (+ short_reads_before count)
  6. send: first non-zero capture RMS
  7. recv task spawned
  8. recv: first media packet received
  9. recv: first successful decode
 10. recv: first playout-ring write

Combined with the existing C++-side cb#0 logs in
crates/wzp-native/cpp/oboe_bridge.cpp ("capture cb#0", "playout
cb#0") this gives us full-pipeline ordering with no native-side
changes needed.

PRD: .taskmaster/docs/prd_android_first_join_no_audio.txt
Task: 32 (first task in the chain — diagnostics before any fix
attempts so we know which of the 5 suspect causes is real).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:00:20 +04:00
Siavash Sameni
578ff8cff4 feat(debug): GUI toggle for DRED verbose logs + macOS mic permission
DRED verbose logs (off by default — keeps logcat clean in normal use):
- wzp-codec: DRED_VERBOSE_LOGS atomic flag with dred_verbose_logs() /
  set_dred_verbose_logs() helpers
- opus_enc: gate "DRED enabled" + libopus version logs behind the flag
- desktop/src-tauri/engine.rs: gate DredRecvState parse log,
  reconstruction log, classical PLC log, and DRED-counter fields in
  the Android recv heartbeat (non-verbose path still logs basic recv
  stats)
- Tauri commands set_dred_verbose_logs / get_dred_verbose_logs
- Settings panel gets a "DRED debug logs (verbose, dev only)"
  checkbox; preference persists in wzp-settings localStorage and is
  pushed to Rust on save and on app boot

macOS mic permission:
- Add desktop/src-tauri/Info.plist with NSMicrophoneUsageDescription.
  Without it, modern macOS silently denies CoreAudio capture for
  ad-hoc-signed Tauri builds — capture starts but every callback
  hands you zeros. Symptom: phones could not hear desktop client,
  desktop could still hear phones (playout has no TCC gate). The
  Tauri 2 bundler auto-merges this file into WarzonePhone.app's
  Contents/Info.plist on the next build, so first launch will pop
  the standard mic prompt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:48:32 +04:00
Siavash Sameni
16890576fb feat(observability): logcat-visible DRED proof of life on Android
Adds enough INFO-level logging that an opus-DRED-v2 APK on Android can
be verified end-to-end by reading logcat alone — no debugger, no
Prometheus, no telemetry pipeline required. Three observation points:

1. Encoder construction (opus_enc.rs)
   - Bumped the "DRED enabled" log from debug! to info! so the
     per-call DRED config is in logcat by default. Each call's first
     OpusEncoder construction logs codec, dred_frames, dred_ms,
     loss_floor_pct.
   - Added a one-shot static OnceLock that logs `opusic_c::version()`
     the first time an OpusEncoder is built in the process. This is
     the smoking gun for "is the new libopus actually loaded" — pre-
     Phase-0 audiopus shipped libopus 1.3 with no DRED, post-Phase-0
     should print 1.5.2 here.

2. DRED state ingest (DredRecvState::ingest_opus in
   desktop/src-tauri/src/engine.rs)
   - First successful parse on a call logs immediately so we can see
     "DRED is on the wire" in logcat.
   - Subsequent parses sample every 100th to confirm steady-state
     samples_available without drowning the log.
   - New parses_total / parses_with_data counters track the parse
     rate vs the success rate (a packet without DRED in it returns
     `available == 0`, so a low ratio means the encoder isn't
     emitting DRED bytes).

3. DRED reconstruction events (DredRecvState::fill_gap_to)
   - Every DRED reconstruction logs at INFO with missing_seq,
     anchor_seq, offset_samples, offset_ms, samples_available,
     gap_size, and the running total. These events are rare on a
     clean network and we want to know exactly which gap was filled.
   - First three classical PLC fills + every 50th thereafter log so
     we can see when DRED couldn't cover a gap (offset out of range,
     no good state, or reconstruct error).

4. Recv heartbeat (Android start() in engine.rs)
   - Existing 2-second heartbeat now includes dred_recv,
     classical_plc, dred_parses_with_data, dred_parses_total
     so a steady-state call shows the cumulative counters in
     logcat without parsing.

How to verify on a real call:

  adb logcat -s 'RustStdoutStderr:*' | grep -i 'dred\|libopus version'

Expected output sequence on a successful Opus call:
  - "linked libopus version libopus_version=libopus 1.5.2-..."  (once per process)
  - "opus encoder: DRED enabled codec=Opus24k dred_frames=20 dred_ms=200 loss_floor_pct=15"  (per call)
  - "DRED state parsed from Opus packet seq=N samples_available=4560 ms=95 ..."  (after first DRED-bearing packet)
  - "recv heartbeat (android) ... dred_recv=0 classical_plc=0 dred_parses_with_data=58 dred_parses_total=58"  (every 2s)

If you see "linked libopus version libopus 1.3" — the FFI swap didn't
take. If dred_parses_with_data stays at 0 while dred_parses_total
climbs — the sender isn't emitting DRED (check the encoder's loss
floor and the receiver's libopus version). If gaps trigger
"classical PLC fill" instead of "DRED reconstruction fired" —
DRED state coverage is too small for the observed loss pattern,
and the loss floor or DRED duration policy needs tuning.

Verification:
- cargo check -p wzp-codec -p wzp-client: 0 errors
- cargo check -p wzp-desktop: 0 Rust errors (only the pre-existing
  tauri::generate_context!() proc macro panic on missing ../dist
  which fires at host check time, irrelevant on the remote build)
- cargo test -p wzp-codec --lib: 69 passing (no regressions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:58:03 +04:00
Siavash Sameni
dfbe21fe6e feat(tauri-engine): Phase 3b/3c re-port — DRED reconstruction on the live Tauri mobile engine
The original Phase 3b landed on wzp-client/CallDecoder and Phase 3c
landed on wzp-android/src/engine.rs. Both of those are DEAD CODE on
feat/desktop-audio-rewrite: the legacy Kotlin app in android/app/ is
not built by the Tauri mobile pipeline, and the Tauri engine bypasses
CallDecoder by calling wzp_codec::create_decoder directly.

The live Android call engine lives at desktop/src-tauri/src/engine.rs
with two `pub async fn start<F>` functions — one cfg-gated on Android
(Oboe via wzp-native) and one for desktop (CPAL). Both recv tasks
were using `let mut decoder = wzp_codec::create_decoder(...)` which
returns `Box<dyn AudioDecoder>` and doesn't expose the inherent
`reconstruct_from_dred` method.

Changes:

New helper struct `DredRecvState` at the top of engine.rs, wrapping:
  - DredDecoderHandle (libopus DRED side-channel parser)
  - DredState scratch (for parse_into)
  - DredState last_good (cached valid state, swapped on success)
  - last_good_seq: Option<u16> (DRED anchor sequence)
  - expected_seq: Option<u16> (for gap detection)
  - dred_reconstructions / classical_plc_invocations counters

With three methods:
  - ingest_opus(seq, payload): parse DRED, swap on success
  - fill_gap_to(decoder, current_seq, frame_samples, scratch, emit):
    detect gap back from expected_seq, reconstruct each missing
    frame via DRED if state covers it, fall through to classical
    decoder.decode_lost() when it doesn't. Calls emit() once per
    frame with a slice the caller uses for AGC + playout write.
  - reset_on_profile_switch(): invalidate tracking when codec changes

Both recv tasks (Android @ ~line 297 and desktop @ ~line 907):
  - Decoder type changed from `Box<dyn AudioDecoder>` via
    `wzp_codec::create_decoder` to concrete `AdaptiveDecoder::new(profile)`
    so we can call the inherent reconstruct_from_dred method.
  - Added `use wzp_proto::traits::AudioDecoder;` at the top of
    engine.rs to bring decode/decode_lost/set_profile trait methods
    into scope on the concrete type.
  - New `current_profile` local alongside `current_codec` (used for
    frame_duration lookups that drive the DRED sample offset math).
  - On codec/profile switch, call dred_recv.reset_on_profile_switch()
    because the cached DRED state is tied to the old profile's
    frame rate.
  - For each arriving Opus source packet:
      1. dred_recv.ingest_opus(seq, payload) — parse DRED
      2. dred_recv.fill_gap_to(...) — detect gap and reconstruct
         missing frames, each emitted through a closure that does
         AGC + playout write (wzp_native on Android, playout_ring
         on desktop)
      3. Normal decoder.decode() fallthrough for the current packet
         (unchanged)
  - Codec2 packets skip the DRED path entirely (is_opus() gate) —
    libopus can't reconstruct Codec2 audio.

Ordering invariant: gap reconstruction writes to playout BEFORE the
current packet's decoded audio, preserving temporal order since the
playout ring is FIFO. The closure captures the `spk_muted` flag once
before the gap loop to avoid mid-gap-fill state changes.

Kept `crates/wzp-android/src/engine.rs` and `crates/wzp-android/src/
stats.rs` from the earlier Phase 3c commit as-is — they're dead code
on feat/desktop-audio-rewrite but harmless, and deleting them would
diverge this branch from an independently-useful intermediate state.
The old Phase 3c commit (505a834) stays as historical reference.

Verification:
- cargo check -p wzp-codec -p wzp-client -p wzp-relay: 0 errors
- cargo check -p wzp-desktop: only pre-existing `tauri::generate_context!()`
  panic on missing ../dist (Vite output not built on host) — no Rust
  compile errors from our changes
- cargo test -p wzp-codec --lib: 69 passing (unchanged)
- cargo test -p wzp-client --lib: 35 passing + 1 ignored (unchanged)

Next: scripts/build-tauri-android.sh to get the actual Tauri APK —
NOT build-and-notify.sh which builds the dead legacy android/app.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:31:09 +04:00
Siavash Sameni
2fd94651e4 fix(desktop): direct calls used wrong identity file — mac identity mismatch
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m40s
The non-Android branch of CallEngine::start loaded the seed from
\$HOME/.wzp/identity directly, while register_signal in lib.rs goes
through the shared load_or_create_seed() helper which resolves via
APP_DATA_DIR → Tauri's app_data_dir(). On macOS those are two
completely different files:

  register_signal → ~/Library/Application Support/com.wzp.desktop/.wzp/identity
  CallEngine::start (old) → ~/.wzp/identity

On a fresh install they end up holding two different random seeds.
Register and CallEngine then derive two different fingerprints from
those seeds, and when a direct call comes in the relay routes it to
"you" under the register_signal fingerprint, but once CallEngine tries
to join the call-* room it advertises a DIFFERENT fingerprint — which
fails the call_registry ACL check on the relay side (only the two
authorised participants of a call can join its room). Silent hang, the
call never completes.

Android hit this bug earlier in the week and was fixed by switching
its CallEngine::start branch to `crate::load_or_create_seed()`.
Backport the same single-line change to the desktop branch so both
platforms share one identity source of truth.

Also bring the desktop branch up to parity with the android branch on
diagnostic logging:
- log CallEngine::start entry with relay/room/alias/quality/has_reuse
- log endpoint.local_addr on reuse / create
- log "QUIC connection established, performing handshake" between
  connect() and perform_handshake() so a hang at either step is
  immediately localisable
- map_err all three potential failure points (create_endpoint,
  connect, perform_handshake) to an explicit error! trace
2026-04-10 12:15:23 +04:00
Siavash Sameni
cfa9ff67cf fix(android-audio): VoIP mode + speakerphone + debug PCM recorder
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Has been cancelled
Build 96be740 logs proved the entire software pipeline is healthy:
  capture heartbeat:   calls=1100 to_write=960 full_drops=0 total_written=1056000
  recv heartbeat:      decoded_frames=1035 last_written=960 decode_errs=0
  recv decoded PCM:    range=[-13564..9244] rms=8044 (real audio)
  playout WRITE:       in_len=960 written=960 rms=2318 (real audio into the ring)
  playout heartbeat:   calls=1100 nonempty=1099 total_played_real=1055040

1055040 samples / 48000 Hz = 22s — exactly matches wall-clock elapsed,
meaning Oboe IS calling our playout callback at the expected rate and
WE ARE handing it real PCM every 20ms. User still heard nothing. Ergo
Oboe accepted the PCM and routed it to a silent output. Two fixes:

1) MainActivity.kt: switch to MODE_IN_COMMUNICATION + speakerphone ON
   right after permissions are granted, and crank STREAM_VOICE_CALL to
   max. Without this, an Oboe Usage::VoiceCommunication stream gets
   opened, the OS creates a real AAudio pipeline, the callback fires on
   schedule — and audio goes to either the earpiece at muted volume or
   a "call not active" dead end. Logs the audio mode + volume levels
   before and after the switch so we can confirm the state change in
   logcat next run.

2) oboe_bridge.cpp: revert Usage::Media → VoiceCommunication (the mode
   that matches MODE_IN_COMMUNICATION), pin the audio API to AAudio
   explicitly instead of letting Oboe fall back to OpenSLES (which has
   its own silent-drop failure modes on some devices), and add getState
   + getXRunCount to the playout heartbeat so we'll see silent stream
   disconnects instead of reading zeros forever.

3) engine.rs recv task: dump the first ~10s of post-AGC decoded PCM to
   `<app_data_dir>/decoded.pcm` as raw i16 LE so we can adb pull it and
   play it back locally:
       adb shell run-as com.wzp.desktop cat .wzp/decoded.pcm > decoded.pcm
       ffmpeg -f s16le -ar 48000 -ac 1 -i decoded.pcm decoded.wav
   This divorces "is our decoder actually producing audible audio" from
   "is Android's audio stack playing it". If the recorded WAV sounds
   correct when played on a laptop, the decoder is fine and 100% of the
   remaining bug surface is AudioManager / Oboe routing.

4) engine.rs: also log when spk_muted=true blocks the write. User
   reported the Speaker button in the UI has inconsistent semantics
   between desktop and android — adding this log rules out the accidental
   "first click muted playback" theory for good.
2026-04-09 21:24:26 +04:00
Siavash Sameni
96be740fd9 diag(android-audio): aggressive logging across the whole Oboe pipeline
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
User confirmed: mac hears android, android does not hear mac. So Oboe
capture works end-to-end but Oboe playout on Android silently drops
audio even though QUIC forwards the packets. Archaeology on the legacy
wzp-android crate also revealed that the "last known good" Android audio
path NEVER used Oboe in production — it used Kotlin AudioRecord +
AudioTrack via JNI, and cpp/oboe_bridge.cpp was dead code. So every time
we've "tested" Oboe end-to-end this week was the first production use,
and any of its config knobs could be the bug.

Instrumenting every stage of the pipeline so one smoke-test log dump can
isolate the layer at fault:

C++ (oboe_bridge.cpp)
  - Log the ACTUAL stream parameters after openStream for both capture
    and playout (sample rate, channels, format, framesPerBurst,
    framesPerDataCallback, bufferCapacityInFrames, sharing, perf mode).
    Oboe may silently override values we requested — e.g. if we ask for
    48kHz mono but the device gives us 44.1kHz stereo our 960-sample
    frames are the wrong duration and the pipeline drifts.
  - Capture callback: on cb#0 log sample range+RMS of the first frame
    to prove we get real mic data (not zeros). Every 50 callbacks
    (~1s at 20ms burst) log calls, numFrames, ring available_write,
    bytes actually written, ring_full_drops, total_written.
  - Playout callback: on cb#0 log numFrames + ring state. On the FIRST
    non-empty read log sample range+RMS so we can tell if the samples
    coming out of the ring are real audio or zeros. Every 50 callbacks
    log calls, nonempty count, numFrames, ring available_read,
    underrun_frames, total_played_real.

Rust wzp-native (src/lib.rs)
  - wzp_native_audio_write_playout now logs the first 3 writes and then
    every 50th: in_len, written, sample range, RMS, ring write/read
    cursors before, available_read and available_write after. Reveals
    ring-overflow and whether the engine is actually handing us audio.
  - Minimal android logcat shim via __android_log_write extern — no
    new crate dependency.
  - AudioBackend grows a `playout_write_log_count` AtomicU64 to gate
    the write-side log throttle.

Rust engine.rs (android branch)
  - Recv task: log sample range + RMS for the first 3 decoded PCM
    frames and then every 100th. Reveals whether decoder.decode is
    producing real audio or silent buffers.
  - Recv task: if audio_write_playout returns fewer samples than we
    handed it (partial write → ring nearly full) warn about it in the
    first 10 frames.
  - Recv heartbeat every 2s: recv_fr, decoded_frames, last_decode_n,
    last_written, written_samples, decode_errs, codec.

Expected flow in a healthy log:
  capture cb#0: numFrames=960 range=[-1200..900] rms=180          ← mic OK
  capture stream opened: actualSR=48000 Ch=1 ...                   ← no override
  playout stream opened: actualSR=48000 Ch=1 ...
  CallEngine::start invoked ... → connected → audio started
  recv: first media packet received ...
  recv: decoded PCM sample range decoded_frames=1 range=[-300..250] rms=92
  playout WRITE #0: in_len=960 written=960 range=[-300..250] rms=92
  playout FIRST nonempty read: to_read=960 range=[-300..250] rms=92
  playout heartbeat: calls=50 nonempty=50 underrun=0 ...
  recv heartbeat: decoded_frames=100 last_written=960 ...

If any of those are missing/zero we know the exact stage to fix.
2026-04-09 21:13:29 +04:00
Siavash Sameni
8c4d640f89 fix(android): playout Usage::Media + relay CallSetup advertises real IP
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m43s
Three real bugs, one smoke-test session's worth of progress.

1. RELAY: wrong advertised addr in CallSetup
   The direct-call CallSetup computed `relay_addr = addr.ip()` where
   `addr = connection.remote_address()` — i.e. the CLIENT'S IP, not the
   relay's. So the relay was telling both parties "the call room is at
   the answerer's IP:4433", which meant each client dialed either the
   other client (no server listening) or themselves. Both endpoint.connect
   calls hung forever and the call never happened.
   Fix: compute the relay's own advertised IP once at startup. If the
   listen addr is 0.0.0.0, probe the primary outbound interface via the
   classic UDP-bind-and-connect(8.8.8.8:80) trick to discover the LAN
   IP the OS would use to reach external hosts. Thread the resulting
   advertised_addr_str into the CallSetup sender for both parties.

2. RELAY: accept loop serialized QUIC handshakes
   Previously the main accept loop called `wzp_transport::accept` which
   did both `endpoint.accept().await` AND `incoming.await` (the server-
   side QUIC handshake). A single slow handshake therefore blocked every
   subsequent client from being accepted. Unroll the helper here and
   move `incoming.await` into the per-connection spawned task, so every
   handshake runs in parallel. Also log "accept queue: new Incoming",
   "QUIC handshake complete", and "QUIC handshake failed" so we can tell
   immediately whether a client's packets are reaching the relay at all.

3. ANDROID: playout was routed to the silent in-call stream
   The Oboe playout stream was configured with Usage::VoiceCommunication,
   which routes to the Android in-call earpiece stream. That stream is
   silent unless the Activity has called AudioManager.setMode(
   IN_COMMUNICATION) and, even then, only the earpiece/BT headset get
   audio (not the loud speaker). Result: android→mac calls worked
   because mac had a normal media output, but mac→android calls were
   silent even though packets flowed through the relay just fine.
   Switch to Usage::Media + ContentType::Speech so Oboe routes to the
   loud speaker and uses the media volume slider. A later polish step
   will wire setMode + setSpeakerphoneOn from MainActivity.kt so we can
   go back to VoiceCommunication for AEC and proximity-sensor routing.

Plus: heartbeat tracing every 2s in the send/recv tasks — frames_sent,
last_rms, last_pkt_bytes, short_reads on the send side; decoded_frames,
last_decode_n, last_written, decode_errs on the recv side. Will make the
next "no sound" regression trivial to localize.
2026-04-09 20:55:10 +04:00
Siavash Sameni
49f101d785 fix(android): reuse signal endpoint for direct-call media connection
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
Direct-call accept hangs forever at the QUIC handshake on Android. Logs
from d7b37a5 showed:
  CallEngine::start (android) invoked relay=172.16.81.172:4433 room=call-…
  resolved relay addr
  identity loaded
  endpoint created, dialing relay   ← reached
                                    ← nothing, 90s+, no error
The "connect failed" and "QUIC connection established" log lines never
fire, meaning endpoint.connect_with(…).await never makes progress.

Repro is 100%: SFU room join (one endpoint) works perfectly; direct call
(opens a SECOND quinn::Endpoint on top of the signal one) hangs in the
QUIC handshake. Creating two quinn::Endpoints on Android's AAudio-adjacent
UDP stack apparently causes the second one's datagrams to never reach the
relay (the server never sees the Initial packet). Rather than fight the
platform, quinn is happy to multiplex multiple Connections on a single
Endpoint — so we reuse the signal endpoint for the media connection.

- SignalState now stores the quinn::Endpoint alongside the QuinnTransport.
  register_signal populates both at the same time.
- CallEngine::start (both android and desktop branches) takes an
  Option<wzp_transport::Endpoint>. Some → reuse (direct-call path, after
  register_signal). None → create fresh (SFU room join path).
- The connect tauri command reads state.signal.endpoint and threads it
  through to CallEngine::start, so the direct-call auto-connect (fired by
  the "setup" signal-event in main.ts) lands on the existing UDP socket.
- wzp_transport re-exports quinn::Endpoint so wzp-desktop doesn't need to
  depend on quinn directly.
- Also wraps the android connect in tokio::time::timeout(10s) so future
  hangs become deterministic "connect TIMED OUT" errors in logcat
  instead of silent deadlock.

Same fix applies verbatim to the desktop client — the user suspects
direct call is broken there too and this was likely always the cause,
just never surfaced because desktop was only tested via SFU rooms.
2026-04-09 20:29:51 +04:00
Siavash Sameni
d7b37a5749 diag: tracing for direct-call signal loop + CallEngine::start stages
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m57s
User reports tapping "answer" on an incoming direct call does nothing
visible, and suspects the same may affect desktop. The signal recv loop
had no tracing at all, so we can't tell whether CallSetup is being
received, whether the recv loop died silently, or whether
CallEngine::start is failing between "identity loaded" and
"connected to relay, handshake complete".

- register_signal recv loop now logs every message type with fields
  (CallRinging, DirectCallOffer, DirectCallAnswer, CallSetup, Hangup,
  unhandled), plus a warn! on recv errors and a final warn when the
  loop exits.
- place_call / answer_call commands log entry + success / error. The
  answer_call error path logs the underlying send_signal error so we
  can see it in logcat instead of only in the JS error toast.
- CallEngine::start android branch logs relay/room/alias on entry,
  logs "endpoint created, dialing relay" between create_endpoint and
  connect, "QUIC connection established, performing handshake" between
  connect and perform_handshake, and promotes all three potential
  failures to explicit error! logs so a silent hang / error becomes
  visible in logcat.

No functional changes — pure diagnostics. Stacks on b35a6b7 (the Oboe
stack-pointer-escape fix) so build #43 carries both.
2026-04-09 19:17:03 +04:00
Siavash Sameni
5beea7de40 phase 3(android): unify connect/disconnect/toggle_*/get_status commands
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m49s
Step 3 of the Tauri Android rewrite was still returning "audio backend not
yet wired on Android (step 3)" because the cfg-gated Android stubs for
connect/disconnect/toggle_mic/toggle_speaker/get_status were shadowing the
real commands. Now that CallEngine::start() has a real Android body (phase
3, commit fdbe502), the gates are unnecessary.

- Drop the #[cfg(not(target_os = "android"))] gates from all five
  engine-backed Tauri commands.
- Delete the Android stub block (~50 LOC of "not connected" boilerplate).
- Ungate `use engine::CallEngine;` and the AppState.engine field so both
  targets share the same Mutex<Option<CallEngine>>.
- CallEngine::stop() now calls crate::wzp_native::audio_stop() on Android so
  the mic + speaker are released between calls, matching the desktop
  behaviour where dropping _audio_handle tears down CPAL.

Direct-call flow on Android: peer sends DirectCallOffer → user accepts via
answer_call → relay sends signal "setup" event → main.ts auto-invokes
connect(relay, room) → CallEngine::start() runs the Android branch →
wzp_native::audio_start() brings up Oboe → send/recv tasks stream PCM
through the dlopen boundary.
2026-04-09 18:53:54 +04:00
Siavash Sameni
fdbe502524 phase 3(android): wire CallEngine::start to wzp-native audio FFI
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 3m57s
Replaces the Android-side CallEngine::start() stub with a real implementation
that mirrors the desktop start() body but routes all PCM through the
standalone wzp-native cdylib loaded at startup via libloading instead of
using CPAL.

- desktop/src-tauri/src/wzp_native.rs: new module with a static
  OnceLock<libloading::Library> + cached raw fn pointers for every symbol
  we need (version, hello, audio_start/stop, read_capture, write_playout,
  is_running, capture/playout_latency_ms). init() resolves everything once
  at startup; accessors return default values if init() never ran.

- desktop/src-tauri/src/lib.rs: drop the inline dlopen smoke test, add
  `mod wzp_native;` behind target_os="android", and invoke
  wzp_native::init() from the Tauri setup() callback so the library is
  loaded + all symbols cached before any CallEngine can touch audio.

- desktop/src-tauri/src/engine.rs: the Android #[cfg] branch of
  CallEngine::start() now does the full QUIC handshake + signal loop +
  Opus send/recv tasks, calling wzp_native::audio_start() /
  audio_read_capture() / audio_write_playout() instead of the desktop
  CPAL rings. SyncWrapper now holds a placeholder Box<()> on Android
  because the audio backend lives in a process-global singleton inside
  libwzp_native.so rather than being owned per-engine.

Next step: build #39 on the remote docker builder and smoke-test on
Pixel 6 that the Connect button in the UI successfully brings up Oboe
and streams audio through the dlopen boundary.
2026-04-09 18:42:27 +04:00
Siavash Sameni
19fd3dd9cc step C fix: ungate wzp_proto imports used by resolve_quality() on Android
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m48s
Build #20 failed to compile on Android because I over-gated the
wzp_proto imports to non-Android. resolve_quality() is compiled on
every platform (it's outside the CallEngine impl) and references
QualityProfile + CodecId — both platform-independent types from
wzp_proto. Move those back to an unconditional import. tracing stays
gated (only the desktop start() body logs; the Android stub is silent).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:59:00 +04:00
Siavash Sameni
c69195fe06 step C(android): compile engine.rs on Android with a stub CallEngine::start
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Has been cancelled
Third incremental variable. Previously the engine module was cfg-gated
out of the Android build entirely (`#[cfg(not(target_os = "android"))]
mod engine;` in lib.rs). Now it's always compiled, so any link-time
effect of having engine.rs in the compilation unit can be measured
against the working baseline from build #19.

Changes kept deliberately small:
- lib.rs: drop the cfg gate on `mod engine;`. `use engine::CallEngine`
  stays gated because the Android-specific connect/disconnect/... stubs
  in lib.rs don't reference the type.
- engine.rs: the `wzp_client::{audio_io, call}` imports + CodecId +
  QualityProfile are gated to non-Android (they require the `audio`
  feature on wzp-client which Android doesn't pull in). On Android we
  keep only the MediaTransport import for transport.close(). The impl
  block now has two `start()` methods: the full CPAL-backed one for
  desktop, and a 6-line Android stub that returns `Err("audio engine
  not yet wired on Android")` so attempts to `connect` from the UI
  fail cleanly.

Goal: verify that linking in the compiled engine module (plus the
types it references) on Android doesn't regress the working baseline.
Home screen should still render and register_signal should still work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:56:02 +04:00
Siavash Sameni
530993854f revert(android): roll back to build #6 (35642d1) — pre-oboe known-good state
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m51s
Spent 10+ builds chasing a __init_tcb+4 / pthread_create SIGSEGV after
adding the oboe audio backend. Every "fix" made things worse. Reverting
all Android-specific files to the state at 35642d1 (build #6), which
was the last commit where the Tauri Android app actually launched,
rendered the home screen, and successfully registered on a relay.

Reverted files (all back to their 35642d1 content):
  - desktop/src-tauri/Cargo.toml        (no build-dep cc, no tracing-android)
  - desktop/src-tauri/build.rs          (git hash only, no Oboe / cc build)
  - desktop/src-tauri/src/lib.rs        (engine cfg-gated on non-android)
  - desktop/src-tauri/src/main.rs       (two-line desktop entry)
  - desktop/src-tauri/src/engine.rs     (desktop-only audio setup)
  - scripts/Dockerfile.android-builder  (no android24→26 clang shim)
  - scripts/build-tauri-android.sh      (no linker env vars / manifest patch)

Deleted (were added between b314138 and e2e023d):
  - desktop/src-tauri/cpp/getauxval_fix.c
  - desktop/src-tauri/cpp/oboe_bridge.{h,cpp}
  - desktop/src-tauri/cpp/oboe_stub.cpp
  - desktop/src-tauri/src/oboe_audio.rs

Next: rebuild image on remote (to drop the baked-in clang shim), build
an APK, install on Pixel 6, verify the UI renders the same way build #6
did. From there we add features back ONE at a time so we can actually
bisect which one triggers the tao::ndk_glue crash. User's rule:
"if you want to change stack, change incrementally, so we can debug".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:22:57 +04:00
Siavash Sameni
b314138caf feat(android): oboe/AAudio audio backend + runtime mic permission (step 3)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 3m39s
This is the big one — the Tauri Android app now has a real audio stack
capable of full-duplex VoIP, reusing the proven C++ Oboe bridge from the
legacy wzp-android crate.

Architecture:
- desktop/src-tauri/cpp/  — copies of oboe_bridge.{h,cpp}, oboe_stub.cpp,
  and getauxval_fix.c from crates/wzp-android/cpp/. build.rs clones
  google/oboe@1.8.1 into OUT_DIR and compiles the bridge + all Oboe
  sources as "oboe_bridge" static lib, linking against shared libc++
  (static would pull broken libc stubs that SIGSEGV in .so libraries).
- src/oboe_audio.rs  — Rust side: an SPSC ring buffer matching the C++
  bridge's AtomicI32 layout, plus OboeHandle::start() which returns
  (capture_ring, playout_ring, owning_handle). The ring exposes the same
  (available / read / write) methods as wzp_client::audio_ring::AudioRing
  so CallEngine treats both backends interchangeably.
- src/engine.rs  — compiled on every platform now. A cfg-switched type
  alias picks wzp_client::audio_ring::AudioRing on desktop and
  crate::oboe_audio::AudioRing on Android. The audio setup block has
  three branches: VPIO/CPAL on macOS, CPAL on Linux/Windows, Oboe on
  Android. Send/recv tasks are identical across platforms.
- src/lib.rs  — removes all the "step 3 not done" Android stubs. The
  engine module is no longer cfg-gated; connect / disconnect / toggle_mic
  / toggle_speaker / get_status are single implementations used by both
  desktop and Android. Identity path resolves via app.path().app_data_dir()
  from the Tauri setup() callback (already wired in step 1).

Runtime mic permission:
- scripts/build-tauri-android.sh now injects RECORD_AUDIO + MODIFY_AUDIO_
  SETTINGS into gen/android/app/src/main/AndroidManifest.xml after init,
  and overwrites MainActivity.kt with a version that calls
  ActivityCompat.requestPermissions in onCreate. This is idempotent:
  every build re-applies the patches so tauri re-init can't regress them.

Cargo.toml:
- cc is now an unconditional build-dep (build.rs runs on the host, so
  target-gating build-deps doesn't work).
- wzp-client is now a dep on every platform. On Android it gets default
  features only (no "audio"/"vpio") so CPAL isn't dragged in — oboe_audio
  provides the capture/playout rings instead.
- tracing-android is added on Android so tracing events flow into logcat.

build.rs also gained embedded git hash (WZP_GIT_HASH) capture, which is
shown under the fingerprint on the home screen — already committed in
7639aaf, reinstated here alongside the Oboe build logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:40:38 +04:00
Siavash Sameni
395a0c557e feat: TX/RX codec badges on desktop call screen
Some checks failed
Mirror to GitHub / mirror (push) Failing after 34s
Build Release Binaries / build-amd64 (push) Failing after 2m1s
Desktop now shows codec badges like Android:
- Green TX badge: e.g. "Opus64k"
- Blue RX badge: e.g. "Opus24k"
Displayed in the stats line below the call controls.

Engine tracks tx_codec (set on encoder init) and rx_codec (updated
from incoming packet headers). Passed through EngineStatus → CallStatus
→ frontend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:03:20 +04:00
Siavash Sameni
da593f9510 feat: relay-grouped participant rendering + relay_label in protocol
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 1m47s
RoomParticipant now has optional relay_label field. Desktop client
groups participants by relay: "This Relay" (green dot) for local,
peer label (blue dot) for federated. Shows all relays in the chain
including intermediate ones.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:22:05 +04:00
Siavash Sameni
a8c2011445 feat: add Opus 32k/48k/64k studio quality tiers
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Has been cancelled
Adds three new codec IDs (Opus32k=6, Opus48k=7, Opus64k=8) and
corresponding STUDIO_32K, STUDIO_48K, STUDIO_64K quality profiles.
All use 20ms frames with minimal FEC (10%) for maximum quality on
good networks.

Updated across: wire protocol (codec_id.rs), encoder/decoder
(opus_enc/dec.rs), adaptive codec switch (call.rs), CLI
(--profile studio-64k), desktop engine + UI slider (8 quality
levels from Studio 64k green to Codec2 1.2k red).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:31:05 +04:00
Siavash Sameni
369347ce54 fix: remove unused FRAME_SAMPLES_20MS constant in desktop engine
Some checks failed
Mirror to GitHub / mirror (push) Failing after 35s
Build Release Binaries / build-amd64 (push) Failing after 1m53s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:54:13 +04:00
Siavash Sameni
85c2146760 feat: quality profile selection in desktop settings
Some checks failed
Mirror to GitHub / mirror (push) Failing after 35s
Build Release Binaries / build-amd64 (push) Failing after 1m58s
Adds a Quality dropdown (Auto / Opus 24k / Opus 6k / Codec2 3.2k /
Codec2 1.2k) to both the connect screen and settings panel. The
selected profile is passed through to the engine which configures
the encoder and decoder accordingly.

The desktop engine recv path now auto-switches the decoder codec
when incoming packets use a different codec than expected, enabling
cross-codec interop between clients on different quality settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:44:17 +04:00
Siavash Sameni
dddf5d2e2d feat: relay ping with RTT display, fix dead_code warning
Some checks failed
Build Release Binaries / build-amd64 (push) Has been cancelled
- New ping_relay Tauri command: QUIC connect with 3s timeout, returns RTT ms
- Relay status shown next to input field: "42ms" (green) or "offline" (red)
- Auto-pings on app startup and debounced on relay input change
- Fix SyncWrapper dead_code warning with #[allow(dead_code)]

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:41:28 +04:00
Siavash Sameni
21f5b24cbf fix: keep audio handles alive for call duration, fix Send+Sync
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m39s
The VPIO/CPAL audio handles were dropped at the end of start(),
killing the audio unit immediately. Audio I/O stopped working
after the first frame.

- Store audio handle in CallEngine via SyncWrapper
- Drop MutexGuard before returning from status() (Send future)
- Audio streams now live for the entire call duration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 12:00:16 +04:00
Siavash Sameni
9b733010ab fix: blocking_lock panic in status(), fingerprint copy-to-clipboard
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m13s
- Change status() from blocking_lock to async lock().await —
  fixes "Cannot block the current thread from within a runtime" panic
  that froze the call timer and broke audio
- Click fingerprint to copy to clipboard (both connect and settings screens)
- Show "Copied!" feedback on click

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:53:31 +04:00
Siavash Sameni
80d5bd7628 fix: survive QUIC congestion — drop packets instead of killing send task
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m14s
send_datagram() returns Err(Blocked) when the QUIC congestion window
is full. This is transient — the window reopens once ACKs arrive.
Previously, all send paths treated this as fatal (break/return),
which killed the send task and cascaded via tokio::select! to kill
the entire call.

Now: log warning, drop the packet, continue. Brief audio glitch
(20-100ms) instead of complete call death. FEC on the receiver
side recovers most dropped packets.

Fixed in:
- CLI run_live send task (continue + error counter)
- CLI run_file_mode send paths (2 locations)
- Desktop engine send task

Also hardened recv tasks: transient errors (non-closed/reset)
are survived instead of causing exit.

Matches the fix applied to Android client (engine.rs).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:48:20 +04:00
Siavash Sameni
f726f8cfa4 feat: desktop GUI enhancements — audio level, call timer, VPIO, settings
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m47s
- Audio level meter with log-scale RMS visualization
- Call duration timer
- VPIO (OS AEC) wired through to engine with fallback to CPAL
- "You" badge on own participant entry
- Recent rooms list (click to reuse)
- Enter key to connect from form fields
- Improved dark theme with pulse animation on status dot
- Settings persistence via localStorage (relay, room, alias, AEC, recent rooms)
- Fingerprint display on connect screen
- Keyboard shortcuts skip input fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:40:07 +04:00
Siavash Sameni
e468454464 feat: Tauri desktop GUI app with call engine
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m27s
- New desktop/ directory with Tauri v2 + Vite + TypeScript
- Rust backend: CallEngine wrapping wzp-client audio + transport
- Web frontend: connect screen, in-call screen with participants,
  mic/speaker mute, keyboard shortcuts (m/s/q)
- Dark theme UI, settings persistence via localStorage
- Platform-aware --os-aec: warns on Windows/Linux (not yet implemented)
- Workspace updated to include desktop/src-tauri

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:25:54 +04:00