fix(p2p): Phase 5.6 — direct-path head start + hangup propagation + media debug events

Three fixes from a field-test log where same-LAN calls were
still losing the dual-path race to the relay path, peers were
getting stuck on an empty call screen when the other side
hung up, and 1-way audio was hard to diagnose because the
GUI debug log had no media-level events.

## 1. Direct-path 500ms head start (dual_path.rs)

The race was resolving in ~105ms with Relay winning even when
both phones were on the same MikroTik LAN with valid IPv6 host
candidates. Root cause: the relay dial is a plain outbound QUIC
connect that completes in whatever the client→relay RTT is
(~100ms), while the direct path needs the PEER to also process
its CallSetup, spin up its own race, and complete at least one
LAN dial back to us. That cross-client sequence reliably takes
longer than 100ms, so relay always won.

Fix: delay the relay_fut with `tokio::time::sleep(500ms)` before
starting its connect. Same-LAN direct dials complete in 30-50ms
typically, so the head start gives direct plenty of time to win
cleanly. Users on setups where direct genuinely can't work
(LTE-to-LTE cross-carrier) pay 500ms extra on the relay fallback,
which is invisible for a call setup.

## 2. Hangup propagation via a new hangup_call command (lib.rs + main.ts)

The hangup button was calling `disconnect` which stopped the
local media engine but never sent a SignalMessage::Hangup to
the relay. The peer never got notified and was stuck on the
call screen with silent audio. My earlier fix (commit e75b045)
only handled the RECEIVE side — auto-dismiss call screen on
recv:Hangup — but the SEND side was still missing.

New Tauri command `hangup_call`:
  1. Acquire state.signal.lock(), send SignalMessage::Hangup
     over the signal transport (best-effort; log + continue if
     signal is down)
  2. Acquire state.engine.lock(), stop the CallEngine

JS hangupBtn click handler now calls hangup_call with a fallback
to raw disconnect if the command is missing (older builds).

## 3. Media debug events (engine.rs + lib.rs)

Threaded tauri::AppHandle into CallEngine::start so the send/
recv tasks can emit call-debug events when the user has debug
logs enabled. Added on the Android branch (desktop branch
accepts the arg for API symmetry but doesn't emit yet):

  - media:first_send — emitted when the first encoded frame is
    handed to the transport. Useful for 1-way audio diagnosis:
    if this fires on side A but side B never sees media:first_recv,
    A's outbound is broken.
  - media:first_recv — emitted when the first packet from the
    peer arrives. Mirror of first_send.
  - media:send_heartbeat — every 2s with frames_sent, last_rms,
    last_pkt_bytes, short_reads, drops. A stalled last_rms
    (== 0) tells you the mic isn't producing samples; a frozen
    frames_sent tells you the encode pipeline hung.
  - media:recv_heartbeat — every 2s with recv_fr, decoded_frames,
    last_written, written_samples, decode_errs, codec. Mirror
    invariants for the inbound direction.

All four are gated by `call_debug_logs_enabled()` via
`emit_call_debug`, so they only show up in the GUI log when the
user has the Call Flow Debug Logs checkbox on. Tracing::info!
still runs unconditionally so logcat (adb) keeps its copy
regardless.

The `emit_call_debug` fn in lib.rs is now `pub(crate)` so
engine.rs can call it via `crate::emit_call_debug`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Siavash Sameni
2026-04-12 07:55:41 +04:00
parent fa038df057
commit 16793be36f
4 changed files with 193 additions and 4 deletions

View File

@@ -63,7 +63,7 @@ fn set_call_debug_logs_internal(on: bool) {
/// Also mirrors to `tracing::info!` so logcat keeps its copy
/// regardless of the flag — the toggle only controls the GUI
/// overlay, not the underlying Android log stream.
fn emit_call_debug(
pub(crate) fn emit_call_debug(
app: &tauri::AppHandle,
step: &str,
details: serde_json::Value,
@@ -469,7 +469,8 @@ async fn connect(
let app_clone = app.clone();
emit_call_debug(&app, "connect:call_engine_starting", serde_json::json!({}));
match CallEngine::start(relay, room, alias, os_aec, quality, reuse_endpoint, pre_connected_transport, move |event_kind, message| {
let app_for_engine = app.clone();
match CallEngine::start(relay, room, alias, os_aec, quality, reuse_endpoint, pre_connected_transport, app_for_engine, move |event_kind, message| {
let _ = app_clone.emit(
"call-event",
CallEvent {
@@ -1531,6 +1532,73 @@ async fn deregister(state: tauri::State<'_, Arc<AppState>>) -> Result<(), String
Ok(())
}
/// End the current call, telling the peer via a signal-plane
/// `Hangup` message before tearing down the local media engine.
///
/// Prior to this command existing, the hangup button just called
/// `disconnect` which stopped the local engine but didn't notify
/// the peer — so the OTHER party stayed on the call screen with
/// nothing to hear. The relay DOES notice the media connection
/// closing but doesn't forward anything to the peer on its own,
/// so a real `SignalMessage::Hangup` is the only reliable signal.
///
/// Best-effort: if the signal transport is down (e.g. the relay
/// dropped us mid-call), we still tear down the engine locally
/// and return success. The peer's CallEngine will eventually
/// notice the media side dying and the signal-event hangup
/// handler will fire on receiving it from their signal loop if
/// the relay is still up on their side.
#[tauri::command]
async fn hangup_call(
state: tauri::State<'_, Arc<AppState>>,
app: tauri::AppHandle,
) -> Result<(), String> {
use wzp_proto::SignalMessage;
emit_call_debug(&app, "hangup_call:start", serde_json::json!({}));
// Step 1: send Hangup over the signal channel so the relay
// forwards it to the peer. Do this FIRST so the peer gets
// the notification even if the engine shutdown takes a beat.
{
let sig = state.signal.lock().await;
if let Some(ref transport) = sig.transport {
match transport
.send_signal(&SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal,
})
.await
{
Ok(()) => {
tracing::info!("hangup_call: Hangup signal sent to relay");
emit_call_debug(&app, "hangup_call:signal_sent", serde_json::json!({}));
}
Err(e) => {
tracing::warn!(error = %e, "hangup_call: failed to send Hangup signal");
emit_call_debug(
&app,
"hangup_call:signal_send_failed",
serde_json::json!({ "error": e.to_string() }),
);
}
}
} else {
tracing::debug!("hangup_call: no signal transport, skipping Hangup send");
emit_call_debug(&app, "hangup_call:no_signal_transport", serde_json::json!({}));
}
}
// Step 2: tear down the local media engine.
let mut engine_lock = state.engine.lock().await;
if let Some(engine) = engine_lock.take() {
engine.stop().await;
emit_call_debug(&app, "hangup_call:engine_stopped", serde_json::json!({}));
} else {
emit_call_debug(&app, "hangup_call:no_engine", serde_json::json!({}));
}
Ok(())
}
// ─── App entry point ─────────────────────────────────────────────────────────
/// Shared Tauri app builder. Used by the desktop `main.rs` and the mobile
@@ -1598,6 +1666,7 @@ pub fn run() {
connect, disconnect, toggle_mic, toggle_speaker, get_status,
register_signal, place_call, answer_call, get_signal_status,
get_reflected_address, detect_nat_type,
hangup_call,
deregister,
set_speakerphone, is_speakerphone_on,
get_call_history, get_recent_contacts, clear_call_history,