22 Commits

Author SHA1 Message Date
Siavash Sameni
75bc72a884 docs: add BRANCH-android-rewrite.md and update ARCH/ADMIN/USER_GUIDE
Documents the android-rewrite branch story end-to-end:
- Why the Kotlin+JNI stack was abandoned (stack overflow, libcrypto
  TLS race, __init_tcb TCB leak, ring runtime reuse crash)
- The Tauri 2.x Mobile pivot that reuses the desktop codebase verbatim
- Android-specific pieces: wzp-native standalone cdylib loaded via
  libloading, android_audio.rs JVM routing, Oboe audio config quirks
- Build pipeline via build-tauri-android.sh + wzp-android-builder image
- Known quirks (API 34/36 coexistence, NDK path absolutes, etc.)

Also appends shared-doc sections (identical on both branches):
- ARCHITECTURE.md: "Audio Backend Architecture (Platform Matrix)"
  covering CPAL / VPIO / WASAPI / Oboe backends, selection matrix,
  the wzp-native cdylib rationale, and the vendored audiopus_sys fix.
- ADMINISTRATION.md: "Build Pipelines" with Docker images
  (wzp-android-builder, wzp-windows-builder), per-pipeline usage
  (Android APK, Linux x86_64, Windows .exe), the Hetzner Cloud
  alternative, ntfy/rustypaste integration, and credential locations.
- USER_GUIDE.md: "Direct 1:1 Calling (Desktop + Android)" covering
  history + recent contacts + deregister UI, and "Windows AEC
  Variants" explaining the AEC vs noAEC builds and driver caveats.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:20:12 +04:00
Siavash Sameni
6aa52accef feat(android): Tauri 2.x mobile build infrastructure
Adds infrastructure for building the Tauri 2.x Android app (the pivot
away from the Kotlin+JNI approach whose stack overflow / libcrypto TLS
crash / thread lifecycle hell is documented in the incident report):

- scripts/Dockerfile.android-builder: extended to support both the
  legacy Kotlin+JNI pipeline (cargo-ndk + Gradle) and the new Tauri
  mobile pipeline (tauri-cli + Node/npm). Adds Node.js 20 LTS, API
  level 36 + build-tools 35.0.0, and additional apt packages.
- scripts/build-tauri-android.sh: fire-and-forget remote build via
  Docker on SepehrHomeserverdk, with ntfy.sh notifications and
  rustypaste upload of the resulting APK. Mirrors the pattern of
  build-tauri-android-docker.sh but targets the new Tauri pipeline.
- docs/incident-tauri-android-init-tcb.md: postmortem of the Kotlin+JNI
  crash cascade that drove the Tauri mobile rewrite decision. Covers
  the __init_tcb / pthread_create bionic private symbol leak, the
  staticlib + cdylib crate-type interaction, the Dispatchers.IO 512 KB
  thread stack overflow, and the tokio runtime / libcrypto TLS race.
- scripts/mint-tmux.sh, scripts/prep-linux-mint.sh: general dev
  infrastructure (tmux + Linux Mint workstation prep scripts).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:06:46 +04:00
Siavash Sameni
d0c17317ea fix: generate seed if empty on register (fresh install), add JNI debug logging
Some checks failed
Mirror to GitHub / mirror (push) Failing after 41s
Build Release Binaries / build-amd64 (push) Failing after 3m38s
2026-04-09 10:21:59 +04:00
Siavash Sameni
5799d18aee debug: add tracing to nativeSignalConnect entry
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
2026-04-09 10:17:13 +04:00
Siavash Sameni
46c9ee1be3 fix: single thread for entire signal lifecycle — runtime never dropped (libcrypto TLS fix)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m52s
2026-04-09 10:11:33 +04:00
Siavash Sameni
b53eae9192 fix: split start() into connect+register (inline) + run() (separate thread) — avoids thread::spawn closure stack overflow
Some checks failed
Mirror to GitHub / mirror (push) Failing after 35s
Build Release Binaries / build-amd64 (push) Failing after 3m26s
2026-04-09 10:02:07 +04:00
Siavash Sameni
a3f54566d4 fix: call nativeSignalConnect from 8MB Java Thread, not Dispatchers.IO
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 3m54s
2026-04-09 09:50:30 +04:00
Siavash Sameni
76e9fe5e43 fix: single thread+runtime for signal lifecycle — avoids ring/libcrypto TLS conflict on pthread_exit
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
2026-04-09 09:44:46 +04:00
Siavash Sameni
b0a89d4f39 docs: PRD for desktop direct calling backport + UI fixes
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m39s
2026-04-09 09:39:50 +04:00
Siavash Sameni
abc96e8887 refactor: separate SignalManager from WzpEngine for direct calling
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m40s
SignalManager (NEW):
- Dedicated Rust struct with its own QUIC connection to _signal
- Separate JNI handle (nativeSignalConnect/GetState/PlaceCall/etc)
- Kotlin wrapper polls state every 500ms via getState() JSON
- Lives independently of WzpEngine — survives across calls
- connect() blocks briefly on 8MB thread, then recv loop runs on dedicated thread

WzpEngine (CLEANED):
- Back to pure media-only role (audio, codec, FEC, jitter)
- Removed start_signaling/place_call/answer_call methods
- Removed signal_transport/signal_fingerprint from EngineState

CallViewModel:
- Two separate managers: signalManager (persistent) + engine (per-call)
- Two separate polling loops: signalPollJob + statsJob
- Auto-connect to media room when signal polling detects "setup" state
- hangupDirectCall() ends media but keeps signal alive

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 09:34:36 +04:00
Siavash Sameni
3a6ae61f8d fix: show real identity fingerprint (SHA-256 full format) on Android home screen
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 1m30s
2026-04-09 09:12:47 +04:00
Siavash Sameni
4c536d256b fix: install rustls crypto provider once in nativeInit, not per-thread (libcrypto TLS conflict)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 4m18s
2026-04-09 09:07:40 +04:00
Siavash Sameni
b0ec9ff4ab fix: signal mode UI + place_call via stored signal transport
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m49s
- Don't set callState for signal-only states (prevents auto-join room)
- Store signal transport + fingerprint in EngineState after registration
- place_call/answer_call send directly via signal transport (not command channel)
- Spawn small threads for async signal sends (non-blocking)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:58:22 +04:00
Siavash Sameni
5855533a39 fix: start stats polling before blocking startSignaling call
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 3m46s
2026-04-09 08:38:06 +04:00
Siavash Sameni
ed09c2e8cc fix: use block_on pattern for signaling (same as start_call) — no thread::spawn
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m50s
2026-04-09 08:33:08 +04:00
Siavash Sameni
f44306cc17 fix: move ALL signaling code into JNI-spawned 8MB thread — zero Rust on caller stack
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m51s
2026-04-09 08:19:48 +04:00
Siavash Sameni
0b821585ab fix: call nativeStartSignaling from Java Thread with 8MB stack, not Kotlin IO dispatcher
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m32s
2026-04-09 08:10:22 +04:00
Siavash Sameni
faec332a8c fix: remove panic::catch_unwind from nativeStartSignaling — stack overflow on Android
Some checks failed
Mirror to GitHub / mirror (push) Failing after 42s
Build Release Binaries / build-amd64 (push) Failing after 3m28s
2026-04-09 08:04:47 +04:00
Siavash Sameni
fe9ae276dc fix: move all crypto/network work to spawned 8MB thread — Android stack too small
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m25s
2026-04-09 07:16:54 +04:00
Siavash Sameni
4fbf6770c4 fix: Android signal thread stack overflow + add version marker to UI
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m47s
- Spawn signaling on dedicated thread with 4MB stack instead of using
  Android's IO dispatcher thread (insufficient stack for tokio + QUIC)
- Add "direct-call-v1" version marker to home screen subtitle

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 07:10:07 +04:00
Siavash Sameni
30a893a73f fix: remove duplicate TextAlign import causing Android build failure
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m34s
2026-04-09 06:54:45 +04:00
Siavash Sameni
d46f3b1deb fix: show more Gradle output in build log for debugging
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m55s
2026-04-09 06:48:14 +04:00
108 changed files with 2199 additions and 26138 deletions

3560
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -10,8 +10,6 @@ members = [
"crates/wzp-client", "crates/wzp-client",
"crates/wzp-web", "crates/wzp-web",
"crates/wzp-android", "crates/wzp-android",
"crates/wzp-native",
"desktop/src-tauri",
] ]
[workspace.package] [workspace.package]
@@ -37,14 +35,7 @@ quinn = "0.11"
raptorq = "2" raptorq = "2"
# Codec # Codec
# opusic-c: high-level safe bindings over libopus 1.5.2 (encoder side). audiopus = "0.3.0-rc.0"
# opusic-sys: raw FFI for the decoder side — we build our own DecoderHandle
# because opusic-c::Decoder.inner is pub(crate) and cannot be reached for the
# Phase 3 DRED reconstruction path. See docs/PRD-dred-integration.md.
# Pinned exactly (no caret) for reproducible libopus 1.5.2 across the fleet.
opusic-c = { version = "=1.5.5", default-features = false, features = ["bundled", "dred"] }
opusic-sys = { version = "=0.6.0", default-features = false, features = ["bundled"] }
bytemuck = "1"
codec2 = "0.3" codec2 = "0.3"
# Crypto # Crypto
@@ -62,29 +53,3 @@ wzp-fec = { path = "crates/wzp-fec" }
wzp-crypto = { path = "crates/wzp-crypto" } wzp-crypto = { path = "crates/wzp-crypto" }
wzp-transport = { path = "crates/wzp-transport" } wzp-transport = { path = "crates/wzp-transport" }
wzp-client = { path = "crates/wzp-client" } wzp-client = { path = "crates/wzp-client" }
# Fast dev profile: optimized but with debug info and incremental compilation.
# Use with: cargo run --profile dev-fast
[profile.dev-fast]
inherits = "dev"
opt-level = 2
# Optimize heavy compute deps even in debug builds —
# real-time audio needs < 20ms per frame, impossible unoptimized.
[profile.dev.package.nnnoiseless]
opt-level = 3
[profile.dev.package.opusic-sys]
opt-level = 3
[profile.dev.package.raptorq]
opt-level = 3
[profile.dev.package.wzp-codec]
opt-level = 3
[profile.dev.package.wzp-fec]
opt-level = 3
# Phase 0 (opus-DRED): removed the [patch.crates-io] audiopus_sys = { path =
# "vendor/audiopus_sys" } block. That patch existed to fix a Windows clang-cl
# SIMD compile bug in libopus 1.3.1. With the swap to opusic-sys (libopus
# 1.5.2), the upstream SIMD gating was fixed and the vendor patch is
# obsolete. The vendor/audiopus_sys directory itself should be deleted as
# part of the same cleanup — see the commit that follows this Phase 0.

View File

@@ -46,14 +46,6 @@ class DebugReporter(private val context: Context) {
val zipFile = File(context.cacheDir, "wzp_debug_${timestamp}.zip") val zipFile = File(context.cacheDir, "wzp_debug_${timestamp}.zip")
ZipOutputStream(BufferedOutputStream(FileOutputStream(zipFile))).use { zos -> ZipOutputStream(BufferedOutputStream(FileOutputStream(zipFile))).use { zos ->
// Phase 4: extract DRED / classical PLC counters from the
// stats JSON so they're visible in the meta preamble at a
// glance, not buried in the trailing JSON dump.
val dredReconstructions = extractLongField(finalStatsJson, "dred_reconstructions")
val classicalPlc = extractLongField(finalStatsJson, "classical_plc_invocations")
val framesDecoded = extractLongField(finalStatsJson, "frames_decoded")
val fecRecovered = extractLongField(finalStatsJson, "fec_recovered")
// 1. Call metadata // 1. Call metadata
val meta = buildString { val meta = buildString {
appendLine("=== WZ Phone Debug Report ===") appendLine("=== WZ Phone Debug Report ===")
@@ -66,18 +58,6 @@ class DebugReporter(private val context: Context) {
appendLine("Device: ${android.os.Build.MANUFACTURER} ${android.os.Build.MODEL}") appendLine("Device: ${android.os.Build.MANUFACTURER} ${android.os.Build.MODEL}")
appendLine("Android: ${android.os.Build.VERSION.RELEASE} (API ${android.os.Build.VERSION.SDK_INT})") appendLine("Android: ${android.os.Build.VERSION.RELEASE} (API ${android.os.Build.VERSION.SDK_INT})")
appendLine() appendLine()
appendLine("=== Loss Recovery ===")
appendLine("Frames decoded: $framesDecoded")
appendLine("DRED reconstructions: $dredReconstructions (Opus neural recovery)")
appendLine("Classical PLC: $classicalPlc (fallback)")
appendLine("RaptorQ FEC recovered: $fecRecovered (Codec2 only)")
if (framesDecoded > 0) {
val dredPct = 100.0 * dredReconstructions / framesDecoded
val plcPct = 100.0 * classicalPlc / framesDecoded
appendLine("DRED rate: ${"%.2f".format(dredPct)}%")
appendLine("Classical PLC rate: ${"%.2f".format(plcPct)}%")
}
appendLine()
appendLine("=== Final Stats ===") appendLine("=== Final Stats ===")
appendLine(finalStatsJson) appendLine(finalStatsJson)
} }
@@ -215,28 +195,4 @@ class DebugReporter(private val context: Context) {
FileInputStream(file).use { it.copyTo(zos) } FileInputStream(file).use { it.copyTo(zos) }
zos.closeEntry() zos.closeEntry()
} }
/**
* Tiny JSON field extractor — pulls an integer value for a top-level
* field like `"dred_reconstructions":42`. We don't want to pull in a
* full JSON parser just for the debug preamble, and the CallStats
* output is a flat record with well-known field names.
*
* Returns 0 if the field is missing or unparseable.
*/
private fun extractLongField(json: String, field: String): Long {
val key = "\"$field\":"
val idx = json.indexOf(key)
if (idx < 0) return 0
var i = idx + key.length
// Skip whitespace
while (i < json.length && json[i].isWhitespace()) i++
val start = i
while (i < json.length && (json[i].isDigit() || json[i] == '-')) i++
return try {
json.substring(start, i).toLong()
} catch (_: NumberFormatException) {
0
}
}
} }

View File

@@ -0,0 +1,97 @@
package com.wzp.engine
import org.json.JSONObject
/**
* Persistent signal connection for direct 1:1 calls.
* Separate from WzpEngine — survives across calls.
*
* Lifecycle: connect() → [placeCall/answerCall] → destroy()
*/
class SignalManager {
private var handle: Long = 0L
val isConnected: Boolean get() = handle != 0L
/**
* Connect to relay and register for direct calls.
* MUST be called from a thread with sufficient stack (8MB).
* Blocks briefly during QUIC connect + register, then returns.
*/
fun connect(relay: String, seedHex: String): Boolean {
if (handle != 0L) return true // already connected
handle = nativeSignalConnect(relay, seedHex)
return handle != 0L
}
/** Get current signal state as parsed object. Non-blocking. */
fun getState(): SignalState {
if (handle == 0L) return SignalState()
val json = nativeSignalGetState(handle) ?: return SignalState()
return try {
val obj = JSONObject(json)
SignalState(
status = obj.optString("status", "idle"),
fingerprint = obj.optString("fingerprint", ""),
incomingCallId = if (obj.isNull("incoming_call_id")) null else obj.optString("incoming_call_id"),
incomingCallerFp = if (obj.isNull("incoming_caller_fp")) null else obj.optString("incoming_caller_fp"),
incomingCallerAlias = if (obj.isNull("incoming_caller_alias")) null else obj.optString("incoming_caller_alias"),
callSetupRelay = if (obj.isNull("call_setup_relay")) null else obj.optString("call_setup_relay"),
callSetupRoom = if (obj.isNull("call_setup_room")) null else obj.optString("call_setup_room"),
callSetupId = if (obj.isNull("call_setup_id")) null else obj.optString("call_setup_id"),
)
} catch (e: Exception) {
SignalState()
}
}
/** Place a direct call to a target fingerprint. */
fun placeCall(targetFp: String): Int {
if (handle == 0L) return -1
return nativeSignalPlaceCall(handle, targetFp)
}
/** Answer an incoming call. mode: 0=Reject, 1=AcceptTrusted, 2=AcceptGeneric */
fun answerCall(callId: String, mode: Int = 2): Int {
if (handle == 0L) return -1
return nativeSignalAnswerCall(handle, callId, mode)
}
/** Send hangup signal. */
fun hangup() {
if (handle != 0L) nativeSignalHangup(handle)
}
/** Destroy the signal manager. */
fun destroy() {
if (handle != 0L) {
nativeSignalDestroy(handle)
handle = 0L
}
}
// JNI native methods
private external fun nativeSignalConnect(relay: String, seed: String): Long
private external fun nativeSignalGetState(handle: Long): String?
private external fun nativeSignalPlaceCall(handle: Long, targetFp: String): Int
private external fun nativeSignalAnswerCall(handle: Long, callId: String, mode: Int): Int
private external fun nativeSignalHangup(handle: Long)
private external fun nativeSignalDestroy(handle: Long)
companion object {
init { System.loadLibrary("wzp_android") }
}
}
/** Signal connection state. */
data class SignalState(
val status: String = "idle",
val fingerprint: String = "",
val incomingCallId: String? = null,
val incomingCallerFp: String? = null,
val incomingCallerAlias: String? = null,
val callSetupRelay: String? = null,
val callSetupRoom: String? = null,
val callSetupId: String? = null,
)

View File

@@ -159,6 +159,18 @@ class WzpEngine(private val callback: WzpCallback) {
private external fun nativeWriteAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, sampleCount: Int): Int private external fun nativeWriteAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, sampleCount: Int): Int
private external fun nativeReadAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, maxSamples: Int): Int private external fun nativeReadAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, maxSamples: Int): Int
private external fun nativeDestroy(handle: Long) private external fun nativeDestroy(handle: Long)
companion object {
init { System.loadLibrary("wzp_android") }
/** Get the identity fingerprint for a seed hex. No engine needed. */
@JvmStatic
private external fun nativeGetFingerprint(seedHex: String): String?
/** Compute the full identity fingerprint (xxxx:xxxx:...) from a seed hex string. */
@JvmStatic
fun getFingerprint(seedHex: String): String = nativeGetFingerprint(seedHex) ?: ""
}
private external fun nativePingRelay(handle: Long, relay: String): String? private external fun nativePingRelay(handle: Long, relay: String): String?
private external fun nativeStartSignaling(handle: Long, relay: String, seed: String, token: String, alias: String): Int private external fun nativeStartSignaling(handle: Long, relay: String, seed: String, token: String, alias: String): Int
private external fun nativePlaceCall(handle: Long, targetFp: String): Int private external fun nativePlaceCall(handle: Long, targetFp: String): Int
@@ -208,11 +220,6 @@ class WzpEngine(private val callback: WzpCallback) {
return nativeAnswerCall(nativeHandle, callId, mode) return nativeAnswerCall(nativeHandle, callId, mode)
} }
companion object {
init {
System.loadLibrary("wzp_android")
}
}
} }
/** Integer constants matching the Rust [CallState] enum ordinals. */ /** Integer constants matching the Rust [CallState] enum ordinals. */

View File

@@ -141,9 +141,9 @@ class CallViewModel : ViewModel(), WzpCallback {
private val _targetFingerprint = MutableStateFlow("") private val _targetFingerprint = MutableStateFlow("")
val targetFingerprint: StateFlow<String> = _targetFingerprint.asStateFlow() val targetFingerprint: StateFlow<String> = _targetFingerprint.asStateFlow()
/** Signal connection state: 0=idle, 5=registered, 6=ringing, 7=incoming */ /** Signal state string: "idle", "registered", "ringing", "incoming", "setup" */
private val _signalState = MutableStateFlow(0) private val _signalState = MutableStateFlow("idle")
val signalState: StateFlow<Int> = _signalState.asStateFlow() val signalState: StateFlow<String> = _signalState.asStateFlow()
/** Incoming call info */ /** Incoming call info */
private val _incomingCallId = MutableStateFlow<String?>(null) private val _incomingCallId = MutableStateFlow<String?>(null)
@@ -155,32 +155,80 @@ class CallViewModel : ViewModel(), WzpCallback {
private val _incomingCallerAlias = MutableStateFlow<String?>(null) private val _incomingCallerAlias = MutableStateFlow<String?>(null)
val incomingCallerAlias: StateFlow<String?> = _incomingCallerAlias.asStateFlow() val incomingCallerAlias: StateFlow<String?> = _incomingCallerAlias.asStateFlow()
/** Separate signal manager (persistent, survives calls) */
private var signalManager: com.wzp.engine.SignalManager? = null
private var signalPollJob: Job? = null
fun setCallMode(mode: Int) { _callMode.value = mode } fun setCallMode(mode: Int) { _callMode.value = mode }
fun setTargetFingerprint(fp: String) { _targetFingerprint.value = fp } fun setTargetFingerprint(fp: String) { _targetFingerprint.value = fp }
/** Register on relay for direct calls */ /** Register on relay for direct calls */
fun registerForCalls() { fun registerForCalls() {
if (engine == null) {
engine = WzpEngine(this).also { it.init() }
}
val serverIdx = _selectedServer.value val serverIdx = _selectedServer.value
val serverList = _servers.value val serverList = _servers.value
if (serverIdx >= serverList.size) return if (serverIdx >= serverList.size) return
val relay = serverList[serverIdx].address val relay = serverList[serverIdx].address
val seed = _seedHex.value var seed = _seedHex.value
val alias = _alias.value // Generate seed if empty (fresh install or cleared storage)
if (seed.isEmpty()) {
viewModelScope.launch(Dispatchers.IO) { val newSeed = ByteArray(32).also { java.security.SecureRandom().nextBytes(it) }
seed = newSeed.joinToString("") { "%02x".format(it) }
_seedHex.value = seed
settings?.saveSeedHex(seed)
Log.i(TAG, "generated new identity seed")
}
val resolvedRelay = resolveToIp(relay) ?: relay val resolvedRelay = resolveToIp(relay) ?: relay
val result = engine?.startSignaling(resolvedRelay, seed, "", alias)
if (result == 0) { // nativeSignalConnect has JNI overhead — must be on a thread with enough stack.
_signalState.value = 5 // Registered // Dispatchers.IO threads overflow. Use explicit Java Thread.
startStatsPolling() Thread(null, {
try {
val mgr = com.wzp.engine.SignalManager()
val ok = mgr.connect(resolvedRelay, seed)
viewModelScope.launch {
if (ok) {
signalManager = mgr
startSignalPolling()
} else { } else {
_errorMessage.value = "Failed to register on relay" _errorMessage.value = "Failed to register on relay"
} }
} }
} catch (e: Exception) {
viewModelScope.launch {
_errorMessage.value = "Register error: ${e.message}"
}
}
}, "wzp-signal-init", 8 * 1024 * 1024).start()
}
/** Poll signal manager state every 500ms */
private fun startSignalPolling() {
signalPollJob?.cancel()
signalPollJob = viewModelScope.launch {
while (isActive) {
val mgr = signalManager
if (mgr != null && mgr.isConnected) {
val state = mgr.getState()
_signalState.value = state.status
_incomingCallId.value = state.incomingCallId
_incomingCallerFp.value = state.incomingCallerFp
_incomingCallerAlias.value = state.incomingCallerAlias
// Auto-connect to media room when call is set up
if (state.status == "setup" && state.callSetupRelay != null && state.callSetupRoom != null) {
Log.i(TAG, "CallSetup: connecting to ${state.callSetupRelay} room ${state.callSetupRoom}")
startCallInternal(state.callSetupRelay, state.callSetupRoom)
}
}
delay(500L)
}
}
}
private fun stopSignalPolling() {
signalPollJob?.cancel()
signalPollJob = null
} }
/** Place a direct call to the target fingerprint */ /** Place a direct call to the target fingerprint */
@@ -190,24 +238,28 @@ class CallViewModel : ViewModel(), WzpCallback {
_errorMessage.value = "Enter a fingerprint to call" _errorMessage.value = "Enter a fingerprint to call"
return return
} }
engine?.placeCall(target) signalManager?.placeCall(target)
_signalState.value = 6 // Ringing
} }
/** Answer an incoming direct call */ /** Answer an incoming direct call */
fun answerIncomingCall(mode: Int = 2) { fun answerIncomingCall(mode: Int = 2) {
val callId = _incomingCallId.value ?: return val callId = _incomingCallId.value ?: return
engine?.answerCall(callId, mode) signalManager?.answerCall(callId, mode)
} }
/** Reject an incoming direct call */ /** Reject an incoming direct call */
fun rejectIncomingCall() { fun rejectIncomingCall() {
val callId = _incomingCallId.value ?: return val callId = _incomingCallId.value ?: return
engine?.answerCall(callId, 0) // 0 = Reject signalManager?.answerCall(callId, 0)
_signalState.value = 5 // Back to registered }
_incomingCallId.value = null
_incomingCallerFp.value = null /** Hang up direct call — media ends, signal stays alive */
_incomingCallerAlias.value = null fun hangupDirectCall() {
signalManager?.hangup()
engine?.stopCall()
engine?.destroy()
engine = null
engineInitialized = false
} }
companion object { companion object {
@@ -685,30 +737,10 @@ class CallViewModel : ViewModel(), WzpCallback {
val s = CallStats.fromJson(json) val s = CallStats.fromJson(json)
lastCallDuration = s.durationSecs lastCallDuration = s.durationSecs
_stats.value = s _stats.value = s
// Only update callState from media engine stats (not signal)
if (s.state != 0) { if (s.state != 0) {
_callState.value = s.state _callState.value = s.state
} }
// Track signal state changes for direct calling
if (s.state in 5..7) {
_signalState.value = s.state
}
// Incoming call detection
if (s.state == 7) { // IncomingCall
_incomingCallId.value = s.incomingCallId
_incomingCallerFp.value = s.incomingCallerFp
_incomingCallerAlias.value = s.incomingCallerAlias
}
// CallSetup: auto-connect to media room
if (s.state == 1 && s.incomingCallId != null && s.incomingCallId.contains("|")) {
// Format: "relay_addr|room_name"
val parts = s.incomingCallId.split("|", limit = 2)
if (parts.size == 2) {
val mediaRelay = parts[0]
val mediaRoom = parts[1]
Log.i(TAG, "CallSetup: connecting to $mediaRelay room $mediaRoom")
startCallInternal(mediaRelay, mediaRoom)
}
}
if (s.state == 2 && !audioStarted) { if (s.state == 2 && !audioStarted) {
startAudio() startAudio()
} }

View File

@@ -165,7 +165,7 @@ fun InCallScreen(
color = Color.White color = Color.White
) )
Text( Text(
text = "ENCRYPTED VOICE", text = "ENCRYPTED VOICE \u2022 direct-call-v1",
style = MaterialTheme.typography.labelSmall.copy(letterSpacing = 3.sp), style = MaterialTheme.typography.labelSmall.copy(letterSpacing = 3.sp),
color = TextDim color = TextDim
) )
@@ -219,7 +219,7 @@ fun InCallScreen(
// Mode toggle: Room vs Direct Call // Mode toggle: Room vs Direct Call
val callMode by viewModel.callMode.collectAsState() val callMode by viewModel.callMode.collectAsState()
val signalState by viewModel.signalState.collectAsState() val signalState by viewModel.signalState.collectAsState() // "idle"/"registered"/"ringing"/etc
val targetFp by viewModel.targetFingerprint.collectAsState() val targetFp by viewModel.targetFingerprint.collectAsState()
val incomingCallId by viewModel.incomingCallId.collectAsState() val incomingCallId by viewModel.incomingCallId.collectAsState()
val incomingCallerFp by viewModel.incomingCallerFp.collectAsState() val incomingCallerFp by viewModel.incomingCallerFp.collectAsState()
@@ -309,7 +309,7 @@ fun InCallScreen(
} }
} else { } else {
// ── Direct call mode ── // ── Direct call mode ──
if (signalState < 5) { if (signalState == "idle") {
// Not registered yet // Not registered yet
SectionLabel("ALIAS") SectionLabel("ALIAS")
OutlinedTextField( OutlinedTextField(
@@ -333,7 +333,7 @@ fun InCallScreen(
color = Color.White color = Color.White
) )
} }
} else if (signalState == 5) { } else if (signalState == "registered" || signalState == "incoming") {
// Registered — show dial pad // Registered — show dial pad
Text( Text(
"\u2705 Registered — waiting for calls", "\u2705 Registered — waiting for calls",
@@ -403,8 +403,7 @@ fun InCallScreen(
color = Color.White color = Color.White
) )
} }
} else if (signalState == 6) { } else if (signalState == "ringing") {
// Ringing
Text( Text(
"\uD83D\uDD14 Ringing...", "\uD83D\uDD14 Ringing...",
color = Yellow, color = Yellow,
@@ -412,11 +411,10 @@ fun InCallScreen(
textAlign = TextAlign.Center, textAlign = TextAlign.Center,
modifier = Modifier.fillMaxWidth() modifier = Modifier.fillMaxWidth()
) )
} else if (signalState == 7) { } else if (signalState == "setup") {
// Incoming call (state 7 also handled above in registered view)
Text( Text(
"\uD83D\uDCDE Incoming call...", "Connecting to call...",
color = Green, color = Accent,
style = MaterialTheme.typography.titleMedium, style = MaterialTheme.typography.titleMedium,
textAlign = TextAlign.Center, textAlign = TextAlign.Center,
modifier = Modifier.fillMaxWidth() modifier = Modifier.fillMaxWidth()
@@ -431,14 +429,16 @@ fun InCallScreen(
Spacer(modifier = Modifier.height(20.dp)) Spacer(modifier = Modifier.height(20.dp))
// Identity // Identity — compute real fingerprint from seed
val fp = if (seedHex.length >= 16) seedHex.take(16) else "" val fullFp = remember(seedHex) {
if (seedHex.length >= 64) com.wzp.engine.WzpEngine.getFingerprint(seedHex) else ""
}
Row(verticalAlignment = Alignment.CenterVertically) { Row(verticalAlignment = Alignment.CenterVertically) {
if (fp.isNotEmpty()) { if (fullFp.isNotEmpty()) {
Identicon(fingerprint = seedHex, size = 28.dp) Identicon(fingerprint = fullFp, size = 28.dp)
Spacer(modifier = Modifier.width(8.dp)) Spacer(modifier = Modifier.width(8.dp))
CopyableFingerprint( CopyableFingerprint(
fingerprint = fp.chunked(4).joinToString(":"), fingerprint = fullFp,
style = MaterialTheme.typography.bodySmall.copy(fontFamily = FontFamily.Monospace), style = MaterialTheme.typography.bodySmall.copy(fontFamily = FontFamily.Monospace),
color = TextDim color = TextDim
) )

View File

@@ -14,10 +14,8 @@ use std::sync::{Arc, Mutex};
use std::time::Instant; use std::time::Instant;
use bytes::Bytes; use bytes::Bytes;
use tracing::{debug, error, info, warn}; use tracing::{error, info, warn};
use wzp_codec::AdaptiveDecoder;
use wzp_codec::agc::AutoGainControl; use wzp_codec::agc::AutoGainControl;
use wzp_codec::dred_ffi::{DredDecoderHandle, DredState};
use wzp_crypto::{KeyExchange, WarzoneKeyExchange}; use wzp_crypto::{KeyExchange, WarzoneKeyExchange};
use wzp_fec::{RaptorQFecDecoder, RaptorQFecEncoder}; use wzp_fec::{RaptorQFecDecoder, RaptorQFecEncoder};
use wzp_proto::{ use wzp_proto::{
@@ -203,7 +201,6 @@ impl WzpEngine {
/// Returns JSON `{"rtt_ms":N,"server_fingerprint":"hex"}` or error. /// Returns JSON `{"rtt_ms":N,"server_fingerprint":"hex"}` or error.
pub fn ping_relay(&self, address: &str) -> Result<String, anyhow::Error> { pub fn ping_relay(&self, address: &str) -> Result<String, anyhow::Error> {
let addr: SocketAddr = address.parse()?; let addr: SocketAddr = address.parse()?;
let _ = rustls::crypto::ring::default_provider().install_default();
let rt = tokio::runtime::Builder::new_current_thread() let rt = tokio::runtime::Builder::new_current_thread()
.enable_all() .enable_all()
@@ -247,154 +244,7 @@ impl WzpEngine {
} }
/// Start persistent signaling connection for direct calls. /// Start persistent signaling connection for direct calls.
/// Spawns a background task that maintains the `_signal` connection. // Signal methods (start_signaling, place_call, answer_call) moved to signal_mgr.rs
pub fn start_signaling(
&mut self,
relay_addr: &str,
seed_hex: &str,
token: Option<&str>,
alias: Option<&str>,
) -> Result<(), anyhow::Error> {
use wzp_proto::{MediaTransport, SignalMessage};
let addr: SocketAddr = relay_addr.parse()?;
let seed = if seed_hex.is_empty() {
wzp_crypto::Seed::generate()
} else {
wzp_crypto::Seed::from_hex(seed_hex).map_err(|e| anyhow::anyhow!(e))?
};
let identity = seed.derive_identity();
let pub_id = identity.public_identity();
let identity_pub = *pub_id.signing.as_bytes();
let fp = pub_id.fingerprint.to_string();
let token = token.map(|s| s.to_string());
let alias = alias.map(|s| s.to_string());
let state = self.state.clone();
let seed_bytes = seed.0;
info!(fingerprint = %fp, relay = %addr, "starting signaling");
// Create runtime for signaling (separate from call runtime)
let rt = tokio::runtime::Builder::new_multi_thread()
.worker_threads(1)
.enable_all()
.build()?;
let signal_state = state.clone();
rt.spawn(async move {
let _ = rustls::crypto::ring::default_provider().install_default();
let bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let endpoint = match wzp_transport::create_endpoint(bind, None) {
Ok(e) => e,
Err(e) => { error!("signal endpoint: {e}"); return; }
};
let client_cfg = wzp_transport::client_config();
let conn = match wzp_transport::connect(&endpoint, addr, "_signal", client_cfg).await {
Ok(c) => c,
Err(e) => { error!("signal connect: {e}"); return; }
};
let transport = std::sync::Arc::new(wzp_transport::QuinnTransport::new(conn));
// Auth if token provided
if let Some(ref tok) = token {
let _ = transport.send_signal(&SignalMessage::AuthToken { token: tok.clone() }).await;
}
// Register presence
let _ = transport.send_signal(&SignalMessage::RegisterPresence {
identity_pub,
signature: vec![],
alias: alias.clone(),
}).await;
// Wait for ack
match transport.recv_signal().await {
Ok(Some(SignalMessage::RegisterPresenceAck { success: true, .. })) => {
info!(fingerprint = %fp, "signal: registered");
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::Registered;
}
other => {
error!("signal registration failed: {other:?}");
return;
}
}
// Signal recv loop
loop {
if !signal_state.running.load(Ordering::Relaxed) {
break;
}
match transport.recv_signal().await {
Ok(Some(SignalMessage::CallRinging { call_id })) => {
info!(call_id = %call_id, "signal: ringing");
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::Ringing;
}
Ok(Some(SignalMessage::DirectCallOffer { caller_fingerprint, caller_alias, call_id, .. })) => {
info!(from = %caller_fingerprint, call_id = %call_id, "signal: incoming call");
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::IncomingCall;
stats.incoming_call_id = Some(call_id);
stats.incoming_caller_fp = Some(caller_fingerprint);
stats.incoming_caller_alias = caller_alias;
}
Ok(Some(SignalMessage::DirectCallAnswer { call_id, accept_mode, .. })) => {
info!(call_id = %call_id, mode = ?accept_mode, "signal: call answered");
}
Ok(Some(SignalMessage::CallSetup { call_id, room, relay_addr })) => {
info!(call_id = %call_id, room = %room, relay = %relay_addr, "signal: call setup");
// Connect to media room via the existing start_call mechanism
// Store the room info so Kotlin can call startCall with it
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::Connecting;
// Store call setup info for Kotlin to pick up
stats.incoming_call_id = Some(format!("{relay_addr}|{room}"));
}
Ok(Some(SignalMessage::Hangup { reason })) => {
info!(reason = ?reason, "signal: call ended by remote");
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::Closed;
stats.incoming_call_id = None;
stats.incoming_caller_fp = None;
stats.incoming_caller_alias = None;
}
Ok(Some(_)) => {}
Ok(None) => {
info!("signal: connection closed");
break;
}
Err(e) => {
error!("signal recv error: {e}");
break;
}
}
}
let mut stats = signal_state.stats.lock().unwrap();
stats.state = crate::stats::CallState::Closed;
});
self.tokio_runtime = Some(rt);
Ok(())
}
/// Place a direct call to a target fingerprint via the signal connection.
pub fn place_call(&self, target_fingerprint: &str) -> Result<(), anyhow::Error> {
let _ = self.state.command_tx.send(EngineCommand::PlaceCall {
target_fingerprint: target_fingerprint.to_string(),
});
Ok(())
}
/// Answer an incoming direct call.
pub fn answer_call(&self, call_id: &str, mode: wzp_proto::CallAcceptMode) -> Result<(), anyhow::Error> {
let _ = self.state.command_tx.send(EngineCommand::AnswerCall {
call_id: call_id.to_string(),
accept_mode: mode,
});
Ok(())
}
pub fn set_mute(&self, muted: bool) { pub fn set_mute(&self, muted: bool) {
self.state.muted.store(muted, Ordering::Relaxed); self.state.muted.store(muted, Ordering::Relaxed);
@@ -458,7 +308,6 @@ async fn run_call(
alias: Option<&str>, alias: Option<&str>,
state: Arc<EngineState>, state: Arc<EngineState>,
) -> Result<(), anyhow::Error> { ) -> Result<(), anyhow::Error> {
let _ = rustls::crypto::ring::default_provider().install_default();
let bind_addr: SocketAddr = "0.0.0.0:0".parse().unwrap(); let bind_addr: SocketAddr = "0.0.0.0:0".parse().unwrap();
let endpoint = wzp_transport::create_endpoint(bind_addr, None)?; let endpoint = wzp_transport::create_endpoint(bind_addr, None)?;
@@ -532,12 +381,9 @@ async fn run_call(
stats.state = CallState::Active; stats.state = CallState::Active;
} }
// Initialize codec (Opus or Codec2 based on profile). // Initialize codec (Opus or Codec2 based on profile)
// Phase 3c: decoder is a concrete AdaptiveDecoder (not Box<dyn
// AudioDecoder>) so the recv task can call reconstruct_from_dred on
// gaps detected via sequence tracking.
let mut encoder = wzp_codec::create_encoder(profile); let mut encoder = wzp_codec::create_encoder(profile);
let mut decoder = AdaptiveDecoder::new(profile).expect("failed to create adaptive decoder"); let mut decoder = wzp_codec::create_decoder(profile);
// Initialize FEC encoder/decoder // Initialize FEC encoder/decoder
let mut fec_enc = wzp_fec::create_encoder(&profile); let mut fec_enc = wzp_fec::create_encoder(&profile);
@@ -670,19 +516,6 @@ async fn run_call(
t_opus_us += t0.elapsed().as_micros() as u64; t_opus_us += t0.elapsed().as_micros() as u64;
let encoded = &encode_buf[..encoded_len]; let encoded = &encode_buf[..encoded_len];
// Phase 2: Opus tiers bypass RaptorQ (DRED handles loss recovery
// at the codec layer). Codec2 tiers keep RaptorQ unchanged.
let is_opus = current_profile.codec.is_opus();
let (hdr_fec_block, hdr_fec_symbol, hdr_fec_ratio) = if is_opus {
(0u8, 0u8, 0u8)
} else {
(
block_id,
frame_in_block,
MediaHeader::encode_fec_ratio(current_profile.fec_ratio),
)
};
// Build source packet // Build source packet
let s = seq.fetch_add(1, Ordering::Relaxed); let s = seq.fetch_add(1, Ordering::Relaxed);
let t = ts.fetch_add(frame_samples as u32, Ordering::Relaxed); let t = ts.fetch_add(frame_samples as u32, Ordering::Relaxed);
@@ -693,11 +526,11 @@ async fn run_call(
is_repair: false, is_repair: false,
codec_id: current_profile.codec, codec_id: current_profile.codec,
has_quality_report: false, has_quality_report: false,
fec_ratio_encoded: hdr_fec_ratio, fec_ratio_encoded: MediaHeader::encode_fec_ratio(current_profile.fec_ratio),
seq: s, seq: s,
timestamp: t, timestamp: t,
fec_block: hdr_fec_block, fec_block: block_id,
fec_symbol: hdr_fec_symbol, fec_symbol: frame_in_block,
reserved: 0, reserved: 0,
csrc_count: 0, csrc_count: 0,
}, },
@@ -727,16 +560,14 @@ async fn run_call(
t_send_us += t0.elapsed().as_micros() as u64; t_send_us += t0.elapsed().as_micros() as u64;
frames_sent += 1; frames_sent += 1;
// Codec2-only: feed RaptorQ and emit repair packets when the // Feed encoded frame to FEC encoder
// block is full. Opus tiers skip this entire block — DRED
// (enabled in Phase 1) provides codec-layer loss recovery.
let t0 = Instant::now(); let t0 = Instant::now();
if !is_opus {
if let Err(e) = fec_enc.add_source_symbol(encoded) { if let Err(e) = fec_enc.add_source_symbol(encoded) {
warn!("fec add_source error: {e}"); warn!("fec add_source error: {e}");
} }
frame_in_block += 1; frame_in_block += 1;
// When block is full, generate repair packets
if frame_in_block >= current_profile.frames_per_block { if frame_in_block >= current_profile.frames_per_block {
match fec_enc.generate_repair(current_profile.fec_ratio) { match fec_enc.generate_repair(current_profile.fec_ratio) {
Ok(repairs) => { Ok(repairs) => {
@@ -787,7 +618,6 @@ async fn run_call(
block_id = block_id.wrapping_add(1); block_id = block_id.wrapping_add(1);
frame_in_block = 0; frame_in_block = 0;
} }
}
t_fec_us += t0.elapsed().as_micros() as u64; t_fec_us += t0.elapsed().as_micros() as u64;
t_frames += 1; t_frames += 1;
@@ -829,27 +659,7 @@ async fn run_call(
let mut last_stats_log = Instant::now(); let mut last_stats_log = Instant::now();
let mut quality_ctrl = AdaptiveQualityController::new(); let mut quality_ctrl = AdaptiveQualityController::new();
let mut last_peer_codec: Option<CodecId> = None; let mut last_peer_codec: Option<CodecId> = None;
info!("recv task started (Opus + RaptorQ FEC)");
// Phase 3c: DRED reconstruction state. Unlike the desktop
// CallDecoder (which sits behind a jitter buffer that emits
// Missing signals), engine.rs reads packets directly from the
// transport and decodes straight into the playout ring. Gap
// detection is therefore done via sequence-number tracking:
// when a packet arrives with seq > expected_seq, the frames in
// between are missing and we attempt to reconstruct them via
// DRED before decoding the newly-arrived packet.
let mut dred_decoder =
DredDecoderHandle::new().expect("opus_dred_decoder_create failed");
let mut dred_parse_scratch =
DredState::new().expect("opus_dred_alloc failed (scratch)");
let mut last_good_dred =
DredState::new().expect("opus_dred_alloc failed (good state)");
let mut last_good_dred_seq: Option<u16> = None;
let mut expected_seq: Option<u16> = None;
let mut dred_reconstructions: u64 = 0;
let mut classical_plc_invocations: u64 = 0;
info!("recv task started (Opus + DRED + Codec2/RaptorQ)");
loop { loop {
if !state.running.load(Ordering::Relaxed) { if !state.running.load(Ordering::Relaxed) {
break; break;
@@ -891,21 +701,14 @@ async fn run_call(
let is_repair = pkt.header.is_repair; let is_repair = pkt.header.is_repair;
let pkt_block = pkt.header.fec_block; let pkt_block = pkt.header.fec_block;
let pkt_symbol = pkt.header.fec_symbol; let pkt_symbol = pkt.header.fec_symbol;
let pkt_is_opus = pkt.header.codec_id.is_opus();
// Phase 2: Opus packets bypass RaptorQ entirely — DRED // Feed every packet (source + repair) to FEC decoder
// (enabled Phase 1) handles codec-layer loss recovery,
// and feeding these symbols into the RaptorQ decoder
// would accumulate block_id=0 duplicates that never
// decode. Codec2 packets still feed RaptorQ.
if !pkt_is_opus {
let _ = fec_dec.add_symbol( let _ = fec_dec.add_symbol(
pkt_block, pkt_block,
pkt_symbol, pkt_symbol,
is_repair, is_repair,
&pkt.payload, &pkt.payload,
); );
}
// Source packets: decode directly // Source packets: decode directly
if !is_repair && pkt.header.codec_id != CodecId::ComfortNoise { if !is_repair && pkt.header.codec_id != CodecId::ComfortNoise {
@@ -928,13 +731,6 @@ async fn run_call(
}; };
info!(from = ?decoder.codec_id(), to = ?pkt.header.codec_id, "recv: switching decoder"); info!(from = ?decoder.codec_id(), to = ?pkt.header.codec_id, "recv: switching decoder");
let _ = decoder.set_profile(switch_profile); let _ = decoder.set_profile(switch_profile);
// Profile switch invalidates the cached DRED
// state because samples_available is measured
// in the old profile's sample rate. Reset the
// tracking so we don't try to reconstruct with
// stale offsets.
last_good_dred_seq = None;
expected_seq = None;
} }
// Track peer codec for UI display // Track peer codec for UI display
if last_peer_codec != Some(pkt.header.codec_id) { if last_peer_codec != Some(pkt.header.codec_id) {
@@ -943,109 +739,6 @@ async fn run_call(
stats.peer_codec = format!("{:?}", pkt.header.codec_id); stats.peer_codec = format!("{:?}", pkt.header.codec_id);
} }
} }
// Phase 3c: Opus path — parse DRED state out of
// the current packet FIRST so last_good_dred
// reflects the freshest available reconstruction
// source, then attempt gap recovery against it
// BEFORE decoding this packet's audio. Ordering
// matters because the playout ring is FIFO — gap
// samples must be written before this packet's
// samples, which come next.
if pkt_is_opus {
// Update DRED state from the current packet.
match dred_decoder.parse_into(&mut dred_parse_scratch, &pkt.payload) {
Ok(available) if available > 0 => {
std::mem::swap(
&mut dred_parse_scratch,
&mut last_good_dred,
);
last_good_dred_seq = Some(pkt.header.seq);
}
Ok(_) => {
// Packet carried no DRED — keep cached state.
}
Err(e) => {
debug!("DRED parse error (ignored): {e}");
}
}
// Detect and fill gap from last-expected to this packet.
const MAX_GAP_FRAMES: u16 = 16;
if let Some(expected) = expected_seq {
let gap = pkt.header.seq.wrapping_sub(expected);
if gap > 0 && gap <= MAX_GAP_FRAMES {
let current_profile_frame_samples =
(48_000 * profile.frame_duration_ms as i32) / 1000;
let available = last_good_dred.samples_available();
let pcm_slice_len =
current_profile_frame_samples as usize;
for gap_idx in 0..gap {
let missing_seq = expected.wrapping_add(gap_idx);
// Offset from the DRED anchor (last_good_dred_seq)
// back to the missing seq, in samples. Skip if
// the anchor is not ahead of missing (defensive).
let offset_samples = match last_good_dred_seq {
Some(anchor) => {
let delta = anchor.wrapping_sub(missing_seq);
if delta == 0 || delta > MAX_GAP_FRAMES {
-1 // skip DRED, use PLC
} else {
delta as i32 * current_profile_frame_samples
}
}
None => -1,
};
let reconstructed = if offset_samples > 0
&& offset_samples <= available
{
decoder
.reconstruct_from_dred(
&last_good_dred,
offset_samples,
&mut decode_buf[..pcm_slice_len],
)
.ok()
} else {
None
};
match reconstructed {
Some(samples) => {
playout_agc.process_frame(
&mut decode_buf[..samples],
);
state
.playout_ring
.write(&decode_buf[..samples]);
dred_reconstructions += 1;
frames_decoded += 1;
}
None => {
// Fall through to classical PLC.
if let Ok(samples) =
decoder.decode_lost(&mut decode_buf)
{
playout_agc
.process_frame(&mut decode_buf[..samples]);
state
.playout_ring
.write(&decode_buf[..samples]);
classical_plc_invocations += 1;
frames_decoded += 1;
}
}
}
}
}
}
// Advance the expected-seq tracker for the next arrival.
expected_seq = Some(pkt.header.seq.wrapping_add(1));
}
match decoder.decode(&pkt.payload, &mut decode_buf) { match decoder.decode(&pkt.payload, &mut decode_buf) {
Ok(samples) => { Ok(samples) => {
playout_agc.process_frame(&mut decode_buf[..samples]); playout_agc.process_frame(&mut decode_buf[..samples]);
@@ -1057,21 +750,12 @@ async fn run_call(
if let Ok(samples) = decoder.decode_lost(&mut decode_buf) { if let Ok(samples) = decoder.decode_lost(&mut decode_buf) {
playout_agc.process_frame(&mut decode_buf[..samples]); playout_agc.process_frame(&mut decode_buf[..samples]);
state.playout_ring.write(&decode_buf[..samples]); state.playout_ring.write(&decode_buf[..samples]);
// This is a decode-error fallback (not a
// detected gap), so count it as PLC.
classical_plc_invocations += 1;
} }
} }
} }
} }
// Codec2-only: try FEC recovery and expire old blocks. // Try FEC recovery
// Opus packets skip both — the Phase 2 Opus path has no
// RaptorQ state to query or clean up. The `fec_recovered`
// counter is now effectively Codec2-only, which is
// correct because DRED reconstructions will be counted
// separately once Phase 3 lands (new telemetry field).
if !pkt_is_opus {
if let Ok(Some(recovered_frames)) = fec_dec.try_decode(pkt_block) { if let Ok(Some(recovered_frames)) = fec_dec.try_decode(pkt_block) {
fec_recovered += recovered_frames.len() as u64; fec_recovered += recovered_frames.len() as u64;
if fec_recovered % 50 == 1 { if fec_recovered % 50 == 1 {
@@ -1088,13 +772,10 @@ async fn run_call(
if pkt_block > 3 { if pkt_block > 3 {
fec_dec.expire_before(pkt_block.wrapping_sub(3)); fec_dec.expire_before(pkt_block.wrapping_sub(3));
} }
}
let mut stats = state.stats.lock().unwrap(); let mut stats = state.stats.lock().unwrap();
stats.frames_decoded = frames_decoded; stats.frames_decoded = frames_decoded;
stats.fec_recovered = fec_recovered; stats.fec_recovered = fec_recovered;
stats.dred_reconstructions = dred_reconstructions;
stats.classical_plc_invocations = classical_plc_invocations;
drop(stats); drop(stats);
// Periodic stats every 5 seconds // Periodic stats every 5 seconds
@@ -1102,8 +783,6 @@ async fn run_call(
info!( info!(
frames_decoded, frames_decoded,
fec_recovered, fec_recovered,
dred_reconstructions,
classical_plc_invocations,
recv_errors, recv_errors,
max_recv_gap_ms, max_recv_gap_ms,
playout_avail = state.playout_ring.available(), playout_avail = state.playout_ring.available(),

View File

@@ -77,6 +77,9 @@ pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeInit(
) -> jlong { ) -> jlong {
let result = panic::catch_unwind(|| { let result = panic::catch_unwind(|| {
init_logging(); init_logging();
// Install rustls crypto provider ONCE on the main thread.
// Must not be called per-thread — conflicts with Android's system libcrypto.so TLS keys.
let _ = rustls::crypto::ring::default_provider().install_default();
let handle = Box::new(EngineHandle { let handle = Box::new(EngineHandle {
engine: WzpEngine::new(), engine: WzpEngine::new(),
}); });
@@ -360,88 +363,149 @@ pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativePingRelay<'a>(
.unwrap_or(JObject::null().into_raw()) .unwrap_or(JObject::null().into_raw())
} }
/// Get the identity fingerprint for a seed hex string.
/// Returns the full fingerprint (xxxx:xxxx:...) or empty string on error.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeGetFingerprint<'a>(
mut env: JNIEnv<'a>,
_class: JClass,
seed_hex_j: JString,
) -> jstring {
let seed_hex: String = env.get_string(&seed_hex_j).map(|s| s.into()).unwrap_or_default();
let fp = if seed_hex.is_empty() {
String::new()
} else {
match wzp_crypto::Seed::from_hex(&seed_hex) {
Ok(seed) => {
let id = seed.derive_identity();
id.public_identity().fingerprint.to_string()
}
Err(_) => String::new(),
}
};
env.new_string(&fp)
.map(|s| s.into_raw())
.unwrap_or(JObject::null().into_raw())
}
// ── Direct calling JNI functions ── // ── Direct calling JNI functions ──
/// Start persistent signaling connection to relay for direct calls. // ── SignalManager JNI functions ──
/// Returns 0 on success, -1 on error.
/// Opaque handle for SignalManager (separate from EngineHandle).
struct SignalHandle {
mgr: crate::signal_mgr::SignalManager,
}
unsafe fn signal_ref(handle: jlong) -> &'static SignalHandle {
unsafe { &*(handle as *const SignalHandle) }
}
/// Connect to relay for signaling. Returns handle (jlong) or 0 on error.
/// Blocks up to 10s waiting for the internal signal thread to connect.
#[unsafe(no_mangle)] #[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeStartSignaling<'a>( pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalConnect<'a>(
mut env: JNIEnv<'a>, mut env: JNIEnv<'a>,
_class: JClass, _class: JClass,
handle: jlong, relay_j: JString,
relay_addr_j: JString, seed_j: JString,
seed_hex_j: JString, ) -> jlong {
token_j: JString, info!("nativeSignalConnect: entered");
alias_j: JString, let relay: String = env.get_string(&relay_j).map(|s| s.into()).unwrap_or_default();
) -> jint { let seed: String = env.get_string(&seed_j).map(|s| s.into()).unwrap_or_default();
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| { info!(relay = %relay, seed_len = seed.len(), "nativeSignalConnect: parsed strings");
let h = unsafe { handle_ref(handle) };
let relay_addr: String = env.get_string(&relay_addr_j).map(|s| s.into()).unwrap_or_default();
let seed_hex: String = env.get_string(&seed_hex_j).map(|s| s.into()).unwrap_or_default();
let token: String = env.get_string(&token_j).map(|s| s.into()).unwrap_or_default();
let alias: String = env.get_string(&alias_j).map(|s| s.into()).unwrap_or_default();
h.engine.start_signaling( // start() spawns an internal thread (connect+register+recv, ONE runtime, never dropped).
&relay_addr, // Blocks up to 10s waiting for the connect+register to complete.
&seed_hex, match crate::signal_mgr::SignalManager::start(&relay, &seed) {
if token.is_empty() { None } else { Some(&token) }, Ok(mgr) => {
if alias.is_empty() { None } else { Some(&alias) }, let handle = Box::new(SignalHandle { mgr });
) Box::into_raw(handle) as jlong
})); }
Err(e) => {
match result { error!("signal connect failed: {e}");
Ok(Ok(())) => 0, 0
Ok(Err(e)) => { error!("start_signaling failed: {e}"); -1 } }
Err(_) => { error!("start_signaling panicked"); -1 }
} }
} }
/// Place a direct call to a target fingerprint. /// Get signal state as JSON string.
/// Returns 0 on success, -1 on error.
#[unsafe(no_mangle)] #[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativePlaceCall<'a>( pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalGetState<'a>(
mut env: JNIEnv<'a>, mut env: JNIEnv<'a>,
_class: JClass, _class: JClass,
handle: jlong, handle: jlong,
target_fp_j: JString, ) -> jstring {
) -> jint { if handle == 0 { return JObject::null().into_raw(); }
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| { let h = signal_ref(handle);
let h = unsafe { handle_ref(handle) }; let json = h.mgr.get_state_json();
let target: String = env.get_string(&target_fp_j).map(|s| s.into()).unwrap_or_default(); env.new_string(&json)
h.engine.place_call(&target) .map(|s| s.into_raw())
})); .unwrap_or(JObject::null().into_raw())
}
match result { /// Place a direct call.
Ok(Ok(())) => 0, #[unsafe(no_mangle)]
Ok(Err(e)) => { error!("place_call failed: {e}"); -1 } pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalPlaceCall<'a>(
Err(_) => { error!("place_call panicked"); -1 } mut env: JNIEnv<'a>,
_class: JClass,
handle: jlong,
target_j: JString,
) -> jint {
if handle == 0 { return -1; }
let h = signal_ref(handle);
let target: String = env.get_string(&target_j).map(|s| s.into()).unwrap_or_default();
match h.mgr.place_call(&target) {
Ok(()) => 0,
Err(e) => { error!("place_call: {e}"); -1 }
} }
} }
/// Answer an incoming direct call. /// Answer an incoming call.
/// mode: 0=Reject, 1=AcceptTrusted, 2=AcceptGeneric
#[unsafe(no_mangle)] #[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeAnswerCall<'a>( pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalAnswerCall<'a>(
mut env: JNIEnv<'a>, mut env: JNIEnv<'a>,
_class: JClass, _class: JClass,
handle: jlong, handle: jlong,
call_id_j: JString, call_id_j: JString,
mode: jint, mode: jint,
) -> jint { ) -> jint {
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| { if handle == 0 { return -1; }
let h = unsafe { handle_ref(handle) }; let h = signal_ref(handle);
let call_id: String = env.get_string(&call_id_j).map(|s| s.into()).unwrap_or_default(); let call_id: String = env.get_string(&call_id_j).map(|s| s.into()).unwrap_or_default();
let accept_mode = match mode { let accept_mode = match mode {
0 => wzp_proto::CallAcceptMode::Reject, 0 => wzp_proto::CallAcceptMode::Reject,
1 => wzp_proto::CallAcceptMode::AcceptTrusted, 1 => wzp_proto::CallAcceptMode::AcceptTrusted,
_ => wzp_proto::CallAcceptMode::AcceptGeneric, _ => wzp_proto::CallAcceptMode::AcceptGeneric,
}; };
h.engine.answer_call(&call_id, accept_mode) match h.mgr.answer_call(&call_id, accept_mode) {
})); Ok(()) => 0,
Err(e) => { error!("answer_call: {e}"); -1 }
match result {
Ok(Ok(())) => 0,
Ok(Err(e)) => { error!("answer_call failed: {e}"); -1 }
Err(_) => { error!("answer_call panicked"); -1 }
} }
} }
/// Send hangup signal.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalHangup(
_env: JNIEnv,
_class: JClass,
handle: jlong,
) {
if handle == 0 { return; }
let h = signal_ref(handle);
h.mgr.hangup();
}
/// Destroy the signal manager and free resources.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_SignalManager_nativeSignalDestroy(
_env: JNIEnv,
_class: JClass,
handle: jlong,
) {
if handle == 0 { return; }
let h = signal_ref(handle);
h.mgr.stop();
// Reclaim the Box
let _ = unsafe { Box::from_raw(handle as *mut SignalHandle) };
}

View File

@@ -8,24 +8,12 @@
//! //!
//! On non-Android targets, the Oboe C++ layer compiles as a stub, //! On non-Android targets, the Oboe C++ layer compiles as a stub,
//! allowing `cargo check` and unit tests on the host. //! allowing `cargo check` and unit tests on the host.
//!
//! ## Status
//!
//! **Dead code as of the Tauri mobile rewrite.** The legacy Kotlin+JNI
//! Android app that consumed this crate was replaced by a Tauri 2.x
//! Mobile app (see `desktop/src-tauri/src/engine.rs` for the live
//! Android audio recv path and `crates/wzp-native/` for the Oboe
//! bridge). We keep this crate in the workspace for reference and to
//! preserve the commit history, but it is not built by any shipping
//! target. Allow the accumulated leftover warnings so CI/workspace
//! checks stay clean — any real cleanup should happen as part of
//! removing the crate entirely, not piecemeal.
#![allow(dead_code, unused_imports, unused_variables, unused_mut)]
pub mod audio_android; pub mod audio_android;
pub mod audio_ring; pub mod audio_ring;
pub mod commands; pub mod commands;
pub mod engine; pub mod engine;
pub mod pipeline; pub mod pipeline;
pub mod signal_mgr;
pub mod stats; pub mod stats;
pub mod jni_bridge; pub mod jni_bridge;

View File

@@ -0,0 +1,288 @@
//! Persistent signal connection manager for direct 1:1 calls.
//!
//! Separate from the media engine — survives across calls.
//! Connects to relay via `_signal` SNI, registers presence,
//! and handles call signaling (offer/answer/setup/hangup).
use std::net::SocketAddr;
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::{Arc, Mutex};
use tracing::{error, info, warn};
use wzp_proto::{MediaTransport, SignalMessage};
/// Signal connection status.
#[derive(Clone, Debug, Default, serde::Serialize)]
pub struct SignalState {
pub status: String, // "idle", "registered", "ringing", "incoming", "setup"
pub fingerprint: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub incoming_call_id: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub incoming_caller_fp: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub incoming_caller_alias: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub call_setup_relay: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub call_setup_room: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
pub call_setup_id: Option<String>,
}
/// Manages a persistent `_signal` QUIC connection to a relay.
pub struct SignalManager {
transport: Arc<wzp_transport::QuinnTransport>,
state: Arc<Mutex<SignalState>>,
running: Arc<AtomicBool>,
}
impl SignalManager {
/// Create SignalManager and start connect+register+recv on a background thread.
/// Returns immediately. The internal thread runs forever.
/// CRITICAL: tokio runtime must never be dropped on Android (libcrypto TLS conflict).
pub fn start(relay_addr: &str, seed_hex: &str) -> Result<Self, anyhow::Error> {
let addr: SocketAddr = relay_addr.parse()?;
let seed = if seed_hex.is_empty() {
wzp_crypto::Seed::generate()
} else {
wzp_crypto::Seed::from_hex(seed_hex).map_err(|e| anyhow::anyhow!(e))?
};
let identity = seed.derive_identity();
let pub_id = identity.public_identity();
let identity_pub = *pub_id.signing.as_bytes();
let fp = pub_id.fingerprint.to_string();
let state = Arc::new(Mutex::new(SignalState {
status: "connecting".into(),
fingerprint: fp.clone(),
..Default::default()
}));
let running = Arc::new(AtomicBool::new(true));
// Channel to receive transport after connect succeeds
let (transport_tx, transport_rx) = std::sync::mpsc::channel();
let bg_state = Arc::clone(&state);
let bg_running = Arc::clone(&running);
let ret_state = Arc::clone(&state);
let ret_running = Arc::clone(&running);
// ONE thread, ONE runtime, NEVER dropped.
// Connect + register + recv loop all happen here.
std::thread::Builder::new()
.name("wzp-signal".into())
.stack_size(4 * 1024 * 1024)
.spawn(move || {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("tokio runtime");
rt.block_on(async move {
info!(fingerprint = %fp, relay = %addr, "signal: connecting");
let bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let endpoint = match wzp_transport::create_endpoint(bind, None) {
Ok(e) => e,
Err(e) => {
error!("signal endpoint: {e}");
bg_state.lock().unwrap().status = "idle".into();
return;
}
};
let client_cfg = wzp_transport::client_config();
let conn = match wzp_transport::connect(&endpoint, addr, "_signal", client_cfg).await {
Ok(c) => c,
Err(e) => {
error!("signal connect: {e}");
bg_state.lock().unwrap().status = "idle".into();
return;
}
};
let transport = Arc::new(wzp_transport::QuinnTransport::new(conn));
// Register
if let Err(e) = transport.send_signal(&SignalMessage::RegisterPresence {
identity_pub, signature: vec![], alias: None,
}).await {
error!("signal register: {e}");
bg_state.lock().unwrap().status = "idle".into();
return;
}
match transport.recv_signal().await {
Ok(Some(SignalMessage::RegisterPresenceAck { success: true, .. })) => {
info!(fingerprint = %fp, "signal: registered");
bg_state.lock().unwrap().status = "registered".into();
// Send transport to caller
let _ = transport_tx.send(transport.clone());
}
other => {
error!("signal registration failed: {other:?}");
bg_state.lock().unwrap().status = "idle".into();
return;
}
}
// Recv loop — runs forever
loop {
if !running.load(Ordering::Relaxed) { break; }
match transport.recv_signal().await {
Ok(Some(SignalMessage::CallRinging { call_id })) => {
info!(call_id = %call_id, "signal: ringing");
let mut s = state.lock().unwrap();
s.status = "ringing".into();
}
Ok(Some(SignalMessage::DirectCallOffer { caller_fingerprint, caller_alias, call_id, .. })) => {
info!(from = %caller_fingerprint, call_id = %call_id, "signal: incoming call");
let mut s = state.lock().unwrap();
s.status = "incoming".into();
s.incoming_call_id = Some(call_id);
s.incoming_caller_fp = Some(caller_fingerprint);
s.incoming_caller_alias = caller_alias;
}
Ok(Some(SignalMessage::DirectCallAnswer { call_id, accept_mode, .. })) => {
info!(call_id = %call_id, mode = ?accept_mode, "signal: call answered");
}
Ok(Some(SignalMessage::CallSetup { call_id, room, relay_addr })) => {
info!(call_id = %call_id, room = %room, relay = %relay_addr, "signal: call setup");
let mut s = state.lock().unwrap();
s.status = "setup".into();
s.call_setup_relay = Some(relay_addr);
s.call_setup_room = Some(room);
s.call_setup_id = Some(call_id);
}
Ok(Some(SignalMessage::Hangup { reason })) => {
info!(reason = ?reason, "signal: hangup");
let mut s = state.lock().unwrap();
s.status = "registered".into();
s.incoming_call_id = None;
s.incoming_caller_fp = None;
s.incoming_caller_alias = None;
s.call_setup_relay = None;
s.call_setup_room = None;
s.call_setup_id = None;
}
Ok(Some(_)) => {}
Ok(None) => {
info!("signal: connection closed");
break;
}
Err(e) => {
error!("signal recv error: {e}");
break;
}
}
}
bg_state.lock().unwrap().status = "idle".into();
}); // block_on
// Runtime intentionally NOT dropped — lives until thread exits.
// This prevents ring/libcrypto TLS cleanup conflict on Android.
// The thread is parked here forever (block_on returned = connection lost).
std::thread::park();
})?; // thread spawn
// Wait for transport (up to 10s)
let transport = transport_rx.recv_timeout(std::time::Duration::from_secs(10))
.map_err(|_| anyhow::anyhow!("signal connect timeout — check relay address"))?;
Ok(Self { transport, state: ret_state, running: ret_running })
}
/// Get current state (non-blocking).
pub fn get_state(&self) -> SignalState {
self.state.lock().unwrap().clone()
}
/// Get state as JSON string.
pub fn get_state_json(&self) -> String {
serde_json::to_string(&self.get_state()).unwrap_or_else(|_| "{}".into())
}
/// Place a direct call.
pub fn place_call(&self, target_fp: &str) -> Result<(), anyhow::Error> {
let fp = self.state.lock().unwrap().fingerprint.clone();
let target = target_fp.to_string();
let call_id = format!("{:016x}", std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH).unwrap().as_nanos());
let transport = self.transport.clone();
// Send on a small thread (async send needs a runtime)
std::thread::Builder::new()
.name("wzp-call-send".into())
.spawn(move || {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all().build().expect("rt");
rt.block_on(async {
let _ = transport.send_signal(&SignalMessage::DirectCallOffer {
caller_fingerprint: fp,
caller_alias: None,
target_fingerprint: target,
call_id,
identity_pub: [0u8; 32],
ephemeral_pub: [0u8; 32],
signature: vec![],
supported_profiles: vec![wzp_proto::QualityProfile::GOOD],
}).await;
});
})?;
Ok(())
}
/// Answer an incoming call.
pub fn answer_call(&self, call_id: &str, mode: wzp_proto::CallAcceptMode) -> Result<(), anyhow::Error> {
let call_id = call_id.to_string();
let transport = self.transport.clone();
std::thread::Builder::new()
.name("wzp-answer-send".into())
.spawn(move || {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all().build().expect("rt");
rt.block_on(async {
let _ = transport.send_signal(&SignalMessage::DirectCallAnswer {
call_id,
accept_mode: mode,
identity_pub: None,
ephemeral_pub: None,
signature: None,
chosen_profile: Some(wzp_proto::QualityProfile::GOOD),
}).await;
});
})?;
Ok(())
}
/// Send hangup.
pub fn hangup(&self) {
let transport = self.transport.clone();
let state = self.state.clone();
std::thread::spawn(move || {
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all().build().expect("rt");
rt.block_on(async {
let _ = transport.send_signal(&SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal,
}).await;
});
let mut s = state.lock().unwrap();
s.status = "registered".into();
s.incoming_call_id = None;
s.incoming_caller_fp = None;
s.incoming_caller_alias = None;
s.call_setup_relay = None;
s.call_setup_room = None;
s.call_setup_id = None;
});
}
/// Stop the signal connection.
pub fn stop(&self) {
self.running.store(false, Ordering::Release);
self.transport.connection().close(0u32.into(), b"shutdown");
}
}

View File

@@ -58,16 +58,8 @@ pub struct CallStats {
pub frames_decoded: u64, pub frames_decoded: u64,
/// Number of playout underruns (buffer empty when audio needed). /// Number of playout underruns (buffer empty when audio needed).
pub underruns: u64, pub underruns: u64,
/// Frames recovered by RaptorQ FEC (Codec2 tiers only; Opus bypasses /// Frames recovered by FEC.
/// RaptorQ per Phase 2).
pub fec_recovered: u64, pub fec_recovered: u64,
/// Phase 3c: Opus frames reconstructed via DRED side-channel data.
/// Only increments on the Opus tiers; always zero for Codec2.
pub dred_reconstructions: u64,
/// Phase 3c: Opus frames filled via classical Opus PLC because no DRED
/// state covered the gap, plus any decode-error fallbacks. Codec2 loss
/// also increments this counter via the Codec2 PLC path.
pub classical_plc_invocations: u64,
/// Playout ring overflow count (reader was lapped by writer). /// Playout ring overflow count (reader was lapped by writer).
pub playout_overflows: u64, pub playout_overflows: u64,
/// Playout ring underrun count (reader found empty buffer). /// Playout ring underrun count (reader found empty buffer).

View File

@@ -23,71 +23,10 @@ serde_json = "1"
chrono = "0.4" chrono = "0.4"
rustls = { version = "0.23", default-features = false, features = ["ring", "std"] } rustls = { version = "0.23", default-features = false, features = ["ring", "std"] }
cpal = { version = "0.15", optional = true } cpal = { version = "0.15", optional = true }
libc = "0.2"
# coreaudio-rs is Apple-framework-only; gate it to macOS so enabling
# the `vpio` feature from a non-macOS target builds cleanly instead of
# pulling in a crate that can only link against Apple frameworks.
[target.'cfg(target_os = "macos")'.dependencies]
coreaudio-rs = { version = "0.11", optional = true }
# Windows-only: direct WASAPI bindings for the `windows-aec` feature.
# `windows` is Microsoft's official Rust COM bindings crate. We pull in
# only the audio + COM subfeatures we need — the crate is organized as
# a massive optional-feature tree, so enabling just these keeps compile
# times reasonable (~5s for these features vs ~60s for the full crate).
[target.'cfg(target_os = "windows")'.dependencies]
windows = { version = "0.58", optional = true, features = [
"Win32_Foundation",
"Win32_Media_Audio",
"Win32_Security",
"Win32_System_Com",
"Win32_System_Com_StructuredStorage",
"Win32_System_Threading",
"Win32_System_Variant",
] }
# Linux-only: WebRTC AEC (Audio Processing Module) bindings for the
# `linux-aec` feature. This is the 0.3.x line of the `tonarino/
# webrtc-audio-processing` crate, which links against Debian's
# `libwebrtc-audio-processing-dev` apt package (0.3-1+b1 on Bookworm).
#
# Note: we attempted the 2.x line with its `bundled` sub-feature first
# (which would give us AEC3 instead of AEC2), but both the crates.io
# tarball AND the upstream git `main` branch of webrtc-audio-processing-sys
# 2.0.3 hit a `meson setup --reconfigure` bug where the build.rs passes
# --reconfigure unconditionally even on first-run empty build dirs,
# causing the bundled build to fail with "Directory does not contain a
# valid build tree". The 0.x line doesn't use bundled mode and sidesteps
# this entirely by linking the apt-provided library. AEC2 is older than
# AEC3 but still the same algorithm family — this is what PulseAudio's
# module-echo-cancel and PipeWire's filter-chain use by default on
# current Debian-family distros.
[target.'cfg(target_os = "linux")'.dependencies]
webrtc-audio-processing = { version = "0.3", optional = true }
[features] [features]
default = [] default = []
audio = ["cpal"] audio = ["cpal"]
# vpio enables coreaudio-rs but that dep is itself gated to macOS above,
# so enabling this feature on Windows/Linux is a no-op (the audio_vpio
# module is also #[cfg(target_os = "macos")] in lib.rs).
vpio = ["dep:coreaudio-rs"]
# windows-aec enables a direct WASAPI capture backend that opens the
# microphone under AudioCategory_Communications, turning on Windows's
# OS-level communications audio processing (AEC + noise suppression +
# AGC). The `windows` dep is itself target-gated to Windows above, so
# enabling this feature on non-Windows targets is a no-op (the
# audio_wasapi module is also #[cfg(target_os = "windows")] in lib.rs).
windows-aec = ["dep:windows"]
# linux-aec enables a CPAL + WebRTC AEC3 capture/playback backend that
# runs the WebRTC Audio Processing Module (same algo as Chrome / Zoom /
# Teams) in-process, using the playback PCM as the reference signal for
# echo cancellation. The webrtc-audio-processing dep is target-gated to
# Linux above, so enabling this feature on non-Linux targets is a no-op
# (the audio_linux_aec module is also #[cfg(target_os = "linux")] in
# lib.rs).
linux-aec = ["dep:webrtc-audio-processing"]
[[bin]] [[bin]]
name = "wzp-client" name = "wzp-client"

View File

@@ -3,10 +3,12 @@
//! Both structs use 48 kHz, mono, i16 format to match the WarzonePhone codec //! Both structs use 48 kHz, mono, i16 format to match the WarzonePhone codec
//! pipeline. Frames are 960 samples (20 ms at 48 kHz). //! pipeline. Frames are 960 samples (20 ms at 48 kHz).
//! //!
//! Audio callbacks are **lock-free**: they read/write directly to an `AudioRing` //! The cpal `Stream` type is not `Send`, so each struct spawns a dedicated OS
//! (atomic SPSC ring buffer). No Mutex, no channel, no allocation on the hot path. //! thread that owns the stream. The public API exposes only `Send + Sync`
//! channel handles.
use std::sync::atomic::{AtomicBool, Ordering}; use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::mpsc;
use std::sync::Arc; use std::sync::Arc;
use anyhow::{anyhow, Context}; use anyhow::{anyhow, Context};
@@ -14,8 +16,6 @@ use cpal::traits::{DeviceTrait, HostTrait, StreamTrait};
use cpal::{SampleFormat, SampleRate, StreamConfig}; use cpal::{SampleFormat, SampleRate, StreamConfig};
use tracing::{info, warn}; use tracing::{info, warn};
use crate::audio_ring::AudioRing;
/// Number of samples per 20 ms frame at 48 kHz mono. /// Number of samples per 20 ms frame at 48 kHz mono.
pub const FRAME_SAMPLES: usize = 960; pub const FRAME_SAMPLES: usize = 960;
@@ -23,25 +23,23 @@ pub const FRAME_SAMPLES: usize = 960;
// AudioCapture // AudioCapture
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// Captures microphone input via CPAL and writes PCM into a lock-free ring buffer. /// Captures microphone input and yields 960-sample PCM frames.
/// ///
/// The cpal stream lives on a dedicated OS thread; this handle is `Send + Sync`. /// The cpal stream lives on a dedicated OS thread; this handle is `Send + Sync`.
pub struct AudioCapture { pub struct AudioCapture {
ring: Arc<AudioRing>, rx: mpsc::Receiver<Vec<i16>>,
running: Arc<AtomicBool>, running: Arc<AtomicBool>,
} }
impl AudioCapture { impl AudioCapture {
/// Create and start capturing from the default input device at 48 kHz mono. /// Create and start capturing from the default input device at 48 kHz mono.
pub fn start() -> Result<Self, anyhow::Error> { pub fn start() -> Result<Self, anyhow::Error> {
let ring = Arc::new(AudioRing::new()); let (tx, rx) = mpsc::sync_channel::<Vec<i16>>(64);
let running = Arc::new(AtomicBool::new(true)); let running = Arc::new(AtomicBool::new(true));
let (init_tx, init_rx) = std::sync::mpsc::sync_channel::<Result<(), String>>(1);
let ring_cb = ring.clone();
let running_clone = running.clone(); let running_clone = running.clone();
let (init_tx, init_rx) = mpsc::sync_channel::<Result<(), String>>(1);
std::thread::Builder::new() std::thread::Builder::new()
.name("wzp-audio-capture".into()) .name("wzp-audio-capture".into())
.spawn(move || { .spawn(move || {
@@ -61,51 +59,53 @@ impl AudioCapture {
let use_f32 = !supports_i16_input(&device)?; let use_f32 = !supports_i16_input(&device)?;
let buf = Arc::new(std::sync::Mutex::new(
Vec::<i16>::with_capacity(FRAME_SAMPLES),
));
let err_cb = |e: cpal::StreamError| { let err_cb = |e: cpal::StreamError| {
warn!("input stream error: {e}"); warn!("input stream error: {e}");
}; };
let logged_cb_size = Arc::new(AtomicBool::new(false));
let stream = if use_f32 { let stream = if use_f32 {
let ring = ring_cb.clone(); let buf = buf.clone();
let tx = tx.clone();
let running = running_clone.clone(); let running = running_clone.clone();
let logged = logged_cb_size.clone();
device.build_input_stream( device.build_input_stream(
&config, &config,
move |data: &[f32], _: &cpal::InputCallbackInfo| { move |data: &[f32], _: &cpal::InputCallbackInfo| {
if !running.load(Ordering::Relaxed) { if !running.load(Ordering::Relaxed) {
return; return;
} }
if !logged.swap(true, Ordering::Relaxed) { let mut lock = buf.lock().unwrap();
eprintln!("[audio] capture callback: {} f32 samples", data.len()); for &s in data {
lock.push(f32_to_i16(s));
if lock.len() == FRAME_SAMPLES {
let frame = lock.drain(..).collect();
let _ = tx.try_send(frame);
} }
let mut tmp = [0i16; FRAME_SAMPLES];
for chunk in data.chunks(FRAME_SAMPLES) {
let n = chunk.len();
for i in 0..n {
tmp[i] = f32_to_i16(chunk[i]);
}
ring.write(&tmp[..n]);
} }
}, },
err_cb, err_cb,
None, None,
)? )?
} else { } else {
let ring = ring_cb.clone(); let buf = buf.clone();
let tx = tx.clone();
let running = running_clone.clone(); let running = running_clone.clone();
let logged = logged_cb_size.clone();
device.build_input_stream( device.build_input_stream(
&config, &config,
move |data: &[i16], _: &cpal::InputCallbackInfo| { move |data: &[i16], _: &cpal::InputCallbackInfo| {
if !running.load(Ordering::Relaxed) { if !running.load(Ordering::Relaxed) {
return; return;
} }
if !logged.swap(true, Ordering::Relaxed) { let mut lock = buf.lock().unwrap();
eprintln!("[audio] capture callback: {} i16 samples", data.len()); for &s in data {
lock.push(s);
if lock.len() == FRAME_SAMPLES {
let frame = lock.drain(..).collect();
let _ = tx.try_send(frame);
}
} }
ring.write(data);
}, },
err_cb, err_cb,
None, None,
@@ -114,6 +114,7 @@ impl AudioCapture {
stream.play().context("failed to start input stream")?; stream.play().context("failed to start input stream")?;
// Signal success to the caller before parking.
let _ = init_tx.send(Ok(())); let _ = init_tx.send(Ok(()));
// Keep stream alive until stopped. // Keep stream alive until stopped.
@@ -134,12 +135,15 @@ impl AudioCapture {
.map_err(|_| anyhow!("capture thread exited before signaling"))? .map_err(|_| anyhow!("capture thread exited before signaling"))?
.map_err(|e| anyhow!("{e}"))?; .map_err(|e| anyhow!("{e}"))?;
Ok(Self { ring, running }) Ok(Self { rx, running })
} }
/// Get a reference to the capture ring buffer for direct polling. /// Read the next frame of 960 PCM samples (blocking until available).
pub fn ring(&self) -> &Arc<AudioRing> { ///
&self.ring /// Returns `None` when the stream has been stopped or the channel is
/// disconnected.
pub fn read_frame(&self) -> Option<Vec<i16>> {
self.rx.recv().ok()
} }
/// Stop capturing. /// Stop capturing.
@@ -148,35 +152,27 @@ impl AudioCapture {
} }
} }
impl Drop for AudioCapture {
fn drop(&mut self) {
self.stop();
}
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// AudioPlayback // AudioPlayback
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// Plays PCM through the default output device, reading from a lock-free ring buffer. /// Plays PCM frames through the default output device at 48 kHz mono.
/// ///
/// The cpal stream lives on a dedicated OS thread; this handle is `Send + Sync`. /// The cpal stream lives on a dedicated OS thread; this handle is `Send + Sync`.
pub struct AudioPlayback { pub struct AudioPlayback {
ring: Arc<AudioRing>, tx: mpsc::SyncSender<Vec<i16>>,
running: Arc<AtomicBool>, running: Arc<AtomicBool>,
} }
impl AudioPlayback { impl AudioPlayback {
/// Create and start playback on the default output device at 48 kHz mono. /// Create and start playback on the default output device at 48 kHz mono.
pub fn start() -> Result<Self, anyhow::Error> { pub fn start() -> Result<Self, anyhow::Error> {
let ring = Arc::new(AudioRing::new()); let (tx, rx) = mpsc::sync_channel::<Vec<i16>>(64);
let running = Arc::new(AtomicBool::new(true)); let running = Arc::new(AtomicBool::new(true));
let (init_tx, init_rx) = std::sync::mpsc::sync_channel::<Result<(), String>>(1);
let ring_cb = ring.clone();
let running_clone = running.clone(); let running_clone = running.clone();
let (init_tx, init_rx) = mpsc::sync_channel::<Result<(), String>>(1);
std::thread::Builder::new() std::thread::Builder::new()
.name("wzp-audio-playback".into()) .name("wzp-audio-playback".into())
.spawn(move || { .spawn(move || {
@@ -196,40 +192,62 @@ impl AudioPlayback {
let use_f32 = !supports_i16_output(&device)?; let use_f32 = !supports_i16_output(&device)?;
// Shared ring of samples the cpal callback drains from.
let ring = Arc::new(std::sync::Mutex::new(
std::collections::VecDeque::<i16>::with_capacity(FRAME_SAMPLES * 8),
));
// Background drainer: moves frames from the mpsc channel into the ring.
{
let ring = ring.clone();
let running = running_clone.clone();
std::thread::Builder::new()
.name("wzp-playback-drain".into())
.spawn(move || {
while running.load(Ordering::Relaxed) {
match rx.recv_timeout(std::time::Duration::from_millis(100)) {
Ok(frame) => {
let mut lock = ring.lock().unwrap();
lock.extend(frame);
while lock.len() > FRAME_SAMPLES * 16 {
lock.pop_front();
}
}
Err(mpsc::RecvTimeoutError::Timeout) => {}
Err(mpsc::RecvTimeoutError::Disconnected) => break,
}
}
})?;
}
let err_cb = |e: cpal::StreamError| { let err_cb = |e: cpal::StreamError| {
warn!("output stream error: {e}"); warn!("output stream error: {e}");
}; };
let stream = if use_f32 { let stream = if use_f32 {
let ring = ring_cb.clone(); let ring = ring.clone();
device.build_output_stream( device.build_output_stream(
&config, &config,
move |data: &mut [f32], _: &cpal::OutputCallbackInfo| { move |data: &mut [f32], _: &cpal::OutputCallbackInfo| {
let mut tmp = [0i16; FRAME_SAMPLES]; let mut lock = ring.lock().unwrap();
for chunk in data.chunks_mut(FRAME_SAMPLES) { for sample in data.iter_mut() {
let n = chunk.len(); *sample = match lock.pop_front() {
let read = ring.read(&mut tmp[..n]); Some(s) => i16_to_f32(s),
for i in 0..read { None => 0.0,
chunk[i] = i16_to_f32(tmp[i]); };
}
// Fill remainder with silence if ring underran
for i in read..n {
chunk[i] = 0.0;
}
} }
}, },
err_cb, err_cb,
None, None,
)? )?
} else { } else {
let ring = ring_cb.clone(); let ring = ring.clone();
device.build_output_stream( device.build_output_stream(
&config, &config,
move |data: &mut [i16], _: &cpal::OutputCallbackInfo| { move |data: &mut [i16], _: &cpal::OutputCallbackInfo| {
let read = ring.read(data); let mut lock = ring.lock().unwrap();
// Fill remainder with silence if ring underran for sample in data.iter_mut() {
for sample in &mut data[read..] { *sample = lock.pop_front().unwrap_or(0);
*sample = 0;
} }
}, },
err_cb, err_cb,
@@ -239,6 +257,7 @@ impl AudioPlayback {
stream.play().context("failed to start output stream")?; stream.play().context("failed to start output stream")?;
// Signal success to the caller before parking.
let _ = init_tx.send(Ok(())); let _ = init_tx.send(Ok(()));
// Keep stream alive until stopped. // Keep stream alive until stopped.
@@ -259,12 +278,12 @@ impl AudioPlayback {
.map_err(|_| anyhow!("playback thread exited before signaling"))? .map_err(|_| anyhow!("playback thread exited before signaling"))?
.map_err(|e| anyhow!("{e}"))?; .map_err(|e| anyhow!("{e}"))?;
Ok(Self { ring, running }) Ok(Self { tx, running })
} }
/// Get a reference to the playout ring buffer for direct writing. /// Write a frame of PCM samples for playback.
pub fn ring(&self) -> &Arc<AudioRing> { pub fn write_frame(&self, pcm: &[i16]) {
&self.ring let _ = self.tx.try_send(pcm.to_vec());
} }
/// Stop playback. /// Stop playback.
@@ -273,16 +292,11 @@ impl AudioPlayback {
} }
} }
impl Drop for AudioPlayback {
fn drop(&mut self) {
self.stop();
}
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Helpers // Helpers
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
/// Check if the input device supports i16 at 48 kHz mono.
fn supports_i16_input(device: &cpal::Device) -> Result<bool, anyhow::Error> { fn supports_i16_input(device: &cpal::Device) -> Result<bool, anyhow::Error> {
let supported = device let supported = device
.supported_input_configs() .supported_input_configs()
@@ -299,6 +313,7 @@ fn supports_i16_input(device: &cpal::Device) -> Result<bool, anyhow::Error> {
Ok(false) Ok(false)
} }
/// Check if the output device supports i16 at 48 kHz mono.
fn supports_i16_output(device: &cpal::Device) -> Result<bool, anyhow::Error> { fn supports_i16_output(device: &cpal::Device) -> Result<bool, anyhow::Error> {
let supported = device let supported = device
.supported_output_configs() .supported_output_configs()

View File

@@ -1,537 +0,0 @@
//! Linux AEC backend: CPAL capture + playback wired through the WebRTC Audio
//! Processing Module (AEC3 + noise suppression + high-pass filter).
//!
//! This is the same algorithm used by Chrome WebRTC, Zoom, Teams, Jitsi, and
//! any other "serious" Linux VoIP app. It runs in-process — no dependency on
//! PulseAudio's module-echo-cancel or PipeWire's filter-chain, so it works
//! identically on ALSA / PulseAudio / PipeWire systems.
//!
//! ## Architecture
//!
//! A single module-level `Arc<Mutex<Processor>>` is shared between the
//! capture and playback paths. On each 20 ms frame (960 samples @ 48 kHz
//! mono):
//!
//! - **Playback path**: `LinuxAecPlayback::start` spawns the usual CPAL
//! output thread, but wraps each chunk in a call to
//! `Processor::process_render_frame` **before** handing it to CPAL. That
//! gives APM an authoritative reference of exactly what's going out to
//! the speakers (same approach Zoom/Teams/Jitsi use). The AEC then knows
//! what to cancel when it sees echo in the capture stream.
//!
//! - **Capture path**: `LinuxAecCapture::start` spawns the usual CPAL
//! input thread, and runs `Processor::process_capture_frame` on each
//! incoming mic chunk **in place** before pushing it into the ring
//! buffer. The AEC subtracts the echo using the render reference it
//! saw on the playback side.
//!
//! APM is strict about frame size: it requires exactly 10 ms = 480 samples
//! per call at 48 kHz. Our pipeline uses 20 ms = 960 samples, so each 20 ms
//! frame is split into two 480-sample halves, APM is called twice, and the
//! halves are stitched back together.
//!
//! APM only accepts f32 samples in `[-1.0, 1.0]`, so we convert i16 → f32
//! before the call and f32 → i16 after (with clamping on the return path).
//!
//! ## Stream delay
//!
//! AEC needs to know roughly how long it takes between a sample being passed
//! to `process_render_frame` and its echo showing up at `process_capture_frame`
//! — i.e. the round trip through CPAL playback → speaker → air → microphone
//! → CPAL capture. AEC3's internal estimator tracks this within a window
//! around whatever hint we give it. We hardcode 60 ms as a reasonable
//! starting point for typical Linux audio stacks; the delay estimator does
//! the fine-tuning automatically.
//!
//! ## Thread safety
//!
//! The 0.3.x line of `webrtc-audio-processing` takes `&mut self` on both
//! `process_capture_frame` and `process_render_frame`, so the `Processor`
//! needs a `Mutex` around it for cross-thread sharing. The capture and
//! playback threads each acquire the lock briefly (sub-millisecond per
//! 10 ms frame) so contention is minimal at our frame rates.
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::{Arc, Mutex, OnceLock};
use anyhow::{anyhow, Context};
use cpal::traits::{DeviceTrait, HostTrait, StreamTrait};
use cpal::{SampleFormat, SampleRate, StreamConfig};
use tracing::{info, warn};
use webrtc_audio_processing::{
Config, EchoCancellation, EchoCancellationSuppressionLevel, InitializationConfig,
NoiseSuppression, NoiseSuppressionLevel, Processor, NUM_SAMPLES_PER_FRAME,
};
use crate::audio_ring::AudioRing;
/// 20 ms at 48 kHz, mono — matches the rest of the pipeline and the codec.
pub const FRAME_SAMPLES: usize = 960;
/// APM requires strict 10 ms frames at 48 kHz = 480 samples per call.
/// Imported from the webrtc-audio-processing crate so we can't drift out
/// of sync with whatever sample rate / frame length the C++ lib is using.
const APM_FRAME_SAMPLES: usize = NUM_SAMPLES_PER_FRAME as usize;
const APM_NUM_CHANNELS: usize = 1;
/// Round-trip delay hint passed to APM; the estimator refines from here.
/// 60 ms is a reasonable default for CPAL on ALSA / PulseAudio / PipeWire.
#[allow(dead_code)]
const STREAM_DELAY_MS: i32 = 60;
// ---------------------------------------------------------------------------
// Shared APM instance
// ---------------------------------------------------------------------------
/// Module-level lazily-initialized APM. Shared between capture and playback
/// so they operate on the same echo-cancellation state — the render frames
/// pushed by playback are what the capture path subtracts from the mic input.
/// Wrapped in a Mutex because the 0.3.x Processor takes `&mut self` on both
/// process_capture_frame and process_render_frame.
static PROCESSOR: OnceLock<Arc<Mutex<Processor>>> = OnceLock::new();
fn get_or_init_processor() -> anyhow::Result<Arc<Mutex<Processor>>> {
if let Some(p) = PROCESSOR.get() {
return Ok(p.clone());
}
let init_config = InitializationConfig {
num_capture_channels: APM_NUM_CHANNELS as i32,
num_render_channels: APM_NUM_CHANNELS as i32,
..Default::default()
};
let mut processor = Processor::new(&init_config)
.map_err(|e| anyhow!("webrtc APM init failed: {e:?}"))?;
let config = Config {
echo_cancellation: Some(EchoCancellation {
suppression_level: EchoCancellationSuppressionLevel::High,
stream_delay_ms: Some(STREAM_DELAY_MS),
enable_delay_agnostic: true,
enable_extended_filter: true,
}),
noise_suppression: Some(NoiseSuppression {
suppression_level: NoiseSuppressionLevel::High,
}),
enable_high_pass_filter: true,
// AGC left off for now — it can fight the Opus encoder's own gain
// staging and the adaptive-quality controller. Add later if users
// report low mic levels.
..Default::default()
};
processor.set_config(config);
let arc = Arc::new(Mutex::new(processor));
let _ = PROCESSOR.set(arc.clone());
info!(
stream_delay_ms = STREAM_DELAY_MS,
"webrtc APM initialized (AEC High + NS High + HPF, AGC off)"
);
Ok(arc)
}
// ---------------------------------------------------------------------------
// Helpers: i16 ↔ f32 and APM frame processing
// ---------------------------------------------------------------------------
#[inline]
fn i16_to_f32(s: i16) -> f32 {
s as f32 / 32768.0
}
#[inline]
fn f32_to_i16(s: f32) -> i16 {
(s.clamp(-1.0, 1.0) * 32767.0) as i16
}
/// Feed a 20 ms (960-sample) playback frame to APM as the render reference.
/// Splits into two 10 ms halves because APM is strict about frame size.
/// Takes the Mutex-wrapped Processor and locks briefly around each call.
fn push_render_frame_20ms(apm: &Mutex<Processor>, pcm: &[i16]) {
debug_assert_eq!(pcm.len(), FRAME_SAMPLES);
let mut buf = [0f32; APM_FRAME_SAMPLES];
for half in pcm.chunks_exact(APM_FRAME_SAMPLES) {
for (i, &s) in half.iter().enumerate() {
buf[i] = i16_to_f32(s);
}
match apm.lock() {
Ok(mut p) => {
if let Err(e) = p.process_render_frame(&mut buf) {
warn!("webrtc APM process_render_frame failed: {e:?}");
}
}
Err(_) => {
warn!("webrtc APM mutex poisoned in render path");
return;
}
}
}
}
/// Run a 20 ms (960-sample) capture frame through APM's echo cancellation
/// in place. Splits into two 10 ms halves, runs APM on each, stitches
/// results back into the caller's buffer. Briefly holds the Mutex once
/// per 10 ms half.
fn process_capture_frame_20ms(apm: &Mutex<Processor>, pcm: &mut [i16]) {
debug_assert_eq!(pcm.len(), FRAME_SAMPLES);
let mut buf = [0f32; APM_FRAME_SAMPLES];
for half in pcm.chunks_exact_mut(APM_FRAME_SAMPLES) {
for (i, &s) in half.iter().enumerate() {
buf[i] = i16_to_f32(s);
}
match apm.lock() {
Ok(mut p) => {
if let Err(e) = p.process_capture_frame(&mut buf) {
warn!("webrtc APM process_capture_frame failed: {e:?}");
}
}
Err(_) => {
warn!("webrtc APM mutex poisoned in capture path");
return;
}
}
for (i, d) in half.iter_mut().enumerate() {
*d = f32_to_i16(buf[i]);
}
}
}
// ---------------------------------------------------------------------------
// LinuxAecCapture — CPAL mic + WebRTC AEC capture-side processing
// ---------------------------------------------------------------------------
/// Microphone capture with WebRTC AEC3 applied in place before the codec
/// sees the samples. Mirrors the public API of `audio_io::AudioCapture` so
/// downstream code doesn't change.
pub struct LinuxAecCapture {
ring: Arc<AudioRing>,
running: Arc<AtomicBool>,
}
impl LinuxAecCapture {
pub fn start() -> Result<Self, anyhow::Error> {
// Eagerly init the APM so the playback side can find it already
// configured, and so init errors surface on the caller thread
// instead of silently failing inside the capture thread.
let apm = get_or_init_processor()?;
let ring = Arc::new(AudioRing::new());
let running = Arc::new(AtomicBool::new(true));
let (init_tx, init_rx) = std::sync::mpsc::sync_channel::<Result<(), String>>(1);
let ring_cb = ring.clone();
let running_clone = running.clone();
let apm_capture = apm.clone();
std::thread::Builder::new()
.name("wzp-audio-capture-linuxaec".into())
.spawn(move || {
let result = (|| -> Result<(), anyhow::Error> {
let host = cpal::default_host();
let device = host
.default_input_device()
.ok_or_else(|| anyhow!("no default input audio device found"))?;
info!(device = %device.name().unwrap_or_default(), "LinuxAEC: using input device");
let config = StreamConfig {
channels: 1,
sample_rate: SampleRate(48_000),
buffer_size: cpal::BufferSize::Default,
};
let use_f32 = !supports_i16_input(&device)?;
let err_cb = |e: cpal::StreamError| {
warn!("LinuxAEC input stream error: {e}");
};
// Leftover buffer for when CPAL gives us partial frames.
// We need exactly 960-sample chunks to feed APM.
let leftover = std::sync::Mutex::new(Vec::<i16>::with_capacity(FRAME_SAMPLES * 4));
let stream = if use_f32 {
let ring = ring_cb.clone();
let running = running_clone.clone();
let apm = apm_capture.clone();
device.build_input_stream(
&config,
move |data: &[f32], _: &cpal::InputCallbackInfo| {
if !running.load(Ordering::Relaxed) {
return;
}
let mut lv = leftover.lock().unwrap();
lv.reserve(data.len());
for &s in data {
lv.push(f32_to_i16(s));
}
drain_frames_through_apm(&mut lv, &apm, &ring);
},
err_cb,
None,
)?
} else {
let ring = ring_cb.clone();
let running = running_clone.clone();
let apm = apm_capture.clone();
device.build_input_stream(
&config,
move |data: &[i16], _: &cpal::InputCallbackInfo| {
if !running.load(Ordering::Relaxed) {
return;
}
let mut lv = leftover.lock().unwrap();
lv.extend_from_slice(data);
drain_frames_through_apm(&mut lv, &apm, &ring);
},
err_cb,
None,
)?
};
stream.play().context("failed to start LinuxAEC input stream")?;
let _ = init_tx.send(Ok(()));
info!("LinuxAEC capture started (AEC3 active)");
while running_clone.load(Ordering::Relaxed) {
std::thread::park_timeout(std::time::Duration::from_millis(200));
}
drop(stream);
Ok(())
})();
if let Err(e) = result {
let _ = init_tx.send(Err(e.to_string()));
}
})?;
init_rx
.recv()
.map_err(|_| anyhow!("LinuxAEC capture thread exited before signaling"))?
.map_err(|e| anyhow!("{e}"))?;
Ok(Self { ring, running })
}
pub fn ring(&self) -> &Arc<AudioRing> {
&self.ring
}
pub fn stop(&self) {
self.running.store(false, Ordering::Relaxed);
}
}
impl Drop for LinuxAecCapture {
fn drop(&mut self) {
self.stop();
}
}
/// Pull whole 960-sample frames out of the leftover buffer, run them through
/// APM's capture-side processing, and push to the ring. Leaves any partial
/// sub-960 remainder in `leftover` for the next callback.
fn drain_frames_through_apm(leftover: &mut Vec<i16>, apm: &Mutex<Processor>, ring: &AudioRing) {
let mut frame = [0i16; FRAME_SAMPLES];
while leftover.len() >= FRAME_SAMPLES {
frame.copy_from_slice(&leftover[..FRAME_SAMPLES]);
process_capture_frame_20ms(apm, &mut frame);
ring.write(&frame);
leftover.drain(..FRAME_SAMPLES);
}
}
// ---------------------------------------------------------------------------
// LinuxAecPlayback — CPAL speaker output + WebRTC AEC render-side tee
// ---------------------------------------------------------------------------
/// Speaker playback with a render-side tee: each frame written to CPAL is
/// ALSO fed to APM via `process_render_frame` as the echo-cancellation
/// reference signal. This is the "tee the playback ring" approach (Zoom,
/// Teams, Jitsi) — deterministic, does not depend on PulseAudio loopback or
/// PipeWire monitor sources.
pub struct LinuxAecPlayback {
ring: Arc<AudioRing>,
running: Arc<AtomicBool>,
}
impl LinuxAecPlayback {
pub fn start() -> Result<Self, anyhow::Error> {
let apm = get_or_init_processor()?;
let ring = Arc::new(AudioRing::new());
let running = Arc::new(AtomicBool::new(true));
let (init_tx, init_rx) = std::sync::mpsc::sync_channel::<Result<(), String>>(1);
let ring_cb = ring.clone();
let running_clone = running.clone();
let apm_render = apm.clone();
std::thread::Builder::new()
.name("wzp-audio-playback-linuxaec".into())
.spawn(move || {
let result = (|| -> Result<(), anyhow::Error> {
let host = cpal::default_host();
let device = host
.default_output_device()
.ok_or_else(|| anyhow!("no default output audio device found"))?;
info!(device = %device.name().unwrap_or_default(), "LinuxAEC: using output device");
let config = StreamConfig {
channels: 1,
sample_rate: SampleRate(48_000),
buffer_size: cpal::BufferSize::Default,
};
let use_f32 = !supports_i16_output(&device)?;
let err_cb = |e: cpal::StreamError| {
warn!("LinuxAEC output stream error: {e}");
};
// Same 960-sample batching approach as the capture side:
// CPAL may ask for N samples in a callback where N doesn't
// divide 960. We accumulate partial frames in a Vec and
// feed APM as soon as we have a whole 20 ms frame.
let carry = std::sync::Mutex::new(Vec::<i16>::with_capacity(FRAME_SAMPLES * 4));
let stream = if use_f32 {
let ring = ring_cb.clone();
let apm = apm_render.clone();
device.build_output_stream(
&config,
move |data: &mut [f32], _: &cpal::OutputCallbackInfo| {
fill_output_and_tee_f32(data, &ring, &apm, &carry);
},
err_cb,
None,
)?
} else {
let ring = ring_cb.clone();
let apm = apm_render.clone();
device.build_output_stream(
&config,
move |data: &mut [i16], _: &cpal::OutputCallbackInfo| {
fill_output_and_tee_i16(data, &ring, &apm, &carry);
},
err_cb,
None,
)?
};
stream.play().context("failed to start LinuxAEC output stream")?;
let _ = init_tx.send(Ok(()));
info!("LinuxAEC playback started (render tee active)");
while running_clone.load(Ordering::Relaxed) {
std::thread::park_timeout(std::time::Duration::from_millis(200));
}
drop(stream);
Ok(())
})();
if let Err(e) = result {
let _ = init_tx.send(Err(e.to_string()));
}
})?;
init_rx
.recv()
.map_err(|_| anyhow!("LinuxAEC playback thread exited before signaling"))?
.map_err(|e| anyhow!("{e}"))?;
Ok(Self { ring, running })
}
pub fn ring(&self) -> &Arc<AudioRing> {
&self.ring
}
pub fn stop(&self) {
self.running.store(false, Ordering::Relaxed);
}
}
impl Drop for LinuxAecPlayback {
fn drop(&mut self) {
self.stop();
}
}
fn fill_output_and_tee_i16(
data: &mut [i16],
ring: &AudioRing,
apm: &Mutex<Processor>,
carry: &std::sync::Mutex<Vec<i16>>,
) {
let read = ring.read(data);
for s in &mut data[read..] {
*s = 0;
}
tee_render_samples(data, apm, carry);
}
fn fill_output_and_tee_f32(
data: &mut [f32],
ring: &AudioRing,
apm: &Mutex<Processor>,
carry: &std::sync::Mutex<Vec<i16>>,
) {
let mut tmp = vec![0i16; data.len()];
let read = ring.read(&mut tmp);
for s in &mut tmp[read..] {
*s = 0;
}
for (d, &s) in data.iter_mut().zip(tmp.iter()) {
*d = i16_to_f32(s);
}
tee_render_samples(&tmp, apm, carry);
}
/// Push CPAL-bound samples into APM's render-side input for echo cancellation.
/// Uses a carry buffer to batch into exact 960-sample (20 ms) frames.
fn tee_render_samples(samples: &[i16], apm: &Mutex<Processor>, carry: &std::sync::Mutex<Vec<i16>>) {
let mut lv = carry.lock().unwrap();
lv.extend_from_slice(samples);
while lv.len() >= FRAME_SAMPLES {
let mut frame = [0i16; FRAME_SAMPLES];
frame.copy_from_slice(&lv[..FRAME_SAMPLES]);
push_render_frame_20ms(apm, &frame);
lv.drain(..FRAME_SAMPLES);
}
}
// ---------------------------------------------------------------------------
// CPAL format helpers (duplicated from audio_io.rs to keep the modules
// independent — each backend file is a self-contained unit)
// ---------------------------------------------------------------------------
fn supports_i16_input(device: &cpal::Device) -> Result<bool, anyhow::Error> {
let supported = device
.supported_input_configs()
.context("failed to query input configs")?;
for cfg in supported {
if cfg.sample_format() == SampleFormat::I16
&& cfg.min_sample_rate() <= SampleRate(48_000)
&& cfg.max_sample_rate() >= SampleRate(48_000)
&& cfg.channels() >= 1
{
return Ok(true);
}
}
Ok(false)
}
fn supports_i16_output(device: &cpal::Device) -> Result<bool, anyhow::Error> {
let supported = device
.supported_output_configs()
.context("failed to query output configs")?;
for cfg in supported {
if cfg.sample_format() == SampleFormat::I16
&& cfg.min_sample_rate() <= SampleRate(48_000)
&& cfg.max_sample_rate() >= SampleRate(48_000)
&& cfg.channels() >= 1
{
return Ok(true);
}
}
Ok(false)
}

View File

@@ -1,122 +0,0 @@
//! Lock-free SPSC ring buffer — "Reader-Detects-Lap" architecture.
//!
//! SPSC invariant: the producer ONLY writes `write_pos`, the consumer
//! ONLY writes `read_pos`. Neither thread touches the other's cursor.
//!
//! On overflow (writer laps the reader), the writer simply overwrites
//! old buffer data. The reader detects the lap via `available() >
//! RING_CAPACITY` and snaps its own `read_pos` forward.
//!
//! Capacity is a power of 2 for bitmask indexing (no modulo).
use std::sync::atomic::{AtomicU64, AtomicUsize, Ordering};
/// Ring buffer capacity — power of 2 for bitmask indexing.
/// 16384 samples = 341.3ms at 48kHz mono.
const RING_CAPACITY: usize = 16384; // 2^14
const RING_MASK: usize = RING_CAPACITY - 1;
/// Lock-free single-producer single-consumer ring buffer for i16 PCM samples.
pub struct AudioRing {
buf: Box<[i16]>,
/// Monotonically increasing write cursor. ONLY written by producer.
write_pos: AtomicUsize,
/// Monotonically increasing read cursor. ONLY written by consumer.
read_pos: AtomicUsize,
/// Incremented by reader when it detects it was lapped (overflow).
overflow_count: AtomicU64,
/// Incremented by reader when ring is empty (underrun).
underrun_count: AtomicU64,
}
// SAFETY: AudioRing is SPSC — one thread writes (producer), one reads (consumer).
// The producer only writes write_pos. The consumer only writes read_pos.
// Neither thread writes the other's cursor. Buffer indices are derived from
// the owning thread's cursor, ensuring no concurrent access to the same index.
unsafe impl Send for AudioRing {}
unsafe impl Sync for AudioRing {}
impl AudioRing {
pub fn new() -> Self {
debug_assert!(RING_CAPACITY.is_power_of_two());
Self {
buf: vec![0i16; RING_CAPACITY].into_boxed_slice(),
write_pos: AtomicUsize::new(0),
read_pos: AtomicUsize::new(0),
overflow_count: AtomicU64::new(0),
underrun_count: AtomicU64::new(0),
}
}
/// Number of samples available to read (clamped to capacity).
pub fn available(&self) -> usize {
let w = self.write_pos.load(Ordering::Acquire);
let r = self.read_pos.load(Ordering::Relaxed);
w.wrapping_sub(r).min(RING_CAPACITY)
}
/// Write samples into the ring. Returns number of samples written.
///
/// If the ring is full, old data is silently overwritten. The reader
/// will detect the lap and self-correct. The writer NEVER touches
/// `read_pos`.
pub fn write(&self, samples: &[i16]) -> usize {
let count = samples.len().min(RING_CAPACITY);
let w = self.write_pos.load(Ordering::Relaxed);
for i in 0..count {
unsafe {
let ptr = self.buf.as_ptr() as *mut i16;
*ptr.add((w + i) & RING_MASK) = samples[i];
}
}
self.write_pos
.store(w.wrapping_add(count), Ordering::Release);
count
}
/// Read samples from the ring into `out`. Returns number of samples read.
///
/// If the writer has lapped the reader (overflow), `read_pos` is snapped
/// forward to the oldest valid data.
pub fn read(&self, out: &mut [i16]) -> usize {
let w = self.write_pos.load(Ordering::Acquire);
let mut r = self.read_pos.load(Ordering::Relaxed);
let mut avail = w.wrapping_sub(r);
// Lap detection: writer has overwritten our unread data.
if avail > RING_CAPACITY {
r = w.wrapping_sub(RING_CAPACITY);
avail = RING_CAPACITY;
self.overflow_count.fetch_add(1, Ordering::Relaxed);
}
let count = out.len().min(avail);
if count == 0 {
if w == r {
self.underrun_count.fetch_add(1, Ordering::Relaxed);
}
return 0;
}
for i in 0..count {
out[i] = unsafe { *self.buf.as_ptr().add((r + i) & RING_MASK) };
}
self.read_pos
.store(r.wrapping_add(count), Ordering::Release);
count
}
/// Number of overflow events (reader was lapped by writer).
pub fn overflow_count(&self) -> u64 {
self.overflow_count.load(Ordering::Relaxed)
}
/// Number of underrun events (reader found empty buffer).
pub fn underrun_count(&self) -> u64 {
self.underrun_count.load(Ordering::Relaxed)
}
}

View File

@@ -1,179 +0,0 @@
//! macOS Voice Processing I/O — uses Apple's VoiceProcessingIO audio unit
//! for hardware-accelerated echo cancellation, AGC, and noise suppression.
//!
//! VoiceProcessingIO is a combined input+output unit that knows what's going
//! to the speaker, so it can cancel the echo from the mic signal internally.
//! This is the same engine FaceTime and other Apple apps use.
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use anyhow::Context;
use coreaudio::audio_unit::audio_format::LinearPcmFlags;
use coreaudio::audio_unit::render_callback::{self, data};
use coreaudio::audio_unit::{AudioUnit, Element, IOType, SampleFormat, Scope, StreamFormat};
use coreaudio::sys;
use tracing::info;
use crate::audio_ring::AudioRing;
/// Number of samples per 20 ms frame at 48 kHz mono.
pub const FRAME_SAMPLES: usize = 960;
/// Combined capture + playback via macOS VoiceProcessingIO.
///
/// The OS handles AEC internally — no manual far-end feeding needed.
pub struct VpioAudio {
capture_ring: Arc<AudioRing>,
playout_ring: Arc<AudioRing>,
_audio_unit: AudioUnit,
running: Arc<AtomicBool>,
}
impl VpioAudio {
/// Start VoiceProcessingIO with AEC enabled.
pub fn start() -> Result<Self, anyhow::Error> {
let capture_ring = Arc::new(AudioRing::new());
let playout_ring = Arc::new(AudioRing::new());
let running = Arc::new(AtomicBool::new(true));
let mut au = AudioUnit::new(IOType::VoiceProcessingIO)
.context("failed to create VoiceProcessingIO audio unit")?;
// Must uninitialize before configuring properties.
au.uninitialize()
.context("failed to uninitialize VPIO for configuration")?;
// Enable input (mic) on Element::Input (bus 1).
let enable: u32 = 1;
au.set_property(
sys::kAudioOutputUnitProperty_EnableIO,
Scope::Input,
Element::Input,
Some(&enable),
)
.context("failed to enable VPIO input")?;
// Output (speaker) is enabled by default on VPIO, but be explicit.
au.set_property(
sys::kAudioOutputUnitProperty_EnableIO,
Scope::Output,
Element::Output,
Some(&enable),
)
.context("failed to enable VPIO output")?;
// Configure stream format: 48kHz mono f32 non-interleaved
let stream_format = StreamFormat {
sample_rate: 48_000.0,
sample_format: SampleFormat::F32,
flags: LinearPcmFlags::IS_FLOAT
| LinearPcmFlags::IS_PACKED
| LinearPcmFlags::IS_NON_INTERLEAVED,
channels: 1,
};
let asbd = stream_format.to_asbd();
// Input: set format on Output scope of Input element
// (= the format the AU delivers to us from the mic)
au.set_property(
sys::kAudioUnitProperty_StreamFormat,
Scope::Output,
Element::Input,
Some(&asbd),
)
.context("failed to set input stream format")?;
// Output: set format on Input scope of Output element
// (= the format we feed to the AU for the speaker)
au.set_property(
sys::kAudioUnitProperty_StreamFormat,
Scope::Input,
Element::Output,
Some(&asbd),
)
.context("failed to set output stream format")?;
// Set up input callback (mic capture with AEC applied)
let cap_ring = capture_ring.clone();
let cap_running = running.clone();
let logged = Arc::new(AtomicBool::new(false));
au.set_input_callback(
move |args: render_callback::Args<data::NonInterleaved<f32>>| {
if !cap_running.load(Ordering::Relaxed) {
return Ok(());
}
let mut buffers = args.data.channels();
if let Some(ch) = buffers.next() {
if !logged.swap(true, Ordering::Relaxed) {
eprintln!("[vpio] capture callback: {} f32 samples", ch.len());
}
let mut tmp = [0i16; FRAME_SAMPLES];
for chunk in ch.chunks(FRAME_SAMPLES) {
let n = chunk.len();
for i in 0..n {
tmp[i] = (chunk[i].clamp(-1.0, 1.0) * i16::MAX as f32) as i16;
}
cap_ring.write(&tmp[..n]);
}
}
Ok(())
},
)
.context("failed to set input callback")?;
// Set up output callback (speaker playback — AEC uses this as reference)
let play_ring = playout_ring.clone();
au.set_render_callback(
move |mut args: render_callback::Args<data::NonInterleaved<f32>>| {
let mut buffers = args.data.channels_mut();
if let Some(ch) = buffers.next() {
let mut tmp = [0i16; FRAME_SAMPLES];
for chunk in ch.chunks_mut(FRAME_SAMPLES) {
let n = chunk.len();
let read = play_ring.read(&mut tmp[..n]);
for i in 0..read {
chunk[i] = tmp[i] as f32 / i16::MAX as f32;
}
for i in read..n {
chunk[i] = 0.0;
}
}
}
Ok(())
},
)
.context("failed to set render callback")?;
au.initialize().context("failed to initialize VoiceProcessingIO")?;
au.start().context("failed to start VoiceProcessingIO")?;
info!("VoiceProcessingIO started (OS-level AEC enabled)");
Ok(Self {
capture_ring,
playout_ring,
_audio_unit: au,
running,
})
}
pub fn capture_ring(&self) -> &Arc<AudioRing> {
&self.capture_ring
}
pub fn playout_ring(&self) -> &Arc<AudioRing> {
&self.playout_ring
}
pub fn stop(&self) {
self.running.store(false, Ordering::Relaxed);
}
}
impl Drop for VpioAudio {
fn drop(&mut self) {
self.stop();
}
}

View File

@@ -1,332 +0,0 @@
//! Direct WASAPI microphone capture with Windows's OS-level AEC enabled.
//!
//! Bypasses CPAL and opens the default capture endpoint directly via
//! `IMMDeviceEnumerator` + `IAudioClient2::SetClientProperties`, setting
//! `AudioClientProperties.eCategory = AudioCategory_Communications`. That's
//! the switch that tells Windows "this is a VoIP call" — the OS then
//! enables its communications audio processing chain (AEC, noise
//! suppression, automatic gain control) for the stream. AEC operates at
//! the OS level using the currently-playing audio as the reference
//! signal, so it cancels echo from our CPAL playback (and any other app's
//! audio) without us having to plumb a reference signal ourselves.
//!
//! Platform: Windows only, compiled only when the `windows-aec` feature
//! is enabled. Mirrors the public API of `audio_io::AudioCapture` so
//! `wzp-client`'s lib.rs can transparently re-export either one as
//! `AudioCapture`.
use std::sync::atomic::{AtomicBool, Ordering};
use std::sync::Arc;
use anyhow::{anyhow, Context};
use tracing::{info, warn};
use windows::core::{Interface, GUID};
use windows::Win32::Foundation::{CloseHandle, BOOL, WAIT_OBJECT_0};
use windows::Win32::Media::Audio::{
eCapture, eCommunications, AudioCategory_Communications, AudioClientProperties,
IAudioCaptureClient, IAudioClient, IAudioClient2, IMMDeviceEnumerator, MMDeviceEnumerator,
AUDCLNT_SHAREMODE_SHARED, AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK, AUDCLNT_STREAMFLAGS_SRC_DEFAULT_QUALITY, WAVEFORMATEX,
WAVE_FORMAT_PCM,
};
use windows::Win32::System::Com::{
CoCreateInstance, CoInitializeEx, CoUninitialize, CLSCTX_ALL, COINIT_MULTITHREADED,
};
use windows::Win32::System::Threading::{CreateEventW, WaitForSingleObject, INFINITE};
use crate::audio_ring::AudioRing;
/// 20 ms at 48 kHz, mono. Matches the rest of the audio pipeline.
pub const FRAME_SAMPLES: usize = 960;
/// Microphone capture via WASAPI with Windows's communications AEC enabled.
///
/// The WASAPI capture stream runs on a dedicated OS thread. This handle is
/// `Send + Sync`. Dropping it stops the stream and joins the thread.
pub struct WasapiAudioCapture {
ring: Arc<AudioRing>,
running: Arc<AtomicBool>,
thread: Option<std::thread::JoinHandle<()>>,
}
impl WasapiAudioCapture {
/// Open the default communications microphone, enable OS AEC, and start
/// streaming PCM into a lock-free ring buffer.
///
/// Returns only after the capture thread has successfully initialized
/// the stream, or propagates the error back to the caller.
pub fn start() -> Result<Self, anyhow::Error> {
let ring = Arc::new(AudioRing::new());
let running = Arc::new(AtomicBool::new(true));
let (init_tx, init_rx) = std::sync::mpsc::sync_channel::<Result<(), String>>(1);
let ring_cb = ring.clone();
let running_cb = running.clone();
let thread = std::thread::Builder::new()
.name("wzp-audio-capture-wasapi".into())
.spawn(move || {
let result = unsafe { capture_thread_main(ring_cb, running_cb.clone(), &init_tx) };
if let Err(e) = result {
warn!("wasapi capture thread exited with error: {e}");
// If we failed before signaling init, signal now so the
// caller unblocks. Double-send is harmless (channel is
// bounded to 1 and we only hit the second send path on
// late errors).
let _ = init_tx.send(Err(e.to_string()));
}
})
.context("failed to spawn WASAPI capture thread")?;
init_rx
.recv()
.map_err(|_| anyhow!("WASAPI capture thread exited before signaling init"))?
.map_err(|e| anyhow!("{e}"))?;
Ok(Self {
ring,
running,
thread: Some(thread),
})
}
/// Get a reference to the capture ring buffer for direct polling.
pub fn ring(&self) -> &Arc<AudioRing> {
&self.ring
}
/// Stop capturing.
pub fn stop(&self) {
self.running.store(false, Ordering::Relaxed);
}
}
impl Drop for WasapiAudioCapture {
fn drop(&mut self) {
self.stop();
if let Some(handle) = self.thread.take() {
// Join best-effort. The thread loop polls `running` every 200ms
// via a short WaitForSingleObject timeout, so it should exit
// within ~200ms of `stop()`.
let _ = handle.join();
}
}
}
// ---------------------------------------------------------------------------
// WASAPI thread entry point — everything below this line runs on the
// dedicated wzp-audio-capture-wasapi thread.
// ---------------------------------------------------------------------------
unsafe fn capture_thread_main(
ring: Arc<AudioRing>,
running: Arc<AtomicBool>,
init_tx: &std::sync::mpsc::SyncSender<Result<(), String>>,
) -> Result<(), anyhow::Error> {
// COM init for the capture thread. MULTITHREADED because we're not
// running a message pump. Must be balanced by CoUninitialize on exit.
CoInitializeEx(None, COINIT_MULTITHREADED)
.ok()
.context("CoInitializeEx failed")?;
// Use a guard struct so CoUninitialize runs even on early returns.
struct ComGuard;
impl Drop for ComGuard {
fn drop(&mut self) {
unsafe { CoUninitialize() };
}
}
let _com_guard = ComGuard;
let enumerator: IMMDeviceEnumerator =
CoCreateInstance(&MMDeviceEnumerator, None, CLSCTX_ALL)
.context("CoCreateInstance(MMDeviceEnumerator) failed")?;
// eCommunications role (not eConsole) — this picks the device the user
// has designated for communications in Sound Settings. It's the one
// Windows's AEC is actually tuned for and the one Teams/Zoom use.
let device = enumerator
.GetDefaultAudioEndpoint(eCapture, eCommunications)
.context("GetDefaultAudioEndpoint(eCapture, eCommunications) failed")?;
if let Ok(name) = device_name(&device) {
info!(device = %name, "opening WASAPI communications capture endpoint");
}
let audio_client: IAudioClient = device
.Activate(CLSCTX_ALL, None)
.context("IMMDevice::Activate(IAudioClient) failed")?;
// IAudioClient2 exposes SetClientProperties, which is the ONLY way to
// set AudioCategory_Communications pre-Initialize. Calling it on the
// base IAudioClient would not compile, and setting it after Initialize
// is a no-op.
let audio_client2: IAudioClient2 = audio_client
.cast()
.context("QueryInterface IAudioClient2 failed")?;
let mut props = AudioClientProperties {
cbSize: std::mem::size_of::<AudioClientProperties>() as u32,
bIsOffload: BOOL(0),
eCategory: AudioCategory_Communications,
// 0 = AUDCLNT_STREAMOPTIONS_NONE. The `windows` crate doesn't
// export the enum constant in all versions, so use 0 directly.
Options: Default::default(),
};
audio_client2
.SetClientProperties(&mut props as *mut _)
.context("SetClientProperties(AudioCategory_Communications) failed")?;
// Request 48 kHz mono i16 directly. AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM
// tells Windows to do any needed format conversion inside the audio
// engine rather than rejecting our format. SRC_DEFAULT_QUALITY picks
// the standard Windows resampler quality (fine for voice).
let wave_format = WAVEFORMATEX {
wFormatTag: WAVE_FORMAT_PCM as u16,
nChannels: 1,
nSamplesPerSec: 48_000,
nAvgBytesPerSec: 48_000 * 2, // 1 ch * 2 bytes/sample * 48000 Hz
nBlockAlign: 2, // 1 ch * 2 bytes/sample
wBitsPerSample: 16,
cbSize: 0,
};
// 1,000,000 hns = 100 ms buffer (hns = 100-nanosecond units). Windows
// treats this as the minimum; the engine may give us a larger one.
const BUFFER_DURATION_HNS: i64 = 1_000_000;
audio_client
.Initialize(
AUDCLNT_SHAREMODE_SHARED,
AUDCLNT_STREAMFLAGS_EVENTCALLBACK
| AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM
| AUDCLNT_STREAMFLAGS_SRC_DEFAULT_QUALITY,
BUFFER_DURATION_HNS,
0,
&wave_format,
Some(&GUID::zeroed()),
)
.context("IAudioClient::Initialize failed — Windows rejected communications-mode 48k mono i16")?;
// Event-driven capture: Windows signals this handle each time a new
// audio packet is available. We wait on it from the loop below.
let event = CreateEventW(None, false, false, None)
.context("CreateEventW failed")?;
audio_client
.SetEventHandle(event)
.context("SetEventHandle failed")?;
let capture_client: IAudioCaptureClient = audio_client
.GetService()
.context("IAudioClient::GetService(IAudioCaptureClient) failed")?;
audio_client.Start().context("IAudioClient::Start failed")?;
// Signal to the parent thread that init succeeded before entering the
// hot loop. From this point on, errors get logged but don't propagate
// back to the caller (they'd just cause the ring buffer to stop
// filling, which the main thread detects as underruns).
let _ = init_tx.send(Ok(()));
info!("WASAPI communications-mode capture started with OS AEC enabled");
let mut logged_first_packet = false;
// Main capture loop. Exit when `running` goes false (from Drop or an
// explicit stop() call).
while running.load(Ordering::Relaxed) {
// 200 ms timeout so we check `running` regularly even if the audio
// engine stops delivering packets (e.g. device unplugged).
let wait = WaitForSingleObject(event, 200);
if wait.0 != WAIT_OBJECT_0.0 {
// Timeout or failure — just loop and re-check running.
continue;
}
// Drain all available packets. Windows may have queued more than
// one since we were last scheduled.
loop {
let packet_length = match capture_client.GetNextPacketSize() {
Ok(n) => n,
Err(e) => {
warn!("GetNextPacketSize failed: {e}");
break;
}
};
if packet_length == 0 {
break;
}
let mut buffer_ptr: *mut u8 = std::ptr::null_mut();
let mut num_frames: u32 = 0;
let mut flags: u32 = 0;
let mut device_position: u64 = 0;
let mut qpc_position: u64 = 0;
if let Err(e) = capture_client.GetBuffer(
&mut buffer_ptr,
&mut num_frames,
&mut flags,
Some(&mut device_position),
Some(&mut qpc_position),
) {
warn!("GetBuffer failed: {e}");
break;
}
if num_frames > 0 && !buffer_ptr.is_null() {
if !logged_first_packet {
info!(
frames = num_frames,
flags, "WASAPI capture: first packet received"
);
logged_first_packet = true;
}
// Because we asked for 48 kHz mono i16, each frame is
// exactly one i16. Windows's AUTOCONVERTPCM handles the
// conversion from whatever the engine mix format is.
let samples = std::slice::from_raw_parts(
buffer_ptr as *const i16,
num_frames as usize,
);
ring.write(samples);
}
if let Err(e) = capture_client.ReleaseBuffer(num_frames) {
warn!("ReleaseBuffer failed: {e}");
break;
}
}
}
info!("WASAPI capture thread stopping");
let _ = audio_client.Stop();
let _ = CloseHandle(event);
// _com_guard drops here, calling CoUninitialize.
// Silence INFINITE unused-import warning — it's referenced by the
// `windows` crate's WaitForSingleObject alternative but we use the
// 200 ms timeout variant instead. Explicit suppression for clarity.
let _ = INFINITE;
Ok(())
}
// ---------------------------------------------------------------------------
// Helpers
// ---------------------------------------------------------------------------
/// Best-effort device ID string for logging. Grabbing the friendly name via
/// PKEY_Device_FriendlyName requires IPropertyStore + PROPVARIANT plumbing
/// that's far more ceremony than a log line justifies; the ID is already
/// sufficient to confirm we opened the right endpoint.
///
/// Rust 2024 edition's `unsafe_op_in_unsafe_fn` lint requires explicit
/// `unsafe { ... }` blocks inside `unsafe fn` bodies for each unsafe call,
/// even though the whole function is already marked unsafe.
unsafe fn device_name(
device: &windows::Win32::Media::Audio::IMMDevice,
) -> Result<String, anyhow::Error> {
let id = unsafe { device.GetId() }.context("IMMDevice::GetId failed")?;
Ok(unsafe { id.to_string() }.unwrap_or_else(|_| "<non-utf16>".to_string()))
}

View File

@@ -7,15 +7,14 @@ use std::time::{Duration, Instant};
use bytes::Bytes; use bytes::Bytes;
use tracing::{debug, info, warn}; use tracing::{debug, info, warn};
use wzp_codec::dred_ffi::{DredDecoderHandle, DredState}; use wzp_codec::{AutoGainControl, ComfortNoise, EchoCanceller, NoiseSupressor, SilenceDetector};
use wzp_codec::{
AdaptiveDecoder, AutoGainControl, ComfortNoise, EchoCanceller, NoiseSupressor, SilenceDetector,
};
use wzp_fec::{RaptorQFecDecoder, RaptorQFecEncoder}; use wzp_fec::{RaptorQFecDecoder, RaptorQFecEncoder};
use wzp_proto::jitter::{JitterBuffer, PlayoutResult}; use wzp_proto::jitter::{JitterBuffer, PlayoutResult};
use wzp_proto::packet::{MediaHeader, MediaPacket, MiniFrameContext}; use wzp_proto::packet::{MediaHeader, MediaPacket, MiniFrameContext};
use wzp_proto::quality::AdaptiveQualityController; use wzp_proto::quality::AdaptiveQualityController;
use wzp_proto::traits::{AudioDecoder, AudioEncoder, FecDecoder, FecEncoder}; use wzp_proto::traits::{
AudioDecoder, AudioEncoder, FecDecoder, FecEncoder,
};
use wzp_proto::packet::QualityReport; use wzp_proto::packet::QualityReport;
use wzp_proto::{CodecId, QualityProfile}; use wzp_proto::{CodecId, QualityProfile};
@@ -43,9 +42,6 @@ pub struct CallConfig {
/// When enabled, only every 50th frame carries a full 12-byte MediaHeader; /// When enabled, only every 50th frame carries a full 12-byte MediaHeader;
/// intermediate frames use a compact 4-byte MiniHeader. /// intermediate frames use a compact 4-byte MiniHeader.
pub mini_frames_enabled: bool, pub mini_frames_enabled: bool,
/// AEC far-end delay compensation in milliseconds (default: 40).
/// Compensates for the round-trip audio latency from playout to mic capture.
pub aec_delay_ms: u32,
/// Enable adaptive jitter buffer (default: true). /// Enable adaptive jitter buffer (default: true).
/// ///
/// When true, the jitter buffer target depth is automatically adjusted /// When true, the jitter buffer target depth is automatically adjusted
@@ -67,7 +63,6 @@ impl Default for CallConfig {
noise_suppression: true, noise_suppression: true,
mini_frames_enabled: true, mini_frames_enabled: true,
adaptive_jitter: true, adaptive_jitter: true,
aec_delay_ms: 40,
} }
} }
} }
@@ -246,7 +241,7 @@ impl CallEncoder {
block_id: 0, block_id: 0,
frame_in_block: 0, frame_in_block: 0,
timestamp_ms: 0, timestamp_ms: 0,
aec: EchoCanceller::with_delay(48000, 60, config.aec_delay_ms), aec: EchoCanceller::new(48000, 100), // 100 ms echo tail
agc: AutoGainControl::new(), agc: AutoGainControl::new(),
silence_detector: SilenceDetector::new( silence_detector: SilenceDetector::new(
config.silence_threshold_rms, config.silence_threshold_rms,
@@ -345,22 +340,6 @@ impl CallEncoder {
let enc_len = self.audio_enc.encode(pcm, &mut encoded)?; let enc_len = self.audio_enc.encode(pcm, &mut encoded)?;
encoded.truncate(enc_len); encoded.truncate(enc_len);
// Phase 2: Opus tiers bypass RaptorQ entirely (DRED handles loss
// recovery at the codec layer). Codec2 tiers keep RaptorQ unchanged.
// On Opus packets, zero the FEC header fields so old receivers
// can cleanly identify "no RaptorQ block to assemble" and new
// receivers can short-circuit their FEC ingest path.
let is_opus = self.profile.codec.is_opus();
let (fec_block, fec_symbol, fec_ratio_encoded) = if is_opus {
(0u8, 0u8, 0u8)
} else {
(
self.block_id,
self.frame_in_block,
MediaHeader::encode_fec_ratio(self.profile.fec_ratio),
)
};
// Build source media packet // Build source media packet
let source_pkt = MediaPacket { let source_pkt = MediaPacket {
header: MediaHeader { header: MediaHeader {
@@ -368,11 +347,11 @@ impl CallEncoder {
is_repair: false, is_repair: false,
codec_id: self.profile.codec, codec_id: self.profile.codec,
has_quality_report: false, has_quality_report: false,
fec_ratio_encoded, fec_ratio_encoded: MediaHeader::encode_fec_ratio(self.profile.fec_ratio),
seq: self.seq, seq: self.seq,
timestamp: self.timestamp_ms, timestamp: self.timestamp_ms,
fec_block, fec_block: self.block_id,
fec_symbol, fec_symbol: self.frame_in_block,
reserved: 0, reserved: 0,
csrc_count: 0, csrc_count: 0,
}, },
@@ -387,13 +366,11 @@ impl CallEncoder {
let mut output = vec![source_pkt]; let mut output = vec![source_pkt];
// Codec2-only: feed RaptorQ and generate repair packets when the // Add to FEC encoder
// block is full. Opus tiers skip this entire block — DRED (active
// in Phase 1) provides codec-layer loss recovery.
if !is_opus {
self.fec_enc.add_source_symbol(&encoded)?; self.fec_enc.add_source_symbol(&encoded)?;
self.frame_in_block += 1; self.frame_in_block += 1;
// If block is full, generate repair and finalize
if self.frame_in_block >= self.profile.frames_per_block { if self.frame_in_block >= self.profile.frames_per_block {
if let Ok(repairs) = self.fec_enc.generate_repair(self.profile.fec_ratio) { if let Ok(repairs) = self.fec_enc.generate_repair(self.profile.fec_ratio) {
for (sym_idx, repair_data) in repairs { for (sym_idx, repair_data) in repairs {
@@ -423,7 +400,6 @@ impl CallEncoder {
self.block_id = self.block_id.wrapping_add(1); self.block_id = self.block_id.wrapping_add(1);
self.frame_in_block = 0; self.frame_in_block = 0;
} }
}
Ok(output) Ok(output)
} }
@@ -458,12 +434,9 @@ impl CallEncoder {
/// Manages the recv/decode side of a call. /// Manages the recv/decode side of a call.
pub struct CallDecoder { pub struct CallDecoder {
/// Audio decoder. Concrete `AdaptiveDecoder` (not `Box<dyn AudioDecoder>`) /// Audio decoder.
/// because Phase 3b calls the inherent `reconstruct_from_dred` method, audio_dec: Box<dyn AudioDecoder>,
/// which cannot live on the `AudioDecoder` trait without dragging libopus /// FEC decoder.
/// types into `wzp-proto`.
audio_dec: AdaptiveDecoder,
/// FEC decoder (Codec2 tiers only; Opus bypasses RaptorQ per Phase 2).
fec_dec: RaptorQFecDecoder, fec_dec: RaptorQFecDecoder,
/// Jitter buffer. /// Jitter buffer.
jitter: JitterBuffer, jitter: JitterBuffer,
@@ -477,24 +450,6 @@ pub struct CallDecoder {
last_was_cn: bool, last_was_cn: bool,
/// Mini-frame decompression context (tracks last full header baseline). /// Mini-frame decompression context (tracks last full header baseline).
mini_context: MiniFrameContext, mini_context: MiniFrameContext,
// ─── Phase 3b: DRED reconstruction state ──────────────────────────────
/// DRED side-channel parser (a separate libopus object from the decoder).
dred_decoder: DredDecoderHandle,
/// Scratch buffer used by `dred_decoder.parse_into` on every arriving
/// Opus packet. Reused across calls to avoid 10 KB alloc churn per packet.
dred_parse_scratch: DredState,
/// Cached "most recently parsed valid" DRED state, swapped with
/// `dred_parse_scratch` on successful parse. Used by `decode_next` when
/// the jitter buffer reports a gap.
last_good_dred: DredState,
/// Sequence number of the packet that produced `last_good_dred`. `None`
/// if no packet has yielded DRED state yet (cold start or legacy sender).
last_good_dred_seq: Option<u16>,
/// Phase 4 telemetry counter: gaps recovered via DRED reconstruction.
pub dred_reconstructions: u64,
/// Phase 4 telemetry counter: gaps filled via classical Opus PLC
/// (because no DRED state covered the gap, or the active codec is Codec2).
pub classical_plc_invocations: u64,
} }
impl CallDecoder { impl CallDecoder {
@@ -504,19 +459,8 @@ impl CallDecoder {
} else { } else {
JitterBuffer::new(config.jitter_target, config.jitter_max, config.jitter_min) JitterBuffer::new(config.jitter_target, config.jitter_max, config.jitter_min)
}; };
// Phase 3b: build the DRED parser + state buffers. These allocate
// libopus state (~10 KB each) once per call, not per packet — the
// scratch and last-good buffers are reused via std::mem::swap on
// every successful parse.
let dred_decoder =
DredDecoderHandle::new().expect("opus_dred_decoder_create failed at call setup");
let dred_parse_scratch =
DredState::new().expect("opus_dred_alloc failed at call setup (scratch)");
let last_good_dred =
DredState::new().expect("opus_dred_alloc failed at call setup (good state)");
Self { Self {
audio_dec: AdaptiveDecoder::new(config.profile) audio_dec: wzp_codec::create_decoder(config.profile),
.expect("failed to create adaptive decoder"),
fec_dec: wzp_fec::create_decoder(&config.profile), fec_dec: wzp_fec::create_decoder(&config.profile),
jitter, jitter,
quality: AdaptiveQualityController::new(), quality: AdaptiveQualityController::new(),
@@ -524,12 +468,6 @@ impl CallDecoder {
comfort_noise: ComfortNoise::new(50), comfort_noise: ComfortNoise::new(50),
last_was_cn: false, last_was_cn: false,
mini_context: MiniFrameContext::default(), mini_context: MiniFrameContext::default(),
dred_decoder,
dred_parse_scratch,
last_good_dred,
last_good_dred_seq: None,
dred_reconstructions: 0,
classical_plc_invocations: 0,
} }
} }
@@ -544,105 +482,20 @@ impl CallDecoder {
/// Feed a received media packet into the decode pipeline. /// Feed a received media packet into the decode pipeline.
pub fn ingest(&mut self, packet: MediaPacket) { pub fn ingest(&mut self, packet: MediaPacket) {
// Phase 2: Opus packets bypass RaptorQ. Codec2 packets still feed // Feed to FEC decoder
// the FEC decoder for recovery. This also cleanly drops any stray
// Opus repair packets from an old sender (we don't push repair
// packets to the jitter buffer either, so they're effectively
// ignored — a graceful mixed-version degradation).
if !packet.header.codec_id.is_opus() {
let _ = self.fec_dec.add_symbol( let _ = self.fec_dec.add_symbol(
packet.header.fec_block, packet.header.fec_block,
packet.header.fec_symbol, packet.header.fec_symbol,
packet.header.is_repair, packet.header.is_repair,
&packet.payload, &packet.payload,
); );
}
// Phase 3b: Opus source packets carry DRED side-channel data in // If not a repair packet, also feed directly to jitter buffer
// libopus 1.5. Parse it into the scratch state and, on success,
// swap with the cached `last_good_dred` so later gap reconstruction
// has fresh neural redundancy to draw from. Parsing happens before
// the jitter push because the jitter buffer consumes the packet.
if packet.header.codec_id.is_opus() && !packet.header.is_repair {
match self
.dred_decoder
.parse_into(&mut self.dred_parse_scratch, &packet.payload)
{
Ok(available) if available > 0 => {
// Swap the freshly parsed state into `last_good_dred`.
// The old good state (now in scratch) is about to be
// overwritten on the next parse — its contents are
// not needed after this swap.
std::mem::swap(&mut self.dred_parse_scratch, &mut self.last_good_dred);
self.last_good_dred_seq = Some(packet.header.seq);
}
Ok(_) => {
// Packet had no DRED data (return 0). Leave the cached
// state untouched — it may still cover upcoming gaps
// from a warm-up period where the encoder was producing
// DRED bytes. The scratch buffer was potentially written
// but its `samples_available` is 0 so it's harmless.
}
Err(e) => {
debug!("DRED parse error (ignored): {e}");
}
}
}
// Source packets (Opus or Codec2) go to the jitter buffer for decode.
// Repair packets never reach the jitter buffer; for Codec2 they're
// used by the FEC decoder above, for Opus they're dropped here.
if !packet.header.is_repair { if !packet.header.is_repair {
self.jitter.push(packet); self.jitter.push(packet);
} }
} }
/// Switch the decoder to match an incoming packet's codec if it differs
/// from the current profile. This enables cross-codec interop (e.g. one
/// client sends Opus, the other sends Codec2).
fn switch_decoder_if_needed(&mut self, incoming_codec: CodecId) {
if incoming_codec == self.profile.codec || incoming_codec == CodecId::ComfortNoise {
return;
}
let new_profile = Self::profile_for_codec(incoming_codec);
info!(
from = ?self.profile.codec,
to = ?incoming_codec,
"decoder switching codec to match incoming packet"
);
if let Err(e) = self.audio_dec.set_profile(new_profile) {
warn!("failed to switch decoder profile: {e}");
return;
}
self.fec_dec = wzp_fec::create_decoder(&new_profile);
self.profile = new_profile;
}
/// Map a `CodecId` to a reasonable `QualityProfile` for decoding.
fn profile_for_codec(codec: CodecId) -> QualityProfile {
match codec {
CodecId::Opus24k => QualityProfile::GOOD,
CodecId::Opus16k => QualityProfile {
codec: CodecId::Opus16k,
fec_ratio: 0.3,
frame_duration_ms: 20,
frames_per_block: 5,
},
CodecId::Opus6k => QualityProfile::DEGRADED,
CodecId::Opus32k => QualityProfile::STUDIO_32K,
CodecId::Opus48k => QualityProfile::STUDIO_48K,
CodecId::Opus64k => QualityProfile::STUDIO_64K,
CodecId::Codec2_3200 => QualityProfile {
codec: CodecId::Codec2_3200,
fec_ratio: 0.5,
frame_duration_ms: 20,
frames_per_block: 5,
},
CodecId::Codec2_1200 => QualityProfile::CATASTROPHIC,
CodecId::ComfortNoise => QualityProfile::GOOD,
}
}
/// Decode the next audio frame from the jitter buffer. /// Decode the next audio frame from the jitter buffer.
/// ///
/// Returns PCM samples (48kHz mono) or None if not ready. /// Returns PCM samples (48kHz mono) or None if not ready.
@@ -657,9 +510,6 @@ impl CallDecoder {
return Some(pcm.len()); return Some(pcm.len());
} }
// Auto-switch decoder if incoming codec differs from current.
self.switch_decoder_if_needed(pkt.header.codec_id);
self.last_was_cn = false; self.last_was_cn = false;
let result = match self.audio_dec.decode(&pkt.payload, pcm) { let result = match self.audio_dec.decode(&pkt.payload, pcm) {
Ok(n) => Some(n), Ok(n) => Some(n),
@@ -674,72 +524,19 @@ impl CallDecoder {
result result
} }
PlayoutResult::Missing { seq } => { PlayoutResult::Missing { seq } => {
// Only attempt recovery if there are still packets buffered ahead. // Only generate PLC if there are still packets buffered ahead.
// Otherwise we've drained everything — return None to stop. // Otherwise we've drained everything — return None to stop.
if self.jitter.depth() == 0 { if self.jitter.depth() > 0 {
self.jitter.record_underrun(); debug!(seq, "packet loss, generating PLC");
return None;
}
// Phase 3b: try DRED reconstruction first. If we have a
// recent DRED state from a packet whose seq > missing seq,
// and the seq delta (in samples) fits within the state's
// available window, libopus can synthesize a plausible
// replacement for the lost frame. Fall back to classical
// PLC when no state covers the gap, when the active codec
// is Codec2, or when the reconstruction itself errors.
if self.profile.codec.is_opus() {
if let Some(last_seq) = self.last_good_dred_seq {
// How many frames ahead of the missing seq is the
// last-good packet? Use wrapping arithmetic for the
// u16 seq space.
let seq_delta = last_seq.wrapping_sub(seq);
// Reject stale or backward state. u16 wraparound
// would make a "seq went backward" delta very large;
// cap at a sane forward-looking window.
const MAX_SEQ_DELTA: u16 = 128;
if seq_delta > 0 && seq_delta <= MAX_SEQ_DELTA {
let frame_samples =
(48_000 * self.profile.frame_duration_ms as i32) / 1000;
let offset_samples = seq_delta as i32 * frame_samples;
let available = self.last_good_dred.samples_available();
if offset_samples > 0 && offset_samples <= available {
match self.audio_dec.reconstruct_from_dred(
&self.last_good_dred,
offset_samples,
pcm,
) {
Ok(n) => {
self.dred_reconstructions += 1;
self.jitter.record_decode();
debug!(
seq,
last_seq,
offset_samples,
available,
"DRED reconstruction for gap"
);
return Some(n);
}
Err(e) => {
// Reconstruction failed — fall
// through to classical PLC below.
debug!(seq, "DRED reconstruct error: {e}");
}
}
}
}
}
}
// Classical PLC fallback (also the Codec2 path).
debug!(seq, "packet loss, generating classical PLC");
self.classical_plc_invocations += 1;
let result = self.audio_dec.decode_lost(pcm).ok(); let result = self.audio_dec.decode_lost(pcm).ok();
if result.is_some() { if result.is_some() {
self.jitter.record_decode(); self.jitter.record_decode();
} }
result result
} else {
self.jitter.record_underrun();
None
}
} }
PlayoutResult::NotReady => { PlayoutResult::NotReady => {
self.jitter.record_underrun(); self.jitter.record_underrun();
@@ -762,19 +559,6 @@ impl CallDecoder {
pub fn reset_stats(&mut self) { pub fn reset_stats(&mut self) {
self.jitter.reset_stats(); self.jitter.reset_stats();
} }
/// Phase 3b introspection: sequence number of the most recently parsed
/// valid DRED state, or `None` if no Opus packet has yielded DRED data
/// yet. Used by tests to debug reconstruction eligibility.
pub fn last_good_dred_seq(&self) -> Option<u16> {
self.last_good_dred_seq
}
/// Phase 3b introspection: samples of audio history currently available
/// in the cached DRED state.
pub fn last_good_dred_samples_available(&self) -> i32 {
self.last_good_dred.samples_available()
}
} }
/// Periodic telemetry logger for jitter buffer statistics. /// Periodic telemetry logger for jitter buffer statistics.
@@ -836,83 +620,18 @@ mod tests {
assert!(!packets[0].header.is_repair); assert!(!packets[0].header.is_repair);
} }
/// Phase 2: Opus packets have zero FEC header fields — no block, no
/// symbol index, no repair ratio. The RaptorQ layer is bypassed
/// entirely on the Opus tiers.
#[test] #[test]
fn opus_source_packets_have_zero_fec_header_fields() { fn encoder_generates_repair_on_full_block() {
let config = CallConfig { let config = CallConfig {
profile: QualityProfile::GOOD, // Opus 24k profile: QualityProfile::GOOD, // 5 frames/block
suppression_enabled: false, // skip silence gate for this test
..Default::default() ..Default::default()
}; };
let mut enc = CallEncoder::new(&config); let mut enc = CallEncoder::new(&config);
// Non-silent sine wave so silence detection doesn't suppress us let pcm = vec![0i16; 960];
// even with suppression_enabled=false (belt and braces).
let pcm: Vec<i16> = (0..960)
.map(|i| ((i as f32 * 0.1).sin() * 10_000.0) as i16)
.collect();
let packets = enc.encode_frame(&pcm).unwrap();
assert_eq!(packets.len(), 1, "Opus must emit exactly 1 source packet");
let hdr = &packets[0].header;
assert!(hdr.codec_id.is_opus());
assert!(!hdr.is_repair);
assert_eq!(hdr.fec_block, 0, "Opus fec_block must be 0");
assert_eq!(hdr.fec_symbol, 0, "Opus fec_symbol must be 0");
assert_eq!(hdr.fec_ratio_encoded, 0, "Opus fec_ratio_encoded must be 0");
}
/// Phase 2: Opus never emits repair packets, regardless of how many let mut total_packets = 0;
/// source frames are fed in. DRED (Phase 1) provides loss recovery at let mut repair_count = 0;
/// the codec layer; RaptorQ is disabled on Opus tiers. for _ in 0..5 {
#[test]
fn opus_encoder_never_emits_repair_packets() {
let config = CallConfig {
profile: QualityProfile::GOOD, // 5 frames/block in the Codec2 sense
suppression_enabled: false,
..Default::default()
};
let mut enc = CallEncoder::new(&config);
let pcm: Vec<i16> = (0..960)
.map(|i| ((i as f32 * 0.1).sin() * 10_000.0) as i16)
.collect();
// Encode well beyond a block boundary to prove no repair ever comes out.
let mut total_packets = 0usize;
let mut repair_count = 0usize;
for _ in 0..20 {
let packets = enc.encode_frame(&pcm).unwrap();
total_packets += packets.len();
repair_count += packets.iter().filter(|p| p.header.is_repair).count();
}
assert_eq!(repair_count, 0, "Opus must emit zero repair packets");
assert_eq!(
total_packets, 20,
"20 source frames → 20 source packets (1:1, no RaptorQ expansion)"
);
}
/// Phase 2: Codec2 still emits repair packets with RaptorQ ratio unchanged.
/// DRED is libopus-only and does not apply here, so RaptorQ is still the
/// primary loss-recovery mechanism on Codec2 tiers.
#[test]
fn codec2_encoder_generates_repair_on_full_block() {
let config = CallConfig {
profile: QualityProfile::CATASTROPHIC, // Codec2 1200, 8 frames/block, ratio 1.0
suppression_enabled: false,
..Default::default()
};
let mut enc = CallEncoder::new(&config);
// Codec2 takes 48 kHz samples and downsamples internally.
// CATASTROPHIC uses 40 ms frames → 1920 samples.
let pcm: Vec<i16> = (0..1920)
.map(|i| ((i as f32 * 0.1).sin() * 10_000.0) as i16)
.collect();
let mut total_packets = 0usize;
let mut repair_count = 0usize;
// Run long enough to cross the 8-frame block boundary and see repairs.
for _ in 0..16 {
let packets = enc.encode_frame(&pcm).unwrap(); let packets = enc.encode_frame(&pcm).unwrap();
for p in &packets { for p in &packets {
if p.header.is_repair { if p.header.is_repair {
@@ -921,10 +640,8 @@ mod tests {
} }
total_packets += packets.len(); total_packets += packets.len();
} }
assert!( assert!(repair_count > 0, "should have repair packets after full block");
repair_count > 0, assert!(total_packets > 5, "total {total_packets} should exceed 5 source");
"Codec2 must still emit repair packets (got {repair_count} repairs, {total_packets} total)"
);
} }
#[test] #[test]
@@ -955,219 +672,6 @@ mod tests {
assert!(dec.decode_next(&mut pcm).is_none()); assert!(dec.decode_next(&mut pcm).is_none());
} }
// ─── Phase 3b — DRED reconstruction on packet loss ────────────────────
/// Helper: create a CallEncoder/CallDecoder pair with the given profile
/// and silence suppression disabled so silence-detection doesn't drop
/// our synthetic test frames.
fn encoder_decoder_pair(profile: QualityProfile) -> (CallEncoder, CallDecoder) {
let config = CallConfig {
profile,
suppression_enabled: false,
// Small jitter buffer so decode_next drains quickly in tests.
jitter_min: 2,
jitter_target: 3,
jitter_max: 20,
adaptive_jitter: false,
..Default::default()
};
(CallEncoder::new(&config), CallDecoder::new(&config))
}
/// Helper: generate a non-silent 20 ms frame of 300 Hz sine at the
/// given sample offset so consecutive frames form a continuous tone.
fn voice_frame_20ms(sample_offset: usize) -> Vec<i16> {
(0..960)
.map(|i| {
let t = (sample_offset + i) as f64 / 48_000.0;
(8000.0 * (2.0 * std::f64::consts::PI * 300.0 * t).sin()) as i16
})
.collect()
}
/// Phase 3b probe: sweep packet_loss_perc values to find the minimum
/// that produces a samples_available ≥ 960 (enough to reconstruct a
/// single 20 ms Opus frame). This guides the production loss floor.
#[test]
#[ignore] // diagnostic only — run with `cargo test ... -- --ignored --nocapture`
fn probe_dred_samples_available_by_loss_floor() {
use wzp_codec::opus_enc::OpusEncoder;
use wzp_proto::traits::AudioEncoder;
for loss_pct in [5u8, 10, 15, 20, 25, 40, 60, 80].iter().copied() {
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
enc.set_expected_loss(loss_pct);
let (_drop_enc, mut dec) = encoder_decoder_pair(QualityProfile::GOOD);
for i in 0..60u16 {
let pcm = voice_frame_20ms(i as usize * 960);
let mut encoded = vec![0u8; 512];
let n = enc.encode(&pcm, &mut encoded).unwrap();
encoded.truncate(n);
let pkt = MediaPacket {
header: MediaHeader {
version: 0,
is_repair: false,
codec_id: CodecId::Opus24k,
has_quality_report: false,
fec_ratio_encoded: 0,
seq: i,
timestamp: (i as u32) * 20,
fec_block: 0,
fec_symbol: 0,
reserved: 0,
csrc_count: 0,
},
payload: Bytes::from(encoded),
quality_report: None,
};
dec.ingest(pkt);
}
eprintln!(
"[phase3b probe] loss_pct={loss_pct} samples_available={}",
dec.last_good_dred_samples_available()
);
}
}
/// Phase 3b: simulated single-packet loss on an Opus call triggers a
/// DRED reconstruction rather than a classical PLC fill. Runs the full
/// encode → ingest → decode_next pipeline.
#[test]
fn opus_single_packet_loss_is_recovered_via_dred() {
let (mut enc, mut dec) = encoder_decoder_pair(QualityProfile::GOOD);
// Warm-up: encode and ingest 60 frames (1.2 s) so the DRED emitter
// has had time to fill its 200 ms window and at least one
// successful DRED parse has happened on the decoder side.
let warmup_frames = 60;
for i in 0..warmup_frames {
let pcm = voice_frame_20ms(i * 960);
let packets = enc.encode_frame(&pcm).unwrap();
for pkt in packets {
dec.ingest(pkt);
}
}
// Drain the warm-up frames through the decoder to advance the
// jitter buffer cursor past them.
let mut out = vec![0i16; 960];
while dec.decode_next(&mut out).is_some() {}
// Encode the next three frames but skip ingesting the middle one.
let base_offset = warmup_frames * 960;
let pcm_a = voice_frame_20ms(base_offset);
let pcm_b = voice_frame_20ms(base_offset + 960);
let pcm_c = voice_frame_20ms(base_offset + 1920);
let pkts_a = enc.encode_frame(&pcm_a).unwrap();
let pkts_b = enc.encode_frame(&pcm_b).unwrap(); // DROP THIS ONE
let pkts_c = enc.encode_frame(&pcm_c).unwrap();
for pkt in pkts_a {
dec.ingest(pkt);
}
// Skip pkts_b entirely — this is the "packet loss".
drop(pkts_b);
for pkt in pkts_c {
dec.ingest(pkt);
}
// Drain again. Somewhere in here decode_next will hit Missing()
// for the dropped packet and attempt DRED reconstruction.
let baseline_dred = dec.dred_reconstructions;
let baseline_plc = dec.classical_plc_invocations;
eprintln!(
"[phase3b probe] pre-drain: last_good_seq={:?} samples_available={}",
dec.last_good_dred_seq(),
dec.last_good_dred_samples_available()
);
while dec.decode_next(&mut out).is_some() {}
let dred_delta = dec.dred_reconstructions - baseline_dred;
let plc_delta = dec.classical_plc_invocations - baseline_plc;
eprintln!(
"[phase3b probe] post-drain: dred_delta={dred_delta} plc_delta={plc_delta}"
);
assert!(
dred_delta >= 1,
"expected ≥1 DRED reconstruction on single-packet loss, \
got dred_delta={dred_delta} plc_delta={plc_delta}"
);
}
/// Phase 3b: lossless stream never triggers DRED reconstruction or PLC.
/// Baseline behavior — verifies the Missing() branch is not spuriously taken.
#[test]
fn opus_lossless_ingest_never_triggers_dred_or_plc() {
let (mut enc, mut dec) = encoder_decoder_pair(QualityProfile::GOOD);
// Encode + ingest 40 frames with no drops.
for i in 0..40 {
let pcm = voice_frame_20ms(i * 960);
let packets = enc.encode_frame(&pcm).unwrap();
for pkt in packets {
dec.ingest(pkt);
}
}
let mut out = vec![0i16; 960];
while dec.decode_next(&mut out).is_some() {}
assert_eq!(
dec.dred_reconstructions, 0,
"lossless stream should not reconstruct"
);
assert_eq!(
dec.classical_plc_invocations, 0,
"lossless stream should not PLC"
);
}
/// Phase 3b: Codec2 calls fall through to classical PLC on loss.
/// DRED is libopus-only, so even if the decoder's DRED state were
/// populated (it won't be — Codec2 packets don't carry DRED bytes),
/// `reconstruct_from_dred` rejects Codec2 at the AdaptiveDecoder
/// level. This test guards the Codec2 side of the protection split.
#[test]
fn codec2_loss_falls_through_to_classical_plc() {
let (mut enc, mut dec) = encoder_decoder_pair(QualityProfile::CATASTROPHIC);
// Codec2 1200 uses 40 ms frames → 1920 samples at 48 kHz (before
// the downsample inside the codec). Encode 20 frames (~0.8 s).
let make_frame = |offset: usize| -> Vec<i16> {
(0..1920)
.map(|i| {
let t = (offset + i) as f64 / 48_000.0;
(8000.0 * (2.0 * std::f64::consts::PI * 300.0 * t).sin()) as i16
})
.collect()
};
for i in 0..20 {
let pcm = make_frame(i * 1920);
let packets = enc.encode_frame(&pcm).unwrap();
for pkt in packets {
// Drop every 5th source packet to simulate loss.
if !pkt.header.is_repair && i % 5 == 3 {
continue;
}
dec.ingest(pkt);
}
}
let mut out = vec![0i16; 1920];
while dec.decode_next(&mut out).is_some() {}
assert_eq!(
dec.dred_reconstructions, 0,
"Codec2 must never reconstruct via DRED"
);
// classical_plc_invocations may or may not trigger depending on
// whether the jitter buffer sees Missing before draining — the key
// assertion is that DRED is not used. PLC count is advisory.
}
// ---- QualityAdapter tests ---- // ---- QualityAdapter tests ----
/// Helper: build a QualityReport from human-readable loss% and RTT ms. /// Helper: build a QualityReport from human-readable loss% and RTT ms.

View File

@@ -626,21 +626,11 @@ async fn run_live(transport: Arc<wzp_transport::QuinnTransport>) -> anyhow::Resu
.spawn(move || { .spawn(move || {
let config = CallConfig::default(); let config = CallConfig::default();
let mut encoder = CallEncoder::new(&config); let mut encoder = CallEncoder::new(&config);
let mut frame = vec![0i16; FRAME_SAMPLES];
loop { loop {
// Pull a full 20 ms frame from the capture ring. The ring let frame = match capture.read_frame() {
// may return a partial read when the CPAL callback hasn't Some(f) => f,
// produced enough samples yet — keep reading until we None => break,
// accumulate a whole frame, sleeping briefly on empty };
// returns so we don't hot-spin the CPU.
let mut filled = 0usize;
while filled < FRAME_SAMPLES {
let n = capture.ring().read(&mut frame[filled..]);
filled += n;
if n == 0 {
std::thread::sleep(std::time::Duration::from_millis(2));
}
}
let packets = match encoder.encode_frame(&frame) { let packets = match encoder.encode_frame(&frame) {
Ok(p) => p, Ok(p) => p,
Err(e) => { Err(e) => {
@@ -671,13 +661,7 @@ async fn run_live(transport: Arc<wzp_transport::QuinnTransport>) -> anyhow::Resu
// Repair packets feed the FEC decoder but don't produce audio. // Repair packets feed the FEC decoder but don't produce audio.
if !is_repair { if !is_repair {
if let Some(_n) = decoder.decode_next(&mut pcm_buf) { if let Some(_n) = decoder.decode_next(&mut pcm_buf) {
// Push the decoded frame into the playback playback.write_frame(&pcm_buf);
// ring. The CPAL output callback drains from
// here on its own clock; if the ring is full
// (rare in CLI live mode) the write returns
// a short count and the tail is dropped,
// which is the correct real-time behavior.
playback.ring().write(&pcm_buf);
} }
} }
} }
@@ -770,15 +754,13 @@ async fn run_signal_mode(
ephemeral_pub: [0u8; 32], // Phase 1: not used for key exchange ephemeral_pub: [0u8; 32], // Phase 1: not used for key exchange
signature: vec![], signature: vec![],
supported_profiles: vec![wzp_proto::QualityProfile::GOOD], supported_profiles: vec![wzp_proto::QualityProfile::GOOD],
// CLI client doesn't attempt hole-punching; always
// relay-path.
caller_reflexive_addr: None,
}).await?; }).await?;
} }
// Signal recv loop — handle incoming signals // Signal recv loop — handle incoming signals
let signal_transport = transport.clone(); let signal_transport = transport.clone();
let relay = relay_addr; let relay = relay_addr;
let my_fp = fp.clone();
let my_seed = seed.0; let my_seed = seed.0;
loop { loop {
@@ -802,15 +784,12 @@ async fn run_signal_mode(
ephemeral_pub: None, ephemeral_pub: None,
signature: None, signature: None,
chosen_profile: Some(wzp_proto::QualityProfile::GOOD), chosen_profile: Some(wzp_proto::QualityProfile::GOOD),
// CLI auto-accept uses generic (privacy) mode,
// so callee addr stays hidden from the caller.
callee_reflexive_addr: None,
}).await; }).await;
} }
SignalMessage::DirectCallAnswer { call_id, accept_mode, .. } => { SignalMessage::DirectCallAnswer { call_id, accept_mode, .. } => {
info!(call_id = %call_id, mode = ?accept_mode, "call answered"); info!(call_id = %call_id, mode = ?accept_mode, "call answered");
} }
SignalMessage::CallSetup { call_id, room, relay_addr: setup_relay, peer_direct_addr: _ } => { SignalMessage::CallSetup { call_id, room, relay_addr: setup_relay } => {
info!(call_id = %call_id, room = %room, relay = %setup_relay, "call setup — connecting to media room"); info!(call_id = %call_id, room = %room, relay = %setup_relay, "call setup — connecting to media room");
// Connect to the media room // Connect to the media room

View File

@@ -1,195 +0,0 @@
//! Phase 3.5 — dual-path QUIC connect race for P2P hole-punching.
//!
//! When both peers advertised reflex addrs in the
//! DirectCallOffer/Answer flow, the relay cross-wires them into
//! `CallSetup.peer_direct_addr`. This module races a direct QUIC
//! handshake against the existing relay dial and returns whichever
//! completes first — with automatic drop of the loser via
//! `tokio::select!`.
//!
//! Role determination is deterministic and symmetric
//! (`wzp_client::reflect::determine_role`): whichever peer has the
//! lexicographically smaller reflex addr becomes the **Acceptor**
//! (listens on a server-capable endpoint), the other becomes the
//! **Dialer** (dials the peer's addr). Because the rule is
//! identical on both sides, the Acceptor's inbound QUIC session
//! and the Dialer's outbound are the SAME connection — no
//! negotiation needed, no two-conns-per-call confusion.
//!
//! Timeout policy:
//! - Direct path: 2s from the start of `race`. Cone-NAT hole-punch
//! typically completes in < 500ms on a LAN; 2s gives us tolerance
//! for a single QUIC Initial retry on unreliable networks.
//! - Relay path: 10s (existing behavior elsewhere in the codebase).
//! - Overall: `tokio::select!` returns as soon as either succeeds.
use std::net::SocketAddr;
use std::sync::Arc;
use std::time::Duration;
use crate::reflect::Role;
use wzp_transport::QuinnTransport;
/// Which path won the race. Used by the `connect` command for
/// logging + (in the future) metrics.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum WinningPath {
Direct,
Relay,
}
/// Attempt a direct QUIC connection to the peer in parallel with
/// the relay dial and return the winning `QuinnTransport`.
///
/// `role` selects the direction of the direct attempt:
/// - `Role::Acceptor` creates a server-capable endpoint and waits
/// for the peer to dial in.
/// - `Role::Dialer` creates a client-only endpoint and dials
/// `peer_direct_addr`.
///
/// The relay path is always attempted in parallel as a fallback so
/// the race ALWAYS produces a working transport unless both paths
/// genuinely fail (network partition). Returns
/// `Err(anyhow::anyhow!(...))` if both paths fail within the
/// timeout.
#[allow(clippy::too_many_arguments)]
pub async fn race(
role: Role,
peer_direct_addr: SocketAddr,
relay_addr: SocketAddr,
room_sni: String,
call_sni: String,
) -> anyhow::Result<(Arc<QuinnTransport>, WinningPath)> {
// Rustls provider must be installed before any quinn endpoint
// is created. Install attempt is idempotent.
let _ = rustls::crypto::ring::default_provider().install_default();
// Build the direct-path endpoint + future based on role.
// Each future returns an already-wrapped `QuinnTransport` so we
// don't need a direct `quinn::Connection` type in scope here
// (this crate doesn't depend on quinn directly).
let direct_ep: wzp_transport::Endpoint;
let direct_fut: std::pin::Pin<
Box<dyn std::future::Future<Output = anyhow::Result<QuinnTransport>> + Send>,
>;
match role {
Role::Acceptor => {
let (sc, _cert_der) = wzp_transport::server_config();
let bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let ep = wzp_transport::create_endpoint(bind, Some(sc))?;
tracing::info!(
local_addr = ?ep.local_addr().ok(),
"dual_path: A-role endpoint up, awaiting peer dial"
);
let ep_for_fut = ep.clone();
direct_fut = Box::pin(async move {
// `wzp_transport::accept` wraps the same
// `endpoint.accept().await?.await?` dance we want
// and maps errors into TransportError for us.
let conn = wzp_transport::accept(&ep_for_fut)
.await
.map_err(|e| anyhow::anyhow!("direct accept: {e}"))?;
Ok(QuinnTransport::new(conn))
});
direct_ep = ep;
}
Role::Dialer => {
let bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let ep = wzp_transport::create_endpoint(bind, None)?;
tracing::info!(
local_addr = ?ep.local_addr().ok(),
%peer_direct_addr,
"dual_path: D-role endpoint up, dialing peer"
);
let ep_for_fut = ep.clone();
let client_cfg = wzp_transport::client_config();
let sni = call_sni.clone();
direct_fut = Box::pin(async move {
let conn =
wzp_transport::connect(&ep_for_fut, peer_direct_addr, &sni, client_cfg)
.await
.map_err(|e| anyhow::anyhow!("direct dial: {e}"))?;
Ok(QuinnTransport::new(conn))
});
direct_ep = ep;
}
}
// Relay path: classic dial to the relay's media room.
let relay_bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let relay_ep = wzp_transport::create_endpoint(relay_bind, None)?;
let relay_ep_for_fut = relay_ep.clone();
let relay_client_cfg = wzp_transport::client_config();
let relay_sni = room_sni.clone();
let relay_fut = async move {
let conn =
wzp_transport::connect(&relay_ep_for_fut, relay_addr, &relay_sni, relay_client_cfg)
.await
.map_err(|e| anyhow::anyhow!("relay dial: {e}"))?;
Ok::<_, anyhow::Error>(QuinnTransport::new(conn))
};
// Race the two with a shared 2s ceiling on the direct attempt.
// Pin both so we can poll them from multiple branches of the
// select without moving the futures — the "direct failed, wait
// for relay" and "relay failed, wait for direct" fallback paths
// below need to await the OPPOSITE future after the winning
// branch fires. Without pinning, tokio::select! moves the
// future out and we can't touch it again.
tracing::info!(?role, %peer_direct_addr, %relay_addr, "dual_path: racing direct vs relay");
let direct_timed = tokio::time::timeout(Duration::from_secs(2), direct_fut);
tokio::pin!(direct_timed, relay_fut);
let result = tokio::select! {
biased; // prefer direct win if both arrive in the same tick
direct_result = &mut direct_timed => {
match direct_result {
Ok(Ok(transport)) => {
tracing::info!(%peer_direct_addr, "dual_path: direct WON");
Ok((Arc::new(transport), WinningPath::Direct))
}
Ok(Err(e)) => {
// Direct failed — fall back to waiting for relay.
tracing::warn!(error = %e, "dual_path: direct failed, awaiting relay");
match tokio::time::timeout(Duration::from_secs(5), &mut relay_fut).await {
Ok(Ok(transport)) => Ok((Arc::new(transport), WinningPath::Relay)),
Ok(Err(e2)) => Err(anyhow::anyhow!("both paths failed: direct={e}, relay={e2}")),
Err(_) => Err(anyhow::anyhow!("both paths failed: direct={e}, relay=timeout(5s)")),
}
}
Err(_elapsed) => {
tracing::warn!("dual_path: direct timed out (2s), awaiting relay");
match tokio::time::timeout(Duration::from_secs(5), &mut relay_fut).await {
Ok(Ok(transport)) => Ok((Arc::new(transport), WinningPath::Relay)),
Ok(Err(e2)) => Err(anyhow::anyhow!("direct timeout + relay failed: {e2}")),
Err(_) => Err(anyhow::anyhow!("direct timeout + relay timeout")),
}
}
}
}
relay_result = &mut relay_fut => {
match relay_result {
Ok(transport) => {
tracing::info!("dual_path: relay WON (direct still pending)");
Ok((Arc::new(transport), WinningPath::Relay))
}
Err(e) => {
tracing::warn!(error = %e, "dual_path: relay failed, awaiting direct remainder");
match tokio::time::timeout(Duration::from_millis(1500), &mut direct_timed).await {
Ok(Ok(Ok(transport))) => Ok((Arc::new(transport), WinningPath::Direct)),
_ => Err(anyhow::anyhow!("relay failed + direct unavailable: {e}")),
}
}
}
}
};
// Drop both endpoints once the winner is stored in result. The
// winning transport owns its own connection so dropping the
// endpoint won't kill it.
drop(direct_ep);
drop(relay_ep);
result
}

View File

@@ -96,7 +96,6 @@ pub fn signal_to_call_type(signal: &SignalMessage) -> CallSignalType {
SignalMessage::Hangup { .. } => CallSignalType::Hangup, SignalMessage::Hangup { .. } => CallSignalType::Hangup,
SignalMessage::Rekey { .. } => CallSignalType::Offer, // reuse SignalMessage::Rekey { .. } => CallSignalType::Offer, // reuse
SignalMessage::QualityUpdate { .. } => CallSignalType::Offer, // reuse SignalMessage::QualityUpdate { .. } => CallSignalType::Offer, // reuse
SignalMessage::LossRecoveryUpdate { .. } => CallSignalType::Offer, // reuse (telemetry)
SignalMessage::Ping { .. } | SignalMessage::Pong { .. } => CallSignalType::Offer, SignalMessage::Ping { .. } | SignalMessage::Pong { .. } => CallSignalType::Offer,
SignalMessage::AuthToken { .. } => CallSignalType::Offer, SignalMessage::AuthToken { .. } => CallSignalType::Offer,
SignalMessage::Hold => CallSignalType::Hold, SignalMessage::Hold => CallSignalType::Hold,
@@ -120,16 +119,6 @@ pub fn signal_to_call_type(signal: &SignalMessage) -> CallSignalType {
SignalMessage::CallRinging { .. } => CallSignalType::Ringing, SignalMessage::CallRinging { .. } => CallSignalType::Ringing,
SignalMessage::RegisterPresence { .. } SignalMessage::RegisterPresence { .. }
| SignalMessage::RegisterPresenceAck { .. } => CallSignalType::Offer, // relay-only | SignalMessage::RegisterPresenceAck { .. } => CallSignalType::Offer, // relay-only
// NAT reflection is a client↔relay control exchange that
// never crosses the featherChat bridge — if it ever reaches
// this mapper something is wrong, but we still have to give
// an answer. "Offer" is the generic catch-all.
SignalMessage::Reflect
| SignalMessage::ReflectResponse { .. } => CallSignalType::Offer, // control-plane
// Phase 4 cross-relay forwarding envelope — strictly a
// relay-to-relay message, never rides the featherChat
// bridge. Catch-all mapping for completeness.
SignalMessage::FederatedSignalForward { .. } => CallSignalType::Offer,
} }
} }

View File

@@ -8,77 +8,16 @@
#[cfg(feature = "audio")] #[cfg(feature = "audio")]
pub mod audio_io; pub mod audio_io;
#[cfg(feature = "audio")]
pub mod audio_ring;
// VoiceProcessingIO is an Apple Core Audio API — only compile the module
// when the `vpio` feature is on AND we're targeting macOS. Enabling the
// feature on Windows/Linux was previously silently broken.
#[cfg(all(feature = "vpio", target_os = "macos"))]
pub mod audio_vpio;
// WASAPI-direct capture with Windows's OS-level AEC (AudioCategory_Communications).
// Only compiled when `windows-aec` feature is on AND target is Windows. The
// `windows` dependency is itself gated to Windows in Cargo.toml, so enabling
// this feature on non-Windows targets is a no-op.
#[cfg(all(feature = "windows-aec", target_os = "windows"))]
pub mod audio_wasapi;
// WebRTC AEC3 (Audio Processing Module) wrapper around CPAL capture + playback
// on Linux. Only compiled when `linux-aec` feature is on AND target is Linux.
// The webrtc-audio-processing dep is itself gated to Linux in Cargo.toml.
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
pub mod audio_linux_aec;
pub mod bench; pub mod bench;
pub mod call; pub mod call;
pub mod drift_test; pub mod drift_test;
pub mod echo_test; pub mod echo_test;
pub mod featherchat; pub mod featherchat;
pub mod handshake; pub mod handshake;
pub mod dual_path;
pub mod metrics; pub mod metrics;
pub mod reflect;
pub mod sweep; pub mod sweep;
// AudioPlayback: three possible backends depending on feature flags. #[cfg(feature = "audio")]
// 1. Default CPAL (`audio_io::AudioPlayback`) — baseline on every platform. pub use audio_io::{AudioCapture, AudioPlayback};
// 2. Linux AEC (`audio_linux_aec::LinuxAecPlayback`) — CPAL + WebRTC APM
// render-side tee, so echo from speakers gets cancelled from the mic.
//
// On macOS and Windows we always use the default CPAL playback because:
// - macOS: VoiceProcessingIO handles AEC at the capture side (Apple's
// native hardware AEC uses its own reference signal handling).
// - Windows: WASAPI AudioCategory_Communications AEC uses the system
// render mix as reference — no per-process plumbing needed.
//
// Linux is the only platform where the in-app approach is necessary, so
// the AEC playback path is gated to target_os = "linux".
#[cfg(all(
feature = "audio",
any(not(feature = "linux-aec"), not(target_os = "linux"))
))]
pub use audio_io::AudioPlayback;
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
pub use audio_linux_aec::LinuxAecPlayback as AudioPlayback;
// AudioCapture: three possible backends depending on feature flags.
// 1. Default CPAL (`audio_io::AudioCapture`) — baseline on every platform.
// 2. Windows AEC (`audio_wasapi::WasapiAudioCapture`) — direct WASAPI
// with AudioCategory_Communications, OS APO chain does AEC.
// 3. Linux AEC (`audio_linux_aec::LinuxAecCapture`) — CPAL + WebRTC APM
// capture-side echo cancellation using the playback tee as reference.
// All three expose the same public API (`start`, `ring`, `stop`, `Drop`).
#[cfg(all(
feature = "audio",
any(not(feature = "windows-aec"), not(target_os = "windows")),
any(not(feature = "linux-aec"), not(target_os = "linux"))
))]
pub use audio_io::AudioCapture;
#[cfg(all(feature = "windows-aec", target_os = "windows"))]
pub use audio_wasapi::WasapiAudioCapture as AudioCapture;
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
pub use audio_linux_aec::LinuxAecCapture as AudioCapture;
pub use call::{CallConfig, CallDecoder, CallEncoder}; pub use call::{CallConfig, CallDecoder, CallEncoder};
pub use handshake::perform_handshake; pub use handshake::perform_handshake;

View File

@@ -1,447 +0,0 @@
//! Multi-relay NAT reflection ("STUN for QUIC" — Phase 2).
//!
//! Phase 1 (`SignalMessage::Reflect` / `ReflectResponse`) lets a
//! client ask a single relay "what source address do you see for
//! me?". Phase 2 queries N relays in parallel and classifies the
//! results into a NAT type so the future P2P hole-punching path
//! can decide whether a direct QUIC handshake is viable:
//!
//! - All relays return the same `(ip, port)` → **Cone NAT**.
//! Endpoint-independent mapping, P2P hole-punching viable,
//! `consensus_addr` is the one address to advertise.
//! - Same ip, different ports → **Symmetric port-dependent NAT**.
//! The mapping changes per destination, so the advertised addr
//! wouldn't match what a peer actually sees; fall back to
//! relay-mediated path.
//! - Different ips → multi-homed / anycast / broken DNS, treat as
//! `Multiple` and do not attempt P2P.
//! - 0 or 1 successful probes → `Unknown`, not enough data.
//!
//! A probe is a throwaway QUIC signal connection: open endpoint,
//! connect, RegisterPresence (with a zero identity — the relay
//! accepts this exactly like the main signaling path does), send
//! Reflect, read ReflectResponse, close. Each probe gets its own
//! ephemeral quinn::Endpoint so the OS assigns a fresh source port
//! per relay — if we shared one endpoint across probes, a
//! symmetric NAT in front of the client would map every probe to
//! the same port and we couldn't detect it.
use std::net::SocketAddr;
use std::time::{Duration, Instant};
use serde::Serialize;
use wzp_proto::{MediaTransport, SignalMessage};
use wzp_transport::{client_config, create_endpoint, QuinnTransport};
/// Result of one probe against one relay. Always returned so the
/// UI can render per-relay status even when some fail.
#[derive(Debug, Clone, Serialize)]
pub struct NatProbeResult {
pub relay_name: String,
pub relay_addr: String,
/// `Some` on successful probe, `None` on failure.
pub observed_addr: Option<String>,
/// End-to-end wall-clock from connect start to ReflectResponse
/// received, in milliseconds. `Some` only on success.
pub latency_ms: Option<u32>,
/// Human-readable error on failure.
pub error: Option<String>,
}
/// Aggregated classification over N `NatProbeResult`s.
#[derive(Debug, Clone, Serialize)]
pub struct NatDetection {
pub probes: Vec<NatProbeResult>,
pub nat_type: NatType,
/// When `nat_type == Cone`, the one address all probes agreed
/// on. `None` for every other case.
pub consensus_addr: Option<String>,
}
/// NAT classification. See module doc for semantics.
#[derive(Debug, Clone, Copy, Serialize, PartialEq, Eq)]
pub enum NatType {
Cone,
SymmetricPort,
Multiple,
Unknown,
}
/// Probe a single relay with a throwaway QUIC connection.
///
/// Each call creates a fresh `quinn::Endpoint` so the OS hands out a
/// fresh ephemeral source port — essential for NAT-type detection
/// because a shared socket would produce the same mapping against
/// every relay and mask symmetric NAT.
pub async fn probe_reflect_addr(
relay: SocketAddr,
timeout_ms: u64,
) -> Result<(SocketAddr, u32), String> {
// Install rustls provider idempotently — a second install on the
// same thread is a no-op.
let _ = rustls::crypto::ring::default_provider().install_default();
let bind: SocketAddr = "0.0.0.0:0".parse().unwrap();
let endpoint = create_endpoint(bind, None).map_err(|e| format!("endpoint: {e}"))?;
let start = Instant::now();
let probe = async {
// Open the signal connection.
let conn =
wzp_transport::connect(&endpoint, relay, "_signal", client_config())
.await
.map_err(|e| format!("connect: {e}"))?;
let transport = QuinnTransport::new(conn);
// The relay signal handler waits for a RegisterPresence
// before entering its main dispatch loop (see
// wzp-relay/src/main.rs). So a transient probe has to
// register with a zero identity first — the relay accepts
// the empty-signature form exactly as the main signaling
// path does in desktop/src-tauri/src/lib.rs register_signal.
transport
.send_signal(&SignalMessage::RegisterPresence {
identity_pub: [0u8; 32],
signature: vec![],
alias: None,
})
.await
.map_err(|e| format!("send RegisterPresence: {e}"))?;
// Drain the RegisterPresenceAck so the response to our
// Reflect doesn't land on an unexpected stream order.
match transport.recv_signal().await {
Ok(Some(SignalMessage::RegisterPresenceAck { success: true, .. })) => {}
Ok(Some(other)) => {
return Err(format!(
"unexpected pre-reflect signal: {:?}",
std::mem::discriminant(&other)
));
}
Ok(None) => return Err("connection closed before RegisterPresenceAck".into()),
Err(e) => return Err(format!("recv RegisterPresenceAck: {e}")),
}
// Send Reflect and await response.
transport
.send_signal(&SignalMessage::Reflect)
.await
.map_err(|e| format!("send Reflect: {e}"))?;
match transport.recv_signal().await {
Ok(Some(SignalMessage::ReflectResponse { observed_addr })) => {
let parsed: SocketAddr = observed_addr
.parse()
.map_err(|e| format!("parse observed_addr {observed_addr:?}: {e}"))?;
let latency_ms = start.elapsed().as_millis() as u32;
// Clean close so the relay's per-connection cleanup
// runs promptly and we don't leak file descriptors.
let _ = transport.close().await;
Ok((parsed, latency_ms))
}
Ok(Some(other)) => Err(format!(
"expected ReflectResponse, got {:?}",
std::mem::discriminant(&other)
)),
Ok(None) => Err("connection closed before ReflectResponse".into()),
Err(e) => Err(format!("recv ReflectResponse: {e}")),
}
};
let out = tokio::time::timeout(Duration::from_millis(timeout_ms), probe)
.await
.map_err(|_| format!("probe timeout ({timeout_ms}ms)"))??;
// Drop the endpoint explicitly AFTER the probe finishes so the
// UDP socket is released before we return.
drop(endpoint);
Ok(out)
}
/// Detect the client's NAT type by probing N relays in parallel and
/// classifying the returned addresses. Never errors — failing
/// probes surface via `NatProbeResult.error`; aggregate is always
/// returned.
pub async fn detect_nat_type(
relays: Vec<(String, SocketAddr)>,
timeout_ms: u64,
) -> NatDetection {
// Parallel probes via tokio::task::JoinSet so the wall-clock is
// bounded by the slowest probe, not the sum. JoinSet keeps the
// dep surface at just tokio — we already depend on it.
let mut set = tokio::task::JoinSet::new();
for (name, addr) in relays {
set.spawn(async move {
let result = probe_reflect_addr(addr, timeout_ms).await;
(name, addr, result)
});
}
let mut probes = Vec::new();
while let Some(join_result) = set.join_next().await {
let (name, addr, result) = match join_result {
Ok(tuple) => tuple,
// Task panicked — surface as a synthetic failed probe so
// the aggregate still returns a reasonable shape. This
// shouldn't happen but we don't want one bad probe to
// poison the whole detection.
Err(join_err) => {
probes.push(NatProbeResult {
relay_name: "<panicked>".into(),
relay_addr: "unknown".into(),
observed_addr: None,
latency_ms: None,
error: Some(format!("probe task panicked: {join_err}")),
});
continue;
}
};
probes.push(match result {
Ok((observed, latency_ms)) => NatProbeResult {
relay_name: name,
relay_addr: addr.to_string(),
observed_addr: Some(observed.to_string()),
latency_ms: Some(latency_ms),
error: None,
},
Err(e) => NatProbeResult {
relay_name: name,
relay_addr: addr.to_string(),
observed_addr: None,
latency_ms: None,
error: Some(e),
},
});
}
let (nat_type, consensus_addr) = classify_nat(&probes);
NatDetection {
probes,
nat_type,
consensus_addr,
}
}
/// Role assignment for the Phase 3.5 dual-path QUIC race.
///
/// Both peers already know two strings at CallSetup time: their
/// own server-reflexive address (queried via Phase 1 Reflect) and
/// the peer's (carried in `CallSetup.peer_direct_addr`). To avoid
/// a negotiation round-trip, both sides compare the two strings
/// lexicographically and agree on a deterministic role:
///
/// - **Acceptor** — lexicographically smaller addr. Listens for
/// an incoming direct connection from the peer. Does NOT dial.
/// - **Dialer** — lexicographically larger addr. Dials the
/// peer's direct addr. Does NOT listen.
///
/// Both roles ALSO dial the relay in parallel as a fallback.
/// Whichever future (direct or relay) completes first is used as
/// the media transport. Because the role is deterministic and
/// symmetric, both peers end up holding the same underlying QUIC
/// session on the direct path — A's accepted conn and D's dialed
/// conn are literally the same connection.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum Role {
/// This peer listens for the direct incoming connection.
Acceptor,
/// This peer dials the peer's direct address.
Dialer,
}
/// Compute the deterministic role for this peer in the dual-path
/// race. Returns `None` when no direct attempt is possible —
/// either peer didn't advertise a reflex addr, or the two addrs
/// are identical (same host on loopback / mis-advertised).
///
/// The caller should treat `None` as "skip direct, relay-only".
pub fn determine_role(
own_reflex_addr: Option<&str>,
peer_reflex_addr: Option<&str>,
) -> Option<Role> {
let (own, peer) = match (own_reflex_addr, peer_reflex_addr) {
(Some(o), Some(p)) => (o, p),
_ => return None,
};
match own.cmp(peer) {
std::cmp::Ordering::Less => Some(Role::Acceptor),
std::cmp::Ordering::Greater => Some(Role::Dialer),
// Equal addrs should never happen in production (both
// peers behind the same NAT mapping + same port would be
// a degenerate case). Guard against it so we don't infinite-
// loop waiting for a connection to ourselves.
std::cmp::Ordering::Equal => None,
}
}
/// Pure-function NAT classifier — split out for unit testing
/// without touching the network.
pub fn classify_nat(probes: &[NatProbeResult]) -> (NatType, Option<String>) {
let successes: Vec<SocketAddr> = probes
.iter()
.filter_map(|p| p.observed_addr.as_deref().and_then(|s| s.parse().ok()))
.collect();
if successes.len() < 2 {
return (NatType::Unknown, None);
}
let first = successes[0];
let same_ip = successes.iter().all(|a| a.ip() == first.ip());
if !same_ip {
return (NatType::Multiple, None);
}
let same_port = successes.iter().all(|a| a.port() == first.port());
if same_port {
(NatType::Cone, Some(first.to_string()))
} else {
(NatType::SymmetricPort, None)
}
}
// ── Unit tests for the pure classifier ───────────────────────────
#[cfg(test)]
mod tests {
use super::*;
fn mk(addr: Option<&str>) -> NatProbeResult {
NatProbeResult {
relay_name: "test".into(),
relay_addr: "0.0.0.0:0".into(),
observed_addr: addr.map(|s| s.to_string()),
latency_ms: addr.map(|_| 10),
error: None,
}
}
#[test]
fn classify_empty_is_unknown() {
let (nt, addr) = classify_nat(&[]);
assert_eq!(nt, NatType::Unknown);
assert!(addr.is_none());
}
#[test]
fn classify_single_success_is_unknown() {
let probes = vec![mk(Some("192.0.2.1:4433"))];
let (nt, addr) = classify_nat(&probes);
assert_eq!(nt, NatType::Unknown);
assert!(addr.is_none());
}
#[test]
fn classify_two_identical_is_cone() {
let probes = vec![
mk(Some("192.0.2.1:4433")),
mk(Some("192.0.2.1:4433")),
];
let (nt, addr) = classify_nat(&probes);
assert_eq!(nt, NatType::Cone);
assert_eq!(addr.as_deref(), Some("192.0.2.1:4433"));
}
#[test]
fn classify_same_ip_different_ports_is_symmetric() {
let probes = vec![
mk(Some("192.0.2.1:4433")),
mk(Some("192.0.2.1:51234")),
];
let (nt, addr) = classify_nat(&probes);
assert_eq!(nt, NatType::SymmetricPort);
assert!(addr.is_none());
}
#[test]
fn classify_different_ips_is_multiple() {
let probes = vec![
mk(Some("192.0.2.1:4433")),
mk(Some("198.51.100.9:4433")),
];
let (nt, addr) = classify_nat(&probes);
assert_eq!(nt, NatType::Multiple);
assert!(addr.is_none());
}
#[test]
fn classify_mix_of_success_and_failure() {
let probes = vec![
mk(Some("192.0.2.1:4433")),
mk(None), // failed probe
mk(Some("192.0.2.1:4433")),
];
let (nt, addr) = classify_nat(&probes);
// Two successes both agree → Cone, ignore the failure row.
assert_eq!(nt, NatType::Cone);
assert_eq!(addr.as_deref(), Some("192.0.2.1:4433"));
}
#[test]
fn determine_role_smaller_is_acceptor() {
// Lexicographic: "192.0.2.1:4433" < "198.51.100.9:4433"
assert_eq!(
determine_role(Some("192.0.2.1:4433"), Some("198.51.100.9:4433")),
Some(Role::Acceptor)
);
}
#[test]
fn determine_role_larger_is_dialer() {
assert_eq!(
determine_role(Some("198.51.100.9:4433"), Some("192.0.2.1:4433")),
Some(Role::Dialer)
);
}
#[test]
fn determine_role_port_difference_matters() {
// Same ip, different ports — string compare still works
// because "4433" < "54321".
assert_eq!(
determine_role(Some("127.0.0.1:4433"), Some("127.0.0.1:54321")),
Some(Role::Acceptor)
);
assert_eq!(
determine_role(Some("127.0.0.1:54321"), Some("127.0.0.1:4433")),
Some(Role::Dialer)
);
}
#[test]
fn determine_role_equal_addrs_is_none() {
assert_eq!(
determine_role(Some("192.0.2.1:4433"), Some("192.0.2.1:4433")),
None
);
}
#[test]
fn determine_role_missing_side_is_none() {
assert_eq!(determine_role(None, Some("192.0.2.1:4433")), None);
assert_eq!(determine_role(Some("192.0.2.1:4433"), None), None);
assert_eq!(determine_role(None, None), None);
}
#[test]
fn determine_role_is_symmetric_across_peers() {
// Both peers compute roles independently; they must end
// up with opposite assignments (one Acceptor, one Dialer)
// so that each side ends up talking to the other.
let a = "192.0.2.1:4433";
let b = "198.51.100.9:4433";
let alice_role = determine_role(Some(a), Some(b));
let bob_role = determine_role(Some(b), Some(a));
assert_eq!(alice_role, Some(Role::Acceptor));
assert_eq!(bob_role, Some(Role::Dialer));
}
#[test]
fn classify_one_success_one_failure_is_unknown() {
let probes = vec![mk(Some("192.0.2.1:4433")), mk(None)];
let (nt, addr) = classify_nat(&probes);
assert_eq!(nt, NatType::Unknown);
assert!(addr.is_none());
}
}

View File

@@ -1,199 +0,0 @@
//! Phase 3.5 integration tests for the dual-path QUIC race.
//!
//! The race takes a role (Acceptor or Dialer), a peer_direct_addr,
//! a relay_addr, and two SNI strings, then returns whichever QUIC
//! handshake completes first wrapped in a `QuinnTransport`. These
//! tests validate that:
//!
//! 1. On loopback with two real clients playing A + D roles, the
//! direct path wins (fewer hops than relay).
//! 2. When the direct peer is dead (nothing listening) but the
//! relay is up, the relay wins within the fallback window.
//! 3. When both paths are dead, the race errors cleanly rather
//! than hanging forever.
//!
//! The "relay" in these tests is a minimal mock that just accepts
//! an incoming QUIC connection and drops it — we don't need any
//! protocol handling, just a TCP-ish listen-and-accept.
use std::net::{Ipv4Addr, SocketAddr};
use std::time::Duration;
use wzp_client::dual_path::{race, WinningPath};
use wzp_client::reflect::Role;
use wzp_transport::{create_endpoint, server_config};
/// Spin up a "relay-ish" mock server on loopback that accepts
/// incoming QUIC connections and does nothing with them. Used to
/// give the relay branch of the race a real target to dial.
/// Returns the bound address + a join handle (kept alive to keep
/// the endpoint up).
async fn spawn_mock_relay() -> (SocketAddr, tokio::task::JoinHandle<()>) {
let _ = rustls::crypto::ring::default_provider().install_default();
let (sc, _cert_der) = server_config();
let bind: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let ep = create_endpoint(bind, Some(sc)).expect("relay endpoint");
let addr = ep.local_addr().expect("local_addr");
let handle = tokio::spawn(async move {
// Accept loop — hold the connection alive for a short
// while so the race result isn't killed by the peer
// closing before the winning transport is returned.
while let Some(incoming) = ep.accept().await {
if let Ok(_conn) = incoming.await {
tokio::time::sleep(Duration::from_secs(5)).await;
}
}
});
(addr, handle)
}
// -----------------------------------------------------------------------
// Test 1: direct path wins when both sides are up
// -----------------------------------------------------------------------
//
// Spawn a mock relay, then set up a two-client test where one
// client plays the Acceptor role and the other plays the Dialer
// role. The Dialer's `peer_direct_addr` is the Acceptor's listen
// address. Because the direct path is a single loopback hop and
// the relay dial also terminates on loopback, both complete
// essentially instantly — the `biased` tokio::select in race()
// should pick direct.
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn dual_path_direct_wins_on_loopback() {
let _ = rustls::crypto::ring::default_provider().install_default();
let (relay_addr, _relay_handle) = spawn_mock_relay().await;
// Acceptor task: run race(Role::Acceptor, peer_addr_placeholder, ...).
// Since the acceptor doesn't dial, the peer_direct_addr arg is
// unused on the direct branch but we still pass a placeholder
// because the API takes one. Use a stub addr that would error
// if it were ever dialed — proving the Acceptor really doesn't
// reach it.
let unused_addr: SocketAddr = "127.0.0.1:2".parse().unwrap();
// We can't race both sides in the same task because each race
// call has its own direct endpoint that needs to talk to the
// OTHER side's endpoint. So spawn the Acceptor in a task and
// let it expose its listen addr via a oneshot back to the test,
// then run the Dialer in the test's main task.
//
// There's a chicken-and-egg issue: the Acceptor's listen addr
// is only known after race() creates its endpoint. To avoid
// reaching into race()'s internals, we instead play a slight
// trick: create the Acceptor's endpoint ourselves (outside
// race()) to learn its addr, spin up an accept loop on it
// ourselves, and pass THAT addr as the Dialer's peer addr.
// This tests the Dialer->Acceptor handshake end-to-end without
// running the full race() on both sides.
let (sc, _cert_der) = server_config();
let acceptor_bind: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let acceptor_ep = create_endpoint(acceptor_bind, Some(sc)).expect("acceptor ep");
let acceptor_listen_addr = acceptor_ep.local_addr().expect("acceptor addr");
// Drop the external acceptor after the test finishes, not
// before — spawn a dedicated accept task.
let acceptor_accept_task = tokio::spawn(async move {
// Accept one connection and hold it for a while so the
// Dialer side can complete its QUIC handshake.
if let Some(incoming) = acceptor_ep.accept().await {
if let Ok(_conn) = incoming.await {
tokio::time::sleep(Duration::from_secs(5)).await;
}
}
});
// Now run the Dialer in the race — peer_direct_addr = acceptor's
// listen addr. The relay is the mock from above. Direct path
// should win.
let result = race(
Role::Dialer,
acceptor_listen_addr,
relay_addr,
"test-room".into(),
"call-test".into(),
)
.await
.expect("race must succeed");
assert_eq!(result.1, WinningPath::Direct, "direct should win on loopback");
// Cancel the acceptor accept task so the test finishes.
acceptor_accept_task.abort();
// Suppress unused-var warning for the placeholder.
let _ = unused_addr;
}
// -----------------------------------------------------------------------
// Test 2: relay wins when the direct peer is dead
// -----------------------------------------------------------------------
//
// Dialer role, peer_direct_addr = a port nothing is listening on,
// relay is the working mock. Direct dial will sit waiting for a
// QUIC handshake that never comes; the 2s direct timeout kicks in
// and the relay path wins the fallback.
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn dual_path_relay_wins_when_direct_is_dead() {
let _ = rustls::crypto::ring::default_provider().install_default();
let (relay_addr, _relay_handle) = spawn_mock_relay().await;
// A port that nothing is listening on — dead direct target.
// Port 1 on loopback is almost never bound and UDP packets to
// it will be dropped silently, so the QUIC handshake times out.
let dead_peer: SocketAddr = "127.0.0.1:1".parse().unwrap();
let result = race(
Role::Dialer,
dead_peer,
relay_addr,
"test-room".into(),
"call-test".into(),
)
.await
.expect("race must succeed via relay fallback");
assert_eq!(
result.1,
WinningPath::Relay,
"relay should win when direct dial has nowhere to land"
);
}
// -----------------------------------------------------------------------
// Test 3: race errors cleanly when both paths are dead
// -----------------------------------------------------------------------
//
// Dialer role, peer_direct_addr = dead, relay_addr = dead.
// Expected: race returns an Err within ~7s (2s direct timeout +
// 5s relay timeout fallback).
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn dual_path_errors_cleanly_when_both_paths_dead() {
let _ = rustls::crypto::ring::default_provider().install_default();
let dead_peer: SocketAddr = "127.0.0.1:1".parse().unwrap();
let dead_relay: SocketAddr = "127.0.0.1:2".parse().unwrap();
let start = std::time::Instant::now();
let result = race(
Role::Dialer,
dead_peer,
dead_relay,
"test-room".into(),
"call-test".into(),
)
.await;
let elapsed = start.elapsed();
assert!(result.is_err(), "both-dead must return Err");
// Upper bound: direct 2s timeout + relay 5s fallback + small
// slack for scheduling. If this blows, something is looping.
assert!(
elapsed < Duration::from_secs(10),
"race took too long to give up: {:?}",
elapsed
);
}

View File

@@ -83,12 +83,12 @@ async fn full_handshake_both_sides_derive_same_session() {
// Run client and relay handshakes concurrently. // Run client and relay handshakes concurrently.
let (client_result, relay_result) = tokio::join!( let (client_result, relay_result) = tokio::join!(
wzp_client::handshake::perform_handshake(client_transport_clone.as_ref(), &client_seed, None), wzp_client::handshake::perform_handshake(client_transport_clone.as_ref(), &client_seed),
wzp_relay::handshake::accept_handshake(relay_transport_clone.as_ref(), &relay_seed), wzp_relay::handshake::accept_handshake(relay_transport_clone.as_ref(), &relay_seed),
); );
let mut client_session = client_result.expect("client handshake should succeed"); let mut client_session = client_result.expect("client handshake should succeed");
let (mut relay_session, chosen_profile, _caller_fp, _caller_alias) = let (mut relay_session, chosen_profile) =
relay_result.expect("relay handshake should succeed"); relay_result.expect("relay handshake should succeed");
// Verify a profile was chosen. // Verify a profile was chosen.
@@ -151,7 +151,6 @@ async fn handshake_rejects_tampered_signature() {
ephemeral_pub, ephemeral_pub,
signature: bad_signature, signature: bad_signature,
supported_profiles: vec![wzp_proto::QualityProfile::GOOD], supported_profiles: vec![wzp_proto::QualityProfile::GOOD],
alias: None,
}; };
client_transport_clone client_transport_clone
.send_signal(&offer) .send_signal(&offer)

View File

@@ -10,17 +10,8 @@ description = "WarzonePhone audio codec layer — Opus + Codec2 encoding/decodin
wzp-proto = { workspace = true } wzp-proto = { workspace = true }
tracing = { workspace = true } tracing = { workspace = true }
# Opus bindings — libopus 1.5.2. # Opus bindings
# opusic-c for the encoder (set_dred_duration lives here in Phase 1). audiopus = { workspace = true }
# opusic-sys for the decoder — we wrap the raw *mut OpusDecoder ourselves
# because opusic-c::Decoder.inner is pub(crate), blocking the unified
# decoder + DRED path we need in Phase 3.
opusic-c = { workspace = true }
opusic-sys = { workspace = true }
# Zero-cost slice reinterpretation for the i16 ↔ u16 boundary between
# our PCM buffers and opusic-c's encode API.
bytemuck = { workspace = true }
# Pure-Rust Codec2 implementation # Pure-Rust Codec2 implementation
codec2 = { workspace = true } codec2 = { workspace = true }

View File

@@ -199,27 +199,6 @@ impl AdaptiveDecoder {
fn codec2_frame_samples(&self) -> usize { fn codec2_frame_samples(&self) -> usize {
self.codec2.frame_samples() self.codec2.frame_samples()
} }
/// Reconstruct a lost frame from a previously parsed DRED state.
///
/// Phase 3b entry point for gap reconstruction. Dispatches to the
/// inner Opus decoder when active. Returns an error if the active
/// codec is Codec2 — DRED is libopus-only and has no Codec2 equivalent,
/// so callers must fall back to classical PLC on Codec2 tiers.
pub fn reconstruct_from_dred(
&mut self,
state: &crate::dred_ffi::DredState,
offset_samples: i32,
output: &mut [i16],
) -> Result<usize, CodecError> {
if is_codec2(self.active) {
return Err(CodecError::DecodeFailed(
"DRED reconstruction is Opus-only; Codec2 must use classical PLC".into(),
));
}
self.opus
.reconstruct_from_dred(state, offset_samples, output)
}
} }
// ─── Tests ─────────────────────────────────────────────────────────────────── // ─── Tests ───────────────────────────────────────────────────────────────────

View File

@@ -1,127 +1,53 @@
//! Acoustic Echo Cancellation — delay-compensated leaky NLMS with //! Acoustic Echo Cancellation using NLMS adaptive filter.
//! Geigel double-talk detection. //! Processes 480-sample (10ms) sub-frames at 48kHz.
//!
//! Key insight: on a laptop, the round-trip audio latency (playout → speaker
//! → air → mic → capture) is 3050ms. The far-end reference must be delayed
//! by this amount so the adaptive filter models the *echo path*, not the
//! *system delay + echo path*.
//!
//! The leaky coefficient decay prevents the filter from diverging when the
//! echo path changes (e.g. hand near laptop) or when the delay estimate
//! is slightly off.
/// Delay-compensated leaky NLMS echo canceller with Geigel DTD. /// NLMS (Normalized Least Mean Squares) adaptive filter echo canceller.
///
/// Removes acoustic echo by modelling the echo path between the far-end
/// (speaker) signal and the near-end (microphone) signal, then subtracting
/// the estimated echo from the near-end in real time.
pub struct EchoCanceller { pub struct EchoCanceller {
// --- Adaptive filter --- filter_coeffs: Vec<f32>,
filter: Vec<f32>,
filter_len: usize, filter_len: usize,
/// Circular buffer of far-end reference samples (after delay). far_end_buf: Vec<f32>,
far_buf: Vec<f32>, far_end_pos: usize,
far_pos: usize,
/// NLMS step size.
mu: f32, mu: f32,
/// Leakage factor: coefficients are multiplied by (1 - leak) each frame.
/// Prevents unbounded growth / divergence. 0.0001 is gentle.
leak: f32,
enabled: bool, enabled: bool,
// --- Delay buffer ---
/// Raw far-end samples before delay compensation.
delay_ring: Vec<f32>,
delay_write: usize,
delay_read: usize,
/// Delay in samples (e.g. 1920 = 40ms at 48kHz).
delay_samples: usize,
/// Capacity of the delay ring.
delay_cap: usize,
// --- Double-talk detection (Geigel) ---
/// Peak far-end level over the last filter_len samples.
far_peak: f32,
/// Geigel threshold: if |near| > threshold * far_peak, assume double-talk.
geigel_threshold: f32,
/// Holdover counter: keep DTD active for a few frames after detection.
dtd_holdover: u32,
dtd_hold_frames: u32,
} }
impl EchoCanceller { impl EchoCanceller {
/// Create a new echo canceller. /// Create a new echo canceller.
/// ///
/// * `sample_rate` — typically 48000 /// * `sample_rate` — typically 48000
/// * `filter_ms` — echo-tail length in milliseconds (60ms recommended) /// * `filter_ms` — echo-tail length in milliseconds (e.g. 100 for 100 ms)
/// * `delay_ms` — far-end delay compensation in milliseconds (40ms for laptops)
pub fn new(sample_rate: u32, filter_ms: u32) -> Self { pub fn new(sample_rate: u32, filter_ms: u32) -> Self {
Self::with_delay(sample_rate, filter_ms, 40)
}
pub fn with_delay(sample_rate: u32, filter_ms: u32, delay_ms: u32) -> Self {
let filter_len = (sample_rate as usize) * (filter_ms as usize) / 1000; let filter_len = (sample_rate as usize) * (filter_ms as usize) / 1000;
let delay_samples = (sample_rate as usize) * (delay_ms as usize) / 1000;
// Delay ring must hold at least delay_samples + one frame (960) of headroom.
let delay_cap = delay_samples + (sample_rate as usize / 10); // +100ms headroom
Self { Self {
filter: vec![0.0; filter_len], filter_coeffs: vec![0.0f32; filter_len],
filter_len, filter_len,
far_buf: vec![0.0; filter_len], far_end_buf: vec![0.0f32; filter_len],
far_pos: 0, far_end_pos: 0,
mu: 0.01, mu: 0.01,
leak: 0.0001,
enabled: true, enabled: true,
delay_ring: vec![0.0; delay_cap],
delay_write: 0,
delay_read: 0,
delay_samples,
delay_cap,
far_peak: 0.0,
geigel_threshold: 0.7,
dtd_holdover: 0,
dtd_hold_frames: 5,
} }
} }
/// Feed far-end (speaker) samples. These go into the delay buffer first; /// Feed far-end (speaker/playback) samples into the circular buffer.
/// once enough samples have accumulated, they are released to the filter's ///
/// circular buffer with the correct delay offset. /// Must be called with the audio that was played out through the speaker
/// *before* the corresponding near-end frame is processed.
pub fn feed_farend(&mut self, farend: &[i16]) { pub fn feed_farend(&mut self, farend: &[i16]) {
// Write raw samples into the delay ring.
for &s in farend { for &s in farend {
self.delay_ring[self.delay_write % self.delay_cap] = s as f32; self.far_end_buf[self.far_end_pos] = s as f32;
self.delay_write += 1; self.far_end_pos = (self.far_end_pos + 1) % self.filter_len;
}
// Release delayed samples to the filter's far-end buffer.
while self.delay_available() >= 1 {
let sample = self.delay_ring[self.delay_read % self.delay_cap];
self.delay_read += 1;
self.far_buf[self.far_pos] = sample;
self.far_pos = (self.far_pos + 1) % self.filter_len;
// Track peak far-end level for Geigel DTD.
let abs_s = sample.abs();
if abs_s > self.far_peak {
self.far_peak = abs_s;
}
}
// Decay far_peak slowly (avoids stale peak from a loud burst long ago).
self.far_peak *= 0.9995;
}
/// Number of delayed samples available to release.
fn delay_available(&self) -> usize {
let buffered = self.delay_write - self.delay_read;
if buffered > self.delay_samples {
buffered - self.delay_samples
} else {
0
} }
} }
/// Process a near-end (microphone) frame, removing the estimated echo. /// Process a near-end (microphone) frame, removing the estimated echo.
///
/// Returns the echo-return-loss enhancement (ERLE) as a ratio: the RMS of
/// the original near-end divided by the RMS of the residual. Values > 1.0
/// mean echo was reduced.
pub fn process_frame(&mut self, nearend: &mut [i16]) -> f32 { pub fn process_frame(&mut self, nearend: &mut [i16]) -> f32 {
if !self.enabled { if !self.enabled {
return 1.0; return 1.0;
@@ -130,96 +56,85 @@ impl EchoCanceller {
let n = nearend.len(); let n = nearend.len();
let fl = self.filter_len; let fl = self.filter_len;
// --- Geigel double-talk detection ---
// If any near-end sample exceeds threshold * far_peak, assume
// the local speaker is active and freeze adaptation.
let mut is_doubletalk = self.dtd_holdover > 0;
if !is_doubletalk {
let threshold_level = self.geigel_threshold * self.far_peak;
for &s in nearend.iter() {
if (s as f32).abs() > threshold_level && self.far_peak > 100.0 {
is_doubletalk = true;
self.dtd_holdover = self.dtd_hold_frames;
break;
}
}
}
if self.dtd_holdover > 0 {
self.dtd_holdover -= 1;
}
// Check if far-end is active (otherwise nothing to cancel).
let far_active = self.far_peak > 100.0;
// --- Leaky coefficient decay ---
// Applied once per frame for efficiency.
let decay = 1.0 - self.leak;
for c in self.filter.iter_mut() {
*c *= decay;
}
let mut sum_near_sq: f64 = 0.0; let mut sum_near_sq: f64 = 0.0;
let mut sum_err_sq: f64 = 0.0; let mut sum_err_sq: f64 = 0.0;
for i in 0..n { for i in 0..n {
let near_f = nearend[i] as f32; let near_f = nearend[i] as f32;
// Position of far-end "now" for this near-end sample. // --- estimate echo as dot(coeffs, farend_window) ---
let base = (self.far_pos + fl * ((n / fl) + 2) + i - n) % fl; // The far-end window for this sample starts at
// (far_end_pos - 1 - i) mod filter_len (most recent)
// --- Echo estimation: dot(filter, far_end_window) --- // and goes back filter_len samples.
let mut echo_est: f32 = 0.0; let mut echo_est: f32 = 0.0;
let mut power: f32 = 0.0; let mut power: f32 = 0.0;
// Position of the most-recent far-end sample for this near-end sample.
// far_end_pos points to the *next write* position, so the most-recent
// sample written is at far_end_pos - 1. We have already called
// feed_farend for this block, so the relevant samples are the last
// filter_len entries ending just before the current write position,
// offset by how far we are into this near-end frame.
//
// For sample i of the near-end frame, the corresponding far-end
// "now" is far_end_pos - n + i (wrapping).
// far_end_pos points to next-write, so most recent sample is at
// far_end_pos - 1. For the i-th near-end sample we want the
// far-end "now" to be at (far_end_pos - n + i). We add fl
// repeatedly to avoid underflow on the usize subtraction.
let base = (self.far_end_pos + fl * ((n / fl) + 2) + i - n) % fl;
for k in 0..fl { for k in 0..fl {
let fe_idx = (base + fl - k) % fl; let fe_idx = (base + fl - k) % fl;
let fe = self.far_buf[fe_idx]; let fe = self.far_end_buf[fe_idx];
echo_est += self.filter[k] * fe; echo_est += self.filter_coeffs[k] * fe;
power += fe * fe; power += fe * fe;
} }
let error = near_f - echo_est; let error = near_f - echo_est;
// --- NLMS adaptation (only when far-end active & no double-talk) --- // --- NLMS coefficient update ---
if far_active && !is_doubletalk && power > 10.0 { let norm = power + 1.0; // +1 regularisation to avoid div-by-zero
let step = self.mu * error / (power + 1.0); let step = self.mu * error / norm;
for k in 0..fl { for k in 0..fl {
let fe_idx = (base + fl - k) % fl; let fe_idx = (base + fl - k) % fl;
self.filter[k] += step * self.far_buf[fe_idx]; let fe = self.far_end_buf[fe_idx];
} self.filter_coeffs[k] += step * fe;
} }
let out = error.clamp(-32768.0, 32767.0); // Clamp output
let out = error.max(-32768.0).min(32767.0);
nearend[i] = out as i16; nearend[i] = out as i16;
sum_near_sq += (near_f as f64).powi(2); sum_near_sq += (near_f as f64) * (near_f as f64);
sum_err_sq += (out as f64).powi(2); sum_err_sq += (out as f64) * (out as f64);
} }
// ERLE ratio
if sum_err_sq < 1.0 { if sum_err_sq < 1.0 {
100.0 return 100.0; // near-perfect cancellation
} else { }
(sum_near_sq / sum_err_sq).sqrt() as f32 (sum_near_sq / sum_err_sq).sqrt() as f32
} }
}
/// Enable or disable echo cancellation.
pub fn set_enabled(&mut self, enabled: bool) { pub fn set_enabled(&mut self, enabled: bool) {
self.enabled = enabled; self.enabled = enabled;
} }
/// Returns whether echo cancellation is currently enabled.
pub fn is_enabled(&self) -> bool { pub fn is_enabled(&self) -> bool {
self.enabled self.enabled
} }
/// Reset the adaptive filter to its initial state.
///
/// Zeroes out all filter coefficients and the far-end circular buffer.
pub fn reset(&mut self) { pub fn reset(&mut self) {
self.filter.iter_mut().for_each(|c| *c = 0.0); self.filter_coeffs.iter_mut().for_each(|c| *c = 0.0);
self.far_buf.iter_mut().for_each(|s| *s = 0.0); self.far_end_buf.iter_mut().for_each(|s| *s = 0.0);
self.far_pos = 0; self.far_end_pos = 0;
self.far_peak = 0.0;
self.delay_ring.iter_mut().for_each(|s| *s = 0.0);
self.delay_write = 0;
self.delay_read = 0;
self.dtd_holdover = 0;
} }
} }
@@ -228,40 +143,50 @@ mod tests {
use super::*; use super::*;
#[test] #[test]
fn creates_with_correct_sizes() { fn aec_creates_with_correct_filter_len() {
let aec = EchoCanceller::with_delay(48000, 60, 40); let aec = EchoCanceller::new(48000, 100);
assert_eq!(aec.filter_len, 2880); // 60ms @ 48kHz assert_eq!(aec.filter_len, 4800);
assert_eq!(aec.delay_samples, 1920); // 40ms @ 48kHz assert_eq!(aec.filter_coeffs.len(), 4800);
assert_eq!(aec.far_end_buf.len(), 4800);
} }
#[test] #[test]
fn passthrough_when_disabled() { fn aec_passthrough_when_disabled() {
let mut aec = EchoCanceller::new(48000, 60); let mut aec = EchoCanceller::new(48000, 100);
aec.set_enabled(false); aec.set_enabled(false);
assert!(!aec.is_enabled());
let original: Vec<i16> = (0..960).map(|i| (i * 10) as i16).collect(); let original: Vec<i16> = (0..480).map(|i| (i * 10) as i16).collect();
let mut frame = original.clone(); let mut frame = original.clone();
aec.process_frame(&mut frame); let erle = aec.process_frame(&mut frame);
assert_eq!(erle, 1.0);
assert_eq!(frame, original); assert_eq!(frame, original);
} }
#[test] #[test]
fn silence_passthrough() { fn aec_reset_zeroes_state() {
let mut aec = EchoCanceller::with_delay(48000, 30, 0); let mut aec = EchoCanceller::new(48000, 10); // short for test speed
aec.feed_farend(&vec![0i16; 960]); let farend: Vec<i16> = (0..480).map(|i| ((i * 37) % 1000) as i16).collect();
let mut frame = vec![0i16; 960]; aec.feed_farend(&farend);
aec.process_frame(&mut frame);
assert!(frame.iter().all(|&s| s == 0)); aec.reset();
assert!(aec.filter_coeffs.iter().all(|&c| c == 0.0));
assert!(aec.far_end_buf.iter().all(|&s| s == 0.0));
assert_eq!(aec.far_end_pos, 0);
} }
#[test] #[test]
fn reduces_echo_with_no_delay() { fn aec_reduces_echo_of_known_signal() {
// Simulate: far-end plays, echo arrives at mic attenuated by ~50% // Use a small filter for speed. Feed a known far-end signal, then
// (realistic — speaker to mic on laptop loses volume). // present the *same* signal as near-end (perfect echo, no room).
let mut aec = EchoCanceller::with_delay(48000, 10, 0); // After adaptation the output energy should drop.
let filter_ms = 5; // 240 taps at 48 kHz
let mut aec = EchoCanceller::new(48000, filter_ms);
let frame_len = 480; // Generate a simple repeating pattern.
let make_tone = |offset: usize| -> Vec<i16> { let frame_len = 480usize;
let make_frame = |offset: usize| -> Vec<i16> {
(0..frame_len) (0..frame_len)
.map(|i| { .map(|i| {
let t = (offset + i) as f64 / 48000.0; let t = (offset + i) as f64 / 48000.0;
@@ -270,16 +195,18 @@ mod tests {
.collect() .collect()
}; };
// Warm up the adaptive filter with several frames.
let mut last_erle = 1.0f32; let mut last_erle = 1.0f32;
for frame_idx in 0..100 { for frame_idx in 0..40 {
let farend = make_tone(frame_idx * frame_len); let farend = make_frame(frame_idx * frame_len);
aec.feed_farend(&farend); aec.feed_farend(&farend);
// Near-end = attenuated copy of far-end (echo at ~50% volume). // Near-end = exact copy of far-end (pure echo).
let mut nearend: Vec<i16> = farend.iter().map(|&s| s / 2).collect(); let mut nearend = farend.clone();
last_erle = aec.process_frame(&mut nearend); last_erle = aec.process_frame(&mut nearend);
} }
// After 40 frames the ERLE should be meaningfully > 1.
assert!( assert!(
last_erle > 1.0, last_erle > 1.0,
"expected ERLE > 1.0 after adaptation, got {last_erle}" "expected ERLE > 1.0 after adaptation, got {last_erle}"
@@ -287,49 +214,15 @@ mod tests {
} }
#[test] #[test]
fn preserves_nearend_during_doubletalk() { fn aec_silence_passthrough() {
let mut aec = EchoCanceller::with_delay(48000, 30, 0); let mut aec = EchoCanceller::new(48000, 10);
// Feed silence far-end
let frame_len = 960; aec.feed_farend(&vec![0i16; 480]);
let nearend: Vec<i16> = (0..frame_len) // Near-end is silence too
.map(|i| { let mut frame = vec![0i16; 480];
let t = i as f64 / 48000.0; let erle = aec.process_frame(&mut frame);
(10000.0 * (2.0 * std::f64::consts::PI * 440.0 * t).sin()) as i16 assert!(erle >= 1.0);
}) // Output should still be silence
.collect(); assert!(frame.iter().all(|&s| s == 0));
// Feed silence as far-end (no echo source).
aec.feed_farend(&vec![0i16; frame_len]);
let mut frame = nearend.clone();
aec.process_frame(&mut frame);
let input_energy: f64 = nearend.iter().map(|&s| (s as f64).powi(2)).sum();
let output_energy: f64 = frame.iter().map(|&s| (s as f64).powi(2)).sum();
let ratio = output_energy / input_energy;
assert!(
ratio > 0.8,
"near-end speech should be preserved, energy ratio = {ratio:.3}"
);
}
#[test]
fn delay_buffer_holds_samples() {
let mut aec = EchoCanceller::with_delay(48000, 10, 20);
// 20ms delay = 960 samples @ 48kHz.
// After feeding, feed_farend auto-drains available samples to far_buf.
// So delay_available() is always 0 after feed_farend returns.
// Instead, verify far_pos advances only after the delay is filled.
// Feed 960 samples (= delay amount). No samples released yet.
aec.feed_farend(&vec![1i16; 960]);
// far_buf should still be all zeros (nothing released).
assert!(aec.far_buf.iter().all(|&s| s == 0.0), "nothing should be released yet");
// Feed 480 more. 480 should be released to far_buf.
aec.feed_farend(&vec![2i16; 480]);
let non_zero = aec.far_buf.iter().filter(|&&s| s != 0.0).count();
assert!(non_zero > 0, "samples should have been released to far_buf");
} }
} }

View File

@@ -1,585 +0,0 @@
//! Raw opusic-sys FFI wrappers for libopus 1.5.2 decoder + DRED reconstruction.
//!
//! # Why this module exists
//!
//! We cannot use `opusic_c::Decoder` because its inner `*mut OpusDecoder`
//! pointer is `pub(crate)` — not reachable from outside the opusic-c crate.
//! Phase 3 of the DRED integration needs to hand that same pointer to
//! `opus_decoder_dred_decode`, and running two parallel decoders (one from
//! opusic-c for normal audio, another from opusic-sys for DRED) would cause
//! the DRED-only decoder's internal state to drift out of sync with the
//! audio stream because it would not see normal decode calls.
//!
//! The fix is to own the raw decoder ourselves and use the same handle for
//! both normal decode AND DRED reconstruction. This module is the single
//! owner of `*mut OpusDecoder`, `*mut OpusDREDDecoder`, and `*mut OpusDRED`
//! in the WZP workspace.
//!
//! # Phase 3a scope
//!
//! Phase 0 added `DecoderHandle` (normal decode). Phase 3a adds:
//! - [`DredDecoderHandle`] — wraps `*mut OpusDREDDecoder` for parsing DRED
//! side-channel data out of arriving Opus packets.
//! - [`DredState`] — wraps `*mut OpusDRED` (a fixed 10,592-byte buffer
//! allocated by libopus) that holds parsed DRED state between the parse
//! and reconstruct steps.
//! - [`DredDecoderHandle::parse_into`] — wraps `opus_dred_parse`.
//! - [`DecoderHandle::reconstruct_from_dred`] — wraps `opus_decoder_dred_decode`.
//!
//! The pattern is: on every arriving Opus packet, the receiver calls
//! `parse_into` with a reusable `DredState`, then stores (seq, state_clone)
//! in a ring. On detected loss, the receiver computes the offset from the
//! freshest reachable DRED state and calls `reconstruct_from_dred` to
//! synthesize the missing audio.
use std::ptr::NonNull;
use opusic_sys::{
OPUS_OK, OpusDRED, OpusDREDDecoder, OpusDecoder as RawOpusDecoder, opus_decode,
opus_decoder_create, opus_decoder_destroy, opus_decoder_dred_decode, opus_dred_alloc,
opus_dred_decoder_create, opus_dred_decoder_destroy, opus_dred_free, opus_dred_parse,
};
use wzp_proto::CodecError;
/// libopus operates at 48 kHz for all Opus variants we use.
const SAMPLE_RATE_HZ: i32 = 48_000;
/// Mono.
const CHANNELS: i32 = 1;
/// Safe owner of a `*mut OpusDecoder` allocated via `opus_decoder_create`.
///
/// Releases the decoder in `Drop`. All FFI access goes through `&mut self`
/// methods, so there is no aliasing or race. The raw pointer is exposed via
/// [`Self::as_raw_ptr`] at a crate-internal visibility for the future Phase 3
/// DRED reconstruction path — external crates cannot reach it.
pub struct DecoderHandle {
inner: NonNull<RawOpusDecoder>,
}
impl DecoderHandle {
/// Allocate a new Opus decoder at 48 kHz mono.
pub fn new() -> Result<Self, CodecError> {
let mut error: i32 = OPUS_OK;
// SAFETY: opus_decoder_create writes to `error` and returns either a
// valid heap pointer or null. We check both before constructing the
// NonNull wrapper.
let ptr = unsafe { opus_decoder_create(SAMPLE_RATE_HZ, CHANNELS, &mut error) };
if error != OPUS_OK {
// Even if ptr is non-null on error, libopus contracts guarantee
// it is unusable — do not attempt to free it.
return Err(CodecError::DecodeFailed(format!(
"opus_decoder_create failed: err={error}"
)));
}
let inner = NonNull::new(ptr).ok_or_else(|| {
CodecError::DecodeFailed("opus_decoder_create returned null".into())
})?;
Ok(Self { inner })
}
/// Decode an Opus packet into PCM samples.
///
/// `pcm` must have enough capacity for the frame (960 for 20 ms, 1920
/// for 40 ms at 48 kHz mono). Returns the number of decoded samples
/// per channel — for mono streams this equals the total sample count.
pub fn decode(&mut self, packet: &[u8], pcm: &mut [i16]) -> Result<usize, CodecError> {
if packet.is_empty() {
return Err(CodecError::DecodeFailed("empty packet".into()));
}
if pcm.is_empty() {
return Err(CodecError::DecodeFailed("empty output buffer".into()));
}
// SAFETY: self.inner is a valid *mut OpusDecoder owned by this struct.
// `data` / `pcm` are live Rust slices, so their pointers and lengths
// are valid for the duration of the call. libopus reads len bytes
// from data and writes up to frame_size samples (per channel) to pcm.
let n = unsafe {
opus_decode(
self.inner.as_ptr(),
packet.as_ptr(),
packet.len() as i32,
pcm.as_mut_ptr(),
pcm.len() as i32,
/* decode_fec = */ 0,
)
};
if n < 0 {
return Err(CodecError::DecodeFailed(format!(
"opus_decode failed: err={n}"
)));
}
Ok(n as usize)
}
/// Generate packet-loss concealment audio for a missing frame.
///
/// Implemented via `opus_decode` with a null data pointer, per the
/// libopus API contract. `pcm` should be sized for the expected frame.
pub fn decode_lost(&mut self, pcm: &mut [i16]) -> Result<usize, CodecError> {
if pcm.is_empty() {
return Err(CodecError::DecodeFailed("empty output buffer".into()));
}
// SAFETY: same invariants as decode(). libopus documents that passing
// a null data pointer with len=0 triggers PLC synthesis into pcm.
let n = unsafe {
opus_decode(
self.inner.as_ptr(),
std::ptr::null(),
0,
pcm.as_mut_ptr(),
pcm.len() as i32,
/* decode_fec = */ 0,
)
};
if n < 0 {
return Err(CodecError::DecodeFailed(format!(
"opus_decode PLC failed: err={n}"
)));
}
Ok(n as usize)
}
/// Reconstruct audio from a `DredState` into the `output` buffer.
///
/// `offset_samples` is the sample position (positive, measured backward
/// from the packet anchor that produced `state`) where reconstruction
/// begins. `output.len()` must match the number of samples to synthesize.
///
/// The libopus API: `opus_decoder_dred_decode(st, dred, dred_offset, pcm,
/// frame_size)` where `dred_offset` is "position of the redundancy to
/// decode, in samples before the beginning of the real audio data in the
/// packet." Valid values: `0 < offset_samples < state.samples_available()`.
///
/// Returns the number of samples actually written (should equal
/// `output.len()` on success).
pub fn reconstruct_from_dred(
&mut self,
state: &DredState,
offset_samples: i32,
output: &mut [i16],
) -> Result<usize, CodecError> {
if output.is_empty() {
return Err(CodecError::DecodeFailed(
"empty reconstruction output buffer".into(),
));
}
if offset_samples <= 0 {
return Err(CodecError::DecodeFailed(format!(
"DRED offset must be positive (got {offset_samples})"
)));
}
if offset_samples > state.samples_available() {
return Err(CodecError::DecodeFailed(format!(
"DRED offset {offset_samples} exceeds available samples {}",
state.samples_available()
)));
}
// SAFETY: self.inner is a valid *mut OpusDecoder, state.inner is a
// valid *const OpusDRED populated by a prior parse_into call, and
// output is a live mutable slice. libopus reads from dred and writes
// exactly frame_size samples (the output.len()) to pcm.
let n = unsafe {
opus_decoder_dred_decode(
self.inner.as_ptr(),
state.inner.as_ptr(),
offset_samples,
output.as_mut_ptr(),
output.len() as i32,
)
};
if n < 0 {
return Err(CodecError::DecodeFailed(format!(
"opus_decoder_dred_decode failed: err={n}"
)));
}
Ok(n as usize)
}
}
impl Drop for DecoderHandle {
fn drop(&mut self) {
// SAFETY: we own the pointer and no further access happens after
// this call because Drop consumes self.
unsafe { opus_decoder_destroy(self.inner.as_ptr()) };
}
}
// SAFETY: The underlying OpusDecoder is a plain heap allocation with no
// thread-local or lock-free state. It is safe to move between threads
// (Send), and all method access is gated by &mut self so Rust's borrow
// checker prevents simultaneous access from multiple threads (Sync).
unsafe impl Send for DecoderHandle {}
unsafe impl Sync for DecoderHandle {}
// ─── DRED decoder (parser) ──────────────────────────────────────────────────
/// Safe owner of a `*mut OpusDREDDecoder` allocated via
/// `opus_dred_decoder_create`.
///
/// The DRED decoder is a **separate** libopus object from the regular
/// `OpusDecoder`. It's used exclusively for parsing DRED side-channel data
/// out of arriving Opus packets via [`Self::parse_into`]. Actual audio
/// reconstruction from the parsed state uses the regular `DecoderHandle`
/// via [`DecoderHandle::reconstruct_from_dred`].
pub struct DredDecoderHandle {
inner: NonNull<OpusDREDDecoder>,
}
impl DredDecoderHandle {
/// Allocate a new DRED decoder.
pub fn new() -> Result<Self, CodecError> {
let mut error: i32 = OPUS_OK;
// SAFETY: opus_dred_decoder_create writes to `error` and returns
// either a valid heap pointer or null. Both are checked.
let ptr = unsafe { opus_dred_decoder_create(&mut error) };
if error != OPUS_OK {
return Err(CodecError::DecodeFailed(format!(
"opus_dred_decoder_create failed: err={error}"
)));
}
let inner = NonNull::new(ptr).ok_or_else(|| {
CodecError::DecodeFailed("opus_dred_decoder_create returned null".into())
})?;
Ok(Self { inner })
}
/// Parse DRED side-channel data from an Opus packet into `state`.
///
/// Returns the number of samples of audio history available for
/// reconstruction, or 0 if the packet carries no DRED data. Subsequent
/// `DecoderHandle::reconstruct_from_dred` calls using this `state` can
/// reconstruct any sample position in `(0, samples_available]`.
///
/// libopus API: `opus_dred_parse(dred_dec, dred, data, len,
/// max_dred_samples, sampling_rate, dred_end, defer_processing)`. We
/// pass `max_dred_samples = 48000` (1 s at 48 kHz, the DRED maximum),
/// `sampling_rate = 48000`, `defer_processing = 0` (process immediately).
/// The `dred_end` output is the silence gap at the tail of the DRED
/// window; we subtract it from the total offset to give callers the
/// truly usable sample count.
pub fn parse_into(
&mut self,
state: &mut DredState,
packet: &[u8],
) -> Result<i32, CodecError> {
if packet.is_empty() {
state.samples_available = 0;
return Ok(0);
}
let mut dred_end: i32 = 0;
// SAFETY: self.inner is a valid *mut OpusDREDDecoder; state.inner is
// a valid *mut OpusDRED allocated via opus_dred_alloc; packet is a
// live slice; dred_end is a stack int. libopus reads packet bytes
// and writes parsed DRED state into *state.inner.
let ret = unsafe {
opus_dred_parse(
self.inner.as_ptr(),
state.inner.as_ptr(),
packet.as_ptr(),
packet.len() as i32,
/* max_dred_samples = */ 48_000, // 1s max per libopus 1.5
/* sampling_rate = */ 48_000,
&mut dred_end,
/* defer_processing = */ 0,
)
};
if ret < 0 {
state.samples_available = 0;
return Err(CodecError::DecodeFailed(format!(
"opus_dred_parse failed: err={ret}"
)));
}
// ret is the positive offset of the first decodable DRED sample,
// or 0 if no DRED is present. dred_end is the silence gap at the
// tail. The usable sample range is (dred_end, ret], so the count
// of usable samples is ret - dred_end. We store `ret` as the max
// usable offset — callers should pass dred_offset values in the
// range (dred_end, ret] to reconstruct_from_dred. For simplicity
// we expose just samples_available = ret and let callers treat
// the full window as valid (the silence gap is small and libopus
// handles minor boundary cases gracefully).
state.samples_available = ret;
Ok(ret)
}
}
impl Drop for DredDecoderHandle {
fn drop(&mut self) {
// SAFETY: we own the pointer and no further access happens after
// this call because Drop consumes self.
unsafe { opus_dred_decoder_destroy(self.inner.as_ptr()) };
}
}
// SAFETY: same reasoning as DecoderHandle — heap allocation with no
// thread-local state, &mut self access discipline prevents races.
unsafe impl Send for DredDecoderHandle {}
unsafe impl Sync for DredDecoderHandle {}
// ─── DRED state buffer ──────────────────────────────────────────────────────
/// Safe owner of a `*mut OpusDRED` allocated via `opus_dred_alloc`.
///
/// Holds a fixed-size (10,592-byte per libopus 1.5) buffer that
/// `DredDecoderHandle::parse_into` populates from an Opus packet. The state
/// is reusable — the caller can call `parse_into` again on the same
/// `DredState` to overwrite it with a fresh packet's data.
///
/// `samples_available` tracks the last-parsed result so reconstruction
/// callers don't need to thread the return value separately. A fresh
/// state (before any `parse_into`) has `samples_available == 0`.
pub struct DredState {
inner: NonNull<OpusDRED>,
samples_available: i32,
}
impl DredState {
/// Allocate a new DRED state buffer.
pub fn new() -> Result<Self, CodecError> {
let mut error: i32 = OPUS_OK;
// SAFETY: opus_dred_alloc writes to `error` and returns either a
// valid heap pointer or null.
let ptr = unsafe { opus_dred_alloc(&mut error) };
if error != OPUS_OK {
return Err(CodecError::DecodeFailed(format!(
"opus_dred_alloc failed: err={error}"
)));
}
let inner = NonNull::new(ptr)
.ok_or_else(|| CodecError::DecodeFailed("opus_dred_alloc returned null".into()))?;
Ok(Self {
inner,
samples_available: 0,
})
}
/// How many samples of audio history this state currently covers.
///
/// Returns 0 if the state is fresh or the last parse found no DRED
/// data. Otherwise returns the positive offset set by the most recent
/// `DredDecoderHandle::parse_into` call — the maximum valid
/// `offset_samples` value for `DecoderHandle::reconstruct_from_dred`.
pub fn samples_available(&self) -> i32 {
self.samples_available
}
/// Reset the state to "fresh" without freeing the underlying buffer.
/// The next `parse_into` will overwrite the contents.
pub fn reset(&mut self) {
self.samples_available = 0;
}
}
impl Drop for DredState {
fn drop(&mut self) {
// SAFETY: we own the pointer and no further access happens after
// this call because Drop consumes self.
unsafe { opus_dred_free(self.inner.as_ptr()) };
}
}
// SAFETY: same reasoning as DecoderHandle.
unsafe impl Send for DredState {}
unsafe impl Sync for DredState {}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn decoder_handle_creates_and_drops() {
let handle = DecoderHandle::new().expect("decoder create");
// Dropping the handle must not panic or leak — validated by miri
// and the absence of sanitizer complaints in CI.
drop(handle);
}
#[test]
fn decode_lost_produces_full_frame_of_silence_on_cold_start() {
let mut handle = DecoderHandle::new().unwrap();
// 20 ms @ 48 kHz mono.
let mut pcm = vec![0i16; 960];
let n = handle.decode_lost(&mut pcm).unwrap();
assert_eq!(n, 960);
// On a fresh decoder, PLC output is silence (no past audio to extend).
assert!(pcm.iter().all(|&s| s == 0));
}
#[test]
fn decode_empty_packet_errors() {
let mut handle = DecoderHandle::new().unwrap();
let mut pcm = vec![0i16; 960];
let err = handle.decode(&[], &mut pcm);
assert!(err.is_err());
}
// ─── Phase 3a — DRED decoder + state ────────────────────────────────────
#[test]
fn dred_decoder_handle_creates_and_drops() {
let h = DredDecoderHandle::new().expect("dred decoder create");
drop(h);
}
#[test]
fn dred_state_creates_and_drops() {
let s = DredState::new().expect("dred state alloc");
assert_eq!(s.samples_available(), 0);
drop(s);
}
#[test]
fn dred_state_reset_zeroes_counter() {
let mut s = DredState::new().unwrap();
s.samples_available = 480; // pretend a parse populated it
assert_eq!(s.samples_available(), 480);
s.reset();
assert_eq!(s.samples_available(), 0);
}
/// Phase 3a end-to-end: encode a DRED-enabled stream, parse state out
/// of packets, and reconstruct audio at a past offset. Validates the
/// full parse → reconstruct pipeline against a real libopus 1.5.2
/// encoder so we catch FFI-layer bugs early.
#[test]
fn dred_parse_and_reconstruct_roundtrip() {
use crate::opus_enc::OpusEncoder;
use wzp_proto::{AudioEncoder, QualityProfile};
// Encoder with DRED at Opus 24k / 200 ms duration (Phase 1 default
// for GOOD profile). The loss floor is 5% per Phase 1.
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
// Decode-side handles.
let mut dec = DecoderHandle::new().unwrap();
let mut dred_dec = DredDecoderHandle::new().unwrap();
let mut state = DredState::new().unwrap();
// Generate 60 frames (1.2 s) of a voice-like 300 Hz sine wave so
// the encoder's DRED emitter has real content to encode rather
// than compressing silence.
let frame_len = 960usize; // 20 ms @ 48 kHz
let make_frame = |offset: usize| -> Vec<i16> {
(0..frame_len)
.map(|i| {
let t = (offset + i) as f64 / 48_000.0;
(8000.0 * (2.0 * std::f64::consts::PI * 300.0 * t).sin()) as i16
})
.collect()
};
// Track the freshest packet that carried non-zero DRED state.
let mut best_samples_available = 0;
let mut best_packet: Option<Vec<u8>> = None;
for frame_idx in 0..60 {
let pcm = make_frame(frame_idx * frame_len);
let mut encoded = vec![0u8; 512];
let n = enc.encode(&pcm, &mut encoded).unwrap();
encoded.truncate(n);
// Run the packet through the normal decode path so dec's
// internal state mirrors the full stream — this is necessary
// for DRED reconstruction to produce meaningful output.
let mut decoded = vec![0i16; frame_len];
dec.decode(&encoded, &mut decoded).unwrap();
// Parse DRED state out of the same packet. Early packets may
// have samples_available == 0 while the DRED encoder warms up;
// later packets should carry the full window.
match dred_dec.parse_into(&mut state, &encoded) {
Ok(available) => {
if available > best_samples_available {
best_samples_available = available;
best_packet = Some(encoded.clone());
}
}
Err(e) => panic!("parse_into errored unexpectedly: {e:?}"),
}
}
// By the time we're 60 frames in, DRED should have emitted data.
assert!(
best_samples_available > 0,
"DRED emitted zero samples across 60 frames — the encoder isn't \
producing DRED bytes (check set_dred_duration and packet_loss floor)"
);
// Parse the best packet into a fresh state and reconstruct some
// audio from somewhere inside its DRED window. We use frame_len/2
// as the offset to pick a point squarely inside the reconstructable
// range rather than at an edge.
let packet = best_packet.expect("at least one packet had DRED state");
let mut fresh_state = DredState::new().unwrap();
let available = dred_dec.parse_into(&mut fresh_state, &packet).unwrap();
assert!(available > 0, "re-parse of known-good packet returned 0");
// Need a decoder that's in the right state to reconstruct — rewind
// by creating a fresh one and feeding it the same stream up to the
// point of the best packet. Simpler: just use a fresh decoder and
// accept that the reconstructed samples may not be phase-matched.
// The test here only asserts *non-silent energy*, not signal fidelity.
let mut recon_dec = DecoderHandle::new().unwrap();
// Warm up the decoder with one frame so its internal state is valid.
let warmup_pcm = vec![0i16; frame_len];
let warmup_encoded = {
let mut warmup_enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
let mut buf = vec![0u8; 512];
let n = warmup_enc.encode(&warmup_pcm, &mut buf).unwrap();
buf.truncate(n);
buf
};
let mut throwaway = vec![0i16; frame_len];
let _ = recon_dec.decode(&warmup_encoded, &mut throwaway);
// Reconstruct 20 ms from some position inside the DRED window.
let offset = (available / 2).max(480).min(available);
let mut recon_pcm = vec![0i16; frame_len];
let n = recon_dec
.reconstruct_from_dred(&fresh_state, offset, &mut recon_pcm)
.expect("reconstruct_from_dred failed");
assert_eq!(n, frame_len);
// Energy check: reconstructed audio should not be all zeros. A
// loose threshold — the DRED reconstruction won't be phase-matched
// to our sine wave because we fed a cold decoder only one warmup
// frame, but it should still produce non-silent speech-like output
// since the DRED state was parsed from real speech content.
let energy: u64 = recon_pcm.iter().map(|&s| (s as i32).unsigned_abs() as u64).sum();
assert!(
energy > 0,
"reconstructed audio has zero total energy — DRED reconstruction produced silence"
);
}
/// A second roundtrip variant: offset too large errors cleanly rather
/// than crashing the FFI.
#[test]
fn reconstruct_with_out_of_range_offset_errors() {
let mut dec = DecoderHandle::new().unwrap();
let state = DredState::new().unwrap();
// state has samples_available == 0 (fresh), so any positive offset
// should be out of range.
let mut out = vec![0i16; 960];
let err = dec.reconstruct_from_dred(&state, 480, &mut out);
assert!(err.is_err());
}
#[test]
fn reconstruct_with_zero_offset_errors() {
let mut dec = DecoderHandle::new().unwrap();
let state = DredState::new().unwrap();
let mut out = vec![0i16; 960];
let err = dec.reconstruct_from_dred(&state, 0, &mut out);
assert!(err.is_err());
}
#[test]
fn dred_parse_empty_packet_returns_zero() {
let mut dred_dec = DredDecoderHandle::new().unwrap();
let mut state = DredState::new().unwrap();
let result = dred_dec.parse_into(&mut state, &[]).unwrap();
assert_eq!(result, 0);
assert_eq!(state.samples_available(), 0);
}
}

View File

@@ -15,7 +15,6 @@ pub mod agc;
pub mod codec2_dec; pub mod codec2_dec;
pub mod codec2_enc; pub mod codec2_enc;
pub mod denoise; pub mod denoise;
pub mod dred_ffi;
pub mod opus_dec; pub mod opus_dec;
pub mod opus_enc; pub mod opus_enc;
pub mod resample; pub mod resample;
@@ -28,26 +27,6 @@ pub use denoise::NoiseSupressor;
pub use silence::{ComfortNoise, SilenceDetector}; pub use silence::{ComfortNoise, SilenceDetector};
pub use wzp_proto::{AudioDecoder, AudioEncoder, CodecId, QualityProfile}; pub use wzp_proto::{AudioDecoder, AudioEncoder, CodecId, QualityProfile};
use std::sync::atomic::{AtomicBool, Ordering};
/// Global verbose-logging flag for DRED. Off by default — when enabled
/// (via the GUI debug toggle wired through Tauri), the encoder logs its
/// DRED config + libopus version, and the recv path logs every DRED
/// reconstruction, classical PLC fill, and parse heartbeat. Off in
/// "normal" mode keeps logcat clean.
static DRED_VERBOSE_LOGS: AtomicBool = AtomicBool::new(false);
/// Returns whether DRED verbose logging is currently enabled.
#[inline]
pub fn dred_verbose_logs() -> bool {
DRED_VERBOSE_LOGS.load(Ordering::Relaxed)
}
/// Enable/disable DRED verbose logging at runtime.
pub fn set_dred_verbose_logs(enabled: bool) {
DRED_VERBOSE_LOGS.store(enabled, Ordering::Relaxed);
}
/// Create an adaptive encoder starting at the given quality profile. /// Create an adaptive encoder starting at the given quality profile.
/// ///
/// The returned encoder accepts 48 kHz mono PCM regardless of the active /// The returned encoder accepts 48 kHz mono PCM regardless of the active

View File

@@ -1,32 +1,30 @@
//! Opus decoder built on top of the raw opusic-sys `DecoderHandle`. //! Opus decoder wrapping the `audiopus` crate.
//!
//! Phase 0 of the DRED integration: we went straight to a custom
//! `DecoderHandle` instead of `opusic_c::Decoder` because the latter's
//! inner pointer is `pub(crate)` and we need to reach it in Phase 3 for
//! `opus_decoder_dred_decode`. See `dred_ffi.rs` for the rationale and
//! `docs/PRD-dred-integration.md` for the full plan.
use crate::dred_ffi::{DecoderHandle, DredState}; use audiopus::coder::Decoder;
use audiopus::{Channels, MutSignals, SampleRate};
use audiopus::packet::Packet;
use wzp_proto::{AudioDecoder, CodecError, CodecId, QualityProfile}; use wzp_proto::{AudioDecoder, CodecError, CodecId, QualityProfile};
/// Opus decoder implementing [`AudioDecoder`]. /// Opus decoder implementing `AudioDecoder`.
/// ///
/// Operates at 48 kHz mono output. 20 ms and 40 ms frames supported via /// Operates at 48 kHz mono output.
/// the active `QualityProfile`. Behavior is intentionally identical to
/// the pre-swap audiopus-based decoder at this phase — DRED reconstruction
/// lands in Phase 3.
pub struct OpusDecoder { pub struct OpusDecoder {
inner: DecoderHandle, inner: Decoder,
codec_id: CodecId, codec_id: CodecId,
frame_duration_ms: u8, frame_duration_ms: u8,
} }
// SAFETY: Same reasoning as OpusEncoder — exclusive access via &mut self.
unsafe impl Sync for OpusDecoder {}
impl OpusDecoder { impl OpusDecoder {
/// Create a new Opus decoder for the given quality profile. /// Create a new Opus decoder for the given quality profile.
pub fn new(profile: QualityProfile) -> Result<Self, CodecError> { pub fn new(profile: QualityProfile) -> Result<Self, CodecError> {
let inner = DecoderHandle::new()?; let decoder = Decoder::new(SampleRate::Hz48000, Channels::Mono)
.map_err(|e| CodecError::DecodeFailed(format!("opus decoder init: {e}")))?;
Ok(Self { Ok(Self {
inner, inner: decoder,
codec_id: profile.codec, codec_id: profile.codec,
frame_duration_ms: profile.frame_duration_ms, frame_duration_ms: profile.frame_duration_ms,
}) })
@@ -36,24 +34,6 @@ impl OpusDecoder {
pub fn frame_samples(&self) -> usize { pub fn frame_samples(&self) -> usize {
(48_000 * self.frame_duration_ms as usize) / 1000 (48_000 * self.frame_duration_ms as usize) / 1000
} }
/// Reconstruct a lost frame from a previously parsed `DredState`.
///
/// Phase 3b entry point: callers (CallDecoder / engine.rs) use this to
/// synthesize audio for gaps detected by the jitter buffer when DRED
/// side-channel state from a later-arriving packet covers the gap's
/// sample offset. `offset_samples` is measured backward from the anchor
/// packet that produced `state`. See `DecoderHandle::reconstruct_from_dred`
/// for the full semantics.
pub fn reconstruct_from_dred(
&mut self,
state: &DredState,
offset_samples: i32,
output: &mut [i16],
) -> Result<usize, CodecError> {
self.inner
.reconstruct_from_dred(state, offset_samples, output)
}
} }
impl AudioDecoder for OpusDecoder { impl AudioDecoder for OpusDecoder {
@@ -65,7 +45,15 @@ impl AudioDecoder for OpusDecoder {
pcm.len() pcm.len()
))); )));
} }
self.inner.decode(encoded, pcm) let packet = Packet::try_from(encoded)
.map_err(|e| CodecError::DecodeFailed(format!("invalid packet: {e}")))?;
let signals = MutSignals::try_from(pcm)
.map_err(|e| CodecError::DecodeFailed(format!("output signals: {e}")))?;
let n = self
.inner
.decode(Some(packet), signals, false)
.map_err(|e| CodecError::DecodeFailed(format!("opus decode: {e}")))?;
Ok(n)
} }
fn decode_lost(&mut self, pcm: &mut [i16]) -> Result<usize, CodecError> { fn decode_lost(&mut self, pcm: &mut [i16]) -> Result<usize, CodecError> {
@@ -76,7 +64,13 @@ impl AudioDecoder for OpusDecoder {
pcm.len() pcm.len()
))); )));
} }
self.inner.decode_lost(pcm) let signals = MutSignals::try_from(pcm)
.map_err(|e| CodecError::DecodeFailed(format!("output signals: {e}")))?;
let n = self
.inner
.decode(None, signals, false)
.map_err(|e| CodecError::DecodeFailed(format!("opus PLC: {e}")))?;
Ok(n)
} }
fn codec_id(&self) -> CodecId { fn codec_id(&self) -> CodecId {

View File

@@ -1,220 +1,58 @@
//! Opus encoder wrapping the `opusic-c` crate (libopus 1.5.2). //! Opus encoder wrapping the `audiopus` crate.
//!
//! Phase 1 of the DRED integration: encoder-side DRED is enabled on every
//! Opus profile with a tiered duration (studio 100 ms / normal 200 ms /
//! degraded 500 ms), and Opus inband FEC (LBRR) is disabled because DRED
//! is the stronger mechanism for the same failure mode. The legacy behavior
//! is preserved behind the `AUDIO_USE_LEGACY_FEC` environment variable as a
//! runtime escape hatch for rollout. See `docs/PRD-dred-integration.md`.
//!
//! # DRED duration policy
//!
//! Rationale from the PRD:
//! - Studio tiers (Opus 32k/48k/64k): 100 ms — loss is rare on high-quality
//! networks; short window keeps decoder CPU modest.
//! - Normal tiers (Opus 16k/24k): 200 ms — balanced baseline covering common
//! VoIP loss patterns (20150 ms bursts from wifi roam, transient congestion).
//! - Degraded tier (Opus 6k): 500 ms — users on 6k are by definition on a
//! bad link; longer DRED buys maximum burst resilience where it matters.
//!
//! # Why the 15% packet loss floor
//!
//! libopus 1.5's DRED emitter is gated on `OPUS_SET_PACKET_LOSS_PERC` and
//! scales the emitted window proportionally to the assumed loss:
//!
//! ```text
//! loss_pct samples_available effective_ms
//! 5% 720 15
//! 10% 2640 55
//! 15% 4560 95
//! 20% 6480 135
//! 25%+ 8400 (capped) 175 (≈ 87% of the 200ms configured max)
//! ```
//!
//! Measured empirically against libopus 1.5.2 on Opus 24k / 200 ms DRED
//! duration during Phase 3b. At 5% loss the window is only 15 ms — too
//! small to even reconstruct a single 20 ms Opus frame. 15% gives 95 ms
//! (enough for single-frame recovery plus modest burst margin) while
//! keeping the bitrate overhead modest compared to 25%. Real measurements
//! from the quality adapter override upward when loss exceeds the floor.
use std::sync::OnceLock; use audiopus::coder::Encoder;
use audiopus::{Application, Bitrate, Channels, SampleRate, Signal};
use opusic_c::{Application, Bitrate, Channels, Encoder, InbandFec, SampleRate, Signal}; use tracing::debug;
use tracing::{debug, info, warn};
use wzp_proto::{AudioEncoder, CodecError, CodecId, QualityProfile}; use wzp_proto::{AudioEncoder, CodecError, CodecId, QualityProfile};
/// Logged exactly once per process the first time an OpusEncoder is built.
/// Confirms that libopus 1.5.2 (the version with DRED) is actually linked
/// at runtime — invaluable when chasing "is the new codec loaded?"
/// regressions on Android, where the only debug surface is logcat.
static LIBOPUS_VERSION_LOGGED: OnceLock<()> = OnceLock::new();
/// Minimum `OPUS_SET_PACKET_LOSS_PERC` value used in DRED mode. libopus
/// scales the DRED emission window with the assumed loss percentage:
/// empirically, 5% gives a 15 ms window (useless), 10% gives 55 ms, 15%
/// gives 95 ms, and 25%+ saturates the configured max (~175 ms at 200 ms
/// duration). 15% is the minimum value that produces a DRED window larger
/// than a single 20 ms frame, making it the minimum floor that actually
/// gives DRED something useful to reconstruct. Real loss measurements from
/// the quality adapter override this upward.
const DRED_LOSS_FLOOR_PCT: u8 = 15;
/// Environment variable that reverts Phase 1 behavior to Phase 0 (inband FEC
/// on, DRED off, no loss floor). Read once per encoder construction.
const LEGACY_FEC_ENV: &str = "AUDIO_USE_LEGACY_FEC";
/// Returns the DRED duration in 10 ms frame units for a given Opus codec.
///
/// Unit: each frame is 10 ms, so the max value of 104 corresponds to 1040 ms
/// of reconstructable history. Returns 0 for non-Opus codecs (DRED is not
/// emitted by the libopus encoder in that case anyway, but we avoid a
/// pointless FFI call).
///
/// See the DRED duration policy in the module docs for per-tier rationale.
pub fn dred_duration_for(codec: CodecId) -> u8 {
match codec {
// Studio tiers — loss is rare, short window.
CodecId::Opus32k | CodecId::Opus48k | CodecId::Opus64k => 10,
// Normal tiers — balanced baseline.
CodecId::Opus16k | CodecId::Opus24k => 20,
// Degraded tier — maximum burst resilience.
CodecId::Opus6k => 50,
// Non-Opus (Codec2 / CN): DRED is N/A.
CodecId::Codec2_1200 | CodecId::Codec2_3200 | CodecId::ComfortNoise => 0,
}
}
/// Returns whether the legacy-FEC escape hatch is active.
///
/// Read from `AUDIO_USE_LEGACY_FEC`. Any non-empty value activates legacy
/// mode; unset or empty leaves DRED enabled.
fn read_legacy_fec_env() -> bool {
match std::env::var(LEGACY_FEC_ENV) {
Ok(v) => !v.is_empty() && v != "0" && v.to_ascii_lowercase() != "false",
Err(_) => false,
}
}
/// Opus encoder implementing `AudioEncoder`. /// Opus encoder implementing `AudioEncoder`.
/// ///
/// Operates at 48 kHz mono. Supports 20 ms and 40 ms frames via the active /// Operates at 48 kHz mono. Supports frame sizes of 20 ms (960 samples)
/// `QualityProfile`. /// and 40 ms (1920 samples).
pub struct OpusEncoder { pub struct OpusEncoder {
inner: Encoder, inner: Encoder,
codec_id: CodecId, codec_id: CodecId,
frame_duration_ms: u8, frame_duration_ms: u8,
/// When `true`, revert to the Phase 0 behavior: inband FEC Mode1, DRED
/// disabled, no loss floor. Captured at construction time and not
/// re-read mid-call.
legacy_fec_mode: bool,
} }
// SAFETY: OpusEncoder is only used via `&mut self` methods. The inner // SAFETY: OpusEncoder is only used via `&mut self` methods. The inner
// opusic-c Encoder wraps a non-null pointer that is !Sync by default, // audiopus Encoder contains a raw pointer that is !Sync, but we never
// but we never share it across threads without exclusive access. // share it across threads without exclusive access.
unsafe impl Sync for OpusEncoder {} unsafe impl Sync for OpusEncoder {}
impl OpusEncoder { impl OpusEncoder {
/// Create a new Opus encoder for the given quality profile. /// Create a new Opus encoder for the given quality profile.
pub fn new(profile: QualityProfile) -> Result<Self, CodecError> { pub fn new(profile: QualityProfile) -> Result<Self, CodecError> {
// opusic-c argument order: (Channels, SampleRate, Application) let encoder = Encoder::new(SampleRate::Hz48000, Channels::Mono, Application::Voip)
// — different from audiopus's (SampleRate, Channels, Application). .map_err(|e| CodecError::EncodeFailed(format!("opus encoder init: {e}")))?;
let encoder = Encoder::new(Channels::Mono, SampleRate::Hz48000, Application::Voip)
.map_err(|e| CodecError::EncodeFailed(format!("opus encoder init: {e:?}")))?;
let legacy_fec_mode = read_legacy_fec_env();
if legacy_fec_mode {
warn!(
"AUDIO_USE_LEGACY_FEC active — reverting Opus encoder to Phase 0 \
behavior (inband FEC Mode1, no DRED)"
);
}
let mut enc = Self { let mut enc = Self {
inner: encoder, inner: encoder,
codec_id: profile.codec, codec_id: profile.codec,
frame_duration_ms: profile.frame_duration_ms, frame_duration_ms: profile.frame_duration_ms,
legacy_fec_mode,
}; };
// Common setup — bitrate, DTX, signal hint, complexity. These are
// identical regardless of the protection mode below.
enc.apply_bitrate(profile.codec)?; enc.apply_bitrate(profile.codec)?;
enc.set_inband_fec(true);
enc.set_dtx(true); enc.set_dtx(true);
// Voice signal type hint for better compression
enc.inner enc.inner
.set_signal(Signal::Voice) .set_signal(Signal::Voice)
.map_err(|e| CodecError::EncodeFailed(format!("set signal: {e:?}")))?; .map_err(|e| CodecError::EncodeFailed(format!("set signal: {e}")))?;
// Default complexity 7 — good quality/CPU trade-off for VoIP
enc.inner enc.inner
.set_complexity(7) .set_complexity(7)
.map_err(|e| CodecError::EncodeFailed(format!("set complexity: {e:?}")))?; .map_err(|e| CodecError::EncodeFailed(format!("set complexity: {e}")))?;
// Protection mode: DRED (Phase 1 default) or legacy inband FEC.
enc.apply_protection_mode(profile.codec)?;
Ok(enc) Ok(enc)
} }
/// Configure the protection mode for the active codec.
///
/// In DRED mode (default): disable inband FEC, set DRED duration for the
/// codec tier, clamp packet_loss to the 5% floor so DRED stays active.
///
/// In legacy mode: enable inband FEC Mode1 (Phase 0 behavior), leave
/// DRED and packet_loss at libopus defaults.
fn apply_protection_mode(&mut self, codec: CodecId) -> Result<(), CodecError> {
if self.legacy_fec_mode {
self.inner
.set_inband_fec(InbandFec::Mode1)
.map_err(|e| CodecError::EncodeFailed(format!("set inband FEC: {e:?}")))?;
// Leave DRED at 0 and packet_loss at default — matches Phase 0.
return Ok(());
}
// DRED path: disable the overlapping inband FEC, enable DRED with
// per-profile duration, floor packet_loss so DRED emits.
self.inner
.set_inband_fec(InbandFec::Off)
.map_err(|e| CodecError::EncodeFailed(format!("set inband FEC off: {e:?}")))?;
let dred_frames = dred_duration_for(codec);
self.inner
.set_dred_duration(dred_frames)
.map_err(|e| CodecError::EncodeFailed(format!("set DRED duration: {e:?}")))?;
self.inner
.set_packet_loss(DRED_LOSS_FLOOR_PCT)
.map_err(|e| CodecError::EncodeFailed(format!("set packet loss floor: {e:?}")))?;
// Both of these are gated behind the GUI debug toggle so logcat
// stays clean in normal mode. Flip "DRED verbose logs" in the
// settings panel to see the per-encoder config + libopus version.
if crate::dred_verbose_logs() {
info!(
codec = ?codec,
dred_frames,
dred_ms = dred_frames as u32 * 10,
loss_floor_pct = DRED_LOSS_FLOOR_PCT,
"opus encoder: DRED enabled"
);
// One-shot logging of the linked libopus version so we can
// confirm at a glance that opusic-c (libopus 1.5.2) is loaded.
// Pre-Phase-0 audiopus shipped libopus 1.3 which has no DRED;
// if this log says "libopus 1.3" something is very wrong.
LIBOPUS_VERSION_LOGGED.get_or_init(|| {
info!(libopus_version = %opusic_c::version(), "linked libopus version");
});
}
Ok(())
}
fn apply_bitrate(&mut self, codec: CodecId) -> Result<(), CodecError> { fn apply_bitrate(&mut self, codec: CodecId) -> Result<(), CodecError> {
let bps = codec.bitrate_bps(); let bps = codec.bitrate_bps() as i32;
self.inner self.inner
.set_bitrate(Bitrate::Value(bps)) .set_bitrate(Bitrate::BitsPerSecond(bps))
.map_err(|e| CodecError::EncodeFailed(format!("set bitrate: {e:?}")))?; .map_err(|e| CodecError::EncodeFailed(format!("set bitrate: {e}")))?;
debug!(bitrate_bps = bps, "opus encoder bitrate set"); debug!(bitrate_bps = bps, "opus encoder bitrate set");
Ok(()) Ok(())
} }
@@ -233,36 +71,10 @@ impl OpusEncoder {
/// Hint the encoder about expected packet loss percentage (0-100). /// Hint the encoder about expected packet loss percentage (0-100).
/// ///
/// In DRED mode, the value is floored at `DRED_LOSS_FLOOR_PCT` so the /// Higher values cause the encoder to use more redundancy to survive
/// encoder never drops DRED emission even on a perfect network. Real /// packet loss, at the expense of slightly higher bitrate.
/// loss measurements from the quality adapter override upward.
///
/// In legacy mode, the value is passed through unchanged (min 0, max 100).
pub fn set_expected_loss(&mut self, loss_pct: u8) { pub fn set_expected_loss(&mut self, loss_pct: u8) {
let clamped = if self.legacy_fec_mode { let _ = self.inner.set_packet_loss_perc(loss_pct.min(100));
loss_pct.min(100)
} else {
loss_pct.max(DRED_LOSS_FLOOR_PCT).min(100)
};
let _ = self.inner.set_packet_loss(clamped);
}
/// Set the DRED duration in 10 ms frame units (0 disables, max 104).
///
/// No-op in legacy mode. Normally driven automatically by the active
/// quality profile via `apply_protection_mode`; this setter exists for
/// tests and for the rare case where a caller needs to override the
/// per-profile default.
pub fn set_dred_duration(&mut self, frames: u8) {
if self.legacy_fec_mode {
return;
}
let _ = self.inner.set_dred_duration(frames.min(104));
}
/// Test/introspection accessor: whether legacy FEC mode is active.
pub fn is_legacy_fec_mode(&self) -> bool {
self.legacy_fec_mode
} }
} }
@@ -275,14 +87,10 @@ impl AudioEncoder for OpusEncoder {
pcm.len() pcm.len()
))); )));
} }
// opusic-c takes &[u16] for the sample input. Bit pattern is
// identical to i16 — the cast is zero-cost and the encoder
// interprets the bytes the same way as libopus internally.
let pcm_u16: &[u16] = bytemuck::cast_slice(pcm);
let n = self let n = self
.inner .inner
.encode_to_slice(pcm_u16, out) .encode(pcm, out)
.map_err(|e| CodecError::EncodeFailed(format!("opus encode: {e:?}")))?; .map_err(|e| CodecError::EncodeFailed(format!("opus encode: {e}")))?;
Ok(n) Ok(n)
} }
@@ -296,9 +104,6 @@ impl AudioEncoder for OpusEncoder {
self.codec_id = profile.codec; self.codec_id = profile.codec;
self.frame_duration_ms = profile.frame_duration_ms; self.frame_duration_ms = profile.frame_duration_ms;
self.apply_bitrate(profile.codec)?; self.apply_bitrate(profile.codec)?;
// Refresh DRED duration for the new tier. apply_protection_mode
// is idempotent and handles the legacy-vs-DRED branch correctly.
self.apply_protection_mode(profile.codec)?;
Ok(()) Ok(())
} }
other => Err(CodecError::UnsupportedTransition { other => Err(CodecError::UnsupportedTransition {
@@ -315,190 +120,10 @@ impl AudioEncoder for OpusEncoder {
} }
fn set_inband_fec(&mut self, enabled: bool) { fn set_inband_fec(&mut self, enabled: bool) {
// In DRED mode, ignore external requests to re-enable inband FEC — let _ = self.inner.set_inband_fec(enabled);
// running both mechanisms wastes bitrate on overlapping protection
// and opusic-c's own docs recommend disabling inband FEC when DRED
// is on. Trait callers that genuinely want classical FEC should set
// `AUDIO_USE_LEGACY_FEC=1` and re-create the encoder.
if !self.legacy_fec_mode {
debug!(
enabled,
"set_inband_fec ignored: DRED mode is active (set AUDIO_USE_LEGACY_FEC to revert)"
);
return;
}
let mode = if enabled { InbandFec::Mode1 } else { InbandFec::Off };
let _ = self.inner.set_inband_fec(mode);
} }
fn set_dtx(&mut self, enabled: bool) { fn set_dtx(&mut self, enabled: bool) {
let _ = self.inner.set_dtx(enabled); let _ = self.inner.set_dtx(enabled);
} }
} }
#[cfg(test)]
mod tests {
use super::*;
use wzp_proto::AudioDecoder;
/// Phase 0 acceptance gate: fail loudly if the linked libopus is not 1.5.x.
/// DRED (Phase 1+) only exists in libopus ≥ 1.5, so running against an
/// older version would silently regress the entire DRED integration.
#[test]
fn linked_libopus_is_1_5() {
let version = opusic_c::version();
assert!(
version.contains("1.5"),
"expected libopus 1.5.x, got: {version}"
);
}
#[test]
fn encoder_creates_at_good_profile() {
let enc = OpusEncoder::new(QualityProfile::GOOD).expect("opus encoder init");
assert_eq!(enc.codec_id, CodecId::Opus24k);
assert_eq!(enc.frame_samples(), 960); // 20 ms @ 48 kHz
}
#[test]
fn encoder_roundtrip_silence() {
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
let mut dec = crate::opus_dec::OpusDecoder::new(QualityProfile::GOOD).unwrap();
let pcm_in = vec![0i16; 960]; // 20 ms silence
let mut encoded = vec![0u8; 512];
let n = enc.encode(&pcm_in, &mut encoded).unwrap();
assert!(n > 0);
let mut pcm_out = vec![0i16; 960];
let samples = dec.decode(&encoded[..n], &mut pcm_out).unwrap();
assert_eq!(samples, 960);
}
// ─── Phase 1 — DRED duration policy ─────────────────────────────────────
#[test]
fn dred_duration_for_studio_tiers_is_100ms() {
assert_eq!(dred_duration_for(CodecId::Opus32k), 10);
assert_eq!(dred_duration_for(CodecId::Opus48k), 10);
assert_eq!(dred_duration_for(CodecId::Opus64k), 10);
}
#[test]
fn dred_duration_for_normal_tiers_is_200ms() {
assert_eq!(dred_duration_for(CodecId::Opus16k), 20);
assert_eq!(dred_duration_for(CodecId::Opus24k), 20);
}
#[test]
fn dred_duration_for_degraded_tier_is_500ms() {
assert_eq!(dred_duration_for(CodecId::Opus6k), 50);
}
#[test]
fn dred_duration_for_codec2_is_zero() {
assert_eq!(dred_duration_for(CodecId::Codec2_3200), 0);
assert_eq!(dred_duration_for(CodecId::Codec2_1200), 0);
assert_eq!(dred_duration_for(CodecId::ComfortNoise), 0);
}
// ─── Phase 1 — Legacy escape hatch ──────────────────────────────────────
/// By default (env var unset), legacy mode is off.
///
/// This test does NOT manipulate the environment to avoid flakiness
/// when the full suite runs in parallel. It only asserts on a freshly
/// created encoder in the ambient environment.
#[test]
fn default_mode_is_dred_not_legacy() {
// SAFETY: only run if the ambient env hasn't set the var externally.
if std::env::var(LEGACY_FEC_ENV).is_ok() {
return; // don't assert — someone set the env for a reason.
}
let enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
assert!(!enc.is_legacy_fec_mode());
}
// ─── Phase 1 — Behavioral regression: roundtrip still works ─────────────
#[test]
fn dred_mode_roundtrip_voice_pattern() {
// Use a realistic voice-like input (sine wave at speech frequencies)
// so the encoder emits meaningful DRED data rather than trivially
// compressible silence.
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
let mut dec = crate::opus_dec::OpusDecoder::new(QualityProfile::GOOD).unwrap();
let mut total_encoded_bytes = 0usize;
// Run 50 frames (1 second) so DRED fills up and starts emitting.
for frame_idx in 0..50 {
let pcm_in: Vec<i16> = (0..960)
.map(|i| {
let t = (frame_idx * 960 + i) as f64 / 48_000.0;
(8000.0 * (2.0 * std::f64::consts::PI * 300.0 * t).sin()) as i16
})
.collect();
let mut encoded = vec![0u8; 512];
let n = enc.encode(&pcm_in, &mut encoded).unwrap();
assert!(n > 0);
total_encoded_bytes += n;
let mut pcm_out = vec![0i16; 960];
let samples = dec.decode(&encoded[..n], &mut pcm_out).unwrap();
assert_eq!(samples, 960);
}
// Effective bitrate after 1 second of encoding.
// Opus 24k base + ~1 kbps DRED ≈ 25 kbps ≈ 3125 bytes/sec.
// Allow generous headroom (2000 lower bound, 8000 upper bound) —
// this is a behavioral regression check, not a tight bitrate assertion.
// The exact value is printed with --nocapture for diagnostic use.
eprintln!(
"[phase1 bitrate probe] legacy_fec_mode={} total_encoded={} bytes/sec",
enc.is_legacy_fec_mode(),
total_encoded_bytes
);
assert!(
total_encoded_bytes > 2000,
"encoder output too small: {total_encoded_bytes} bytes/sec (DRED likely not emitting)"
);
assert!(
total_encoded_bytes < 8000,
"encoder output too large: {total_encoded_bytes} bytes/sec"
);
}
// ─── Phase 1 — set_profile updates DRED duration on tier switch ─────────
#[test]
fn profile_switch_refreshes_dred_duration() {
// Start on GOOD (Opus 24k, DRED 20 frames), switch to DEGRADED
// (Opus 6k, DRED 50 frames). The encoder should accept both profile
// changes without error. We can't directly observe the DRED duration
// inside libopus, but apply_protection_mode returns Ok for both.
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
assert_eq!(enc.codec_id, CodecId::Opus24k);
enc.set_profile(QualityProfile::DEGRADED).unwrap();
assert_eq!(enc.codec_id, CodecId::Opus6k);
enc.set_profile(QualityProfile::STUDIO_64K).unwrap();
assert_eq!(enc.codec_id, CodecId::Opus64k);
}
// ─── Phase 1 — Trait set_inband_fec is a no-op in DRED mode ─────────────
#[test]
fn set_inband_fec_noop_in_dred_mode() {
if std::env::var(LEGACY_FEC_ENV).is_ok() {
return;
}
let mut enc = OpusEncoder::new(QualityProfile::GOOD).unwrap();
// Should not error, should not re-enable inband FEC internally.
enc.set_inband_fec(true);
// We can't directly query libopus's inband FEC state through opusic-c,
// but the call must not panic and the encoder must still work.
let pcm_in = vec![0i16; 960];
let mut encoded = vec![0u8; 512];
let n = enc.encode(&pcm_in, &mut encoded).unwrap();
assert!(n > 0);
}
}

View File

@@ -115,7 +115,6 @@ fn wzp_signal_serializes_into_fc_callsignal_payload() {
ephemeral_pub: [2u8; 32], ephemeral_pub: [2u8; 32],
signature: vec![3u8; 64], signature: vec![3u8; 64],
supported_profiles: vec![wzp_proto::QualityProfile::GOOD], supported_profiles: vec![wzp_proto::QualityProfile::GOOD],
alias: None,
}; };
// Encode as featherChat CallSignal payload // Encode as featherChat CallSignal payload
@@ -274,14 +273,13 @@ fn auth_invalid_response_matches() {
#[test] #[test]
fn all_signal_types_map_correctly() { fn all_signal_types_map_correctly() {
use wzp_client::featherchat::signal_to_call_type; use wzp_client::featherchat::{signal_to_call_type, CallSignalType};
let cases: Vec<(wzp_proto::SignalMessage, &str)> = vec![ let cases: Vec<(wzp_proto::SignalMessage, &str)> = vec![
( (
wzp_proto::SignalMessage::CallOffer { wzp_proto::SignalMessage::CallOffer {
identity_pub: [0; 32], ephemeral_pub: [0; 32], identity_pub: [0; 32], ephemeral_pub: [0; 32],
signature: vec![], supported_profiles: vec![], signature: vec![], supported_profiles: vec![],
alias: None,
}, },
"Offer", "Offer",
), ),

View File

@@ -1,29 +0,0 @@
[package]
name = "wzp-native"
version = "0.1.0"
edition = "2024"
description = "WarzonePhone native audio library — standalone Android cdylib that eventually owns all C++ (Oboe bridge) and exposes a pure-C FFI. Built with cargo-ndk, loaded at runtime by the Tauri desktop cdylib via libloading."
# Crate-type is DELIBERATELY only cdylib (no rlib, no staticlib). This crate
# is built with `cargo ndk -t arm64-v8a build --release -p wzp-native` as a
# standalone .so, which is the same path the legacy wzp-android crate uses
# successfully on the same phone / same NDK. Keeping the crate-type single
# avoids the rust-lang/rust#104707 symbol leak that bit us when Tauri's
# desktop crate had ["staticlib", "cdylib", "rlib"] and any C++ static
# archive pulled bionic's internal pthread_create into the final .so.
[lib]
name = "wzp_native"
crate-type = ["cdylib"]
[build-dependencies]
# cc is SAFE to use here because this crate is a single-cdylib: no
# staticlib in crate-type → no rust-lang/rust#104707 symbol leak. The
# legacy wzp-android crate uses the same setup and works.
cc = "1"
[dependencies]
# Phase 2: Oboe C++ audio bridge. Still no Rust deps — we do the whole
# audio pipeline via extern "C" into the bundled C++ and expose our own
# narrow extern "C" API for wzp-desktop to dlopen via libloading.
# Phase 3 can add wzp-proto/wzp-codec if we want to share codec logic
# instead of calling back into wzp-desktop via callbacks.

View File

@@ -1,119 +0,0 @@
//! wzp-native build.rs — Oboe C++ bridge compile on Android.
//!
//! Near-verbatim copy of crates/wzp-android/build.rs (which is known to
//! work). The crucial distinction: this crate is a single-cdylib (no
//! staticlib, no rlib in crate-type) so rust-lang/rust#104707 doesn't
//! apply — bionic's internal pthread_create / __init_tcb symbols stay
//! UND and resolve against libc.so at runtime, as they should.
//!
//! On non-Android hosts we compile `cpp/oboe_stub.cpp` (empty stubs) so
//! `cargo check --target <host>` still works for IDEs and CI.
use std::path::PathBuf;
fn main() {
let target = std::env::var("TARGET").unwrap_or_default();
if target.contains("android") {
// getauxval_fix: override compiler-rt's broken static getauxval
// stub that SIGSEGVs in shared libraries.
cc::Build::new()
.file("cpp/getauxval_fix.c")
.compile("wzp_native_getauxval_fix");
let oboe_dir = fetch_oboe();
match oboe_dir {
Some(oboe_path) => {
println!("cargo:warning=wzp-native: building with Oboe from {:?}", oboe_path);
let mut build = cc::Build::new();
build
.cpp(true)
.std("c++17")
// Shared libc++ — matches legacy wzp-android setup.
.cpp_link_stdlib(Some("c++_shared"))
.include("cpp")
.include(oboe_path.join("include"))
.include(oboe_path.join("src"))
.define("WZP_HAS_OBOE", None)
.file("cpp/oboe_bridge.cpp");
add_cpp_files_recursive(&mut build, &oboe_path.join("src"));
build.compile("wzp_native_oboe_bridge");
}
None => {
println!("cargo:warning=wzp-native: Oboe not found, building stub");
cc::Build::new()
.cpp(true)
.std("c++17")
.cpp_link_stdlib(Some("c++_shared"))
.file("cpp/oboe_stub.cpp")
.include("cpp")
.compile("wzp_native_oboe_bridge");
}
}
// Oboe needs log + OpenSLES backends at runtime.
println!("cargo:rustc-link-lib=log");
println!("cargo:rustc-link-lib=OpenSLES");
// Re-run if any cpp file changes
println!("cargo:rerun-if-changed=cpp/oboe_bridge.cpp");
println!("cargo:rerun-if-changed=cpp/oboe_bridge.h");
println!("cargo:rerun-if-changed=cpp/oboe_stub.cpp");
println!("cargo:rerun-if-changed=cpp/getauxval_fix.c");
} else {
// Non-Android hosts: compile the empty stub so lib.rs's extern
// declarations resolve when someone runs `cargo check` on macOS
// or Linux without an NDK.
cc::Build::new()
.cpp(true)
.std("c++17")
.file("cpp/oboe_stub.cpp")
.include("cpp")
.compile("wzp_native_oboe_bridge");
println!("cargo:rerun-if-changed=cpp/oboe_stub.cpp");
}
}
/// Recursively add all `.cpp` files from a directory to a cc::Build.
fn add_cpp_files_recursive(build: &mut cc::Build, dir: &std::path::Path) {
if !dir.is_dir() {
return;
}
for entry in std::fs::read_dir(dir).unwrap() {
let entry = entry.unwrap();
let path = entry.path();
if path.is_dir() {
add_cpp_files_recursive(build, &path);
} else if path.extension().map_or(false, |e| e == "cpp") {
build.file(&path);
}
}
}
/// Fetch or find Oboe headers + sources (v1.8.1). Same logic as the
/// legacy wzp-android crate's build.rs.
fn fetch_oboe() -> Option<PathBuf> {
let out_dir = PathBuf::from(std::env::var("OUT_DIR").unwrap());
let oboe_dir = out_dir.join("oboe");
if oboe_dir.join("include").join("oboe").join("Oboe.h").exists() {
return Some(oboe_dir);
}
let status = std::process::Command::new("git")
.args([
"clone",
"--depth=1",
"--branch=1.8.1",
"https://github.com/google/oboe.git",
oboe_dir.to_str().unwrap(),
])
.status();
match status {
Ok(s) if s.success() && oboe_dir.join("include").join("oboe").join("Oboe.h").exists() => {
Some(oboe_dir)
}
_ => None,
}
}

View File

@@ -1,21 +0,0 @@
// Override the broken static getauxval from compiler-rt/CRT.
// The static version reads from __libc_auxv which is NULL in shared libs
// loaded via dlopen, causing SIGSEGV in init_have_lse_atomics at load time.
// This version calls the real bionic getauxval via dlsym.
#ifdef __ANDROID__
#include <dlfcn.h>
#include <stdint.h>
typedef unsigned long (*getauxval_fn)(unsigned long);
unsigned long getauxval(unsigned long type) {
static getauxval_fn real_getauxval = (getauxval_fn)0;
if (!real_getauxval) {
real_getauxval = (getauxval_fn)dlsym((void*)-1L /* RTLD_DEFAULT */, "getauxval");
if (!real_getauxval) {
return 0;
}
}
return real_getauxval(type);
}
#endif

View File

@@ -1,420 +0,0 @@
// Full Oboe implementation for Android
// This file is compiled only when targeting Android
#include "oboe_bridge.h"
#ifdef __ANDROID__
#include <oboe/Oboe.h>
#include <android/log.h>
#include <cstring>
#include <atomic>
#define LOG_TAG "wzp-oboe"
#define LOGI(...) __android_log_print(ANDROID_LOG_INFO, LOG_TAG, __VA_ARGS__)
#define LOGW(...) __android_log_print(ANDROID_LOG_WARN, LOG_TAG, __VA_ARGS__)
#define LOGE(...) __android_log_print(ANDROID_LOG_ERROR, LOG_TAG, __VA_ARGS__)
// ---------------------------------------------------------------------------
// Ring buffer helpers (SPSC, lock-free)
// ---------------------------------------------------------------------------
static inline int32_t ring_available_read(const wzp_atomic_int* write_idx,
const wzp_atomic_int* read_idx,
int32_t capacity) {
int32_t w = std::atomic_load_explicit(write_idx, std::memory_order_acquire);
int32_t r = std::atomic_load_explicit(read_idx, std::memory_order_relaxed);
int32_t avail = w - r;
if (avail < 0) avail += capacity;
return avail;
}
static inline int32_t ring_available_write(const wzp_atomic_int* write_idx,
const wzp_atomic_int* read_idx,
int32_t capacity) {
return capacity - 1 - ring_available_read(write_idx, read_idx, capacity);
}
static inline void ring_write(int16_t* buf, int32_t capacity,
wzp_atomic_int* write_idx, const wzp_atomic_int* read_idx,
const int16_t* src, int32_t count) {
int32_t w = std::atomic_load_explicit(write_idx, std::memory_order_relaxed);
for (int32_t i = 0; i < count; i++) {
buf[w] = src[i];
w++;
if (w >= capacity) w = 0;
}
std::atomic_store_explicit(write_idx, w, std::memory_order_release);
}
static inline void ring_read(int16_t* buf, int32_t capacity,
const wzp_atomic_int* write_idx, wzp_atomic_int* read_idx,
int16_t* dst, int32_t count) {
int32_t r = std::atomic_load_explicit(read_idx, std::memory_order_relaxed);
for (int32_t i = 0; i < count; i++) {
dst[i] = buf[r];
r++;
if (r >= capacity) r = 0;
}
std::atomic_store_explicit(read_idx, r, std::memory_order_release);
}
// ---------------------------------------------------------------------------
// Global state
// ---------------------------------------------------------------------------
static std::shared_ptr<oboe::AudioStream> g_capture_stream;
static std::shared_ptr<oboe::AudioStream> g_playout_stream;
// Value copy — the WzpOboeRings the Rust side passes us lives on the caller's
// stack frame and goes away as soon as wzp_oboe_start returns. The raw
// int16/atomic pointers INSIDE the struct point into the Rust-owned, leaked-
// for-the-lifetime-of-the-process AudioBackend singleton, so copying the
// struct by value is safe and keeps the inner pointers valid indefinitely.
// g_rings_valid guards the audio-callback-side read; clearing it in stop()
// signals "no backend" to the callbacks which then return silence + Stop.
static WzpOboeRings g_rings{};
static std::atomic<bool> g_rings_valid{false};
static std::atomic<bool> g_running{false};
static std::atomic<float> g_capture_latency_ms{0.0f};
static std::atomic<float> g_playout_latency_ms{0.0f};
// ---------------------------------------------------------------------------
// Capture callback
// ---------------------------------------------------------------------------
class CaptureCallback : public oboe::AudioStreamDataCallback {
public:
uint64_t calls = 0;
uint64_t total_frames = 0;
uint64_t total_written = 0;
uint64_t ring_full_drops = 0;
oboe::DataCallbackResult onAudioReady(
oboe::AudioStream* stream,
void* audioData,
int32_t numFrames) override {
if (!g_running.load(std::memory_order_relaxed) ||
!g_rings_valid.load(std::memory_order_acquire)) {
return oboe::DataCallbackResult::Stop;
}
const int16_t* src = static_cast<const int16_t*>(audioData);
int32_t avail = ring_available_write(g_rings.capture_write_idx,
g_rings.capture_read_idx,
g_rings.capture_capacity);
int32_t to_write = (numFrames < avail) ? numFrames : avail;
if (to_write > 0) {
ring_write(g_rings.capture_buf, g_rings.capture_capacity,
g_rings.capture_write_idx, g_rings.capture_read_idx,
src, to_write);
}
total_frames += numFrames;
total_written += to_write;
if (to_write < numFrames) {
ring_full_drops += (numFrames - to_write);
}
// Sample-range probe on the FIRST callback to prove we get real audio
if (calls == 0 && numFrames > 0) {
int16_t lo = src[0], hi = src[0];
int32_t sumsq = 0;
for (int32_t i = 0; i < numFrames; i++) {
if (src[i] < lo) lo = src[i];
if (src[i] > hi) hi = src[i];
sumsq += (int32_t)src[i] * (int32_t)src[i];
}
int32_t rms = (int32_t) (numFrames > 0 ? (int32_t)__builtin_sqrt((double)sumsq / (double)numFrames) : 0);
LOGI("capture cb#0: numFrames=%d sample_range=[%d..%d] rms=%d to_write=%d",
numFrames, lo, hi, rms, to_write);
}
// Heartbeat every 50 callbacks (~1s at 20ms/burst)
calls++;
if ((calls % 50) == 0) {
LOGI("capture heartbeat: calls=%llu numFrames=%d ring_avail_write=%d to_write=%d full_drops=%llu total_written=%llu",
(unsigned long long)calls, numFrames, avail, to_write,
(unsigned long long)ring_full_drops, (unsigned long long)total_written);
}
// Update latency estimate
auto result = stream->calculateLatencyMillis();
if (result) {
g_capture_latency_ms.store(static_cast<float>(result.value()),
std::memory_order_relaxed);
}
return oboe::DataCallbackResult::Continue;
}
};
// ---------------------------------------------------------------------------
// Playout callback
// ---------------------------------------------------------------------------
class PlayoutCallback : public oboe::AudioStreamDataCallback {
public:
uint64_t calls = 0;
uint64_t total_frames = 0;
uint64_t total_played_real = 0;
uint64_t underrun_frames = 0;
uint64_t nonempty_calls = 0;
oboe::DataCallbackResult onAudioReady(
oboe::AudioStream* stream,
void* audioData,
int32_t numFrames) override {
if (!g_running.load(std::memory_order_relaxed) ||
!g_rings_valid.load(std::memory_order_acquire)) {
memset(audioData, 0, numFrames * sizeof(int16_t));
return oboe::DataCallbackResult::Stop;
}
int16_t* dst = static_cast<int16_t*>(audioData);
int32_t avail = ring_available_read(g_rings.playout_write_idx,
g_rings.playout_read_idx,
g_rings.playout_capacity);
int32_t to_read = (numFrames < avail) ? numFrames : avail;
if (to_read > 0) {
ring_read(g_rings.playout_buf, g_rings.playout_capacity,
g_rings.playout_write_idx, g_rings.playout_read_idx,
dst, to_read);
nonempty_calls++;
}
// Fill remainder with silence on underrun
if (to_read < numFrames) {
memset(dst + to_read, 0, (numFrames - to_read) * sizeof(int16_t));
underrun_frames += (numFrames - to_read);
}
total_frames += numFrames;
total_played_real += to_read;
// First callback: log requested config + prove we're being called
if (calls == 0) {
LOGI("playout cb#0: numFrames=%d ring_avail_read=%d to_read=%d",
numFrames, avail, to_read);
}
// On the first callback that actually has data, log the sample range
// so we can tell if the samples coming out of the ring look like real
// audio vs constant-zeroes vs garbage.
if (to_read > 0 && nonempty_calls == 1) {
int16_t lo = dst[0], hi = dst[0];
int32_t sumsq = 0;
for (int32_t i = 0; i < to_read; i++) {
if (dst[i] < lo) lo = dst[i];
if (dst[i] > hi) hi = dst[i];
sumsq += (int32_t)dst[i] * (int32_t)dst[i];
}
int32_t rms = (to_read > 0) ? (int32_t)__builtin_sqrt((double)sumsq / (double)to_read) : 0;
LOGI("playout FIRST nonempty read: to_read=%d sample_range=[%d..%d] rms=%d",
to_read, lo, hi, rms);
}
// Heartbeat every 50 callbacks (~1s at 20ms/burst)
calls++;
if ((calls % 50) == 0) {
int state = (int)stream->getState();
auto xrunRes = stream->getXRunCount();
int xruns = xrunRes ? xrunRes.value() : -1;
LOGI("playout heartbeat: calls=%llu nonempty=%llu numFrames=%d ring_avail_read=%d to_read=%d underrun_frames=%llu total_played_real=%llu state=%d xruns=%d",
(unsigned long long)calls, (unsigned long long)nonempty_calls,
numFrames, avail, to_read,
(unsigned long long)underrun_frames, (unsigned long long)total_played_real,
state, xruns);
}
// Update latency estimate
auto result = stream->calculateLatencyMillis();
if (result) {
g_playout_latency_ms.store(static_cast<float>(result.value()),
std::memory_order_relaxed);
}
return oboe::DataCallbackResult::Continue;
}
};
static CaptureCallback g_capture_cb;
static PlayoutCallback g_playout_cb;
// ---------------------------------------------------------------------------
// Public C API
// ---------------------------------------------------------------------------
int wzp_oboe_start(const WzpOboeConfig* config, const WzpOboeRings* rings) {
if (g_running.load(std::memory_order_relaxed)) {
LOGW("wzp_oboe_start: already running");
return -1;
}
// Deep-copy the rings struct into static storage BEFORE we publish it to
// the audio callbacks — `rings` points at the caller's stack frame and
// goes away as soon as this function returns.
g_rings = *rings;
g_rings_valid.store(true, std::memory_order_release);
// Build capture stream
oboe::AudioStreamBuilder captureBuilder;
captureBuilder.setDirection(oboe::Direction::Input)
->setPerformanceMode(oboe::PerformanceMode::LowLatency)
->setSharingMode(oboe::SharingMode::Exclusive)
->setFormat(oboe::AudioFormat::I16)
->setChannelCount(config->channel_count)
->setSampleRate(config->sample_rate)
->setFramesPerDataCallback(config->frames_per_burst)
->setInputPreset(oboe::InputPreset::VoiceCommunication)
->setDataCallback(&g_capture_cb);
oboe::Result result = captureBuilder.openStream(g_capture_stream);
if (result != oboe::Result::OK) {
LOGE("Failed to open capture stream: %s", oboe::convertToText(result));
return -2;
}
LOGI("capture stream opened: actualSR=%d actualCh=%d actualFormat=%d actualFramesPerBurst=%d actualFramesPerDataCallback=%d bufferCapacityInFrames=%d sharing=%d perfMode=%d",
g_capture_stream->getSampleRate(),
g_capture_stream->getChannelCount(),
(int)g_capture_stream->getFormat(),
g_capture_stream->getFramesPerBurst(),
g_capture_stream->getFramesPerDataCallback(),
g_capture_stream->getBufferCapacityInFrames(),
(int)g_capture_stream->getSharingMode(),
(int)g_capture_stream->getPerformanceMode());
// Build playout stream.
//
// Regression triangulation between builds:
// 96be740 (Usage::Media, default API): playout callback DID drain
// the ring at steady 50Hz (playout heartbeat: calls=1100,
// total_played_real=1055040). Audio not audible because OS routing
// sent it to a silent output.
//
// 8c36fb5 (Usage::VoiceCommunication + setAudioApi(AAudio) +
// ContentType::Speech): playout callback fired cb#0 once then
// stopped draining the ring entirely. written_samples stuck at
// ring capacity (7679) across all subsequent heartbeats, so Oboe
// accepted zero samples after startup. Still inaudible.
//
// Hypothesis: forcing setAudioApi(AAudio) + VoiceCommunication on
// Pixel 6 / Android 15 opens a stream that succeeds at cb#0 but
// then detaches from the real audio driver. Reverting to the
// config that at least drove callbacks correctly, plus the
// Kotlin-side MODE_IN_COMMUNICATION + setSpeakerphoneOn(true)
// handled in MainActivity.kt to route audio to the loud speaker.
// Usage::VoiceCommunication is the correct Oboe usage for a VoIP app
// — it respects Android's in-call audio routing and lets
// AudioManager.setSpeakerphoneOn/setBluetoothScoOn actually switch
// between earpiece, loudspeaker, and Bluetooth headset. Combined with
// MODE_IN_COMMUNICATION set from MainActivity.kt and
// speakerphoneOn=false by default, this produces handset/earpiece as
// the default output.
//
// IMPORTANT: do NOT add setAudioApi(AAudio) here. Build 8c36fb5 proved
// forcing AAudio with Usage::VoiceCommunication makes the playout
// callback stop draining the ring after cb#0, even though the stream
// opens successfully. Letting Oboe pick the API (which will be AAudio
// on API ≥ 27 but via a different codepath) kept callbacks firing in
// every other build.
oboe::AudioStreamBuilder playoutBuilder;
playoutBuilder.setDirection(oboe::Direction::Output)
->setPerformanceMode(oboe::PerformanceMode::LowLatency)
->setSharingMode(oboe::SharingMode::Exclusive)
->setFormat(oboe::AudioFormat::I16)
->setChannelCount(config->channel_count)
->setSampleRate(config->sample_rate)
->setFramesPerDataCallback(config->frames_per_burst)
->setUsage(oboe::Usage::VoiceCommunication)
->setDataCallback(&g_playout_cb);
result = playoutBuilder.openStream(g_playout_stream);
if (result != oboe::Result::OK) {
LOGE("Failed to open playout stream: %s", oboe::convertToText(result));
g_capture_stream->close();
g_capture_stream.reset();
return -3;
}
LOGI("playout stream opened: actualSR=%d actualCh=%d actualFormat=%d actualFramesPerBurst=%d actualFramesPerDataCallback=%d bufferCapacityInFrames=%d sharing=%d perfMode=%d",
g_playout_stream->getSampleRate(),
g_playout_stream->getChannelCount(),
(int)g_playout_stream->getFormat(),
g_playout_stream->getFramesPerBurst(),
g_playout_stream->getFramesPerDataCallback(),
g_playout_stream->getBufferCapacityInFrames(),
(int)g_playout_stream->getSharingMode(),
(int)g_playout_stream->getPerformanceMode());
g_running.store(true, std::memory_order_release);
// Start both streams
result = g_capture_stream->requestStart();
if (result != oboe::Result::OK) {
LOGE("Failed to start capture: %s", oboe::convertToText(result));
g_running.store(false, std::memory_order_release);
g_capture_stream->close();
g_playout_stream->close();
g_capture_stream.reset();
g_playout_stream.reset();
return -4;
}
result = g_playout_stream->requestStart();
if (result != oboe::Result::OK) {
LOGE("Failed to start playout: %s", oboe::convertToText(result));
g_running.store(false, std::memory_order_release);
g_capture_stream->requestStop();
g_capture_stream->close();
g_playout_stream->close();
g_capture_stream.reset();
g_playout_stream.reset();
return -5;
}
LOGI("Oboe started: sr=%d burst=%d ch=%d",
config->sample_rate, config->frames_per_burst, config->channel_count);
return 0;
}
void wzp_oboe_stop(void) {
g_running.store(false, std::memory_order_release);
// Tell the audio callbacks to stop touching g_rings BEFORE we tear down
// the streams, so any in-flight callback returns Stop instead of reading
// stale pointers.
g_rings_valid.store(false, std::memory_order_release);
if (g_capture_stream) {
g_capture_stream->requestStop();
g_capture_stream->close();
g_capture_stream.reset();
}
if (g_playout_stream) {
g_playout_stream->requestStop();
g_playout_stream->close();
g_playout_stream.reset();
}
LOGI("Oboe stopped");
}
float wzp_oboe_capture_latency_ms(void) {
return g_capture_latency_ms.load(std::memory_order_relaxed);
}
float wzp_oboe_playout_latency_ms(void) {
return g_playout_latency_ms.load(std::memory_order_relaxed);
}
int wzp_oboe_is_running(void) {
return g_running.load(std::memory_order_relaxed) ? 1 : 0;
}
#else
// Non-Android fallback — should not be reached; oboe_stub.cpp is used instead.
// Provide empty implementations just in case.
int wzp_oboe_start(const WzpOboeConfig* config, const WzpOboeRings* rings) {
(void)config; (void)rings;
return -99;
}
void wzp_oboe_stop(void) {}
float wzp_oboe_capture_latency_ms(void) { return 0.0f; }
float wzp_oboe_playout_latency_ms(void) { return 0.0f; }
int wzp_oboe_is_running(void) { return 0; }
#endif // __ANDROID__

View File

@@ -1,43 +0,0 @@
#ifndef WZP_OBOE_BRIDGE_H
#define WZP_OBOE_BRIDGE_H
#include <stdint.h>
#ifdef __cplusplus
#include <atomic>
typedef std::atomic<int32_t> wzp_atomic_int;
extern "C" {
#else
#include <stdatomic.h>
typedef atomic_int wzp_atomic_int;
#endif
typedef struct {
int32_t sample_rate;
int32_t frames_per_burst;
int32_t channel_count;
} WzpOboeConfig;
typedef struct {
int16_t* capture_buf;
int32_t capture_capacity;
wzp_atomic_int* capture_write_idx;
wzp_atomic_int* capture_read_idx;
int16_t* playout_buf;
int32_t playout_capacity;
wzp_atomic_int* playout_write_idx;
wzp_atomic_int* playout_read_idx;
} WzpOboeRings;
int wzp_oboe_start(const WzpOboeConfig* config, const WzpOboeRings* rings);
void wzp_oboe_stop(void);
float wzp_oboe_capture_latency_ms(void);
float wzp_oboe_playout_latency_ms(void);
int wzp_oboe_is_running(void);
#ifdef __cplusplus
}
#endif
#endif // WZP_OBOE_BRIDGE_H

View File

@@ -1,27 +0,0 @@
// Stub implementation for non-Android host builds (testing, cargo check, etc.)
#include "oboe_bridge.h"
#include <stdio.h>
int wzp_oboe_start(const WzpOboeConfig* config, const WzpOboeRings* rings) {
(void)config;
(void)rings;
fprintf(stderr, "wzp_oboe_start: stub (not on Android)\n");
return 0;
}
void wzp_oboe_stop(void) {
fprintf(stderr, "wzp_oboe_stop: stub (not on Android)\n");
}
float wzp_oboe_capture_latency_ms(void) {
return 0.0f;
}
float wzp_oboe_playout_latency_ms(void) {
return 0.0f;
}
int wzp_oboe_is_running(void) {
return 0;
}

View File

@@ -1,331 +0,0 @@
//! wzp-native — standalone Android cdylib for all the C++ audio code.
//!
//! Built with `cargo ndk`, NOT `cargo tauri android build`. Loaded at
//! runtime by the Tauri desktop cdylib (`wzp-desktop`) via libloading.
//! See `docs/incident-tauri-android-init-tcb.md` for why the split exists.
//!
//! Phase 2: real Oboe audio backend.
//!
//! Architecture: Oboe runs capture + playout streams on its own high-
//! priority AAudio callback threads inside the C++ bridge. Two SPSC ring
//! buffers (capture and playout) are shared between the C++ callbacks
//! and the Rust side via atomic indices — no locks on the hot path.
//! `wzp-desktop` drains the capture ring into its Opus encoder and fills
//! the playout ring with decoded PCM.
use std::sync::atomic::{AtomicI32, Ordering};
// ─── Phase 1 smoke-test exports (kept for sanity checks) ─────────────────
/// Returns 42. Used by wzp-desktop's setup() to verify dlopen + dlsym
/// work before any audio code runs.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_version() -> i32 {
42
}
/// Writes a NUL-terminated string into `out` (capped at `cap`) and
/// returns bytes written excluding the NUL.
#[unsafe(no_mangle)]
pub unsafe extern "C" fn wzp_native_hello(out: *mut u8, cap: usize) -> usize {
const MSG: &[u8] = b"hello from wzp-native\0";
if out.is_null() || cap == 0 {
return 0;
}
let n = MSG.len().min(cap);
unsafe {
core::ptr::copy_nonoverlapping(MSG.as_ptr(), out, n);
*out.add(n - 1) = 0;
}
n - 1
}
// ─── C++ Oboe bridge FFI ─────────────────────────────────────────────────
#[repr(C)]
struct WzpOboeConfig {
sample_rate: i32,
frames_per_burst: i32,
channel_count: i32,
}
#[repr(C)]
struct WzpOboeRings {
capture_buf: *mut i16,
capture_capacity: i32,
capture_write_idx: *mut AtomicI32,
capture_read_idx: *mut AtomicI32,
playout_buf: *mut i16,
playout_capacity: i32,
playout_write_idx: *mut AtomicI32,
playout_read_idx: *mut AtomicI32,
}
// SAFETY: atomics synchronise producer/consumer; raw pointers are owned
// by the AudioBackend singleton below whose lifetime covers all calls.
unsafe impl Send for WzpOboeRings {}
unsafe impl Sync for WzpOboeRings {}
unsafe extern "C" {
fn wzp_oboe_start(config: *const WzpOboeConfig, rings: *const WzpOboeRings) -> i32;
fn wzp_oboe_stop();
fn wzp_oboe_capture_latency_ms() -> f32;
fn wzp_oboe_playout_latency_ms() -> f32;
fn wzp_oboe_is_running() -> i32;
}
// ─── SPSC ring buffer (shared with C++ via AtomicI32) ────────────────────
/// 20 ms @ 48 kHz mono = 960 samples.
const FRAME_SAMPLES: usize = 960;
/// ~160 ms headroom at 48 kHz.
const RING_CAPACITY: usize = 7680;
struct RingBuffer {
buf: Vec<i16>,
capacity: usize,
write_idx: AtomicI32,
read_idx: AtomicI32,
}
// SAFETY: SPSC with atomic read/write cursors; producer and consumer
// are always on different threads.
unsafe impl Send for RingBuffer {}
unsafe impl Sync for RingBuffer {}
impl RingBuffer {
fn new(capacity: usize) -> Self {
Self {
buf: vec![0i16; capacity],
capacity,
write_idx: AtomicI32::new(0),
read_idx: AtomicI32::new(0),
}
}
fn available_read(&self) -> usize {
let w = self.write_idx.load(Ordering::Acquire);
let r = self.read_idx.load(Ordering::Relaxed);
let avail = w - r;
if avail < 0 { (avail + self.capacity as i32) as usize } else { avail as usize }
}
fn available_write(&self) -> usize {
self.capacity - 1 - self.available_read()
}
fn write(&self, data: &[i16]) -> usize {
let count = data.len().min(self.available_write());
if count == 0 {
return 0;
}
let mut w = self.write_idx.load(Ordering::Relaxed) as usize;
let cap = self.capacity;
let buf_ptr = self.buf.as_ptr() as *mut i16;
for sample in &data[..count] {
unsafe { *buf_ptr.add(w) = *sample; }
w += 1;
if w >= cap { w = 0; }
}
self.write_idx.store(w as i32, Ordering::Release);
count
}
fn read(&self, out: &mut [i16]) -> usize {
let count = out.len().min(self.available_read());
if count == 0 {
return 0;
}
let mut r = self.read_idx.load(Ordering::Relaxed) as usize;
let cap = self.capacity;
let buf_ptr = self.buf.as_ptr();
for slot in &mut out[..count] {
unsafe { *slot = *buf_ptr.add(r); }
r += 1;
if r >= cap { r = 0; }
}
self.read_idx.store(r as i32, Ordering::Release);
count
}
fn buf_ptr(&self) -> *mut i16 {
self.buf.as_ptr() as *mut i16
}
fn write_idx_ptr(&self) -> *mut AtomicI32 {
&self.write_idx as *const AtomicI32 as *mut AtomicI32
}
fn read_idx_ptr(&self) -> *mut AtomicI32 {
&self.read_idx as *const AtomicI32 as *mut AtomicI32
}
}
// ─── AudioBackend singleton ──────────────────────────────────────────────
//
// There is one global AudioBackend instance because Oboe's C++ side
// holds its own singleton of the streams. The `Box::leak`'d statics own
// the ring buffers for the lifetime of the process — dropping them while
// Oboe is still running would cause use-after-free in the audio callback.
use std::sync::OnceLock;
struct AudioBackend {
capture: RingBuffer,
playout: RingBuffer,
started: std::sync::Mutex<bool>,
/// Per-write logging throttle counter for wzp_native_audio_write_playout.
playout_write_log_count: std::sync::atomic::AtomicU64,
}
static BACKEND: OnceLock<&'static AudioBackend> = OnceLock::new();
fn backend() -> &'static AudioBackend {
BACKEND.get_or_init(|| {
Box::leak(Box::new(AudioBackend {
capture: RingBuffer::new(RING_CAPACITY),
playout: RingBuffer::new(RING_CAPACITY),
started: std::sync::Mutex::new(false),
playout_write_log_count: std::sync::atomic::AtomicU64::new(0),
}))
})
}
// ─── C FFI for wzp-desktop ───────────────────────────────────────────────
/// Start the Oboe audio streams. Returns 0 on success, non-zero on error.
/// Idempotent — calling while already running is a no-op that returns 0.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_audio_start() -> i32 {
let b = backend();
let mut started = match b.started.lock() {
Ok(g) => g,
Err(_) => return -1,
};
if *started {
return 0;
}
let config = WzpOboeConfig {
sample_rate: 48_000,
frames_per_burst: FRAME_SAMPLES as i32,
channel_count: 1,
};
let rings = WzpOboeRings {
capture_buf: b.capture.buf_ptr(),
capture_capacity: b.capture.capacity as i32,
capture_write_idx: b.capture.write_idx_ptr(),
capture_read_idx: b.capture.read_idx_ptr(),
playout_buf: b.playout.buf_ptr(),
playout_capacity: b.playout.capacity as i32,
playout_write_idx: b.playout.write_idx_ptr(),
playout_read_idx: b.playout.read_idx_ptr(),
};
let ret = unsafe { wzp_oboe_start(&config, &rings) };
if ret != 0 {
return ret;
}
*started = true;
0
}
/// Stop Oboe. Idempotent. Safe to call from any thread.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_audio_stop() {
let b = backend();
if let Ok(mut started) = b.started.lock() {
if *started {
unsafe { wzp_oboe_stop() };
*started = false;
}
}
}
/// Read captured PCM samples from the capture ring. Returns the number
/// of `i16` samples actually copied into `out` (may be less than
/// `out_len` if the ring is empty).
#[unsafe(no_mangle)]
pub unsafe extern "C" fn wzp_native_audio_read_capture(out: *mut i16, out_len: usize) -> usize {
if out.is_null() || out_len == 0 {
return 0;
}
let slice = unsafe { std::slice::from_raw_parts_mut(out, out_len) };
backend().capture.read(slice)
}
/// Write PCM samples into the playout ring. Returns the number of
/// samples actually enqueued (may be less than `in_len` if the ring
/// is nearly full — in practice the caller should pace to 20 ms
/// frames and spin briefly if the ring is full).
#[unsafe(no_mangle)]
pub unsafe extern "C" fn wzp_native_audio_write_playout(input: *const i16, in_len: usize) -> usize {
if input.is_null() || in_len == 0 {
return 0;
}
let slice = unsafe { std::slice::from_raw_parts(input, in_len) };
let b = backend();
let before_w = b.playout.write_idx.load(std::sync::atomic::Ordering::Relaxed);
let before_r = b.playout.read_idx.load(std::sync::atomic::Ordering::Relaxed);
let written = b.playout.write(slice);
// First few writes: log ring state + sample range so we can compare what
// engine.rs hands us to what the C++ playout callback reads.
let first_writes = b.playout_write_log_count.fetch_add(1, std::sync::atomic::Ordering::Relaxed);
if first_writes < 3 || first_writes % 50 == 0 {
let (mut lo, mut hi, mut sumsq) = (i16::MAX, i16::MIN, 0i64);
for &s in slice.iter() {
if s < lo { lo = s; }
if s > hi { hi = s; }
sumsq += (s as i64) * (s as i64);
}
let rms = (sumsq as f64 / slice.len() as f64).sqrt() as i32;
let avail_w_after = b.playout.available_write();
let avail_r_after = b.playout.available_read();
let msg = format!(
"playout WRITE #{first_writes}: in_len={} written={} range=[{lo}..{hi}] rms={rms} before_w={before_w} before_r={before_r} avail_read_after={avail_r_after} avail_write_after={avail_w_after}",
slice.len(), written
);
unsafe {
android_log(msg.as_str());
}
}
written
}
// Minimal android logcat shim so we can print from the cdylib without pulling
// in android_logger crate (which would add another dep that has to build with
// cargo-ndk). Uses libc's __android_log_print via extern linkage.
#[cfg(target_os = "android")]
unsafe extern "C" {
fn __android_log_write(prio: i32, tag: *const u8, text: *const u8) -> i32;
}
#[cfg(target_os = "android")]
unsafe fn android_log(msg: &str) {
// ANDROID_LOG_INFO = 4. Tag and text must be NUL-terminated.
let tag = b"wzp-native\0";
let mut buf = Vec::with_capacity(msg.len() + 1);
buf.extend_from_slice(msg.as_bytes());
buf.push(0);
unsafe { __android_log_write(4, tag.as_ptr(), buf.as_ptr()); }
}
#[cfg(not(target_os = "android"))]
#[allow(dead_code)]
unsafe fn android_log(_msg: &str) {}
/// Current capture latency reported by Oboe, in milliseconds. Returns
/// NaN / 0.0 if the stream isn't running.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_audio_capture_latency_ms() -> f32 {
unsafe { wzp_oboe_capture_latency_ms() }
}
/// Current playout latency reported by Oboe, in milliseconds.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_audio_playout_latency_ms() -> f32 {
unsafe { wzp_oboe_playout_latency_ms() }
}
/// Non-zero if both Oboe streams are currently running.
#[unsafe(no_mangle)]
pub extern "C" fn wzp_native_audio_is_running() -> i32 {
unsafe { wzp_oboe_is_running() }
}

View File

@@ -53,15 +53,6 @@ pub enum TransportError {
Timeout { ms: u64 }, Timeout { ms: u64 },
#[error("io error: {0}")] #[error("io error: {0}")]
Io(#[from] std::io::Error), Io(#[from] std::io::Error),
/// Parsed wire bytes successfully but the payload didn't
/// deserialize into a known `SignalMessage` variant. Usually
/// means the peer is running a newer build with a variant we
/// don't know yet. Callers should **log and continue** rather
/// than tearing down the connection, so that forward-compat
/// additions to `SignalMessage` don't silently kill old
/// clients/relays.
#[error("signal deserialize: {0}")]
Deserialize(String),
#[error("internal transport error: {0}")] #[error("internal transport error: {0}")]
Internal(String), Internal(String),
} }

View File

@@ -584,26 +584,6 @@ pub enum SignalMessage {
recommended_profile: crate::QualityProfile, recommended_profile: crate::QualityProfile,
}, },
/// Phase 4 telemetry: loss-recovery counts for the current session.
/// Sent periodically from receivers to the relay so Prometheus metrics
/// can distinguish DRED reconstructions from classical PLC invocations.
/// Fields default to 0 on old receivers (`#[serde(default)]`), so
/// introducing this variant is backward-compatible with pre-Phase-4
/// relays — they'll just log "unknown signal variant" on receipt.
LossRecoveryUpdate {
/// Total frames reconstructed via DRED since call start (monotonic).
#[serde(default)]
dred_reconstructions: u64,
/// Total frames filled via classical Opus/Codec2 PLC since call
/// start (monotonic).
#[serde(default)]
classical_plc_invocations: u64,
/// Total frames decoded since call start. Used by the relay to
/// compute recovery rates as a fraction of total frames.
#[serde(default)]
frames_decoded: u64,
},
/// Connection keepalive / RTT measurement. /// Connection keepalive / RTT measurement.
Ping { timestamp_ms: u64 }, Ping { timestamp_ms: u64 },
Pong { timestamp_ms: u64 }, Pong { timestamp_ms: u64 },
@@ -736,15 +716,6 @@ pub enum SignalMessage {
signature: Vec<u8>, signature: Vec<u8>,
/// Supported quality profiles. /// Supported quality profiles.
supported_profiles: Vec<crate::QualityProfile>, supported_profiles: Vec<crate::QualityProfile>,
/// Phase 3 (hole-punching): caller's own server-reflexive
/// address as learned via `SignalMessage::Reflect`. The
/// relay stashes this in its call registry and later
/// injects it into the callee's `CallSetup.peer_direct_addr`
/// so the callee can try a direct QUIC handshake to the
/// caller instead of routing media through the relay.
/// `None` means "caller doesn't want P2P, use relay only".
#[serde(default, skip_serializing_if = "Option::is_none")]
caller_reflexive_addr: Option<String>,
}, },
/// Callee's response to a direct call. /// Callee's response to a direct call.
@@ -764,13 +735,6 @@ pub enum SignalMessage {
/// Chosen quality profile (present when accepting). /// Chosen quality profile (present when accepting).
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
chosen_profile: Option<crate::QualityProfile>, chosen_profile: Option<crate::QualityProfile>,
/// Phase 3 (hole-punching): callee's own server-reflexive
/// address, only populated on `AcceptTrusted` — privacy-mode
/// answers leave this `None` so the callee's real IP stays
/// hidden (the whole point of `AcceptGeneric`). The relay
/// carries it opaquely into the caller's `CallSetup`.
#[serde(default, skip_serializing_if = "Option::is_none")]
callee_reflexive_addr: Option<String>,
}, },
/// Relay tells both parties: media room is ready. /// Relay tells both parties: media room is ready.
@@ -780,78 +744,12 @@ pub enum SignalMessage {
room: String, room: String,
/// Relay address for the QUIC media connection. /// Relay address for the QUIC media connection.
relay_addr: String, relay_addr: String,
/// Phase 3 (hole-punching): the OTHER party's server-reflexive
/// address as the relay learned it from the offer/answer
/// exchange. When populated, clients attempt a direct QUIC
/// handshake to this address in parallel with the existing
/// relay path and use whichever connects first. `None`
/// means the relay path is the only option — either because
/// a peer didn't advertise its addr (Phase 1/2 relay or
/// privacy-mode answer) or because the relay decided P2P
/// wasn't viable.
#[serde(default, skip_serializing_if = "Option::is_none")]
peer_direct_addr: Option<String>,
}, },
/// Ringing notification (relay → caller, callee received the offer). /// Ringing notification (relay → caller, callee received the offer).
CallRinging { CallRinging {
call_id: String, call_id: String,
}, },
// ── NAT reflection ("STUN for QUIC") ──────────────────────────────
/// Client → relay: "please tell me the source IP:port you see on
/// this connection". A QUIC-native replacement for classic STUN
/// that reuses the TLS-authenticated signal channel to the relay
/// instead of running a separate UDP reflection service on port
/// 3478. The relay answers with `ReflectResponse`.
///
/// No payload — the relay already knows which connection the
/// request arrived on, and `connection.remote_address()` gives it
/// the exact source address (post-NAT) as observed from the
/// server side of the TLS session.
Reflect,
/// Relay → client: response to `Reflect`. Carries the socket
/// address the relay observes as the client's source for this
/// QUIC connection in `SocketAddr::to_string()` form — "a.b.c.d:p"
/// for IPv4, "[::1]:p" for IPv6. Clients parse it with
/// `SocketAddr::from_str`.
ReflectResponse {
observed_addr: String,
},
// ── Phase 4: cross-relay direct-call signaling ────────────────────
/// Phase 4: relay-to-relay envelope for forwarding direct-call
/// signaling across a federation link. When Alice on Relay A
/// sends a `DirectCallOffer` for Bob whose fingerprint isn't
/// in A's local SignalHub, Relay A wraps the offer in this
/// envelope and broadcasts it over every active federation
/// peer link. Whichever peer has Bob registered unwraps the
/// inner message and delivers it locally.
///
/// Never originated by clients — only relays create and
/// consume this variant.
///
/// Loop prevention: the receiving relay drops any forward
/// where `origin_relay_fp` matches its own federation TLS
/// fingerprint. With broadcast-to-all-peers this prevents
/// A→B→A echo loops; proper TTL + dedup will land when
/// multi-hop federation is added (Phase 4.2).
FederatedSignalForward {
/// The signal message being forwarded
/// (`DirectCallOffer`, `DirectCallAnswer`, `CallRinging`,
/// `Hangup`, ...). Boxed because `SignalMessage` is
/// relatively large and JSON serde handles recursion
/// cleanly.
inner: Box<SignalMessage>,
/// Federation TLS fingerprint of the sending relay.
/// Used (a) for loop prevention by the receiver and (b)
/// to route the peer's reply back through the same
/// federation link via `send_signal_to_peer`.
origin_relay_fp: String,
},
} }
/// How the callee responds to a direct call. /// How the callee responds to a direct call.
@@ -990,261 +888,6 @@ mod tests {
assert_eq!(packet.quality_report, decoded.quality_report); assert_eq!(packet.quality_report, decoded.quality_report);
} }
#[test]
fn reflect_serialize_roundtrip() {
// Reflect is a unit variant — the client sends it with no
// payload and the relay answers with the observed source addr.
let req = SignalMessage::Reflect;
let json = serde_json::to_string(&req).unwrap();
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
assert!(matches!(decoded, SignalMessage::Reflect));
// ReflectResponse carries a string — exercise both IPv4 and
// IPv6 shapes because SocketAddr::to_string uses [::1]:port
// for v6 and the client side has to parse that back.
for addr in ["192.0.2.17:4433", "[2001:db8::1]:4433", "127.0.0.1:54321"] {
let resp = SignalMessage::ReflectResponse {
observed_addr: addr.to_string(),
};
let json = serde_json::to_string(&resp).unwrap();
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
match decoded {
SignalMessage::ReflectResponse { observed_addr } => {
assert_eq!(observed_addr, addr);
// Must parse back to a SocketAddr cleanly.
let _parsed: std::net::SocketAddr = observed_addr.parse()
.expect("observed_addr must parse as SocketAddr");
}
_ => panic!("wrong variant after roundtrip"),
}
}
}
#[test]
fn federated_signal_forward_roundtrip() {
// Wrap a DirectCallOffer inside FederatedSignalForward and
// prove both directions of serde preserve every field.
let inner = SignalMessage::DirectCallOffer {
caller_fingerprint: "alice".into(),
caller_alias: Some("Alice".into()),
target_fingerprint: "bob".into(),
call_id: "c1".into(),
identity_pub: [1u8; 32],
ephemeral_pub: [2u8; 32],
signature: vec![3u8; 64],
supported_profiles: vec![],
caller_reflexive_addr: Some("192.0.2.1:4433".into()),
};
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(inner),
origin_relay_fp: "relay-a-tls-fp".into(),
};
let json = serde_json::to_string(&forward).unwrap();
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
match decoded {
SignalMessage::FederatedSignalForward { inner, origin_relay_fp } => {
assert_eq!(origin_relay_fp, "relay-a-tls-fp");
match *inner {
SignalMessage::DirectCallOffer {
caller_fingerprint,
target_fingerprint,
caller_reflexive_addr,
..
} => {
assert_eq!(caller_fingerprint, "alice");
assert_eq!(target_fingerprint, "bob");
assert_eq!(caller_reflexive_addr.as_deref(), Some("192.0.2.1:4433"));
}
_ => panic!("inner was not DirectCallOffer after roundtrip"),
}
}
_ => panic!("outer was not FederatedSignalForward"),
}
}
#[test]
fn federated_signal_forward_can_nest_any_inner() {
// Sanity check that every direct-call signaling variant
// we intend to forward survives being boxed + re-serialized.
let cases: Vec<SignalMessage> = vec![
SignalMessage::DirectCallAnswer {
call_id: "c1".into(),
accept_mode: CallAcceptMode::AcceptTrusted,
identity_pub: None,
ephemeral_pub: None,
signature: None,
chosen_profile: None,
callee_reflexive_addr: Some("198.51.100.9:4433".into()),
},
SignalMessage::CallRinging { call_id: "c1".into() },
SignalMessage::Hangup { reason: HangupReason::Normal },
];
for inner in cases {
let inner_disc = std::mem::discriminant(&inner);
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(inner),
origin_relay_fp: "r".into(),
};
let json = serde_json::to_string(&forward).unwrap();
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
match decoded {
SignalMessage::FederatedSignalForward { inner, .. } => {
assert_eq!(std::mem::discriminant(&*inner), inner_disc);
}
_ => panic!("outer variant lost"),
}
}
}
#[test]
fn hole_punching_optional_fields_roundtrip() {
// DirectCallOffer with Some(caller_reflexive_addr)
let offer = SignalMessage::DirectCallOffer {
caller_fingerprint: "alice".into(),
caller_alias: None,
target_fingerprint: "bob".into(),
call_id: "c1".into(),
identity_pub: [0; 32],
ephemeral_pub: [0; 32],
signature: vec![],
supported_profiles: vec![],
caller_reflexive_addr: Some("192.0.2.1:4433".into()),
};
let json = serde_json::to_string(&offer).unwrap();
assert!(
json.contains("caller_reflexive_addr"),
"Some field must serialize: {json}"
);
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
match decoded {
SignalMessage::DirectCallOffer { caller_reflexive_addr, .. } => {
assert_eq!(caller_reflexive_addr.as_deref(), Some("192.0.2.1:4433"));
}
_ => panic!("wrong variant"),
}
// DirectCallOffer with None — skip_serializing_if must
// OMIT the field from the JSON so older relays that don't
// know about caller_reflexive_addr don't see it.
let offer_none = SignalMessage::DirectCallOffer {
caller_fingerprint: "alice".into(),
caller_alias: None,
target_fingerprint: "bob".into(),
call_id: "c1".into(),
identity_pub: [0; 32],
ephemeral_pub: [0; 32],
signature: vec![],
supported_profiles: vec![],
caller_reflexive_addr: None,
};
let json_none = serde_json::to_string(&offer_none).unwrap();
assert!(
!json_none.contains("caller_reflexive_addr"),
"None field must NOT serialize: {json_none}"
);
// DirectCallAnswer with callee_reflexive_addr.
let answer = SignalMessage::DirectCallAnswer {
call_id: "c1".into(),
accept_mode: CallAcceptMode::AcceptTrusted,
identity_pub: None,
ephemeral_pub: None,
signature: None,
chosen_profile: None,
callee_reflexive_addr: Some("198.51.100.9:4433".into()),
};
let decoded: SignalMessage =
serde_json::from_str(&serde_json::to_string(&answer).unwrap()).unwrap();
match decoded {
SignalMessage::DirectCallAnswer { callee_reflexive_addr, .. } => {
assert_eq!(
callee_reflexive_addr.as_deref(),
Some("198.51.100.9:4433")
);
}
_ => panic!("wrong variant"),
}
// CallSetup with peer_direct_addr.
let setup = SignalMessage::CallSetup {
call_id: "c1".into(),
room: "call-c1".into(),
relay_addr: "203.0.113.5:4433".into(),
peer_direct_addr: Some("192.0.2.1:4433".into()),
};
let decoded: SignalMessage =
serde_json::from_str(&serde_json::to_string(&setup).unwrap()).unwrap();
match decoded {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert_eq!(peer_direct_addr.as_deref(), Some("192.0.2.1:4433"));
}
_ => panic!("wrong variant"),
}
}
#[test]
fn hole_punching_backward_compat_old_json_parses() {
// An older client/relay wouldn't include the new fields at
// all — the new code must still accept that JSON because
// of #[serde(default)] on the Option<String>.
let old_offer_json = r#"{
"DirectCallOffer": {
"caller_fingerprint": "alice",
"caller_alias": null,
"target_fingerprint": "bob",
"call_id": "c1",
"identity_pub": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
"ephemeral_pub": [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0],
"signature": [],
"supported_profiles": []
}
}"#;
let decoded: SignalMessage = serde_json::from_str(old_offer_json).unwrap();
match decoded {
SignalMessage::DirectCallOffer { caller_reflexive_addr, .. } => {
assert!(caller_reflexive_addr.is_none());
}
_ => panic!("wrong variant"),
}
let old_setup_json = r#"{
"CallSetup": {
"call_id": "c1",
"room": "call-c1",
"relay_addr": "203.0.113.5:4433"
}
}"#;
let decoded: SignalMessage = serde_json::from_str(old_setup_json).unwrap();
match decoded {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert!(peer_direct_addr.is_none());
}
_ => panic!("wrong variant"),
}
}
#[test]
fn reflect_backward_compat_with_existing_variants() {
// Adding Reflect/ReflectResponse at the end of the enum must
// not break JSON round-tripping of existing variants. Smoke-
// test a sample of the pre-existing ones.
let cases = vec![
SignalMessage::Ping { timestamp_ms: 12345 },
SignalMessage::Hold,
SignalMessage::Hangup { reason: HangupReason::Normal },
SignalMessage::CallRinging { call_id: "abcd".into() },
];
for m in cases {
let json = serde_json::to_string(&m).unwrap();
let decoded: SignalMessage = serde_json::from_str(&json).unwrap();
// Discriminant equality proves variant tag survived.
assert_eq!(
std::mem::discriminant(&m),
std::mem::discriminant(&decoded)
);
}
}
#[test] #[test]
fn hold_unhold_serialize() { fn hold_unhold_serialize() {
let hold = SignalMessage::Hold; let hold = SignalMessage::Hold;

View File

@@ -31,25 +31,6 @@ pub struct DirectCall {
pub created_at: Instant, pub created_at: Instant,
pub answered_at: Option<Instant>, pub answered_at: Option<Instant>,
pub ended_at: Option<Instant>, pub ended_at: Option<Instant>,
/// Phase 3 (hole-punching): caller's server-reflexive address
/// as carried in the `DirectCallOffer`. The relay stashes it
/// here when the offer arrives so it can later inject it as
/// `peer_direct_addr` into the callee's `CallSetup`.
pub caller_reflexive_addr: Option<String>,
/// Phase 3 (hole-punching): callee's server-reflexive address
/// as carried in the `DirectCallAnswer`. Only populated for
/// `AcceptTrusted` answers — privacy-mode answers leave this
/// `None`. Fed into the caller's `CallSetup.peer_direct_addr`.
pub callee_reflexive_addr: Option<String>,
/// Phase 4 (cross-relay): federation TLS fingerprint of the
/// PEER RELAY that forwarded the offer/answer for this call.
/// `None` for local calls — caller and callee both
/// registered on this relay. `Some(fp)` when one side of
/// the call is on a remote relay reached through the
/// federation link identified by `fp`. The
/// `DirectCallAnswer` handling uses this to route the reply
/// back through the SAME link instead of broadcasting again.
pub peer_relay_fp: Option<String>,
} }
/// Registry of active direct calls. /// Registry of active direct calls.
@@ -76,42 +57,11 @@ impl CallRegistry {
created_at: Instant::now(), created_at: Instant::now(),
answered_at: None, answered_at: None,
ended_at: None, ended_at: None,
caller_reflexive_addr: None,
callee_reflexive_addr: None,
peer_relay_fp: None,
}; };
self.calls.insert(call_id.clone(), call); self.calls.insert(call_id.clone(), call);
self.calls.get(&call_id).unwrap() self.calls.get(&call_id).unwrap()
} }
/// Phase 4: stash the federation TLS fingerprint of the peer
/// relay that originated (or will receive) the cross-relay
/// forward for this call. Safe to call with `None` to clear
/// a previously-set value.
pub fn set_peer_relay_fp(&mut self, call_id: &str, fp: Option<String>) {
if let Some(call) = self.calls.get_mut(call_id) {
call.peer_relay_fp = fp;
}
}
/// Phase 3: stash the caller's server-reflexive address read
/// off a `DirectCallOffer`. Safe to call on any call state;
/// a no-op if the call doesn't exist.
pub fn set_caller_reflexive_addr(&mut self, call_id: &str, addr: Option<String>) {
if let Some(call) = self.calls.get_mut(call_id) {
call.caller_reflexive_addr = addr;
}
}
/// Phase 3: stash the callee's server-reflexive address read
/// off a `DirectCallAnswer`. Safe to call on any call state;
/// a no-op if the call doesn't exist.
pub fn set_callee_reflexive_addr(&mut self, call_id: &str, addr: Option<String>) {
if let Some(call) = self.calls.get_mut(call_id) {
call.callee_reflexive_addr = addr;
}
}
/// Get a call by ID. /// Get a call by ID.
pub fn get(&self, call_id: &str) -> Option<&DirectCall> { pub fn get(&self, call_id: &str) -> Option<&DirectCall> {
self.calls.get(call_id) self.calls.get(call_id)
@@ -246,79 +196,4 @@ mod tests {
assert_eq!(reg.peer_fingerprint("c1", "alice"), Some("bob")); assert_eq!(reg.peer_fingerprint("c1", "alice"), Some("bob"));
assert_eq!(reg.peer_fingerprint("c1", "bob"), Some("alice")); assert_eq!(reg.peer_fingerprint("c1", "bob"), Some("alice"));
} }
#[test]
fn call_registry_stores_reflexive_addrs() {
let mut reg = CallRegistry::new();
reg.create_call("c1".into(), "alice".into(), "bob".into());
// Default: both addrs are None.
let c = reg.get("c1").unwrap();
assert!(c.caller_reflexive_addr.is_none());
assert!(c.callee_reflexive_addr.is_none());
// Caller advertises its reflex addr via DirectCallOffer.
reg.set_caller_reflexive_addr("c1", Some("192.0.2.1:4433".into()));
assert_eq!(
reg.get("c1").unwrap().caller_reflexive_addr.as_deref(),
Some("192.0.2.1:4433")
);
// Callee responds with AcceptTrusted + its own reflex addr.
reg.set_callee_reflexive_addr("c1", Some("198.51.100.9:4433".into()));
assert_eq!(
reg.get("c1").unwrap().callee_reflexive_addr.as_deref(),
Some("198.51.100.9:4433")
);
// Both addrs are independently readable — the relay uses
// them to cross-wire peer_direct_addr in CallSetup.
let c = reg.get("c1").unwrap();
assert_eq!(
c.caller_reflexive_addr.as_deref(),
Some("192.0.2.1:4433")
);
assert_eq!(
c.callee_reflexive_addr.as_deref(),
Some("198.51.100.9:4433")
);
// Setter on an unknown call is a no-op, not a panic.
reg.set_caller_reflexive_addr("does-not-exist", Some("x".into()));
}
#[test]
fn call_registry_stores_peer_relay_fp() {
let mut reg = CallRegistry::new();
reg.create_call("c1".into(), "alice".into(), "bob".into());
// Default: no peer relay.
assert!(reg.get("c1").unwrap().peer_relay_fp.is_none());
// Cross-relay call: origin relay's fp is stashed.
reg.set_peer_relay_fp("c1", Some("relay-a-tls-fp".into()));
assert_eq!(
reg.get("c1").unwrap().peer_relay_fp.as_deref(),
Some("relay-a-tls-fp")
);
// Clearing with None is a valid no-op and empties the field.
reg.set_peer_relay_fp("c1", None);
assert!(reg.get("c1").unwrap().peer_relay_fp.is_none());
// Unknown call is a no-op, not a panic.
reg.set_peer_relay_fp("does-not-exist", Some("x".into()));
}
#[test]
fn call_registry_clearing_reflex_addr_works() {
// Passing None to the setter must clear a previously-set value
// so callers that downgrade to privacy mode mid-flow don't
// leak a stale addr into CallSetup.
let mut reg = CallRegistry::new();
reg.create_call("c1".into(), "alice".into(), "bob".into());
reg.set_caller_reflexive_addr("c1", Some("192.0.2.1:4433".into()));
reg.set_caller_reflexive_addr("c1", None);
assert!(reg.get("c1").unwrap().caller_reflexive_addr.is_none());
}
} }

View File

@@ -5,6 +5,7 @@
//! Use `wzp-analyzer` to correlate events across multiple relays. //! Use `wzp-analyzer` to correlate events across multiple relays.
use std::path::PathBuf; use std::path::PathBuf;
use std::sync::Arc;
use serde::Serialize; use serde::Serialize;
use tokio::sync::mpsc; use tokio::sync::mpsc;

View File

@@ -142,18 +142,13 @@ pub struct FederationManager {
peer_links: Arc<Mutex<HashMap<String, PeerLink>>>, peer_links: Arc<Mutex<HashMap<String, PeerLink>>>,
/// Dedup filter for incoming federation datagrams. /// Dedup filter for incoming federation datagrams.
dedup: Mutex<Deduplicator>, dedup: Mutex<Deduplicator>,
/// Per-room seq counter for federation media delivered to local clients.
/// Ensures clients see monotonically increasing seq regardless of federation sender.
local_delivery_seq: std::sync::atomic::AtomicU16,
/// JSONL event log for protocol analysis. /// JSONL event log for protocol analysis.
event_log: EventLogger, event_log: EventLogger,
/// Per-room rate limiters for inbound federation media. /// Per-room rate limiters for inbound federation media.
rate_limiters: Mutex<HashMap<String, RateLimiter>>, rate_limiters: Mutex<HashMap<String, RateLimiter>>,
/// Phase 4: channel for handing cross-relay direct-call
/// signaling (inner message + origin relay fp) back to the
/// main signal loop in `main.rs`. Set once at startup via
/// `set_cross_relay_tx`. `None` when the main loop hasn't
/// wired it up yet (e.g. during startup warmup) — forwards
/// that arrive before wiring are dropped with a warning.
cross_relay_signal_tx:
Mutex<Option<tokio::sync::mpsc::Sender<(wzp_proto::SignalMessage, String)>>>,
} }
impl FederationManager { impl FederationManager {
@@ -177,80 +172,9 @@ impl FederationManager {
metrics, metrics,
peer_links: Arc::new(Mutex::new(HashMap::new())), peer_links: Arc::new(Mutex::new(HashMap::new())),
dedup: Mutex::new(Deduplicator::new(DEDUP_WINDOW_SIZE)), dedup: Mutex::new(Deduplicator::new(DEDUP_WINDOW_SIZE)),
local_delivery_seq: std::sync::atomic::AtomicU16::new(0),
event_log, event_log,
rate_limiters: Mutex::new(HashMap::new()), rate_limiters: Mutex::new(HashMap::new()),
cross_relay_signal_tx: Mutex::new(None),
}
}
/// Phase 4: expose this relay's federation TLS fingerprint so
/// the main signal loop can populate
/// `SignalMessage::FederatedSignalForward.origin_relay_fp`.
pub fn local_tls_fp(&self) -> &str {
&self.local_tls_fp
}
/// Phase 4: wire the channel that the main signal loop uses
/// to receive unwrapped cross-relay direct-call signals. Called
/// once at startup from `main.rs`.
pub async fn set_cross_relay_tx(
&self,
tx: tokio::sync::mpsc::Sender<(wzp_proto::SignalMessage, String)>,
) {
*self.cross_relay_signal_tx.lock().await = Some(tx);
}
/// Phase 4: broadcast a `SignalMessage::FederatedSignalForward`
/// to every active federation peer link. Returns the number of
/// peers the broadcast reached (not the number that successfully
/// delivered the message further). Used when the local relay
/// doesn't know which peer holds the target fingerprint for a
/// `DirectCallOffer` — whichever peer has it will unwrap and
/// handle locally; the rest drop silently after "target not
/// local" check.
///
/// Loop prevention: the receiving relay checks
/// `origin_relay_fp` against its own fp and drops self-sourced
/// forwards.
pub async fn broadcast_signal(&self, msg: &wzp_proto::SignalMessage) -> usize {
let links = self.peer_links.lock().await;
let mut count = 0;
for (fp, link) in links.iter() {
match link.transport.send_signal(msg).await {
Ok(()) => {
count += 1;
tracing::debug!(peer = %link.label, %fp, "federation: broadcast signal ok");
}
Err(e) => {
tracing::warn!(peer = %link.label, %fp, error = %e, "federation: broadcast signal failed");
}
}
}
count
}
/// Phase 4: targeted send — used by the
/// `DirectCallAnswer` path when the registry knows exactly
/// which peer relay to route the reply back to. More efficient
/// than re-broadcasting and avoids leaking the call to
/// uninvolved peers.
///
/// Returns `Ok(())` on success, `Err(String)` when the peer
/// isn't currently linked or the send fails.
pub async fn send_signal_to_peer(
&self,
peer_relay_fp: &str,
msg: &wzp_proto::SignalMessage,
) -> Result<(), String> {
let normalized = normalize_fp(peer_relay_fp);
let links = self.peer_links.lock().await;
match links.get(&normalized) {
Some(link) => link
.transport
.send_signal(msg)
.await
.map_err(|e| format!("send to peer {normalized}: {e}")),
None => Err(format!("no active federation link for {normalized}")),
} }
} }
@@ -372,12 +296,7 @@ impl FederationManager {
/// Forward locally-generated media to all connected peers. /// Forward locally-generated media to all connected peers.
/// For locally-originated media, we send to ALL peers (they decide whether to deliver). /// For locally-originated media, we send to ALL peers (they decide whether to deliver).
/// For forwarded media (multi-hop), handle_datagram filters by active_rooms. /// For forwarded media (multi-hop), handle_datagram filters by active_rooms.
/// pub async fn forward_to_peers(&self, room_name: &str, room_hash: &[u8; 8], media_data: &Bytes) {
/// `_room_name` is kept in the signature for caller-site symmetry with
/// the other room-tagged helpers and for future per-room-name logging
/// or rate limiting; the body currently forwards on `room_hash` alone
/// because that's what the wire format carries.
pub async fn forward_to_peers(&self, _room_name: &str, room_hash: &[u8; 8], media_data: &Bytes) {
let links = self.peer_links.lock().await; let links = self.peer_links.lock().await;
if links.is_empty() { if links.is_empty() {
return; return;
@@ -704,20 +623,11 @@ async fn run_federation_link(
} }
}; };
// RTT monitor: periodically sample QUIC RTT for this peer and push it // RTT monitor: periodically sample QUIC RTT for this peer
// into the `wzp_federation_peer_rtt_ms` gauge. The gauge is registered
// in metrics.rs but previously never received any samples — the task
// computed rtt_ms and dropped it on the floor, leaving the Grafana
// panel blank. Fixed as part of the workspace warning sweep.
let rtt_task = async move { let rtt_task = async move {
loop { loop {
tokio::time::sleep(Duration::from_secs(5)).await; tokio::time::sleep(Duration::from_secs(5)).await;
let rtt_ms = rtt_transport.connection().stats().path.rtt.as_millis() as f64; let rtt_ms = rtt_transport.connection().stats().path.rtt.as_millis() as f64;
fm_rtt
.metrics
.federation_peer_rtt_ms
.with_label_values(&[&label_rtt])
.set(rtt_ms);
} }
}; };
@@ -932,57 +842,6 @@ async fn handle_signal(
} }
} }
} }
// Phase 4: cross-relay direct-call signal envelope.
//
// Unwrap the inner message and hand it off to the main
// signal loop via the cross_relay_signal_tx channel. The
// main loop will then dispatch the inner DirectCallOffer/
// Answer/Ringing/Hangup exactly as if it had arrived on a
// local signal transport — with the extra context that
// the call is "federated" (origin_relay_fp).
//
// Loop prevention: drop any forward whose origin matches
// our own federation TLS fingerprint. With
// broadcast-to-all-peers this prevents A→B→A echo loops.
SignalMessage::FederatedSignalForward { inner, origin_relay_fp } => {
if origin_relay_fp == fm.local_tls_fp {
tracing::debug!(
peer = %peer_label,
"federation: dropping self-sourced FederatedSignalForward (loop prevention)"
);
return;
}
let tx_opt = {
let guard = fm.cross_relay_signal_tx.lock().await;
guard.clone()
};
match tx_opt {
Some(tx) => {
let inner_discriminant = std::mem::discriminant(&*inner);
if let Err(e) = tx.send((*inner, origin_relay_fp.clone())).await {
warn!(
peer = %peer_label,
?inner_discriminant,
error = %e,
"federation: cross-relay signal dispatcher full / closed"
);
} else {
tracing::debug!(
peer = %peer_label,
?inner_discriminant,
%origin_relay_fp,
"federation: forwarded cross-relay signal to main dispatcher"
);
}
}
None => {
warn!(
peer = %peer_label,
"federation: cross_relay_signal_tx not wired yet — dropping forward"
);
}
}
}
_ => {} // ignore other signals _ => {} // ignore other signals
} }
} }

View File

@@ -94,13 +94,9 @@ pub async fn accept_handshake(
} }
/// Select the best quality profile from those the caller supports. /// Select the best quality profile from those the caller supports.
/// fn choose_profile(supported: &[QualityProfile]) -> QualityProfile {
/// The `_supported` list is currently ignored — we hardcode GOOD (24k) until // Cap at GOOD (24k) for now — studio tiers (32k/48k/64k) not yet tested
/// studio tiers (32k/48k/64k) have been validated across federation (large // for federation reliability (large packets may exceed path MTU).
/// packets may exceed path MTU and fragment in unpleasant ways). Once that's
/// tested, the body should pick the highest supported profile ≤ the relay's
/// configured ceiling.
fn choose_profile(_supported: &[QualityProfile]) -> QualityProfile {
QualityProfile::GOOD QualityProfile::GOOD
} }

View File

@@ -13,7 +13,7 @@ use std::sync::Arc;
use std::time::Duration; use std::time::Duration;
use tokio::sync::Mutex; use tokio::sync::Mutex;
use tracing::{debug, error, info, warn}; use tracing::{error, info, warn};
use wzp_proto::{MediaTransport, SignalMessage}; use wzp_proto::{MediaTransport, SignalMessage};
use wzp_relay::config::RelayConfig; use wzp_relay::config::RelayConfig;
@@ -272,7 +272,7 @@ const BUILD_GIT_HASH: &str = env!("WZP_BUILD_HASH");
#[tokio::main] #[tokio::main]
async fn main() -> anyhow::Result<()> { async fn main() -> anyhow::Result<()> {
let CliResult { config, identity_path, config_file, config_needs_create } = parse_args(); let CliResult { mut config, identity_path, config_file, config_needs_create } = parse_args();
tracing_subscriber::fmt().init(); tracing_subscriber::fmt().init();
info!(version = BUILD_GIT_HASH, "wzp-relay build"); info!(version = BUILD_GIT_HASH, "wzp-relay build");
rustls::crypto::ring::default_provider() rustls::crypto::ring::default_provider()
@@ -378,31 +378,6 @@ async fn main() -> anyhow::Result<()> {
} }
let endpoint = wzp_transport::create_endpoint(config.listen_addr, Some(server_config))?; let endpoint = wzp_transport::create_endpoint(config.listen_addr, Some(server_config))?;
// Compute the IP address we should advertise in CallSetup for direct
// calls. If the relay is bound to a specific IP, use it as-is; if bound
// to 0.0.0.0, use the trick of "connect" a UDP socket to an arbitrary
// external address and read its local_addr — the OS binds to whichever
// local interface IP would route packets to that destination, which is
// the primary outbound interface. This is the same IP clients on the
// LAN use to reach us.
let advertised_ip: std::net::IpAddr = {
let listen_ip = config.listen_addr.ip();
if !listen_ip.is_unspecified() {
listen_ip
} else {
// Probe via a dummy "connected" UDP socket. Never actually sends.
match std::net::UdpSocket::bind("0.0.0.0:0")
.and_then(|s| { s.connect("8.8.8.8:80").map(|_| s) })
.and_then(|s| s.local_addr())
{
Ok(a) if !a.ip().is_loopback() => a.ip(),
_ => std::net::IpAddr::from([127u8, 0, 0, 1]),
}
}
};
let advertised_addr_str = format!("{}:{}", advertised_ip, config.listen_addr.port());
info!(%advertised_addr_str, "relay advertised address for CallSetup");
// Forward mode // Forward mode
let remote_transport: Option<Arc<wzp_transport::QuinnTransport>> = let remote_transport: Option<Arc<wzp_transport::QuinnTransport>> =
if let Some(remote_addr) = config.remote_relay { if let Some(remote_addr) = config.remote_relay {
@@ -453,21 +428,6 @@ async fn main() -> anyhow::Result<()> {
let signal_hub = Arc::new(Mutex::new(wzp_relay::signal_hub::SignalHub::new())); let signal_hub = Arc::new(Mutex::new(wzp_relay::signal_hub::SignalHub::new()));
let call_registry = Arc::new(Mutex::new(wzp_relay::call_registry::CallRegistry::new())); let call_registry = Arc::new(Mutex::new(wzp_relay::call_registry::CallRegistry::new()));
// Phase 4: cross-relay direct-call signal dispatcher.
//
// The federation layer unwraps incoming
// `SignalMessage::FederatedSignalForward` envelopes and pushes
// (inner, origin_relay_fp) onto this channel. A dedicated task
// further down reads from it and routes the inner message
// through signal_hub / call_registry exactly as if it had
// arrived on a local signal transport — with the extra
// context that a peer relay is on the other side of the call.
let (cross_relay_tx, mut cross_relay_rx) =
tokio::sync::mpsc::channel::<(wzp_proto::SignalMessage, String)>(32);
if let Some(ref fm) = federation_mgr {
fm.set_cross_relay_tx(cross_relay_tx.clone()).await;
}
// Spawn inter-relay health probes via ProbeMesh coordinator // Spawn inter-relay health probes via ProbeMesh coordinator
if !config.probe_targets.is_empty() { if !config.probe_targets.is_empty() {
let mesh = wzp_relay::probe::ProbeMesh::new( let mesh = wzp_relay::probe::ProbeMesh::new(
@@ -512,217 +472,12 @@ async fn main() -> anyhow::Result<()> {
info!(filter = %tap, "debug tap enabled — logging packet headers"); info!(filter = %tap, "debug tap enabled — logging packet headers");
} }
// Phase 4: cross-relay direct-call dispatcher task.
//
// Reads unwrapped (inner, origin_relay_fp) tuples that the
// federation layer pushes out of its `handle_signal` arm for
// `FederatedSignalForward`, and routes the inner message
// through the local signal_hub / call_registry exactly as if
// the message had arrived on a local client signal transport.
//
// In Phase 4 MVP the dispatcher handles:
// * DirectCallOffer — if target is local, stash in registry
// with peer_relay_fp and deliver to
// local callee via signal_hub.
// * DirectCallAnswer — stash callee addr, forward answer to
// local caller, emit local CallSetup.
// * CallRinging — forward to local caller for UX.
// * Hangup — forward to the local participant(s).
// Everything else is dropped.
{
let signal_hub_d = signal_hub.clone();
let call_registry_d = call_registry.clone();
let advertised_addr_d = advertised_addr_str.clone();
let federation_mgr_d = federation_mgr.clone();
tokio::spawn(async move {
use wzp_proto::{CallAcceptMode, SignalMessage};
while let Some((inner, origin_relay_fp)) = cross_relay_rx.recv().await {
match inner {
SignalMessage::DirectCallOffer {
ref target_fingerprint,
ref caller_fingerprint,
ref call_id,
ref caller_reflexive_addr,
..
} => {
// Is the target on THIS relay? If not, drop —
// Phase 4 MVP is single-hop federation only.
let online = {
let hub = signal_hub_d.lock().await;
hub.is_online(target_fingerprint)
};
if !online {
tracing::debug!(
target = %target_fingerprint,
%origin_relay_fp,
"cross-relay: offer target not local, dropping (no multi-hop)"
);
continue;
}
// Stash in local registry so the answer path
// can find the call + route the reply back
// through the same federation link.
{
let mut reg = call_registry_d.lock().await;
reg.create_call(
call_id.clone(),
caller_fingerprint.clone(),
target_fingerprint.clone(),
);
reg.set_caller_reflexive_addr(call_id, caller_reflexive_addr.clone());
reg.set_peer_relay_fp(call_id, Some(origin_relay_fp.clone()));
}
// Deliver the offer to the local target.
let hub = signal_hub_d.lock().await;
if let Err(e) = hub.send_to(target_fingerprint, &inner).await {
tracing::warn!(
target = %target_fingerprint,
error = %e,
"cross-relay: failed to deliver forwarded offer"
);
}
}
SignalMessage::DirectCallAnswer {
ref call_id,
accept_mode,
ref callee_reflexive_addr,
..
} => {
// Look up the local caller fp from the registry.
let caller_fp = {
let reg = call_registry_d.lock().await;
reg.get(call_id).map(|c| c.caller_fingerprint.clone())
};
let Some(caller_fp) = caller_fp else {
tracing::debug!(%call_id, "cross-relay: answer for unknown call, dropping");
continue;
};
if accept_mode == CallAcceptMode::Reject {
// Forward hangup to local caller + clean up registry.
let hub = signal_hub_d.lock().await;
let _ = hub
.send_to(
&caller_fp,
&SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal,
},
)
.await;
drop(hub);
let mut reg = call_registry_d.lock().await;
reg.end_call(call_id);
continue;
}
// Accept — stash the callee's reflex addr + mark
// the call active, then read back BOTH addrs so
// we can cross-wire peer_direct_addr in CallSetup.
let room_name = format!("call-{call_id}");
let (caller_addr, callee_addr_for_setup) = {
let mut reg = call_registry_d.lock().await;
reg.set_active(call_id, accept_mode, room_name.clone());
reg.set_callee_reflexive_addr(
call_id,
callee_reflexive_addr.clone(),
);
let c = reg.get(call_id);
(
c.and_then(|c| c.caller_reflexive_addr.clone()),
c.and_then(|c| c.callee_reflexive_addr.clone()),
)
};
let _ = caller_addr; // unused on the caller side; callee holds the relevant addr
// Forward the raw answer to the local caller so
// the JS side sees DirectCallAnswer (fires any
// "call answered" UX that looks at this message).
{
let hub = signal_hub_d.lock().await;
let _ = hub.send_to(&caller_fp, &inner).await;
}
// Emit the LOCAL CallSetup to our local caller.
// relay_addr = our own advertised addr so if P2P
// fails the caller will at least dial OUR relay
// (single-relay fallback — Phase 4.1 will wire
// federated media so that actually reaches the
// peer). peer_direct_addr = the callee's reflex
// addr carried in the answer.
let setup = SignalMessage::CallSetup {
call_id: call_id.clone(),
room: room_name.clone(),
relay_addr: advertised_addr_d.clone(),
peer_direct_addr: callee_addr_for_setup,
};
let hub = signal_hub_d.lock().await;
let _ = hub.send_to(&caller_fp, &setup).await;
tracing::info!(
%call_id,
%caller_fp,
%origin_relay_fp,
"cross-relay: delivered answer + CallSetup to local caller"
);
}
SignalMessage::CallRinging { ref call_id } => {
// Forward to local caller for "ringing..." UX.
let caller_fp = {
let reg = call_registry_d.lock().await;
reg.get(call_id).map(|c| c.caller_fingerprint.clone())
};
if let Some(fp) = caller_fp {
let hub = signal_hub_d.lock().await;
let _ = hub.send_to(&fp, &inner).await;
}
}
SignalMessage::Hangup { .. } => {
// Best-effort: broadcast the hangup to every
// local participant of any call that currently
// has this origin as its peer_relay_fp.
// The forwarded hangup doesn't carry a call_id
// so we can't target precisely — Phase 4.1 will
// tighten this once hangup tracking is stricter.
tracing::debug!(
%origin_relay_fp,
"cross-relay: forwarded Hangup (Phase 4.1 will target by call_id)"
);
}
_ => {
tracing::debug!(
%origin_relay_fp,
"cross-relay: dispatcher ignoring unsupported inner variant"
);
}
}
}
// Suppress the warning if federation_mgr_d is unused —
// it's held here so the Arc doesn't drop during the
// dispatcher's lifetime.
drop(federation_mgr_d);
});
}
info!("Listening for connections..."); info!("Listening for connections...");
loop { loop {
// Pull the next Incoming off the queue. Deliberately do NOT await let connection = match wzp_transport::accept(&endpoint).await {
// the QUIC handshake here — move that into the per-connection Ok(conn) => conn,
// spawned task below. Previously we used wzp_transport::accept Err(e) => { error!("accept: {e}"); continue; }
// which did both, which meant a single slow handshake would block
// the entire accept loop and prevent ALL subsequent connections
// from being processed. Surfaced as direct-call hangs where the
// callee's call-* connection never completes its QUIC handshake.
let incoming = match endpoint.accept().await {
Some(inc) => inc,
None => {
error!("endpoint.accept() returned None — endpoint closed");
break;
}
}; };
let remote_transport = remote_transport.clone(); let remote_transport = remote_transport.clone();
@@ -738,26 +493,9 @@ async fn main() -> anyhow::Result<()> {
let federation_mgr = federation_mgr.clone(); let federation_mgr = federation_mgr.clone();
let signal_hub = signal_hub.clone(); let signal_hub = signal_hub.clone();
let call_registry = call_registry.clone(); let call_registry = call_registry.clone();
let advertised_addr_str = advertised_addr_str.clone(); let listen_addr_str = config.listen_addr.to_string();
// Phase 4: per-task clone of this relay's federation TLS
// fingerprint so the FederatedSignalForward envelopes the
// spawned signal handler builds carry `origin_relay_fp`.
let tls_fp = tls_fp.clone();
let incoming_addr = incoming.remote_address();
info!(%incoming_addr, "accept queue: new Incoming, spawning handshake task");
tokio::spawn(async move { tokio::spawn(async move {
// Drive the QUIC handshake inside the spawned task so that
// slow or hung handshakes never block the outer accept loop.
let connection = match incoming.await {
Ok(c) => c,
Err(e) => {
error!(%incoming_addr, "QUIC handshake failed: {e}");
return;
}
};
info!(%incoming_addr, "QUIC handshake complete");
let addr = connection.remote_address(); let addr = connection.remote_address();
let room_name = connection let room_name = connection
@@ -980,15 +718,9 @@ async fn main() -> anyhow::Result<()> {
match transport.recv_signal().await { match transport.recv_signal().await {
Ok(Some(msg)) => { Ok(Some(msg)) => {
match msg { match msg {
SignalMessage::DirectCallOffer { SignalMessage::DirectCallOffer { ref target_fingerprint, ref call_id, ref caller_alias, .. } => {
ref target_fingerprint,
ref call_id,
ref caller_reflexive_addr,
..
} => {
let target_fp = target_fingerprint.clone(); let target_fp = target_fingerprint.clone();
let call_id = call_id.clone(); let call_id = call_id.clone();
let caller_addr_for_registry = caller_reflexive_addr.clone();
// Check if target is online // Check if target is online
let online = { let online = {
@@ -996,85 +728,17 @@ async fn main() -> anyhow::Result<()> {
hub.is_online(&target_fp) hub.is_online(&target_fp)
}; };
if !online { if !online {
// Phase 4: maybe the target is on a info!(%addr, target = %target_fp, "call target not online");
// federation peer. Wrap the offer in
// FederatedSignalForward and broadcast
// it over every active peer link —
// whichever relay has the target will
// unwrap and dispatch locally. We also
// stash the call in OUR registry so
// the eventual answer coming back via
// federation has a matching entry.
let forwarded = if let Some(ref fm) = federation_mgr {
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(msg.clone()),
origin_relay_fp: tls_fp.clone(),
};
let count = fm.broadcast_signal(&forward).await;
if count > 0 {
info!(
%addr,
target = %target_fp,
peers = count,
"direct-call offer forwarded to federation peers"
);
true
} else {
false
}
} else {
false
};
if !forwarded {
info!(%addr, target = %target_fp, "call target not online (no federation route)");
let _ = transport.send_signal(&SignalMessage::Hangup { let _ = transport.send_signal(&SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal, reason: wzp_proto::HangupReason::Normal,
}).await; }).await;
continue; continue;
} }
// Create call in registry with the // Create call in registry
// caller's reflex addr + mark it as
// cross-relay so the answer path knows
// to route the CallSetup's
// peer_direct_addr from what the
// federated answer carries. peer_relay_fp
// stays None here because we broadcast —
// the receiving relay picks itself as
// the answer source and its forwarded
// answer will identify itself there.
{
let mut reg = call_registry.lock().await;
reg.create_call(
call_id.clone(),
client_fp.clone(),
target_fp.clone(),
);
reg.set_caller_reflexive_addr(
&call_id,
caller_addr_for_registry,
);
}
// Send ringing to caller immediately
// so the UI shows feedback while the
// federated delivery is in flight.
let _ = transport.send_signal(&SignalMessage::CallRinging {
call_id: call_id.clone(),
}).await;
continue;
}
// Create call in registry + stash the caller's
// reflex addr (Phase 3 hole-punching). The relay
// treats the addr as opaque — no validation.
// Injected later into the callee's CallSetup as
// peer_direct_addr.
{ {
let mut reg = call_registry.lock().await; let mut reg = call_registry.lock().await;
reg.create_call(call_id.clone(), client_fp.clone(), target_fp.clone()); reg.create_call(call_id.clone(), client_fp.clone(), target_fp.clone());
reg.set_caller_reflexive_addr(&call_id, caller_addr_for_registry);
} }
// Forward offer to callee // Forward offer to callee
@@ -1091,35 +755,16 @@ async fn main() -> anyhow::Result<()> {
}).await; }).await;
} }
SignalMessage::DirectCallAnswer { SignalMessage::DirectCallAnswer { ref call_id, ref accept_mode, .. } => {
ref call_id,
ref accept_mode,
ref callee_reflexive_addr,
..
} => {
let call_id = call_id.clone(); let call_id = call_id.clone();
let mode = *accept_mode; let mode = *accept_mode;
let callee_addr_for_registry = callee_reflexive_addr.clone();
// Phase 4: look up peer fingerprint AND let peer_fp = {
// peer_relay_fp in one lock acquisition.
// peer_relay_fp being Some means the
// caller is on a remote federation peer
// and we have to route the answer /
// hangup back through that link instead
// of local signal_hub.
let (peer_fp, peer_relay_fp) = {
let reg = call_registry.lock().await; let reg = call_registry.lock().await;
match reg.get(&call_id) { reg.peer_fingerprint(&call_id, &client_fp).map(|s| s.to_string())
Some(c) => (
Some(reg.peer_fingerprint(&call_id, &client_fp).map(|s| s.to_string())),
c.peer_relay_fp.clone(),
),
None => (None, None),
}
}; };
let Some(Some(peer_fp)) = peer_fp else { let Some(peer_fp) = peer_fp else {
warn!(call_id = %call_id, "answer for unknown call"); warn!(call_id = %call_id, "answer for unknown call");
continue; continue;
}; };
@@ -1129,120 +774,50 @@ async fn main() -> anyhow::Result<()> {
let mut reg = call_registry.lock().await; let mut reg = call_registry.lock().await;
reg.end_call(&call_id); reg.end_call(&call_id);
drop(reg); drop(reg);
// Phase 4: cross-relay reject —
// forward the hangup to the origin
// relay instead of local signal_hub.
if let Some(ref origin_fp) = peer_relay_fp {
if let Some(ref fm) = federation_mgr {
let hangup = SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal,
};
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(hangup),
origin_relay_fp: tls_fp.clone(),
};
if let Err(e) = fm.send_signal_to_peer(origin_fp, &forward).await {
warn!(%call_id, %origin_fp, error = %e, "cross-relay reject forward failed");
}
}
} else {
let hub = signal_hub.lock().await; let hub = signal_hub.lock().await;
let _ = hub.send_to(&peer_fp, &SignalMessage::Hangup { let _ = hub.send_to(&peer_fp, &SignalMessage::Hangup {
reason: wzp_proto::HangupReason::Normal, reason: wzp_proto::HangupReason::Normal,
}).await; }).await;
}
} else { } else {
// Accept — create private room + stash the // Accept — create private room
// callee's reflex addr if it advertised one
// (AcceptTrusted only — privacy-mode answers
// leave it None by design). Then read back
// BOTH parties' addrs so we can cross-wire
// peer_direct_addr on the CallSetups below.
let room = format!("call-{call_id}"); let room = format!("call-{call_id}");
let (caller_addr, callee_addr) = { {
let mut reg = call_registry.lock().await; let mut reg = call_registry.lock().await;
reg.set_active(&call_id, mode, room.clone()); reg.set_active(&call_id, mode, room.clone());
reg.set_callee_reflexive_addr(&call_id, callee_addr_for_registry);
let call = reg.get(&call_id);
(
call.and_then(|c| c.caller_reflexive_addr.clone()),
call.and_then(|c| c.callee_reflexive_addr.clone()),
)
};
info!(
call_id = %call_id,
room = %room,
?mode,
p2p_viable = caller_addr.is_some() && callee_addr.is_some(),
"call accepted, creating room"
);
let relay_addr_for_setup = advertised_addr_str.clone();
if let Some(ref origin_fp) = peer_relay_fp {
// Phase 4 cross-relay: the caller
// is on a remote peer. Forward the
// raw answer (which carries the
// callee's reflex addr) back over
// federation — the peer's
// cross-relay dispatcher will
// deliver it to the local caller
// AND emit a CallSetup on that
// side with peer_direct_addr =
// callee_addr.
//
// Here we emit only the LOCAL
// CallSetup (to our callee) with
// peer_direct_addr = caller_addr.
if let Some(ref fm) = federation_mgr {
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(msg.clone()),
origin_relay_fp: tls_fp.clone(),
};
if let Err(e) = fm.send_signal_to_peer(origin_fp, &forward).await {
warn!(
%call_id,
%origin_fp,
error = %e,
"cross-relay answer forward failed"
);
}
} }
info!(call_id = %call_id, room = %room, mode = ?mode, "call accepted, creating room");
let setup_for_callee = SignalMessage::CallSetup {
call_id: call_id.clone(),
room: room.clone(),
relay_addr: relay_addr_for_setup,
peer_direct_addr: caller_addr.clone(),
};
let hub = signal_hub.lock().await;
let _ = hub.send_to(&client_fp, &setup_for_callee).await;
} else {
// Local call (existing Phase 3 path).
// Forward answer to caller // Forward answer to caller
{ {
let hub = signal_hub.lock().await; let hub = signal_hub.lock().await;
let _ = hub.send_to(&peer_fp, &msg).await; let _ = hub.send_to(&peer_fp, &msg).await;
} }
// Send CallSetup to BOTH parties with // Send CallSetup to both parties
// cross-wired peer_direct_addr. // Use the address the client connected to (their remote addr
let setup_for_caller = SignalMessage::CallSetup { // is our perspective, but we need our listen addr).
call_id: call_id.clone(), // Replace 0.0.0.0 with the client's destination IP.
room: room.clone(), let relay_addr_for_setup = if listen_addr_str.starts_with("0.0.0.0:") {
relay_addr: relay_addr_for_setup.clone(), let port = &listen_addr_str[8..];
peer_direct_addr: callee_addr.clone(), // Use the local IP from the client's connection
let local_ip = addr.ip();
if local_ip.is_loopback() {
format!("127.0.0.1:{port}")
} else {
format!("{local_ip}:{port}")
}
} else {
listen_addr_str.clone()
}; };
let setup_for_callee = SignalMessage::CallSetup { let setup = SignalMessage::CallSetup {
call_id: call_id.clone(), call_id: call_id.clone(),
room: room.clone(), room: room.clone(),
relay_addr: relay_addr_for_setup, relay_addr: relay_addr_for_setup,
peer_direct_addr: caller_addr.clone(),
}; };
{
let hub = signal_hub.lock().await; let hub = signal_hub.lock().await;
let _ = hub.send_to(&peer_fp, &setup_for_caller).await; let _ = hub.send_to(&peer_fp, &setup).await;
let _ = hub.send_to(&client_fp, &setup_for_callee).await; let _ = hub.send_to(&client_fp, &setup).await;
} }
} }
} }
@@ -1273,31 +848,6 @@ async fn main() -> anyhow::Result<()> {
let _ = transport.send_signal(&SignalMessage::Pong { timestamp_ms }).await; let _ = transport.send_signal(&SignalMessage::Pong { timestamp_ms }).await;
} }
// QUIC-native NAT reflection ("STUN for QUIC").
// The client asks "what source address do you
// see for me?" and we reply with whatever
// quinn reports as this connection's remote
// address — i.e. the post-NAT public address
// as observed from the server side of the TLS
// session. Used by the P2P path to learn the
// client's server-reflexive address without
// running a separate STUN server. No auth or
// rate-limit in Phase 1 — the client is
// already TLS-authenticated by the time it
// reaches this match arm.
SignalMessage::Reflect => {
let observed_addr = addr.to_string();
if let Err(e) = transport.send_signal(
&SignalMessage::ReflectResponse {
observed_addr: observed_addr.clone(),
},
).await {
warn!(%addr, error = %e, "reflect: failed to send response");
} else {
debug!(%addr, %observed_addr, "reflect: responded");
}
}
other => { other => {
warn!(%addr, "signal: unexpected message: {:?}", std::mem::discriminant(&other)); warn!(%addr, "signal: unexpected message: {:?}", std::mem::discriminant(&other));
} }
@@ -1307,16 +857,6 @@ async fn main() -> anyhow::Result<()> {
info!(%addr, "signal connection closed"); info!(%addr, "signal connection closed");
break; break;
} }
Err(wzp_proto::TransportError::Deserialize(e)) => {
// Forward-compat: the peer sent a
// SignalMessage variant we don't know
// (newer client, newer federation peer).
// Log and continue — tearing down the
// connection on unknown variants would
// silently kill interop across minor
// protocol version bumps.
warn!(%addr, "signal deserialize (unknown variant?), continuing: {e}");
}
Err(e) => { Err(e) => {
warn!(%addr, "signal recv error: {e}"); warn!(%addr, "signal recv error: {e}");
break; break;
@@ -1613,5 +1153,4 @@ async fn main() -> anyhow::Result<()> {
} }
}); });
} }
Ok(())
} }

View File

@@ -29,9 +29,6 @@ pub struct RelayMetrics {
pub session_rtt_ms: GaugeVec, pub session_rtt_ms: GaugeVec,
pub session_underruns: IntCounterVec, pub session_underruns: IntCounterVec,
pub session_overruns: IntCounterVec, pub session_overruns: IntCounterVec,
// Phase 4: loss-recovery breakdown per session.
pub session_dred_reconstructions: IntCounterVec,
pub session_classical_plc: IntCounterVec,
registry: Registry, registry: Registry,
} }
@@ -133,23 +130,6 @@ impl RelayMetrics {
) )
.expect("metric"); .expect("metric");
let session_dred_reconstructions = IntCounterVec::new(
Opts::new(
"wzp_relay_session_dred_reconstructions_total",
"Frames reconstructed via DRED (Deep REDundancy) per session",
),
&["session_id"],
)
.expect("metric");
let session_classical_plc = IntCounterVec::new(
Opts::new(
"wzp_relay_session_classical_plc_total",
"Frames filled via classical Opus/Codec2 PLC per session",
),
&["session_id"],
)
.expect("metric");
registry.register(Box::new(active_sessions.clone())).expect("register"); registry.register(Box::new(active_sessions.clone())).expect("register");
registry.register(Box::new(active_rooms.clone())).expect("register"); registry.register(Box::new(active_rooms.clone())).expect("register");
registry.register(Box::new(packets_forwarded.clone())).expect("register"); registry.register(Box::new(packets_forwarded.clone())).expect("register");
@@ -167,8 +147,6 @@ impl RelayMetrics {
registry.register(Box::new(session_rtt_ms.clone())).expect("register"); registry.register(Box::new(session_rtt_ms.clone())).expect("register");
registry.register(Box::new(session_underruns.clone())).expect("register"); registry.register(Box::new(session_underruns.clone())).expect("register");
registry.register(Box::new(session_overruns.clone())).expect("register"); registry.register(Box::new(session_overruns.clone())).expect("register");
registry.register(Box::new(session_dred_reconstructions.clone())).expect("register");
registry.register(Box::new(session_classical_plc.clone())).expect("register");
Self { Self {
active_sessions, active_sessions,
@@ -188,8 +166,6 @@ impl RelayMetrics {
session_rtt_ms, session_rtt_ms,
session_underruns, session_underruns,
session_overruns, session_overruns,
session_dred_reconstructions,
session_classical_plc,
registry, registry,
} }
} }
@@ -241,39 +217,6 @@ impl RelayMetrics {
} }
} }
/// Phase 4: update per-session loss-recovery counters from a client's
/// `LossRecoveryUpdate` signal message. The client sends monotonic
/// totals (frames reconstructed since call start); we compute the
/// delta against the current Prometheus counter and increment by it.
/// IntCounterVec only increases, so a client restart that resets the
/// counter to 0 simply produces no delta until the new totals exceed
/// the Prometheus state.
pub fn update_session_loss_recovery(
&self,
session_id: &str,
dred_reconstructions: u64,
classical_plc: u64,
) {
let cur_dred = self
.session_dred_reconstructions
.with_label_values(&[session_id])
.get();
if dred_reconstructions > cur_dred {
self.session_dred_reconstructions
.with_label_values(&[session_id])
.inc_by(dred_reconstructions - cur_dred);
}
let cur_plc = self
.session_classical_plc
.with_label_values(&[session_id])
.get();
if classical_plc > cur_plc {
self.session_classical_plc
.with_label_values(&[session_id])
.inc_by(classical_plc - cur_plc);
}
}
/// Remove all per-session label values for a disconnected session. /// Remove all per-session label values for a disconnected session.
pub fn remove_session_metrics(&self, session_id: &str) { pub fn remove_session_metrics(&self, session_id: &str) {
let _ = self.session_buffer_depth.remove_label_values(&[session_id]); let _ = self.session_buffer_depth.remove_label_values(&[session_id]);
@@ -281,10 +224,6 @@ impl RelayMetrics {
let _ = self.session_rtt_ms.remove_label_values(&[session_id]); let _ = self.session_rtt_ms.remove_label_values(&[session_id]);
let _ = self.session_underruns.remove_label_values(&[session_id]); let _ = self.session_underruns.remove_label_values(&[session_id]);
let _ = self.session_overruns.remove_label_values(&[session_id]); let _ = self.session_overruns.remove_label_values(&[session_id]);
let _ = self
.session_dred_reconstructions
.remove_label_values(&[session_id]);
let _ = self.session_classical_plc.remove_label_values(&[session_id]);
} }
/// Get a reference to the underlying Prometheus registry. /// Get a reference to the underlying Prometheus registry.
@@ -479,13 +418,10 @@ mod tests {
}; };
m.update_session_quality("sess-cleanup", &report); m.update_session_quality("sess-cleanup", &report);
m.update_session_buffer("sess-cleanup", 42, 3, 1); m.update_session_buffer("sess-cleanup", 42, 3, 1);
m.update_session_loss_recovery("sess-cleanup", 17, 4);
// Verify they appear // Verify they appear
let output = m.metrics_handler(); let output = m.metrics_handler();
assert!(output.contains("sess-cleanup")); assert!(output.contains("sess-cleanup"));
assert!(output.contains("wzp_relay_session_dred_reconstructions_total"));
assert!(output.contains("wzp_relay_session_classical_plc_total"));
// Remove and verify they are gone // Remove and verify they are gone
m.remove_session_metrics("sess-cleanup"); m.remove_session_metrics("sess-cleanup");
@@ -493,55 +429,6 @@ mod tests {
assert!(!output.contains("sess-cleanup")); assert!(!output.contains("sess-cleanup"));
} }
/// Phase 4: LossRecoveryUpdate → per-session counters, monotonic delta
/// application.
#[test]
fn session_loss_recovery_monotonic_delta() {
let m = RelayMetrics::new();
let sess = "sess-dred";
// First update: 10 DRED, 2 PLC
m.update_session_loss_recovery(sess, 10, 2);
let dred1 = m
.session_dred_reconstructions
.with_label_values(&[sess])
.get();
let plc1 = m.session_classical_plc.with_label_values(&[sess]).get();
assert_eq!(dred1, 10);
assert_eq!(plc1, 2);
// Second update: 25 DRED, 5 PLC — counter advances by (15, 3)
m.update_session_loss_recovery(sess, 25, 5);
let dred2 = m
.session_dred_reconstructions
.with_label_values(&[sess])
.get();
let plc2 = m.session_classical_plc.with_label_values(&[sess]).get();
assert_eq!(dred2, 25);
assert_eq!(plc2, 5);
// Third update with LOWER values (e.g., client reset) — counters
// hold steady, no decrement.
m.update_session_loss_recovery(sess, 5, 1);
let dred3 = m
.session_dred_reconstructions
.with_label_values(&[sess])
.get();
let plc3 = m.session_classical_plc.with_label_values(&[sess]).get();
assert_eq!(dred3, 25, "counter must not decrease");
assert_eq!(plc3, 5, "counter must not decrease");
// Fourth update: client caught up and exceeded the old max.
m.update_session_loss_recovery(sess, 30, 8);
let dred4 = m
.session_dred_reconstructions
.with_label_values(&[sess])
.get();
let plc4 = m.session_classical_plc.with_label_values(&[sess]).get();
assert_eq!(dred4, 30);
assert_eq!(plc4, 8);
}
#[test] #[test]
fn metrics_increment() { fn metrics_increment() {
let m = RelayMetrics::new(); let m = RelayMetrics::new();

View File

@@ -10,7 +10,7 @@ use std::time::Duration;
use bytes::Bytes; use bytes::Bytes;
use tokio::sync::Mutex; use tokio::sync::Mutex;
use tracing::{error, info, warn}; use tracing::{debug, error, info, trace, warn};
use wzp_proto::packet::TrunkFrame; use wzp_proto::packet::TrunkFrame;
use wzp_proto::MediaTransport; use wzp_proto::MediaTransport;
@@ -483,6 +483,7 @@ async fn run_participant_plain(
); );
loop { loop {
let recv_start = std::time::Instant::now();
let pkt = match transport.recv_media().await { let pkt = match transport.recv_media().await {
Ok(Some(pkt)) => pkt, Ok(Some(pkt)) => pkt,
Ok(None) => { Ok(None) => {
@@ -837,7 +838,7 @@ mod tests {
#[test] #[test]
fn room_join_leave() { fn room_join_leave() {
let mgr = RoomManager::new(); let mut mgr = RoomManager::new();
assert_eq!(mgr.room_size("test"), 0); assert_eq!(mgr.room_size("test"), 0);
assert!(mgr.list().is_empty()); assert!(mgr.list().is_empty());
} }

View File

@@ -7,7 +7,7 @@ use std::collections::HashMap;
use std::sync::Arc; use std::sync::Arc;
use std::time::Instant; use std::time::Instant;
use tracing::info; use tracing::{info, warn};
use wzp_proto::{MediaTransport, SignalMessage}; use wzp_proto::{MediaTransport, SignalMessage};
use wzp_transport::QuinnTransport; use wzp_transport::QuinnTransport;
@@ -94,7 +94,7 @@ mod tests {
#[test] #[test]
fn register_unregister() { fn register_unregister() {
let hub = SignalHub::new(); let mut hub = SignalHub::new();
assert_eq!(hub.online_count(), 0); assert_eq!(hub.online_count(), 0);
assert!(!hub.is_online("alice")); assert!(!hub.is_online("alice"));

View File

@@ -1,311 +0,0 @@
//! Phase 4 integration test for cross-relay direct calling
//! (PRD: .taskmaster/docs/prd_phase4_cross_relay_p2p.txt).
//!
//! Drives the call-registry cross-wiring + a simulated federation
//! forward without spinning up actual relay binaries. The real
//! main-loop and dispatcher code are exercised end-to-end in
//! `reflect.rs` / `hole_punching.rs` already; this file focuses on
//! the *new* invariants Phase 4 adds:
//!
//! 1. When Relay A forwards a DirectCallOffer, its local registry
//! stashes caller_reflexive_addr and leaves peer_relay_fp
//! unset (broadcast, answer-side will identify itself).
//! 2. When Relay B's cross-relay dispatcher receives the forward,
//! its local registry stores the call with
//! peer_relay_fp = Some(relay_a_tls_fp).
//! 3. When Relay B processes the local callee's answer, it sees
//! peer_relay_fp.is_some() and MUST NOT deliver the answer via
//! local signal_hub — instead it routes through federation.
//! 4. When Relay A receives the forwarded answer via its
//! cross-relay dispatcher, it stashes callee_reflexive_addr
//! and emits a CallSetup to its local caller with
//! peer_direct_addr = callee_addr.
//! 5. Final state: Alice's CallSetup carries Bob's reflex addr,
//! Bob's CallSetup carries Alice's reflex addr — cross-wired
//! through two relays + a federation link.
use wzp_proto::{CallAcceptMode, SignalMessage};
use wzp_relay::call_registry::CallRegistry;
// ────────────────────────────────────────────────────────────────
// Simulated dispatch helpers — these reproduce the exact logic
// in main.rs without the tokio + federation boilerplate.
// ────────────────────────────────────────────────────────────────
const RELAY_A_TLS_FP: &str = "relay-A-tls-fingerprint";
const RELAY_B_TLS_FP: &str = "relay-B-tls-fingerprint";
const ALICE_ADDR: &str = "192.0.2.1:4433";
const BOB_ADDR: &str = "198.51.100.9:4433";
const RELAY_A_ADDR: &str = "203.0.113.5:4433";
const RELAY_B_ADDR: &str = "203.0.113.10:4433";
/// Helper that Alice's place_call sends.
fn alice_offer(call_id: &str) -> SignalMessage {
SignalMessage::DirectCallOffer {
caller_fingerprint: "alice".into(),
caller_alias: None,
target_fingerprint: "bob".into(),
call_id: call_id.into(),
identity_pub: [0; 32],
ephemeral_pub: [0; 32],
signature: vec![],
supported_profiles: vec![],
caller_reflexive_addr: Some(ALICE_ADDR.into()),
}
}
/// Relay A receives Alice's offer. Target Bob is not local.
/// Relay A wraps + broadcasts over federation, stashes the call
/// locally with peer_relay_fp = None (broadcast — answer-side
/// identifies itself).
fn relay_a_handle_offer(reg_a: &mut CallRegistry, offer: &SignalMessage) -> SignalMessage {
match offer {
SignalMessage::DirectCallOffer {
caller_fingerprint,
target_fingerprint,
call_id,
caller_reflexive_addr,
..
} => {
reg_a.create_call(
call_id.clone(),
caller_fingerprint.clone(),
target_fingerprint.clone(),
);
reg_a.set_caller_reflexive_addr(call_id, caller_reflexive_addr.clone());
// peer_relay_fp stays None — we don't know which peer
// will respond yet.
}
_ => panic!("not an offer"),
}
// Build the federation envelope the main loop would
// broadcast.
SignalMessage::FederatedSignalForward {
inner: Box::new(offer.clone()),
origin_relay_fp: RELAY_A_TLS_FP.into(),
}
}
/// Relay B receives a FederatedSignalForward(DirectCallOffer).
/// This is the cross-relay dispatcher task code in main.rs —
/// reproduced here for the test.
fn relay_b_handle_forwarded_offer(reg_b: &mut CallRegistry, forward: &SignalMessage) {
let (inner, origin_relay_fp) = match forward {
SignalMessage::FederatedSignalForward { inner, origin_relay_fp } => {
(inner.as_ref().clone(), origin_relay_fp.clone())
}
_ => panic!("not a forward"),
};
// Loop-prevention: drop self-sourced.
assert_ne!(origin_relay_fp, RELAY_B_TLS_FP);
let SignalMessage::DirectCallOffer {
caller_fingerprint,
target_fingerprint,
call_id,
caller_reflexive_addr,
..
} = inner
else {
panic!("inner was not DirectCallOffer");
};
// Simulated: target is local to B (Bob is registered here).
reg_b.create_call(
call_id.clone(),
caller_fingerprint,
target_fingerprint,
);
reg_b.set_caller_reflexive_addr(&call_id, caller_reflexive_addr);
reg_b.set_peer_relay_fp(&call_id, Some(origin_relay_fp));
}
/// Bob's answer — AcceptTrusted with his reflex addr.
fn bob_answer(call_id: &str) -> SignalMessage {
SignalMessage::DirectCallAnswer {
call_id: call_id.into(),
accept_mode: CallAcceptMode::AcceptTrusted,
identity_pub: None,
ephemeral_pub: None,
signature: None,
chosen_profile: None,
callee_reflexive_addr: Some(BOB_ADDR.into()),
}
}
/// Relay B handles the LOCAL callee's answer. If peer_relay_fp
/// is Some, wrap the answer in a FederatedSignalForward + emit the
/// local CallSetup to Bob. Returns the (forward_envelope,
/// bob_call_setup) pair.
fn relay_b_handle_local_answer(
reg_b: &mut CallRegistry,
answer: &SignalMessage,
) -> (SignalMessage, SignalMessage) {
let (call_id, mode, callee_addr) = match answer {
SignalMessage::DirectCallAnswer {
call_id,
accept_mode,
callee_reflexive_addr,
..
} => (call_id.clone(), *accept_mode, callee_reflexive_addr.clone()),
_ => panic!(),
};
// Stash callee addr + activate.
reg_b.set_active(&call_id, mode, format!("call-{call_id}"));
reg_b.set_callee_reflexive_addr(&call_id, callee_addr);
let call = reg_b.get(&call_id).unwrap();
let caller_addr = call.caller_reflexive_addr.clone();
let callee_addr = call.callee_reflexive_addr.clone();
assert!(
call.peer_relay_fp.is_some(),
"Relay B must know this call is cross-relay"
);
// Forward the answer back over federation.
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(answer.clone()),
origin_relay_fp: RELAY_B_TLS_FP.into(),
};
// Local CallSetup for Bob — peer_direct_addr = Alice's addr.
let setup_for_bob = SignalMessage::CallSetup {
call_id: call_id.clone(),
room: format!("call-{call_id}"),
relay_addr: RELAY_B_ADDR.into(),
peer_direct_addr: caller_addr,
};
let _ = callee_addr;
(forward, setup_for_bob)
}
/// Relay A's cross-relay dispatcher receives the forwarded answer.
/// It stashes the callee addr, forwards the raw answer to local
/// Alice, and emits a CallSetup with peer_direct_addr = Bob's addr.
fn relay_a_handle_forwarded_answer(
reg_a: &mut CallRegistry,
forward: &SignalMessage,
) -> SignalMessage {
let (inner, origin_relay_fp) = match forward {
SignalMessage::FederatedSignalForward { inner, origin_relay_fp } => {
(inner.as_ref().clone(), origin_relay_fp.clone())
}
_ => panic!("not a forward"),
};
assert_ne!(origin_relay_fp, RELAY_A_TLS_FP);
let SignalMessage::DirectCallAnswer {
call_id,
accept_mode,
callee_reflexive_addr,
..
} = inner
else {
panic!("inner was not DirectCallAnswer");
};
assert_eq!(accept_mode, CallAcceptMode::AcceptTrusted);
reg_a.set_active(&call_id, accept_mode, format!("call-{call_id}"));
reg_a.set_callee_reflexive_addr(&call_id, callee_reflexive_addr.clone());
// Alice's CallSetup — peer_direct_addr = Bob's addr.
SignalMessage::CallSetup {
call_id: call_id.clone(),
room: format!("call-{call_id}"),
relay_addr: RELAY_A_ADDR.into(),
peer_direct_addr: callee_reflexive_addr,
}
}
// ────────────────────────────────────────────────────────────────
// Tests
// ────────────────────────────────────────────────────────────────
#[test]
fn cross_relay_offer_forwards_and_stashes_peer_relay_fp() {
let mut reg_a = CallRegistry::new();
let mut reg_b = CallRegistry::new();
let offer = alice_offer("c-xrelay-1");
let forward = relay_a_handle_offer(&mut reg_a, &offer);
// Relay A's local view: call exists, caller addr stashed,
// peer_relay_fp still None (broadcast — answer identifies the
// peer).
let call_a = reg_a.get("c-xrelay-1").unwrap();
assert_eq!(call_a.caller_fingerprint, "alice");
assert_eq!(call_a.callee_fingerprint, "bob");
assert_eq!(call_a.caller_reflexive_addr.as_deref(), Some(ALICE_ADDR));
assert!(call_a.peer_relay_fp.is_none());
// Relay B dispatches the forward: creates the call locally
// and stashes peer_relay_fp = Relay A.
relay_b_handle_forwarded_offer(&mut reg_b, &forward);
let call_b = reg_b.get("c-xrelay-1").unwrap();
assert_eq!(call_b.caller_fingerprint, "alice");
assert_eq!(call_b.callee_fingerprint, "bob");
assert_eq!(call_b.caller_reflexive_addr.as_deref(), Some(ALICE_ADDR));
assert_eq!(call_b.peer_relay_fp.as_deref(), Some(RELAY_A_TLS_FP));
}
#[test]
fn cross_relay_answer_crosswires_peer_direct_addrs() {
let mut reg_a = CallRegistry::new();
let mut reg_b = CallRegistry::new();
// Full round trip: offer → forward → dispatch → answer →
// forward back → dispatch → both CallSetups.
let offer = alice_offer("c-xrelay-2");
let offer_forward = relay_a_handle_offer(&mut reg_a, &offer);
relay_b_handle_forwarded_offer(&mut reg_b, &offer_forward);
// Bob answers on Relay B.
let answer = bob_answer("c-xrelay-2");
let (answer_forward, setup_for_bob) =
relay_b_handle_local_answer(&mut reg_b, &answer);
// Bob's CallSetup carries Alice's addr.
match setup_for_bob {
SignalMessage::CallSetup { peer_direct_addr, relay_addr, .. } => {
assert_eq!(peer_direct_addr.as_deref(), Some(ALICE_ADDR));
assert_eq!(relay_addr, RELAY_B_ADDR);
}
_ => panic!("wrong variant"),
}
// Alice's dispatcher receives the forwarded answer and builds
// her CallSetup.
let setup_for_alice = relay_a_handle_forwarded_answer(&mut reg_a, &answer_forward);
match setup_for_alice {
SignalMessage::CallSetup { peer_direct_addr, relay_addr, .. } => {
assert_eq!(peer_direct_addr.as_deref(), Some(BOB_ADDR));
assert_eq!(relay_addr, RELAY_A_ADDR);
}
_ => panic!("wrong variant"),
}
// Both registries agree on caller + callee reflex addrs after
// the full round-trip.
for reg in [&reg_a, &reg_b] {
let c = reg.get("c-xrelay-2").unwrap();
assert_eq!(c.caller_reflexive_addr.as_deref(), Some(ALICE_ADDR));
assert_eq!(c.callee_reflexive_addr.as_deref(), Some(BOB_ADDR));
}
}
#[test]
fn cross_relay_loop_prevention_drops_self_sourced_forward() {
// A FederatedSignalForward that circles back to the origin
// relay should be dropped before it hits the call registry.
let forward = SignalMessage::FederatedSignalForward {
inner: Box::new(alice_offer("c-loop")),
origin_relay_fp: RELAY_B_TLS_FP.into(),
};
// The dispatcher in main.rs calls this explicit check before
// doing any work. Reproduce it inline.
let origin = match &forward {
SignalMessage::FederatedSignalForward { origin_relay_fp, .. } => origin_relay_fp.clone(),
_ => unreachable!(),
};
// Relay B sees origin == its own fp → drop.
assert_eq!(origin, RELAY_B_TLS_FP, "loop-prevention triggers on self-fp");
}

View File

@@ -63,11 +63,11 @@ async fn handshake_succeeds() {
accept_handshake(server_t.as_ref(), &callee_seed).await accept_handshake(server_t.as_ref(), &callee_seed).await
}); });
let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed, None) let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed)
.await .await
.expect("perform_handshake should succeed"); .expect("perform_handshake should succeed");
let (callee_session, chosen_profile, _caller_fp, _caller_alias) = callee_handle let (callee_session, chosen_profile) = callee_handle
.await .await
.expect("join callee task") .expect("join callee task")
.expect("accept_handshake should succeed"); .expect("accept_handshake should succeed");
@@ -124,11 +124,11 @@ async fn handshake_verifies_identity() {
accept_handshake(server_t.as_ref(), &callee_seed).await accept_handshake(server_t.as_ref(), &callee_seed).await
}); });
let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed, None) let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed)
.await .await
.expect("handshake must succeed even with different identities"); .expect("handshake must succeed even with different identities");
let (callee_session, _profile, _caller_fp, _caller_alias) = callee_handle let (callee_session, _profile) = callee_handle
.await .await
.expect("join") .expect("join")
.expect("accept_handshake must succeed"); .expect("accept_handshake must succeed");
@@ -183,7 +183,7 @@ async fn auth_then_handshake() {
}; };
// 2. Run the cryptographic handshake // 2. Run the cryptographic handshake
let (session, profile, _caller_fp, _caller_alias) = accept_handshake(server_t.as_ref(), &callee_seed) let (session, profile) = accept_handshake(server_t.as_ref(), &callee_seed)
.await .await
.expect("accept_handshake after auth"); .expect("accept_handshake after auth");
@@ -199,7 +199,7 @@ async fn auth_then_handshake() {
.await .await
.expect("send AuthToken"); .expect("send AuthToken");
let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed, None) let caller_session = perform_handshake(client_transport.as_ref(), &caller_seed)
.await .await
.expect("perform_handshake after auth"); .expect("perform_handshake after auth");
@@ -270,7 +270,6 @@ async fn handshake_rejects_bad_signature() {
ephemeral_pub, ephemeral_pub,
signature, signature,
supported_profiles: vec![wzp_proto::QualityProfile::GOOD], supported_profiles: vec![wzp_proto::QualityProfile::GOOD],
alias: None,
}; };
client_transport client_transport

View File

@@ -1,288 +0,0 @@
//! Phase 3 integration tests for hole-punching advertising
//! (PRD: .taskmaster/docs/prd_hole_punching.txt).
//!
//! These verify the end-to-end protocol cross-wiring:
//! caller (places offer with caller_reflexive_addr=A)
//! → relay (stashes A in registry)
//! → callee (reads A off the forwarded offer)
//! callee (sends AcceptTrusted answer with callee_reflexive_addr=B)
//! → relay (stashes B, emits CallSetup to both parties)
//! → caller receives CallSetup.peer_direct_addr = B
//! → callee receives CallSetup.peer_direct_addr = A
//!
//! The actual QUIC hole-punch race is a Phase 3.5 follow-up.
//! These tests only cover the signal-plane plumbing — that the
//! addrs make it from each peer's offer/answer through the relay
//! cross-wiring back out in CallSetup with the peer's addr.
//!
//! We drive the call registry + a minimal routing function
//! directly instead of spinning up a full relay process — easier
//! to reason about, no real network, and what we actually want to
//! test is the cross-wiring logic, not the whole signal stack.
use wzp_proto::{CallAcceptMode, SignalMessage};
use wzp_relay::call_registry::CallRegistry;
/// Helper: simulate the relay's handling of a DirectCallOffer. In
/// `wzp-relay/src/main.rs` this is the match arm that creates the
/// call in the registry and stashes the caller's reflex addr.
fn handle_offer(reg: &mut CallRegistry, offer: &SignalMessage) -> String {
match offer {
SignalMessage::DirectCallOffer {
caller_fingerprint,
target_fingerprint,
call_id,
caller_reflexive_addr,
..
} => {
reg.create_call(
call_id.clone(),
caller_fingerprint.clone(),
target_fingerprint.clone(),
);
reg.set_caller_reflexive_addr(call_id, caller_reflexive_addr.clone());
call_id.clone()
}
_ => panic!("not an offer"),
}
}
/// Helper: simulate the relay's handling of a DirectCallAnswer +
/// the subsequent CallSetup emission. Returns the two CallSetup
/// messages the relay would push: (for_caller, for_callee).
fn handle_answer_and_build_setups(
reg: &mut CallRegistry,
answer: &SignalMessage,
) -> (SignalMessage, SignalMessage) {
let (call_id, mode, callee_addr) = match answer {
SignalMessage::DirectCallAnswer {
call_id,
accept_mode,
callee_reflexive_addr,
..
} => (call_id.clone(), *accept_mode, callee_reflexive_addr.clone()),
_ => panic!("not an answer"),
};
reg.set_callee_reflexive_addr(&call_id, callee_addr);
let room = format!("call-{call_id}");
reg.set_active(&call_id, mode, room.clone());
let (caller_addr, callee_addr) = {
let c = reg.get(&call_id).unwrap();
(
c.caller_reflexive_addr.clone(),
c.callee_reflexive_addr.clone(),
)
};
let setup_for_caller = SignalMessage::CallSetup {
call_id: call_id.clone(),
room: room.clone(),
relay_addr: "203.0.113.5:4433".into(),
peer_direct_addr: callee_addr,
};
let setup_for_callee = SignalMessage::CallSetup {
call_id,
room,
relay_addr: "203.0.113.5:4433".into(),
peer_direct_addr: caller_addr,
};
(setup_for_caller, setup_for_callee)
}
fn mk_offer(call_id: &str, caller_reflexive_addr: Option<&str>) -> SignalMessage {
SignalMessage::DirectCallOffer {
caller_fingerprint: "alice".into(),
caller_alias: None,
target_fingerprint: "bob".into(),
call_id: call_id.into(),
identity_pub: [0; 32],
ephemeral_pub: [0; 32],
signature: vec![],
supported_profiles: vec![],
caller_reflexive_addr: caller_reflexive_addr.map(String::from),
}
}
fn mk_answer(
call_id: &str,
mode: CallAcceptMode,
callee_reflexive_addr: Option<&str>,
) -> SignalMessage {
SignalMessage::DirectCallAnswer {
call_id: call_id.into(),
accept_mode: mode,
identity_pub: None,
ephemeral_pub: None,
signature: None,
chosen_profile: None,
callee_reflexive_addr: callee_reflexive_addr.map(String::from),
}
}
// -----------------------------------------------------------------------
// Test 1: both peers advertise — CallSetup cross-wires correctly
// -----------------------------------------------------------------------
#[test]
fn both_peers_advertise_reflex_addrs_cross_wire_in_setup() {
let mut reg = CallRegistry::new();
let caller_addr = "192.0.2.1:4433";
let callee_addr = "198.51.100.9:4433";
let offer = mk_offer("c1", Some(caller_addr));
let call_id = handle_offer(&mut reg, &offer);
assert_eq!(call_id, "c1");
assert_eq!(
reg.get("c1").unwrap().caller_reflexive_addr.as_deref(),
Some(caller_addr)
);
let answer = mk_answer("c1", CallAcceptMode::AcceptTrusted, Some(callee_addr));
let (setup_caller, setup_callee) =
handle_answer_and_build_setups(&mut reg, &answer);
// The CALLER's setup should carry the CALLEE's addr as peer_direct_addr.
match setup_caller {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert_eq!(
peer_direct_addr.as_deref(),
Some(callee_addr),
"caller's CallSetup must contain callee's addr"
);
}
_ => panic!("wrong variant"),
}
// The CALLEE's setup should carry the CALLER's addr.
match setup_callee {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert_eq!(
peer_direct_addr.as_deref(),
Some(caller_addr),
"callee's CallSetup must contain caller's addr"
);
}
_ => panic!("wrong variant"),
}
}
// -----------------------------------------------------------------------
// Test 2: callee uses AcceptGeneric (privacy) — no addr leaks
// -----------------------------------------------------------------------
#[test]
fn privacy_mode_answer_omits_callee_addr_from_setup() {
let mut reg = CallRegistry::new();
let caller_addr = "192.0.2.1:4433";
handle_offer(&mut reg, &mk_offer("c2", Some(caller_addr)));
// AcceptGeneric explicitly passes None for callee_reflexive_addr —
// the whole point is to hide the callee's IP from the caller.
let answer = mk_answer("c2", CallAcceptMode::AcceptGeneric, None);
let (setup_caller, setup_callee) =
handle_answer_and_build_setups(&mut reg, &answer);
// CALLER should see peer_direct_addr = None (privacy preserved).
match setup_caller {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert!(
peer_direct_addr.is_none(),
"privacy mode must not leak callee addr to caller"
);
}
_ => panic!("wrong variant"),
}
// CALLEE still gets the caller's addr — only the callee opted for
// privacy, the caller already volunteered its addr in the offer.
match setup_callee {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert_eq!(
peer_direct_addr.as_deref(),
Some(caller_addr),
"callee's CallSetup should still carry caller's volunteered addr"
);
}
_ => panic!("wrong variant"),
}
}
// -----------------------------------------------------------------------
// Test 3: old caller (no addr) + new callee — relay path only
// -----------------------------------------------------------------------
#[test]
fn pre_phase3_caller_leaves_both_setups_relay_only() {
let mut reg = CallRegistry::new();
// Pre-Phase-3 client doesn't know about caller_reflexive_addr
// so the field is None.
handle_offer(&mut reg, &mk_offer("c3", None));
// New callee advertises its addr — doesn't matter because
// without caller_reflexive_addr the caller has nothing to
// attempt a direct handshake to, so the cross-wiring should
// still leave the caller's CallSetup without peer_direct_addr.
let answer = mk_answer(
"c3",
CallAcceptMode::AcceptTrusted,
Some("198.51.100.9:4433"),
);
let (setup_caller, setup_callee) =
handle_answer_and_build_setups(&mut reg, &answer);
match setup_caller {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
// Phase 3 relay behavior: we always inject whatever
// addrs are in the registry, regardless of who
// advertised. The caller here gets the callee's addr
// because the callee did advertise.
assert_eq!(peer_direct_addr.as_deref(), Some("198.51.100.9:4433"));
}
_ => panic!("wrong variant"),
}
// The callee's setup has no caller addr (pre-Phase-3 offer).
match setup_callee {
SignalMessage::CallSetup { peer_direct_addr, .. } => {
assert!(
peer_direct_addr.is_none(),
"callee should see no caller addr when offer was pre-Phase-3"
);
}
_ => panic!("wrong variant"),
}
}
// -----------------------------------------------------------------------
// Test 4: neither side advertises — both CallSetups fall back cleanly
// -----------------------------------------------------------------------
#[test]
fn neither_peer_advertises_both_setups_are_relay_only() {
let mut reg = CallRegistry::new();
handle_offer(&mut reg, &mk_offer("c4", None));
let answer = mk_answer("c4", CallAcceptMode::AcceptTrusted, None);
let (setup_caller, setup_callee) =
handle_answer_and_build_setups(&mut reg, &answer);
for (label, setup) in [("caller", setup_caller), ("callee", setup_callee)] {
match setup {
SignalMessage::CallSetup { peer_direct_addr, relay_addr, .. } => {
assert!(
peer_direct_addr.is_none(),
"{label}'s CallSetup must have no peer_direct_addr"
);
// Relay addr is always filled — that's the fallback
// path and the existing behavior.
assert!(!relay_addr.is_empty(), "{label} relay_addr must be set");
}
_ => panic!("wrong variant"),
}
}
}

View File

@@ -1,228 +0,0 @@
//! Phase 2 integration tests for multi-relay NAT reflection
//! (PRD: .taskmaster/docs/prd_multi_relay_reflect.txt).
//!
//! These spin up one or two mock relays that implement the full
//! pre-reflect dance — RegisterPresence → RegisterPresenceAck →
//! Reflect → ReflectResponse — which is what the transient
//! probe helper in `wzp_client::reflect::probe_reflect_addr` does
//! against a real relay.
//!
//! Test matrix:
//! 1. `probe_reflect_addr_happy_path`
//! — single mock relay, assert the probe helper returns the
//! observed addr as 127.0.0.1:<client ephemeral port>
//! 2. `detect_nat_type_two_loopback_relays_is_cone`
//! — two mock relays, one client; loopback single-host means
//! every probe sees the same (127.0.0.1, same_port) so the
//! classifier returns `Cone` + a consensus addr
//! 3. `detect_nat_type_dead_relay_is_unknown`
//! — one alive relay + one dead address; aggregator returns
//! `Unknown` with a non-empty `error` field on the failed
//! probe
use std::net::{Ipv4Addr, SocketAddr};
use std::sync::Arc;
use std::time::Duration;
use wzp_client::reflect::{detect_nat_type, probe_reflect_addr, NatType};
use wzp_proto::{MediaTransport, SignalMessage};
use wzp_transport::{create_endpoint, server_config, QuinnTransport};
/// Minimal mock relay that loops accepting connections, handles
/// RegisterPresence + Reflect, and responds correctly. Mirrors the
/// two match arms from `wzp-relay/src/main.rs` that matter here.
///
/// Each accepted connection gets its own inner task so multiple
/// simultaneous probes work.
async fn spawn_mock_relay() -> (SocketAddr, tokio::task::JoinHandle<()>) {
let _ = rustls::crypto::ring::default_provider().install_default();
let (sc, _cert_der) = server_config();
let bind: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let endpoint = create_endpoint(bind, Some(sc)).expect("server endpoint");
let listen_addr = endpoint.local_addr().expect("local_addr");
let handle = tokio::spawn(async move {
loop {
// Accept the next incoming connection. `wzp_transport::accept`
// returns the established `quinn::Connection`.
let conn = match wzp_transport::accept(&endpoint).await {
Ok(c) => c,
Err(_) => break, // endpoint closed
};
let observed_addr = conn.remote_address();
let transport = Arc::new(QuinnTransport::new(conn));
// Per-connection handler. Keep servicing messages until
// the peer closes so one probe connection can do
// RegisterPresence → Ack → Reflect → Response without
// racing other incoming connections.
let t = transport;
tokio::spawn(async move {
loop {
match t.recv_signal().await {
Ok(Some(SignalMessage::RegisterPresence { .. })) => {
let _ = t
.send_signal(&SignalMessage::RegisterPresenceAck {
success: true,
error: None,
})
.await;
}
Ok(Some(SignalMessage::Reflect)) => {
let _ = t
.send_signal(&SignalMessage::ReflectResponse {
observed_addr: observed_addr.to_string(),
})
.await;
}
Ok(Some(_other)) => { /* ignore */ }
Ok(None) => break,
Err(_) => break,
}
}
});
}
});
(listen_addr, handle)
}
// -----------------------------------------------------------------------
// Test 1: probe_reflect_addr against a single mock relay
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn probe_reflect_addr_happy_path() {
let (relay_addr, _relay_handle) = spawn_mock_relay().await;
let (observed, latency_ms) = tokio::time::timeout(
Duration::from_secs(3),
probe_reflect_addr(relay_addr, 2000),
)
.await
.expect("probe must complete within 3s")
.expect("probe must succeed");
assert_eq!(
observed.ip().to_string(),
"127.0.0.1",
"loopback test should see 127.0.0.1"
);
assert_ne!(observed.port(), 0, "observed port must be non-zero");
// Latency on same host is dominated by the handshake — generously
// allow up to 2s (the timeout) rather than picking a tight number
// that would be flaky on busy CI runners.
assert!(latency_ms < 2000, "latency {latency_ms}ms too high");
}
// -----------------------------------------------------------------------
// Test 2: two loopback relays → Cone classification
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn detect_nat_type_two_loopback_relays_is_cone() {
let (addr_a, _h_a) = spawn_mock_relay().await;
let (addr_b, _h_b) = spawn_mock_relay().await;
let detection = detect_nat_type(
vec![
("RelayA".into(), addr_a),
("RelayB".into(), addr_b),
],
2000,
)
.await;
assert_eq!(detection.probes.len(), 2);
for p in &detection.probes {
assert!(p.observed_addr.is_some(), "probe {:?} failed: {:?}", p.relay_name, p.error);
}
// Loopback single-host: every probe sees 127.0.0.1 and, crucially,
// uses a different ephemeral source port (since probe_reflect_addr
// spins up a fresh quinn::Endpoint per probe). Wait — that makes
// this look like Symmetric to the classifier, not Cone!
//
// The classifier cares about the *observed* addr, which is what
// the relay sees as the client's source. Two different client
// endpoints on loopback → two different observed ports → the
// classifier correctly labels this as SymmetricPort in the test
// environment. That's still a valid verification of the
// plumbing, just not of the Cone classification.
//
// Accept either Cone OR SymmetricPort for this test, then
// assert the more specific invariant that matters: both probes
// returned the same observed IP.
let observed_ips: Vec<String> = detection
.probes
.iter()
.map(|p| {
p.observed_addr
.as_ref()
.and_then(|s| s.parse::<SocketAddr>().ok())
.map(|a| a.ip().to_string())
.unwrap_or_default()
})
.collect();
assert_eq!(observed_ips[0], "127.0.0.1");
assert_eq!(observed_ips[1], "127.0.0.1");
// Either classification is valid on loopback (see long comment
// above). Explicitly assert the set so a future refactor that
// accidentally returns `Multiple` or `Unknown` fails the test.
assert!(
matches!(detection.nat_type, NatType::Cone | NatType::SymmetricPort),
"expected Cone or SymmetricPort on loopback, got {:?}",
detection.nat_type
);
}
// -----------------------------------------------------------------------
// Test 3: one alive relay + one dead address → Unknown
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn detect_nat_type_dead_relay_is_unknown() {
let (alive_addr, _alive_handle) = spawn_mock_relay().await;
// Dead relay: a port that nothing is listening on. OS will drop
// the packets, the probe should time out within the 600ms budget
// we give it. Pick a port unlikely to be in use — port 1 on
// loopback works on every OS I care about and fails fast.
let dead_addr: SocketAddr = "127.0.0.1:1".parse().unwrap();
let detection = detect_nat_type(
vec![
("Alive".into(), alive_addr),
("Dead".into(), dead_addr),
],
600, // tight timeout so the dead probe fails fast
)
.await;
assert_eq!(detection.probes.len(), 2);
// Find the alive and dead probes by name (order of JoinSet
// completions is not guaranteed).
let alive = detection.probes.iter().find(|p| p.relay_name == "Alive").unwrap();
let dead = detection.probes.iter().find(|p| p.relay_name == "Dead").unwrap();
assert!(
alive.observed_addr.is_some(),
"alive probe must succeed: {:?}",
alive.error
);
assert!(
dead.observed_addr.is_none(),
"dead probe must fail, got addr {:?}",
dead.observed_addr
);
assert!(
dead.error.is_some(),
"dead probe must surface an error string"
);
// With only 1 successful probe, the classifier returns Unknown.
assert_eq!(detection.nat_type, NatType::Unknown);
assert!(detection.consensus_addr.is_none());
}

View File

@@ -1,318 +0,0 @@
//! Integration tests for the "STUN for QUIC" reflect protocol
//! (PRD: .taskmaster/docs/prd_reflect_over_quic.txt, Phase 1).
//!
//! We don't spin up the full relay binary — instead we exercise the
//! same wire-level request/response dance with a mock relay loop
//! that implements exactly the match arm added to
//! `wzp-relay/src/main.rs`. This isolates the protocol test from the
//! rest of the relay state (rooms, federation, call registry, ...).
//!
//! Three test cases:
//! 1. `reflect_happy_path` — client sends `Reflect`, mock relay
//! replies with `ReflectResponse { observed_addr }`, client
//! parses it back to a `SocketAddr` and confirms the IP is
//! `127.0.0.1` and the port matches its own bound port.
//! 2. `reflect_two_clients_distinct_ports` — two simultaneous
//! client connections on different ephemeral ports get back
//! different reflected ports, proving the relay uses
//! per-connection `remote_address` rather than a global.
//! 3. `reflect_old_relay_times_out` — mock relay that *doesn't*
//! handle `Reflect`; client side times out in the expected
//! window and does not hang.
//!
//! The third test uses a `tokio::time::timeout` wrapper directly
//! (the client-side `request_reflect` helper lives in
//! `desktop/src-tauri/src/lib.rs` which isn't a library we can
//! depend on from here, so we reproduce the timeout semantics
//! inline).
use std::net::{Ipv4Addr, SocketAddr};
use std::sync::Arc;
use std::time::Duration;
use wzp_proto::{MediaTransport, SignalMessage};
use wzp_transport::{client_config, create_endpoint, server_config, QuinnTransport};
/// Spawn a minimal mock relay that loops over `recv_signal`,
/// matches on `Reflect`, and responds with `ReflectResponse` using
/// the remote_address observed for this connection. Mirrors the
/// match arm in `crates/wzp-relay/src/main.rs`.
async fn spawn_mock_relay_with_reflect(
server_transport: Arc<QuinnTransport>,
) -> tokio::task::JoinHandle<()> {
tokio::spawn(async move {
// Observed remote address at the time the connection was
// accepted. Stable for the life of the connection under quinn's
// normal operation. This is exactly what the real relay does.
let observed = server_transport.connection().remote_address();
loop {
match server_transport.recv_signal().await {
Ok(Some(SignalMessage::Reflect)) => {
let resp = SignalMessage::ReflectResponse {
observed_addr: observed.to_string(),
};
// If the send fails the client has gone; just exit.
if server_transport.send_signal(&resp).await.is_err() {
break;
}
}
Ok(Some(_other)) => {
// Ignore anything else — not relevant to this test.
}
Ok(None) => break,
Err(_e) => break,
}
}
})
}
/// Spawn a mock relay that intentionally DOES NOT handle Reflect.
/// Models a pre-Phase-1 relay — it keeps reading signal messages and
/// logs them to stderr, but never produces a `ReflectResponse`.
async fn spawn_mock_relay_without_reflect(
server_transport: Arc<QuinnTransport>,
) -> tokio::task::JoinHandle<()> {
tokio::spawn(async move {
loop {
match server_transport.recv_signal().await {
Ok(Some(_msg)) => {
// Deliberately do nothing. Old relay.
}
Ok(None) => break,
Err(_) => break,
}
}
})
}
/// Build an in-process QUIC client/server pair on loopback and
/// return (client_transport, server_transport, endpoints). The
/// endpoints tuple must be kept alive for the test duration.
///
/// `client_port_hint` of 0 means "let OS pick". Pass an explicit
/// port to pin the client's source port (useful for the
/// distinct-ports test).
async fn connected_pair_with_port(
_client_port_hint: u16,
) -> (Arc<QuinnTransport>, Arc<QuinnTransport>, (quinn::Endpoint, quinn::Endpoint)) {
let _ = rustls::crypto::ring::default_provider().install_default();
let (sc, _cert_der) = server_config();
let server_addr: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let server_ep = create_endpoint(server_addr, Some(sc)).expect("server endpoint");
let server_listen = server_ep.local_addr().expect("server local addr");
// Always bind the client to an ephemeral port — we'll read back
// the actual assigned port via `local_addr()` in the assertions.
let client_bind: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let client_ep = create_endpoint(client_bind, None).expect("client endpoint");
let server_ep_clone = server_ep.clone();
let accept_fut = tokio::spawn(async move {
let conn = wzp_transport::accept(&server_ep_clone).await.expect("accept");
Arc::new(QuinnTransport::new(conn))
});
let client_conn =
wzp_transport::connect(&client_ep, server_listen, "localhost", client_config())
.await
.expect("connect");
let client_transport = Arc::new(QuinnTransport::new(client_conn));
let server_transport = accept_fut.await.expect("join accept task");
(client_transport, server_transport, (server_ep, client_ep))
}
// -----------------------------------------------------------------------
// Test 1: happy path — client learns its own port via Reflect
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reflect_happy_path() {
let (client_transport, server_transport, (_server_ep, client_ep)) =
connected_pair_with_port(0).await;
// Grab the client's actual bound port so we can cross-check
// against the reflected response.
let client_port = client_ep
.local_addr()
.expect("client local addr")
.port();
assert_ne!(client_port, 0, "client must have a real bound port");
// Start the mock relay's reflect handler.
let _relay_handle = spawn_mock_relay_with_reflect(Arc::clone(&server_transport)).await;
// Client sends Reflect and awaits the response. The real
// request_reflect helper in desktop/src-tauri/src/lib.rs uses a
// oneshot channel driven off the spawned recv loop; here we just
// do it inline because there's no spawned loop yet in this test
// — this isolates the wire protocol from the client-side state
// machine.
client_transport
.send_signal(&SignalMessage::Reflect)
.await
.expect("send Reflect");
let resp = tokio::time::timeout(Duration::from_secs(2), client_transport.recv_signal())
.await
.expect("reflect response should arrive within 2s")
.expect("recv_signal ok")
.expect("some message");
let observed_addr = match resp {
SignalMessage::ReflectResponse { observed_addr } => observed_addr,
other => panic!("expected ReflectResponse, got {:?}", std::mem::discriminant(&other)),
};
let parsed: SocketAddr = observed_addr
.parse()
.expect("ReflectResponse.observed_addr must parse as SocketAddr");
// The relay should see the client on 127.0.0.1 (loopback in the
// test harness) and on the client's bound ephemeral port.
assert_eq!(parsed.ip().to_string(), "127.0.0.1");
assert_eq!(
parsed.port(),
client_port,
"reflected port must match the client's local_addr port"
);
drop(client_transport);
drop(server_transport);
}
// -----------------------------------------------------------------------
// Test 2: two clients get DIFFERENT reflected ports
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn reflect_two_clients_distinct_ports() {
let _ = rustls::crypto::ring::default_provider().install_default();
// Shared server: one endpoint, two incoming accepts.
let (sc, _cert_der) = server_config();
let server_addr: SocketAddr = (Ipv4Addr::LOCALHOST, 0).into();
let server_ep = create_endpoint(server_addr, Some(sc)).expect("server endpoint");
let server_listen = server_ep.local_addr().expect("server local addr");
// Accept two clients in parallel.
let server_ep_a = server_ep.clone();
let accept_a = tokio::spawn(async move {
let conn = wzp_transport::accept(&server_ep_a).await.expect("accept A");
Arc::new(QuinnTransport::new(conn))
});
let server_ep_b = server_ep.clone();
let accept_b = tokio::spawn(async move {
let conn = wzp_transport::accept(&server_ep_b).await.expect("accept B");
Arc::new(QuinnTransport::new(conn))
});
// Client A
let client_ep_a = create_endpoint((Ipv4Addr::LOCALHOST, 0).into(), None).expect("ep A");
let conn_a =
wzp_transport::connect(&client_ep_a, server_listen, "localhost", client_config())
.await
.expect("connect A");
let client_a = Arc::new(QuinnTransport::new(conn_a));
let port_a = client_ep_a.local_addr().unwrap().port();
// Client B
let client_ep_b = create_endpoint((Ipv4Addr::LOCALHOST, 0).into(), None).expect("ep B");
let conn_b =
wzp_transport::connect(&client_ep_b, server_listen, "localhost", client_config())
.await
.expect("connect B");
let client_b = Arc::new(QuinnTransport::new(conn_b));
let port_b = client_ep_b.local_addr().unwrap().port();
assert_ne!(
port_a, port_b,
"preconditions: OS must assign two clients different ephemeral ports"
);
let server_a = accept_a.await.expect("join A");
let server_b = accept_b.await.expect("join B");
// Spawn a reflect handler for each server-side transport.
let _relay_a = spawn_mock_relay_with_reflect(Arc::clone(&server_a)).await;
let _relay_b = spawn_mock_relay_with_reflect(Arc::clone(&server_b)).await;
// Each client requests reflect concurrently.
let reflect_for = |t: Arc<QuinnTransport>| async move {
t.send_signal(&SignalMessage::Reflect).await.expect("send");
let resp = tokio::time::timeout(Duration::from_secs(2), t.recv_signal())
.await
.expect("timeout")
.expect("ok")
.expect("some");
match resp {
SignalMessage::ReflectResponse { observed_addr } => observed_addr,
_ => panic!("wrong variant"),
}
};
let (addr_a, addr_b) = tokio::join!(reflect_for(client_a.clone()), reflect_for(client_b.clone()));
let parsed_a: SocketAddr = addr_a.parse().unwrap();
let parsed_b: SocketAddr = addr_b.parse().unwrap();
assert_eq!(parsed_a.port(), port_a, "client A's reflected port");
assert_eq!(parsed_b.port(), port_b, "client B's reflected port");
assert_ne!(
parsed_a.port(),
parsed_b.port(),
"each client must see its own port, not a shared one"
);
drop(client_a);
drop(client_b);
drop(server_a);
drop(server_b);
}
// -----------------------------------------------------------------------
// Test 3: old relay never answers — client times out cleanly
// -----------------------------------------------------------------------
#[tokio::test(flavor = "multi_thread", worker_threads = 2)]
async fn reflect_old_relay_times_out() {
let (client_transport, server_transport, _endpoints) =
connected_pair_with_port(0).await;
// Mock relay that ignores Reflect — simulates a pre-Phase-1 build.
let _relay_handle =
spawn_mock_relay_without_reflect(Arc::clone(&server_transport)).await;
client_transport
.send_signal(&SignalMessage::Reflect)
.await
.expect("send Reflect");
// 1100ms ceiling matches the 1s timeout baked into
// get_reflected_address plus a tiny bit of slack. If this
// regression ever fires it probably means recv_signal blocked
// longer than expected and the Tauri command would hang the UI.
let start = std::time::Instant::now();
let result =
tokio::time::timeout(Duration::from_millis(1100), client_transport.recv_signal()).await;
let elapsed = start.elapsed();
assert!(
result.is_err(),
"recv_signal must time out when the relay ignores Reflect"
);
assert!(
elapsed >= Duration::from_millis(1000),
"timeout fired too early ({:?})",
elapsed
);
assert!(
elapsed < Duration::from_millis(1200),
"timeout fired too late ({:?}), client would feel unresponsive",
elapsed
);
drop(client_transport);
drop(server_transport);
}

View File

@@ -27,8 +27,3 @@ pub use connection::{accept, connect, create_endpoint};
pub use path_monitor::PathMonitor; pub use path_monitor::PathMonitor;
pub use quic::QuinnTransport; pub use quic::QuinnTransport;
pub use wzp_proto::{MediaTransport, PathQuality, TransportError}; pub use wzp_proto::{MediaTransport, PathQuality, TransportError};
// Re-export the quinn Endpoint type so downstream crates (wzp-desktop) can
// thread a shared endpoint between signaling and media connections without
// needing to depend on quinn directly.
pub use quinn::Endpoint;

View File

@@ -53,13 +53,6 @@ pub async fn recv_signal(recv: &mut quinn::RecvStream) -> Result<SignalMessage,
.await .await
.map_err(|e| TransportError::Internal(format!("stream read payload error: {e}")))?; .map_err(|e| TransportError::Internal(format!("stream read payload error: {e}")))?;
serde_json::from_slice(&payload).map_err(|e| { serde_json::from_slice(&payload)
// Distinguish serde failures from transport failures so the .map_err(|e| TransportError::Internal(format!("signal deserialize error: {e}")))
// caller (relay main loop, client recv loop) can continue on
// unknown-variant / parse errors instead of tearing down the
// whole signal connection. Forward-compat: adding a new
// `SignalMessage` variant in one side must not break the
// other side's signal connection.
TransportError::Deserialize(format!("{e}"))
})
} }

View File

@@ -1,16 +0,0 @@
{
"name": "wzp-wasm",
"type": "module",
"description": "WarzonePhone WASM bindings — FEC (RaptorQ) + crypto (ChaCha20-Poly1305, X25519)",
"version": "0.1.0",
"files": [
"wzp_wasm_bg.wasm",
"wzp_wasm.js",
"wzp_wasm.d.ts"
],
"main": "wzp_wasm.js",
"types": "wzp_wasm.d.ts",
"sideEffects": [
"./snippets/*"
]
}

View File

@@ -1,169 +0,0 @@
/* tslint:disable */
/* eslint-disable */
/**
* Symmetric encryption session using ChaCha20-Poly1305.
*
* Mirrors `wzp-crypto::session::ChaChaSession` for WASM. Nonce derivation
* and key setup are identical so WASM and native peers interoperate.
*/
export class WzpCryptoSession {
free(): void;
[Symbol.dispose](): void;
/**
* Decrypt a media payload with AAD.
*
* Returns plaintext on success, or throws on auth failure.
*/
decrypt(header_aad: Uint8Array, ciphertext: Uint8Array): Uint8Array;
/**
* Encrypt a media payload with AAD (typically the 12-byte MediaHeader).
*
* Returns `ciphertext || poly1305_tag` (plaintext.len() + 16 bytes).
*/
encrypt(header_aad: Uint8Array, plaintext: Uint8Array): Uint8Array;
/**
* Create from a 32-byte shared secret (output of `WzpKeyExchange.derive_shared_secret`).
*/
constructor(shared_secret: Uint8Array);
/**
* Current receive sequence number (for diagnostics / UI stats).
*/
recv_seq(): number;
/**
* Current send sequence number (for diagnostics / UI stats).
*/
send_seq(): number;
}
export class WzpFecDecoder {
free(): void;
[Symbol.dispose](): void;
/**
* Feed a received symbol.
*
* Returns the decoded block (concatenated original frames, unpadded) if
* enough symbols have been received to recover the block, or `undefined`.
*/
add_symbol(block_id: number, symbol_idx: number, _is_repair: boolean, data: Uint8Array): Uint8Array | undefined;
/**
* Create a new FEC decoder.
*
* * `block_size` — expected number of source symbols per block.
* * `symbol_size` — padded byte size of each symbol (must match encoder).
*/
constructor(block_size: number, symbol_size: number);
}
export class WzpFecEncoder {
free(): void;
[Symbol.dispose](): void;
/**
* Add a source symbol (audio frame).
*
* Returns encoded packets (all source + repair) when the block is complete,
* or `undefined` if the block is still accumulating.
*
* Each returned packet carries the 3-byte header:
* `[block_id][symbol_idx][is_repair]` followed by `symbol_size` bytes.
*/
add_symbol(data: Uint8Array): Uint8Array | undefined;
/**
* Force-flush the current (possibly partial) block.
*
* Returns all source + repair symbols with headers, or empty vec if no
* symbols have been accumulated.
*/
flush(): Uint8Array;
/**
* Create a new FEC encoder.
*
* * `block_size` — number of source symbols (audio frames) per FEC block.
* * `symbol_size` — padded byte size of each symbol (default 256).
*/
constructor(block_size: number, symbol_size: number);
}
/**
* X25519 key exchange: generate ephemeral keypair and derive shared secret.
*
* Usage from JS:
* ```js
* const kx = new WzpKeyExchange();
* const ourPub = kx.public_key(); // Uint8Array(32)
* // ... send ourPub to peer, receive peerPub ...
* const secret = kx.derive_shared_secret(peerPub); // Uint8Array(32)
* const session = new WzpCryptoSession(secret);
* ```
*/
export class WzpKeyExchange {
free(): void;
[Symbol.dispose](): void;
/**
* Derive a 32-byte session key from the peer's public key.
*
* Raw DH output is expanded via HKDF-SHA256 with info="warzone-session-key",
* matching `wzp-crypto::handshake::WarzoneKeyExchange::derive_session`.
*/
derive_shared_secret(peer_public: Uint8Array): Uint8Array;
/**
* Generate a new random X25519 keypair.
*/
constructor();
/**
* Our public key (32 bytes).
*/
public_key(): Uint8Array;
}
export type InitInput = RequestInfo | URL | Response | BufferSource | WebAssembly.Module;
export interface InitOutput {
readonly memory: WebAssembly.Memory;
readonly __wbg_wzpcryptosession_free: (a: number, b: number) => void;
readonly __wbg_wzpfecdecoder_free: (a: number, b: number) => void;
readonly __wbg_wzpfecencoder_free: (a: number, b: number) => void;
readonly __wbg_wzpkeyexchange_free: (a: number, b: number) => void;
readonly wzpcryptosession_decrypt: (a: number, b: number, c: number, d: number, e: number) => [number, number, number, number];
readonly wzpcryptosession_encrypt: (a: number, b: number, c: number, d: number, e: number) => [number, number, number, number];
readonly wzpcryptosession_new: (a: number, b: number) => [number, number, number];
readonly wzpcryptosession_recv_seq: (a: number) => number;
readonly wzpcryptosession_send_seq: (a: number) => number;
readonly wzpfecdecoder_add_symbol: (a: number, b: number, c: number, d: number, e: number, f: number) => [number, number];
readonly wzpfecdecoder_new: (a: number, b: number) => number;
readonly wzpfecencoder_add_symbol: (a: number, b: number, c: number) => [number, number];
readonly wzpfecencoder_flush: (a: number) => [number, number];
readonly wzpfecencoder_new: (a: number, b: number) => number;
readonly wzpkeyexchange_derive_shared_secret: (a: number, b: number, c: number) => [number, number, number, number];
readonly wzpkeyexchange_new: () => number;
readonly wzpkeyexchange_public_key: (a: number) => [number, number];
readonly __wbindgen_exn_store: (a: number) => void;
readonly __externref_table_alloc: () => number;
readonly __wbindgen_externrefs: WebAssembly.Table;
readonly __wbindgen_malloc: (a: number, b: number) => number;
readonly __externref_table_dealloc: (a: number) => void;
readonly __wbindgen_free: (a: number, b: number, c: number) => void;
readonly __wbindgen_start: () => void;
}
export type SyncInitInput = BufferSource | WebAssembly.Module;
/**
* Instantiates the given `module`, which can either be bytes or
* a precompiled `WebAssembly.Module`.
*
* @param {{ module: SyncInitInput }} module - Passing `SyncInitInput` directly is deprecated.
*
* @returns {InitOutput}
*/
export function initSync(module: { module: SyncInitInput } | SyncInitInput): InitOutput;
/**
* If `module_or_path` is {RequestInfo} or {URL}, makes a request and
* for everything else, calls `WebAssembly.instantiate` directly.
*
* @param {{ module_or_path: InitInput | Promise<InitInput> }} module_or_path - Passing `InitInput` directly is deprecated.
*
* @returns {Promise<InitOutput>}
*/
export default function __wbg_init (module_or_path?: { module_or_path: InitInput | Promise<InitInput> } | InitInput | Promise<InitInput>): Promise<InitOutput>;

View File

@@ -1,27 +0,0 @@
/* tslint:disable */
/* eslint-disable */
export const memory: WebAssembly.Memory;
export const __wbg_wzpcryptosession_free: (a: number, b: number) => void;
export const __wbg_wzpfecdecoder_free: (a: number, b: number) => void;
export const __wbg_wzpfecencoder_free: (a: number, b: number) => void;
export const __wbg_wzpkeyexchange_free: (a: number, b: number) => void;
export const wzpcryptosession_decrypt: (a: number, b: number, c: number, d: number, e: number) => [number, number, number, number];
export const wzpcryptosession_encrypt: (a: number, b: number, c: number, d: number, e: number) => [number, number, number, number];
export const wzpcryptosession_new: (a: number, b: number) => [number, number, number];
export const wzpcryptosession_recv_seq: (a: number) => number;
export const wzpcryptosession_send_seq: (a: number) => number;
export const wzpfecdecoder_add_symbol: (a: number, b: number, c: number, d: number, e: number, f: number) => [number, number];
export const wzpfecdecoder_new: (a: number, b: number) => number;
export const wzpfecencoder_add_symbol: (a: number, b: number, c: number) => [number, number];
export const wzpfecencoder_flush: (a: number) => [number, number];
export const wzpfecencoder_new: (a: number, b: number) => number;
export const wzpkeyexchange_derive_shared_secret: (a: number, b: number, c: number) => [number, number, number, number];
export const wzpkeyexchange_new: () => number;
export const wzpkeyexchange_public_key: (a: number) => [number, number];
export const __wbindgen_exn_store: (a: number) => void;
export const __externref_table_alloc: () => number;
export const __wbindgen_externrefs: WebAssembly.Table;
export const __wbindgen_malloc: (a: number, b: number) => number;
export const __externref_table_dealloc: (a: number) => void;
export const __wbindgen_free: (a: number, b: number, c: number) => void;
export const __wbindgen_start: () => void;

2
desktop/.gitignore vendored
View File

@@ -1,2 +0,0 @@
node_modules/
dist/

View File

@@ -1,8 +0,0 @@
{
"hash": "9046c0bf",
"configHash": "ef0fc96f",
"lockfileHash": "d66891b1",
"browserHash": "8171ed59",
"optimized": {},
"chunks": {}
}

View File

@@ -1,3 +0,0 @@
{
"type": "module"
}

View File

@@ -1,276 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta
name="viewport"
content="width=device-width, initial-scale=1.0, maximum-scale=1.0, minimum-scale=1.0, user-scalable=no, viewport-fit=cover"
/>
<title>WarzonePhone</title>
<link rel="stylesheet" href="/src/style.css" />
</head>
<body>
<div id="app">
<!-- Connect screen -->
<div id="connect-screen">
<h1>WarzonePhone</h1>
<p class="subtitle">Encrypted Voice</p>
<div class="form">
<label>Relay
<button id="relay-selected" class="relay-selected" type="button">
<span id="relay-dot" class="dot"></span>
<span id="relay-label">Select relay...</span>
<span class="arrow">&#9881;</span>
</button>
</label>
<label>Room
<input id="room" type="text" value="general" />
</label>
<label>Alias
<input id="alias" type="text" placeholder="your name" />
</label>
<div class="form-row">
<label class="checkbox">
<input id="os-aec" type="checkbox" checked />
OS Echo Cancel
</label>
<button id="settings-btn-home" class="icon-btn" title="Settings (Cmd+,)">&#9881;</button>
</div>
<!-- Mode toggle -->
<div class="mode-toggle" style="display:flex;gap:8px;margin-bottom:8px;">
<button id="mode-room" class="mode-btn active" style="flex:1">Room</button>
<button id="mode-direct" class="mode-btn" style="flex:1">Direct Call</button>
</div>
<!-- Room mode (default) -->
<div id="room-mode">
<button id="connect-btn" class="primary">Connect</button>
</div>
<!-- Direct call mode -->
<div id="direct-mode" class="hidden">
<button id="register-btn" class="primary" style="background:#2196F3">Register on Relay</button>
<div id="direct-registered" class="hidden" style="margin-top:12px">
<div class="direct-registered-header">
<p style="color:var(--green);font-size:13px;margin:0">&#x2705; Registered — waiting for calls</p>
<button id="deregister-btn" class="secondary-btn small">Deregister</button>
</div>
<div id="incoming-call-panel" class="hidden" style="background:#1B5E20;padding:12px;border-radius:8px;margin:8px 0">
<p style="font-weight:bold;margin:0 0 4px 0">Incoming Call</p>
<p id="incoming-caller" style="font-size:12px;opacity:0.8;margin:0 0 8px 0">From: unknown</p>
<div style="display:flex;gap:8px">
<button id="accept-call-btn" style="flex:1;background:var(--green);color:white;border:none;padding:8px;border-radius:6px;cursor:pointer">Accept</button>
<button id="reject-call-btn" style="flex:1;background:var(--red);color:white;border:none;padding:8px;border-radius:6px;cursor:pointer">Reject</button>
</div>
</div>
<!-- Recent contacts -->
<div id="recent-contacts-section" class="hidden">
<div class="history-header">Recent contacts</div>
<div id="recent-contacts-list" class="history-list"></div>
</div>
<!-- Call history -->
<div id="call-history-section" class="hidden">
<div class="history-header">
History
<button id="clear-history-btn" class="link-btn">clear</button>
</div>
<div id="call-history-list" class="history-list"></div>
</div>
<label style="margin-top:8px">Call by fingerprint
<input id="target-fp" type="text" placeholder="xxxx:xxxx:xxxx:..." />
</label>
<button id="call-btn" class="primary" style="margin-top:8px">Call</button>
<p id="call-status-text" style="color:var(--yellow);font-size:13px;margin-top:4px"></p>
</div>
</div>
<p id="connect-error" class="error"></p>
</div>
<div class="identity-info">
<span id="my-identicon"></span>
<span id="my-fingerprint" class="fp-display"></span>
</div>
<div class="recent-rooms" id="recent-rooms"></div>
</div>
<!-- In-call screen -->
<div id="call-screen" class="hidden">
<div class="call-header">
<div class="call-header-row">
<div id="room-name" class="room-name"></div>
<button id="settings-btn-call" class="icon-btn small" title="Settings (Cmd+,)">&#9881;</button>
</div>
<div class="call-meta">
<span id="call-status" class="status-dot"></span>
<span id="call-timer" class="call-timer">0:00</span>
</div>
</div>
<div class="level-meter">
<div id="level-bar" class="level-bar-fill"></div>
</div>
<div id="participants" class="participants"></div>
<div class="controls">
<button id="mic-btn" class="control-btn" title="Toggle Mic (m)">
<span class="icon" id="mic-icon">Mic</span>
</button>
<button id="hangup-btn" class="control-btn hangup" title="Hang Up (q)">
<span class="icon">End</span>
</button>
<button id="spk-btn" class="control-btn" title="Toggle Speaker (s)">
<span class="icon" id="spk-icon">Spk</span>
</button>
</div>
<div id="stats" class="stats"></div>
</div>
<!-- Settings panel -->
<div id="settings-panel" class="hidden">
<div class="settings-card">
<div class="settings-header">
<h2>Settings</h2>
<button id="settings-close" class="icon-btn">&times;</button>
</div>
<div class="settings-section">
<h3>Connection</h3>
<label>Default Room
<input id="s-room" type="text" />
</label>
<label>Alias
<input id="s-alias" type="text" />
</label>
</div>
<div class="settings-section">
<h3>Audio</h3>
<div class="quality-control">
<div class="quality-header">
<span class="setting-label">QUALITY</span>
<span id="s-quality-label" class="quality-label">Auto</span>
</div>
<input id="s-quality" type="range" min="0" max="7" step="1" value="3" class="quality-slider" />
<div class="quality-ticks">
<span>64k</span>
<span>48k</span>
<span>32k</span>
<span>Auto</span>
<span>24k</span>
<span>6k</span>
<span>C2</span>
<span>1.2k</span>
</div>
</div>
<label class="checkbox">
<input id="s-os-aec" type="checkbox" />
OS Echo Cancellation (macOS VoiceProcessingIO)
</label>
<label class="checkbox">
<input id="s-agc" type="checkbox" checked />
Automatic Gain Control
</label>
<label class="checkbox">
<input id="s-dred-debug" type="checkbox" />
DRED debug logs (verbose, dev only)
</label>
<label class="checkbox">
<input id="s-call-debug" type="checkbox" />
Call flow debug logs (trace every step of a call)
</label>
</div>
<div class="settings-section" id="s-call-debug-section" style="display:none">
<h3>Call Debug Log</h3>
<div id="s-call-debug-log" style="max-height:220px;overflow-y:auto;background:#0a0a0a;color:#e0e0e0;font-family:ui-monospace,Menlo,Monaco,'Courier New',monospace;font-size:10px;padding:6px;border-radius:4px;line-height:1.4;white-space:pre-wrap"></div>
<button id="s-call-debug-clear" class="secondary-btn" style="margin-top:6px">Clear log</button>
<small style="color:var(--text-dim);display:block;margin-top:4px">
Rolling buffer of the last 200 call-flow events. Turned off by
default — the GUI overlay only populates when the checkbox above
is on, but logcat (adb) always keeps a copy regardless.
</small>
</div>
<div class="settings-section">
<h3>Identity</h3>
<div class="setting-row">
<span class="setting-label">Fingerprint</span>
<span id="s-fingerprint" class="fp-display-large"></span>
</div>
<div class="setting-row">
<span class="setting-label">Identity file</span>
<span class="fp-display">~/.wzp/identity</span>
</div>
</div>
<div class="settings-section">
<h3>Network</h3>
<div class="setting-row">
<span class="setting-label">Public address</span>
<span id="s-reflected-addr" class="fp-display">(not queried)</span>
<button id="s-reflect-btn" class="secondary-btn">Detect</button>
</div>
<small style="color:var(--text-dim);display:block;margin-top:4px">
Asks the registered relay to echo back the IP:port it sees for this
connection (QUIC-native NAT reflection, replaces STUN).
</small>
<div class="setting-row" style="margin-top:10px">
<span class="setting-label">NAT type</span>
<span id="s-nat-type" class="fp-display">(not detected)</span>
<button id="s-nat-detect-btn" class="secondary-btn">Detect NAT</button>
</div>
<div id="s-nat-probes" style="margin-top:6px;font-size:11px;color:var(--text-dim)"></div>
<small style="color:var(--text-dim);display:block;margin-top:4px">
Probes every configured relay in parallel and compares the results
to classify the NAT: cone (P2P viable), symmetric (must relay),
multiple, or unknown.
</small>
</div>
<div class="settings-section">
<h3>Recent Rooms</h3>
<div id="s-recent-rooms" class="recent-rooms-list"></div>
<button id="s-clear-recent" class="secondary-btn">Clear History</button>
</div>
<button id="settings-save" class="primary">Save</button>
</div>
</div>
<!-- Manage Relays dialog -->
<div id="relay-dialog" class="hidden">
<div class="settings-card relay-dialog-card">
<div class="settings-header">
<h2>Manage Relays</h2>
<button id="relay-dialog-close" class="icon-btn">&times;</button>
</div>
<div id="relay-dialog-list" class="relay-dialog-list"></div>
<div class="relay-add-row">
<div class="relay-add-inputs">
<input id="relay-add-name" type="text" placeholder="Name" />
<input id="relay-add-addr" type="text" placeholder="host:port" />
</div>
<button id="relay-add-btn" class="primary">Add Relay</button>
</div>
</div>
</div>
<!-- Key changed warning dialog -->
<div id="key-warning" class="hidden">
<div class="settings-card key-warning-card">
<div class="key-warning-icon">&#9888;</div>
<h2>Server Key Changed</h2>
<p class="key-warning-text">The relay's identity has changed since you last connected. This usually happens when the server was restarted, but could also indicate a security issue.</p>
<div class="key-warning-fps">
<div class="key-fp-row">
<span class="key-fp-label">Previously known</span>
<code id="kw-old-fp" class="key-fp"></code>
</div>
<div class="key-fp-row">
<span class="key-fp-label">New key</span>
<code id="kw-new-fp" class="key-fp"></code>
</div>
</div>
<div class="key-warning-actions">
<button id="kw-accept" class="primary">Accept New Key</button>
<button id="kw-cancel" class="secondary-btn">Cancel</button>
</div>
</div>
</div>
</div>
<script type="module" src="/src/main.ts"></script>
</body>
</html>

1350
desktop/package-lock.json generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,19 +0,0 @@
{
"name": "wzp-desktop",
"private": true,
"version": "0.1.0",
"type": "module",
"scripts": {
"dev": "vite",
"build": "vite build",
"tauri": "tauri"
},
"dependencies": {
"@tauri-apps/api": "^2"
},
"devDependencies": {
"typescript": "^5",
"vite": "^6",
"@tauri-apps/cli": "^2"
}
}

View File

@@ -1,107 +0,0 @@
[package]
name = "wzp-desktop"
version = "0.1.0"
edition = "2024"
description = "WarzonePhone Desktop — encrypted VoIP client"
default-run = "wzp-desktop"
# Library target — required for Tauri mobile (Android/iOS link the app as a cdylib)
# and also used by the desktop binary below.
#
# `staticlib` was DROPPED from crate-type because rust-lang/rust#104707
# documents that having staticlib alongside cdylib leaks non-exported
# symbols from staticlibs into the cdylib. Bionic's private `__init_tcb`
# / `pthread_create` symbols end up bound LOCALLY inside our .so instead
# of resolved dynamically against libc.so at dlopen time — which crashes
# at launch as soon as tao tries to std::thread::spawn() from the JNI
# onCreate callback. The legacy wzp-android crate uses ["cdylib", "rlib"]
# and runs fine on the same phone with the same NDK + Rust toolchain.
#
# iOS Tauri builds that actually need staticlib can re-add it behind a
# target cfg if we ever ship on iOS.
[lib]
name = "wzp_desktop_lib"
crate-type = ["cdylib", "rlib"]
[[bin]]
name = "wzp-desktop"
path = "src/main.rs"
[build-dependencies]
tauri-build = { version = "2", features = [] }
# cc is no longer needed — all C++ moved to crates/wzp-native (built with
# cargo-ndk and loaded via libloading at runtime). wzp-desktop's .so on
# Android is now pure Rust.
[dependencies]
tauri = { version = "2", features = [] }
tauri-plugin-shell = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["full"] }
tracing = "0.1"
tracing-subscriber = "0.3"
anyhow = "1"
rustls = { version = "0.23", default-features = false, features = ["ring", "std"] }
# WarzonePhone crates — protocol layer is platform-independent
wzp-proto = { path = "../../crates/wzp-proto" }
wzp-codec = { path = "../../crates/wzp-codec" }
wzp-fec = { path = "../../crates/wzp-fec" }
wzp-crypto = { path = "../../crates/wzp-crypto" }
wzp-transport = { path = "../../crates/wzp-transport" }
# wzp-client pulls in CPAL on every desktop target and, additionally on
# macOS, VoiceProcessingIO (coreaudio-rs behind the "vpio" feature). The
# vpio feature MUST NOT be enabled on Windows / Linux because coreaudio-rs
# is Apple-framework-only and will fail to build. Task #24 will add a
# matching Windows Voice Capture DSP path behind its own feature; until
# then, Windows desktops use plain CPAL with AEC disabled.
# macOS: CPAL + VoiceProcessingIO (hardware AEC via Core Audio).
[target.'cfg(target_os = "macos")'.dependencies]
wzp-client = { path = "../../crates/wzp-client", features = ["audio", "vpio"] }
# Windows: CPAL for playback + direct WASAPI for capture with OS-level
# AEC (AudioCategory_Communications). The wzp-client `windows-aec`
# feature swaps the default CPAL AudioCapture for a WASAPI one that
# opens the mic under AudioCategory_Communications, turning on Windows's
# communications audio processing chain (AEC, NS, AGC). The reference
# signal for AEC is the system render mix, so echo from our CPAL
# playback is cancelled automatically without extra plumbing.
[target.'cfg(target_os = "windows")'.dependencies]
wzp-client = { path = "../../crates/wzp-client", features = ["audio", "windows-aec"] }
# Linux: CPAL playback+capture baseline. AEC is enabled via the top-level
# `linux-aec` feature in wzp-desktop, which forwards to wzp-client/linux-aec.
# Keeping it opt-in at the wzp-desktop level (rather than forcing it always
# on here) lets `cargo tauri build` produce two variants from the same
# source tree — a noAEC baseline and an AEC build — by toggling the feature
# at build time: `cargo tauri build -- --features wzp-desktop/linux-aec`.
[target.'cfg(target_os = "linux")'.dependencies]
wzp-client = { path = "../../crates/wzp-client", features = ["audio"] }
# Android: no CPAL, no vpio — audio goes through the standalone wzp-native
# cdylib that we dlopen via libloading at runtime. See the wzp_native
# module in src/.
[target.'cfg(target_os = "android")'.dependencies]
wzp-client = { path = "../../crates/wzp-client", default-features = false }
# libloading: runtime dlopen of libwzp_native.so — the standalone cdylib
# crate that owns all C++ (Oboe bridge). Keeps wzp-desktop's .so free of
# any C/C++ static archives that would otherwise leak bionic's internal
# pthread_create into our cdylib and trigger the __init_tcb crash.
libloading = "0.8"
# jni + ndk-context: called from android_audio.rs to invoke
# AudioManager.setSpeakerphoneOn on the JVM side at runtime, so the
# Oboe playout stream (opened with Usage::VoiceCommunication) can route
# between earpiece and loud speaker without restarting.
jni = "0.21"
ndk-context = "0.1"
[features]
default = ["custom-protocol"]
custom-protocol = ["tauri/custom-protocol"]
# linux-aec: forwards to wzp-client/linux-aec so `cargo tauri build -- --features
# wzp-desktop/linux-aec` enables the WebRTC AEC3 backend on Linux. No-op on
# other targets because wzp-client/linux-aec is itself cfg(target_os = "linux").
linux-aec = ["wzp-client/linux-aec"]

View File

@@ -1,21 +0,0 @@
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<!--
Custom Info.plist keys merged into the bundled WarzonePhone.app by
tauri-bundler. The base Info.plist (CFBundleIdentifier, version,
etc.) is generated from tauri.conf.json — only put *additional*
keys here.
NSMicrophoneUsageDescription is required by macOS TCC for any
app that opens an audio input unit. Without this string the OS
silently denies CoreAudio capture (input callbacks return zeros)
and the app never appears in System Settings → Privacy &
Security → Microphone. This was the root cause of the desktop
mic regression where phones could not hear the desktop client.
-->
<key>NSMicrophoneUsageDescription</key>
<string>WarzonePhone needs microphone access to transmit your voice during calls.</string>
</dict>
</plist>

View File

@@ -1,26 +0,0 @@
use std::process::Command;
fn main() {
// Capture short git hash so the running app can prove which build it is.
// Falls back to "unknown" if git isn't available (e.g. when building from
// a tarball without a .git dir).
let git_hash = Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.ok()
.filter(|o| o.status.success())
.and_then(|o| String::from_utf8(o.stdout).ok())
.map(|s| s.trim().to_string())
.unwrap_or_else(|| "unknown".into());
println!("cargo:rustc-env=WZP_GIT_HASH={git_hash}");
println!("cargo:rerun-if-changed=../../.git/HEAD");
println!("cargo:rerun-if-changed=../../.git/refs/heads");
// No cc::Build of ANY kind on Android — all C++ lives in the standalone
// `wzp-native` crate which is built separately with cargo-ndk and loaded
// via libloading at runtime. See docs/incident-tauri-android-init-tcb.md
// for why this split exists.
tauri_build::build()
}

View File

@@ -1,26 +0,0 @@
{
"$schema": "../gen/schemas/desktop-schema.json",
"identifier": "default",
"description": "Default capability — grants core APIs (events, path, window, app, clipboard) to the main window on every platform we ship to.",
"windows": ["main"],
"platforms": [
"linux",
"macOS",
"windows",
"android",
"iOS"
],
"permissions": [
"core:default",
"core:event:default",
"core:event:allow-listen",
"core:event:allow-unlisten",
"core:event:allow-emit",
"core:event:allow-emit-to",
"core:path:default",
"core:window:default",
"core:app:default",
"core:webview:default",
"shell:default"
]
}

View File

@@ -1,40 +0,0 @@
<?xml version="1.0" encoding="utf-8"?>
<manifest xmlns:android="http://schemas.android.com/apk/res/android">
<uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-feature android:name="android.hardware.microphone" android:required="true" />
<!-- AndroidTV support -->
<uses-feature android:name="android.software.leanback" android:required="false" />
<application
android:icon="@mipmap/ic_launcher"
android:label="@string/app_name"
android:theme="@style/Theme.wzp_desktop"
android:usesCleartextTraffic="${usesCleartextTraffic}">
<activity
android:configChanges="orientation|keyboardHidden|keyboard|screenSize|locale|smallestScreenSize|screenLayout|uiMode"
android:launchMode="singleTask"
android:label="@string/main_activity_title"
android:name=".MainActivity"
android:exported="true">
<intent-filter>
<action android:name="android.intent.action.MAIN" />
<category android:name="android.intent.category.LAUNCHER" />
<!-- AndroidTV support -->
<category android:name="android.intent.category.LEANBACK_LAUNCHER" />
</intent-filter>
</activity>
<provider
android:name="androidx.core.content.FileProvider"
android:authorities="${applicationId}.fileprovider"
android:exported="false"
android:grantUriPermissions="true">
<meta-data
android:name="android.support.FILE_PROVIDER_PATHS"
android:resource="@xml/file_paths" />
</provider>
</application>
</manifest>

View File

@@ -1,101 +0,0 @@
package com.wzp.desktop
import android.Manifest
import android.content.Context
import android.content.pm.PackageManager
import android.media.AudioManager
import android.os.Bundle
import android.util.Log
import androidx.activity.enableEdgeToEdge
import androidx.core.app.ActivityCompat
import androidx.core.content.ContextCompat
class MainActivity : TauriActivity() {
companion object {
private const val TAG = "WzpMainActivity"
private const val AUDIO_PERMISSIONS_REQUEST = 4242
private val REQUIRED_AUDIO_PERMISSIONS = arrayOf(
Manifest.permission.RECORD_AUDIO,
Manifest.permission.MODIFY_AUDIO_SETTINGS
)
}
override fun onCreate(savedInstanceState: Bundle?) {
enableEdgeToEdge()
super.onCreate(savedInstanceState)
// Request RECORD_AUDIO early so Oboe (inside libwzp_native.so) can open
// the AAudio input stream without silently failing. The grant is
// persisted, so after the first launch the dialog no longer appears.
// MODIFY_AUDIO_SETTINGS is needed to switch AudioManager mode + speaker.
val needsRequest = REQUIRED_AUDIO_PERMISSIONS.any {
ContextCompat.checkSelfPermission(this, it) != PackageManager.PERMISSION_GRANTED
}
if (needsRequest) {
Log.i(TAG, "requesting audio permissions")
ActivityCompat.requestPermissions(this, REQUIRED_AUDIO_PERMISSIONS, AUDIO_PERMISSIONS_REQUEST)
} else {
Log.i(TAG, "audio permissions already granted")
configureAudioForCall()
}
}
override fun onRequestPermissionsResult(
requestCode: Int,
permissions: Array<String>,
grantResults: IntArray
) {
super.onRequestPermissionsResult(requestCode, permissions, grantResults)
if (requestCode == AUDIO_PERMISSIONS_REQUEST) {
val allGranted = grantResults.isNotEmpty() &&
grantResults.all { it == PackageManager.PERMISSION_GRANTED }
Log.i(TAG, "audio permissions result: allGranted=$allGranted grants=${grantResults.toList()}")
if (allGranted) {
configureAudioForCall()
}
}
}
/**
* Put the phone into VoIP call mode with handset (earpiece) as the
* default output. The Oboe playout stream is opened with
* Usage::VoiceCommunication which honours this routing, so:
*
* MODE_IN_COMMUNICATION + speakerphoneOn=false → earpiece (handset)
* MODE_IN_COMMUNICATION + speakerphoneOn=true → loudspeaker
* MODE_IN_COMMUNICATION + bluetoothScoOn=true → bluetooth headset
*
* The speaker/handset/BT toggle itself is wired up via the Tauri
* command `set_speakerphone(on)` in a follow-up build. For now the
* default is handset, matching the user's stated preference.
*
* STREAM_VOICE_CALL volume is cranked to max since the in-call volume
* slider is separate from media volume on most devices.
*/
private fun configureAudioForCall() {
try {
val am = getSystemService(Context.AUDIO_SERVICE) as AudioManager
Log.i(TAG, "audio state before: mode=${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)} " +
"musicVol=${am.getStreamVolume(AudioManager.STREAM_MUSIC)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_MUSIC)}")
am.mode = AudioManager.MODE_IN_COMMUNICATION
am.isSpeakerphoneOn = false // default: handset / earpiece
// Crank both voice-call and music volumes so nothing silent slips
// through regardless of which stream actually ends up driving.
val maxVoice = am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)
am.setStreamVolume(AudioManager.STREAM_VOICE_CALL, maxVoice, 0)
val maxMusic = am.getStreamMaxVolume(AudioManager.STREAM_MUSIC)
am.setStreamVolume(AudioManager.STREAM_MUSIC, maxMusic, 0)
Log.i(TAG, "audio state after: mode=${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/$maxVoice " +
"musicVol=${am.getStreamVolume(AudioManager.STREAM_MUSIC)}/$maxMusic")
} catch (e: Throwable) {
Log.e(TAG, "configureAudioForCall failed: ${e.message}", e)
}
}
}

File diff suppressed because one or more lines are too long

View File

@@ -1 +0,0 @@
{"default":{"identifier":"default","description":"Default capability — grants core APIs (events, path, window, app, clipboard) to the main window on every platform we ship to.","local":true,"windows":["main"],"permissions":["core:default","core:event:default","core:event:allow-listen","core:event:allow-unlisten","core:event:allow-emit","core:event:allow-emit-to","core:path:default","core:window:default","core:app:default","core:webview:default","shell:default"],"platforms":["linux","macOS","windows","android","iOS"]}}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

Binary file not shown.

Before

Width:  |  Height:  |  Size: 2.0 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 104 B

View File

@@ -1,98 +0,0 @@
//! Runtime bridge to Android's `AudioManager` for in-call audio routing.
//!
//! We own a quinn+Oboe VoIP pipeline entirely from Rust, but routing the
//! playout stream between earpiece / loudspeaker / Bluetooth headset has to
//! happen at the JVM level because those toggles are AudioManager-only.
//! This module uses the global JavaVM handle that `ndk_context` exposes
//! (populated by Tauri's mobile runtime) + the `jni` crate to reach into
//! the Android framework without needing a Tauri plugin.
//!
//! All callers must be inside an Android target (`#[cfg(target_os = "android")]`).
#![cfg(target_os = "android")]
use jni::objects::{JObject, JString, JValue};
use jni::JavaVM;
/// Grab the JavaVM + current Activity from the ndk_context that Tauri's
/// mobile runtime sets up at process startup.
fn jvm_and_activity() -> Result<(JavaVM, JObject<'static>), String> {
let ctx = ndk_context::android_context();
let vm_ptr = ctx.vm() as *mut jni::sys::JavaVM;
if vm_ptr.is_null() {
return Err("ndk_context: JavaVM pointer is null".into());
}
let vm = unsafe { JavaVM::from_raw(vm_ptr) }
.map_err(|e| format!("JavaVM::from_raw: {e}"))?;
let activity_ptr = ctx.context() as jni::sys::jobject;
if activity_ptr.is_null() {
return Err("ndk_context: activity pointer is null".into());
}
// SAFETY: ndk_context guarantees the pointer lives for the process
// lifetime; we wrap it as a JObject<'static> for convenience.
let activity: JObject<'static> = unsafe { JObject::from_raw(activity_ptr) };
Ok((vm, activity))
}
/// Get Android's `AudioManager` via `activity.getSystemService("audio")`.
fn audio_manager<'local>(
env: &mut jni::AttachGuard<'local>,
activity: &JObject<'local>,
) -> Result<JObject<'local>, String> {
let svc_name: JString<'local> = env
.new_string("audio")
.map_err(|e| format!("new_string(audio): {e}"))?;
let am = env
.call_method(
activity,
"getSystemService",
"(Ljava/lang/String;)Ljava/lang/Object;",
&[JValue::Object(&svc_name)],
)
.and_then(|v| v.l())
.map_err(|e| format!("getSystemService(audio): {e}"))?;
if am.is_null() {
return Err("getSystemService returned null".into());
}
Ok(am)
}
/// Switch between loud speaker (`true`) and earpiece/handset (`false`).
///
/// Calls `AudioManager.setSpeakerphoneOn(on)` on the JVM. Requires that
/// the audio mode is already `MODE_IN_COMMUNICATION` — MainActivity.kt
/// sets this at startup, so by the time a call is up this is always true.
pub fn set_speakerphone(on: bool) -> Result<(), String> {
let (vm, activity) = jvm_and_activity()?;
let mut env = vm
.attach_current_thread()
.map_err(|e| format!("attach_current_thread: {e}"))?;
let am = audio_manager(&mut env, &activity)?;
env.call_method(
&am,
"setSpeakerphoneOn",
"(Z)V",
&[JValue::Bool(if on { 1 } else { 0 })],
)
.map_err(|e| format!("setSpeakerphoneOn({on}): {e}"))?;
tracing::info!(on, "AudioManager.setSpeakerphoneOn");
Ok(())
}
/// Query the current speakerphone state. Returns true if routing is on the
/// loud speaker, false if on earpiece / BT headset / wired headset.
pub fn is_speakerphone_on() -> Result<bool, String> {
let (vm, activity) = jvm_and_activity()?;
let mut env = vm
.attach_current_thread()
.map_err(|e| format!("attach_current_thread: {e}"))?;
let am = audio_manager(&mut env, &activity)?;
let on = env
.call_method(&am, "isSpeakerphoneOn", "()Z", &[])
.and_then(|v| v.z())
.map_err(|e| format!("isSpeakerphoneOn: {e}"))?;
Ok(on)
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,180 +0,0 @@
//! Call history store.
//!
//! Keeps a rolling JSON file of the last N direct-call events so the UI can
//! show "recent contacts" + "call history with callback buttons" on the
//! direct-call screen. Storage lives in `<APP_DATA_DIR>/call_history.json`
//! alongside the identity file. The file is read lazily on first access and
//! cached in an RwLock behind a OnceLock.
//!
//! This is a v1 — no duration tracking yet, entries are logged at the
//! moment the direction is decided (placed / received / missed).
use std::path::PathBuf;
use std::sync::{OnceLock, RwLock};
use std::time::{SystemTime, UNIX_EPOCH};
use serde::{Deserialize, Serialize};
/// Maximum number of history entries we keep. Older ones are pruned FIFO.
const MAX_ENTRIES: usize = 200;
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "lowercase")]
pub enum CallDirection {
/// Local user placed the call.
Placed,
/// Remote user called and local user answered.
Received,
/// Remote user called but local user did not answer (rejected or
/// missed entirely — the UI treats these identically).
Missed,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct CallHistoryEntry {
pub call_id: String,
pub peer_fp: String,
pub peer_alias: Option<String>,
pub direction: CallDirection,
/// Seconds since UNIX epoch, UTC.
pub timestamp_unix: u64,
}
// ─── In-process store (loaded from disk once) ─────────────────────────────
static STORE: OnceLock<RwLock<Vec<CallHistoryEntry>>> = OnceLock::new();
fn store() -> &'static RwLock<Vec<CallHistoryEntry>> {
STORE.get_or_init(|| RwLock::new(load_from_disk()))
}
fn history_path() -> PathBuf {
crate::APP_DATA_DIR
.get()
.cloned()
.unwrap_or_else(|| {
let home = std::env::var("HOME").unwrap_or_else(|_| ".".into());
PathBuf::from(home).join(".wzp")
})
.join("call_history.json")
}
fn load_from_disk() -> Vec<CallHistoryEntry> {
let path = history_path();
let Ok(bytes) = std::fs::read(&path) else {
return Vec::new();
};
serde_json::from_slice::<Vec<CallHistoryEntry>>(&bytes)
.inspect_err(|e| tracing::warn!(path = %path.display(), error = %e, "call_history.json parse failed"))
.unwrap_or_default()
}
fn save_to_disk(entries: &[CallHistoryEntry]) {
let path = history_path();
if let Some(parent) = path.parent() {
let _ = std::fs::create_dir_all(parent);
}
let Ok(json) = serde_json::to_vec_pretty(entries) else { return };
// Atomic write via temp file + rename so a crash mid-write doesn't
// leave us with a half-file on disk.
let tmp = path.with_extension("json.tmp");
if std::fs::write(&tmp, &json).is_ok() {
let _ = std::fs::rename(&tmp, &path);
}
}
fn now_unix() -> u64 {
SystemTime::now()
.duration_since(UNIX_EPOCH)
.map(|d| d.as_secs())
.unwrap_or(0)
}
// ─── Public API ───────────────────────────────────────────────────────────
/// Append a new entry to the store and persist to disk. Trims the store to
/// `MAX_ENTRIES` after insertion.
pub fn log(
call_id: String,
peer_fp: String,
peer_alias: Option<String>,
direction: CallDirection,
) {
tracing::info!(
%call_id, %peer_fp, ?direction,
alias = ?peer_alias,
"history::log"
);
let entry = CallHistoryEntry {
call_id: call_id.clone(),
peer_fp,
peer_alias,
direction,
timestamp_unix: now_unix(),
};
let mut guard = store().write().unwrap();
// If an entry for this call_id already exists, update it in-place
// rather than appending a duplicate. Protects against the caller
// side adding a second Missed row when the callee's DirectCallOffer
// bounces back through federation / loopback, or when some future
// relay routing edge case double-emits a signal. The dedup keeps
// history tidy and matches what the user intuitively expects (one
// history row per call, not one per signal event).
if let Some(existing) = guard.iter_mut().rev().find(|e| e.call_id == call_id) {
tracing::info!(%call_id, from = ?existing.direction, to = ?direction, "history::log replacing existing entry");
existing.direction = direction;
existing.timestamp_unix = entry.timestamp_unix;
save_to_disk(&guard);
return;
}
guard.push(entry);
if guard.len() > MAX_ENTRIES {
let drop_n = guard.len() - MAX_ENTRIES;
guard.drain(0..drop_n);
}
save_to_disk(&guard);
}
/// Return a copy of all entries in reverse-chronological order
/// (most recent first).
pub fn all() -> Vec<CallHistoryEntry> {
let guard = store().read().unwrap();
guard.iter().rev().cloned().collect()
}
/// Unique peer contacts sorted by most recent interaction. Each contact
/// is represented by the newest history entry for that fingerprint.
pub fn contacts() -> Vec<CallHistoryEntry> {
let guard = store().read().unwrap();
let mut seen: std::collections::HashSet<String> = std::collections::HashSet::new();
let mut out = Vec::new();
// iterate newest → oldest
for entry in guard.iter().rev() {
if seen.insert(entry.peer_fp.clone()) {
out.push(entry.clone());
}
}
out
}
/// Clear the entire history and persist the empty file.
pub fn clear() {
let mut guard = store().write().unwrap();
guard.clear();
save_to_disk(&guard);
}
/// Find a Missed-candidate entry that matches `call_id` and hasn't been
/// answered yet. Used by the signal loop to turn "pending incoming" into
/// "Received" when the user accepts.
pub fn mark_received_if_pending(call_id: &str) -> bool {
let mut guard = store().write().unwrap();
for entry in guard.iter_mut().rev() {
if entry.call_id == call_id && entry.direction == CallDirection::Missed {
entry.direction = CallDirection::Received;
save_to_disk(&guard);
return true;
}
}
false
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,10 +0,0 @@
// Desktop binary entry point. All logic lives in `lib.rs` so the same
// code can be built as a cdylib for Android/iOS via `cargo tauri android build`.
#![cfg_attr(
all(not(debug_assertions), target_os = "windows"),
windows_subsystem = "windows"
)]
fn main() {
wzp_desktop_lib::run();
}

View File

@@ -1,138 +0,0 @@
//! Runtime binding to the standalone `wzp-native` cdylib.
//!
//! See `docs/incident-tauri-android-init-tcb.md` and the top of
//! `crates/wzp-native/src/lib.rs` for the full story on why this split
//! exists. Short version: Tauri's desktop cdylib cannot have any C++
//! compiled into it (via cc::Build) without landing in rust-lang/rust#104707's
//! staticlib symbol leak, which makes bionic's private `pthread_create`
//! symbols bind locally and SIGSEGV in `__init_tcb+4` at launch. So all
//! the Oboe + audio code lives in a standalone `wzp-native` .so built
//! with `cargo-ndk`, and we dlopen it here at runtime.
//!
//! The Library handle lives in a `'static` `OnceLock` for the lifetime of
//! the process; all function pointers cached below borrow from it safely.
#![cfg(target_os = "android")]
use std::sync::OnceLock;
// ─── Library handle (kept alive forever) ─────────────────────────────────
static LIB: OnceLock<libloading::Library> = OnceLock::new();
// Cached function pointers, resolved once at init(). Each is a raw
// `extern "C"` fn pointer with effectively `'static` lifetime because
// LIB is a OnceLock that never drops.
static VERSION: OnceLock<unsafe extern "C" fn() -> i32> = OnceLock::new();
static HELLO: OnceLock<unsafe extern "C" fn(*mut u8, usize) -> usize> = OnceLock::new();
static AUDIO_START: OnceLock<unsafe extern "C" fn() -> i32> = OnceLock::new();
static AUDIO_STOP: OnceLock<unsafe extern "C" fn()> = OnceLock::new();
static AUDIO_READ_CAPTURE: OnceLock<unsafe extern "C" fn(*mut i16, usize) -> usize> = OnceLock::new();
static AUDIO_WRITE_PLAYOUT: OnceLock<unsafe extern "C" fn(*const i16, usize) -> usize> = OnceLock::new();
static AUDIO_IS_RUNNING: OnceLock<unsafe extern "C" fn() -> i32> = OnceLock::new();
static AUDIO_CAPTURE_LATENCY: OnceLock<unsafe extern "C" fn() -> f32> = OnceLock::new();
static AUDIO_PLAYOUT_LATENCY: OnceLock<unsafe extern "C" fn() -> f32> = OnceLock::new();
/// Load `libwzp_native.so` and resolve every exported function we use.
/// Call this once at app startup (from the Tauri `setup()` callback).
/// Subsequent calls are no-ops.
pub fn init() -> Result<(), String> {
if LIB.get().is_some() {
return Ok(());
}
// Open the sibling cdylib. The Android dynamic linker searches
// /data/app/<pkg>/lib/arm64/ which gradle populates from jniLibs.
let lib = unsafe { libloading::Library::new("libwzp_native.so") }
.map_err(|e| format!("dlopen libwzp_native.so: {e}"))?;
// Stash the Library into the OnceLock first so all Symbol lookups
// below borrow from the 'static reference rather than a local.
LIB.set(lib).map_err(|_| "wzp_native::LIB already set")?;
let lib_ref: &'static libloading::Library = LIB.get().unwrap();
unsafe {
macro_rules! resolve {
($cell:expr, $ty:ty, $name:expr) => {{
let sym: libloading::Symbol<$ty> = lib_ref.get($name)
.map_err(|e| format!("dlsym {}: {e}", core::str::from_utf8($name).unwrap_or("?")))?;
// Dereference the Symbol to extract the raw fn pointer;
// it stays valid because lib_ref is 'static.
$cell.set(*sym).map_err(|_| format!("{} already set", core::str::from_utf8($name).unwrap_or("?")))?;
}};
}
resolve!(VERSION, unsafe extern "C" fn() -> i32, b"wzp_native_version");
resolve!(HELLO, unsafe extern "C" fn(*mut u8, usize) -> usize, b"wzp_native_hello");
resolve!(AUDIO_START, unsafe extern "C" fn() -> i32, b"wzp_native_audio_start");
resolve!(AUDIO_STOP, unsafe extern "C" fn(), b"wzp_native_audio_stop");
resolve!(AUDIO_READ_CAPTURE, unsafe extern "C" fn(*mut i16, usize) -> usize, b"wzp_native_audio_read_capture");
resolve!(AUDIO_WRITE_PLAYOUT, unsafe extern "C" fn(*const i16, usize) -> usize, b"wzp_native_audio_write_playout");
resolve!(AUDIO_IS_RUNNING, unsafe extern "C" fn() -> i32, b"wzp_native_audio_is_running");
resolve!(AUDIO_CAPTURE_LATENCY, unsafe extern "C" fn() -> f32, b"wzp_native_audio_capture_latency_ms");
resolve!(AUDIO_PLAYOUT_LATENCY, unsafe extern "C" fn() -> f32, b"wzp_native_audio_playout_latency_ms");
}
Ok(())
}
/// Is `init()` done and all symbols cached?
pub fn is_loaded() -> bool {
AUDIO_START.get().is_some()
}
// ─── Smoke-test accessors ────────────────────────────────────────────────
pub fn version() -> i32 {
VERSION.get().map(|f| unsafe { f() }).unwrap_or(-1)
}
pub fn hello() -> String {
let Some(f) = HELLO.get() else { return String::new(); };
let mut buf = [0u8; 64];
let n = unsafe { f(buf.as_mut_ptr(), buf.len()) };
String::from_utf8_lossy(&buf[..n]).into_owned()
}
// ─── Audio accessors ─────────────────────────────────────────────────────
/// Start the Oboe capture + playout streams. Returns `Err(code)` on
/// failure. Idempotent on the wzp-native side.
pub fn audio_start() -> Result<(), i32> {
let f = AUDIO_START.get().ok_or(-100_i32)?;
let ret = unsafe { f() };
if ret == 0 { Ok(()) } else { Err(ret) }
}
/// Stop both streams. Safe to call even if not running.
pub fn audio_stop() {
if let Some(f) = AUDIO_STOP.get() {
unsafe { f() };
}
}
/// Read captured i16 PCM into `out`. Returns bytes actually copied.
pub fn audio_read_capture(out: &mut [i16]) -> usize {
let Some(f) = AUDIO_READ_CAPTURE.get() else { return 0; };
unsafe { f(out.as_mut_ptr(), out.len()) }
}
/// Write i16 PCM into the playout ring. Returns samples enqueued.
pub fn audio_write_playout(input: &[i16]) -> usize {
let Some(f) = AUDIO_WRITE_PLAYOUT.get() else { return 0; };
unsafe { f(input.as_ptr(), input.len()) }
}
pub fn audio_is_running() -> bool {
AUDIO_IS_RUNNING.get().map(|f| unsafe { f() } != 0).unwrap_or(false)
}
#[allow(dead_code)]
pub fn audio_capture_latency_ms() -> f32 {
AUDIO_CAPTURE_LATENCY.get().map(|f| unsafe { f() }).unwrap_or(0.0)
}
#[allow(dead_code)]
pub fn audio_playout_latency_ms() -> f32 {
AUDIO_PLAYOUT_LATENCY.get().map(|f| unsafe { f() }).unwrap_or(0.0)
}

View File

@@ -1,36 +0,0 @@
{
"productName": "WarzonePhone",
"version": "0.1.0",
"identifier": "com.wzp.desktop",
"build": {
"frontendDist": "../dist",
"devUrl": "http://localhost:1420",
"beforeDevCommand": "npm run dev",
"beforeBuildCommand": "npm run build"
},
"app": {
"windows": [
{
"title": "WarzonePhone",
"width": 400,
"height": 640,
"resizable": true,
"minWidth": 360,
"minHeight": 500
}
],
"security": {
"csp": null
}
},
"bundle": {
"active": true,
"targets": "all",
"icon": [
"icons/icon.png"
],
"android": {
"minSdkVersion": 26
}
}
}

View File

@@ -1,110 +0,0 @@
/**
* Deterministic identicon generator — creates a unique symmetric pattern
* from a hex fingerprint string, similar to MetaMask's Jazzicon / Ethereum blockies.
*
* Returns an SVG data URL that can be used as an <img> src.
*/
function hashBytes(hex: string): number[] {
const clean = hex.replace(/[^0-9a-fA-F]/g, "");
const bytes: number[] = [];
for (let i = 0; i < clean.length; i += 2) {
bytes.push(parseInt(clean.substring(i, i + 2), 16));
}
// Pad to at least 16 bytes
while (bytes.length < 16) bytes.push(0);
return bytes;
}
function hslToRgb(h: number, s: number, l: number): [number, number, number] {
s /= 100;
l /= 100;
const k = (n: number) => (n + h / 30) % 12;
const a = s * Math.min(l, 1 - l);
const f = (n: number) =>
l - a * Math.max(-1, Math.min(k(n) - 3, Math.min(9 - k(n), 1)));
return [
Math.round(f(0) * 255),
Math.round(f(8) * 255),
Math.round(f(4) * 255),
];
}
export function generateIdenticon(
fingerprint: string,
size: number = 36
): string {
const bytes = hashBytes(fingerprint);
// Derive colors from first bytes
const hue1 = (bytes[0] * 360) / 256;
const hue2 = ((bytes[1] * 360) / 256 + 120) % 360;
const [r1, g1, b1] = hslToRgb(hue1, 65, 35); // dark bg
const [r2, g2, b2] = hslToRgb(hue2, 70, 55); // bright fg
const bg = `rgb(${r1},${g1},${b1})`;
const fg = `rgb(${r2},${g2},${b2})`;
// 5x5 grid, left-right symmetric (only need 3 columns)
const grid: boolean[][] = [];
for (let y = 0; y < 5; y++) {
const row: boolean[] = [];
for (let x = 0; x < 3; x++) {
const byteIdx = 2 + y * 3 + x;
row.push(bytes[byteIdx % bytes.length] > 128);
}
// Mirror: col 3 = col 1, col 4 = col 0
grid.push([row[0], row[1], row[2], row[1], row[0]]);
}
// Render SVG
const cellSize = size / 5;
const r = size * 0.12; // border radius
let rects = "";
for (let y = 0; y < 5; y++) {
for (let x = 0; x < 5; x++) {
if (grid[y][x]) {
rects += `<rect x="${x * cellSize}" y="${y * cellSize}" width="${cellSize}" height="${cellSize}" fill="${fg}"/>`;
}
}
}
const svg = `<svg xmlns="http://www.w3.org/2000/svg" width="${size}" height="${size}" viewBox="0 0 ${size} ${size}">
<rect width="${size}" height="${size}" rx="${r}" fill="${bg}"/>
${rects}
</svg>`;
return `data:image/svg+xml,${encodeURIComponent(svg)}`;
}
/**
* Create an <img> element with the identicon.
* Click copies the fingerprint to clipboard.
*/
export function createIdenticonEl(
fingerprint: string,
size: number = 36,
clickToCopy: boolean = true
): HTMLImageElement {
const img = document.createElement("img");
img.src = generateIdenticon(fingerprint, size);
img.width = size;
img.height = size;
img.style.borderRadius = `${size * 0.12}px`;
img.style.cursor = clickToCopy ? "pointer" : "default";
img.title = fingerprint;
if (clickToCopy && fingerprint) {
img.addEventListener("click", (e) => {
e.stopPropagation();
navigator.clipboard.writeText(fingerprint).then(() => {
img.style.outline = "2px solid #4ade80";
setTimeout(() => {
img.style.outline = "";
}, 600);
});
});
}
return img;
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,15 +0,0 @@
{
"compilerOptions": {
"target": "ESNext",
"module": "ESNext",
"moduleResolution": "bundler",
"strict": true,
"esModuleInterop": true,
"skipLibCheck": true,
"forceConsistentCasingInFileNames": true,
"resolveJsonModule": true,
"allowImportingTsExtensions": true,
"noEmit": true
},
"include": ["src"]
}

View File

@@ -1,15 +0,0 @@
import { defineConfig } from "vite";
export default defineConfig({
clearScreen: false,
server: {
port: 1420,
strictPort: true,
},
envPrefix: ["VITE_", "TAURI_"],
build: {
target: "esnext",
minify: !process.env.TAURI_DEBUG ? "esbuild" : false,
sourcemap: !!process.env.TAURI_DEBUG,
},
});

View File

@@ -0,0 +1,139 @@
# Branch: `android-rewrite`
Pivot away from the legacy Kotlin + JNI Android client to a pure-Rust **Tauri 2.x Mobile** app that shares the same frontend and backend code as the desktop client.
## Why this branch exists
The Kotlin + JNI stack was a crash factory. Every failure mode we hit was at the Kotlin ↔ Rust boundary, and each fix uncovered the next layer of the onion:
| Symptom | Root cause | Fix |
|---|---|---|
| App crashed on launch before `onCreate` returned | `__init_tcb` / `pthread_create` bionic private symbols leaking out of `libwzp_android.so` because the Rust crate used `crate-type = ["cdylib", "staticlib"]`. rust-lang/rust#104707 documents that staticlib alongside cdylib leaks non-exported symbols from the staticlib into the cdylib, and Bionic's private internal pthread symbols got bound LOCALLY inside our `.so` instead of resolved against `libc.so` at `dlopen` time | Dropped `staticlib` from the crate-type list. `crate-type = ["cdylib", "rlib"]` only. |
| Stack overflow on `place_call` | `Dispatchers.IO` threads have a ~512 KB stack, too small for the Rust signal-connect path that does TLS handshake + quinn setup inside one closure | Launched JNI calls from a dedicated `java.lang.Thread` with an explicit 8 MB stack |
| `ring` / `libcrypto` TLS reuse crash on second call | tokio runtime got dropped between calls, but `ring` keeps a TLS-stored SSL context that is invalidated when the runtime thread is reused by a new runtime — `ring` sees stale context and segfaults | Single long-lived tokio runtime for the entire signal client lifetime; split `start()` into an inline `connect+register` path and a `run()` path on a separate thread to avoid the `thread::spawn` closure's stack overflow |
| Null dereference on register with fresh install | Identity seed file empty when it existed-but-was-blank, Rust side deref'd the zero-length slice | Generate seed if empty on register |
Every fix kept the app limping along but the fundamental design problem remained: **state management was split across a Kotlin ViewModel and a Rust engine, with a hand-rolled JNI bridge in between that had to be perfect to not crash**. The working desktop Tauri client (with the same Rust backend) had none of these problems because it spoke to the Rust code via in-process `invoke()` from a WebView, not JNI.
So: rewrite the Android app as a **Tauri 2.x Mobile app**, reusing the entire desktop codebase verbatim (`main.ts`, `style.css`, `index.html`, `main.rs`, `engine.rs` — everything). Tauri Mobile added Android support in v2, it's production-ready, and it eliminates the JNI boundary entirely.
The incident postmortem lives at [`docs/incident-tauri-android-init-tcb.md`](incident-tauri-android-init-tcb.md).
## Architecture
```
┌─────────────────────────────────────────────────┐
│ Tauri 2.x Mobile │
│ │
│ Android WebView ────────── HTML/JS/CSS │ ← Shared with desktop
│ │ (main.ts) │
│ │ │
│ invoke() ─────────────── Rust Commands │ ← Shared with desktop
│ (main.rs) │
│ │ │
│ ┌───────────────┼────────────┐ │
│ │ │ │ │
│ SignalMgr CallEngine Identity │ ← Shared crates
│ (signal_hub) (wzp-client) (wzp-crypto)│
│ │ │ │
│ │ │ │
│ ▼ ▼ │
│ QUIC to relay Oboe audio (Android) │
│ via wzp-native cdylib │
└─────────────────────────────────────────────────┘
```
**What is reused from desktop verbatim** (zero rewrite):
- `desktop/src/main.ts` — entire frontend
- `desktop/src/style.css` — all styling
- `desktop/src/identicon.ts` — identicon rendering
- `desktop/index.html` — HTML structure
- `desktop/src-tauri/src/main.rs` — all Tauri commands (`connect`, `disconnect`, `register_signal`, `place_call`, …)
- `desktop/src-tauri/src/engine.rs``CallEngine` wrapper
**What is Android-specific**:
- `desktop/src-tauri/src/android_audio.rs` — JVM-side audio routing (`AudioManager.setSpeakerphoneOn` for earpiece/speaker toggle). Runs from Tauri's existing JNI context — no hand-rolled bridge, Tauri owns the JVM hookup.
- `desktop/src-tauri/src/wzp_native.rs` — runtime `dlopen` of `libwzp_native.so`, a standalone cdylib crate (`crates/wzp-native`) that owns all C++ (Oboe bridge). Kept in its own crate so its C/C++ static archives never get statically linked into `wzp-desktop`'s `.so`, which would re-trigger the `__init_tcb` / pthread leak.
- `crates/wzp-native/` — the standalone C++/Oboe bridge cdylib. Loaded via `libloading` at runtime from `wzp_native.rs`. Provides capture + playout streams using Oboe's `Usage::VoiceCommunication` + `MODE_IN_COMMUNICATION` combo.
- Android-specific target dependencies in `desktop/src-tauri/Cargo.toml` (`jni`, `ndk-context`, `libloading`) — no CPAL, no VPIO.
## Key architectural decisions
### 1. `wzp-native` as a standalone cdylib loaded via `libloading`
The alternative — linking `wzp-native` as a regular Rust dep with C++ static archives — would cause the same `__init_tcb` crash that killed the Kotlin version. By making `wzp-native` its own cdylib and `dlopen`-ing it at runtime, Bionic's `libc.so` resolves every symbol at load time the way it's supposed to, and no private TCB symbols leak.
### 2. `crate-type = ["cdylib", "rlib"]` only (no `staticlib`)
Same reason. The `rlib` output is needed so the `wzp-desktop` binary target can link against the library; `cdylib` is needed for Android's `System.loadLibrary`; `staticlib` would reintroduce the symbol-leak bug.
### 3. Oboe audio config
`Usage::VoiceCommunication` + Java-side `MODE_IN_COMMUNICATION`. **Never** call `setAudioApi(AAudio)` explicitly — on some devices (Nothing Phone in particular) it causes Oboe to open the wrong stream type and audio goes silent. Let Oboe pick the audio API automatically. This is documented in the auto-memory `project_tauri_android_audio.md`.
### 4. Speaker/earpiece toggle uses `tokio::task::spawn_blocking`
Oboe's `stop()` + `start()` cycle is synchronous and can block for 50200 ms. Calling it on the tokio executor stalls every other async task (including the QUIC datagram loop), dropping audio packets. Wrapping the toggle in `spawn_blocking` isolates it to a dedicated thread pool. Fixed in commit `76a4c53`.
## Build pipeline
Docker on SepehrHomeserverdk, same pattern as the Android legacy pipeline and the Windows pipeline:
```
./scripts/build-tauri-android.sh # Full: pull + build + ntfy + rustypaste
./scripts/build-tauri-android.sh --pull # Explicit git pull (default)
./scripts/build-tauri-android.sh --clean # Blow away the Rust target cache
```
**Image**: `wzp-android-builder` (shared with the legacy Kotlin pipeline). The Dockerfile was extended to install Node.js 20 LTS, Android API level 36, build-tools 35.0.0, tauri-cli 2.x, and all four Android Rust targets on top of the legacy NDK 26.1 + cargo-ndk + Gradle setup. Both pipelines coexist in the same image.
**Output**: `wzp-release.apk` uploaded to rustypaste, URL delivered via `ntfy.sh/wzp`.
## Known quirks (Tauri Mobile specific)
1. **tauri-cli `android init` writes absolute paths** into `gradle.properties` for the NDK path. Those paths are local to wherever `android init` was run, so they break any cross-machine build unless overridden with `ANDROID_NDK_HOME` at build time. The build script exports `ANDROID_NDK_HOME` explicitly to work around this.
2. **API 36 vs API 34 coexistence**: the legacy Kotlin pipeline targets API 34, Tauri Mobile 2.x wants compileSdk 36. The shared Docker image installs both SDK levels so neither pipeline needs to reinstall.
3. **Identity seed lives in Android-specific app data dir**: `/data/data/com.wzp.phone/files/.wzp/identity` instead of `$HOME/.wzp/identity`. The shared `load_or_create_seed()` function in `desktop/src-tauri/src/lib.rs` uses Tauri's `app_data_dir()` which resolves correctly on both Android and desktop — no per-platform code needed.
4. **Direct calls on macOS previously hit an identity mismatch bug** — the `CallEngine` was using `$HOME/.wzp/identity` directly while `register_signal` used Tauri's `app_data_dir()`. Fixed by routing both through `load_or_create_seed()` (commit `2fd9465`). This was important for cross-platform consistency.
## Current state (snapshot)
What works:
- Tauri Mobile scaffold builds and runs on Android
- Signal hub connect + register works
- Room mode (SFU group calls) works with Oboe audio
- Direct 1:1 calls work with full parity to desktop
- Speaker/earpiece toggle works without stalling the audio pipeline
- Call history, recent contacts, deregister UI all present (inherited from desktop)
What remains (task list refs in parens):
- Background service for keeping signal alive when app is backgrounded (#19)
- Proper permission requests (microphone, notifications) on first launch (#19)
- Incoming call notification while backgrounded (#19)
- App icon + splash screen (#19)
## Testing
- **Build**: `./scripts/build-tauri-android.sh` — verify the APK lands on rustypaste and installs on device.
- **Smoke test**: Install → open app → Register → Place call → Receive call. No crashes, audio flows both ways.
- **Speaker toggle**: During a call, toggle speaker/earpiece several times in rapid succession. Audio should never stop, and the toggle should respond within ~200 ms.
- **Stress test**: Call for 10+ minutes continuous. No memory growth, no packet loss beyond what's attributable to the network.
## Files of interest
| Path | Purpose |
|---|---|
| `desktop/src-tauri/src/lib.rs` | Shared Tauri commands (desktop + Android) |
| `desktop/src-tauri/src/android_audio.rs` | JVM-side speaker/earpiece routing |
| `desktop/src-tauri/src/wzp_native.rs` | Runtime dlopen of libwzp_native.so |
| `crates/wzp-native/` | Standalone C++/Oboe cdylib, loaded at runtime |
| `scripts/build-tauri-android.sh` | Remote Docker build pipeline |
| `scripts/Dockerfile.android-builder` | Shared Android Docker image (legacy + Tauri) |
| `docs/incident-tauri-android-init-tcb.md` | Postmortem of the Kotlin+JNI crash cascade |

View File

@@ -1,164 +0,0 @@
# Branch: `feat/desktop-audio-rewrite`
Home of the Tauri desktop client for macOS, Windows, and Linux. Named "audio-rewrite" because the original driver was replacing a CPAL-only audio pipeline with platform-native backends that support OS-level echo cancellation (VoiceProcessingIO on macOS, WASAPI Communications on Windows), but the branch has grown into the full desktop story — Windows cross-compilation, vendored dependencies, history UI, direct calling, the whole thing.
## Purpose
The desktop client shares 100% of its frontend (`desktop/src/`) and Tauri command layer (`desktop/src-tauri/src/lib.rs`, `engine.rs`, `history.rs`) with the Android build on `android-rewrite`. Differences are limited to:
- **Audio backends**, which are platform-gated via Cargo target-dep sections in `desktop/src-tauri/Cargo.toml` and feature flags in `crates/wzp-client/Cargo.toml`.
- **Identity storage paths**, which resolve via Tauri's `app_data_dir()` (`~/Library/Application Support/…` on macOS, `%APPDATA%\…` on Windows, `~/.local/share/…` on Linux).
- **Build toolchains**: native `cargo build` on macOS/Linux, `cargo xwin` cross-compile from Linux for Windows via Docker on SepehrHomeserverdk.
## Audio backend matrix
| Target | Capture | Playback | AEC |
|---|---|---|---|
| macOS | CPAL (WASAPI/CoreAudio via cpal crate) OR VoiceProcessingIO (native Core Audio) | CPAL | VoiceProcessingIO native AEC (when `vpio` feature enabled) |
| Windows (default) | CPAL → WASAPI shared mode | CPAL → WASAPI shared mode | None |
| Windows (AEC build) | Direct WASAPI with `IAudioClient2::SetClientProperties(AudioCategory_Communications)` | CPAL → WASAPI shared mode | **OS-level**: Windows routes the capture stream through the driver's communications APO chain (AEC + NS + AGC) |
| Linux | CPAL → ALSA/PulseAudio | CPAL → ALSA/PulseAudio | None |
The macOS VPIO path is gated behind the `vpio` feature in `wzp-client` and the `coreaudio-rs` dep is itself `cfg(target_os = "macos")`, so enabling the feature on Windows or Linux is a no-op.
The Windows AEC path is gated behind the `windows-aec` feature, also target-gated (the `windows` crate dep is only pulled in on Windows), and re-exports `WasapiAudioCapture as AudioCapture` when enabled so downstream code doesn't need to know which backend is active. The current Windows build at `target/windows-exe/wzp-desktop.exe` has `windows-aec` on; a baseline noAEC build is preserved at `target/windows-exe/wzp-desktop-noAEC.exe` for A/B comparison on real hardware.
See [`BRANCH-android-rewrite.md`](BRANCH-android-rewrite.md) for Oboe audio on Android, which is its own story.
## Recent major work
### 1. Desktop direct calling feature (commit `2fd9465` and neighbors)
Brought direct 1:1 calls to macOS with full parity to the Android client:
- **Identity path fix**: the desktop `CallEngine::start` was loading seed from `$HOME/.wzp/identity` while `register_signal` used Tauri's `app_data_dir()`, producing two different fingerprints per run. Both now route through `load_or_create_seed()` which uses `app_data_dir()` everywhere.
- **Call history with dedup**: `history.rs` stores a `Vec<CallHistoryEntry>` with a `CallDirection` enum (`Placed | Received | Missed`). The `log` function dedupes by `call_id` so an outgoing call isn't logged twice as "missed" (when the signal loop's `DirectCallOffer` handler fires) and then again as "placed" (when `place_call` returns). Instead the entry is updated in place.
- **Recent contacts row**: a horizontal chip UI in the direct-call panel showing the last N peers with friendly aliases, clickable to re-dial.
- **Deregister button**: lets a user drop their signal registration without quitting the app, useful when switching identities.
- **Random alias derivation**: a new client sees a human-friendly alias like "silent-forest-41" derived deterministically from its seed, so it's identifiable in the UI before manual naming.
- **Default room "general"** instead of "android", since the desktop client is not Android.
### 2. macOS VoiceProcessingIO integration
`crates/wzp-client/src/audio_vpio.rs` — a native Core Audio implementation using `AUGraph` + `AudioComponentInstance` with the VPIO audio unit. Gives you hardware-accelerated AEC (same AEC Apple ships in FaceTime / iMessage audio / voice memos) at the cost of tight coupling to Apple frameworks. Lock-free ring pattern matches the CPAL path so the upper layers don't notice the difference.
Enabled by `features = ["audio", "vpio"]` in the macOS target section of `desktop/src-tauri/Cargo.toml`.
### 3. Windows cross-compilation via cargo-xwin
Cross-compiling Rust + Tauri to `x86_64-pc-windows-msvc` from Linux using `cargo-xwin`, which downloads the Microsoft CRT + Windows SDK on demand and drives `clang-cl` as the compiler. No Windows machine is needed for the build itself — only for runtime testing.
**Build infrastructure**:
- `scripts/Dockerfile.windows-builder` — Debian bookworm + Rust + cargo-xwin + Node 20 + cmake + ninja + llvm + clang + lld + nasm. Pre-warms the xwin MSVC CRT cache at image build time (saves ~4 minutes per cold build).
- `scripts/build-windows-docker.sh` — fire-and-forget remote build via Docker on SepehrHomeserverdk. Same pattern as `build-tauri-android.sh`. Uploads the `.exe` to rustypaste and fires an `ntfy.sh/wzp` notification on start and on completion.
- `scripts/build-windows-cloud.sh` — alternative pipeline using a temporary Hetzner Cloud VPS. Slower (full VM spin-up), more expensive, but useful when Docker image rebuilds would be disruptive.
**Two critical blockers resolved** on the way to a working `.exe`:
1. **libopus SSE4.1 / SSSE3 intrinsic compile failure**. `audiopus_sys` vendors libopus 1.3.1, whose `CMakeLists.txt` gates the per-file `-msse4.1` `COMPILE_FLAGS` behind `if(NOT MSVC)`. Under `clang-cl`, CMake sets `MSVC=1` (because `CMAKE_C_COMPILER_FRONTEND_VARIANT=MSVC` triggers `Platform/Windows-MSVC.cmake` which unconditionally sets the variable), so the per-file flag is never set and the SSE4.1 source files compile without the target feature — then fail with 20+ "always_inline function '_mm_cvtepi16_epi32' requires target feature 'sse4.1'" errors.
Fixed by **vendoring audiopus_sys into `vendor/audiopus_sys/`** and patching its bundled libopus to introduce an `MSVC_CL` variable that is true only for real `cl.exe` (distinguished via `CMAKE_C_COMPILER_ID STREQUAL "MSVC"`). The eight `if(NOT MSVC)` SIMD guards are flipped to `if(NOT MSVC_CL)` and the global `/arch` block at line 445 becomes `if(MSVC_CL)`, so clang-cl gets the GCC-style per-file flags while real cl.exe keeps the `/arch:AVX` / `/arch:SSE2` globals.
Wired in via `[patch.crates-io] audiopus_sys = { path = "vendor/audiopus_sys" }` at the workspace root.
Upstream tracking: [xiph/opus#256](https://github.com/xiph/opus/issues/256), [xiph/opus PR #257](https://github.com/xiph/opus/pull/257) (both stale).
2. **tauri-build needs `icons/icon.ico` for the Windows PE resource**. The desktop only had `icon.png`. Generated a multi-size ICO (16/24/32/48/64/128/256) from the existing placeholder via Pillow and committed it. Placeholder quality — real branded icons can replace it later.
### 4. Windows `AudioCategory_Communications` capture path (task #24)
`crates/wzp-client/src/audio_wasapi.rs` — direct WASAPI capture via `IMMDeviceEnumerator → IAudioClient2 → SetClientProperties` with `AudioCategory_Communications`. This tells Windows "this is a VoIP call" and Windows routes the capture stream through the driver's registered communications APO chain, which on most Win10/11 consumer hardware includes AEC, NS, and AGC.
**Caveat**: quality is driver-dependent. On a machine with a good communications APO (Intel Smart Sound, Dolby, modern Realtek on Win11 24H2+, anything with Voice Clarity enabled) it's excellent. On generic class-compliant drivers with no communications APO registered, it's a no-op. For a guaranteed AEC regardless of driver, see task #26 which tracks implementing the classic Voice Capture DSP (`CLSID_CWMAudioAEC`) as a fallback.
Gated behind the `windows-aec` feature in `wzp-client`. Enabled by default in the Windows target section of `desktop/src-tauri/Cargo.toml`.
## Build pipelines
### Native macOS / Linux
```bash
cd desktop
npm install
npm run build
cd src-tauri
cargo build --release --bin wzp-desktop
```
### Windows x86_64 via Docker on SepehrHomeserverdk
```bash
./scripts/build-windows-docker.sh # Full: pull + build + download
./scripts/build-windows-docker.sh --no-pull # Skip git fetch
./scripts/build-windows-docker.sh --rust # Force-clean Rust target
./scripts/build-windows-docker.sh --image-build # (Re)build the Docker image (fire-and-forget)
```
Output lands at `target/windows-exe/wzp-desktop.exe`. Both `wzp-desktop.exe` and `wzp-desktop-noAEC.exe` can coexist in that directory; the script writes `wzp-desktop.exe` so renaming the prior build to `-noAEC.exe` (or any other name) before rebuilding preserves it.
### Windows x86_64 via Hetzner Cloud (alternative)
```bash
./scripts/build-windows-cloud.sh # Full: create VM → build → download → destroy
./scripts/build-windows-cloud.sh --prepare # Create VM and install deps only
./scripts/build-windows-cloud.sh --build # Build on existing VM
./scripts/build-windows-cloud.sh --destroy # Delete the VM
WZP_KEEP_VM=1 ./scripts/build-windows-cloud.sh # Keep VM alive after build for debug
```
Remember to destroy the VM at end of day with `--destroy`.
### Linux x86_64 (relay + CLI + bench)
```bash
./scripts/build-linux-docker.sh # Fire-and-forget remote Docker build
./scripts/build-linux-docker.sh --install # Wait for completion and download
```
Uses the same `wzp-android-builder` Docker image as Android (not a separate image), since the deps (Rust + cmake + ring prereqs) are the same.
## Testing
### Direct calling parity
1. Build on two machines (macOS + Windows, or two macOS, or any combination).
2. Both machines register on the same relay.
3. Copy one machine's fingerprint into the other's direct-call panel.
4. Place the call. Confirm ringing UI on the callee and "calling…" UI on the caller.
5. Answer. Confirm audio flows both ways.
6. Hang up from either side. Confirm call-history entries are labeled correctly (`Outgoing` on caller, `Incoming` on callee, never `Missed` on a successful call).
### Windows AEC A/B
1. Install `wzp-desktop-noAEC.exe` and `wzp-desktop.exe` on the same Windows box.
2. Join a call from each (separately) while a second machine plays known audio through the first machine's speakers.
3. On the remote (listening) side: the `noAEC` call should have clear audible echo; the AEC call should have minimal or no echo after a 12 s convergence period.
4. If both builds sound identical (with echo) → the `AudioCategory_Communications` switch isn't triggering the driver's APO chain. Investigate via task #26 (Voice Capture DSP fallback).
## Known quirks
1. **libopus vendor path is workspace-relative**. `[patch.crates-io] audiopus_sys = { path = "vendor/audiopus_sys" }` works from any crate in the workspace because Cargo resolves it against the root `Cargo.toml`'s directory. If the workspace is moved or vendored into another workspace, update the path.
2. **`cargo xwin` overwrites `override.cmake` on every invocation**. Any attempt to patch `~/.cache/cargo-xwin/cmake/clang-cl/override.cmake` at Docker image build time is inert because `src/compiler/clang_cl.rs` line ~444 writes the bundled file fresh on every run. All real fixes must land in the source tree (via the vendored audiopus_sys, as done here), not in the cargo-xwin cache.
3. **WebView2 runtime is a prerequisite on Windows 10**. Windows 11 ships with it. If the `.exe` launches and immediately exits with no error on a Win10 machine, that's the missing runtime — install it from [Microsoft's Evergreen bootstrapper](https://developer.microsoft.com/en-us/microsoft-edge/webview2/).
4. **Rust 2024 edition `unsafe_op_in_unsafe_fn` lint**. The WASAPI backend in `audio_wasapi.rs` emits ~18 of these warnings because Rust 2024 requires explicit `unsafe { ... }` blocks inside `unsafe fn` bodies. The warnings don't block the build and don't affect runtime behavior; cleaning them up is tracked informally as tech debt.
## Files of interest
| Path | Purpose |
|---|---|
| `desktop/src/` | Shared frontend (TypeScript + HTML + CSS) |
| `desktop/src-tauri/src/lib.rs` | Tauri commands shared with Android |
| `desktop/src-tauri/src/engine.rs` | `CallEngine` wrapper |
| `desktop/src-tauri/src/history.rs` | Persistent call history store with dedup |
| `crates/wzp-client/src/audio_io.rs` | CPAL capture + playback (baseline) |
| `crates/wzp-client/src/audio_vpio.rs` | macOS VoiceProcessingIO capture (AEC) |
| `crates/wzp-client/src/audio_wasapi.rs` | Windows WASAPI communications capture (AEC) |
| `vendor/audiopus_sys/opus/CMakeLists.txt` | Patched libopus for clang-cl SIMD |
| `scripts/Dockerfile.windows-builder` | Windows cross-compile Docker image |
| `scripts/build-windows-docker.sh` | Remote Docker build pipeline |
| `scripts/build-windows-cloud.sh` | Hetzner VPS alternative pipeline |
| `scripts/build-linux-docker.sh` | Linux x86_64 relay/CLI build pipeline |

View File

@@ -0,0 +1,22 @@
# PRD: Desktop Direct Calling — Backport SignalManager
## Problem
The desktop Tauri app has the direct calling UI (Room/Direct Call toggle, Register, Call buttons) but the backend uses inline async code in `main.rs` instead of a proper `SignalManager`. This needs to be backported from the Android refactor.
## Tasks
1. **Create `signal_mgr.rs` for desktop** — same pattern as Android, or reuse the crate directly
2. **Wire into Tauri commands**`register_signal` should use `SignalManager::connect()` + `run_recv_loop()` on a dedicated thread
3. **State polling**`get_signal_status` should call `SignalManager::get_state_json()`
4. **place_call / answer_call** — delegate to SignalManager methods
5. **Merge android branch into desktop branch** — resolve the 37 desktop-only + 90 android-only commit divergence
6. **Test** — Android calls Desktop, Desktop calls Android
## UI Fixes
1. **Default alias** — generate random name on first start (like Android does)
2. **Default room** — change from "android" to "general"
3. **Fingerprint display** — ensure full `xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx` format (not truncated)
4. **Deregister button** — ability to disconnect signal channel
5. **Call state reset** — after hangup, return to "Registered" state, not stuck on "Ringing"

View File

@@ -1,360 +0,0 @@
# PRD: DRED Integration & Opus-Tier FEC Simplification
## Problem
WarzonePhone's audio loss-recovery stack is built around classical Opus + application-level RaptorQ FEC. It was the right answer when WZP was designed, but libopus 1.5 (December 2023) introduced **Deep REDundancy (DRED)** — a neural speech-recovery feature that is strictly better than classical FEC for the loss patterns VoIP calls actually experience. We are paying real latency, bitrate, and complexity costs for protection that DRED now does better and cheaper.
Concretely, on every Opus call today we pay:
- **~40100 ms of receiver-side latency** waiting for RaptorQ block completion before decode
- **1020% bitrate overhead** from RaptorQ repair symbols (more on studio profiles)
- **~2040% codec-internal overhead** from Opus inband FEC (LBRR)
- Classical Opus PLC on loss bursts exceeding the RaptorQ block size — which sounds robotic and gap-ridden
…in exchange for bit-exact recovery of isolated single-frame losses, which is perceptually indistinguishable from classical Opus PLC for 20 ms of speech. The protection is misaligned with the failure modes.
DRED delivers:
- **Zero added receive latency** — reconstruction runs only on detected loss
- **~1 kbps flat bitrate overhead** regardless of base bitrate
- **Plausible reconstruction of bursts up to ~1 second** — DRED's headline capability, exactly the regime RaptorQ can't touch
- Neural PLC that sounds like continuous speech, not a gap
We also have a second, unrelated problem blocking adoption: our FFI crate `audiopus_sys 0.2.2` vendors **libopus 1.3**, predating DRED entirely. We cannot enable DRED without first swapping the FFI layer. The naïve choice (`opus` crate from SpaceManiac) is a trap — it depends on the same dead `audiopus_sys`. The real target is `opusic-c 1.5.5` by DoumanAsh, which vendors libopus 1.5.2 with full DRED support and documents Android NDK cross-compile.
This PRD covers the FFI swap, DRED enablement, the decision to **remove RaptorQ and Opus inband FEC from the Opus tiers entirely** (keeping RaptorQ only for Codec2 where DRED is N/A), and the jitter buffer refactor that the DRED lookahead/backfill pattern requires.
## Goals
- Replace `audiopus 0.3.0-rc.0` + `audiopus_sys 0.2.2` (dead upstream, libopus 1.3) with `opusic-c 1.5.5` + `opusic-sys 0.6.0` (active upstream, libopus 1.5.2)
- Enable DRED on every Opus profile with a tiered duration policy, lower at studio bitrates and higher at degraded bitrates
- Disable Opus inband FEC (LBRR) on all Opus profiles — opusic-c's own docs recommend this, and it overlaps DRED's job
- Remove `wzp-fec` (RaptorQ) from the Opus tiers entirely — the latency and bitrate savings are real, and DRED strictly dominates it on speech
- Keep RaptorQ + current FEC ratios on the Codec2 tiers unchanged — DRED is libopus-only, Codec2 has no neural equivalent
- Refactor `wzp-transport::jitter` to a lookahead/backfill pattern that lets DRED reconstruct loss windows when the next packet arrives, instead of the current "wait for block completion or fall through to classical PLC" policy
- Ship behind a runtime escape hatch (`AUDIO_USE_LEGACY_FEC`) for the first rollout window so we can revert to RaptorQ if DRED has surprises in real-world conditions
## Non-goals
- Changing Codec2 at all. Codec2 1200 / 3200 are outside the DRED lineage and keep their current RaptorQ protection, block sizes, and PLC path.
- Adding new Opus bitrate tiers or changing the quality adaptation thresholds. This PRD is about the protection layer, not the bitrate ladder.
- Enabling OSCE (Opus Speech Coding Enhancement — a separate libopus 1.5 neural post-processor that opusic-c exposes via an `osce` feature flag). Valuable, complementary, and free once opusic-c is in — but out of scope here to keep the PRD focused. Track as follow-up.
- Video, audio-over-MoQ, or any protocol-layer changes discussed in prior conversations.
- Touching the wzp-web / browser client. Browser Opus is a separate codepath via WebAudio / WASM libopus and is not affected by the native FFI swap.
## Background
### How the three protection mechanisms actually differ
| | Opus inband FEC (LBRR) | RaptorQ (wzp-fec) | DRED |
|---|---|---|---|
| Layer | codec-internal | application, across Opus packets | codec-internal |
| What it sends | low-bitrate copy of the *previous* frame, embedded in every packet | fountain-code repair symbols across a block | neural-coded history of the recent past |
| Protection horizon | 1 packet back | block duration (currently 100 ms, proposed 40 ms) | configurable, 01040 ms |
| Recovery granularity | 1 frame (lower quality) | 1 frame (bit-exact) | 10 ms frames (plausible reconstruction) |
| Latency cost | 0 ms | block duration on receive | 0 ms |
| Bitrate cost | ~2040% of base | `fec_ratio × base` (currently +20% GOOD, +50% DEGRADED) | ~1 kbps flat |
| Effective loss tolerance | ~single-packet losses | up to `(repair symbols / block)` losses, cliff beyond | bursts up to the configured duration |
| Content assumption | any Opus audio | any | speech (DRED model is speech-trained) |
### Why DRED dominates on the Opus tiers
Loss-scenario walkthrough (verified against opusic-c and libopus 1.5 docs):
- **1-frame loss (20 ms)**: RaptorQ recovers bit-exactly, DRED wouldn't run (classical Opus PLC is perceptually indistinguishable for single 20 ms frames). RaptorQ "wins" on paper but not on ears.
- **23 frame burst (4060 ms)**: RaptorQ at current ratio 0.2 hits its tolerance cliff. DRED handles this trivially — well within a 200 ms window.
- **510 frame burst (100200 ms)**: RaptorQ completely overwhelmed at any reasonable ratio. DRED's sweet spot.
- **10+ frame burst (>200 ms)**: RaptorQ useless. DRED at 5001000 ms still recovers.
The only scenario where RaptorQ strictly beats DRED is bit-exact recovery of isolated single-frame losses — which is perceptually irrelevant for speech. In every other scenario DRED either ties or wins.
### Why Codec2 keeps RaptorQ
DRED lives inside libopus — it does not help Codec2 at all. Codec2's classical PLC is a parametric-vocoder interpolation that produces noticeably robotic artifacts on loss. On the Codec2 tiers, RaptorQ is the only protection we have, and it should stay at current ratios (1.0 on CATASTROPHIC, 0.5 on the Codec2 3200 tier).
### The opusic-c / opusic-sys situation
- `opusic-sys 0.6.0` — FFI crate, published 2026-03-17, vendors libopus 1.5.2 via its `bundled` feature (on by default), documents Android NDK cross-compile via `ANDROID_NDK_HOME` (which our `wzp-android/build.rs` already sets). Exposes raw bindings to `opus_dred_parse`, `opus_decoder_dred_decode`, and the `OpusDRED` state struct.
- `opusic-c 1.5.5` — high-level safe wrapper. Its **encoder** side is fine: exposes `Encoder::set_dred_duration(value: u8) -> Result<(), ErrorCode>` with range `0..=104` (each unit is 10 ms, so 01040 ms configurable). Also exposes `set_bitrate`, `set_inband_fec`, `set_dtx`, `set_packet_loss`, `set_signal`, `set_complexity`, `set_bandwidth`, `set_application` on the encoder.
- **opusic-c's decoder-side DRED wrapper is NOT sufficient for our architecture.** Confirmed by reading the source of `opusic-c/src/dred.rs`:
1. `Dred::decode_to` ignores the `dred_end` output of `opus_dred_parse` (prefixed `_dred_end`), so the caller cannot know how much DRED history a given packet actually carried.
2. In `opus_decoder_dred_decode(decoder, dred, dred_offset, pcm, frame_size)`, the wrapper passes `frame_size` to BOTH the `dred_offset` and `frame_size` arguments. This looks like a bug — it means reconstruction always starts at offset `frame_size` into the DRED window, not at an arbitrary caller-chosen offset. Arbitrary-gap reconstruction (which we need for the lookahead/backfill pattern) requires proper offset control.
3. `DredPacket` is owned internally by a `Dred` instance; its internal buffer is overwritten on every `decode_to` call. We cannot hold a ring of parsed DredPackets from multiple recent arrivals — which is exactly what the lookahead/backfill jitter buffer pattern requires.
- **Decision**: use opusic-c for the encoder path (its wrapper is correct and saves work), and drop to `opusic-sys` raw FFI for the entire decoder path AND the DRED reconstruction path. Both use a single shared `DecoderHandle` so internal decoder state stays consistent. **Verified at pre-flight**: `opusic_c::Decoder.inner` is `pub(crate)`, so there is no way to reach the raw `*mut OpusDecoder` from outside opusic-c. Running two parallel decoders (one from opusic-c for audio, one from opusic-sys for DRED) would cause state drift because the DRED-only decoder wouldn't see the normal decode calls. Single unified decoder via opusic-sys is the only correct architecture.
- **Three FFI handles required** per decode session: `opusic_c::Encoder` (encoder side, unchanged), our own `DecoderHandle` wrapping `*mut OpusDecoder` from opusic-sys (for normal decode AND for the `OpusDecoder` pointer passed to `opus_decoder_dred_decode`), and a new `DredDecoderHandle` wrapping `*mut OpusDREDDecoder` from opusic-sys (passed to `opus_dred_parse`). Note: `OpusDREDDecoder` is a **separate struct** from `OpusDecoder` in libopus 1.5 — verified from opus.h. Allocation via `opus_dred_decoder_create()` (confirm exact symbol name at Phase 3a start).
- The `opus` crate from SpaceManiac (0.3.1, published 2026-01-03) is a trap: it depends on `audiopus_sys ^0.2.0` — the same dead FFI crate we're trying to get away from. Do not use.
- **Follow-up (out of scope for this PRD)**: upstream the fixes to `opusic-c/src/dred.rs` (preserve `dred_end`, fix the `dred_offset` double-pass, expose `DredPacket` externally). Worth a GitHub PR once our own implementation has proven correct. Would let us eventually delete our internal FFI wrapper.
### Critical note from opusic-c docs
From the `dred` module documentation: *"The documentation recommends disabling in-band FEC and using `Application::Voip` for optimal results."* This applies to the **codec-internal** Opus inband FEC (LBRR), not our application-level RaptorQ. The two are independent layers. This PRD disables both on Opus tiers, but for different reasons — inband FEC per upstream recommendation, RaptorQ per the analysis above.
### The libopus 1.5 loss-percentage gating quirk
In libopus 1.5, both inband FEC and DRED are gated on `OPUS_SET_PACKET_LOSS_PERC` being non-zero. If the encoder thinks loss is 0%, it will not emit DRED data even when `set_dred_duration` is configured. We must plumb a meaningful loss percentage into the encoder continuously, floored at a small non-zero value so DRED stays active even when the network is perfect. Planned floor: **5%**, overridden upward by the real `QualityReport` loss value when it exceeds the floor.
## Solution
### High-level architecture change
**Before** (per Opus frame encode path):
```
PCM → AdaptiveEncoder.encode (Opus)
→ inband FEC embedded in packet
→ wzp-fec FEC encoder (accumulate into block, generate repair symbols)
→ DATAGRAM out
```
**Before** (per Opus frame decode path):
```
DATAGRAM in → wzp-fec block assembly (wait for block, recover if possible)
→ AdaptiveDecoder.decode (Opus) / decode_lost (classical PLC)
→ PCM
```
**After** (Opus tiers):
```
PCM → OpusEncoder.encode (opusic-c, DRED enabled via set_dred_duration, inband FEC off)
→ DATAGRAM out directly (no RaptorQ block)
```
```
DATAGRAM in → jitter buffer (lookahead/backfill)
→ on frame arrival: OpusDecoder.decode
→ on detected gap: if next packet has DRED state → dred::Dred.reconstruct(gap)
else → OpusDecoder.decode_lost (classical PLC)
→ PCM
```
**After** (Codec2 tiers): unchanged. RaptorQ block encoding + classical Codec2 decode path stay exactly as they are today.
### New per-profile protection matrix
| Profile | Codec | Inband FEC | RaptorQ ratio | DRED duration | Total overhead |
|---|---|---|---|---|---|
| `STUDIO_64K` | Opus 64k | **off** | **none** | **10 frames (100 ms)** | +1 kbps |
| `STUDIO_48K` | Opus 48k | **off** | **none** | **10 frames (100 ms)** | +1 kbps |
| `STUDIO_32K` | Opus 32k | **off** | **none** | **10 frames (100 ms)** | +1 kbps |
| `GOOD` | Opus 24k | **off** | **none** | **20 frames (200 ms)** | +1 kbps |
| `NORMAL_16K` | Opus 16k | **off** | **none** | **20 frames (200 ms)** | +1 kbps |
| `DEGRADED` | Opus 6k | **off** | **none** | **50 frames (500 ms)** | +1 kbps |
| `CODEC2_3200` | Codec2 3200 | N/A | **0.5 (unchanged)** | N/A | +50% |
| `CATASTROPHIC` | Codec2 1200 | N/A | **1.0 (unchanged)** | N/A | +100% |
| `COMFORT_NOISE` | CN | — | — | — | — |
DRED duration rationale:
- **Studio tiers (100 ms)**: loss is rare on the networks where users pick studio quality. Short DRED window keeps decode-side CPU modest. Still covers multi-frame bursts that classical PLC can't touch.
- **Normal tiers (200 ms)**: balanced baseline. Handles the common VoIP loss pattern (20150 ms bursts from wifi roam, transient congestion).
- **Degraded tier (500 ms)**: users on Opus 6k are by definition on a bad link. Long DRED window buys maximum burst resilience where it matters most. Still well under the 1040 ms cap.
### Runtime escape hatch
Ship with a single environment variable / settings flag: **`AUDIO_USE_LEGACY_FEC`**. When set, the entire Opus-tier path reverts to the pre-PRD behavior: RaptorQ re-enabled at the old ratios, Opus inband FEC re-enabled, DRED disabled (`set_dred_duration(0)`). This is the rollback safety valve for the first production window.
Escape hatch semantics:
- Read once at `CallEncoder::new` / `CallDecoder::new` time. Call-scoped, not re-read mid-call.
- Exposed via Android Settings UI as a hidden "Legacy FEC (debug)" toggle, and as a CLI flag `--legacy-fec` on the desktop client.
- Logged in `DebugReporter` so we can tell which mode a call was in when diagnosing.
- Removed entirely after 2 months of stable production with no regressions reported. Removal is a follow-up PR, not part of this PRD's scope.
## Detailed design
### Phase 0 — FFI crate swap (prerequisite, no behavior change)
**Files touched:**
- `Cargo.toml` (workspace root) — replace `audiopus = "0.3.0-rc.0"` with `opusic-c = { version = "1.5.5", features = ["bundled", "dred"] }` and `opusic-sys = { version = "0.6.0", features = ["bundled"] }`. The `opusic-sys` direct dep is for the DRED decoder path below.
- `crates/wzp-codec/Cargo.toml` — update `audiopus = { workspace = true }` to `opusic-c = { workspace = true }`, add `opusic-sys = { workspace = true }`, add `bytemuck = "1"` for the i16↔u16 slice cast.
- `crates/wzp-codec/src/opus_enc.rs` — rewrite against opusic-c. API mapping:
- `audiopus::coder::Encoder::new(SampleRate::Hz48000, Channels::Mono, Application::Voip)``opusic_c::Encoder::new(Channels::Mono, SampleRate::Hz48000, Application::Voip)` (argument order swapped)
- `set_bitrate(Bitrate::BitsPerSecond(bps))``set_bitrate(Bitrate::Bits(bps))` or equivalent variant — verify at implementation time
- `set_inband_fec(true/false)``set_inband_fec(InbandFec::On/Off)` (now an enum)
- `set_packet_loss_perc(u8)``set_packet_loss(u8)` (method renamed)
- `set_dtx(bool)`, `set_signal(Signal::Voice)`, `set_complexity(u8)` — names match
- `encode(&[i16], &mut [u8])``encode_to_slice(&[u16], &mut [u8])` with `bytemuck::cast_slice::<i16, u16>(pcm)` at the call site
- `crates/wzp-codec/src/opus_dec.rs` — same-style rewrite for the `Decoder` path. Note that opusic-c's decoder methods take `decode_fec: bool` as a parameter directly (not a separate ctl).
- `vendor/audiopus_sys/` — delete the directory (only exists on `feat/desktop-audio-rewrite`, not on `android-rewrite`, so this is a no-op on the current branch but do remove the `[patch.crates-io]` block from Cargo.toml when merging back).
**Acceptance criteria:**
- `cargo check --workspace` passes on Linux x86_64, macOS, and Android NDK cross-compile.
- All existing codec unit tests in `crates/wzp-codec/src/adaptive.rs` pass unchanged. DRED is still disabled at this phase (default `set_dred_duration(0)`), so behavior is equivalent to pre-swap libopus 1.3 for call quality purposes.
- A short real-call smoke test produces audio identical to current behavior (no audible regression).
- `opusic_c::version()` at startup logs libopus version containing `1.5.2` — hard signal that the swap landed correctly.
### Phase 1 — DRED encoder enable on all Opus profiles
**Files touched:**
- `crates/wzp-codec/src/opus_enc.rs`:
- Add `fn dred_duration_for(codec: CodecId) -> u8` returning the per-profile value from the matrix above (10 / 20 / 50 frames).
- In `OpusEncoder::new`, after the existing `set_bitrate`/`set_signal`/`set_complexity` block: call `inner.set_inband_fec(InbandFec::Off)`, then `inner.set_dred_duration(dred_duration_for(profile.codec))`, then `inner.set_packet_loss(5)` as the default floor.
- Add `pub fn set_dred_duration(&mut self, frames: u8)` to allow the adaptive ladder to update DRED duration on profile switch.
- In the existing `set_profile` impl, call `set_dred_duration(dred_duration_for(profile.codec))` after `apply_bitrate`.
- `crates/wzp-codec/src/adaptive.rs`:
- `AdaptiveEncoder::set_profile` already delegates to `self.opus.set_profile` — no changes needed. DRED update rides along.
- `crates/wzp-client/src/call.rs` (and equivalent on `wzp-android/src/pipeline.rs`):
- In the `QualityReport` handler (wherever we currently call `set_expected_loss` / `set_packet_loss_perc`), also ensure the loss value is floored at 5% before passing to the Opus encoder. This is a 1-line change.
**Acceptance criteria:**
- Encoder produces DRED-enabled Opus packets. Verifiable via libopus's reference decoder in debug mode, or by wire capture + inspection — a DRED-bearing Opus packet has a larger `opus_packet_get_nb_frames` footprint than a non-DRED one of the same nominal bitrate.
- Total outgoing bitrate on Opus 24k is ~25 kbps (up from ~24 kbps) — confirms ~1 kbps DRED overhead.
- On a lossless path, decoder output is audibly identical to Phase 0.
- Escape hatch `AUDIO_USE_LEGACY_FEC=1` cleanly reverts the DRED enable (calls `set_dred_duration(0)` and `set_inband_fec(InbandFec::On)` instead).
### Phase 2 — RaptorQ removal on Opus tiers
**Files touched:**
- `crates/wzp-client/src/call.rs`:
- In `CallEncoder::encode_frame` (or wherever `wzp_fec::Encoder::add_source_symbol` is called), gate the RaptorQ path on `!profile.codec.is_opus()` — Opus frames go straight to DATAGRAM emit, Codec2 frames continue through RaptorQ.
- When a profile switch crosses the Opus↔Codec2 boundary, flush/reset the RaptorQ encoder state.
- `crates/wzp-android/src/pipeline.rs`:
- Mirror the same gate in the Android encode path.
- `crates/wzp-proto/src/packet.rs`:
- `MediaHeader.fec_block` and `fec_symbol` are still valid fields on the wire. For Opus packets we emit `fec_block = 0`, `fec_symbol = 0`, `fec_ratio_encoded = 0`. No wire format change; the receiver just sees all-zeros in the FEC fields for Opus packets and skips the FEC decoder path.
- Bump protocol version to v1 → v2? **No** — the change is semantically backward compatible because existing RaptorQ decoders handle a zero ratio correctly (ratio 0.0 means "no repair symbols expected"). Old receivers can still decode new Opus packets; they just won't see any DRED benefit because their libopus is old. This is a property we want: the opposite (new receiver, old sender) is the more common mixed-version case during rollout and also Just Works.
- `crates/wzp-client/src/call.rs``CallDecoder`:
- Symmetric change: Opus frames bypass the RaptorQ block assembly, go straight to the decoder. Only Codec2 frames (`codec_id.is_codec2()`) feed through `wzp-fec` block decoding.
**Acceptance criteria:**
- Outgoing Opus packets have `fec_ratio_encoded == 0` (verifiable with the existing wire capture tooling in `wzp-client/src/echo_test.rs`).
- On a clean network, receiver latency (measured as encode-to-playout one-way delay) drops by ~40 ms versus Phase 1. This is the primary win and should be directly measurable with the existing telemetry.
- Codec2 calls show no latency change and no packet-format change. Regression-test Codec2 3200 and Codec2 1200 specifically.
- Total outgoing bitrate on Opus 24k drops from ~28.8 kbps (24k base + 0.2 RaptorQ ratio) to ~25 kbps (24k base + ~1 kbps DRED). Direct savings observable in network telemetry.
### Phase 3 — DRED reconstruction wrapper + jitter buffer lookahead/backfill refactor
This phase is larger than originally estimated because opusic-c's decoder-side DRED wrapper is unusable for our architecture (see Background). We write our own safe wrapper over `opusic-sys` raw FFI first, then plumb it through the jitter buffer.
**Step 3a — Safe DRED reconstruction wrapper in `wzp-codec`:**
New file `crates/wzp-codec/src/dred_ffi.rs`. Wraps the raw libopus 1.5 DRED API:
- `pub struct DredState` — owns an `OpusDRED` buffer (allocated via `opusic_sys::opus_dred_alloc` or equivalent; size is fixed at 10,592 bytes per libopus 1.5). `Clone` is intentionally NOT implemented — the state is heap-owned and non-trivial to copy.
- `pub fn parse_from_packet(&mut self, decoder: &opusic_c::Decoder, packet: &[u8], max_dred_samples: i32) -> Result<DredParseResult, DredError>` — wraps `opus_dred_parse`, preserves the `dred_end` output (number of samples of history the packet carried), returns it in `DredParseResult { samples_available: i32, frames_available: u8 }`.
- `pub fn reconstruct_into(&self, decoder: &mut opusic_c::Decoder, dred_offset_samples: i32, output: &mut [i16]) -> Result<usize, DredError>` — wraps `opus_decoder_dred_decode`, takes the offset explicitly, decodes `output.len()` samples starting from that offset in the DRED window.
- All `unsafe` contained here, strict bounds checking on offsets, Rust-level panic safety. Unit tests use a reference encoder + known-good reference decoder to verify that reconstruction at specific offsets produces expected output.
- Depends on `opusic-sys` directly and on `opusic-c::Decoder` for the decoder handle. The Decoder handle must be reachable as a raw pointer; opusic-c exposes this via an unstable internal or we wrap the pointer ourselves. **Verify at implementation time** — if opusic-c doesn't expose the raw decoder pointer safely, we create our own thin Decoder wrapper in `dred_ffi.rs` using raw opusic-sys, losing the convenience of opusic-c's decoder but keeping its encoder. This is the smaller-risk fallback.
New `pub trait DredReconstructor` in `wzp-codec/src/lib.rs`:
```rust
pub trait DredReconstructor: Send {
/// Parse DRED state from an arriving Opus packet into `state`.
/// Returns number of 48 kHz samples of history available, or 0 if the packet has no DRED.
fn parse(&mut self, state: &mut DredState, packet: &[u8]) -> Result<i32, DredError>;
/// Reconstruct `output.len()` samples from `state`, starting at the given
/// sample offset (measured from the end of the DRED window going backward).
fn reconstruct(&mut self, state: &DredState, offset_samples: i32, output: &mut [i16]) -> Result<usize, DredError>;
}
```
Implement `DredReconstructor` over the `dred_ffi::DredState` + opusic-c Decoder combination. This is the clean boundary the jitter buffer will talk to.
**Step 3b — Jitter buffer refactor in `crates/wzp-transport/src/jitter.rs`:**
- Current behavior: buffer waits a fixed number of frames of jitter before emitting; on a missing slot, after a timeout it gives up and signals the decoder to run `decode_lost()` (classical Opus PLC or Codec2 PLC).
- New behavior on Opus tiers: when a frame arrives (in-order or late), first call `DredReconstructor::parse` on it to update a rolling ring of `DredState` instances tagged with their originating sequence number. When a gap is detected (missing sequence number between last-emitted and current arrival), and the ring contains a `DredState` from a nearby packet that covers the gap's sample offset, call `DredReconstructor::reconstruct` with the correct offset to synthesize the missing frames, splice them into playout, then continue normal decode.
- If no DRED state covers the gap (e.g., gap too far back, or every nearby packet was dropped), fall through to classical PLC exactly as today. The classical path stays intact as the ultimate fallback.
- Codec2 packets bypass the entire DRED ring. They are not inspected for DRED state and take the unchanged classical PLC path.
- Ring sizing: `max_dred_duration_frames` + `jitter_depth_frames` worth of `DredState` instances. At 500 ms DRED on degraded tier + 60 ms jitter depth, that's ~28 DredState instances × 10,592 bytes ≈ 300 KB. Acceptable. On studio tier with 100 ms DRED it's only ~80 KB.
- The jitter buffer takes a `Box<dyn DredReconstructor>` at construction, passed in by the call engine. `wzp-transport` does NOT take a direct dep on `opusic-c` or `opusic-sys` — it only knows about the trait defined in `wzp-codec`.
**Files touched:**
- `crates/wzp-codec/src/dred_ffi.rs` (new, ~150300 lines)
- `crates/wzp-codec/src/lib.rs` — expose `DredReconstructor`, `DredState`, `DredError` types
- `crates/wzp-codec/Cargo.toml` — add `opusic-sys = { workspace = true }` as a direct dep (already done in Phase 0)
- `crates/wzp-transport/src/jitter.rs` — lookahead/backfill refactor, DRED ring
- `crates/wzp-transport/Cargo.toml` — add `wzp-codec = { workspace = true }` (likely already present) for the trait import
- `crates/wzp-client/src/call.rs` — construct a `DredReconstructor` and pass into `CallDecoder`'s jitter buffer
- `crates/wzp-android/src/pipeline.rs` — same on Android
**Acceptance criteria:**
- Unit tests in `dred_ffi.rs`: round-trip a known speech waveform through an encoder with DRED enabled, parse the resulting packets, reconstruct at several different offsets, verify the reconstructed samples are within an energy/spectral threshold of the original. (Not bit-exact — DRED reconstruction is lossy by design.)
- Synthetic loss test on the full pipeline: inject 200 ms bursts at 10% rate into a looped call, verify the DRED reconstruction rate on receiver telemetry is ≥95% of all loss events whose gaps fall within the configured DRED duration window.
- Reconstructed audio is audibly continuous on 40200 ms bursts — no gaps, no classical-PLC robot artifact. Verified on real voice samples (not just sine tones), and on at least two distinct speaker profiles (male, female) because DRED can have voice-dependent quality.
- End-to-end latency metric is unchanged versus Phase 2 (no regression from adding the lookahead path). The DRED ring insertion on packet arrival must be O(1) in practice.
- Existing `echo_test.rs` and `drift_test.rs` pass with the new jitter buffer.
- Codec2 path uses classical PLC exclusively (no DRED invocation) because Codec2 packets don't carry DRED state. Verify by injecting loss on a Codec2 call and confirming zero DRED reconstruction telemetry events during that call.
- `wzp-transport` has no direct dependency on `opusic-sys` or `opusic-c` in its `Cargo.toml` after the refactor — only on `wzp-codec`. Verify by grepping the Cargo.toml file.
### Phase 4 — Telemetry and tooling updates
**Files touched:**
- `crates/wzp-proto/src/packet.rs``QualityReport` or equivalent telemetry message gains `dred_reconstructions: u32` as a new counter (frames reconstructed via DRED this reporting window) and `classical_plc_invocations: u32` (frames filled by Opus/Codec2 classical PLC). These are separate counters because they're different recovery mechanisms.
- `crates/wzp-relay/src/*` — relay telemetry pipeline surfaces both counters in Prometheus metrics: `wzp_dred_reconstructions_total{call_id}`, `wzp_classical_plc_total{call_id}`.
- `docs/grafana-dashboard.json` — new panel: "Loss recovery breakdown" stacked bar, DRED vs classical PLC vs clean decode, per call.
- `android/app/src/main/java/com/wzp/debug/DebugReporter.kt` — surfaces `dredReconstructions` and `classicalPlc` counts in the debug report; also logs active DRED duration and whether legacy-FEC mode is engaged.
**Acceptance criteria:**
- Grafana dashboard shows a clear visual distinction between DRED-recovered and classical-PLC-recovered frames across a test fleet of calls.
- Debug report includes the active protection mode ("DRED 200 ms" / "Legacy RaptorQ") and reconstruction counts, so incidents can be classified unambiguously.
### Phase 5 — Escape hatch removal (follow-up, ~2 months post-ship)
After 2 months of stable production with no rollbacks triggered:
- Delete `AUDIO_USE_LEGACY_FEC` handling in `opus_enc.rs` / `call.rs` / `pipeline.rs`
- Delete the Opus-tier paths of `wzp-fec` (the crate stays for Codec2)
- Delete the Android settings toggle and desktop CLI flag
- Remove the `--legacy-fec` path from smoke tests
## Critical files to modify (summary)
- `Cargo.toml` (workspace) — dep swap (audiopus → opusic-c + opusic-sys)
- `crates/wzp-codec/Cargo.toml` — dep swap + `bytemuck` for slice cast
- `crates/wzp-codec/src/opus_enc.rs` — opusic-c rewrite + DRED enable + inband FEC off
- `crates/wzp-codec/src/opus_dec.rs` — opusic-c rewrite
- `crates/wzp-codec/src/dred_ffi.rs`**new file**, safe wrapper over opusic-sys raw DRED FFI
- `crates/wzp-codec/src/lib.rs` — expose `DredReconstructor` trait, `DredState`, `DredError`
- `crates/wzp-codec/src/adaptive.rs` — verify profile switch carries DRED duration
- `crates/wzp-client/src/call.rs` — Opus/Codec2 gate on RaptorQ path, loss floor, wire DredReconstructor into CallDecoder
- `crates/wzp-android/src/pipeline.rs` — same gate, same loss floor, wire DredReconstructor
- `crates/wzp-transport/src/jitter.rs` — lookahead/backfill refactor, DRED ring, reconstruction dispatch
- `crates/wzp-transport/Cargo.toml` — verify it depends only on `wzp-codec`, not directly on opusic-*
- `crates/wzp-proto/src/packet.rs` — new telemetry counters
- `crates/wzp-relay/` — Prometheus metric exposure
- `android/app/src/main/java/com/wzp/debug/DebugReporter.kt` — debug output
- `docs/grafana-dashboard.json` — loss-recovery panel
- (delete) `vendor/audiopus_sys/` on `feat/desktop-audio-rewrite` when merging back
## Existing utilities to reuse
- `wzp_codec::resample::Downsampler48to8` / `Upsampler8to48` — unchanged, only Codec2 path uses them
- `wzp_codec::adaptive::AdaptiveEncoder` / `AdaptiveDecoder` — existing profile-switching machinery, DRED duration changes ride along
- `wzp_codec::silence::SilenceDetector` / `ComfortNoise` — unchanged
- `wzp_codec::agc::AutoGainControl` — unchanged, runs before encode as today
- `wzp_fec::RaptorQFecEncoder` / decoder — unchanged, still used for Codec2 tiers
- `wzp_client::call::QualityAdapter` — unchanged; drives profile switching, which now also reconfigures DRED duration via the existing `set_profile` path
## Verification
End-to-end testing, in order:
1. **Unit**: `cargo test -p wzp-codec` — Opus encode/decode round-trip at every profile, DRED enabled. Verify `version()` reports libopus 1.5.2.
2. **Unit**: `cargo test -p wzp-transport` — jitter buffer lookahead/backfill behavior with injected loss patterns (0%, 5%, 15%, 30%, 50% loss; isolated losses, 40 ms bursts, 200 ms bursts, 500 ms bursts).
3. **Integration**: `crates/wzp-client/src/echo_test.rs` — existing echo test must pass on all Opus profiles with <5% perceived quality regression (measure via the time-window analysis already built into `echo_test.rs`).
4. **Integration**: `crates/wzp-client/src/drift_test.rs` — latency measurement. Must show ~40 ms reduction on Opus profiles versus pre-PRD baseline. Codec2 profiles unchanged.
5. **Manual**: Android release build, real call over bad wifi (or a shaped network via `tc netem` on Linux). Burst losses of 200 ms should be perceptually continuous speech, not robotic gaps.
6. **Manual**: Same call with `AUDIO_USE_LEGACY_FEC=1` — verify behavior reverts to current production behavior. This is the pre-ship rollback rehearsal.
7. **Cross-compile**: full build matrix — Android arm64-v8a + armeabi-v7a (via `scripts/build-and-notify.sh`), macOS universal, Linux x86_64 (via `scripts/build-linux-docker.sh`). Windows cross-compile via cargo-xwin should also pass — libopus 1.5 upstream fixed the clang-cl SIMD issue that required the vendor patch on `feat/desktop-audio-rewrite`.
8. **Telemetry smoke**: deploy to staging relay, make 10 test calls, verify Grafana's new "Loss recovery breakdown" panel shows DRED reconstruction events firing on injected loss and classical-PLC on packet-loss beyond DRED's window.
## Risks and mitigations
- **Custom DRED FFI wrapper is WZP-maintained code with no second source.** opusic-c's decoder-side DRED wrapper is insufficient (see Background), so we carry our own `dred_ffi.rs` that calls `opus_dred_parse` and `opus_decoder_dred_decode` directly via opusic-sys. Bugs in this wrapper — offset arithmetic off-by-ones, lifetime errors on `OpusDRED` buffers, UB from misuse of the C API — could manifest as silent audio corruption on loss bursts, hard to diagnose. **Mitigation**: extensive unit tests in `dred_ffi.rs` using a reference encoder + reference decoder round-trip with known offsets; strict bounds checking on every `unsafe` boundary; Miri run in CI if feasible; the legacy-FEC escape hatch disables the entire DRED code path including our custom wrapper, giving us a single flag to revert any wrapper bug in production. Long-term: upstream the fixes to opusic-c (follow-up task, not blocking).
- **opusic-c's encoder-side API and internal Decoder pointer access**. Step 3a depends on being able to call opusic-sys raw functions that take an `*mut OpusDecoder` pointer while still using opusic-c's `Decoder` for normal decode. If opusic-c doesn't expose the raw pointer cleanly, we fall back to a thin opusic-sys-direct Decoder wrapper inside `dred_ffi.rs` and lose some of opusic-c's convenience. **Mitigation**: verify at the start of Phase 3 (one afternoon of reading opusic-c source). If the clean path doesn't work, the fallback is not difficult — it's what we'd have built anyway if opusic-c didn't exist.
- **DRED reconstruction quality varies by voice / content**. The neural model is trained on speech; edge cases (shouting, whispering, heavy accents, music-on-hold, cough, laughter) may reconstruct less cleanly than continuous speech. **Mitigation**: escape hatch ships from day one. If production telemetry shows perceptible quality regression on specific voice patterns, flip legacy mode for affected users while tuning. Also: classical Opus PLC remains as the third-tier fallback when DRED state is unavailable.
- **Removing RaptorQ removes bit-exact recovery**. Isolated single-packet losses are now reconstructed plausibly instead of bit-exactly. **Mitigation**: as argued in Background, bit-exactness on a single 20 ms speech frame is perceptually meaningless. The assumption is "speech is the workload" — if we ever add non-speech features (music bot, ringtones over the call path, DTMF-over-audio) we revisit.
- **libopus 1.5 DRED API stability**. **Verified at pre-flight**: opus.h in the upstream xiph/opus repository has no "experimental" marker on the DRED API declarations. The earlier characterization was incorrect. DRED shipped as a first-class feature in libopus 1.5.0 (Dec 2023) and has been iterated in 1.5.1 and 1.5.2. Google Meet and Duo ship it at scale. **Mitigation**: pin `opusic-sys` exactly (no `^` range) to ensure reproducible builds, follow upstream 1.5.x bugfixes as they land. No special stability concerns beyond normal dependency hygiene.
- **Jitter buffer refactor is the largest code change**. Jitter bugs are notoriously subtle (off-by-one on sequence wraparound, clock drift interactions, playout starvation corner cases). **Mitigation**: keep the classical-PLC path intact as the DRED fallback, so jitter bugs degrade to "current behavior" rather than "broken audio". Write targeted unit tests for the buffer at each loss-pattern scenario before touching production paths. Consider shipping Phase 3 behind a sub-flag separate from the main escape hatch, so we can independently toggle "DRED enabled but classical jitter buffer" for bisection.
- **Cross-compile surprises**. `opusic-sys` is actively maintained but our exact combination of Android NDK version / Docker builder environment / Windows cross-compile via cargo-xwin has not been tested by upstream. **Mitigation**: Phase 0 includes the full cross-compile matrix as an acceptance criterion. Any blockers surface before we touch loss-recovery behavior.
- **Wire-format compatibility during rollout**. Mixed-version calls (new sender + old receiver, or vice versa) need to keep working. **Verified at pre-flight**: traced both live receive paths (`wzp-client/src/call.rs::CallDecoder::ingest` and `wzp-android/src/engine.rs` the JNI-driven engine path), and both degrade gracefully: new-sender Opus packets with `fec_ratio_encoded=0` / `fec_block=0` / `fec_symbol=0` flow through to the jitter buffer and decode normally on old receivers. The RaptorQ decoder either ignores zero-FEC packets entirely (Android pipeline.rs gates on non-zero fec_block/fec_symbol) or accumulates them harmlessly until the 2-second staleness eviction (desktop call.rs). Old-sender packets with populated RaptorQ fields are handled by new receivers via the unchanged Codec2 path (new receivers keep wzp-fec for Codec2 tiers and simply ignore RaptorQ fields on Opus packets). **No wire format version bump required.**
- **Pre-existing desktop RaptorQ gap** (incidental finding, NOT caused by this PRD). The desktop `wzp-client/src/call.rs::CallDecoder` feeds packets into `fec_dec.add_symbol` but **never calls `fec_dec.try_decode`** — RaptorQ recovery is effectively dead code on the desktop path today. Main decode reads from the jitter buffer directly, falling through to classical Opus PLC on missing packets. The Android `engine.rs` path properly uses `try_decode` for recovery. This PRD does not fix the desktop gap — it's unrelated — but is noted here so nobody is surprised that removing RaptorQ from Opus tiers on the desktop client causes no measurable recovery regression (there was nothing to lose). Recommend filing a follow-up task to either fix or remove the vestigial desktop RaptorQ wiring independently of this work.
- **`AUDIO_USE_LEGACY_FEC` itself becoming permanent tech debt**. Escape hatches have a way of outliving their intended lifespan. **Mitigation**: put an explicit removal date in a `// TODO(2026-06-15): remove legacy FEC path` comment at the flag-handling site. Track in taskmaster.
## Open questions
- ~~**Does opusic-c expose `opusic_c::Decoder`'s raw inner pointer?**~~ **Resolved at pre-flight**: no, it's `pub(crate)`. We build a unified `DecoderHandle` over raw opusic-sys in `dred_ffi.rs` and use it for both normal decode and DRED reconstruction. Opusic-c is used only for the encoder side.
- **Exact opusic-sys symbol name for DRED decoder allocation**. opus.h documents the `OpusDREDDecoder` type and `opus_dred_parse`/`opus_decoder_dred_decode` functions, but the allocation function name is not in the fetched snippet. Expected to be `opus_dred_decoder_create` / `opus_dred_decoder_destroy` per libopus naming convention, but confirm at the very start of Phase 3a by reading the actual opusic-sys bindings. If the function is not exported by opusic-sys, we file a PR upstream to opusic-sys (small fix, trivially mergeable) and temporarily vendor the function declaration locally.
- **Should the 5% loss floor be configurable per profile?** Currently specified as a constant. A future refinement might make it higher at degraded tiers and lower at studio tiers, but without real telemetry we don't know if the constant is wrong. Keep as a constant for now, revisit after 1 month of production data.
- **OSCE enable**: opusic-c has an `osce` feature flag for Opus Speech Coding Enhancement, a separate libopus 1.5 neural post-processor. Out of scope for this PRD but should be the next audio-quality follow-up. Probably one-line enable once opusic-c is in.
- **Upstream PR to opusic-c**: our own `dred_ffi.rs` wrapper should be proven in production first, then the fixes upstreamed to `opusic-c/src/dred.rs` (preserve `dred_end`, fix `dred_offset` double-pass, expose `DredPacket` externally). Follow-up task, not blocking this PRD.
- **`feat/desktop-audio-rewrite` merge**: the vendored `audiopus_sys` patch on that branch becomes obsolete under this PRD. Coordinate removal with whoever owns that branch.

View File

@@ -1,141 +0,0 @@
# PRD: Local Recording + Cloud Mixer for Podcast-Quality Interviews
## Problem
WarzonePhone delivers real-time encrypted voice, but the audio quality is limited by network conditions (codec compression, packet loss, jitter). Podcasters and interviewers need pristine, studio-grade recordings of each participant — independent of what the network delivers.
## Solution
**Dual-path architecture**: each client simultaneously (1) participates in the live call at whatever codec quality the network supports, and (2) records their own microphone locally as lossless PCM. After the session, all local recordings are uploaded to a self-hosted mixer service that aligns, normalizes, and outputs a final multi-track or mixed file.
## Architecture
```
┌──────────────────┐
Mic ──┬── Opus/Codec2 ──► Network (live) │ ← real-time call
│ └──────────────────┘
└── WAV 48kHz ────► Local File │ ← pristine recording
(timestamped)
▼ (after hangup)
┌──────────────────┐
│ Mixer Service │ ← self-hosted
│ (align + mix) │
└──────────────────┘
Final MP3/WAV/FLAC
```
## Requirements
### Phase 1: Local Recording (MVP)
**All clients (Desktop, Android, Web):**
1. **Record toggle**: User can enable "Record this call" before or during a call
2. **Recording pipeline**: Tap raw PCM from the microphone capture path *before* it enters the codec encoder
3. **File format**: WAV (48kHz, 16-bit, mono) — simple, universally supported, lossless
4. **Sync markers**: Embed a monotonic timestamp (ms since call start) at the beginning of the recording, and periodically (every 10s) write a sync marker packet into a sidecar JSON file:
```json
{"ts_ms": 30000, "seq": 1500, "wall_clock_utc": "2026-04-07T12:00:30Z"}
```
This allows the mixer to align recordings from different participants even if they join at different times.
5. **Storage**:
- Desktop: `~/.wzp/recordings/{room}_{timestamp}.wav`
- Android: `Documents/WarzonePhone/{room}_{timestamp}.wav`
- Web: IndexedDB blob or File System Access API
6. **File size estimate**: 48kHz * 16-bit * mono = 96 KB/s = ~5.6 MB/min = ~345 MB/hour
7. **UI indicator**: Red dot + timer showing recording is active and file size growing
8. **On hangup**: Close the WAV file, show "Recording saved" with file path/size
### Phase 2: Upload to Mixer
1. **Upload endpoint**: Self-hosted HTTP service (Rust or Go) that accepts WAV uploads with metadata
2. **Chunked/resumable upload**: Large files need resumable uploads (tus protocol or simple chunked POST)
3. **Upload metadata**:
```json
{
"session_id": "uuid",
"participant_fingerprint": "xxxx:xxxx:...",
"alias": "Alice",
"room": "podcast-ep-42",
"duration_secs": 3600,
"sync_markers": [...],
"sample_rate": 48000,
"channels": 1,
"bit_depth": 16
}
```
4. **Upload UI**: Progress bar after hangup, option to upload now or later
5. **Retry on failure**: Queue uploads for retry if network is unavailable
### Phase 3: Mixer Service
1. **Alignment**: Use sync markers (wall clock + sequence numbers) to align recordings from all participants to a common timeline
2. **Silence trimming**: Detect and optionally trim leading/trailing silence
3. **Normalization**: Per-track loudness normalization (LUFS-based)
4. **Noise reduction**: Optional per-track noise gate or RNNoise pass
5. **Output formats**:
- Multi-track: ZIP of individual WAVs (aligned, normalized)
- Mixed: Single stereo or mono WAV/MP3/FLAC with all participants
- Podcast-ready: Loudness-normalized to -16 LUFS (podcast standard)
6. **Web UI**: Simple dashboard to see sessions, download outputs, preview waveforms
7. **Self-hosted**: Docker image, single binary, SQLite for metadata
## Implementation Notes
### Recording tap point
The recording must tap *after* AGC (so levels are normalized) but *before* the codec encoder (to avoid compression artifacts). In the current architecture:
```
Mic → Ring Buffer → AGC → [TAP HERE for recording] → Opus/Codec2 → Network
```
**Desktop** (`engine.rs`): After `capture_agc.process_frame()`, before `encoder.encode()`
**Android** (`engine.rs`): Same location — after AGC, before encode
**CLI** (`call.rs`): After `self.agc.process_frame()` in `CallEncoder::encode_frame()`
### WAV writer
Use a simple streaming WAV writer that:
- Writes the WAV header with placeholder data length
- Appends PCM samples as they come
- On close, seeks back to update the data length in the header
### Sync mechanism
Wall-clock UTC alone is insufficient (clocks drift). The sync strategy:
1. Each participant records their local monotonic time + wall clock at call start
2. Periodically (every 10s), each participant writes: `{local_mono_ms, seq_number, utc_iso}`
3. The mixer uses sequence numbers (which are shared via the wire protocol) as ground truth for alignment, with wall clock as a fallback
### Privacy
- Local recordings never leave the device without explicit user action
- Upload is manual, not automatic
- The mixer service processes files and can delete originals after mixing
- No recording data flows through the relay — only the user's own mic
## Non-Goals (v1)
- Live transcription (future)
- Video recording (audio only)
- Automatic upload without user consent
- Recording other participants' audio (only your own mic)
- Real-time mixing (post-session only)
## Milestones
| Phase | Scope | Effort |
|-------|-------|--------|
| 1a | Local WAV recording on Desktop | 1-2 days |
| 1b | Local WAV recording on Android | 1-2 days |
| 1c | Sync markers + metadata sidecar | 1 day |
| 2a | Upload service (HTTP + storage) | 2-3 days |
| 2b | Upload UI in clients | 1-2 days |
| 3a | Mixer: alignment + normalization | 2-3 days |
| 3b | Mixer: web dashboard | 2-3 days |
| 3c | Docker packaging | 1 day |

View File

@@ -1,56 +0,0 @@
# PRD: Studio Quality Tiers (Opus 32k/48k/64k)
## Status: Implemented
Studio quality tiers have been added to the wire protocol and all clients.
## What Was Added
### Wire Protocol (codec_id.rs)
Three new `CodecId` variants using the 4-bit header space (values 6-8):
| CodecId | Wire Value | Bitrate | Frame | Use Case |
|---------|-----------|---------|-------|----------|
| Opus32k | 6 | 32 kbps | 20ms | Studio low — noticeable improvement over 24k for voice |
| Opus48k | 7 | 48 kbps | 20ms | Studio — excellent voice, captures nuance |
| Opus64k | 8 | 64 kbps | 20ms | Studio high — near-transparent quality |
### Quality Profiles
| Profile | Codec | FEC | Bandwidth (with FEC) |
|---------|-------|-----|---------------------|
| STUDIO_32K | Opus 32k | 10% | ~35 kbps |
| STUDIO_48K | Opus 48k | 10% | ~53 kbps |
| STUDIO_64K | Opus 64k | 10% | ~70 kbps |
FEC is set to 10% (vs 20% for GOOD) — studio assumes a good network.
### Client Support
| Client | Selection | Status |
|--------|-----------|--------|
| Desktop (Tauri) | Quality slider in Settings (8 levels) | Done |
| CLI | `--profile studio-64k` / `studio-48k` / `studio-32k` | Done |
| Android | Needs codec picker update in SettingsScreen.kt | TODO |
| Web | Needs UI | TODO |
### Cross-Codec Interop
All decoder auto-switch paths (call.rs, desktop engine.rs) handle the new codec IDs. A studio-64k client can talk to a codec2-1200 client — the receiver auto-switches.
## When to Use Studio Tiers
- **Podcast recording sessions**: Use studio-64k for best quality (combined with local WAV recording for pristine output)
- **Music collaboration**: Opus at 48-64k captures instrument harmonics much better than 24k
- **Good network conditions**: Only useful when bandwidth isn't constrained; the extra bits are wasted on lossy networks
## When NOT to Use
- **Mobile data**: Stick with Auto/GOOD — studio tiers use 2-3x the bandwidth
- **High packet loss**: Studio profiles use minimal FEC (10%); degraded networks need DEGRADED or CATASTROPHIC profiles with 50-100% FEC
- **Large group calls**: Each participant's stream multiplies bandwidth; 64k * 10 participants = 640 kbps incoming
## Backward Compatibility
Old clients (before this change) will receive packets with CodecId 6/7/8 which they don't recognize. The `from_wire()` returns `None` for unknown values, causing the packet to be dropped. Old clients can still *send* to new clients fine (they use CodecId 0-5). This is acceptable for a pre-release protocol.

View File

@@ -0,0 +1,431 @@
# Incident report — Tauri Android `__init_tcb+4` SIGSEGV
**Status:** Blocked. Reproducible crash with a known trigger at the cc::Build /
rustc-link-lib layer that we cannot yet explain. Writing this report to hand
off for external help.
**Project:** WarzonePhone (Rust + Tauri 2.x Mobile) Android rewrite
**Branch:** `feat/desktop-audio-rewrite`
**Target phone:** Pixel 6 (`oriole`), Android 16 (`BP3A.250905.014`), arm64-v8a
**Date range of investigation:** 2026-04-09 (one working session, ~27 builds)
---
## One-paragraph summary
We're porting the existing CPAL-backed desktop Tauri app (`desktop/src-tauri`)
to Tauri Mobile Android so the same Rust + Tauri + WebView codebase runs on
both platforms. The Android `.apk` launches, renders the home screen, and
registers on a relay for signal-only builds (no audio backend). The moment
we add **any** `cc::Build::new().cpp(true).cpp_link_stdlib("c++_shared")`
call to `build.rs` — even with a 6-line cpp file that just returns 42 and is
never called from Rust — the built `.so` crashes at launch inside
`__init_tcb(bionic_tcb*, pthread_internal_t*)+4` via `pthread_create` via
`std::thread::spawn` via `tao::ndk_glue::create` via
`Java_com_wzp_desktop_WryActivity_create`, before our Rust entry point has
a chance to run. The exact same NDK, exact same Rust toolchain, exact same
Docker image is used by the legacy `wzp-android` crate (via `cargo-ndk`)
which compiles Oboe and runs fine on the same phone.
---
## Environment
**Docker build image:** `wzp-android-builder` (Dockerfile at
`scripts/Dockerfile.android-builder`)
- Base: `debian:bookworm`
- JDK 17
- Android SDK:
- cmdline-tools latest
- `platforms;android-34`, `platforms;android-36`
- `build-tools;34.0.0`, `build-tools;35.0.0`
- `ndk;26.1.10909125` (last stable before scudo/MTE crash on NDK r27+)
- `platform-tools`
- Node.js 20 LTS
- Rust stable `1.94.1 (e408947bf 2026-03-25)`
- Rust android targets: `aarch64-linux-android`, `armv7-linux-androideabi`,
`i686-linux-android`, `x86_64-linux-android`
- `cargo-ndk` + `cargo tauri-cli 2.10.1` (latest 2.x)
**Host:** Docker on `SepehrHomeserverdk` (remote build server).
**Phone:** Pixel 6, Android 16, kernel 6.1.134-android14-11, on the same LAN
as the build machine and a local `wzp-relay` binary.
**Tauri crate:** `desktop/src-tauri/` in the workspace at the root of the
repo. Depends on `tauri = "2"`, `tauri-plugin-shell = "2"`, `tokio`, `rustls`,
`wzp-proto`, `wzp-codec`, `wzp-fec`, `wzp-crypto`, `wzp-transport`, and (on
non-Android only) `wzp-client` with `features = ["audio", "vpio"]`. The
crate's `[lib]` section is:
```toml
[lib]
name = "wzp_desktop_lib"
crate-type = ["staticlib", "cdylib", "rlib"]
```
The crate produces `libwzp_desktop_lib.so` which is `System.loadLibrary`'d by
Tauri's generated `WryActivity.onCreate` via JNI.
---
## The crash
Every failing build produces the same stack at launch, same pc offsets:
```
signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x00000072XXXXXX00f (write)
#00 pc 000000000130cc74 libwzp_desktop_lib.so (__init_tcb(bionic_tcb*, pthread_internal_t*)+4)
#01 pc 0000000001331cf0 libwzp_desktop_lib.so (pthread_create+360)
#02 pc 00000000012bee04 libwzp_desktop_lib.so (std::sys::thread::unix::Thread::new::h87be8e9feeaaaf84+184)
#03 pc 0000000000e37f5c libwzp_desktop_lib.so (std::thread::lifecycle::spawn_unchecked::h941f828f9a95150d+1504)
#04 pc 0000000000e461e8 libwzp_desktop_lib.so (std::thread::builder::Builder::spawn_unchecked::hec5f087680cb0248+112)
#05 pc 0000000000e441c8 libwzp_desktop_lib.so (std::thread::functions::spawn::ha3d3fbf2d9fe53e3+108)
#06 pc ... libwzp_desktop_lib.so (tao::platform_impl::platform::ndk_glue::create::h254c68662718841a+1792)
#07 pc ... libwzp_desktop_lib.so (Java_com_wzp_desktop_WryActivity_create+76)
```
The offsets are **byte-identical across every failing build**, even when the
cpp content changes drastically (cf. `cpp_smoke.cpp` at 6 lines, 20 lines,
200+ Oboe source files). We believe this is because cargo caches the Rust
compilation unit and only the build-script artifacts differ, and the final
link produces the same layout.
`__init_tcb` is defined locally inside our `.so` with C++ mangling:
```
_Z10__init_tcbP10bionic_tcbP18pthread_internal_t
```
It originates from bionic's `pthread_create.cpp`, which got pulled in
statically from the NDK's `sysroot/usr/lib/aarch64-linux-android/libc.a`.
Both failing and known-good (legacy `wzp_android.so`) builds contain this
same static symbol — the presence of the symbol is not the problem.
Fault address `0x72XXXXXX00f` with code `SEGV_ACCERR` (access permission
error, write). Aligned to `+4` inside `__init_tcb`, which is typically a
store into the passed-in `bionic_tcb*`. The pointer is either NULL-ish or
pointing into read-only memory.
---
## Bisection (the important part)
We started from a known-good commit (`5309938`) where the Tauri Android app
launches, registers on a relay, and behaves identically to the desktop app
modulo audio. Then we added features **one variable at a time**:
| Step | Commit | Change vs previous | Result |
|---|---|---|---|
| Baseline | `5309938` | — | ✅ launches, renders home, registers on relay |
| **A** | `f96d7ce` | Add `cc = "1"` build-dep + compile trivial `cpp/hello.c` via `cc::Build` (C, not C++). Static lib never linked in. | ✅ |
| **B** | `ae4f366` | Add `wzp-client` Android dep with `default-features = false` (no CPAL, no VPIO). No new imports. | ✅ |
| **C** | `19fd3dd` | Un-cfg-gate `mod engine;` in `lib.rs` so `engine.rs` compiles on Android. `CallEngine::start()` has an Android stub returning an error. | ✅ |
| **D** | `a852cad` | Compile `cpp/getauxval_fix.c` (legacy wzp-android shim). Still pure C. | ✅ |
| **E** | `4250f1b` | **Compile full Oboe C++ bridge** (200+ source files from `google/oboe@1.8.1`). `cc::Build::new().cpp(true).std("c++17").cpp_link_stdlib(Some("c++_shared"))` + `-llog` + `-lOpenSLES` link directives. Nothing called from Rust yet — the `extern "C"` bridge functions are exported but never referenced from the Rust side. | ❌ **crash** |
| E.4 | `aa240c6` | **Only change:** replace the entire Oboe compile with ONE tiny `cpp_smoke.cpp` file: `extern "C" int wzp_cpp_smoke(void) { std::lock_guard<std::mutex> lk(m); std::thread t([](){...}); t.join(); return g.load(); }`. Still `cpp(true) + cpp_link_stdlib("c++_shared")`. Drop `-llog`/`-lOpenSLES`. | ❌ **same crash, same offsets** |
| E.2 | `0224ce6` | Shrink `cpp_smoke.cpp` further: just `std::atomic<int>` + `fetch_add`, no mutex, no thread, no includes beyond `<atomic>`. | ❌ **same crash, same offsets** |
| E.1 | `0d74366` | **Absolute minimum:** `cpp_smoke.cpp` = `extern "C" int wzp_cpp_hello(void){return 42;}`. NO `#include`. NO STL. Just a function. Still compiled with `cpp(true) + cpp_link_stdlib("c++_shared")`. | ❌ **same crash, same offsets** |
### Additional confirming observations
1. **The cpp code is dead-stripped.** `llvm-nm -a libwzp_desktop_lib.so` shows
zero matches for `wzp_cpp_hello`, `wzp_cpp_smoke`, or any Oboe symbol in
builds E through E.1. The static archive (`libwzp_cpp_smoke.a` /
`liboboe_bridge.a`) exists on disk under
`target/aarch64-linux-android/debug/build/wzp-desktop-*/out/`, but because
nothing in Rust ever references the exported C function, the final linker
drops it.
2. **`build.rs` link directives are the real delta.** `cc::Build::new()
.cpp(true).cpp_link_stdlib(Some("c++_shared"))` emits a
`cargo:rustc-link-lib=c++_shared` directive that adds a `NEEDED` entry for
`libc++_shared.so` to the final `.so`'s dynamic table. `readelf -d` on
the crashing `.so` shows:
```
NEEDED Shared library: [libc++_shared.so]
NEEDED Shared library: [liblog.so] (only in full Oboe build)
NEEDED Shared library: [libOpenSLES.so] (only in full Oboe build)
```
The working baseline `.so` has no `NEEDED` entries beyond libc/liblog.
3. **Linker version doesn't matter.** We tried forcing
`aarch64-linux-android26-clang` as the linker (API 26 has proper dynamic
bindings to libc.so's runtime `pthread_create`/`__init_tcb`) via three
different mechanisms:
- `CARGO_TARGET_AARCH64_LINUX_ANDROID_LINKER` env var in `docker run`
- `.cargo/config.toml` workspace-level linker override
- **Binary replacement inside the image**: `mv
aarch64-linux-android24-clang .orig` and replace with a shell script
that `exec`s `aarch64-linux-android26-clang`. Verified by calling
`--version` which prints `Target: aarch64-unknown-linux-android26`.
All three made no difference. The `__init_tcb` symbol is pulled statically
from the **same** `libc.a` regardless of which clang wrapper is used — the
NDK ships ONE `libc.a` at
`sysroot/usr/lib/aarch64-linux-android/libc.a` shared across all API
levels. Only the per-API `libc.so` symlinks change (and we're linked
statically, not dynamically, against libc).
4. **Legacy `wzp-android` crate works on the same phone, same image.** Run
in the exact same Docker container, the legacy Kotlin app's JNI library
(`crates/wzp-android` built via `cargo ndk`) compiles a subset of the
same Oboe code, produces a `.so` that has the same static
`_Z10__init_tcbP...` + `pthread_create` + `pthread_create.cpp` symbols,
and launches cleanly on the Pixel 6. Key differences between the two
build paths:
| | `wzp-android` (works) | `wzp-desktop` Tauri (crashes) |
|---|---|---|
| Build driver | `cargo ndk -t arm64-v8a build --release -p wzp-android` | `cargo tauri android build --debug --target aarch64 --apk` |
| Profile | release | debug (release crashes identically) |
| Linker | `aarch64-linux-android26-clang` (via `.cargo/config.toml` which cargo-ndk honors) | `aarch64-linux-android24-clang` (tauri-cli hardcodes and ignores config; the shim redirect makes no difference) |
| crate-type | `["cdylib", "rlib"]` | `["staticlib", "cdylib", "rlib"]` |
| JNI entrypoint | direct Kotlin `System.loadLibrary` + our own `native fun` declarations; first `pthread_create` runs later from the tokio runtime inside a command | `WryActivity.onCreate` via Tauri's generated Java glue; first `pthread_create` runs **inside the JNI call** via `tao::ndk_glue::create` |
| Other heavy deps | tokio, wzp-{proto,codec,fec,crypto,transport} | tokio, tauri, tauri-runtime-wry, tao, wry, webview2-com, soup3, webkit2gtk (all platform-specific ones cfg-gated out of android), and also all of the above |
| Binary size | `libwzp_android.so` ≈ 14 MB (release) | `libwzp_desktop_lib.so` ≈ 160 MB (debug), 16 MB (release) |
5. **The crash happens in the JNI-callback thread during `onCreate`.** Frame
#06 `tao::platform_impl::platform::ndk_glue::create+1792` is tao's Android
event-loop bootstrap, which Tauri calls from inside
`Java_com_wzp_desktop_WryActivity_create` in response to the Java-side
activity lifecycle. This means the thread spawn is happening while the
Java VM still holds the native onCreate call, before `onCreate` has
returned to the Android runtime. Legacy `wzp-android` never spawns a
thread from an onCreate JNI call — it spawns threads only from
`nativeSignalConnect`/similar commands invoked later from Kotlin button
clicks, after the activity is fully initialised.
---
## Current suspect
One of the two items below, probably (2):
1. **The `.cpp(true)` mode in cc-rs changes something invisible in the link
pipeline** (for example, emitting a different `-x` flag to clang, or
changing linker driver selection). We have not yet verified this by
diffing the actual rustc linker invocation between a working and a
crashing build with `--verbose` + `-Clink-arg=-Wl,-t`.
2. **Adding `libc++_shared.so` as a NEEDED entry causes Android's dynamic
linker to load libc++_shared.so before our `.so`'s init runs, and
something in libc++_shared's `.init_array` interacts badly with
tao::ndk_glue's `pthread_create` call from inside the JNI onCreate
window**. The legacy crate doesn't hit this because (a) it has no
NEEDED libc++_shared when built without Oboe, and (b) even when it does
build Oboe, its thread spawns happen outside the onCreate JNI call so
whatever libc state is wrong at that moment is already stabilised.
We have not yet confirmed (2) with the obvious A/B test: keep `cpp_smoke.cpp`
but drop `.cpp_link_stdlib(Some("c++_shared"))` (and drop any manual
`cargo:rustc-link-lib=c++_shared`) so the NEEDED entry disappears but the
rest of the pipeline stays identical. That's the next experiment we were
going to run, but the user reasonably asked for this report first.
---
## What we've ruled out
- **NDK API level** — forcing API-26 linker via three independent mechanisms
made zero difference.
- **Build profile** — release (`0x6b8000` offset, 21 MB unsigned APK) and
debug (same 193 MB APK, same crash offsets) both crash identically.
- **Oboe specifically** — replacing the Oboe compile with 6 lines of C++
that does nothing still reproduces the crash.
- **cpp code being executed at runtime** — dead-stripped, not in the final
`.so` at all per `nm -a`.
- **minSdk in build.gradle** — bumped from 24 to 26, no effect.
- **libdl.a stub issue** — ruled out via logcat (`libdl.a is a stub --- use
libdl.so instead` was only surfacing from our own `dlsym` shim that we
subsequently deleted).
- **`pthread_create` interposition via `-Wl,--wrap=pthread_create`** — tried
and reverted; the wrap target still resolved to the broken static stub.
- **Keystore / signing** — debug signing with persistent `~/.android/
debug.keystore` works fine; no signature mismatch issues.
---
## The files involved
### `desktop/src-tauri/build.rs` (current state, E.1)
```rust
use std::path::PathBuf;
use std::process::Command;
fn main() {
// Embedded git hash
let git_hash = Command::new("git")
.args(["rev-parse", "--short", "HEAD"])
.output()
.ok()
.filter(|o| o.status.success())
.and_then(|o| String::from_utf8(o.stdout).ok())
.map(|s| s.trim().to_string())
.unwrap_or_else(|| "unknown".into());
println!("cargo:rustc-env=WZP_GIT_HASH={git_hash}");
println!("cargo:rerun-if-changed=../../.git/HEAD");
println!("cargo:rerun-if-changed=../../.git/refs/heads");
let target = std::env::var("TARGET").unwrap_or_default();
if target.contains("android") {
// Step A: plain C sanity file
println!("cargo:rerun-if-changed=cpp/hello.c");
cc::Build::new().file("cpp/hello.c").compile("wzp_hello");
// Step D: legacy getauxval shim
println!("cargo:rerun-if-changed=cpp/getauxval_fix.c");
cc::Build::new().file("cpp/getauxval_fix.c").compile("getauxval_fix");
// Step E.1: minimal C++ smoke — THIS STEP BRINGS BACK THE CRASH
println!("cargo:rerun-if-changed=cpp/cpp_smoke.cpp");
cc::Build::new()
.cpp(true)
.std("c++17")
.cpp_link_stdlib(Some("c++_shared"))
.file("cpp/cpp_smoke.cpp")
.compile("wzp_cpp_smoke");
// Copy libc++_shared.so into gen/android jniLibs so the runtime
// linker can find it when the NEEDED entry fires.
if let Ok(ndk) = std::env::var("ANDROID_NDK_HOME").or_else(|_| std::env::var("NDK_HOME")) {
let triple = "aarch64-linux-android";
let abi = "arm64-v8a";
let lib_dir = format!(
"{ndk}/toolchains/llvm/prebuilt/linux-x86_64/sysroot/usr/lib/{triple}"
);
println!("cargo:rustc-link-search=native={lib_dir}");
let shared_so = format!("{lib_dir}/libc++_shared.so");
if std::path::Path::new(&shared_so).exists() {
let manifest = std::env::var("CARGO_MANIFEST_DIR").unwrap_or_default();
let jni_dir = format!("{manifest}/gen/android/app/src/main/jniLibs/{abi}");
if std::fs::create_dir_all(&jni_dir).is_ok() {
let _ = std::fs::copy(&shared_so, format!("{jni_dir}/libc++_shared.so"));
}
}
}
}
tauri_build::build()
}
```
### `desktop/src-tauri/cpp/cpp_smoke.cpp` (E.1)
```cpp
extern "C" int wzp_cpp_hello(void) {
return 42;
}
```
### `desktop/src-tauri/Cargo.toml` (relevant excerpts)
```toml
[package]
name = "wzp-desktop"
version = "0.1.0"
edition = "2024"
[lib]
name = "wzp_desktop_lib"
crate-type = ["staticlib", "cdylib", "rlib"]
[[bin]]
name = "wzp-desktop"
path = "src/main.rs"
[build-dependencies]
tauri-build = { version = "2", features = [] }
cc = "1"
[dependencies]
tauri = { version = "2", features = [] }
tauri-plugin-shell = "2"
serde = { version = "1", features = ["derive"] }
serde_json = "1"
tokio = { version = "1", features = ["full"] }
tracing = "0.1"
tracing-subscriber = "0.3"
anyhow = "1"
rustls = { version = "0.23", default-features = false, features = ["ring", "std"] }
wzp-proto = { path = "../../crates/wzp-proto" }
wzp-codec = { path = "../../crates/wzp-codec" }
wzp-fec = { path = "../../crates/wzp-fec" }
wzp-crypto = { path = "../../crates/wzp-crypto" }
wzp-transport = { path = "../../crates/wzp-transport" }
[target.'cfg(not(target_os = "android"))'.dependencies]
wzp-client = { path = "../../crates/wzp-client", features = ["audio", "vpio"] }
[target.'cfg(target_os = "android")'.dependencies]
wzp-client = { path = "../../crates/wzp-client", default-features = false }
```
---
## Reproduction
A fresh clone on a Linux x86_64 host with:
```bash
git clone ssh://git@git.manko.yoga:222/manawenuz/wz-phone.git
cd wz-phone
git checkout feat/desktop-audio-rewrite
git reset --hard 0d74366 # <-- step E.1, smallest crashing commit
# Need: Android NDK r26.1.10909125, JDK 17, Node 20, Rust stable, cargo tauri 2.x
scripts/prep-linux-mint.sh # installs all the above into /opt/android-sdk etc.
cd desktop
npm install
cd src-tauri
cargo tauri android build --debug --target aarch64 --apk
adb install -r gen/android/app/build/outputs/apk/universal/debug/app-universal-debug.apk
adb logcat -c && adb shell am start -n com.wzp.desktop/.MainActivity
adb logcat | grep -E "F DEBUG|__init_tcb|pthread_create"
```
Expected result: SIGSEGV at `__init_tcb+4` within ~500 ms of launch.
Reverting `cpp/cpp_smoke.cpp` + the `cc::Build` call for it in `build.rs`
(one git command: `git revert 0d74366 aa240c6 0224ce6 a852cad`) restores a
working build. Keeping the C sanity compile (`hello.c`, `getauxval_fix.c`)
is fine — only the `.cpp(true) + .cpp_link_stdlib("c++_shared")` combination
triggers the regression.
---
## What we'd like help with
1. **Is our suspect #2 actually the mechanism?** Is there a known issue
where a Tauri/tao android cdylib crashes on load when it has a
`libc++_shared.so` NEEDED entry and tries to spawn a thread from inside
an onCreate JNI call?
2. **What's the correct way to link Oboe (or any C++ Android audio
library) into a `cargo tauri android build` cdylib** without hitting
this? Is there a known-good combination of cc-rs flags / linker
arguments / cargo config?
3. **Is there a way to force `cargo tauri` to use the same linker setup
as `cargo ndk`**, which reliably produces working Oboe-linked .so
files from the exact same workspace? We've tried env var override,
`.cargo/config.toml`, and image-level binary replacement — cargo
tauri ignores all three and keeps using
`aarch64-linux-android24-clang`.
4. **Is there a way to defer `tao::ndk_glue::create`'s thread spawn to
after `onCreate` returns** so that whatever bionic state `__init_tcb`
depends on is ready?
5. **Lastly** — is there a fundamentally different approach we should
take (e.g., use the `oboe` Rust crate from crates.io instead of a
hand-rolled C++ bridge, use Android's AAudio directly via the `ndk`
crate's aaudio bindings, or even abandon the C++ audio path and
implement mic/speaker via JNI into Java `AudioRecord`/`AudioTrack`)?

View File

@@ -1,59 +0,0 @@
# =============================================================================
# WZ Phone — Linux x86_64 Tauri desktop build image
#
# Thin extension of wzp-android-builder that adds the GTK3 + WebKit2GTK 4.1 +
# libsoup-3.0 + AppIndicator dev packages needed to build the Tauri desktop
# app for Linux. Everything else (Rust, Node.js, cmake, pkg-config, cpal
# libasound deps, tauri-cli) is inherited from the base image.
#
# Build:
# docker build -t wzp-linux-desktop-builder -f Dockerfile.linux-desktop-builder .
#
# Run: driven by scripts/build-linux-desktop-docker.sh (see that file).
# =============================================================================
FROM wzp-android-builder
USER root
# Tauri 2.x Linux dependencies.
# - libwebkit2gtk-4.1-dev: the WebView backend. Tauri 2.x uses 4.1 (not 4.0).
# - libsoup-3.0-dev: HTTP client used by webkit2gtk. Must match its major.
# - libgtk-3-dev: GTK3 headers (webkit2gtk still uses GTK3).
# - libayatana-appindicator3-dev: system tray / status icon. Optional at
# runtime but tauri-build's feature-detection includes it.
# - librsvg2-dev: SVG rendering in the menu/icon code.
# - libglib2.0-dev: GObject introspection headers (transitive, but explicit).
# - patchelf: used by the tauri bundler to rewrite rpaths in the final binary.
# - file: already in the base, but tauri-build checks for it by name.
RUN apt-get update && apt-get install -y --no-install-recommends \
libwebkit2gtk-4.1-dev \
libsoup-3.0-dev \
libgtk-3-dev \
libayatana-appindicator3-dev \
librsvg2-dev \
libglib2.0-dev \
patchelf \
libwebrtc-audio-processing-dev \
clang \
&& rm -rf /var/lib/apt/lists/*
# ── webrtc-audio-processing build requirements ──────────────────────────────
# The `webrtc-audio-processing` Rust crate (0.3.x line) links against Debian
# Bookworm's `libwebrtc-audio-processing-dev` apt package (0.3-1+b1), which
# provides the PulseAudio fork of the WebRTC audio processing module. This is
# the library that Pulse's module-echo-cancel and PipeWire's filter-chain
# use for their AEC modes — same algorithm family, runtime-linked via
# pkg-config at cargo build time.
#
# An attempt was made to use the 2.x line with the `bundled` sub-feature
# (which would give AEC3 instead of AEC2) but both the crates.io tarball
# and the upstream git `main` branch hit a `meson setup --reconfigure` bug
# that panics on first-run empty build dirs. The 0.3 line avoids the
# bundled build path entirely and is what we ship for now.
#
# `clang` is listed explicitly because the Rust crate's bindgen may need
# it at compile time depending on the version of the underlying
# webrtc-audio-processing-sys build script.
USER builder
WORKDIR /build/source

View File

@@ -1,99 +0,0 @@
# =============================================================================
# WZ Phone — Windows (x86_64-pc-windows-msvc) cross-compile image
#
# Cross-compiles the Tauri desktop binary for Windows from a Linux host via
# `cargo xwin`, which auto-downloads the Microsoft CRT + Windows SDK at build
# time. This image pre-warms that cache so the cross-compile is as close as
# possible to a native Linux build on rebuild (~3 min warm vs ~20 min cold).
#
# Build:
# docker build -t wzp-windows-builder -f Dockerfile.windows-builder .
#
# Run: driven by scripts/build-windows-docker.sh (see that file).
# =============================================================================
FROM debian:bookworm
ARG RUST_TARGET=x86_64-pc-windows-msvc
ENV DEBIAN_FRONTEND=noninteractive
# ── System packages ──────────────────────────────────────────────────────────
# - build-essential + pkg-config + libssl-dev: baseline cargo build toolchain
# - cmake + ninja-build: audiopus_sys (libopus) uses cmake and expects Ninja
# as the generator for the windows target; without ninja-build the cmake
# build fails with "CMake was unable to find a build program corresponding
# to Ninja" partway through.
# - llvm + clang + lld: cargo-xwin uses clang + lld-link for PE/COFF output.
# - nasm: ring / rustls assembly for Windows needs NASM on non-Windows hosts.
# - curl, git, ca-certificates, unzip: obvious plumbing.
# - xz-utils: some Microsoft installer archives are xz-compressed.
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
cmake \
ninja-build \
curl \
git \
pkg-config \
libssl-dev \
ca-certificates \
llvm \
clang \
lld \
nasm \
unzip \
xz-utils \
file \
&& rm -rf /var/lib/apt/lists/*
# ── Node.js 20 LTS (required by Tauri for frontend build) ────────────────────
RUN curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y --no-install-recommends nodejs \
&& rm -rf /var/lib/apt/lists/* \
&& node --version \
&& npm --version
# ── Builder user (1000:1000) — matches host bind-mount UID for the cache
# volumes so cargo-registry / target survive across runs without perms
# gymnastics.
RUN groupadd -g 1000 builder \
&& useradd -m -u 1000 -g 1000 -s /bin/bash builder
USER builder
WORKDIR /home/builder
# ── Rust toolchain + Windows target + cargo-xwin ────────────────────────────
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs \
| sh -s -- -y --default-toolchain stable \
&& . $HOME/.cargo/env \
&& rustup target add ${RUST_TARGET} \
&& cargo install cargo-xwin --locked
ENV PATH="/home/builder/.cargo/bin:$PATH" \
XWIN_ACCEPT_LICENSE=1 \
RUST_TARGET_WIN=${RUST_TARGET}
# ── Pre-warm the xwin cache ─────────────────────────────────────────────────
# cargo-xwin downloads the Microsoft CRT + Windows SDK (~1.5-2 GB) into
# ~/.cache/cargo-xwin the first time it runs. Baking that into an image
# layer saves ~4 minutes off every subsequent cold run.
#
# We do this by creating a throwaway Rust project, building it with
# cargo-xwin against the Windows target, then deleting the project but
# keeping the xwin cache.
RUN set -eux; \
mkdir -p /tmp/xwin-warmup && cd /tmp/xwin-warmup && \
. $HOME/.cargo/env && \
cargo new --bin xwin-warmup --quiet && \
cd xwin-warmup && \
cargo xwin build --release --target ${RUST_TARGET} 2>&1 | tail -5 && \
cd / && rm -rf /tmp/xwin-warmup && \
du -sh $HOME/.cache/cargo-xwin
# Note: the libopus SSE4.1/SSSE3 intrinsic compile failure under clang-cl
# is fixed at the source level by vendoring audiopus_sys and patching its
# bundled libopus CMakeLists.txt (see desktop/vendor/audiopus_sys in the
# source tree). Do NOT try to patch cargo-xwin's override.cmake at this
# layer — cargo-xwin rewrites that file on every `cargo xwin build`
# invocation, so any edits baked into the image are wiped at runtime.
WORKDIR /build/source

Some files were not shown because too many files have changed in this diff Show More