17 Commits

Author SHA1 Message Date
Siavash Sameni
264ef9c4d4 feat: relay ping with RTT, server TOFU, lock icons (Phase 2 backport)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m48s
Rust JNI:
- nativePingRelay: QUIC connect with 3s timeout, returns RTT + server
  certificate fingerprint as JSON. Static method, no engine needed.

Kotlin:
- WzpEngine.pingRelay() static wrapper
- SettingsRepository: TOFU fingerprint persistence (tofu_{address} keys)
- CallViewModel: pingAllServers() coroutine, lockStatus() helper,
  PingResult/LockStatus data types
- InCallScreen: server chips show lock icon + RTT color (green/yellow),
  "Ping All" button

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:43:53 +04:00
Siavash Sameni
a9adb5cfd7 feat: identicons, tap-to-copy fingerprint, recent rooms (Phase 1 backport)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m47s
Backport from desktop client to Android:

Identicons:
- New Identicon.kt composable: deterministic 5x5 symmetric Canvas pattern
  from fingerprint hash (same algorithm as desktop identicon.ts)
- Participant list shows identicon + name + tappable fingerprint
- Settings page shows identicon next to fingerprint

CopyableFingerprint:
- Tap any fingerprint text to copy to clipboard with Toast feedback
- Used in participant list and settings page

Recent rooms:
- SettingsRepository: persists last 5 (relay, room) pairs
- CallViewModel: saves on startCall, exposes as StateFlow
- InCallScreen: clickable chips that fill room + select matching server

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:37:46 +04:00
Siavash Sameni
a39b074d6e fix: DirectByteBuffer as class field — survives ART JIT OSR
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m58s
Previous attempt allocated DirectByteBuffer as local variables inside
runCapture/runPlayout. ART's JIT On-Stack Replacement nulled them
when recompiling the hot loop mid-execution.

Fix: allocate as class fields on AudioPipeline (captureDirectBuf,
playoutDirectBuf). Object fields live on the heap, immune to OSR
stack frame replacement.

Eliminates JNI array copies (GetShortArrayRegion/SetShortArrayRegion)
from the audio hot path, preventing ART GC SIGBUS crashes on
Android 16 with concurrent mark-compact GC.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:22:54 +04:00
Siavash Sameni
9cab6e2347 ci: skip build on CI-only file changes
Add paths-ignore for .gitea/** so build.yml doesn't waste runner time
when only workflow files are modified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:13:29 +04:00
Siavash Sameni
5e93cb74f2 fix: filter tracing to INFO for wzp crates, WARN for jni crate
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 4m7s
The jni crate emits VERBOSE logs for every JNI method lookup (~10 lines
per call, 100+ calls/sec on audio threads). This floods logcat, consumes
CPU, and triggers system kills. Filter to only show INFO+ for our crates
and WARN+ for everything else.

Also fix build script: clean full Rust target to ensure libc++_shared.so
is always copied by cargo-ndk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:37:29 +04:00
Siavash Sameni
b56b4a759c revert: use ShortArray audio path (DirectByteBuffer causes null ptr crash)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 35s
Build Release Binaries / build-amd64 (push) Failing after 3m58s
DirectByteBuffer.clear() crashes with null pointer in ART's JIT OSR
compiled code on Android 16. Revert AudioPipeline to use the original
ShortArray writeAudio/readAudio path.

The DirectByteBuffer JNI functions remain in WzpEngine.kt and
jni_bridge.rs for future use once the OSR issue is resolved.

The original SIGBUS from ART GC is rare (~1 crash per 8 min call)
and doesn't warrant the DirectByteBuffer approach until we can
allocate the buffer as a class field outside the hot loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:17:15 +04:00
Siavash Sameni
6f99841cc7 fix: cloud build script — filter by server name, rsync upload, cx33
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m57s
- Filter hcloud by SERVER_NAME to avoid touching other servers
- Use rsync instead of tar (handles submodules, no macOS xattr spam)
- Default server type cx33
- Release APK failure is non-fatal (debug APK still produced)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 20:00:10 +04:00
Siavash Sameni
3b0811ce2e ci: add GitHub mirror workflow
Automatically pushes branches and tags to github.com:manawenuz/wzp.git
on every push to Forgejo. Uses GH_SSH_KEY secret for authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:49:59 +04:00
Siavash Sameni
9eed94850d fix: DirectByteBuffer audio path — eliminate JNI array copies
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m43s
Adds nativeWriteAudioDirect / nativeReadAudioDirect JNI functions
that accept a DirectByteBuffer instead of ShortArray. The buffer's
native memory is accessed directly by Rust via pointer — no
GetShortArrayRegion / SetShortArrayRegion, no GC-managed array
copies on the audio hot path.

This fixes SIGBUS crashes on Android 16 where ART's concurrent
mark-compact GC crashes when flipping thread roots during JNI
array operations on MAX_PRIORITY audio threads.

Old ShortArray methods kept for backward compatibility.
AudioPipeline switched to use Direct variants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:29:08 +04:00
Siavash Sameni
5e9718aeb2 docs: incident report — SIGBUS in ART GC during audio JNI calls
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m37s
Android 16's concurrent mark-compact GC crashes when flipping
thread roots on our MAX_PRIORITY audio threads during JNI calls
(AudioRecord.read / AudioTrack.write). Not our code — all crash
frames are in libart.so.

Proposed fixes:
- Short term: DirectByteBuffer to reduce JNI transitions
- Long term: Oboe native audio from Rust (no JNI, no GC)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:21:32 +04:00
Siavash Sameni
3093933602 fix: build script works on Ubuntu 24.04 (cmake 3.28) too
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m48s
cmake 3.28 works when ANDROID_NDK is set (not just ANDROID_NDK_HOME).
Relaxed version check from <=3.26 to <=3.30.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:00:06 +04:00
Siavash Sameni
4c6c909732 feat: comprehensive Android build script for Debian 12
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m56s
Documents WHY each version is pinned:
- cmake 3.25: 3.27+ rewrote Android-Determine.cmake with bugs
- NDK 26.1: NDK 27 scudo crashes on MTE devices (Nothing A059)
- JDK 17: Gradle 8.5 + AGP 8.2.0 official support
- ANDROID_NDK: cmake checks this, not ANDROID_NDK_HOME

Idempotent, works from clone or existing tree.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 18:37:12 +04:00
Siavash Sameni
33fab9a049 fix: vec allocation for AudioRing, catch_unwind on tracing init, profiling
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m49s
- AudioRing: use vec![].into_boxed_slice() instead of Box::new([]) to
  avoid 32KB stack allocation that crashes scudo on Android
- JNI bridge: wrap tracing_subscriber init in catch_unwind to survive
  sharded_slab allocation failures on some devices
- Engine: per-step encode profiling (avg_agc_us, avg_opus_us, avg_fec_us,
  avg_send_us) logged every 5 seconds in send stats

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:41:46 +04:00
Siavash Sameni
31d2306915 feat: per-step encode profiling in send task stats
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m48s
Adds average microsecond timings for each encode step:
- avg_agc_us: AGC processing
- avg_opus_us: Opus encoding
- avg_fec_us: FEC encode + repair generation
- avg_send_us: QUIC send_media
- avg_total_us: sum of above

Logged every 5 seconds in send stats. Resets each interval.
Use to identify which step is bottlenecking the encode loop
on devices where fps drops below 50.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:18:33 +04:00
Siavash Sameni
4af7c5f94c fix: AudioRing cursor desync + capture thread use-after-free
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m56s
AudioRing (reader-detects-lap architecture):
- Writer NEVER touches read_pos — fixes SPSC invariant violation
- Reader self-corrects when lapped (snaps read_pos forward)
- Power-of-2 capacity (16384 = 341ms) with bitmask indexing
- Added overflow_count and underrun_count diagnostics
- Wired ring health into engine stats and periodic logging

Capture thread use-after-free (drain latch):
- Added CountDownLatch(2) to AudioPipeline
- Audio threads count down after exiting their loops
- teardown() awaits latch (200ms timeout) before destroy()
- Guarantees no in-flight JNI calls when native handle is freed
- stopAudio() no longer nulls pipeline (teardown handles it)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 13:28:34 +04:00
Claude
6597b5bd86 docs: incident report + fix spec for capture thread use-after-free crash
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3s
SIGSEGV on hangup: capture thread calls writeAudio() via JNI after
teardown() has freed the native engine handle. TOCTOU race between
the nativeHandle==0L check and destroy() on the ViewModel thread.

Fix: CountDownLatch(2) — audio threads count down after exiting loops,
teardown() awaits before destroy(). 2 Kotlin files, no Rust changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:21:35 +00:00
Claude
ae9d8526dd docs: implementation spec for AudioRing SPSC desync fix
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m51s
Complete spec for fixing the playout ring buffer cursor race that
causes 12-16s bidirectional silence mid-call. Includes exact code,
memory ordering rationale, unit tests, and verification steps.

Any agent can implement from this document alone.

See also: debug/INCIDENT-2026-04-06-playout-ring-desync.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:16:47 +00:00
20 changed files with 2202 additions and 74 deletions

View File

@@ -7,6 +7,8 @@ on:
- 'feat/*'
tags:
- 'v*'
paths-ignore:
- '.gitea/**'
workflow_dispatch:
env:

View File

@@ -0,0 +1,43 @@
name: Mirror to GitHub
on:
push:
branches:
- main
- 'feat/*'
- 'feature/*'
tags:
- '*'
jobs:
mirror:
runs-on: ubuntu-latest
container:
image: catthehacker/ubuntu:act-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Push to GitHub
env:
GH_SSH_KEY: ${{ secrets.GH_SSH_KEY }}
run: |
mkdir -p ~/.ssh
echo "${GH_SSH_KEY}" > ~/.ssh/id_ed25519
chmod 600 ~/.ssh/id_ed25519
ssh-keyscan github.com >> ~/.ssh/known_hosts 2>/dev/null
git remote add github git@github.com:manawenuz/wzp.git
# Push the current branch
BRANCH="${GITHUB_REF#refs/heads/}"
TAG="${GITHUB_REF#refs/tags/}"
if [ "${GITHUB_REF}" != "${GITHUB_REF#refs/tags/}" ]; then
echo "Pushing tag: ${TAG}"
git push github "refs/tags/${TAG}" --force
else
echo "Pushing branch: ${BRANCH}"
git push github "HEAD:refs/heads/${BRANCH}" --force
fi

View File

@@ -19,6 +19,8 @@ import java.io.FileOutputStream
import java.io.OutputStreamWriter
import java.nio.ByteBuffer
import java.nio.ByteOrder
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit
import kotlin.math.pow
import kotlin.math.sqrt
@@ -59,6 +61,19 @@ class AudioPipeline(private val context: Context) {
private var captureThread: Thread? = null
private var playoutThread: Thread? = null
// DirectByteBuffers for zero-copy JNI audio transfer.
// Allocated as class fields (NOT locals) because ART's JIT OSR
// can null local variables when it replaces the stack frame mid-loop.
// These survive OSR because they're on the heap.
private val captureDirectBuf: ByteBuffer =
ByteBuffer.allocateDirect(FRAME_SAMPLES * 2).order(ByteOrder.LITTLE_ENDIAN)
private val playoutDirectBuf: ByteBuffer =
ByteBuffer.allocateDirect(FRAME_SAMPLES * 2).order(ByteOrder.LITTLE_ENDIAN)
/** Latch counted down by each audio thread after exiting its loop.
* stop() does NOT wait on this — teardown waits via awaitDrain(). */
private var drainLatch: CountDownLatch? = null
private val debugDir: File by lazy {
File(context.cacheDir, "wzp_debug").also { it.mkdirs() }
}
@@ -66,9 +81,11 @@ class AudioPipeline(private val context: Context) {
fun start(engine: WzpEngine) {
if (running) return
running = true
drainLatch = CountDownLatch(2) // one for capture, one for playout
captureThread = Thread({
runCapture(engine)
drainLatch?.countDown() // signal: capture loop exited, no more JNI calls
// Park thread forever — exiting triggers a libcrypto TLS destructor
// crash (SIGSEGV in OPENSSL_free) on Android when a JNI-calling thread exits.
parkThread()
@@ -80,6 +97,7 @@ class AudioPipeline(private val context: Context) {
playoutThread = Thread({
runPlayout(engine)
drainLatch?.countDown() // signal: playout loop exited
parkThread()
}, "wzp-playout").apply {
isDaemon = true
@@ -92,10 +110,20 @@ class AudioPipeline(private val context: Context) {
fun stop() {
running = false
// Don't join threads are parked as daemons to avoid native TLS crash
// Don't join threads — they are parked as daemons to avoid native TLS crash.
// Don't null thread refs or drainLatch — teardown() needs awaitDrain().
Log.i(TAG, "audio pipeline stopped (running=false)")
}
/** Block until both audio threads have exited their loops (max 200ms).
* After this returns, no more JNI calls to the engine will be made. */
fun awaitDrain(): Boolean {
val ok = drainLatch?.await(200, TimeUnit.MILLISECONDS) ?: true
if (!ok) Log.w(TAG, "awaitDrain: audio threads did not drain in 200ms")
captureThread = null
playoutThread = null
Log.i(TAG, "audio pipeline stopped")
drainLatch = null
return ok
}
private fun applyGain(pcm: ShortArray, count: Int, db: Float) {
@@ -206,7 +234,10 @@ class AudioPipeline(private val context: Context) {
val read = recorder.read(pcm, 0, FRAME_SAMPLES)
if (read > 0) {
applyGain(pcm, read, captureGainDb)
engine.writeAudio(pcm)
// Zero-copy write via DirectByteBuffer (class field, survives JIT OSR)
captureDirectBuf.clear()
captureDirectBuf.asShortBuffer().put(pcm, 0, read)
engine.writeAudioDirect(captureDirectBuf, read)
// Debug: write raw PCM + RMS
if (pcmOut != null) {
@@ -285,8 +316,12 @@ class AudioPipeline(private val context: Context) {
}
try {
while (running) {
val read = engine.readAudio(pcm)
// Zero-copy read via DirectByteBuffer (class field, survives JIT OSR)
playoutDirectBuf.clear()
val read = engine.readAudioDirect(playoutDirectBuf, FRAME_SAMPLES)
if (read >= FRAME_SAMPLES) {
playoutDirectBuf.rewind()
playoutDirectBuf.asShortBuffer().get(pcm, 0, read)
applyGain(pcm, read, playoutGainDb)
track.write(pcm, 0, read)

View File

@@ -28,6 +28,8 @@ class SettingsRepository(context: Context) {
private const val KEY_PREFER_IPV6 = "prefer_ipv6"
private const val KEY_IDENTITY_SEED = "identity_seed_hex"
private const val KEY_AEC_ENABLED = "aec_enabled"
private const val KEY_RECENT_ROOMS = "recent_rooms"
private const val TOFU_PREFIX = "tofu_"
}
// --- Servers ---
@@ -138,4 +140,43 @@ class SettingsRepository(context: Context) {
fun saveSeedHex(hex: String) {
prefs.edit().putString(KEY_IDENTITY_SEED, hex).apply()
}
// --- Recent rooms ---
data class RecentRoom(val relay: String, val room: String)
fun addRecentRoom(relay: String, room: String) {
val rooms = loadRecentRooms().toMutableList()
rooms.removeAll { it.relay == relay && it.room == room }
rooms.add(0, RecentRoom(relay, room))
if (rooms.size > 5) rooms.subList(5, rooms.size).clear()
val arr = JSONArray()
rooms.forEach { arr.put(JSONObject().apply { put("relay", it.relay); put("room", it.room) }) }
prefs.edit().putString(KEY_RECENT_ROOMS, arr.toString()).apply()
}
fun loadRecentRooms(): List<RecentRoom> {
val json = prefs.getString(KEY_RECENT_ROOMS, null) ?: return emptyList()
return try {
val arr = JSONArray(json)
(0 until arr.length()).map { i ->
val o = arr.getJSONObject(i)
RecentRoom(o.getString("relay"), o.getString("room"))
}
} catch (_: Exception) { emptyList() }
}
fun clearRecentRooms() {
prefs.edit().remove(KEY_RECENT_ROOMS).apply()
}
// --- Server fingerprint TOFU ---
fun saveServerFingerprint(address: String, fingerprint: String) {
prefs.edit().putString("$TOFU_PREFIX$address", fingerprint).apply()
}
fun loadServerFingerprint(address: String): String? {
return prefs.getString("$TOFU_PREFIX$address", null)
}
}

View File

@@ -117,6 +117,26 @@ class WzpEngine(private val callback: WzpCallback) {
return nativeReadAudio(nativeHandle, pcm)
}
/**
* Write captured PCM from a DirectByteBuffer — zero JNI array copy.
* The buffer must be a direct ByteBuffer with native byte order containing i16 samples.
* Called from the AudioRecord capture thread.
*/
fun writeAudioDirect(buffer: java.nio.ByteBuffer, sampleCount: Int): Int {
if (nativeHandle == 0L) return 0
return nativeWriteAudioDirect(nativeHandle, buffer, sampleCount)
}
/**
* Read decoded PCM into a DirectByteBuffer — zero JNI array copy.
* The buffer must be a direct ByteBuffer with native byte order.
* Called from the AudioTrack playout thread.
*/
fun readAudioDirect(buffer: java.nio.ByteBuffer, maxSamples: Int): Int {
if (nativeHandle == 0L) return 0
return nativeReadAudioDirect(nativeHandle, buffer, maxSamples)
}
// -- JNI native methods --------------------------------------------------
private external fun nativeInit(): Long
@@ -130,12 +150,23 @@ class WzpEngine(private val callback: WzpCallback) {
private external fun nativeForceProfile(handle: Long, profile: Int)
private external fun nativeWriteAudio(handle: Long, pcm: ShortArray): Int
private external fun nativeReadAudio(handle: Long, pcm: ShortArray): Int
private external fun nativeWriteAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, sampleCount: Int): Int
private external fun nativeReadAudioDirect(handle: Long, buffer: java.nio.ByteBuffer, maxSamples: Int): Int
private external fun nativeDestroy(handle: Long)
companion object {
init {
System.loadLibrary("wzp_android")
}
/**
* Ping a relay server. Returns JSON `{"rtt_ms":N,"server_fingerprint":"hex"}`
* or null if unreachable. Does not require an engine instance.
*/
fun pingRelay(address: String): String? = nativePingRelay(address)
@JvmStatic
private external fun nativePingRelay(relay: String): String?
}
}

View File

@@ -12,6 +12,7 @@ import com.wzp.engine.CallStats
import com.wzp.service.CallService
import com.wzp.engine.WzpCallback
import com.wzp.engine.WzpEngine
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.Job
import kotlinx.coroutines.delay
import kotlinx.coroutines.flow.MutableStateFlow
@@ -19,6 +20,8 @@ import kotlinx.coroutines.flow.StateFlow
import kotlinx.coroutines.flow.asStateFlow
import kotlinx.coroutines.isActive
import kotlinx.coroutines.launch
import kotlinx.coroutines.withContext
import org.json.JSONObject
import java.io.File
import java.net.Inet4Address
import java.net.Inet6Address
@@ -26,6 +29,13 @@ import java.net.InetAddress
data class ServerEntry(val address: String, val label: String)
data class PingResult(
val rttMs: Int,
val serverFingerprint: String,
)
enum class LockStatus { UNKNOWN, OFFLINE, NEW, VERIFIED, CHANGED }
class CallViewModel : ViewModel(), WzpCallback {
private var engine: WzpEngine? = null
@@ -70,6 +80,16 @@ class CallViewModel : ViewModel(), WzpCallback {
private val _preferIPv6 = MutableStateFlow(false)
val preferIPv6: StateFlow<Boolean> = _preferIPv6.asStateFlow()
private val _recentRooms = MutableStateFlow<List<com.wzp.data.SettingsRepository.RecentRoom>>(emptyList())
val recentRooms: StateFlow<List<com.wzp.data.SettingsRepository.RecentRoom>> = _recentRooms.asStateFlow()
/** Ping results keyed by server address. */
private val _pingResults = MutableStateFlow<Map<String, PingResult>>(emptyMap())
val pingResults: StateFlow<Map<String, PingResult>> = _pingResults.asStateFlow()
/** Known server fingerprints (TOFU). */
private val _knownFingerprints = MutableStateFlow<Map<String, String>>(emptyMap())
private val _playoutGainDb = MutableStateFlow(0f)
val playoutGainDb: StateFlow<Float> = _playoutGainDb.asStateFlow()
@@ -139,6 +159,7 @@ class CallViewModel : ViewModel(), WzpCallback {
_captureGainDb.value = s.loadCaptureGain()
_seedHex.value = s.getOrCreateSeedHex()
_aecEnabled.value = s.loadAecEnabled()
_recentRooms.value = s.loadRecentRooms()
}
fun selectServer(index: Int) {
@@ -182,6 +203,51 @@ class CallViewModel : ViewModel(), WzpCallback {
settings?.saveSelectedServer(_selectedServer.value)
}
/** Ping all servers in background, update results. */
fun pingAllServers() {
viewModelScope.launch {
val results = mutableMapOf<String, PingResult>()
val known = mutableMapOf<String, String>()
_servers.value.forEach { server ->
val pr = withContext(Dispatchers.IO) {
try {
val json = WzpEngine.pingRelay(server.address) ?: return@withContext null
val obj = JSONObject(json)
PingResult(
rttMs = obj.getInt("rtt_ms"),
serverFingerprint = obj.optString("server_fingerprint", ""),
)
} catch (e: Exception) {
Log.w(TAG, "ping ${server.address} failed: ${e.message}")
null
}
}
if (pr != null) {
results[server.address] = pr
// TOFU: save fingerprint on first contact
if (pr.serverFingerprint.isNotEmpty()) {
val saved = settings?.loadServerFingerprint(server.address)
if (saved == null) {
settings?.saveServerFingerprint(server.address, pr.serverFingerprint)
}
known[server.address] = saved ?: pr.serverFingerprint
}
}
}
_pingResults.value = results
_knownFingerprints.value = known
}
}
/** Get lock status for a server. */
fun lockStatus(address: String): LockStatus {
val pr = _pingResults.value[address] ?: return LockStatus.UNKNOWN
val known = _knownFingerprints.value[address]
if (pr.serverFingerprint.isEmpty()) return LockStatus.NEW
if (known == null) return LockStatus.NEW
return if (pr.serverFingerprint == known) LockStatus.VERIFIED else LockStatus.CHANGED
}
fun setRoomName(name: String) {
_roomName.value = name
settings?.saveRoom(name)
@@ -254,8 +320,17 @@ class CallViewModel : ViewModel(), WzpCallback {
Log.i(TAG, "teardown: stopping audio, stopService=$stopService")
val hadCall = audioStarted
CallService.onStopFromNotification = null
stopAudio()
stopAudio() // sets running=false (non-blocking)
stopStatsPolling()
// Wait for audio threads to exit their loops before destroying the engine.
// This guarantees no in-flight JNI calls to writeAudio/readAudio.
val drained = audioPipeline?.awaitDrain() ?: true
if (!drained) {
Log.w(TAG, "teardown: audio threads did not drain in time")
}
audioPipeline = null
Log.i(TAG, "teardown: stopping engine")
try { engine?.stopCall() } catch (e: Exception) { Log.w(TAG, "stopCall err: $e") }
try { engine?.destroy() } catch (e: Exception) { Log.w(TAG, "destroy err: $e") }
@@ -278,6 +353,8 @@ class CallViewModel : ViewModel(), WzpCallback {
_debugReportAvailable.value = false
_debugReportStatus.value = null
lastCallServer = serverEntry.address
settings?.addRecentRoom(serverEntry.address, room)
_recentRooms.value = settings?.loadRecentRooms() ?: emptyList()
debugReporter?.prepareForCall()
try {
// Teardown previous call but don't stop the service (we're about to restart it)
@@ -399,8 +476,7 @@ class CallViewModel : ViewModel(), WzpCallback {
private fun stopAudio() {
if (!audioStarted) return
audioPipeline?.stop()
audioPipeline = null
audioPipeline?.stop() // sets running=false; DON'T null — teardown needs awaitDrain()
audioRouteManager?.unregister()
audioRouteManager?.setSpeaker(false)
_isSpeaker.value = false

View File

@@ -42,11 +42,13 @@ import androidx.compose.ui.Alignment
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.text.font.FontFamily
import androidx.compose.ui.text.font.FontWeight
import androidx.compose.ui.text.style.TextAlign
import androidx.compose.ui.unit.dp
import androidx.compose.ui.unit.sp
import com.wzp.engine.CallStats
import com.wzp.ui.call.LockStatus
import kotlin.math.roundToInt
@OptIn(ExperimentalLayoutApi::class)
@@ -117,18 +119,31 @@ fun InCallScreen(
color = MaterialTheme.colorScheme.onSurfaceVariant
)
Spacer(modifier = Modifier.height(4.dp))
val pingResults by viewModel.pingResults.collectAsState()
FlowRow(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.Center
) {
servers.forEachIndexed { idx, entry ->
val isSelected = selectedServer == idx
val ping = pingResults[entry.address]
val lockStatus = viewModel.lockStatus(entry.address)
val lockIcon = when (lockStatus) {
LockStatus.VERIFIED -> "\uD83D\uDD12" // 🔒
LockStatus.NEW -> "\uD83D\uDD13" // 🔓
LockStatus.CHANGED -> "\uFE0F" // ⚠️
LockStatus.OFFLINE -> "\uD83D\uDD34" // 🔴
LockStatus.UNKNOWN -> ""
}
val rttText = ping?.let { "${it.rttMs}ms" } ?: ""
FilledTonalIconButton(
onClick = { viewModel.selectServer(idx) },
modifier = Modifier
.padding(2.dp)
.height(36.dp)
.width(140.dp),
.height(40.dp)
.width(160.dp),
shape = RoundedCornerShape(8.dp),
colors = if (isSelected) {
IconButtonDefaults.filledTonalIconButtonColors(
@@ -139,11 +154,28 @@ fun InCallScreen(
IconButtonDefaults.filledTonalIconButtonColors()
}
) {
Row(verticalAlignment = Alignment.CenterVertically) {
if (lockIcon.isNotEmpty()) {
Text(text = lockIcon, fontSize = 12.sp)
Spacer(modifier = Modifier.width(4.dp))
}
Text(
text = entry.label,
style = MaterialTheme.typography.labelSmall,
maxLines = 1
)
if (rttText.isNotEmpty()) {
Spacer(modifier = Modifier.width(4.dp))
Text(
text = rttText,
style = MaterialTheme.typography.labelSmall.copy(fontSize = 9.sp),
color = when {
(ping?.rttMs ?: 0) > 200 -> Color(0xFFFACC15) // yellow
else -> Color(0xFF4ADE80) // green
}
)
}
}
}
}
// + Add button
@@ -151,13 +183,18 @@ fun InCallScreen(
onClick = { showAddServerDialog = true },
modifier = Modifier
.padding(2.dp)
.height(36.dp),
.height(40.dp),
shape = RoundedCornerShape(8.dp)
) {
Text("+", style = MaterialTheme.typography.labelMedium)
}
}
// Ping button
TextButton(onClick = { viewModel.pingAllServers() }) {
Text("Ping All", style = MaterialTheme.typography.labelSmall)
}
// IPv4/IPv6 preference
Spacer(modifier = Modifier.height(8.dp))
Row(
@@ -200,6 +237,36 @@ fun InCallScreen(
modifier = Modifier.fillMaxWidth(0.6f)
)
// Recent rooms
val recentRooms by viewModel.recentRooms.collectAsState()
if (recentRooms.isNotEmpty()) {
Spacer(modifier = Modifier.height(8.dp))
FlowRow(
modifier = Modifier.fillMaxWidth(),
horizontalArrangement = Arrangement.Center
) {
recentRooms.forEach { recent ->
Surface(
onClick = {
viewModel.setRoomName(recent.room)
// Select matching server
val idx = servers.indexOfFirst { it.address == recent.relay }
if (idx >= 0) viewModel.selectServer(idx)
},
shape = RoundedCornerShape(16.dp),
color = MaterialTheme.colorScheme.surfaceVariant,
modifier = Modifier.padding(2.dp)
) {
Text(
text = recent.room,
style = MaterialTheme.typography.labelSmall,
modifier = Modifier.padding(horizontal = 12.dp, vertical = 4.dp)
)
}
}
}
}
Spacer(modifier = Modifier.height(24.dp))
Button(
@@ -262,11 +329,33 @@ fun InCallScreen(
color = MaterialTheme.colorScheme.onSurfaceVariant
)
unique.forEach { member ->
Row(
verticalAlignment = Alignment.CenterVertically,
modifier = Modifier.padding(vertical = 2.dp)
) {
com.wzp.ui.components.Identicon(
fingerprint = member.fingerprint.ifEmpty { member.displayName },
size = 28.dp,
)
Spacer(modifier = Modifier.width(8.dp))
Column {
Text(
text = member.displayName,
style = MaterialTheme.typography.labelSmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
if (member.fingerprint.isNotEmpty()) {
com.wzp.ui.components.CopyableFingerprint(
fingerprint = member.fingerprint.take(16),
style = MaterialTheme.typography.labelSmall.copy(
fontSize = 9.sp,
fontFamily = FontFamily.Monospace,
),
color = MaterialTheme.colorScheme.onSurfaceVariant.copy(alpha = 0.6f),
)
}
}
}
}
}

View File

@@ -0,0 +1,141 @@
package com.wzp.ui.components
import android.widget.Toast
import androidx.compose.foundation.Canvas
import androidx.compose.foundation.clickable
import androidx.compose.foundation.layout.size
import androidx.compose.foundation.shape.RoundedCornerShape
import androidx.compose.runtime.Composable
import androidx.compose.ui.Modifier
import androidx.compose.ui.draw.clip
import androidx.compose.ui.geometry.Offset
import androidx.compose.ui.geometry.Size
import androidx.compose.ui.graphics.Color
import androidx.compose.ui.platform.LocalClipboardManager
import androidx.compose.ui.platform.LocalContext
import androidx.compose.ui.text.AnnotatedString
import androidx.compose.ui.unit.Dp
import androidx.compose.ui.unit.dp
import kotlin.math.min
/**
* Deterministic identicon — generates a unique 5x5 symmetric pattern
* from a hex fingerprint string. Identical algorithm to the desktop
* TypeScript implementation in identicon.ts.
*/
@Composable
fun Identicon(
fingerprint: String,
size: Dp = 36.dp,
clickToCopy: Boolean = true,
modifier: Modifier = Modifier,
) {
val clipboard = LocalClipboardManager.current
val context = LocalContext.current
val bytes = hashBytes(fingerprint)
val (bg, fg) = deriveColors(bytes)
val grid = buildGrid(bytes)
Canvas(
modifier = modifier
.size(size)
.clip(RoundedCornerShape(size * 0.12f))
.then(
if (clickToCopy && fingerprint.isNotEmpty()) {
Modifier.clickable {
clipboard.setText(AnnotatedString(fingerprint))
Toast.makeText(context, "Copied", Toast.LENGTH_SHORT).show()
}
} else Modifier
)
) {
val cellW = this.size.width / 5f
val cellH = this.size.height / 5f
// Background
drawRect(color = bg, size = this.size)
// Foreground cells
for (y in 0 until 5) {
for (x in 0 until 5) {
if (grid[y][x]) {
drawRect(
color = fg,
topLeft = Offset(x * cellW, y * cellH),
size = Size(cellW, cellH),
)
}
}
}
}
}
/**
* Fingerprint text that copies to clipboard on tap.
*/
@Composable
fun CopyableFingerprint(
fingerprint: String,
modifier: Modifier = Modifier,
style: androidx.compose.ui.text.TextStyle = androidx.compose.material3.MaterialTheme.typography.bodySmall,
color: Color = Color.Unspecified,
) {
val clipboard = LocalClipboardManager.current
val context = LocalContext.current
androidx.compose.material3.Text(
text = fingerprint,
style = style,
color = color,
modifier = modifier.clickable {
if (fingerprint.isNotEmpty()) {
clipboard.setText(AnnotatedString(fingerprint))
Toast.makeText(context, "Fingerprint copied", Toast.LENGTH_SHORT).show()
}
}
)
}
// --- Internal helpers (matching desktop identicon.ts) ---
private fun hashBytes(hex: String): List<Int> {
val clean = hex.filter { it.isLetterOrDigit() }
val bytes = mutableListOf<Int>()
var i = 0
while (i + 1 < clean.length) {
val b = clean.substring(i, i + 2).toIntOrNull(16) ?: 0
bytes.add(b)
i += 2
}
// Pad to at least 16 bytes
while (bytes.size < 16) bytes.add(0)
return bytes
}
private fun deriveColors(bytes: List<Int>): Pair<Color, Color> {
val hue1 = bytes[0] * 360f / 256f
val hue2 = (bytes[1] * 360f / 256f + 120f) % 360f
val bg = hslToColor(hue1, 0.65f, 0.35f)
val fg = hslToColor(hue2, 0.70f, 0.55f)
return bg to fg
}
private fun buildGrid(bytes: List<Int>): List<List<Boolean>> {
return (0 until 5).map { y ->
val left = (0 until 3).map { x ->
val idx = 2 + y * 3 + x
bytes[idx % bytes.size] > 128
}
// Mirror: col3 = col1, col4 = col0
listOf(left[0], left[1], left[2], left[1], left[0])
}
}
private fun hslToColor(h: Float, s: Float, l: Float): Color {
val k = { n: Float -> (n + h / 30f) % 12f }
val a = s * min(l, 1f - l)
val f = { n: Float ->
l - a * maxOf(-1f, minOf(k(n) - 3f, minOf(9f - k(n), 1f)))
}
return Color(f(0f), f(8f), f(4f))
}

View File

@@ -158,20 +158,30 @@ fun SettingsScreen(
Spacer(modifier = Modifier.height(16.dp))
// Fingerprint display
// Fingerprint display with identicon
val fingerprint = if (draftSeedHex.length >= 16) draftSeedHex.take(16).uppercase() else "Not generated"
Text(
text = "Fingerprint",
style = MaterialTheme.typography.labelSmall,
color = MaterialTheme.colorScheme.onSurfaceVariant
)
Text(
text = fingerprint.chunked(4).joinToString(" "),
Row(
verticalAlignment = Alignment.CenterVertically,
modifier = Modifier.padding(vertical = 4.dp)
) {
com.wzp.ui.components.Identicon(
fingerprint = draftSeedHex,
size = 40.dp,
)
Spacer(modifier = Modifier.width(12.dp))
com.wzp.ui.components.CopyableFingerprint(
fingerprint = fingerprint.chunked(4).joinToString(" "),
style = MaterialTheme.typography.bodyMedium.copy(
fontFamily = FontFamily.Monospace
),
color = MaterialTheme.colorScheme.onSurface
color = MaterialTheme.colorScheme.onSurface,
)
}
Spacer(modifier = Modifier.height(12.dp))

View File

@@ -17,7 +17,7 @@ wzp-crypto = { workspace = true }
wzp-transport = { workspace = true }
tokio = { workspace = true }
tracing = { workspace = true }
tracing-subscriber = { workspace = true }
tracing-subscriber = { workspace = true, features = ["env-filter"] }
bytes = { workspace = true }
serde = { workspace = true }
serde_json = "1"

View File

@@ -1,91 +1,128 @@
//! Lock-free SPSC ring buffers for audio PCM transfer between
//! Kotlin AudioRecord/AudioTrack threads and the Rust engine.
//! Lock-free SPSC ring buffer — "Reader-Detects-Lap" architecture.
//!
//! These use a simple spin-free design: the producer writes and advances
//! a write cursor, the consumer reads and advances a read cursor.
//! Both cursors are atomic so no mutex is needed.
//! SPSC invariant: the producer ONLY writes `write_pos`, the consumer
//! ONLY writes `read_pos`. Neither thread touches the other's cursor.
//!
//! On overflow (writer laps the reader), the writer simply overwrites
//! old buffer data. The reader detects the lap via `available() >
//! RING_CAPACITY` and snaps its own `read_pos` forward.
//!
//! Capacity is a power of 2 for bitmask indexing (no modulo).
use std::sync::atomic::{AtomicUsize, Ordering};
use std::sync::atomic::{AtomicU64, AtomicUsize, Ordering};
/// Ring buffer capacity in i16 samples.
/// 960 samples * 10 frames = ~200ms of audio at 48kHz mono.
const RING_CAPACITY: usize = 960 * 10;
/// Ring buffer capacity — power of 2 for bitmask indexing.
/// 16384 samples = 341.3ms at 48kHz mono. 70% more headroom
/// than the previous 9600 (200ms) for surviving Android GC pauses.
const RING_CAPACITY: usize = 16384; // 2^14
const RING_MASK: usize = RING_CAPACITY - 1;
/// Lock-free single-producer single-consumer ring buffer for i16 PCM samples.
pub struct AudioRing {
buf: Box<[i16; RING_CAPACITY]>,
buf: Box<[i16]>,
/// Monotonically increasing write cursor. ONLY written by producer.
write_pos: AtomicUsize,
/// Monotonically increasing read cursor. ONLY written by consumer.
read_pos: AtomicUsize,
/// Incremented by reader when it detects it was lapped (overflow).
overflow_count: AtomicU64,
/// Incremented by reader when ring is empty (underrun).
underrun_count: AtomicU64,
}
// SAFETY: AudioRing is designed for SPSC — one thread writes, one reads.
// The atomics ensure visibility. The buffer itself is never accessed
// from the same index by both threads simultaneously because the
// producer only writes to positions between write_pos and read_pos,
// and the consumer only reads from positions between read_pos and write_pos.
// SAFETY: AudioRing is SPSC — one thread writes (producer), one reads (consumer).
// The producer only writes write_pos. The consumer only writes read_pos.
// Neither thread writes the other's cursor. Buffer indices are derived from
// the owning thread's cursor, ensuring no concurrent access to the same index.
unsafe impl Send for AudioRing {}
unsafe impl Sync for AudioRing {}
impl AudioRing {
pub fn new() -> Self {
debug_assert!(RING_CAPACITY.is_power_of_two());
Self {
buf: Box::new([0i16; RING_CAPACITY]),
buf: vec![0i16; RING_CAPACITY].into_boxed_slice(),
write_pos: AtomicUsize::new(0),
read_pos: AtomicUsize::new(0),
overflow_count: AtomicU64::new(0),
underrun_count: AtomicU64::new(0),
}
}
/// Number of samples available to read.
/// Number of samples available to read (clamped to capacity).
pub fn available(&self) -> usize {
let w = self.write_pos.load(Ordering::Acquire);
let r = self.read_pos.load(Ordering::Acquire);
w.wrapping_sub(r)
let r = self.read_pos.load(Ordering::Relaxed);
w.wrapping_sub(r).min(RING_CAPACITY)
}
/// Number of samples that can be written without overwriting.
/// Number of samples that can be written without overwriting unread data.
pub fn free_space(&self) -> usize {
RING_CAPACITY - self.available()
RING_CAPACITY.saturating_sub(self.available())
}
/// Write samples into the ring. Returns number of samples written.
/// Drops oldest samples if the ring is full.
///
/// If the ring is full, old data is silently overwritten. The reader
/// will detect the lap and self-correct. The writer NEVER touches
/// `read_pos` — this is the key invariant that prevents cursor desync.
pub fn write(&self, samples: &[i16]) -> usize {
let w = self.write_pos.load(Ordering::Relaxed);
let count = samples.len().min(RING_CAPACITY);
let w = self.write_pos.load(Ordering::Relaxed);
for i in 0..count {
let idx = (w + i) % RING_CAPACITY;
// SAFETY: We're the only writer, and the reader won't read
// past read_pos which we haven't advanced past yet.
unsafe {
let ptr = self.buf.as_ptr() as *mut i16;
*ptr.add(idx) = samples[i];
*ptr.add((w + i) & RING_MASK) = samples[i];
}
}
self.write_pos.store(w.wrapping_add(count), Ordering::Release);
// If we overwrote unread data, advance read_pos
if self.available() > RING_CAPACITY {
let new_read = self.write_pos.load(Ordering::Relaxed).wrapping_sub(RING_CAPACITY);
self.read_pos.store(new_read, Ordering::Release);
}
count
}
/// Read samples from the ring into `out`. Returns number of samples read.
///
/// If the writer has lapped the reader (overflow), `read_pos` is snapped
/// forward to the oldest valid data. This is safe because only the
/// reader thread writes `read_pos`.
pub fn read(&self, out: &mut [i16]) -> usize {
let avail = self.available();
let count = out.len().min(avail);
let w = self.write_pos.load(Ordering::Acquire);
let mut r = self.read_pos.load(Ordering::Relaxed);
let mut avail = w.wrapping_sub(r);
// Lap detection: writer has overwritten our unread data.
// Snap read_pos forward to oldest valid data in the buffer.
if avail > RING_CAPACITY {
r = w.wrapping_sub(RING_CAPACITY);
avail = RING_CAPACITY;
self.overflow_count.fetch_add(1, Ordering::Relaxed);
}
let count = out.len().min(avail);
if count == 0 {
if w == r {
self.underrun_count.fetch_add(1, Ordering::Relaxed);
}
return 0;
}
let r = self.read_pos.load(Ordering::Relaxed);
for i in 0..count {
let idx = (r + i) % RING_CAPACITY;
out[i] = unsafe { *self.buf.as_ptr().add(idx) };
out[i] = unsafe { *self.buf.as_ptr().add((r + i) & RING_MASK) };
}
self.read_pos.store(r.wrapping_add(count), Ordering::Release);
count
}
/// Number of overflow events (reader was lapped by writer).
pub fn overflow_count(&self) -> u64 {
self.overflow_count.load(Ordering::Relaxed)
}
/// Number of underrun events (reader found empty buffer).
pub fn underrun_count(&self) -> u64 {
self.underrun_count.load(Ordering::Relaxed)
}
}

View File

@@ -183,6 +183,9 @@ impl WzpEngine {
stats.duration_secs = start.elapsed().as_secs_f64();
}
stats.audio_level = self.state.audio_level_rms.load(Ordering::Relaxed);
stats.playout_overflows = self.state.playout_ring.overflow_count();
stats.playout_underruns = self.state.playout_ring.underrun_count();
stats.capture_overflows = self.state.capture_ring.overflow_count();
stats
}
@@ -333,6 +336,12 @@ async fn run_call(
let mut last_stats_log = Instant::now();
let mut frames_sent: u64 = 0;
let mut frames_dropped: u64 = 0;
// Per-step timing accumulators (reset every stats log)
let mut t_agc_us: u64 = 0;
let mut t_opus_us: u64 = 0;
let mut t_fec_us: u64 = 0;
let mut t_send_us: u64 = 0;
let mut t_frames: u64 = 0;
loop {
if !state.running.load(Ordering::Relaxed) {
break;
@@ -356,9 +365,12 @@ async fn run_call(
}
// AGC: normalize capture volume before encoding
let t0 = Instant::now();
capture_agc.process_frame(&mut capture_buf);
t_agc_us += t0.elapsed().as_micros() as u64;
// Opus encode
let t0 = Instant::now();
let encoded_len = match encoder.encode(&capture_buf, &mut encode_buf) {
Ok(n) => n,
Err(e) => {
@@ -366,6 +378,7 @@ async fn run_call(
continue;
}
};
t_opus_us += t0.elapsed().as_micros() as u64;
let encoded = &encode_buf[..encoded_len];
// Build source packet
@@ -391,6 +404,7 @@ async fn run_call(
};
// Send source packet — drop on error, never break
let t0 = Instant::now();
if let Err(e) = transport.send_media(&source_pkt).await {
send_errors += 1;
frames_dropped += 1;
@@ -405,11 +419,14 @@ async fn run_call(
last_send_error_log = Instant::now();
}
// Don't feed to FEC either — the source is lost
t_send_us += t0.elapsed().as_micros() as u64;
continue;
}
t_send_us += t0.elapsed().as_micros() as u64;
frames_sent += 1;
// Feed encoded frame to FEC encoder
let t0 = Instant::now();
if let Err(e) = fec_enc.add_source_symbol(encoded) {
warn!("fec add_source error: {e}");
}
@@ -466,9 +483,12 @@ async fn run_call(
block_id = block_id.wrapping_add(1);
frame_in_block = 0;
}
t_fec_us += t0.elapsed().as_micros() as u64;
t_frames += 1;
// Periodic stats every 5 seconds
if last_stats_log.elapsed().as_secs() >= 5 {
let avg = |total: u64| if t_frames > 0 { total / t_frames } else { 0 };
info!(
seq = s,
block_id,
@@ -476,8 +496,15 @@ async fn run_call(
frames_dropped,
send_errors,
ring_avail = state.capture_ring.available(),
capture_overflows = state.capture_ring.overflow_count(),
avg_agc_us = avg(t_agc_us),
avg_opus_us = avg(t_opus_us),
avg_fec_us = avg(t_fec_us),
avg_send_us = avg(t_send_us),
avg_total_us = avg(t_agc_us + t_opus_us + t_fec_us + t_send_us),
"send stats"
);
t_agc_us = 0; t_opus_us = 0; t_fec_us = 0; t_send_us = 0; t_frames = 0;
last_stats_log = Instant::now();
}
}
@@ -578,6 +605,8 @@ async fn run_call(
recv_errors,
max_recv_gap_ms,
playout_avail = state.playout_ring.available(),
playout_overflows = state.playout_ring.overflow_count(),
playout_underruns = state.playout_ring.underrun_count(),
"recv stats"
);
max_recv_gap_ms = 0;

View File

@@ -35,12 +35,26 @@ static INIT_LOGGING: Once = Once::new();
/// Safe to call multiple times — only the first call takes effect.
fn init_logging() {
INIT_LOGGING.call_once(|| {
// Wrap in catch_unwind — sharded_slab allocation inside
// tracing_subscriber::registry() can crash on some Android
// devices if scudo malloc fails during early initialization.
let _ = std::panic::catch_unwind(|| {
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::util::SubscriberInitExt;
use tracing_subscriber::EnvFilter;
if let Ok(layer) = tracing_android::layer("wzp_android") {
let _ = tracing_subscriber::registry().with(layer).try_init();
// Filter: INFO for our crates, WARN for everything else.
// The jni crate emits VERBOSE logs for every method lookup
// (~10 lines per JNI call, 100+ calls/sec) which floods logcat
// and causes the system to kill the app.
let filter = EnvFilter::new("warn,wzp_android=info,wzp_proto=info,wzp_transport=info,wzp_codec=info,wzp_fec=info,wzp_crypto=info");
let _ = tracing_subscriber::registry()
.with(layer)
.with(filter)
.try_init();
}
});
});
}
#[unsafe(no_mangle)]
@@ -209,7 +223,6 @@ pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeWriteAudio(
return 0;
}
let mut buf = vec![0i16; len];
// GetShortArrayRegion copies Java array into our buffer
if env.get_short_array_region(&pcm, 0, &mut buf).is_err() {
return 0;
}
@@ -243,6 +256,56 @@ pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeReadAudio(
result.unwrap_or(0)
}
/// Write captured PCM from a DirectByteBuffer — zero JNI array copies.
/// The ByteBuffer must contain little-endian i16 samples.
/// Called from the AudioRecord capture thread.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeWriteAudioDirect(
env: JNIEnv,
_class: JClass,
handle: jlong,
buffer: jni::objects::JByteBuffer,
sample_count: jint,
) -> jint {
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
let h = unsafe { handle_ref(handle) };
let ptr = env.get_direct_buffer_address(&buffer).unwrap_or(std::ptr::null_mut());
if ptr.is_null() || sample_count <= 0 {
return 0;
}
let samples = unsafe {
std::slice::from_raw_parts(ptr as *const i16, sample_count as usize)
};
h.engine.write_audio(samples) as jint
}));
result.unwrap_or(0)
}
/// Read decoded PCM into a DirectByteBuffer — zero JNI array copies.
/// The ByteBuffer will be filled with little-endian i16 samples.
/// Called from the AudioTrack playout thread.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeReadAudioDirect(
env: JNIEnv,
_class: JClass,
handle: jlong,
buffer: jni::objects::JByteBuffer,
max_samples: jint,
) -> jint {
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
let h = unsafe { handle_ref(handle) };
let ptr = env.get_direct_buffer_address(&buffer).unwrap_or(std::ptr::null_mut());
if ptr.is_null() || max_samples <= 0 {
return 0;
}
let samples = unsafe {
std::slice::from_raw_parts_mut(ptr as *mut i16, max_samples as usize)
};
h.engine.read_audio(samples) as jint
}));
result.unwrap_or(0)
}
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeDestroy(
_env: JNIEnv,
@@ -254,3 +317,79 @@ pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativeDestroy(
drop(h);
}));
}
/// Ping a relay server — returns JSON `{"rtt_ms":N,"server_fingerprint":"hex"}` or null on failure.
/// Does NOT require an engine handle — creates a temporary QUIC connection.
#[unsafe(no_mangle)]
pub unsafe extern "system" fn Java_com_wzp_engine_WzpEngine_nativePingRelay<'a>(
mut env: JNIEnv<'a>,
_class: JClass,
relay_j: JString,
) -> jstring {
let result = panic::catch_unwind(panic::AssertUnwindSafe(|| {
let relay: String = env.get_string(&relay_j).map(|s| s.into()).unwrap_or_default();
let addr: std::net::SocketAddr = match relay.parse() {
Ok(a) => a,
Err(_) => return None,
};
let _ = rustls::crypto::ring::default_provider().install_default();
let rt = match tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
{
Ok(rt) => rt,
Err(_) => return None,
};
rt.block_on(async {
let bind: std::net::SocketAddr = "0.0.0.0:0".parse().unwrap();
let endpoint = match wzp_transport::create_endpoint(bind, None) {
Ok(e) => e,
Err(_) => return None,
};
let client_cfg = wzp_transport::client_config();
let start = std::time::Instant::now();
match tokio::time::timeout(
std::time::Duration::from_secs(3),
wzp_transport::connect(&endpoint, addr, "ping", client_cfg),
)
.await
{
Ok(Ok(conn)) => {
let rtt_ms = start.elapsed().as_millis() as u64;
let server_fp = conn
.peer_identity()
.and_then(|id| {
id.downcast::<Vec<rustls::pki_types::CertificateDer>>().ok()
})
.and_then(|certs| {
certs.first().map(|c| {
use std::hash::{Hash, Hasher};
let mut h = std::collections::hash_map::DefaultHasher::new();
c.as_ref().hash(&mut h);
format!("{:016x}", h.finish())
})
})
.unwrap_or_default();
conn.close(0u32.into(), b"ping");
Some(format!(
r#"{{"rtt_ms":{},"server_fingerprint":"{}"}}"#,
rtt_ms, server_fp
))
}
_ => None,
}
})
}));
let json = match result {
Ok(Some(s)) => s,
_ => return JObject::null().into_raw(),
};
env.new_string(&json)
.map(|s| s.into_raw())
.unwrap_or(JObject::null().into_raw())
}

View File

@@ -51,6 +51,12 @@ pub struct CallStats {
pub underruns: u64,
/// Frames recovered by FEC.
pub fec_recovered: u64,
/// Playout ring overflow count (reader was lapped by writer).
pub playout_overflows: u64,
/// Playout ring underrun count (reader found empty buffer).
pub playout_underruns: u64,
/// Capture ring overflow count.
pub capture_overflows: u64,
/// Current mic audio level (RMS of i16 samples, 0-32767).
pub audio_level: u32,
/// Number of participants in the room (from last RoomUpdate).

View File

@@ -0,0 +1,115 @@
# Incident Report: SIGBUS in ART GC During Audio Thread JNI Calls
**Date:** 2026-04-06
**Severity:** High — app crash (SIGBUS) mid-call
**Status:** Root-caused, fix proposed
**Affects:** Android 16 (API 36) devices with concurrent mark-compact GC
## Summary
The app crashes with SIGBUS (signal 7, BUS_ADRERR) during an active call. The crash occurs in ART's garbage collector or JIT compiler, NOT in our Rust native code or AudioRing buffer. Both `wzp-capture` and `wzp-playout` Kotlin threads are affected.
## Crash Details
### Crash 1: wzp-capture (18:42, after 476s of call)
```
Fatal signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0x720009be38
tid 19697 (wzp-capture), pid 17885 (com.wzp.phone)
```
**Backtrace:**
```
#00 art::StackVisitor::WalkStack
#01 art::Thread::VisitRoots
#02 art::gc::collector::MarkCompact::ThreadFlipVisitor::Run
#03 art::Thread::EnsureFlipFunctionStarted
#04 CheckJNI::ReleasePrimitiveArrayElements ← JNI boundary
#05 android_media_AudioRecord_readInArray ← AudioRecord.read()
#09 com.wzp.audio.AudioPipeline.runCapture
```
**Root cause:** ART's concurrent mark-compact GC (`MarkCompact::ThreadFlipVisitor`) is flipping thread roots while the capture thread is in the middle of a JNI call (`AudioRecord.read()`). The GC's `EnsureFlipFunctionStarted` triggers a stack walk that hits an invalid address.
### Crash 2: wzp-playout (19:17, mid-call)
```
Fatal signal 7 (SIGBUS), code 2 (BUS_ADRERR), fault addr 0x225eb98
tid 32574 (wzp-playout), pid 32479 (com.wzp.phone)
```
**Backtrace:**
```
#00 com.wzp.audio.AudioPipeline.runPlayout ← JIT-compiled code
#01 art_quick_osr_stub ← On-Stack Replacement
#02 art::jit::Jit::MaybeDoOnStackReplacement
#03-#04 art::interpreter::ExecuteSwitchImplCpp
```
**Root cause:** ART's JIT compiler performed On-Stack Replacement (OSR) on the hot playout loop. The OSR stub references a code address (`0x225eb98`) that is no longer valid — likely because the GC moved the compiled code in memory during concurrent compaction.
## Why This Happens
Android 16 introduced a new **concurrent mark-compact GC** (CMC) that moves objects in memory while other threads are running. This is safe for normal Java code because ART uses read barriers. But our audio threads have specific properties that stress this:
1. **`Thread.MAX_PRIORITY`** — audio threads run at the highest priority, starving the GC thread of CPU time. The GC may not complete its thread-flip before the audio thread resumes.
2. **Tight JNI loops**`runCapture()` and `runPlayout()` loop every 20ms calling `AudioRecord.read()` / `AudioTrack.write()` via JNI. Each JNI transition is a GC safepoint, but the thread spends most of its time in native code where the GC can't flip it.
3. **Long-running JIT-compiled code** — the hot loop gets JIT-compiled and may undergo OSR. If the GC compacts memory while OSR is in progress, the stub can reference stale addresses.
4. **Daemon threads that never exit** — our threads are parked with `Thread.sleep(Long.MAX_VALUE)` after the call ends (to avoid the libcrypto TLS destructor crash). These zombie threads accumulate GC root scan work.
## Evidence This Is Not Our Bug
| Component | Evidence |
|-----------|---------|
| **AudioRing** | Not in any backtrace. All crash frames are in `libart.so` (ART runtime) |
| **Rust native code** | `libwzp_android.so` not in any crash frame |
| **JNI bridge** | Crash happens during `ReleasePrimitiveArrayElements` (ART internal), not during our JNI calls |
| **Timing** | Crashes after 476s and mid-call — not during init or teardown |
## Proposed Fix
### Option A: Disable concurrent GC compaction for audio threads (recommended)
Use `dalvik.vm.gctype` or per-thread GC pinning to prevent the mark-compact collector from moving objects referenced by audio threads.
**Not directly controllable from app code.** But we can reduce GC pressure:
### Option B: Reduce JNI transitions in audio threads
Instead of calling `engine.writeAudio(pcm)` / `engine.readAudio(pcm)` via JNI on every 20ms frame, batch multiple frames or use `DirectByteBuffer` to share memory without JNI array copies.
**Implementation:**
- Allocate a `DirectByteBuffer` in Kotlin, share the pointer with Rust via JNI
- Audio threads write/read directly to the buffer (no JNI call per frame)
- Rust reads/writes from the same memory region
- Reduces JNI transitions from 100/sec to 0/sec per audio direction
### Option C: Use Android's Oboe (AAudio) natively from Rust
Skip the Kotlin AudioRecord/AudioTrack entirely. Use Oboe (which we already have as a dependency in `wzp-android/Cargo.toml`) to create native audio streams directly from Rust. The audio callbacks run in native code with no JNI, no GC interaction, no ART.
This is how the project was originally designed (see `audio_android.rs` with Oboe references) before switching to Kotlin AudioRecord for simplicity.
**Pros:** Eliminates the entire JNI audio path. No GC interaction. Lower latency.
**Cons:** Requires rewriting `AudioPipeline.kt` into Rust. Oboe setup is more complex.
### Option D: Pin audio thread objects to prevent GC movement
Use JNI `GetPrimitiveArrayCritical` instead of `GetShortArrayRegion` to pin the array in memory during the operation. This prevents the GC from moving the array while we're using it.
**Implementation:** Change `nativeWriteAudio` / `nativeReadAudio` JNI functions to use critical sections.
### Recommendation
**Short term: Option B** (DirectByteBuffer) — reduces JNI transitions without major refactoring.
**Long term: Option C** (Oboe from Rust) — eliminates the problem entirely. This is the architecturally correct solution and matches the original design intent.
## Data Files
- Logcat from Nothing A059 (Android 16, API 36)
- Two crashes in the same session: 18:42 (capture, after 476s) and 19:17 (playout)
- Both SIGBUS/BUS_ADRERR, both in ART internal frames

View File

@@ -0,0 +1,175 @@
# Incident Report: Native Crash in Capture Thread — Use-After-Free on Engine Handle
**Date:** 2026-04-06
**Severity:** Critical — app crash (SIGSEGV) on call hangup
**Status:** Root-caused, fix pending
**Affects:** Android client only
## Summary
The app crashes with a native SIGSEGV during or shortly after call hangup. The crash occurs in JIT-compiled code inside `AudioPipeline.runCapture()`. The root cause is a use-after-free: the capture thread calls `engine.writeAudio()` via JNI after the engine's native handle has been freed by `teardown()` on the ViewModel thread.
## Crash Stacktrace
```
04-06 13:05:42.707 F DEBUG: #09 pc 000000000250696c /memfd:jit-cache (deleted) (com.wzp.audio.AudioPipeline.runCapture+3228)
04-06 13:05:42.707 F DEBUG: #14 pc 0000000000005270 <anonymous:730900d000> (com.wzp.audio.AudioPipeline.start$lambda$0+0)
04-06 13:05:42.708 F DEBUG: #19 pc 00000000000044cc <anonymous:730900d000> (com.wzp.audio.AudioPipeline.$r8$lambda$0rYcivupwvyN4SgBXhsroKmTlo8+0)
04-06 13:05:42.708 F DEBUG: #24 pc 00000000000042e4 <anonymous:730900d000> (com.wzp.audio.AudioPipeline$$ExternalSyntheticLambda0.run+0)
```
This is a tombstone (signal crash), not a Java exception. The `F DEBUG` tag indicates a native crash handler (debuggerd) captured the signal.
## Root Cause
### The Race Condition
Two threads operate on the engine concurrently without synchronization:
**Thread 1: `wzp-capture` (AudioRecord thread, MAX_PRIORITY)**
```kotlin
// AudioPipeline.runCapture() — runs in a tight loop
while (running) {
val read = recorder.read(pcm, 0, FRAME_SAMPLES)
if (read > 0) {
engine.writeAudio(pcm) // <-- JNI call to native engine
}
}
```
**Thread 2: ViewModel/UI thread (normal priority)**
```kotlin
// CallViewModel.teardown()
stopAudio() // sets AudioPipeline.running = false
engine?.stopCall() // tells Rust to stop
engine?.destroy() // frees native memory, sets nativeHandle = 0L
engine = null
```
### The Kotlin Guard is Insufficient
`WzpEngine.writeAudio()` has a guard:
```kotlin
fun writeAudio(pcm: ShortArray): Int {
if (nativeHandle == 0L) return 0 // check
return nativeWriteAudio(nativeHandle, pcm) // use
}
```
This is a **TOCTOU (time-of-check/time-of-use) race**:
1. Capture thread checks `nativeHandle != 0L` → true
2. ViewModel thread calls `destroy()`, which calls `nativeDestroy(handle)` then sets `nativeHandle = 0L`
3. Capture thread calls `nativeWriteAudio(handle, pcm)` with the now-freed handle
4. The JNI function dereferences `handle` as a pointer → **SIGSEGV**
The same race exists for `readAudio()` on the `wzp-playout` thread.
### Why `stopAudio()` Doesn't Prevent This
`AudioPipeline.stop()` sets `running = false` but does **NOT join or wait** for the threads:
```kotlin
fun stop() {
running = false
// Don't join — threads are parked as daemons to avoid native TLS crash
captureThread = null
playoutThread = null
}
```
The threads are intentionally not joined because of a separate bug: exiting a JNI-calling thread triggers a `SIGSEGV in OPENSSL_free` due to libcrypto TLS destructors on Android. The threads instead "park" with `Thread.sleep(Long.MAX_VALUE)` after the loop exits.
But the problem is the **window between `running = false` and the thread actually checking it**. The capture thread may be blocked in `recorder.read()` (which blocks for 20ms per frame) or in the middle of `engine.writeAudio()` when `destroy()` is called.
### Timeline of the Crash
```
T=0ms ViewModel: stopAudio() → sets running=false
T=0ms ViewModel: stopStatsPolling()
T=0ms ViewModel: engine.stopCall() — Rust stops internal tasks
T=1ms ViewModel: engine.destroy() — frees native memory
↑ nativeHandle = 0L
T=0-20ms Capture thread: still in recorder.read() or writeAudio()
→ if in writeAudio(), the nativeHandle check passed BEFORE destroy()
→ JNI dereferences freed pointer → SIGSEGV
```
## Affected Code
### Files with the race
| File | Line(s) | Issue |
|------|---------|-------|
| `android/.../WzpEngine.kt` | 107-108, 116-117 | TOCTOU on `nativeHandle` in `writeAudio()` / `readAudio()` |
| `android/.../CallViewModel.kt` | 257-262 | `stopAudio()` + `destroy()` without waiting for audio threads to quiesce |
| `android/.../AudioPipeline.kt` | 80-82 | `stop()` doesn't synchronize with running threads |
### Files with the thread parking workaround
| File | Line(s) | Context |
|------|---------|---------|
| `android/.../AudioPipeline.kt` | 57-58, 69-70 | Threads parked after loop exit to avoid libcrypto TLS crash |
| `android/.../AudioPipeline.kt` | 96-101 | `parkThread()``Thread.sleep(Long.MAX_VALUE)` |
## Constraints for the Fix
1. **Cannot join audio threads** — joining triggers a separate SIGSEGV in `OPENSSL_free` when the thread's TLS destructors fire (documented in `AudioPipeline.kt` comments). The parking workaround must be preserved.
2. **Must guarantee no JNI calls after `destroy()`** — the native handle is a raw pointer; any dereference after free is undefined behavior.
3. **Must not add blocking waits on the UI thread**`teardown()` runs on the ViewModel thread which must remain responsive.
4. **The `@Volatile running` flag is necessary but not sufficient** — it prevents new loop iterations but doesn't help with in-flight JNI calls.
5. **Both `writeAudio` and `readAudio` have the same race** — the fix must cover both the capture and playout paths.
## Reproduction
The crash is timing-dependent. It's most likely to occur when:
- The capture thread is in the middle of a `writeAudio()` JNI call when `destroy()` is called
- More likely on slower devices or under CPU pressure (GC, thermal throttling)
- Can happen on every hangup, but only crashes ~10-30% of the time due to the timing window
## Analysis of Possible Fix Approaches
### Approach A: Add a synchronization gate in the JNI bridge
Use a `ReentrantReadWriteLock` or `AtomicBoolean` in `WzpEngine.kt`:
- Audio threads acquire a read lock / check the flag before JNI calls
- `destroy()` acquires a write lock / sets the flag and waits for in-flight calls to drain
**Pro:** Clean, solves the race directly.
**Con:** Adding a lock to the audio hot path (every 20ms). `ReentrantReadWriteLock` is not lock-free. However, the read-lock path is uncontended 99.99% of the time (write-lock only during destroy), so contention is negligible.
### Approach B: Defer `destroy()` until audio threads have stopped
Instead of calling `destroy()` in `teardown()`, set a flag and have the audio threads call `destroy()` after they exit the loop (before parking).
**Pro:** No locks on hot path.
**Con:** Complex lifecycle — which thread calls destroy? What if both threads race to destroy? Need a `CountDownLatch` or similar.
### Approach C: Make the JNI handle atomically invalidated
Use `AtomicLong` for `nativeHandle` and use `compareAndExchange` in `destroy()` + `getAndCheck` pattern in audio calls.
**Pro:** Lock-free.
**Con:** Still has a TOCTOU window — the thread can load the handle, then it gets CAS'd to 0, then the thread uses the stale handle. Doesn't fully solve the race without combining with a reference count or epoch.
### Approach D: Introduce a destroy latch
Add a `CountDownLatch(1)` that audio threads wait on before parking. `teardown()` sets `running=false`, then `await`s the latch (with timeout), then calls `destroy()`. Each audio thread counts down the latch after exiting the loop.
Actually this needs a `CountDownLatch(2)` — one for each thread (capture + playout).
**Pro:** Guarantees no in-flight JNI calls at destroy time. No locks on hot path.
**Con:** `teardown()` blocks for up to one frame duration (~20ms) waiting for threads to exit their loops. Acceptable for a hangup path.
### Recommendation
**Approach D (destroy latch)** is the cleanest. The 20ms worst-case wait is imperceptible on the hangup path, and it provides a hard guarantee that no JNI calls are in flight when `destroy()` runs. Combined with the existing `running` volatile flag, the audio threads exit their loops within one frame and count down the latch.
If the latch times out (e.g., AudioRecord.read() is stuck), `destroy()` proceeds anyway — the `panic::catch_unwind` in the JNI bridge will catch the invalid access as a panic rather than a SIGSEGV (though this is best-effort; a true SIGSEGV from freed memory is not catchable).
## Data Files
The crash was captured from the Nothing A059 device at 13:05:42 on 2026-04-06. The tombstone is in the device's `/data/tombstones/` directory. The logcat output shows the crash frames.

View File

@@ -0,0 +1,394 @@
# Fix: AudioRing SPSC Buffer Cursor Desync
## Problem
A critical bug causes 10-16 seconds of bidirectional audio silence mid-call (~25-30s in). Both participants go silent at the exact same moment. The QUIC transport, relay, Opus codec, and FEC are all healthy — the bug is in the lock-free ring buffer that transfers decoded PCM from the Rust recv task to the Kotlin AudioTrack playout thread.
**Root cause:** `AudioRing::write()` modifies `read_pos` from the producer thread during overflow handling (lines 68-72 of `audio_ring.rs`). This violates the SPSC invariant — only the consumer should own `read_pos`. When both threads write to `read_pos`, a race corrupts the cursor state, causing the reader to see an empty or stale buffer for 12-16 seconds.
**Full forensics:** `debug/INCIDENT-2026-04-06-playout-ring-desync.md`
---
## Solution: Reader-Detects-Lap Architecture
The writer NEVER touches `read_pos`. On overflow, the writer simply overwrites old buffer data and advances `write_pos`. The reader detects it was lapped and self-corrects by snapping its own `read_pos` forward.
---
## Implementation Steps
### Step 1: Rewrite `AudioRing`
**File:** `crates/wzp-android/src/audio_ring.rs`
Replace the entire implementation with:
**Constants:**
```rust
/// Ring buffer capacity — must be a power of 2 for bitmask indexing.
/// 16384 samples = 341.3ms at 48kHz mono. Provides 70% more headroom
/// than the previous 9600 (200ms) for surviving Android GC pauses.
const RING_CAPACITY: usize = 16384; // 2^14
const RING_MASK: usize = RING_CAPACITY - 1;
```
**Struct:**
```rust
pub struct AudioRing {
buf: Box<[i16; RING_CAPACITY]>,
write_pos: AtomicUsize, // monotonically increasing, ONLY written by producer
read_pos: AtomicUsize, // monotonically increasing, ONLY written by consumer
overflow_count: AtomicU64, // incremented by reader when it detects a lap
underrun_count: AtomicU64, // incremented by reader when ring is empty
}
```
**`write()` — producer. Does NOT touch `read_pos`:**
```rust
pub fn write(&self, samples: &[i16]) -> usize {
let count = samples.len().min(RING_CAPACITY);
let w = self.write_pos.load(Ordering::Relaxed);
for i in 0..count {
unsafe {
let ptr = self.buf.as_ptr() as *mut i16;
*ptr.add((w + i) & RING_MASK) = samples[i];
}
}
self.write_pos.store(w.wrapping_add(count), Ordering::Release);
count
}
```
**`read()` — consumer. Detects lap, self-corrects:**
```rust
pub fn read(&self, out: &mut [i16]) -> usize {
let w = self.write_pos.load(Ordering::Acquire);
let mut r = self.read_pos.load(Ordering::Relaxed);
let mut avail = w.wrapping_sub(r);
// Lap detection: writer has overwritten our unread data.
// Snap read_pos forward to oldest valid data in the buffer.
// Safe because we (the reader) are the sole owner of read_pos.
if avail > RING_CAPACITY {
r = w.wrapping_sub(RING_CAPACITY);
avail = RING_CAPACITY;
self.overflow_count.fetch_add(1, Ordering::Relaxed);
}
let count = out.len().min(avail);
if count == 0 {
if w == r {
self.underrun_count.fetch_add(1, Ordering::Relaxed);
}
return 0;
}
for i in 0..count {
out[i] = unsafe { *self.buf.as_ptr().add((r + i) & RING_MASK) };
}
self.read_pos.store(r.wrapping_add(count), Ordering::Release);
count
}
```
**`available()` — clamped for external callers:**
```rust
pub fn available(&self) -> usize {
let w = self.write_pos.load(Ordering::Acquire);
let r = self.read_pos.load(Ordering::Relaxed);
w.wrapping_sub(r).min(RING_CAPACITY)
}
```
**`free_space()` — keep for API compat:**
```rust
pub fn free_space(&self) -> usize {
RING_CAPACITY.saturating_sub(self.available())
}
```
**Diagnostic accessors:**
```rust
pub fn overflow_count(&self) -> u64 {
self.overflow_count.load(Ordering::Relaxed)
}
pub fn underrun_count(&self) -> u64 {
self.underrun_count.load(Ordering::Relaxed)
}
```
**Constructor:**
```rust
pub fn new() -> Self {
debug_assert!(RING_CAPACITY.is_power_of_two());
Self {
buf: Box::new([0i16; RING_CAPACITY]),
write_pos: AtomicUsize::new(0),
read_pos: AtomicUsize::new(0),
overflow_count: AtomicU64::new(0),
underrun_count: AtomicU64::new(0),
}
}
```
**Imports to add:** `use std::sync::atomic::AtomicU64;`
**Safety comment update:**
```rust
// SAFETY: AudioRing is SPSC — one thread writes (producer), one reads (consumer).
// The producer only writes write_pos. The consumer only writes read_pos.
// Neither thread writes the other's cursor. Buffer indices are derived from
// the owning thread's cursor, ensuring no concurrent access to the same index.
```
---
### Step 2: Add counter fields to `CallStats`
**File:** `crates/wzp-android/src/stats.rs`
Add three fields to the `CallStats` struct (after `fec_recovered`):
```rust
/// Playout ring overflow count (reader was lapped by writer).
pub playout_overflows: u64,
/// Playout ring underrun count (reader found empty buffer).
pub playout_underruns: u64,
/// Capture ring overflow count.
pub capture_overflows: u64,
```
These derive `Default` (= 0) automatically via the existing `#[derive(Default)]`.
---
### Step 3: Wire ring diagnostics into engine stats + logging
**File:** `crates/wzp-android/src/engine.rs`
**3a.** In `get_stats()` (~line 181), populate the new fields:
```rust
stats.playout_overflows = self.state.playout_ring.overflow_count();
stats.playout_underruns = self.state.playout_ring.underrun_count();
stats.capture_overflows = self.state.capture_ring.overflow_count();
```
**3b.** In the recv task periodic stats log, add ring health:
```rust
info!(
frames_decoded,
fec_recovered,
recv_errors,
max_recv_gap_ms,
playout_avail = state.playout_ring.available(),
playout_overflows = state.playout_ring.overflow_count(),
playout_underruns = state.playout_ring.underrun_count(),
"recv stats"
);
```
**3c.** In the send task periodic stats log, add capture ring health:
```rust
info!(
seq = s,
block_id,
frames_sent,
frames_dropped,
send_errors,
ring_avail = state.capture_ring.available(),
capture_overflows = state.capture_ring.overflow_count(),
"send stats"
);
```
---
### Step 4: Parse new stats in Kotlin
**File:** `android/app/src/main/java/com/wzp/engine/CallStats.kt`
Add fields to the data class:
```kotlin
val playoutOverflows: Long = 0,
val playoutUnderruns: Long = 0,
val captureOverflows: Long = 0,
```
Add parsing in `fromJson()`:
```kotlin
playoutOverflows = obj.optLong("playout_overflows", 0),
playoutUnderruns = obj.optLong("playout_underruns", 0),
captureOverflows = obj.optLong("capture_overflows", 0),
```
No UI changes needed — these fields will appear in debug report JSON automatically.
---
### Step 5: Unit tests
**File:** `crates/wzp-android/src/audio_ring.rs` — add `#[cfg(test)] mod tests`
```rust
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn capacity_is_power_of_two() {
assert!(RING_CAPACITY.is_power_of_two());
}
#[test]
fn basic_write_read() {
let ring = AudioRing::new();
let input: Vec<i16> = (0..960).map(|i| i as i16).collect();
ring.write(&input);
assert_eq!(ring.available(), 960);
let mut output = vec![0i16; 960];
let read = ring.read(&mut output);
assert_eq!(read, 960);
assert_eq!(output, input);
assert_eq!(ring.available(), 0);
}
#[test]
fn wraparound() {
let ring = AudioRing::new();
let frame = vec![42i16; 960];
// Write enough to wrap the buffer multiple times
for _ in 0..20 {
ring.write(&frame);
let mut out = vec![0i16; 960];
ring.read(&mut out);
assert!(out.iter().all(|&s| s == 42));
}
}
#[test]
fn overflow_detected_by_reader() {
let ring = AudioRing::new();
// Write more than RING_CAPACITY without reading
let big = vec![7i16; RING_CAPACITY + 960];
ring.write(&big[..RING_CAPACITY]);
ring.write(&big[RING_CAPACITY..]);
// Reader should detect lap
let mut out = vec![0i16; 960];
let read = ring.read(&mut out);
assert!(read > 0);
assert_eq!(ring.overflow_count(), 1);
// Data should be from the most recent writes
assert!(out.iter().all(|&s| s == 7));
}
#[test]
fn writer_never_modifies_read_pos() {
let ring = AudioRing::new();
// Read pos should stay at 0 until read() is called
let data = vec![1i16; RING_CAPACITY + 960];
ring.write(&data);
// read_pos is private, but we can check available() > CAPACITY
// which proves write() didn't advance read_pos
let w = ring.write_pos.load(std::sync::atomic::Ordering::Relaxed);
let r = ring.read_pos.load(std::sync::atomic::Ordering::Relaxed);
assert_eq!(r, 0, "write() must not modify read_pos");
assert!(w.wrapping_sub(r) > RING_CAPACITY);
}
#[test]
fn underrun_counted() {
let ring = AudioRing::new();
let mut out = vec![0i16; 960];
let read = ring.read(&mut out);
assert_eq!(read, 0);
assert_eq!(ring.underrun_count(), 1);
}
#[test]
fn overflow_recovery_reads_recent_data() {
let ring = AudioRing::new();
// Fill with old data
let old = vec![1i16; RING_CAPACITY];
ring.write(&old);
// Overwrite with new data (lapping the reader)
let new_data = vec![99i16; 960];
ring.write(&new_data);
// Reader should snap forward and get recent data
let mut out = vec![0i16; RING_CAPACITY];
let read = ring.read(&mut out);
assert_eq!(read, RING_CAPACITY);
// The last 960 samples should be 99
assert!(out[RING_CAPACITY - 960..].iter().all(|&s| s == 99));
assert_eq!(ring.overflow_count(), 1);
}
}
```
---
## Memory Ordering Reference
| Operation | Ordering | Rationale |
|-----------|----------|-----------|
| `write_pos.store` in `write()` | Release | Buffer writes visible before cursor advances |
| `write_pos.load` in `read()` | Acquire | Pairs with Release above — sees all buffer writes |
| `write_pos.load` in `write()` | Relaxed | Writer is sole owner of write_pos |
| `read_pos.load` in `read()` | Relaxed | Reader is sole owner of read_pos |
| `read_pos.store` in `read()` | Release | Makes available() consistent from any thread |
| `read_pos.load` in `available()` | Relaxed | Informational only, slight staleness OK |
| All counters | Relaxed | Diagnostic only |
---
## Capacity Tradeoff
| Capacity | Duration | Memory | Verdict |
|----------|----------|--------|---------|
| 8192 (2^13) | 170ms | 16KB | Less than current 200ms — risky |
| **16384 (2^14)** | **341ms** | **32KB** | **70% more headroom, bitmask indexing** |
| 32768 (2^15) | 682ms | 64KB | Excessive latency on overflow recovery |
---
## Verification
1. `cargo test -p wzp-android` — new unit tests pass
2. `cargo ndk -t arm64-v8a build --release -p wzp-android` — ARM cross-compile succeeds
3. Build APK, install on both test devices (Nothing A059 + Pixel 6)
4. 2+ minute call — verify no audio gaps
5. Check debug report JSON: `playout_overflows` should be 0 or very small
6. Check logcat `wzp_android` tag: send/recv stats show healthy ring state
7. Stress test: play music through one device speaker while on call — forces high ring throughput
---
## Files to Modify
| File | What changes |
|------|-------------|
| `crates/wzp-android/src/audio_ring.rs` | Complete rewrite — the core fix |
| `crates/wzp-android/src/stats.rs` | Add 3 counter fields |
| `crates/wzp-android/src/engine.rs` | Wire counters into get_stats() + periodic logs |
| `android/app/src/main/java/com/wzp/engine/CallStats.kt` | Parse 3 new JSON fields |
## What Does NOT Change
- `AudioPipeline.kt` — calls `readAudio()`/`writeAudio()` unchanged; ring fix is transparent
- `jni_bridge.rs` — JNI bridge passes through unchanged
- `audio_android.rs` — separate Oboe-based ring, currently unused, different design
- Relay code — relay is confirmed healthy
- Desktop client — uses `Mutex + mpsc`, not `AudioRing`

View File

@@ -0,0 +1,149 @@
# Fix: Capture/Playout Thread Use-After-Free on Hangup
## Problem
App crashes (SIGSEGV) when hanging up a call. The capture thread (`wzp-capture`) calls `engine.writeAudio()` via JNI after `teardown()` has freed the native engine handle. Same race exists for the playout thread's `readAudio()`.
**Root cause:** TOCTOU race between the `nativeHandle == 0L` check in `WzpEngine.writeAudio()`/`readAudio()` and `destroy()` freeing the native memory on the ViewModel thread. Audio threads can't be joined (libcrypto TLS destructor crash), so there's no synchronization between `stopAudio()` and `destroy()`.
**Full forensics:** `debug/INCIDENT-2026-04-06-capture-thread-use-after-free.md`
---
## Solution: Destroy Latch
Add a `CountDownLatch(2)` that both audio threads count down after exiting their loops. `teardown()` awaits the latch (with timeout) before calling `destroy()`, guaranteeing no in-flight JNI calls.
---
## Implementation Steps
### Step 1: Add a drain latch to `AudioPipeline`
**File:** `android/app/src/main/java/com/wzp/audio/AudioPipeline.kt`
Add a `CountDownLatch` field:
```kotlin
import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit
class AudioPipeline(private val context: Context) {
// ... existing fields ...
/** Latch counted down by each audio thread after exiting its loop.
* stop() does NOT wait on this — teardown waits via awaitDrain(). */
private var drainLatch: CountDownLatch? = null
```
In `start()`, create the latch before spawning threads:
```kotlin
fun start(engine: WzpEngine) {
if (running) return
running = true
drainLatch = CountDownLatch(2) // one for capture, one for playout
captureThread = Thread({
runCapture(engine)
drainLatch?.countDown() // signal: capture loop exited
parkThread()
}, "wzp-capture").apply { ... }
playoutThread = Thread({
runPlayout(engine)
drainLatch?.countDown() // signal: playout loop exited
parkThread()
}, "wzp-playout").apply { ... }
// ...
}
```
Add `awaitDrain()` — called by ViewModel before `destroy()`:
```kotlin
/** Block until both audio threads have exited their loops (max 200ms).
* After this returns, no more JNI calls to the engine will be made. */
fun awaitDrain(): Boolean {
return drainLatch?.await(200, TimeUnit.MILLISECONDS) ?: true
}
```
`stop()` remains unchanged (non-blocking, sets `running = false`).
### Step 2: Update `CallViewModel.teardown()` to await drain
**File:** `android/app/src/main/java/com/wzp/ui/call/CallViewModel.kt`
Change teardown to wait for audio threads before destroying:
```kotlin
private fun teardown(stopService: Boolean = true) {
Log.i(TAG, "teardown: stopping audio, stopService=$stopService")
val hadCall = audioStarted
CallService.onStopFromNotification = null
stopAudio() // sets running=false (non-blocking)
stopStatsPolling()
// Wait for audio threads to exit their loops before destroying the engine.
// This guarantees no in-flight JNI calls to writeAudio/readAudio.
val drained = audioPipeline?.awaitDrain() ?: true
if (!drained) {
Log.w(TAG, "teardown: audio threads did not drain in time")
}
audioPipeline = null
Log.i(TAG, "teardown: stopping engine")
try { engine?.stopCall() } catch (e: Exception) { Log.w(TAG, "stopCall err: $e") }
try { engine?.destroy() } catch (e: Exception) { Log.w(TAG, "destroy err: $e") }
engine = null
engineInitialized = false
// ... rest unchanged
}
```
**Key change:** `awaitDrain()` is called AFTER `stopAudio()` (which sets `running=false`) but BEFORE `engine?.destroy()`. The latch guarantees both threads have exited their `while(running)` loops and will never call `writeAudio`/`readAudio` again.
Also move `audioPipeline = null` to after `awaitDrain()` to keep the reference alive for the latch call.
### Step 3: Move `stopAudio()` pipeline nulling
**File:** `android/app/src/main/java/com/wzp/ui/call/CallViewModel.kt`
In `stopAudio()`, do NOT null out the pipeline — let `teardown()` handle it after drain:
```kotlin
private fun stopAudio() {
if (!audioStarted) return
audioPipeline?.stop() // sets running=false
// DON'T null audioPipeline here — teardown() needs it for awaitDrain()
audioRouteManager?.unregister()
audioRouteManager?.setSpeaker(false)
_isSpeaker.value = false
audioStarted = false
}
```
---
## Files to Modify
| File | What changes |
|------|-------------|
| `android/.../audio/AudioPipeline.kt` | Add `CountDownLatch`, `countDown()` in threads, `awaitDrain()` method |
| `android/.../ui/call/CallViewModel.kt` | `teardown()` calls `awaitDrain()` before `destroy()`; `stopAudio()` doesn't null pipeline |
## What Does NOT Change
- `WzpEngine.kt` — the `nativeHandle == 0L` guard stays as defense-in-depth
- `jni_bridge.rs``panic::catch_unwind` stays as last resort
- `AudioPipeline.stop()` — remains non-blocking
- Thread parking — still needed to avoid libcrypto TLS crash
## Verification
1. Build APK, install on test device
2. Make a call, hang up — verify no crash in logcat (`adb logcat -s AndroidRuntime:E DEBUG:F`)
3. Rapid call/hangup/call/hangup cycles — stress the teardown path
4. Check logcat for `teardown: audio threads did not drain in time` — should never appear under normal conditions
5. Verify debug report still works after hangup (latch doesn't interfere with report collection)

376
scripts/build-android-cloud.sh Executable file
View File

@@ -0,0 +1,376 @@
#!/usr/bin/env bash
set -euo pipefail
# Build WarzonePhone Android APK using a temporary Hetzner Cloud VPS.
# Creates a VM, builds both debug and release APKs, downloads them, destroys the VM.
#
# Prerequisites: hcloud CLI authenticated, SSH key "wz" registered.
#
# Usage:
# ./scripts/build-android-cloud.sh Full build (create → build → download → destroy)
# ./scripts/build-android-cloud.sh --prepare Create VM and install deps only
# ./scripts/build-android-cloud.sh --build Build on existing VM
# ./scripts/build-android-cloud.sh --transfer Download APKs from VM
# ./scripts/build-android-cloud.sh --destroy Delete the VM
# ./scripts/build-android-cloud.sh --all prepare + build + transfer (VM persists)
# ./scripts/build-android-cloud.sh --upload Re-upload source to existing VM
#
# Environment variables (all optional):
# WZP_BRANCH Branch to build (default: feat/android-voip-client)
# WZP_SERVER_TYPE Hetzner server type (default: cx32 — 4 vCPU, 8GB RAM)
# WZP_KEEP_VM Set to 1 to skip destroy on full build
SSH_KEY_NAME="wz"
SSH_KEY_PATH="/Users/manwe/CascadeProjects/wzp"
SERVER_TYPE="${WZP_SERVER_TYPE:-cx33}"
IMAGE="ubuntu-24.04"
SERVER_NAME="wzp-android-builder"
REMOTE_USER="root"
OUTPUT_DIR="target/android-apk"
PROJECT_DIR="$(cd "$(dirname "$0")/.." && pwd)"
BRANCH="${WZP_BRANCH:-feat/android-voip-client}"
KEEP_VM="${WZP_KEEP_VM:-0}"
SSH_OPTS="-o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o ConnectTimeout=10 -o LogLevel=ERROR"
# NDK 26.1 — NDK 27 crashes scudo on Android 16 MTE devices
NDK_VERSION="26.1.10909125"
ANDROID_API="34"
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
log() { echo -e "\n\033[1;36m>>> $*\033[0m"; }
err() { echo -e "\033[1;31mERROR: $*\033[0m" >&2; }
die() { err "$@"; do_destroy_quiet; exit 1; }
get_vm_ip() {
hcloud server list -o columns=name,ipv4 -o noheader 2>/dev/null | grep "$SERVER_NAME" | awk '{print $2}' | tr -d ' '
}
ssh_cmd() {
local ip
ip=$(get_vm_ip)
[ -n "$ip" ] || die "No VM found. Run --prepare first."
ssh $SSH_OPTS -A -i "$SSH_KEY_PATH" "$REMOTE_USER@$ip" "$@"
}
scp_down() {
local ip
ip=$(get_vm_ip)
[ -n "$ip" ] || die "No VM found."
scp $SSH_OPTS -i "$SSH_KEY_PATH" "$REMOTE_USER@$ip:$1" "$2"
}
do_destroy_quiet() {
local name
name=$(hcloud server list -o columns=name -o noheader 2>/dev/null | grep "$SERVER_NAME" | tr -d ' ' || true)
if [ -n "$name" ]; then
echo ""
err "Cleaning up — destroying VM $name"
hcloud server delete "$name" 2>/dev/null || true
fi
}
# ---------------------------------------------------------------------------
# --prepare: Create VM, install all build dependencies
# ---------------------------------------------------------------------------
do_prepare() {
# Check if VM already exists
local existing
existing=$(hcloud server list -o columns=name -o noheader 2>/dev/null | grep "$SERVER_NAME" | tr -d ' ' || true)
if [ -n "$existing" ]; then
log "VM already exists: $existing — reusing"
do_upload
return
fi
log "Creating Hetzner VM ($SERVER_TYPE, $IMAGE)..."
hcloud server create \
--name "$SERVER_NAME" \
--type "$SERVER_TYPE" \
--image "$IMAGE" \
--ssh-key "$SSH_KEY_NAME" \
--location fsn1 \
--quiet \
|| die "Failed to create VM"
local ip
ip=$(get_vm_ip)
[ -n "$ip" ] || die "VM created but no IP found"
echo " VM: $SERVER_NAME @ $ip"
# Wait for SSH
log "Waiting for SSH..."
local ok=0
for i in $(seq 1 30); do
if ssh $SSH_OPTS -i "$SSH_KEY_PATH" "$REMOTE_USER@$ip" "echo ok" &>/dev/null; then
ok=1
break
fi
sleep 2
done
[ "$ok" -eq 1 ] || die "SSH timeout after 60s"
# System packages
log "Installing system packages (cmake, JDK 17, build tools)..."
ssh_cmd "export DEBIAN_FRONTEND=noninteractive && \
apt-get update -qq && \
apt-get install -y -qq \
build-essential cmake curl git libssl-dev pkg-config \
unzip wget zip openjdk-17-jdk-headless \
> /dev/null 2>&1" \
|| die "Failed to install system packages"
# Verify cmake version (must be <= 3.30)
local cmake_ver
cmake_ver=$(ssh_cmd "cmake --version | head -1")
echo " cmake: $cmake_ver"
echo " java: $(ssh_cmd "java -version 2>&1 | head -1")"
# Rust
log "Installing Rust toolchain..."
ssh_cmd "curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable > /dev/null 2>&1" \
|| die "Failed to install Rust"
ssh_cmd "source \$HOME/.cargo/env && rustup target add aarch64-linux-android > /dev/null 2>&1"
ssh_cmd "source \$HOME/.cargo/env && cargo install cargo-ndk > /dev/null 2>&1" \
|| die "Failed to install cargo-ndk"
echo " rust: $(ssh_cmd "source \$HOME/.cargo/env && rustc --version")"
# Android SDK + NDK
log "Installing Android SDK + NDK $NDK_VERSION..."
ssh_cmd "export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 && \
mkdir -p \$HOME/android-sdk/cmdline-tools && \
cd /tmp && \
wget -q https://dl.google.com/android/repository/commandlinetools-linux-11076708_latest.zip -O cmdtools.zip && \
unzip -qo cmdtools.zip -d \$HOME/android-sdk/cmdline-tools && \
mv \$HOME/android-sdk/cmdline-tools/cmdline-tools \$HOME/android-sdk/cmdline-tools/latest 2>/dev/null; \
yes | \$HOME/android-sdk/cmdline-tools/latest/bin/sdkmanager --licenses > /dev/null 2>&1; \
\$HOME/android-sdk/cmdline-tools/latest/bin/sdkmanager --install \
'platforms;android-${ANDROID_API}' \
'build-tools;${ANDROID_API}.0.0' \
'ndk;${NDK_VERSION}' \
'platform-tools' \
2>&1 | grep -v '^\[' > /dev/null" \
|| die "Failed to install Android SDK/NDK"
ssh_cmd "[ -d \$HOME/android-sdk/ndk/$NDK_VERSION ]" \
|| die "NDK not found after install"
echo " NDK: $NDK_VERSION"
# Upload source
do_upload
log "VM ready!"
echo " IP: $ip"
echo " SSH: ssh -A -i $SSH_KEY_PATH root@$ip"
}
# ---------------------------------------------------------------------------
# --upload: Upload source code to VM
# ---------------------------------------------------------------------------
do_upload() {
log "Uploading source code (rsync)..."
local ip
ip=$(get_vm_ip)
[ -n "$ip" ] || die "No VM found."
rsync -az --delete \
--exclude='target' \
--exclude='.git' \
--exclude='.claude' \
--exclude='node_modules' \
--exclude='dist' \
--exclude='desktop/src-tauri/gen' \
-e "ssh $SSH_OPTS -i $SSH_KEY_PATH" \
"$PROJECT_DIR/" "$REMOTE_USER@$ip:/root/wzp-build/"
echo " Source uploaded."
}
# ---------------------------------------------------------------------------
# --build: Build native .so + debug & release APKs
# ---------------------------------------------------------------------------
do_build() {
log "Building Rust native library (arm64-v8a, release)..."
# Clean Rust release target to force full rebuild.
# cargo-ndk only copies libc++_shared.so when it actually links — a partial
# clean that skips relinking leaves libc++_shared.so missing from jniLibs.
ssh_cmd "rm -rf /root/wzp-build/target/aarch64-linux-android/release \
/root/wzp-build/android/app/src/main/jniLibs/arm64-v8a"
# ANDROID_NDK must be set (not just ANDROID_NDK_HOME) — cmake checks it
ssh_cmd "source \$HOME/.cargo/env && \
export ANDROID_HOME=\$HOME/android-sdk && \
export ANDROID_NDK_HOME=\$ANDROID_HOME/ndk/$NDK_VERSION && \
export ANDROID_NDK=\$ANDROID_NDK_HOME && \
cd /root/wzp-build && \
cargo ndk -t arm64-v8a \
-o android/app/src/main/jniLibs \
build --release -p wzp-android 2>&1" | tail -5 \
|| die "Rust native build failed"
ssh_cmd "[ -f /root/wzp-build/android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so ]" \
|| die "libwzp_android.so not found after build"
local so_size
so_size=$(ssh_cmd "du -h /root/wzp-build/android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so | cut -f1")
echo " .so: $so_size"
# Generate debug keystore if missing
ssh_cmd "[ -f /root/wzp-build/android/keystore/wzp-debug.jks ] || \
(mkdir -p /root/wzp-build/android/keystore && \
keytool -genkey -v \
-keystore /root/wzp-build/android/keystore/wzp-debug.jks \
-keyalg RSA -keysize 2048 -validity 10000 \
-alias wzp-debug -storepass android -keypass android \
-dname 'CN=WZP Debug' > /dev/null 2>&1)"
# Build debug APK
log "Building debug APK..."
ssh_cmd "export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 && \
export ANDROID_HOME=\$HOME/android-sdk && \
cd /root/wzp-build/android && \
chmod +x ./gradlew && \
./gradlew assembleDebug --no-daemon --warning-mode=none 2>&1" | tail -3 \
|| die "Debug APK build failed"
# Build release APK (uses debug keystore for now)
log "Building release APK..."
# Copy debug keystore as release keystore (same password in build.gradle)
ssh_cmd "cp /root/wzp-build/android/keystore/wzp-debug.jks /root/wzp-build/android/keystore/wzp-release.jks 2>/dev/null; true"
ssh_cmd "export JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64 && \
export ANDROID_HOME=\$HOME/android-sdk && \
cd /root/wzp-build/android && \
./gradlew assembleRelease --no-daemon --warning-mode=none 2>&1" | tail -3 \
|| echo " (release APK failed — debug APK still available)"
log "Build complete!"
ssh_cmd "find /root/wzp-build/android -name '*.apk' -path '*/outputs/apk/*' -exec ls -lh {} \;"
}
# ---------------------------------------------------------------------------
# --transfer: Download APKs to local machine
# ---------------------------------------------------------------------------
do_transfer() {
log "Downloading APKs..."
mkdir -p "$OUTPUT_DIR"
local ip
ip=$(get_vm_ip)
# Debug APK
local debug_apk
debug_apk=$(ssh_cmd "find /root/wzp-build/android -name 'app-debug*.apk' -path '*/outputs/apk/*' | head -1")
if [ -n "$debug_apk" ]; then
scp_down "$debug_apk" "$OUTPUT_DIR/wzp-debug.apk"
echo " debug: $OUTPUT_DIR/wzp-debug.apk ($(du -h "$OUTPUT_DIR/wzp-debug.apk" | cut -f1))"
fi
# Release APK
local release_apk
release_apk=$(ssh_cmd "find /root/wzp-build/android -name 'app-release*.apk' -path '*/outputs/apk/*' | head -1" || true)
if [ -n "$release_apk" ]; then
scp_down "$release_apk" "$OUTPUT_DIR/wzp-release.apk"
echo " release: $OUTPUT_DIR/wzp-release.apk ($(du -h "$OUTPUT_DIR/wzp-release.apk" | cut -f1))"
fi
# Also copy the .so for inspection
scp_down "/root/wzp-build/android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so" "$OUTPUT_DIR/libwzp_android.so"
echo " .so: $OUTPUT_DIR/libwzp_android.so"
log "Transfer complete!"
echo ""
echo " Install debug: adb install -r $OUTPUT_DIR/wzp-debug.apk"
[ -f "$OUTPUT_DIR/wzp-release.apk" ] && echo " Install release: adb install -r $OUTPUT_DIR/wzp-release.apk"
}
# ---------------------------------------------------------------------------
# --destroy: Delete the VM
# ---------------------------------------------------------------------------
do_destroy() {
local name
name=$(hcloud server list -o columns=name -o noheader 2>/dev/null | grep "$SERVER_NAME" | tr -d ' ' || true)
if [ -z "$name" ]; then
echo "No VM to destroy."
return
fi
log "Deleting VM: $name"
hcloud server delete "$name"
echo " Done."
}
# ---------------------------------------------------------------------------
# Full build: create → build → transfer → destroy
# ---------------------------------------------------------------------------
do_full() {
trap 'err "Build failed!"; do_destroy_quiet; exit 1' ERR
do_prepare
# Disable trap during build — release APK failure is non-fatal
trap - ERR
do_build
do_transfer
trap 'err "Build failed!"; do_destroy_quiet; exit 1' ERR
if [ "$KEEP_VM" = "1" ]; then
log "VM kept alive (WZP_KEEP_VM=1). Destroy with: $0 --destroy"
else
do_destroy
fi
log "All done!"
echo ""
echo " ┌──────────────────────────────────────────────────┐"
echo " │ Debug APK: $OUTPUT_DIR/wzp-debug.apk"
[ -f "$OUTPUT_DIR/wzp-release.apk" ] && \
echo " │ Release APK: $OUTPUT_DIR/wzp-release.apk"
echo " │"
echo " │ Install: adb install -r $OUTPUT_DIR/wzp-debug.apk"
echo " └──────────────────────────────────────────────────┘"
}
# ---------------------------------------------------------------------------
# Main
# ---------------------------------------------------------------------------
case "${1:-}" in
--prepare) do_prepare ;;
--build) do_build ;;
--transfer) do_transfer ;;
--destroy) do_destroy ;;
--upload) do_upload ;;
--all)
do_prepare
do_build
do_transfer
log "VM still running. Destroy with: $0 --destroy"
;;
"")
do_full
;;
*)
echo "Usage: $0 [--prepare|--build|--transfer|--destroy|--all|--upload]"
echo ""
echo " (no args) Full build: create VM → build → download → destroy VM"
echo " --prepare Create VM and install deps"
echo " --build Build on existing VM"
echo " --transfer Download APKs from VM"
echo " --destroy Delete the VM"
echo " --all prepare + build + transfer (VM persists)"
echo " --upload Re-upload source to existing VM"
echo ""
echo "Environment:"
echo " WZP_BRANCH=$BRANCH"
echo " WZP_SERVER_TYPE=$SERVER_TYPE"
echo " WZP_KEEP_VM=$KEEP_VM (set to 1 to skip auto-destroy)"
exit 1
;;
esac

240
scripts/build-android.sh Executable file
View File

@@ -0,0 +1,240 @@
#!/usr/bin/env bash
# =============================================================================
# WZ Phone — Android APK build script for Debian 12 (Bookworm)
#
# Sets up a complete build environment from scratch and produces a debug APK.
# Idempotent — safe to run multiple times (skips already-installed components).
#
# Tested on: Debian 12 x86_64, cross-compiling to aarch64-linux-android
#
# Why these specific versions:
#
# cmake 3.25-3.28 (system package from apt)
# cmake 3.25 (Debian 12) and 3.28 (Ubuntu 24.04) both work.
# cmake 3.31+ has armv7/aarch64 flag conflicts in Android-Determine.cmake.
# cmake 4.x drops cmake_minimum_required < 3.5.
# Do NOT use pip cmake — it bundles its own modules with different bugs.
# CRITICAL: must set ANDROID_NDK=$ANDROID_NDK_HOME (cmake checks ANDROID_NDK).
#
# NDK 26.1.10909125 (r26b)
# NDK 27+ ships a newer libc++_shared.so with different scudo allocator
# defaults. On Android 16 devices with MTE (Memory Tagging Extension)
# enabled (e.g. Nothing A059), NDK 27's scudo crashes during malloc/calloc.
# NDK 26.1 is the last stable version for these devices.
# Matches build.gradle.kts: ndkVersion = "26.1.10909125"
#
# JDK 17 (openjdk-17-jdk-headless)
# Gradle 8.5 + AGP 8.2.0 officially support JDK 17.
# JDK 21 works for compilation but has Gradle daemon compat issues.
#
# Rust stable (currently 1.94.1)
# Edition 2024, MSRV 1.85. Stable channel is fine.
#
# ANDROID_NDK=$ANDROID_NDK_HOME (BOTH must be set)
# cmake's Android platform module checks ANDROID_NDK (no _HOME suffix).
# cargo-ndk sets ANDROID_NDK_HOME. Both must point to the same path.
#
# Usage:
# chmod +x scripts/build-android.sh
# ./scripts/build-android.sh # build from current tree
# WZP_CLONE=1 ./scripts/build-android.sh # clone fresh from git
# WZP_COMMIT=2092245 ./scripts/build-android.sh # pin to specific commit
#
# Environment variables (all optional):
# WZP_CLONE Set to 1 to clone from git instead of using current dir
# WZP_REPO Git clone URL (default: ssh://git@git.manko.yoga:222/manawenuz/wz-phone)
# WZP_BRANCH Branch to checkout (default: feat/android-voip-client)
# WZP_COMMIT Commit to pin to (default: HEAD)
# WZP_WORKDIR Build directory (default: /tmp/wzp-build)
# ANDROID_API SDK platform level (default: 34)
# NDK_VERSION NDK version string (default: 26.1.10909125)
# =============================================================================
set -euo pipefail
# ---------------------------------------------------------------------------
# Configuration
# ---------------------------------------------------------------------------
CLONE="${WZP_CLONE:-0}"
REPO="${WZP_REPO:-ssh://git@git.manko.yoga:222/manawenuz/wz-phone}"
BRANCH="${WZP_BRANCH:-feat/android-voip-client}"
COMMIT="${WZP_COMMIT:-}"
WORKDIR="${WZP_WORKDIR:-/tmp/wzp-build}"
ANDROID_API="${ANDROID_API:-34}"
NDK_VERSION="${NDK_VERSION:-26.1.10909125}"
ANDROID_HOME="${ANDROID_HOME:-$HOME/android-sdk}"
ANDROID_NDK_HOME="$ANDROID_HOME/ndk/$NDK_VERSION"
# cmake checks ANDROID_NDK (not _HOME) — both must be set
ANDROID_NDK="$ANDROID_NDK_HOME"
JAVA_HOME="/usr/lib/jvm/java-17-openjdk-$(dpkg --print-architecture)"
CMDLINE_TOOLS_URL="https://dl.google.com/android/repository/commandlinetools-linux-11076708_latest.zip"
export ANDROID_HOME ANDROID_NDK_HOME ANDROID_NDK JAVA_HOME
export PATH="$JAVA_HOME/bin:$ANDROID_HOME/cmdline-tools/latest/bin:$ANDROID_HOME/platform-tools:$HOME/.cargo/bin:$PATH"
log() { echo -e "\n\033[1;36m>>> $*\033[0m"; }
err() { echo -e "\033[1;31mERROR: $*\033[0m" >&2; exit 1; }
# ---------------------------------------------------------------------------
# Step 1: System packages (cmake 3.25, JDK 17, make, git, etc.)
# ---------------------------------------------------------------------------
log "Installing system packages"
export DEBIAN_FRONTEND=noninteractive
apt-get update -qq
apt-get install -y -qq \
build-essential \
cmake \
curl \
git \
libssl-dev \
pkg-config \
unzip \
wget \
zip \
openjdk-17-jdk-headless \
2>/dev/null
# Verify critical versions
log "Verifying build environment"
echo " cmake: $(cmake --version | head -1)"
echo " java: $(java -version 2>&1 | head -1)"
echo " make: $(make --version | head -1)"
CMAKE_MAJOR=$(cmake --version | head -1 | grep -oP '\d+' | head -1)
CMAKE_MINOR=$(cmake --version | head -1 | grep -oP '\d+' | sed -n '2p')
if [ "$CMAKE_MAJOR" -gt 3 ] || { [ "$CMAKE_MAJOR" -eq 3 ] && [ "$CMAKE_MINOR" -gt 30 ]; }; then
err "cmake $(cmake --version | head -1) is too new! Need cmake <= 3.28.x. cmake 3.31+ has Android cross-compilation bugs."
fi
# ---------------------------------------------------------------------------
# Step 2: Rust toolchain
# ---------------------------------------------------------------------------
log "Setting up Rust toolchain"
if ! command -v rustup &>/dev/null; then
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
source "$HOME/.cargo/env"
fi
rustup default stable
rustup target add aarch64-linux-android
echo " rustc: $(rustc --version)"
echo " cargo: $(cargo --version)"
if ! command -v cargo-ndk &>/dev/null; then
log "Installing cargo-ndk"
cargo install cargo-ndk
fi
echo " ndk: $(cargo ndk --version)"
# ---------------------------------------------------------------------------
# Step 3: Android SDK + NDK 26.1
# ---------------------------------------------------------------------------
log "Setting up Android SDK + NDK $NDK_VERSION"
if [ ! -f "$ANDROID_HOME/cmdline-tools/latest/bin/sdkmanager" ]; then
log "Downloading Android command-line tools"
mkdir -p "$ANDROID_HOME/cmdline-tools"
TMPZIP=$(mktemp /tmp/cmdline-tools-XXXXX.zip)
wget -q -O "$TMPZIP" "$CMDLINE_TOOLS_URL"
unzip -qo "$TMPZIP" -d "$ANDROID_HOME/cmdline-tools"
mv "$ANDROID_HOME/cmdline-tools/cmdline-tools" "$ANDROID_HOME/cmdline-tools/latest" 2>/dev/null || true
rm -f "$TMPZIP"
fi
yes | sdkmanager --licenses >/dev/null 2>&1 || true
if [ ! -d "$ANDROID_NDK_HOME" ]; then
log "Installing NDK $NDK_VERSION (this takes a few minutes)"
sdkmanager --install \
"platforms;android-${ANDROID_API}" \
"build-tools;${ANDROID_API}.0.0" \
"ndk;${NDK_VERSION}" \
"platform-tools" \
2>&1 | grep -v "^\[" || true
fi
[ -d "$ANDROID_NDK_HOME" ] || err "NDK not found at $ANDROID_NDK_HOME"
echo " NDK: $ANDROID_NDK_HOME"
echo " SDK: $ANDROID_HOME"
# ---------------------------------------------------------------------------
# Step 4: Source code
# ---------------------------------------------------------------------------
if [ "$CLONE" = "1" ]; then
log "Cloning $REPO (branch: $BRANCH)"
if [ -d "$WORKDIR/.git" ]; then
cd "$WORKDIR"
git fetch origin
else
rm -rf "$WORKDIR"
git clone --branch "$BRANCH" --recurse-submodules "$REPO" "$WORKDIR"
cd "$WORKDIR"
fi
git checkout "$BRANCH"
git pull origin "$BRANCH" || true
git submodule update --init --recursive
if [ -n "$COMMIT" ]; then
log "Pinning to commit $COMMIT"
git checkout "$COMMIT"
fi
else
# Use current directory (assume we're in the repo root)
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
WORKDIR="$(cd "$SCRIPT_DIR/.." && pwd)"
cd "$WORKDIR"
[ -f "Cargo.toml" ] || err "Not in repo root. Run from repo root or set WZP_CLONE=1"
fi
echo " HEAD: $(git log --oneline -1)"
# ---------------------------------------------------------------------------
# Step 5: Build native Rust library (.so)
# ---------------------------------------------------------------------------
log "Building Rust native library (arm64-v8a, release)"
cargo ndk -t arm64-v8a \
-o "$WORKDIR/android/app/src/main/jniLibs" \
build --release -p wzp-android
SO="$WORKDIR/android/app/src/main/jniLibs/arm64-v8a/libwzp_android.so"
[ -f "$SO" ] || err ".so not found at $SO"
echo " Built: $SO ($(du -h "$SO" | cut -f1))"
# ---------------------------------------------------------------------------
# Step 6: Generate debug keystore (if missing)
# ---------------------------------------------------------------------------
KEYSTORE="$WORKDIR/android/keystore/wzp-debug.jks"
if [ ! -f "$KEYSTORE" ]; then
log "Generating debug keystore"
mkdir -p "$(dirname "$KEYSTORE")"
keytool -genkey -v \
-keystore "$KEYSTORE" \
-keyalg RSA -keysize 2048 -validity 10000 \
-alias wzp-debug \
-storepass android -keypass android \
-dname "CN=WZP Debug" 2>&1 | tail -1
fi
# ---------------------------------------------------------------------------
# Step 7: Build Android APK
# ---------------------------------------------------------------------------
log "Building APK (debug)"
cd "$WORKDIR/android"
chmod +x ./gradlew
./gradlew assembleDebug --no-daemon --warning-mode=none
APK=$(find . -name "app-debug*.apk" -path "*/outputs/apk/*" | head -1)
[ -n "$APK" ] || err "APK not found"
APK_ABS="$(cd "$(dirname "$APK")" && pwd)/$(basename "$APK")"
# ---------------------------------------------------------------------------
# Done
# ---------------------------------------------------------------------------
log "Build complete!"
echo ""
echo " ┌──────────────────────────────────────────────────────────┐"
echo " │ APK: $APK_ABS"
echo " │ Size: $(du -h "$APK_ABS" | cut -f1)"
echo " │ SHA256: $(sha256sum "$APK_ABS" | cut -d' ' -f1)"
echo " └──────────────────────────────────────────────────────────┘"
echo ""
echo " Install: adb install -r $APK_ABS"
echo ""