Last bisection step. cpp/cpp_smoke.cpp reduced to a single extern 'C'
function that returns 42. No #include, no std::atomic, no std::mutex,
no std::thread. Only C++ things remaining are:
- cc::Build::new().cpp(true) in build.rs (C++ mode compile)
- cpp_link_stdlib('c++_shared') emitting -lc++_shared
If this still crashes with the same __init_tcb+4 / pthread_create
stack, we've conclusively proven the trigger is NOT any C++ code
that ends up in the final .so (everything gets dead-stripped
anyway because Rust never references wzp_cpp_hello). The trigger
must be either:
a) cargo:rustc-link-lib=c++_shared (adds NEEDED entry for
libc++_shared.so in the .so's dynamic table, causing the
dynamic linker to load libc++_shared.so at dlopen() time
alongside our .so), or
b) Some interaction between cpp(true) mode and the rest of the
build pipeline (toolchain flags, symbol visibility, etc.)
After this build we stop and write an incident report for the
WarzonePhone Tauri Android rewrite bisection so far.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Incremental bisection within Step E. E.4 (atomic + mutex + thread) still
crashed at __init_tcb. Drop mutex and thread, keep only std::atomic.
Build.rs still emits cargo:rustc-link-lib=c++_shared via
cpp_link_stdlib('c++_shared'), so the NEEDED entry for libc++_shared.so
in the final .so stays identical. Goal: if this crashes, the issue is
purely the dynamic link against libc++_shared (not thread/mutex code).
If it passes, the issue is actually std::thread or std::mutex use.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bisection for the __init_tcb+4 crash that Step E introduced: drop the
full Oboe C++ build (200+ files, hundreds of KB of code) and replace
it with ONE tiny cpp/cpp_smoke.cpp that exercises the libc++ features
Oboe uses — std::atomic, std::mutex, std::thread — via an
extern "C" wzp_cpp_smoke() function that's exported but NEVER called
from Rust.
Still compiled with cpp_link_stdlib("c++_shared"), same as Oboe.
libc++_shared.so still copied into gen/android jniLibs. But no Oboe
headers, no Oboe source files, no -llog / -lOpenSLES links.
Hypothesis: if cpp_smoke.cpp alone reproduces the __init_tcb crash,
the trigger is "any libc++_shared link that references
std::thread/std::mutex" and Oboe is not the specific culprit. If it
launches cleanly, Oboe itself (its size, its static constructors, or
a specific header) is responsible — and we then bisect Oboe's
source tree.
fetch_oboe() and add_cpp_files_recursive() are retained in build.rs
with #[allow(dead_code)] so re-enabling the full Oboe compile is a
one-line edit once we've identified what's safe to include.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fifth incremental variable — and the first genuinely heavy one. Adds:
- cpp/oboe_bridge.{h,cpp} (copied verbatim from crates/wzp-android/cpp/)
- cpp/oboe_stub.cpp (fallback if Oboe can't be fetched)
- build.rs now clones google/oboe@1.8.1 into OUT_DIR and compiles
oboe_bridge.cpp + every .cpp file under oboe/src/ as a single
static library via cc::Build, using shared libc++. Same logic as
the legacy wzp-android build.rs.
- libc++_shared.so gets copied from the NDK sysroot into the Tauri
gen/android jniLibs directory so the runtime linker can find it.
- rustc-link-lib=log / OpenSLES emitted for Oboe's Android backends.
Deliberately NOT called from Rust yet — no extern "C" FFI declarations,
no oboe_audio.rs module, the `wzp_oboe_*` symbols from the static lib
are simply present but unreferenced.
Goal: isolate whether the Oboe C++ compile + static lib link alone
(with its libc++ dependency and log/OpenSLES bindings) regresses the
working baseline. If the build still launches and renders the home
screen, we know the C++ side is clean and the actual regression is
caused by calling into Oboe at runtime (next step).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fourth incremental variable. Adds the getauxval_fix.c shim from the
legacy wzp-android crate (which has been shipping with it for months
without issue) to our cc::Build on Android. The file defines a single
getauxval() function that delegates to bionic's real runtime
implementation via dlsym — this is needed because rustc links
compiler-rt's broken static getauxval stub that SIGSEGVs in .so
libraries loaded via dlopen (reads __libc_auxv which is NULL).
Not imported from Rust. Goal: verify that adding a second C static
archive (and especially one that overrides a libc-ish symbol) doesn't
regress the working build.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First incremental variable on the path back to Oboe integration. Changes
are deliberately minimal: add cc = "1" to [build-dependencies] (cargo
build-deps resolve against the host so the line is unconditional), and
on the Android target run a single cc::Build step that compiles
cpp/hello.c — a 6-line file that defines one function (`wzp_hello_stub`)
that is never called from Rust.
Goal: verify that merely introducing a C static library into the .so
via cc::Build does not regress the working build (#17, commit 5309938
= build #6 behaviour: launches, renders home screen, registers on
relay). If this build still works, we know cc::Build pipelines alone
are fine and can move to the next variable.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Spent 10+ builds chasing a __init_tcb+4 / pthread_create SIGSEGV after
adding the oboe audio backend. Every "fix" made things worse. Reverting
all Android-specific files to the state at 35642d1 (build #6), which
was the last commit where the Tauri Android app actually launched,
rendered the home screen, and successfully registered on a relay.
Reverted files (all back to their 35642d1 content):
- desktop/src-tauri/Cargo.toml (no build-dep cc, no tracing-android)
- desktop/src-tauri/build.rs (git hash only, no Oboe / cc build)
- desktop/src-tauri/src/lib.rs (engine cfg-gated on non-android)
- desktop/src-tauri/src/main.rs (two-line desktop entry)
- desktop/src-tauri/src/engine.rs (desktop-only audio setup)
- scripts/Dockerfile.android-builder (no android24→26 clang shim)
- scripts/build-tauri-android.sh (no linker env vars / manifest patch)
Deleted (were added between b314138 and e2e023d):
- desktop/src-tauri/cpp/getauxval_fix.c
- desktop/src-tauri/cpp/oboe_bridge.{h,cpp}
- desktop/src-tauri/cpp/oboe_stub.cpp
- desktop/src-tauri/src/oboe_audio.rs
Next: rebuild image on remote (to drop the baked-in clang shim), build
an APK, install on Pixel 6, verify the UI renders the same way build #6
did. From there we add features back ONE at a time so we can actually
bisect which one triggers the tao::ndk_glue crash. User's rule:
"if you want to change stack, change incrementally, so we can debug".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Once the Dockerfile rewrites every android24-clang to exec android26-clang,
the linker uses the API-26 NDK sysroot and libstd's pthread_create reference
resolves directly against libc.so's real runtime symbol — no interposition
needed.
The pthread_shim.c approach was actually fighting its own solution: our
shim's dlsym() call bound at link time to libdl.a's STUB dlsym (a
five-line function inside libdl_static.o that just returns NULL and sets
dlerror to "libdl.a is a stub --- use libdl.so instead"). NDK r19 and
glibc 2.34 both replaced libdl.a with empty stubs because dynamic loading
is now part of the main libc/bionic — so no amount of link-order
tinkering can make a static libdl.a dlsym actually work.
Remove pthread_shim.c, the cc::Build::new().file("cpp/pthread_shim.c")
step in build.rs, and the -Wl,--wrap=pthread_create rustc-link-arg. Keep
getauxval_fix.c because that one DOES work at link time (the symbol
override is for a function compiler-rt defines statically, not one that
would depend on the stub libdl.a/libc.a).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Build #11 linked cleanly with --wrap=pthread_create but crashed at launch
on tao::ndk_glue::create with a Rust .expect() panic — meaning the shim's
__wrap_pthread_create successfully intercepted the call but returned
non-zero, triggering std::thread::spawn's Result::expect panic.
Add __android_log_print tracing so logcat shows exactly which resolver
path fired (RTLD_DEFAULT vs dlopen fallback) and what dlerror reports
when they fail. Also try RTLD_DEFAULT first — it's the simplest and
should find libc.so's pthread_create in the process's global symbol
table without any namespace games.
Build #10 failed with:
ld.lld: error: duplicate symbol: pthread_create
>>> defined at pthread_shim.c:30
>>> ... in archive libpthread_shim.a
(the other definition coming from libstd's bundled libc.a stub)
The raw-symbol-override approach was naive: when two static archives
both define the same symbol the linker refuses instead of picking one.
Switch to GNU-ld's `--wrap=pthread_create` mechanism:
- All `pthread_create` references get rewritten to `__wrap_pthread_create`
- Our shim now defines `__wrap_pthread_create` (no symbol clash)
- Inside the shim we `dlopen("libc.so")` + `dlsym("pthread_create")` to
get the real runtime symbol directly, bypassing BOTH the broken static
stub (libstd's libc.a copy) AND libstd's own pthread_create path
- `--real_pthread_create` is deliberately NOT used — it would alias the
same broken stub the wrap exists to avoid
The wrap flag is emitted via `cargo:rustc-link-arg` in build.rs so it
only affects the Android target (the Android-branch of build.rs is the
only place that emits it).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Builds #7, #8 and #9 all crashed at launch with the same SIGSEGV inside
__init_tcb(bionic_tcb*, pthread_internal_t*)+4 called via pthread_create
from std::sys::thread::unix::Thread::new.
Digging further: the problem is NOT the final linker we pass to cargo.
It's that rustup ships a PRE-COMPILED libstd for aarch64-linux-android
which was built statically against an old NDK libc archive. That archive
has a pthread_create stub which calls a static __init_tcb stub that
assumes libc's static init path has set up the TCB — which never happens
in a .so loaded via dlopen. Bumping minSdk to 26 or forcing the
android26-clang linker (903a07c) doesn't rebuild libstd and therefore
doesn't fix the bundled broken stub.
The legacy wzp-android crate dodged this with a getauxval_fix.c shim that
interposes getauxval via RTLD_NEXT. The same trick works for pthread_create
here: define our own `int pthread_create(...)` in cpp/pthread_shim.c that
forwards to `dlsym(RTLD_NEXT, "pthread_create")` — the real, fully working
version exported from libc.so. The linker processes our static lib before
libstd.rlib, so libstd's unresolved pthread_create reference binds to our
symbol, and the broken libc.a stub inside libstd is never pulled in.
build.rs compiles cpp/pthread_shim.c right after cpp/getauxval_fix.c so
both symbol overrides are in place before any Rust code gets linked.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This is the big one — the Tauri Android app now has a real audio stack
capable of full-duplex VoIP, reusing the proven C++ Oboe bridge from the
legacy wzp-android crate.
Architecture:
- desktop/src-tauri/cpp/ — copies of oboe_bridge.{h,cpp}, oboe_stub.cpp,
and getauxval_fix.c from crates/wzp-android/cpp/. build.rs clones
google/oboe@1.8.1 into OUT_DIR and compiles the bridge + all Oboe
sources as "oboe_bridge" static lib, linking against shared libc++
(static would pull broken libc stubs that SIGSEGV in .so libraries).
- src/oboe_audio.rs — Rust side: an SPSC ring buffer matching the C++
bridge's AtomicI32 layout, plus OboeHandle::start() which returns
(capture_ring, playout_ring, owning_handle). The ring exposes the same
(available / read / write) methods as wzp_client::audio_ring::AudioRing
so CallEngine treats both backends interchangeably.
- src/engine.rs — compiled on every platform now. A cfg-switched type
alias picks wzp_client::audio_ring::AudioRing on desktop and
crate::oboe_audio::AudioRing on Android. The audio setup block has
three branches: VPIO/CPAL on macOS, CPAL on Linux/Windows, Oboe on
Android. Send/recv tasks are identical across platforms.
- src/lib.rs — removes all the "step 3 not done" Android stubs. The
engine module is no longer cfg-gated; connect / disconnect / toggle_mic
/ toggle_speaker / get_status are single implementations used by both
desktop and Android. Identity path resolves via app.path().app_data_dir()
from the Tauri setup() callback (already wired in step 1).
Runtime mic permission:
- scripts/build-tauri-android.sh now injects RECORD_AUDIO + MODIFY_AUDIO_
SETTINGS into gen/android/app/src/main/AndroidManifest.xml after init,
and overwrites MainActivity.kt with a version that calls
ActivityCompat.requestPermissions in onCreate. This is idempotent:
every build re-applies the patches so tauri re-init can't regress them.
Cargo.toml:
- cc is now an unconditional build-dep (build.rs runs on the host, so
target-gating build-deps doesn't work).
- wzp-client is now a dep on every platform. On Android it gets default
features only (no "audio"/"vpio") so CPAL isn't dragged in — oboe_audio
provides the capture/playout rings instead.
- tracing-android is added on Android so tracing events flow into logcat.
build.rs also gained embedded git hash (WZP_GIT_HASH) capture, which is
shown under the fingerprint on the home screen — already committed in
7639aaf, reinstated here alongside the Oboe build logic.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>