Build #11 linked cleanly with --wrap=pthread_create but crashed at launch
on tao::ndk_glue::create with a Rust .expect() panic — meaning the shim's
__wrap_pthread_create successfully intercepted the call but returned
non-zero, triggering std::thread::spawn's Result::expect panic.
Add __android_log_print tracing so logcat shows exactly which resolver
path fired (RTLD_DEFAULT vs dlopen fallback) and what dlerror reports
when they fail. Also try RTLD_DEFAULT first — it's the simplest and
should find libc.so's pthread_create in the process's global symbol
table without any namespace games.
Build #10 failed with:
ld.lld: error: duplicate symbol: pthread_create
>>> defined at pthread_shim.c:30
>>> ... in archive libpthread_shim.a
(the other definition coming from libstd's bundled libc.a stub)
The raw-symbol-override approach was naive: when two static archives
both define the same symbol the linker refuses instead of picking one.
Switch to GNU-ld's `--wrap=pthread_create` mechanism:
- All `pthread_create` references get rewritten to `__wrap_pthread_create`
- Our shim now defines `__wrap_pthread_create` (no symbol clash)
- Inside the shim we `dlopen("libc.so")` + `dlsym("pthread_create")` to
get the real runtime symbol directly, bypassing BOTH the broken static
stub (libstd's libc.a copy) AND libstd's own pthread_create path
- `--real_pthread_create` is deliberately NOT used — it would alias the
same broken stub the wrap exists to avoid
The wrap flag is emitted via `cargo:rustc-link-arg` in build.rs so it
only affects the Android target (the Android-branch of build.rs is the
only place that emits it).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Builds #7, #8 and #9 all crashed at launch with the same SIGSEGV inside
__init_tcb(bionic_tcb*, pthread_internal_t*)+4 called via pthread_create
from std::sys::thread::unix::Thread::new.
Digging further: the problem is NOT the final linker we pass to cargo.
It's that rustup ships a PRE-COMPILED libstd for aarch64-linux-android
which was built statically against an old NDK libc archive. That archive
has a pthread_create stub which calls a static __init_tcb stub that
assumes libc's static init path has set up the TCB — which never happens
in a .so loaded via dlopen. Bumping minSdk to 26 or forcing the
android26-clang linker (903a07c) doesn't rebuild libstd and therefore
doesn't fix the bundled broken stub.
The legacy wzp-android crate dodged this with a getauxval_fix.c shim that
interposes getauxval via RTLD_NEXT. The same trick works for pthread_create
here: define our own `int pthread_create(...)` in cpp/pthread_shim.c that
forwards to `dlsym(RTLD_NEXT, "pthread_create")` — the real, fully working
version exported from libc.so. The linker processes our static lib before
libstd.rlib, so libstd's unresolved pthread_create reference binds to our
symbol, and the broken libc.a stub inside libstd is never pulled in.
build.rs compiles cpp/pthread_shim.c right after cpp/getauxval_fix.c so
both symbol overrides are in place before any Rust code gets linked.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>