Commit Graph

12 Commits

Author SHA1 Message Date
Siavash Sameni
0d74366592 step E.1: absolute minimum C++ file (no STL, no includes)
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m53s
Last bisection step. cpp/cpp_smoke.cpp reduced to a single extern 'C'
function that returns 42. No #include, no std::atomic, no std::mutex,
no std::thread. Only C++ things remaining are:
  - cc::Build::new().cpp(true) in build.rs (C++ mode compile)
  - cpp_link_stdlib('c++_shared') emitting -lc++_shared

If this still crashes with the same __init_tcb+4 / pthread_create
stack, we've conclusively proven the trigger is NOT any C++ code
that ends up in the final .so (everything gets dead-stripped
anyway because Rust never references wzp_cpp_hello). The trigger
must be either:
  a) cargo:rustc-link-lib=c++_shared (adds NEEDED entry for
     libc++_shared.so in the .so's dynamic table, causing the
     dynamic linker to load libc++_shared.so at dlopen() time
     alongside our .so), or
  b) Some interaction between cpp(true) mode and the rest of the
     build pipeline (toolchain flags, symbol visibility, etc.)

After this build we stop and write an incident report for the
WarzonePhone Tauri Android rewrite bisection so far.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:54:21 +04:00
Siavash Sameni
0224ce654c step E.2: shrink cpp_smoke to std::atomic only — no thread, no mutex
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m48s
Incremental bisection within Step E. E.4 (atomic + mutex + thread) still
crashed at __init_tcb. Drop mutex and thread, keep only std::atomic.
Build.rs still emits cargo:rustc-link-lib=c++_shared via
cpp_link_stdlib('c++_shared'), so the NEEDED entry for libc++_shared.so
in the final .so stays identical. Goal: if this crashes, the issue is
purely the dynamic link against libc++_shared (not thread/mutex code).
If it passes, the issue is actually std::thread or std::mutex use.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:47:30 +04:00
Siavash Sameni
aa240c6d83 step E.4(android): replace full Oboe compile with minimal C++ smoke file
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m44s
Bisection for the __init_tcb+4 crash that Step E introduced: drop the
full Oboe C++ build (200+ files, hundreds of KB of code) and replace
it with ONE tiny cpp/cpp_smoke.cpp that exercises the libc++ features
Oboe uses — std::atomic, std::mutex, std::thread — via an
extern "C" wzp_cpp_smoke() function that's exported but NEVER called
from Rust.

Still compiled with cpp_link_stdlib("c++_shared"), same as Oboe.
libc++_shared.so still copied into gen/android jniLibs. But no Oboe
headers, no Oboe source files, no -llog / -lOpenSLES links.

Hypothesis: if cpp_smoke.cpp alone reproduces the __init_tcb crash,
the trigger is "any libc++_shared link that references
std::thread/std::mutex" and Oboe is not the specific culprit. If it
launches cleanly, Oboe itself (its size, its static constructors, or
a specific header) is responsible — and we then bisect Oboe's
source tree.

fetch_oboe() and add_cpp_files_recursive() are retained in build.rs
with #[allow(dead_code)] so re-enabling the full Oboe compile is a
one-line edit once we've identified what's safe to include.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:39:30 +04:00
Siavash Sameni
4250f1b44a step E(android): compile full Oboe C++ bridge (not yet called from Rust)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m53s
Fifth incremental variable — and the first genuinely heavy one. Adds:
  - cpp/oboe_bridge.{h,cpp} (copied verbatim from crates/wzp-android/cpp/)
  - cpp/oboe_stub.cpp (fallback if Oboe can't be fetched)
  - build.rs now clones google/oboe@1.8.1 into OUT_DIR and compiles
    oboe_bridge.cpp + every .cpp file under oboe/src/ as a single
    static library via cc::Build, using shared libc++. Same logic as
    the legacy wzp-android build.rs.
  - libc++_shared.so gets copied from the NDK sysroot into the Tauri
    gen/android jniLibs directory so the runtime linker can find it.
  - rustc-link-lib=log / OpenSLES emitted for Oboe's Android backends.

Deliberately NOT called from Rust yet — no extern "C" FFI declarations,
no oboe_audio.rs module, the `wzp_oboe_*` symbols from the static lib
are simply present but unreferenced.

Goal: isolate whether the Oboe C++ compile + static lib link alone
(with its libc++ dependency and log/OpenSLES bindings) regresses the
working baseline. If the build still launches and renders the home
screen, we know the C++ side is clean and the actual regression is
caused by calling into Oboe at runtime (next step).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:09:16 +04:00
Siavash Sameni
a852cad15e step D(android): compile cpp/getauxval_fix.c alongside hello.c
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m55s
Fourth incremental variable. Adds the getauxval_fix.c shim from the
legacy wzp-android crate (which has been shipping with it for months
without issue) to our cc::Build on Android. The file defines a single
getauxval() function that delegates to bionic's real runtime
implementation via dlsym — this is needed because rustc links
compiler-rt's broken static getauxval stub that SIGSEGVs in .so
libraries loaded via dlopen (reads __libc_auxv which is NULL).

Not imported from Rust. Goal: verify that adding a second C static
archive (and especially one that overrides a libc-ish symbol) doesn't
regress the working build.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:03:37 +04:00
Siavash Sameni
f96d7ce3e1 step A(android): add cc=1 build-dep + compile single trivial hello.c
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m54s
First incremental variable on the path back to Oboe integration. Changes
are deliberately minimal: add cc = "1" to [build-dependencies] (cargo
build-deps resolve against the host so the line is unconditional), and
on the Android target run a single cc::Build step that compiles
cpp/hello.c — a 6-line file that defines one function (`wzp_hello_stub`)
that is never called from Rust.

Goal: verify that merely introducing a C static library into the .so
via cc::Build does not regress the working build (#17, commit 5309938
= build #6 behaviour: launches, renders home screen, registers on
relay). If this build still works, we know cc::Build pipelines alone
are fine and can move to the next variable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:45:24 +04:00
Siavash Sameni
530993854f revert(android): roll back to build #6 (35642d1) — pre-oboe known-good state
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m51s
Spent 10+ builds chasing a __init_tcb+4 / pthread_create SIGSEGV after
adding the oboe audio backend. Every "fix" made things worse. Reverting
all Android-specific files to the state at 35642d1 (build #6), which
was the last commit where the Tauri Android app actually launched,
rendered the home screen, and successfully registered on a relay.

Reverted files (all back to their 35642d1 content):
  - desktop/src-tauri/Cargo.toml        (no build-dep cc, no tracing-android)
  - desktop/src-tauri/build.rs          (git hash only, no Oboe / cc build)
  - desktop/src-tauri/src/lib.rs        (engine cfg-gated on non-android)
  - desktop/src-tauri/src/main.rs       (two-line desktop entry)
  - desktop/src-tauri/src/engine.rs     (desktop-only audio setup)
  - scripts/Dockerfile.android-builder  (no android24→26 clang shim)
  - scripts/build-tauri-android.sh      (no linker env vars / manifest patch)

Deleted (were added between b314138 and e2e023d):
  - desktop/src-tauri/cpp/getauxval_fix.c
  - desktop/src-tauri/cpp/oboe_bridge.{h,cpp}
  - desktop/src-tauri/cpp/oboe_stub.cpp
  - desktop/src-tauri/src/oboe_audio.rs

Next: rebuild image on remote (to drop the baked-in clang shim), build
an APK, install on Pixel 6, verify the UI renders the same way build #6
did. From there we add features back ONE at a time so we can actually
bisect which one triggers the tao::ndk_glue crash. User's rule:
"if you want to change stack, change incrementally, so we can debug".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:22:57 +04:00
Siavash Sameni
e2e023d2bc fix(android): drop pthread_shim — clang shim makes it unnecessary (and harmful)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 37s
Build Release Binaries / build-amd64 (push) Failing after 3m49s
Once the Dockerfile rewrites every android24-clang to exec android26-clang,
the linker uses the API-26 NDK sysroot and libstd's pthread_create reference
resolves directly against libc.so's real runtime symbol — no interposition
needed.

The pthread_shim.c approach was actually fighting its own solution: our
shim's dlsym() call bound at link time to libdl.a's STUB dlsym (a
five-line function inside libdl_static.o that just returns NULL and sets
dlerror to "libdl.a is a stub --- use libdl.so instead"). NDK r19 and
glibc 2.34 both replaced libdl.a with empty stubs because dynamic loading
is now part of the main libc/bionic — so no amount of link-order
tinkering can make a static libdl.a dlsym actually work.

Remove pthread_shim.c, the cc::Build::new().file("cpp/pthread_shim.c")
step in build.rs, and the -Wl,--wrap=pthread_create rustc-link-arg. Keep
getauxval_fix.c because that one DOES work at link time (the symbol
override is for a function compiler-rt defines statically, not one that
would depend on the stub libdl.a/libc.a).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:52:53 +04:00
Siavash Sameni
1a8288c95f debug(android): instrument pthread_shim with logcat tracing + try RTLD_DEFAULT first
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m43s
Build #11 linked cleanly with --wrap=pthread_create but crashed at launch
on tao::ndk_glue::create with a Rust .expect() panic — meaning the shim's
__wrap_pthread_create successfully intercepted the call but returned
non-zero, triggering std::thread::spawn's Result::expect panic.

Add __android_log_print tracing so logcat shows exactly which resolver
path fired (RTLD_DEFAULT vs dlopen fallback) and what dlerror reports
when they fail. Also try RTLD_DEFAULT first — it's the simplest and
should find libc.so's pthread_create in the process's global symbol
table without any namespace games.
2026-04-09 13:15:47 +04:00
Siavash Sameni
f015be63ec fix(android): use --wrap=pthread_create instead of raw symbol override
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Failing after 3m39s
Build #10 failed with:
  ld.lld: error: duplicate symbol: pthread_create
    >>> defined at pthread_shim.c:30
    >>> ... in archive libpthread_shim.a
  (the other definition coming from libstd's bundled libc.a stub)

The raw-symbol-override approach was naive: when two static archives
both define the same symbol the linker refuses instead of picking one.

Switch to GNU-ld's `--wrap=pthread_create` mechanism:
  - All `pthread_create` references get rewritten to `__wrap_pthread_create`
  - Our shim now defines `__wrap_pthread_create` (no symbol clash)
  - Inside the shim we `dlopen("libc.so")` + `dlsym("pthread_create")` to
    get the real runtime symbol directly, bypassing BOTH the broken static
    stub (libstd's libc.a copy) AND libstd's own pthread_create path
  - `--real_pthread_create` is deliberately NOT used — it would alias the
    same broken stub the wrap exists to avoid

The wrap flag is emitted via `cargo:rustc-link-arg` in build.rs so it
only affects the Android target (the Android-branch of build.rs is the
only place that emits it).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:08:41 +04:00
Siavash Sameni
79e876126c fix(android): interpose pthread_create to bypass libstd's broken static stub
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m52s
Builds #7, #8 and #9 all crashed at launch with the same SIGSEGV inside
__init_tcb(bionic_tcb*, pthread_internal_t*)+4 called via pthread_create
from std::sys::thread::unix::Thread::new.

Digging further: the problem is NOT the final linker we pass to cargo.
It's that rustup ships a PRE-COMPILED libstd for aarch64-linux-android
which was built statically against an old NDK libc archive. That archive
has a pthread_create stub which calls a static __init_tcb stub that
assumes libc's static init path has set up the TCB — which never happens
in a .so loaded via dlopen. Bumping minSdk to 26 or forcing the
android26-clang linker (903a07c) doesn't rebuild libstd and therefore
doesn't fix the bundled broken stub.

The legacy wzp-android crate dodged this with a getauxval_fix.c shim that
interposes getauxval via RTLD_NEXT. The same trick works for pthread_create
here: define our own `int pthread_create(...)` in cpp/pthread_shim.c that
forwards to `dlsym(RTLD_NEXT, "pthread_create")` — the real, fully working
version exported from libc.so. The linker processes our static lib before
libstd.rlib, so libstd's unresolved pthread_create reference binds to our
symbol, and the broken libc.a stub inside libstd is never pulled in.

build.rs compiles cpp/pthread_shim.c right after cpp/getauxval_fix.c so
both symbol overrides are in place before any Rust code gets linked.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 13:04:18 +04:00
Siavash Sameni
b314138caf feat(android): oboe/AAudio audio backend + runtime mic permission (step 3)
Some checks failed
Mirror to GitHub / mirror (push) Failing after 39s
Build Release Binaries / build-amd64 (push) Failing after 3m39s
This is the big one — the Tauri Android app now has a real audio stack
capable of full-duplex VoIP, reusing the proven C++ Oboe bridge from the
legacy wzp-android crate.

Architecture:
- desktop/src-tauri/cpp/  — copies of oboe_bridge.{h,cpp}, oboe_stub.cpp,
  and getauxval_fix.c from crates/wzp-android/cpp/. build.rs clones
  google/oboe@1.8.1 into OUT_DIR and compiles the bridge + all Oboe
  sources as "oboe_bridge" static lib, linking against shared libc++
  (static would pull broken libc stubs that SIGSEGV in .so libraries).
- src/oboe_audio.rs  — Rust side: an SPSC ring buffer matching the C++
  bridge's AtomicI32 layout, plus OboeHandle::start() which returns
  (capture_ring, playout_ring, owning_handle). The ring exposes the same
  (available / read / write) methods as wzp_client::audio_ring::AudioRing
  so CallEngine treats both backends interchangeably.
- src/engine.rs  — compiled on every platform now. A cfg-switched type
  alias picks wzp_client::audio_ring::AudioRing on desktop and
  crate::oboe_audio::AudioRing on Android. The audio setup block has
  three branches: VPIO/CPAL on macOS, CPAL on Linux/Windows, Oboe on
  Android. Send/recv tasks are identical across platforms.
- src/lib.rs  — removes all the "step 3 not done" Android stubs. The
  engine module is no longer cfg-gated; connect / disconnect / toggle_mic
  / toggle_speaker / get_status are single implementations used by both
  desktop and Android. Identity path resolves via app.path().app_data_dir()
  from the Tauri setup() callback (already wired in step 1).

Runtime mic permission:
- scripts/build-tauri-android.sh now injects RECORD_AUDIO + MODIFY_AUDIO_
  SETTINGS into gen/android/app/src/main/AndroidManifest.xml after init,
  and overwrites MainActivity.kt with a version that calls
  ActivityCompat.requestPermissions in onCreate. This is idempotent:
  every build re-applies the patches so tauri re-init can't regress them.

Cargo.toml:
- cc is now an unconditional build-dep (build.rs runs on the host, so
  target-gating build-deps doesn't work).
- wzp-client is now a dep on every platform. On Android it gets default
  features only (no "audio"/"vpio") so CPAL isn't dragged in — oboe_audio
  provides the capture/playout rings instead.
- tracing-android is added on Android so tracing events flow into logcat.

build.rs also gained embedded git hash (WZP_GIT_HASH) capture, which is
shown under the fingerprint on the home screen — already committed in
7639aaf, reinstated here alongside the Oboe build logic.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 12:40:38 +04:00