wz-phone

Author	SHA1	Message	Date
Siavash Sameni	7949266e11	windows: docker + hcloud build scripts for cross-compile Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m59s Details Two parallel paths to build wzp-desktop.exe for x86_64-pc-windows-msvc: scripts/Dockerfile.windows-builder Debian 12 base, matches scripts/Dockerfile.android-builder's layout: - apt: build-essential, cmake, ninja-build, llvm, clang, lld, nasm, libssl-dev, node 20 LTS - rust stable + x86_64-pc-windows-msvc target - cargo-xwin pre-installed - Pre-warmed ~/.cache/cargo-xwin layer: creates a throwaway cargo project and runs `cargo xwin build` once during image build so the MSVC CRT + Windows SDK (~1.5 GB) is baked into an image layer. Saves ~4 minutes off every cold cross-compile run. - Builder user uid 1000 to match existing bind-mount perms on SepehrHomeserverdk. scripts/build-windows-docker.sh Same pattern as scripts/build-tauri-android.sh but for Windows: - Fires a remote build on SepehrHomeserverdk via ssh + heredoc - Mounts the shared cargo-registry + cargo-git cache + a target-windows dir (separate from the android target cache so different triples don't stomp each other) - Runs npm install + npm run build for the frontend dist, then cargo xwin build --release --target x86_64-pc-windows-msvc --bin wzp-desktop inside the container - Uploads the resulting .exe to rustypaste (via the .env token on the remote, same as android script) and fires ntfy.sh/wzp notifications at start + completion - scp's the .exe back to target/windows-exe/wzp-desktop.exe locally - --image-build flag triggers a fire-and-forget `docker build` of the Dockerfile.windows-builder on the remote (used once after the Dockerfile changes). The image is already built at the moment of this commit — sha256:f3895cb2fde7 scripts/build-windows-cloud.sh Kept as an alternative cross-compile path using a fresh Hetzner VM (cx33, 8 vCPU, 8 GB — bumped from cx23 after the smaller size OOM'd mid-rustc). The docker-on-SepehrHomeserverdk path is now the preferred fast path because the image has a pre-warmed xwin cache and a persistent cargo target volume, making warm builds ~3 minutes vs the cloud path's ~20 minutes cold each run. The cloud script stays around for when we want a truly isolated environment. Both scripts notify via ntfy.sh/wzp and upload to paste.dk.manko.yoga so the user can pick up the artefact + see status without polling.	2026-04-10 12:35:02 +04:00
Siavash Sameni	d774f5f8c5	feat(history): dedupe by call_id + explicit Incoming/Outgoing/Missed labels Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details User reported that outgoing direct calls from macOS show up in the history list as "missed" even when the call completes successfully. Adds two changes to fix / diagnose: 1. history::log now dedupes by call_id. If an entry for this call_id already exists in the store, it updates the existing row's direction + timestamp in place instead of appending a duplicate. Protects against double-emit (caller side adding Missed on top of Placed, or any future signal loop that fires twice). One row per call_id, which matches what the user intuitively expects. 2. history::log now logs every write with tracing::info — call_id, peer_fp, direction, alias. Plus an extra line when we replace an existing entry: "history::log replacing existing entry from=Placed to=Missed" etc. Makes it easy to see in the desktop stderr which side is writing what, so we can find the outgoing => missed regression immediately if it recurs. 3. main.ts now renders an explicit text label next to the direction arrow: "Outgoing", "Incoming", or "Missed" instead of just the ↗ ↙ ✗ icons. Removes any ambiguity about what the icon means so future users can't misread a Placed entry as Missed based on icon shape alone. Side fix for scripts/build-windows-cloud.sh: - die() and the do_full ERR trap now respect WZP_KEEP_VM=1 so a failed build doesn't auto-destroy the debug VM (previously the trap fired before the KEEP_VM check and tore down the VM on any error). - Bump default server type cx23 → cx33. 4GB RAM is not enough for a cold tauri + rustls + quinn + wzp-client cross-compile — the cx23 run got "Read from remote host ... Connection reset by peer" partway through rustc, which is the classic signature of an OOM kill on the SSH session. cx33 has 8GB RAM and 8 vCPU which should comfortably fit the build.	2026-04-10 12:34:19 +04:00
Siavash Sameni	2fd94651e4	fix(desktop): direct calls used wrong identity file — mac identity mismatch Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m40s Details The non-Android branch of CallEngine::start loaded the seed from \$HOME/.wzp/identity directly, while register_signal in lib.rs goes through the shared load_or_create_seed() helper which resolves via APP_DATA_DIR → Tauri's app_data_dir(). On macOS those are two completely different files: register_signal → ~/Library/Application Support/com.wzp.desktop/.wzp/identity CallEngine::start (old) → ~/.wzp/identity On a fresh install they end up holding two different random seeds. Register and CallEngine then derive two different fingerprints from those seeds, and when a direct call comes in the relay routes it to "you" under the register_signal fingerprint, but once CallEngine tries to join the call-* room it advertises a DIFFERENT fingerprint — which fails the call_registry ACL check on the relay side (only the two authorised participants of a call can join its room). Silent hang, the call never completes. Android hit this bug earlier in the week and was fixed by switching its CallEngine::start branch to `crate::load_or_create_seed()`. Backport the same single-line change to the desktop branch so both platforms share one identity source of truth. Also bring the desktop branch up to parity with the android branch on diagnostic logging: - log CallEngine::start entry with relay/room/alias/quality/has_reuse - log endpoint.local_addr on reuse / create - log "QUIC connection established, performing handshake" between connect() and perform_handshake() so a hang at either step is immediately localisable - map_err all three potential failure points (create_endpoint, connect, perform_handshake) to an explicit error! trace	2026-04-10 12:15:23 +04:00
Siavash Sameni	da09fdb6e9	windows(desktop): gate coreaudio / VoiceProcessingIO to macOS-only targets Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m34s Details First step of the Windows x86_64 desktop build: stop pulling coreaudio-rs into the Windows dependency graph so the project can at least run `cargo check --target x86_64-pc-windows-msvc`. Software AEC is already disabled in engine.rs so there's nothing else to stub — the macOS-specific VPIO path is skipped via #[cfg(target_os = "macos")] on both sides and Windows falls through to the plain CPAL AudioCapture/AudioPlayback branch that already existed. crates/wzp-client/Cargo.toml - coreaudio-rs optional dep moved under [target.'cfg(target_os = "macos")'] - `vpio` feature now uses `dep:coreaudio-rs` syntax and the gated dep - Enabling `vpio` on Windows/Linux is a no-op at resolution time crates/wzp-client/src/lib.rs - `pub mod audio_vpio` is now #[cfg(all(feature = "vpio", target_os = "macos"))] - Previously `vpio` alone was enough to try to compile the Core Audio bindings, which would fail on non-Apple targets the moment the feature flag was flipped on desktop/src-tauri/Cargo.toml - [target.'cfg(not(target_os = "android"))'] removed — was leaking vpio into Windows/Linux via the catch-all. - macOS: wzp-client with features = ["audio", "vpio"] - Windows: wzp-client with features = ["audio"] - Linux: wzp-client with features = ["audio"] - Android: wzp-client with default-features = false (unchanged) - Dropped the unused direct coreaudio-rs = "0.11" dep on macOS — wzp-desktop's own sources never call Core Audio directly. Verified via `cargo tree --target x86_64-pc-windows-msvc -p wzp-desktop` that the Windows target now resolves wzp-client with cpal but without coreaudio-rs. macOS target still resolves with coreaudio (direct via vpio feature and transitively via cpal). macOS `cargo check` still builds cleanly. Cross-compile from macOS hit a cargo-xwin + llvm-lib setup issue in ring's build.rs, so the actual `cargo check --target x86_64-pc-windows-msvc` did not complete locally. Build verification belongs on the user's Windows x86_64 host where MSVC is present natively. See tasks #23 (this one), #24 (Voice Capture DSP / WASAPI Communications for OS-level AEC on Windows), and #25 (aarch64-pc-windows-msvc support).	2026-04-10 11:12:08 +04:00
Siavash Sameni	510eae2089	feat(direct-call): call history, recent contacts, deregister button Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details Persistent JSON-backed call history for the direct-call screen so users can see what they've placed / received / missed and dial back with one click. Also fixes two small latent UX issues reported alongside. Backend (Rust) - new crate/module desktop/src-tauri/src/history.rs: thread-safe in- process store (OnceLock<RwLock<Vec<CallHistoryEntry>>>) backed by <APP_DATA_DIR>/call_history.json. Atomic writes via temp+rename. Max 200 entries, FIFO pruning. CallDirection { Placed, Received, Missed }. - Log hooks in the signal loop + commands: * place_call → Placed entry (with target fingerprint) * DirectCallOffer → Missed entry up front; upgraded to Received inside answer_call when accept_mode != Reject via history::mark_received_if_pending(call_id). If user rejects or never answers, it stays Missed. - New Tauri commands: * get_call_history() → all entries, newest first * get_recent_contacts() → unique peers by fp, newest interaction first * clear_call_history() → wipes JSON + in-memory * deregister() → tears down signal transport + endpoint Backend emits `history-changed` events so the UI can live-refresh without polling. Frontend (main.ts + index.html + style.css) - Direct-call panel now has: * Recent contacts chip row (top 6 unique peers). Click a chip → dial. * Call history list (up to 50 rows). Direction icon (↗ placed, ↙ received, ✗ missed), peer alias/fp, relative timestamp, callback button. Both click handlers populate target-fp and fire place_call. * Deregister button in the "registered" header — calls the new deregister command, tears down the signal transport, returns the UI to the pre-register state. * Clear-history link in the history header. - Subscribes to `history-changed` events so the list updates the moment the backend logs a new entry. Also refreshed on register + after a clear. - Nothing is rendered until there is data — empty sections stay hidden. Tasks #20 + #21 (small UX items bundled in) - Default room "general" for new installations: the html input value attribute is now "general" and loadSettings() defaults match. Existing users' localStorage still wins. - Random alias on desktop: already latent but confirmed working — the startup IIFE at main.ts:374 calls get_app_info() and prefills the alias input from derive_alias(seed) when the input is empty. No code change needed, just verified it flows through the same path as the Android client. Known follow-ups (deferred to step 6 polish) - Call duration tracking (currently all entries have no duration field) - Hangup signal from an unanswered incoming should emit history-changed so the missed state is visible even when the user never tapped accept - Android UI layout fit-check on the smaller Nothing screen	2026-04-10 11:03:36 +04:00
Siavash Sameni	76a4c53e21	fix(android-audio): spawn_blocking for Oboe restart — unblock tokio executor Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details Build `4c6aac6` added a stop+sleep+start Oboe restart inside the set_speakerphone Tauri command, but calling wzp_native::audio_stop() and audio_start() synchronously from an async fn blocks the tokio executor thread — those FFI calls wait for AAudio to finalise the stream teardown/bringup, which takes ~400ms each on Nothing phone (Pixel is fast enough to hide the bug). Reproduced on Nothing: 7 rapid Speaker button clicks across ~30 seconds, each restarting Oboe. After the 5th click the engine send and recv tokio tasks froze for 22 seconds — decoded_frames stuck at 1159 across 9 heartbeats, send_drops growing from 148 to 1720 as encoded frames couldn't make it past `send_t.send_media(pkt).await`. At 08:40:48 the runtime finally caught up and processed a 911-frame burst at once (buffered QUIC datagrams flooding through). Classic "blocking sync call in async context" anti-pattern. Fix: run the stop + start sequence inside tokio::task::spawn_blocking so the Oboe teardown + reopen happens on a dedicated blocking thread, leaving the tokio runtime free to keep driving the send and recv tasks. AAudio's requestStop returns only after the stream is actually in Stopped state, so the explicit sleep that bridged stop and start is no longer needed and is dropped. Send and recv tasks still see a ~500ms window of empty reads / partial writes during the blocking restart, but they get SCHEDULED through it — network packets keep being received + decoded + dropped into the playout ring, and captured mic samples keep being encoded + sent through quinn. No more executor starvation, no more 22-second audio dropouts, no more send_drops burst. Pixel still worked before this fix only because its AAudio teardown is fast enough to not exceed the scheduler's cooperative yield interval — same bug was latent on both devices, Nothing just made it visible.	2026-04-10 08:45:54 +04:00
Siavash Sameni	4c6aac654a	fix(android-audio): restart Oboe on speakerphone toggle + unbreak button UI Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m25s Details Build `4f2ad65` wired the Speaker button to AudioManager.setSpeakerphoneOn but user testing found that flipping speakerphone on an active Oboe VoiceCommunication stream silently tears down the AAudio streams on Pixel-class devices — both capture and playout stop producing data. Only ending the call and rejoining brings audio back (because the fresh Oboe open runs with the new routing already applied). Also the earpiece state showed up red in the UI because the button was getting the `.muted` CSS class when speakerphoneOn=false. Earpiece is a valid routing state, not a muted one. Fix set_speakerphone Tauri command: 1. Flip AudioManager.setSpeakerphoneOn via JNI (as before). 2. If the Oboe backend is currently running, stop it, sleep 50 ms to let AAudio finalise the transition, then start it again. The Rust send/recv tokio tasks keep running across the gap — they just read zero samples and write into the preserved ring buffers for a few frames, which is acceptable. The AudioBackend singleton's ring state is preserved across stop+start because it's in a 'static OnceLock. 3. Debounce the UI click via speakerphoneBusy + spkBtn.disabled so users can't queue up multiple toggles during the restart window. Fix main.ts Speaker button: - Remove the `.muted` classList toggle (added `.speaker-on` for CSS). - Update label text to "🔊 Speaker" / "🔈 Earpiece" for clarity. - On showCallScreen(), invoke is_speakerphone_on to sync the label with the real AudioManager state, so it matches reality after a rejoin (which was another symptom the user hit — the button label desynced from the actual routing after ending and restarting a call). - Debounce click + disable button while the restart is in flight. Drops #[allow(dead_code)] from wzp_native::audio_is_running now that it is actually called from the set_speakerphone restart guard.	2026-04-10 07:35:12 +04:00
Siavash Sameni	4f2ad65418	fix(android_audio): add explicit pointer types for .cast() — was rejected by rustc E0282 on android target Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 4m6s Details	2026-04-09 22:02:48 +04:00
Siavash Sameni	0178cbd91d	android(audio): Speaker button toggles earpiece↔speaker via JNI (WIP, untested) Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Build `9e37201` confirmed on-device that Usage::VoiceCommunication + MODE_IN_COMMUNICATION + speakerphoneOn=false routes Oboe playout to the handset earpiece and the callback drains the ring correctly. Next step: let the user flip speakerphoneOn at runtime so the existing Speaker button actually switches audio routing instead of just gating writes. - Cargo.toml (android target): pull in `jni = 0.21` and `ndk-context = 0.1`. Both are already transitively in the lockfile via Tauri/Wry, so this just promotes them to direct deps. - desktop/src-tauri/src/android_audio.rs: new module. Grabs the JavaVM + current Activity from `ndk_context::android_context()`, attaches a JNI thread, calls `activity.getSystemService("audio")` to get the AudioManager, and exposes `set_speakerphone(bool)` + `is_speakerphone_on()` helpers that call the AudioManager method of the same name. All gated behind `#[cfg(target_os = "android")]`. - lib.rs: adds `mod android_audio;` (android only), two new Tauri commands `set_speakerphone(on)` and `is_speakerphone_on()` — desktop gets no-op stubs so the same frontend invoke() works everywhere. Both registered in the invoke_handler. - desktop/src/main.ts: the Speaker button (previously toggled the playout-write gate via `toggle_speaker`) now calls `set_speakerphone` and reads back the new routing state. Labels switched from "Spk" / "Spk Off" to "Earpiece" / "Speaker" so users can't be confused into thinking clicking turns audio off. pollStatus no longer clobbers the spkBtn label based on engine spk_muted, since the two concepts are now decoupled. WIP because this has NOT been built or tested yet — committing at night to save the work. Tomorrow: build #50 with this change, smoke-test the Handset↔Speaker toggle, then move on to call history + last-contacts UI and the Speaker-button mute bug on the other phone.	2026-04-09 22:00:34 +04:00
Siavash Sameni	9e37201198	android(audio): Usage::VoiceCommunication + MODE_IN_COMMUNICATION, default handset Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m44s Details With `da106bd` (Usage::Media + MODE_NORMAL) audio works but is always on the loudspeaker — we want handset as the default with a user-driven toggle for speaker (and later bluetooth). The right Oboe usage for a VoIP app is VoiceCommunication, which honours AudioManager.setSpeakerphoneOn / setBluetoothScoOn for routing. Bisection across previous builds showed that setAudioApi(AAudio) + Usage::VoiceCommunication made the playout callback stop draining the ring after cb#0 (build `8c36fb5` logs). Letting Oboe pick the AudioApi implicitly keeps the callback alive — 96be740's Media-usage callbacks fired at steady 50Hz without any explicit setAudioApi. So: keep the Usage change, DROP the explicit AAudio force. - oboe_bridge.cpp: Usage::VoiceCommunication, no setAudioApi, no ContentType override. - MainActivity.kt: setMode(MODE_IN_COMMUNICATION) + setSpeakerphoneOn(false) = handset default, plus max both STREAM_VOICE_CALL and STREAM_MUSIC volumes for belt-and-braces. Next build will add a JNI-based Tauri command to flip speakerphoneOn at runtime so the user can toggle handset↔speaker during a call.	2026-04-09 21:50:06 +04:00
Siavash Sameni	da106bd939	fix(android-audio): revert to 96be740's Oboe config — VoiceCommunication broke callback drain Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Failing after 3m45s Details Build `8c36fb5` logs showed a new regression: Oboe playout cb#0 fires once at startup then the callback STOPS DRAINING the ring entirely. written_samples sticks at 7679 (= RING_CAPACITY - 1) across every recv heartbeat in a 40-second test. Meanwhile the recv task decodes 1800+ real audio frames (sample range up to [-27920..31907], rms 12065) which all get dropped on the floor by audio_write_playout returning 0 because the ring is full. Bisection: `96be740` (Usage::Media, no setAudioApi, no ContentType, no MainActivity audio mode change) DID drive the playout callback at the expected 50Hz (playout heartbeat: calls=1100 total_played_real=1055040 over 22 seconds). User still heard nothing there because of OS routing, but at least Oboe accepted the PCM. `8c36fb5` added three changes on top of `96be740`: 1. Oboe Usage::Media → Usage::VoiceCommunication 2. Oboe setAudioApi(oboe::AudioApi::AAudio) explicit 3. Oboe setContentType(ContentType::Speech) 4. MainActivity setMode(MODE_IN_COMMUNICATION) + setSpeakerphoneOn(true) Every one of those could have killed the callback; combined they did. Revert to 96be740's exact Oboe config: Usage::Media, no setAudioApi, no ContentType. Keep the PCM recorder, heartbeat logging, and stream-open logging. Separately, MainActivity now maxes STREAM_MUSIC (the stream Usage::Media routes to) but leaves audio mode in MODE_NORMAL — no more speakerphone/call-mode combo that makes Oboe unhappy. In NORMAL mode a STREAM_MUSIC stream plays through the loud speaker by default. Proof that the Rust pipeline is perfect: decoded.pcm recorded in `8c36fb5` was pulled via `adb shell run-as com.wzp.desktop cat .wzp/decoded.pcm`, converted with ffmpeg, and played back on the Mac — user confirmed audible speech. So 100% of the remaining bug surface is Android audio routing, not anything in the Rust/C++ decode path.	2026-04-09 21:38:19 +04:00
Siavash Sameni	8c36fb5651	fix(wzp-native): Oboe ResultWithValue has no value_or, unfold explicitly Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m55s Details cc-rs build of oboe_bridge.cpp failed at `cfa9ff6` because the Oboe ResultWithValue<T> template returned by getXRunCount() does not have a .value_or(T) method — only .value(). Replace with an explicit bool-conversion + .value() guard that yields -1 on error.	2026-04-09 21:25:38 +04:00
Siavash Sameni	cfa9ff67cf	fix(android-audio): VoIP mode + speakerphone + debug PCM recorder Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Build `96be740` logs proved the entire software pipeline is healthy: capture heartbeat: calls=1100 to_write=960 full_drops=0 total_written=1056000 recv heartbeat: decoded_frames=1035 last_written=960 decode_errs=0 recv decoded PCM: range=[-13564..9244] rms=8044 (real audio) playout WRITE: in_len=960 written=960 rms=2318 (real audio into the ring) playout heartbeat: calls=1100 nonempty=1099 total_played_real=1055040 1055040 samples / 48000 Hz = 22s — exactly matches wall-clock elapsed, meaning Oboe IS calling our playout callback at the expected rate and WE ARE handing it real PCM every 20ms. User still heard nothing. Ergo Oboe accepted the PCM and routed it to a silent output. Two fixes: 1) MainActivity.kt: switch to MODE_IN_COMMUNICATION + speakerphone ON right after permissions are granted, and crank STREAM_VOICE_CALL to max. Without this, an Oboe Usage::VoiceCommunication stream gets opened, the OS creates a real AAudio pipeline, the callback fires on schedule — and audio goes to either the earpiece at muted volume or a "call not active" dead end. Logs the audio mode + volume levels before and after the switch so we can confirm the state change in logcat next run. 2) oboe_bridge.cpp: revert Usage::Media → VoiceCommunication (the mode that matches MODE_IN_COMMUNICATION), pin the audio API to AAudio explicitly instead of letting Oboe fall back to OpenSLES (which has its own silent-drop failure modes on some devices), and add getState + getXRunCount to the playout heartbeat so we'll see silent stream disconnects instead of reading zeros forever. 3) engine.rs recv task: dump the first ~10s of post-AGC decoded PCM to `<app_data_dir>/decoded.pcm` as raw i16 LE so we can adb pull it and play it back locally: adb shell run-as com.wzp.desktop cat .wzp/decoded.pcm > decoded.pcm ffmpeg -f s16le -ar 48000 -ac 1 -i decoded.pcm decoded.wav This divorces "is our decoder actually producing audible audio" from "is Android's audio stack playing it". If the recorded WAV sounds correct when played on a laptop, the decoder is fine and 100% of the remaining bug surface is AudioManager / Oboe routing. 4) engine.rs: also log when spk_muted=true blocks the write. User reported the Speaker button in the UI has inconsistent semantics between desktop and android — adding this log rules out the accidental "first click muted playback" theory for good.	2026-04-09 21:24:26 +04:00
Siavash Sameni	96be740fd9	diag(android-audio): aggressive logging across the whole Oboe pipeline Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Failing after 3m46s Details User confirmed: mac hears android, android does not hear mac. So Oboe capture works end-to-end but Oboe playout on Android silently drops audio even though QUIC forwards the packets. Archaeology on the legacy wzp-android crate also revealed that the "last known good" Android audio path NEVER used Oboe in production — it used Kotlin AudioRecord + AudioTrack via JNI, and cpp/oboe_bridge.cpp was dead code. So every time we've "tested" Oboe end-to-end this week was the first production use, and any of its config knobs could be the bug. Instrumenting every stage of the pipeline so one smoke-test log dump can isolate the layer at fault: C++ (oboe_bridge.cpp) - Log the ACTUAL stream parameters after openStream for both capture and playout (sample rate, channels, format, framesPerBurst, framesPerDataCallback, bufferCapacityInFrames, sharing, perf mode). Oboe may silently override values we requested — e.g. if we ask for 48kHz mono but the device gives us 44.1kHz stereo our 960-sample frames are the wrong duration and the pipeline drifts. - Capture callback: on cb#0 log sample range+RMS of the first frame to prove we get real mic data (not zeros). Every 50 callbacks (~1s at 20ms burst) log calls, numFrames, ring available_write, bytes actually written, ring_full_drops, total_written. - Playout callback: on cb#0 log numFrames + ring state. On the FIRST non-empty read log sample range+RMS so we can tell if the samples coming out of the ring are real audio or zeros. Every 50 callbacks log calls, nonempty count, numFrames, ring available_read, underrun_frames, total_played_real. Rust wzp-native (src/lib.rs) - wzp_native_audio_write_playout now logs the first 3 writes and then every 50th: in_len, written, sample range, RMS, ring write/read cursors before, available_read and available_write after. Reveals ring-overflow and whether the engine is actually handing us audio. - Minimal android logcat shim via __android_log_write extern — no new crate dependency. - AudioBackend grows a `playout_write_log_count` AtomicU64 to gate the write-side log throttle. Rust engine.rs (android branch) - Recv task: log sample range + RMS for the first 3 decoded PCM frames and then every 100th. Reveals whether decoder.decode is producing real audio or silent buffers. - Recv task: if audio_write_playout returns fewer samples than we handed it (partial write → ring nearly full) warn about it in the first 10 frames. - Recv heartbeat every 2s: recv_fr, decoded_frames, last_decode_n, last_written, written_samples, decode_errs, codec. Expected flow in a healthy log: capture cb#0: numFrames=960 range=[-1200..900] rms=180 ← mic OK capture stream opened: actualSR=48000 Ch=1 ... ← no override playout stream opened: actualSR=48000 Ch=1 ... CallEngine::start invoked ... → connected → audio started recv: first media packet received ... recv: decoded PCM sample range decoded_frames=1 range=[-300..250] rms=92 playout WRITE #0: in_len=960 written=960 range=[-300..250] rms=92 playout FIRST nonempty read: to_read=960 range=[-300..250] rms=92 playout heartbeat: calls=50 nonempty=50 underrun=0 ... recv heartbeat: decoded_frames=100 last_written=960 ... If any of those are missing/zero we know the exact stage to fix.	2026-04-09 21:13:29 +04:00
Siavash Sameni	8c4d640f89	fix(android): playout Usage::Media + relay CallSetup advertises real IP Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details Three real bugs, one smoke-test session's worth of progress. 1. RELAY: wrong advertised addr in CallSetup The direct-call CallSetup computed `relay_addr = addr.ip()` where `addr = connection.remote_address()` — i.e. the CLIENT'S IP, not the relay's. So the relay was telling both parties "the call room is at the answerer's IP:4433", which meant each client dialed either the other client (no server listening) or themselves. Both endpoint.connect calls hung forever and the call never happened. Fix: compute the relay's own advertised IP once at startup. If the listen addr is 0.0.0.0, probe the primary outbound interface via the classic UDP-bind-and-connect(8.8.8.8:80) trick to discover the LAN IP the OS would use to reach external hosts. Thread the resulting advertised_addr_str into the CallSetup sender for both parties. 2. RELAY: accept loop serialized QUIC handshakes Previously the main accept loop called `wzp_transport::accept` which did both `endpoint.accept().await` AND `incoming.await` (the server- side QUIC handshake). A single slow handshake therefore blocked every subsequent client from being accepted. Unroll the helper here and move `incoming.await` into the per-connection spawned task, so every handshake runs in parallel. Also log "accept queue: new Incoming", "QUIC handshake complete", and "QUIC handshake failed" so we can tell immediately whether a client's packets are reaching the relay at all. 3. ANDROID: playout was routed to the silent in-call stream The Oboe playout stream was configured with Usage::VoiceCommunication, which routes to the Android in-call earpiece stream. That stream is silent unless the Activity has called AudioManager.setMode( IN_COMMUNICATION) and, even then, only the earpiece/BT headset get audio (not the loud speaker). Result: android→mac calls worked because mac had a normal media output, but mac→android calls were silent even though packets flowed through the relay just fine. Switch to Usage::Media + ContentType::Speech so Oboe routes to the loud speaker and uses the media volume slider. A later polish step will wire setMode + setSpeakerphoneOn from MainActivity.kt so we can go back to VoiceCommunication for AEC and proximity-sensor routing. Plus: heartbeat tracing every 2s in the send/recv tasks — frames_sent, last_rms, last_pkt_bytes, short_reads on the send side; decoded_frames, last_decode_n, last_written, decode_errs on the recv side. Will make the next "no sound" regression trivial to localize.	2026-04-09 20:55:10 +04:00
Siavash Sameni	49f101d785	fix(android): reuse signal endpoint for direct-call media connection Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m46s Details Direct-call accept hangs forever at the QUIC handshake on Android. Logs from `d7b37a5` showed: CallEngine::start (android) invoked relay=172.16.81.172:4433 room=call-… resolved relay addr identity loaded endpoint created, dialing relay ← reached ← nothing, 90s+, no error The "connect failed" and "QUIC connection established" log lines never fire, meaning endpoint.connect_with(…).await never makes progress. Repro is 100%: SFU room join (one endpoint) works perfectly; direct call (opens a SECOND quinn::Endpoint on top of the signal one) hangs in the QUIC handshake. Creating two quinn::Endpoints on Android's AAudio-adjacent UDP stack apparently causes the second one's datagrams to never reach the relay (the server never sees the Initial packet). Rather than fight the platform, quinn is happy to multiplex multiple Connections on a single Endpoint — so we reuse the signal endpoint for the media connection. - SignalState now stores the quinn::Endpoint alongside the QuinnTransport. register_signal populates both at the same time. - CallEngine::start (both android and desktop branches) takes an Option<wzp_transport::Endpoint>. Some → reuse (direct-call path, after register_signal). None → create fresh (SFU room join path). - The connect tauri command reads state.signal.endpoint and threads it through to CallEngine::start, so the direct-call auto-connect (fired by the "setup" signal-event in main.ts) lands on the existing UDP socket. - wzp_transport re-exports quinn::Endpoint so wzp-desktop doesn't need to depend on quinn directly. - Also wraps the android connect in tokio::time::timeout(10s) so future hangs become deterministic "connect TIMED OUT" errors in logcat instead of silent deadlock. Same fix applies verbatim to the desktop client — the user suspects direct call is broken there too and this was likely always the cause, just never surfaced because desktop was only tested via SFU rooms.	2026-04-09 20:29:51 +04:00
Siavash Sameni	d7b37a5749	diag: tracing for direct-call signal loop + CallEngine::start stages Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m57s Details User reports tapping "answer" on an incoming direct call does nothing visible, and suspects the same may affect desktop. The signal recv loop had no tracing at all, so we can't tell whether CallSetup is being received, whether the recv loop died silently, or whether CallEngine::start is failing between "identity loaded" and "connected to relay, handshake complete". - register_signal recv loop now logs every message type with fields (CallRinging, DirectCallOffer, DirectCallAnswer, CallSetup, Hangup, unhandled), plus a warn! on recv errors and a final warn when the loop exits. - place_call / answer_call commands log entry + success / error. The answer_call error path logs the underlying send_signal error so we can see it in logcat instead of only in the JS error toast. - CallEngine::start android branch logs relay/room/alias on entry, logs "endpoint created, dialing relay" between create_endpoint and connect, "QUIC connection established, performing handshake" between connect and perform_handshake, and promotes all three potential failures to explicit error! logs so a silent hang / error becomes visible in logcat. No functional changes — pure diagnostics. Stacks on `b35a6b7` (the Oboe stack-pointer-escape fix) so build #43 carries both.	2026-04-09 19:17:03 +04:00
Siavash Sameni	b35a6b7d92	fix(wzp-native): copy WzpOboeRings by value, not by pointer Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m41s Details PlayoutCallback::onAudioReady crashed with SIGSEGV(SEGV_ACCERR) on the first AAudio callback because g_rings was a `const WzpOboeRings` pointing at the caller's stack frame. wzp_native_audio_start() constructs the rings struct as a stack local in Rust, passes &rings to wzp_oboe_start (which stored the raw pointer), and returns — at which point the stack frame unwinds and g_rings becomes a dangling reference. The first audio callback then read from freed memory and died. - g_rings is now a static WzpOboeRings value (was `const WzpOboeRings`). The raw int16 buffer + atomic index pointers inside the struct still point into the Rust-owned AudioBackend singleton, which is leaked for the lifetime of the process, so deep-copying the struct by value is safe and keeps the inner pointers valid forever. - g_rings_valid atomic bool gates the audio-callback reads: set to true after the value copy in wzp_oboe_start, cleared in wzp_oboe_stop BEFORE the streams are torn down so any in-flight callback sees "no backend" and returns Stop instead of racing on g_rings. - All g_rings->x accesses in the capture + playout callbacks switched to g_rings.x (member-of-value). Reproduced on Pixel 6 / Android 15 with build `0105b0f`: F libc: Fatal signal 11 (SIGSEGV), code 2 (SEGV_ACCERR), fault addr 0x71aa717eb0 in tid 11822 (AudioTrack) #00 PlayoutCallback::onAudioReady(oboe::AudioStream, void, int)+120 #01 oboe::AudioStream::fireDataCallback(void*, int)+136 ...	2026-04-09 19:11:16 +04:00
Siavash Sameni	0105b0fbf3	phase 3(android): RECORD_AUDIO permission + runtime request in MainActivity Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 4m0s Details Oboe fails silently to open the AAudio input stream without android.permission.RECORD_AUDIO, so the call audio would never actually flow even after phase 3's engine wiring. - AndroidManifest.xml: declare RECORD_AUDIO and MODIFY_AUDIO_SETTINGS, and android.hardware.microphone as a required feature. These files are the cargo-tauri-generated scaffold — nothing in .gitignore excludes them, so the intended Tauri 2 mobile workflow is to commit them once populated. - MainActivity.kt: override onCreate to call ActivityCompat.requestPermissions for the audio perms on first launch. The dialog shows exactly once; the grant is persisted per-package. onRequestPermissionsResult logs the outcome so we can spot failures in logcat. A full native Tauri permission plugin integration is deferred to Step 6 (polish) together with notifications, icon, and background service.	2026-04-09 19:00:12 +04:00
Siavash Sameni	5beea7de40	phase 3(android): unify connect/disconnect/toggle_*/get_status commands Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details Step 3 of the Tauri Android rewrite was still returning "audio backend not yet wired on Android (step 3)" because the cfg-gated Android stubs for connect/disconnect/toggle_mic/toggle_speaker/get_status were shadowing the real commands. Now that CallEngine::start() has a real Android body (phase 3, commit `fdbe502`), the gates are unnecessary. - Drop the #[cfg(not(target_os = "android"))] gates from all five engine-backed Tauri commands. - Delete the Android stub block (~50 LOC of "not connected" boilerplate). - Ungate `use engine::CallEngine;` and the AppState.engine field so both targets share the same Mutex<Option<CallEngine>>. - CallEngine::stop() now calls crate::wzp_native::audio_stop() on Android so the mic + speaker are released between calls, matching the desktop behaviour where dropping _audio_handle tears down CPAL. Direct-call flow on Android: peer sends DirectCallOffer → user accepts via answer_call → relay sends signal "setup" event → main.ts auto-invokes connect(relay, room) → CallEngine::start() runs the Android branch → wzp_native::audio_start() brings up Oboe → send/recv tasks stream PCM through the dlopen boundary.	2026-04-09 18:53:54 +04:00
Siavash Sameni	fdbe502524	phase 3(android): wire CallEngine::start to wzp-native audio FFI Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m57s Details Replaces the Android-side CallEngine::start() stub with a real implementation that mirrors the desktop start() body but routes all PCM through the standalone wzp-native cdylib loaded at startup via libloading instead of using CPAL. - desktop/src-tauri/src/wzp_native.rs: new module with a static OnceLock<libloading::Library> + cached raw fn pointers for every symbol we need (version, hello, audio_start/stop, read_capture, write_playout, is_running, capture/playout_latency_ms). init() resolves everything once at startup; accessors return default values if init() never ran. - desktop/src-tauri/src/lib.rs: drop the inline dlopen smoke test, add `mod wzp_native;` behind target_os="android", and invoke wzp_native::init() from the Tauri setup() callback so the library is loaded + all symbols cached before any CallEngine can touch audio. - desktop/src-tauri/src/engine.rs: the Android #[cfg] branch of CallEngine::start() now does the full QUIC handshake + signal loop + Opus send/recv tasks, calling wzp_native::audio_start() / audio_read_capture() / audio_write_playout() instead of the desktop CPAL rings. SyncWrapper now holds a placeholder Box<()> on Android because the audio backend lives in a process-global singleton inside libwzp_native.so rather than being owned per-engine. Next step: build #39 on the remote docker builder and smoke-test on Pixel 6 that the Connect button in the UI successfully brings up Oboe and streams audio through the dlopen boundary.	2026-04-09 18:42:27 +04:00
Siavash Sameni	c769a476a2	phase 2(android): port Oboe C++ bridge + audio FFI into wzp-native Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m56s Details Now that Phase 1 proved the split-cdylib pipeline (build #37 launched cleanly with 'wzp-native dlopen OK: version=42 msg=...' in logcat), this commit brings the real audio code into wzp-native without ever touching the Tauri crate: - cpp/oboe_bridge.{h,cpp}, oboe_stub.cpp, getauxval_fix.c copied verbatim from crates/wzp-android/cpp/ (same files that work in the legacy wzp-android .so on this phone) - build.rs near-identical to crates/wzp-android/build.rs: clones google/oboe@1.8.1 into OUT_DIR, compiles oboe_bridge.cpp + all oboe source files as a single static lib with c++_shared linkage, emits -llog + -lOpenSLES. On non-android hosts it compiles just oboe_stub.cpp so `cargo check` works locally without an NDK. - Cargo.toml gets cc = "1" in [build-dependencies]. This is SAFE because wzp-native is a single-cdylib crate — crate-type is only ["cdylib"], no staticlib, so rust-lang/rust#104707 does not apply. - src/lib.rs extends the FFI surface with the real audio API: wzp_native_audio_start() -> i32 wzp_native_audio_stop() wzp_native_audio_read_capture(mut i16, usize) -> usize wzp_native_audio_write_playout(const i16, usize) -> usize wzp_native_audio_capture_latency_ms() -> f32 wzp_native_audio_playout_latency_ms() -> f32 wzp_native_audio_is_running() -> i32 Plus a static AudioBackend singleton holding the two SPSC ring buffers (capture + playout) that are shared with the C++ Oboe callbacks via AtomicI32 cursors. The wzp_native_version() and wzp_native_hello() smoke tests from Phase 1 are preserved. Compiles cleanly on macOS host with the stub oboe .cpp. Next build will exercise the full cargo-ndk path inside docker to verify the whole Oboe compile still works standalone. Phase 3 (next commit): wzp-desktop engine.rs on Android calls wzp-native's audio FFI via the already-wired libloading handle, and the real CallEngine::start() is implemented for Android using the same codec/handshake/send/recv pipeline as desktop but with Oboe rings instead of CPAL rings. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 18:12:01 +04:00
Siavash Sameni	7cc53aedc7	refactor(android): split C++ into wzp-native cdylib, loaded at runtime Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m34s Details Phase 1 of the big refactor. Escape the Tauri Android __init_tcb+4 symbol leak (rust-lang/rust#104707) by making wzp-desktop's Android .so pure Rust — ZERO cc::Build, no cpp/ files, no C++ in the rustc link step. All future C++ (Oboe audio bridge) lives in a new standalone cdylib crate `wzp-native` which is built with cargo-ndk (the same path the legacy wzp-android crate uses successfully on the same phone + same NDK), copied into Tauri's gen/android/app/src/main/jniLibs at build time, and dlopened by wzp-desktop at runtime via libloading. Changes in this commit: - NEW crate crates/wzp-native/ with crate-type = ["cdylib"] only (no staticlib, no rlib — rust#104707 shows mixing staticlib with cdylib leaks non-exported symbols, which is the original bug source). Phase 1 scaffold has TWO extern "C" functions: wzp_native_version() -> i32 (returns 42) wzp_native_hello(buf, cap) -> usize (writes a string) So we can verify dlopen + dlsym + cross-.so FFI end-to-end before adding any real C++. - desktop/src-tauri/cpp/ directory DELETED (7 files gone). - desktop/src-tauri/build.rs reduced to just the git hash capture + tauri_build::build(). No more cc::Build of any kind. - desktop/src-tauri/Cargo.toml: drop cc from build-dependencies, add libloading = "0.8" as an Android-only runtime dep. - desktop/src-tauri/src/lib.rs Builder::setup() now (on Android only) dlopens libwzp_native.so, calls wzp_native_version() and wzp_native_hello(), and logs the result: "wzp-native dlopen OK: version=42 msg=\"hello from wzp-native\"" If this log appears in logcat when the app launches and the home screen still renders, the split-cdylib pipeline is validated and Phase 2 (port the Oboe bridge into wzp-native) can proceed. - scripts/build-tauri-android.sh: insert a `cargo ndk -t arm64-v8a build --release -p wzp-native` step before `cargo tauri android build`, with `-o desktop/src-tauri/gen/android/app/src/main/jniLibs` so the resulting libwzp_native.so lands in the place gradle will package into the final APK. - Workspace Cargo.toml: add crates/wzp-native to [workspace] members. Phase 2 (separate commit, only if Phase 1 works): - Copy cpp/oboe_bridge.{h,cpp} + getauxval_fix.c from the legacy wzp-android crate into crates/wzp-native/cpp/. - Add cc = "1" as a build-dependency on wzp-native (safe: it's a single-cdylib crate with no staticlib, so no symbol leak). - Add build.rs that compiles the Oboe C++ and the wzp-native Rust FFI exposes the audio start/stop/read/write functions. - wzp-desktop::engine.rs dlopens wzp-native at CallEngine::start, uses its audio functions instead of CPAL on Android. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 18:02:53 +04:00
Siavash Sameni	711137da96	fix(android): -Wl,--exclude-libs,ALL + --no-whole-archive to stop symbol leak Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m54s Details llvm-nm on the crashing .so confirmed the research's smoking gun theory: 000000000130c1f0 t _Z10__init_tcbP10bionic_tcbP18pthread_internal_t 0000000000000000 a pthread_create.cpp 0000000001331108 t pthread_create All lowercase 't' (= LOCAL text symbols), zero UND dynamic references for pthread_create. So rustc's link step is pulling bionic's own pthread_create.cpp compilation unit out of libc.a as a whole-archive inclusion and binding those symbols locally inside our .so, instead of letting them stay UND and resolved against libc.so at dlopen time. Rust's libstd thread::spawn then calls the LOCAL (broken) pthread_create which calls the LOCAL __init_tcb with arguments set up for bionic's static-executable layout — crashes at __init_tcb+4 with SEGV_ACCERR. `-Wl,--exclude-libs,ALL` tells the linker to make symbols from static archives NOT appear in the dynamic symbol table of the output .so. `-Wl,--no-whole-archive` tells it to only pull archive objects that satisfy undefined references, not include the whole archive blindly. If this works, the symbol table should show pthread_create as UND (or at least not locally bound) and the app should launch. If it doesn't, the remaining fallback is the research's action #3 — extract the C++ into its own upstream cdylib crate built with cargo-ndk, and dlopen it from the Tauri cdylib at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 17:45:35 +04:00
Siavash Sameni	6071eb1b02	fix(android): drop staticlib from crate-type — root cause of __init_tcb crash Some checks failed Mirror to GitHub / mirror (push) Failing after 42s Details Build Release Binaries / build-amd64 (push) Failing after 3m47s Details External research (per rust-lang/rust#104707) pointed at this as the highest-probability cause of our byte-identical __init_tcb+4 / pthread_create SIGSEGVs: > Having 'staticlib' alongside 'cdylib' in crate-type leaks non-exported > symbols from the staticlib into the cdylib's symbol table. For a > Tauri Android cdylib, that means bionic's private pthread_create / > __init_tcb code — which got pulled in statically from libc.a the > moment any cc::Build C++ file added C++-linkage overhead — ends up > bound LOCALLY inside our .so instead of being resolved dynamically > against libc.so at dlopen time. Symptoms that match the theory exactly: - llvm-nm on the crashing .so shows __init_tcb and pthread_create as LOCAL symbols with C++ name mangling (bionic's own pthread_create.cpp) - Adding any cc::Build cpp(true) step reliably triggers the crash, independent of which linker (android24-clang vs android26-clang) or which libc++ linkage (shared/static/none) - The legacy wzp-android crate (["cdylib", "rlib"]) works fine on the same phone with the same NDK + Rust toolchain + Oboe C++ code - tauri.conf.json bundle.android.minSdkVersion=26 propagates to gradle but the .so still crashes byte-identically Drop 'staticlib' from crate-type. If we ever need it for iOS, re-add behind a target.'cfg(target_os = "ios")' gate. The desktop binary still links against the rlib, so the bin target on macOS/Linux/Windows is unaffected. Source: https://github.com/rust-lang/rust/issues/104707 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 17:38:49 +04:00
Siavash Sameni	c9cd043657	test: tauri.conf.json bundle.android.minSdkVersion=26 + cpp_smoke.cpp c++_shared Some checks failed Mirror to GitHub / mirror (push) Failing after 1m26s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details User theory: tauri-cli hardcodes minSdkVersion=24 into its rustc invocation regardless of gradle build.gradle.kts, .cargo/config.toml, or env var overrides — but DOES read from tauri.conf.json's bundle.android block. That would explain why every cc::Build C++ compile crashed with __init_tcb+4 via pthread_create: API-24 bionic's .init_array routines for the linked-in .init_array clash with the pthread_create state tao later expects. This commit applies the fix AND re-adds the smallest known crashing variant (E.1 with cpp_link_stdlib('c++_shared')) so the test has one clear failure mode to compare against: tauri.conf.json bundle: "android": { "minSdkVersion": 26 } build.rs (on android target): - hello.c (plain C, worked in Step A) - getauxval_fix.c (plain C, worked in Step D) - hello2.c (plain C, worked in Step D+1) - cpp_smoke.cpp (C++ via cc::Build .cpp(true), crashed in E.1) Also re-emits the libc++_shared.so copy into gen/android jniLibs so the runtime linker can resolve the NEEDED entry cc-rs added via cpp_link_stdlib('c++_shared'). If this launches → theory validated, proceed with Oboe integration. If this crashes → need to keep digging. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:58:37 +04:00
Siavash Sameni	6dd62c94c9	step D+1: add third trivial C static lib (hello2.c) Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m51s Details Step D (hello.c + getauxval_fix.c) launches cleanly. E.minus-1 (hello.c + getauxval_fix.c + cpp_smoke.c) crashes. All three are plain-C trivial single-function files. Theory: the regression is triggered by having 3 or more cc::Build static libs in a Tauri Android cdylib, regardless of what the libs contain. Test: clone hello.c as hello2.c (same content, different symbol) and add a third cc::Build step compiling it. If this crashes, the trigger is just the number of static libs. If it launches, there's something magical about cpp_smoke.c specifically (unlikely — it was near-identical content). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:51:50 +04:00
Siavash Sameni	4c998312aa	regression check: revert build.rs to exact Step D state Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details Verify the Step D baseline still launches after the environment mutations we may have caused during the E bisection (docker image rebuild, tauri-cli version drift, etc). Build.rs is now byte-identical to commit `a852cad` (Step D) except for the git hash capture block that already existed at that point. If this launches cleanly → the cpp_smoke addition genuinely breaks something, bisection continues. If this crashes → the environment regressed between Step D and now, and we need to rebuild the docker image to an earlier snapshot.	2026-04-09 16:45:34 +04:00
Siavash Sameni	22701830c2	step E.minus-1: cpp_smoke renamed to .c and compiled as plain C Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m53s Details c++_shared crashed, c++_static crashed, no stdlib crashed. The remaining variable isolated to cc::Build::new().cpp(true) itself is the C++ compile-mode invocation of clang++. Rename cpp_smoke.cpp → cpp_smoke.c and drop .cpp(true), leaving a plain-C cc::Build that compiles the exact same bytes (minus the 'extern "C"' linkage spec which is C++- only syntax). This is structurally identical to Step A (hello.c), which worked. If THIS build launches, the diff between 'works' and 'crashes' is purely the .cpp(true) mode — something clang++ does differently at compile or link time when producing object files for a Tauri Android cdylib. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:38:29 +04:00
Siavash Sameni	47a037368c	step E.0: drop cpp_link_stdlib entirely (no libc++ linkage) Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m47s Details c++_shared crashed. c++_static also crashed. Both have libc++ code landing in the final .so — one as a NEEDED dynamic lib, the other bundled statically. So the trigger isn't the NEEDED entry specifically, it's libc++ being present in any form. cpp_smoke.cpp is just 'extern "C" int wzp_cpp_hello() { return 42; }' with zero C++ features used, so we can drop cpp_link_stdlib completely and the compile still succeeds. No libc++ .a or .so referenced at all. If this crashes: the trigger is cc::Build::new().cpp(true) switching rustc's final linker driver from clang to clang++ (which pulls in different default libraries). If this launches: the trigger is libc++'s own static initializers or the libc++ code itself doing something that breaks our .so at dlopen time, and we have a path forward — C++ code that doesn't need libc++ (e.g., a thin C++ bridge to Oboe that uses only POD types at the boundary, with all the STL stuff confined to Oboe's own compilation unit which would still need libc++...). More likely we still need a C-only audio interface like raw AAudio via the ndk Rust crate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:31:53 +04:00
Siavash Sameni	191e8761d5	step E.1 variant: cpp_link_stdlib c++_shared → c++_static Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m42s Details Every E.x variant crashed identically when linked with c++_shared, even with a 3-line cpp file that's dead-stripped from the final .so. The crash offsets are byte-identical across E.1, E.2, E.4, and the original full-Oboe Step E. That points at a non-code link-time delta: the `cargo:rustc-link-lib=c++_shared` directive that adds a NEEDED entry for libc++_shared.so to the .so's dynamic table. Swap to c++_static — bundles libc++ directly into our .so so the NEEDED entry disappears. If this launches cleanly, we've conclusively proven the NEEDED libc++_shared.so is the root cause and we have a workable linkage for any C++ we want to add to the Tauri Android build (including the eventual Oboe audio backend). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:18:04 +04:00
Siavash Sameni	0d74366592	step E.1: absolute minimum C++ file (no STL, no includes) Some checks failed Build Release Binaries / build-amd64 (push) Failing after 3m53s Details Last bisection step. cpp/cpp_smoke.cpp reduced to a single extern 'C' function that returns 42. No #include, no std::atomic, no std::mutex, no std::thread. Only C++ things remaining are: - cc::Build::new().cpp(true) in build.rs (C++ mode compile) - cpp_link_stdlib('c++_shared') emitting -lc++_shared If this still crashes with the same __init_tcb+4 / pthread_create stack, we've conclusively proven the trigger is NOT any C++ code that ends up in the final .so (everything gets dead-stripped anyway because Rust never references wzp_cpp_hello). The trigger must be either: a) cargo:rustc-link-lib=c++_shared (adds NEEDED entry for libc++_shared.so in the .so's dynamic table, causing the dynamic linker to load libc++_shared.so at dlopen() time alongside our .so), or b) Some interaction between cpp(true) mode and the rest of the build pipeline (toolchain flags, symbol visibility, etc.) After this build we stop and write an incident report for the WarzonePhone Tauri Android rewrite bisection so far. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:54:21 +04:00
Siavash Sameni	0224ce654c	step E.2: shrink cpp_smoke to std::atomic only — no thread, no mutex Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details Incremental bisection within Step E. E.4 (atomic + mutex + thread) still crashed at __init_tcb. Drop mutex and thread, keep only std::atomic. Build.rs still emits cargo:rustc-link-lib=c++_shared via cpp_link_stdlib('c++_shared'), so the NEEDED entry for libc++_shared.so in the final .so stays identical. Goal: if this crashes, the issue is purely the dynamic link against libc++_shared (not thread/mutex code). If it passes, the issue is actually std::thread or std::mutex use. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:47:30 +04:00
Siavash Sameni	aa240c6d83	step E.4(android): replace full Oboe compile with minimal C++ smoke file Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Failing after 3m44s Details Bisection for the __init_tcb+4 crash that Step E introduced: drop the full Oboe C++ build (200+ files, hundreds of KB of code) and replace it with ONE tiny cpp/cpp_smoke.cpp that exercises the libc++ features Oboe uses — std::atomic, std::mutex, std::thread — via an extern "C" wzp_cpp_smoke() function that's exported but NEVER called from Rust. Still compiled with cpp_link_stdlib("c++_shared"), same as Oboe. libc++_shared.so still copied into gen/android jniLibs. But no Oboe headers, no Oboe source files, no -llog / -lOpenSLES links. Hypothesis: if cpp_smoke.cpp alone reproduces the __init_tcb crash, the trigger is "any libc++_shared link that references std::thread/std::mutex" and Oboe is not the specific culprit. If it launches cleanly, Oboe itself (its size, its static constructors, or a specific header) is responsible — and we then bisect Oboe's source tree. fetch_oboe() and add_cpp_files_recursive() are retained in build.rs with #[allow(dead_code)] so re-enabling the full Oboe compile is a one-line edit once we've identified what's safe to include. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:39:30 +04:00
Siavash Sameni	d216dcc7a3	step E fix (Option 3): bake android24→26 clang shim into image Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m37s Details Incremental Step E (commit `4250f1b`) proved that merely compiling the Oboe C++ bridge into libwzp_desktop_lib.so — with NO Rust-side FFI bindings, no function calls — resurrects the __init_tcb+4 / pthread_ create SIGSEGV at WryActivity.onCreate. Bisection: build #17 (baseline) ✓ build #18 (Step A, hello.c) ✓ build #19 (Step B, wzp-client dep) ✓ build #21 (Step C, engine mod compiled) ✓ build #22 (Step D, getauxval_fix.c) ✓ build #23 (Step E, Oboe C++ compiled) ✗ — __init_tcb+4 crash Root cause: tauri-cli hard-codes `aarch64-linux-android24-clang` as the Rust linker. Without any C++ code in the .so, libstd's pthread_create reference gets resolved against the dynamic libc.so. The moment we add a C++ static library that links against libc++_shared, the link-time resolution pulls in the API-24 libc.a static pthread_create stub — and Rust's libstd then also calls that stub instead of libc.so's real one. The stub calls __init_tcb which SIGSEGVs because bionic's TCB state only exists for static-libc main executables, not .so's loaded via dlopen. API-26 NDK has proper dynamic bindings that resolve correctly. Option 3 fix: at image build time, replace every NDK aarch64-linux-android24-clang (and armv7/x86_64/i686, clang/clang++) binary with a one-line shell script that exec()s the corresponding android26-clang. Since tauri-cli invokes the linker via absolute path, PATH and env var overrides fail — but replacing the binary on disk inside the image is guaranteed to take effect. The legacy wzp-android crate doesn't need this because cargo-ndk respects .cargo/config.toml where a crate-level linker override is set. Only changing the Dockerfile here. Next: rebuild the image no-cache, retry Step E, and if the baseline holds, proceed to Steps F/G. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:17:34 +04:00
Siavash Sameni	4250f1b44a	step E(android): compile full Oboe C++ bridge (not yet called from Rust) Some checks failed Mirror to GitHub / mirror (push) Failing after 40s Details Build Release Binaries / build-amd64 (push) Failing after 3m53s Details Fifth incremental variable — and the first genuinely heavy one. Adds: - cpp/oboe_bridge.{h,cpp} (copied verbatim from crates/wzp-android/cpp/) - cpp/oboe_stub.cpp (fallback if Oboe can't be fetched) - build.rs now clones google/oboe@1.8.1 into OUT_DIR and compiles oboe_bridge.cpp + every .cpp file under oboe/src/ as a single static library via cc::Build, using shared libc++. Same logic as the legacy wzp-android build.rs. - libc++_shared.so gets copied from the NDK sysroot into the Tauri gen/android jniLibs directory so the runtime linker can find it. - rustc-link-lib=log / OpenSLES emitted for Oboe's Android backends. Deliberately NOT called from Rust yet — no extern "C" FFI declarations, no oboe_audio.rs module, the `wzp_oboe_*` symbols from the static lib are simply present but unreferenced. Goal: isolate whether the Oboe C++ compile + static lib link alone (with its libc++ dependency and log/OpenSLES bindings) regresses the working baseline. If the build still launches and renders the home screen, we know the C++ side is clean and the actual regression is caused by calling into Oboe at runtime (next step). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:09:16 +04:00
Siavash Sameni	a852cad15e	step D(android): compile cpp/getauxval_fix.c alongside hello.c Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m55s Details Fourth incremental variable. Adds the getauxval_fix.c shim from the legacy wzp-android crate (which has been shipping with it for months without issue) to our cc::Build on Android. The file defines a single getauxval() function that delegates to bionic's real runtime implementation via dlsym — this is needed because rustc links compiler-rt's broken static getauxval stub that SIGSEGVs in .so libraries loaded via dlopen (reads __libc_auxv which is NULL). Not imported from Rust. Goal: verify that adding a second C static archive (and especially one that overrides a libc-ish symbol) doesn't regress the working build. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:03:37 +04:00
Siavash Sameni	19fd3dd9cc	step C fix: ungate wzp_proto imports used by resolve_quality() on Android Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details Build #20 failed to compile on Android because I over-gated the wzp_proto imports to non-Android. resolve_quality() is compiled on every platform (it's outside the CallEngine impl) and references QualityProfile + CodecId — both platform-independent types from wzp_proto. Move those back to an unconditional import. tracing stays gated (only the desktop start() body logs; the Android stub is silent). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:59:00 +04:00
Siavash Sameni	c69195fe06	step C(android): compile engine.rs on Android with a stub CallEngine::start Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Has been cancelled Details Third incremental variable. Previously the engine module was cfg-gated out of the Android build entirely (`#[cfg(not(target_os = "android"))] mod engine;` in lib.rs). Now it's always compiled, so any link-time effect of having engine.rs in the compilation unit can be measured against the working baseline from build #19. Changes kept deliberately small: - lib.rs: drop the cfg gate on `mod engine;`. `use engine::CallEngine` stays gated because the Android-specific connect/disconnect/... stubs in lib.rs don't reference the type. - engine.rs: the `wzp_client::{audio_io, call}` imports + CodecId + QualityProfile are gated to non-Android (they require the `audio` feature on wzp-client which Android doesn't pull in). On Android we keep only the MediaTransport import for transport.close(). The impl block now has two `start()` methods: the full CPAL-backed one for desktop, and a 6-line Android stub that returns `Err("audio engine not yet wired on Android")` so attempts to `connect` from the UI fail cleanly. Goal: verify that linking in the compiled engine module (plus the types it references) on Android doesn't regress the working baseline. Home screen should still render and register_signal should still work. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:56:02 +04:00
Siavash Sameni	ae4f366b05	step B(android): depend on wzp-client with default-features=false Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m54s Details Second incremental variable on the path to Oboe. Adds a `[target.'cfg(target_os = "android")'.dependencies]` block that pulls in wzp-client with NO features enabled — no audio (no CPAL), no vpio (no VoiceProcessingIO). This gives the Android build access to wzp-client's platform-independent modules (call, handshake, audio_ring, codec wiring) without any system audio bindings. Deliberately no new imports in lib.rs or engine.rs. The only effect should be: cargo-tauri on Android now has to compile wzp-client and all its transitive crates (wzp-codec, wzp-fec, wzp-proto, wzp-crypto already pulled directly; now also audiopus, raptorq, etc.) and link them into libwzp_desktop_lib.so. Goal: verify that merely expanding the compiled code set to include wzp-client doesn't regress the previous working state. If it does, we know one of wzp-client's transitive deps is the problem — probably a C dep like audiopus or codec2. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:49:49 +04:00
Siavash Sameni	f96d7ce3e1	step A(android): add cc=1 build-dep + compile single trivial hello.c Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m54s Details First incremental variable on the path back to Oboe integration. Changes are deliberately minimal: add cc = "1" to [build-dependencies] (cargo build-deps resolve against the host so the line is unconditional), and on the Android target run a single cc::Build step that compiles cpp/hello.c — a 6-line file that defines one function (`wzp_hello_stub`) that is never called from Rust. Goal: verify that merely introducing a C static library into the .so via cc::Build does not regress the working build (#17, commit `5309938` = build #6 behaviour: launches, renders home screen, registers on relay). If this build still works, we know cc::Build pipelines alone are fine and can move to the next variable. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:45:24 +04:00
Siavash Sameni	530993854f	revert(android): roll back to build #6 (`35642d1`) — pre-oboe known-good state Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m51s Details Spent 10+ builds chasing a __init_tcb+4 / pthread_create SIGSEGV after adding the oboe audio backend. Every "fix" made things worse. Reverting all Android-specific files to the state at `35642d1` (build #6), which was the last commit where the Tauri Android app actually launched, rendered the home screen, and successfully registered on a relay. Reverted files (all back to their `35642d1` content): - desktop/src-tauri/Cargo.toml (no build-dep cc, no tracing-android) - desktop/src-tauri/build.rs (git hash only, no Oboe / cc build) - desktop/src-tauri/src/lib.rs (engine cfg-gated on non-android) - desktop/src-tauri/src/main.rs (two-line desktop entry) - desktop/src-tauri/src/engine.rs (desktop-only audio setup) - scripts/Dockerfile.android-builder (no android24→26 clang shim) - scripts/build-tauri-android.sh (no linker env vars / manifest patch) Deleted (were added between `b314138` and `e2e023d`): - desktop/src-tauri/cpp/getauxval_fix.c - desktop/src-tauri/cpp/oboe_bridge.{h,cpp} - desktop/src-tauri/cpp/oboe_stub.cpp - desktop/src-tauri/src/oboe_audio.rs Next: rebuild image on remote (to drop the baked-in clang shim), build an APK, install on Pixel 6, verify the UI renders the same way build #6 did. From there we add features back ONE at a time so we can actually bisect which one triggers the tao::ndk_glue crash. User's rule: "if you want to change stack, change incrementally, so we can debug". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:22:57 +04:00
Siavash Sameni	e2e023d2bc	fix(android): drop pthread_shim — clang shim makes it unnecessary (and harmful) Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m49s Details Once the Dockerfile rewrites every android24-clang to exec android26-clang, the linker uses the API-26 NDK sysroot and libstd's pthread_create reference resolves directly against libc.so's real runtime symbol — no interposition needed. The pthread_shim.c approach was actually fighting its own solution: our shim's dlsym() call bound at link time to libdl.a's STUB dlsym (a five-line function inside libdl_static.o that just returns NULL and sets dlerror to "libdl.a is a stub --- use libdl.so instead"). NDK r19 and glibc 2.34 both replaced libdl.a with empty stubs because dynamic loading is now part of the main libc/bionic — so no amount of link-order tinkering can make a static libdl.a dlsym actually work. Remove pthread_shim.c, the cc::Build::new().file("cpp/pthread_shim.c") step in build.rs, and the -Wl,--wrap=pthread_create rustc-link-arg. Keep getauxval_fix.c because that one DOES work at link time (the symbol override is for a function compiler-rt defines statically, not one that would depend on the stub libdl.a/libc.a). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 13:52:53 +04:00
Siavash Sameni	5df9d418c9	fix(android): bake android24→26 clang shim into the docker image itself Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m36s Details Build #13's PATH wrapper trick failed because tauri-cli invokes the linker with an absolute path (/opt/android-sdk/ndk/.../bin/aarch64-linux-android24- clang), which bypasses \$PATH entirely. The pthread_shim logs confirmed the broken API-24 stubs were still being linked: WZP_pthread_shim: dlsym(RTLD_DEFAULT, pthread_create) returned NULL: libdl.a is a stub --- use libdl.so instead Move the fix up a level — into the Dockerfile itself. On image build, for each of the four android ABIs × {clang, clang++}, rename `${abi}24-${suffix}` to `${abi}24-${suffix}.orig` and replace it with a shell wrapper that exec()s `${abi}26-${suffix}`. Any call to the API-24 wrapper — via PATH, absolute path, or otherwise — now transparently runs the API-26 wrapper, which uses the real libc.so/libdl.so bindings. The old bash-c /tmp/wrappers workaround in build-tauri-android.sh is removed now that the image handles it at the right layer. Also add `--shell` to build-tauri-android.sh: opens an interactive docker container on the remote with the same mounts/env as the build, so I can iterate on cargo tauri android build / manually patch files / etc. without the full git push → ssh → rebuild → install loop. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 13:33:10 +04:00
Siavash Sameni	2718402e96	fix(android): PATH wrapper to redirect tauri-cli's android24-clang → android26 Some checks failed Mirror to GitHub / mirror (push) Failing after 37s Details Build Release Binaries / build-amd64 (push) Failing after 3m48s Details Build #12's instrumented pthread_shim gave us the definitive diagnosis: WZP_pthread_shim: dlsym(RTLD_DEFAULT, pthread_create) returned NULL: libdl.a is a stub --- use libdl.so instead Tauri-cli invokes `aarch64-linux-android24-clang` as the linker and the API-24 NDK sysroot ships stub libdl.a / libc.a: they compile fine but every symbol crashes if called, because they're meant to coexist with a separate dynamic .so that the dynamic linker provides at runtime. Rust's pre-built libstd.rlib has static calls into those stubs baked in, so no matter what we do at link time the broken code lands in the .so. Env-var overrides of CARGO_TARGET_AARCH64_LINUX_ANDROID_LINKER don't stick — tauri-cli resets them before invoking cargo. So instead of fighting the env, we put a wrapper on $PATH, literally named `aarch64-linux-android24-clang`, that exec()s the android26 version. When tauri-cli looks up android24-clang via PATH, it gets our wrapper, our wrapper runs android26-clang, and suddenly the whole build is using the API-26 NDK sysroot with real dynamic bindings to libc.so / libdl.so. Wrappers are installed for all four ABIs (aarch64, armv7, x86_64, i686) × both suffixes (clang, clang++) directly inside the docker bash -c preamble before any cargo invocation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 13:23:47 +04:00
Siavash Sameni	1a8288c95f	debug(android): instrument pthread_shim with logcat tracing + try RTLD_DEFAULT first Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m43s Details Build #11 linked cleanly with --wrap=pthread_create but crashed at launch on tao::ndk_glue::create with a Rust .expect() panic — meaning the shim's __wrap_pthread_create successfully intercepted the call but returned non-zero, triggering std::thread::spawn's Result::expect panic. Add __android_log_print tracing so logcat shows exactly which resolver path fired (RTLD_DEFAULT vs dlopen fallback) and what dlerror reports when they fail. Also try RTLD_DEFAULT first — it's the simplest and should find libc.so's pthread_create in the process's global symbol table without any namespace games.	2026-04-09 13:15:47 +04:00
Siavash Sameni	f015be63ec	fix(android): use --wrap=pthread_create instead of raw symbol override Some checks failed Mirror to GitHub / mirror (push) Failing after 38s Details Build Release Binaries / build-amd64 (push) Failing after 3m39s Details Build #10 failed with: ld.lld: error: duplicate symbol: pthread_create >>> defined at pthread_shim.c:30 >>> ... in archive libpthread_shim.a (the other definition coming from libstd's bundled libc.a stub) The raw-symbol-override approach was naive: when two static archives both define the same symbol the linker refuses instead of picking one. Switch to GNU-ld's `--wrap=pthread_create` mechanism: - All `pthread_create` references get rewritten to `__wrap_pthread_create` - Our shim now defines `__wrap_pthread_create` (no symbol clash) - Inside the shim we `dlopen("libc.so")` + `dlsym("pthread_create")` to get the real runtime symbol directly, bypassing BOTH the broken static stub (libstd's libc.a copy) AND libstd's own pthread_create path - `--real_pthread_create` is deliberately NOT used — it would alias the same broken stub the wrap exists to avoid The wrap flag is emitted via `cargo:rustc-link-arg` in build.rs so it only affects the Android target (the Android-branch of build.rs is the only place that emits it). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 13:08:41 +04:00
Siavash Sameni	79e876126c	fix(android): interpose pthread_create to bypass libstd's broken static stub Some checks failed Mirror to GitHub / mirror (push) Failing after 36s Details Build Release Binaries / build-amd64 (push) Failing after 3m52s Details Builds #7, #8 and #9 all crashed at launch with the same SIGSEGV inside __init_tcb(bionic_tcb, pthread_internal_t)+4 called via pthread_create from std::sys::thread::unix::Thread::new. Digging further: the problem is NOT the final linker we pass to cargo. It's that rustup ships a PRE-COMPILED libstd for aarch64-linux-android which was built statically against an old NDK libc archive. That archive has a pthread_create stub which calls a static __init_tcb stub that assumes libc's static init path has set up the TCB — which never happens in a .so loaded via dlopen. Bumping minSdk to 26 or forcing the android26-clang linker (`903a07c`) doesn't rebuild libstd and therefore doesn't fix the bundled broken stub. The legacy wzp-android crate dodged this with a getauxval_fix.c shim that interposes getauxval via RTLD_NEXT. The same trick works for pthread_create here: define our own `int pthread_create(...)` in cpp/pthread_shim.c that forwards to `dlsym(RTLD_NEXT, "pthread_create")` — the real, fully working version exported from libc.so. The linker processes our static lib before libstd.rlib, so libstd's unresolved pthread_create reference binds to our symbol, and the broken libc.a stub inside libstd is never pulled in. build.rs compiles cpp/pthread_shim.c right after cpp/getauxval_fix.c so both symbol overrides are in place before any Rust code gets linked. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 13:04:18 +04:00
Siavash Sameni	903a07c1d4	fix(android): force API-26 NDK linker via docker env vars Some checks failed Mirror to GitHub / mirror (push) Failing after 39s Details Build Release Binaries / build-amd64 (push) Failing after 3m46s Details The previous commit bumped minSdk from 24 to 26 in build.gradle.kts hoping tauri-cli would pick it up and use the android26-clang linker, but the crash recurred at exactly the same frame (__init_tcb via pthread_create via std::thread::spawn). That means tauri-cli is ignoring the gradle minSdk value and sticking with its hardcoded aarch64-linux-android24-clang. The android24 linker resolves __init_tcb against the broken static stub in libc.a (API 24 does NOT export __init_tcb as a dynamic symbol from libc.so — it only exists in the static archive, and the stub expects the TCB to be initialised by a running static init path, which never happens in a dlopen-loaded .so). Override the linker env vars directly in the docker run invocation for all four ABIs. These take precedence over anything tauri-cli or .cargo/config.toml might set. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 12:55:11 +04:00
Siavash Sameni	af20fa418a	fix(android): bump minSdk 24 -> 26 to avoid broken __init_tcb in NDK 24 stub Some checks failed Mirror to GitHub / mirror (push) Failing after 42s Details Build Release Binaries / build-amd64 (push) Failing after 3m52s Details Build #7 crashed at launch on the Pixel 6 with SIGSEGV in __init_tcb / pthread_create called from tao::ndk_glue::create in WryActivity.onCreate: #00 __init_tcb(bionic_tcb, pthread_internal_t)+4 #01 pthread_create+360 #02 std::sys::thread::unix::Thread::new #04 tao::platform_impl::platform::ndk_glue::create #05 Java_com_wzp_desktop_WryActivity_create Tauri scaffolds build.gradle.kts with `minSdk = 24`, which makes the tauri-cli invoke `aarch64-linux-android24-clang` as the Rust linker. That linker transitively pulls broken static stubs from libc.a for getauxval, __init_tcb and pthread_create — these stubs only work in statically- linked executables because they read bionic state (__libc_auxv, TCB) that only the libc init path sets up. In a .so loaded via dlopen they SIGSEGV the moment anything spawns a thread. API 26+ has the real runtime symbols and the NDK-26 linker resolves them against libc.so instead of the static fallback. This is also the minimum Oboe supports. Patch the generated build.gradle.kts post-init to swap `minSdk = 24` for `minSdk = 26` — the legacy wzp-android crate solved the same issue with a .cargo/config.toml linker override plus a getauxval_fix.c shim. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 12:47:36 +04:00

... 2 3 4 5 6 ...

455 Commits