fix(android-audio): revert to 96be740's Oboe config — VoiceCommunication broke callback drain
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Failing after 3m45s

Build 8c36fb5 logs showed a new regression: Oboe playout cb#0 fires once
at startup then the callback STOPS DRAINING the ring entirely.
written_samples sticks at 7679 (= RING_CAPACITY - 1) across every recv
heartbeat in a 40-second test. Meanwhile the recv task decodes 1800+ real
audio frames (sample range up to [-27920..31907], rms 12065) which all
get dropped on the floor by audio_write_playout returning 0 because the
ring is full.

Bisection: 96be740 (Usage::Media, no setAudioApi, no ContentType, no
MainActivity audio mode change) DID drive the playout callback at the
expected 50Hz (playout heartbeat: calls=1100 total_played_real=1055040
over 22 seconds). User still heard nothing there because of OS routing,
but at least Oboe accepted the PCM.

8c36fb5 added three changes on top of 96be740:
  1. Oboe Usage::Media → Usage::VoiceCommunication
  2. Oboe setAudioApi(oboe::AudioApi::AAudio) explicit
  3. Oboe setContentType(ContentType::Speech)
  4. MainActivity setMode(MODE_IN_COMMUNICATION) + setSpeakerphoneOn(true)
Every one of those could have killed the callback; combined they did.

Revert to 96be740's exact Oboe config: Usage::Media, no setAudioApi, no
ContentType. Keep the PCM recorder, heartbeat logging, and stream-open
logging. Separately, MainActivity now maxes STREAM_MUSIC (the stream
Usage::Media routes to) but leaves audio mode in MODE_NORMAL — no more
speakerphone/call-mode combo that makes Oboe unhappy. In NORMAL mode a
STREAM_MUSIC stream plays through the loud speaker by default.

Proof that the Rust pipeline is perfect: decoded.pcm recorded in 8c36fb5
was pulled via `adb shell run-as com.wzp.desktop cat .wzp/decoded.pcm`,
converted with ffmpeg, and played back on the Mac — user confirmed
audible speech. So 100% of the remaining bug surface is Android audio
routing, not anything in the Rust/C++ decode path.
This commit is contained in:
Siavash Sameni
2026-04-09 21:38:19 +04:00
parent 8c36fb5651
commit da106bd939
2 changed files with 36 additions and 37 deletions

View File

@@ -57,40 +57,35 @@ class MainActivity : TauriActivity() {
}
/**
* Put the phone into VoIP-call audio mode so that the Oboe playout stream
* (opened with Usage::VoiceCommunication) actually routes to the loud
* speaker and uses the in-call volume slider. Without this, the stream is
* accepted by AAudio, the callback is driven at realtime with valid PCM,
* and nothing is audible because the OS routes the stream to a muted or
* unavailable output. See build 96be740's logcat for the full proof:
* playout callback played 1055040 samples in 22s with RMS up to 2318 and
* still produced zero audible output, which was the smoking gun pointing
* at this AudioManager state rather than the Rust pipeline.
* Max out STREAM_MUSIC so the Oboe playout stream (opened with
* Usage::Media, which routes to STREAM_MUSIC) is actually audible.
*
* This is a temporary "call mode always on" setup — fine for smoke tests
* and the current single-purpose VoIP app. A polished version should
* setMode(IN_COMMUNICATION) only while a call is active and restore
* MODE_NORMAL on hangup, with proper audio-focus requests.
* DELIBERATELY does NOT call setMode(IN_COMMUNICATION) or
* setSpeakerphoneOn: build 8c36fb5 confirmed that combining those with
* Usage::Media OR with Usage::VoiceCommunication (both tried) broke the
* Oboe playout callback entirely — the ring filled once at startup and
* Oboe stopped draining it. Keeping audio mode in MODE_NORMAL so the
* Media stream follows the normal speaker-output path, controlled by
* the media volume slider.
*
* A polished version of the app will setMode/setSpeakerphoneOn on a
* per-call basis once we've figured out the correct combo with AAudio.
*/
private fun configureAudioForCall() {
try {
val am = getSystemService(Context.AUDIO_SERVICE) as AudioManager
Log.i(TAG, "audio mode before: ${am.mode} speaker=${am.isSpeakerphoneOn} " +
Log.i(TAG, "audio state before: mode=${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)} " +
"musicVol=${am.getStreamVolume(AudioManager.STREAM_MUSIC)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_MUSIC)}")
am.mode = AudioManager.MODE_IN_COMMUNICATION
am.isSpeakerphoneOn = true
// Crank media volume to max — STREAM_MUSIC is what Usage::Media
// plays through. User can adjust with hardware volume buttons.
val maxMusic = am.getStreamMaxVolume(AudioManager.STREAM_MUSIC)
am.setStreamVolume(AudioManager.STREAM_MUSIC, maxMusic, 0)
// Nudge volumes to max so the smoke test can actually hear something.
// Users can adjust with the hardware volume buttons afterwards.
val maxVoice = am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)
am.setStreamVolume(AudioManager.STREAM_VOICE_CALL, maxVoice, 0)
Log.i(TAG, "audio mode after: ${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/$maxVoice")
Log.i(TAG, "audio state after: mode=${am.mode} musicVol=${am.getStreamVolume(AudioManager.STREAM_MUSIC)}/$maxMusic")
} catch (e: Throwable) {
Log.e(TAG, "configureAudioForCall failed: ${e.message}", e)
}