fix(android-audio): VoIP mode + speakerphone + debug PCM recorder
Some checks failed
Mirror to GitHub / mirror (push) Failing after 40s
Build Release Binaries / build-amd64 (push) Has been cancelled

Build 96be740 logs proved the entire software pipeline is healthy:
  capture heartbeat:   calls=1100 to_write=960 full_drops=0 total_written=1056000
  recv heartbeat:      decoded_frames=1035 last_written=960 decode_errs=0
  recv decoded PCM:    range=[-13564..9244] rms=8044 (real audio)
  playout WRITE:       in_len=960 written=960 rms=2318 (real audio into the ring)
  playout heartbeat:   calls=1100 nonempty=1099 total_played_real=1055040

1055040 samples / 48000 Hz = 22s — exactly matches wall-clock elapsed,
meaning Oboe IS calling our playout callback at the expected rate and
WE ARE handing it real PCM every 20ms. User still heard nothing. Ergo
Oboe accepted the PCM and routed it to a silent output. Two fixes:

1) MainActivity.kt: switch to MODE_IN_COMMUNICATION + speakerphone ON
   right after permissions are granted, and crank STREAM_VOICE_CALL to
   max. Without this, an Oboe Usage::VoiceCommunication stream gets
   opened, the OS creates a real AAudio pipeline, the callback fires on
   schedule — and audio goes to either the earpiece at muted volume or
   a "call not active" dead end. Logs the audio mode + volume levels
   before and after the switch so we can confirm the state change in
   logcat next run.

2) oboe_bridge.cpp: revert Usage::Media → VoiceCommunication (the mode
   that matches MODE_IN_COMMUNICATION), pin the audio API to AAudio
   explicitly instead of letting Oboe fall back to OpenSLES (which has
   its own silent-drop failure modes on some devices), and add getState
   + getXRunCount to the playout heartbeat so we'll see silent stream
   disconnects instead of reading zeros forever.

3) engine.rs recv task: dump the first ~10s of post-AGC decoded PCM to
   `<app_data_dir>/decoded.pcm` as raw i16 LE so we can adb pull it and
   play it back locally:
       adb shell run-as com.wzp.desktop cat .wzp/decoded.pcm > decoded.pcm
       ffmpeg -f s16le -ar 48000 -ac 1 -i decoded.pcm decoded.wav
   This divorces "is our decoder actually producing audible audio" from
   "is Android's audio stack playing it". If the recorded WAV sounds
   correct when played on a laptop, the decoder is fine and 100% of the
   remaining bug surface is AudioManager / Oboe routing.

4) engine.rs: also log when spk_muted=true blocks the write. User
   reported the Speaker button in the UI has inconsistent semantics
   between desktop and android — adding this log rules out the accidental
   "first click muted playback" theory for good.
This commit is contained in:
Siavash Sameni
2026-04-09 21:24:26 +04:00
parent 96be740fd9
commit cfa9ff67cf
3 changed files with 124 additions and 15 deletions

View File

@@ -1,7 +1,9 @@
package com.wzp.desktop
import android.Manifest
import android.content.Context
import android.content.pm.PackageManager
import android.media.AudioManager
import android.os.Bundle
import android.util.Log
import androidx.activity.enableEdgeToEdge
@@ -25,8 +27,7 @@ class MainActivity : TauriActivity() {
// Request RECORD_AUDIO early so Oboe (inside libwzp_native.so) can open
// the AAudio input stream without silently failing. The grant is
// persisted, so after the first launch the dialog no longer appears.
// MODIFY_AUDIO_SETTINGS is requested alongside because Oboe toggles the
// audio mode to communication on some devices.
// MODIFY_AUDIO_SETTINGS is needed to switch AudioManager mode + speaker.
val needsRequest = REQUIRED_AUDIO_PERMISSIONS.any {
ContextCompat.checkSelfPermission(this, it) != PackageManager.PERMISSION_GRANTED
}
@@ -35,6 +36,7 @@ class MainActivity : TauriActivity() {
ActivityCompat.requestPermissions(this, REQUIRED_AUDIO_PERMISSIONS, AUDIO_PERMISSIONS_REQUEST)
} else {
Log.i(TAG, "audio permissions already granted")
configureAudioForCall()
}
}
@@ -48,6 +50,49 @@ class MainActivity : TauriActivity() {
val allGranted = grantResults.isNotEmpty() &&
grantResults.all { it == PackageManager.PERMISSION_GRANTED }
Log.i(TAG, "audio permissions result: allGranted=$allGranted grants=${grantResults.toList()}")
if (allGranted) {
configureAudioForCall()
}
}
}
/**
* Put the phone into VoIP-call audio mode so that the Oboe playout stream
* (opened with Usage::VoiceCommunication) actually routes to the loud
* speaker and uses the in-call volume slider. Without this, the stream is
* accepted by AAudio, the callback is driven at realtime with valid PCM,
* and nothing is audible because the OS routes the stream to a muted or
* unavailable output. See build 96be740's logcat for the full proof:
* playout callback played 1055040 samples in 22s with RMS up to 2318 and
* still produced zero audible output, which was the smoking gun pointing
* at this AudioManager state rather than the Rust pipeline.
*
* This is a temporary "call mode always on" setup — fine for smoke tests
* and the current single-purpose VoIP app. A polished version should
* setMode(IN_COMMUNICATION) only while a call is active and restore
* MODE_NORMAL on hangup, with proper audio-focus requests.
*/
private fun configureAudioForCall() {
try {
val am = getSystemService(Context.AUDIO_SERVICE) as AudioManager
Log.i(TAG, "audio mode before: ${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)} " +
"musicVol=${am.getStreamVolume(AudioManager.STREAM_MUSIC)}/" +
"${am.getStreamMaxVolume(AudioManager.STREAM_MUSIC)}")
am.mode = AudioManager.MODE_IN_COMMUNICATION
am.isSpeakerphoneOn = true
// Nudge volumes to max so the smoke test can actually hear something.
// Users can adjust with the hardware volume buttons afterwards.
val maxVoice = am.getStreamMaxVolume(AudioManager.STREAM_VOICE_CALL)
am.setStreamVolume(AudioManager.STREAM_VOICE_CALL, maxVoice, 0)
Log.i(TAG, "audio mode after: ${am.mode} speaker=${am.isSpeakerphoneOn} " +
"voiceVol=${am.getStreamVolume(AudioManager.STREAM_VOICE_CALL)}/$maxVoice")
} catch (e: Throwable) {
Log.e(TAG, "configureAudioForCall failed: ${e.message}", e)
}
}
}