Adds gold-standard Linux echo cancellation: in-app WebRTC AEC3 (Audio Processing Module) via the webrtc-audio-processing crate, using the same algorithm as Chrome WebRTC, Zoom, Teams, and Jitsi. Runs entirely in-process, so it works identically on ALSA / PulseAudio / PipeWire systems — no dependency on user-configured echo-cancel modules. Architecture: - New crates/wzp-client/src/audio_linux_aec.rs module (~470 lines). Contains LinuxAecCapture and LinuxAecPlayback, both using CPAL under the hood but routing samples through a shared Arc<webrtc_audio_processing::Processor>. The playback path tees each 20 ms frame into APM.process_render_frame as the echo reference BEFORE handing the samples to CPAL's output callback. The capture path runs APM.process_capture_frame on each mic frame in place before pushing to the audio ring buffer. This is the "tee the playback ring" approach that Zoom/Teams/Jitsi use. - New `linux-aec` feature in wzp-client pulling in the webrtc-audio-processing crate at v2.x with the `bundled` sub-feature. Bundled means the vendored PulseAudio WebRTC C++ sources are statically compiled via meson+ninja at cargo build time — no runtime .so dependency, avoids Debian Bookworm's stale libwebrtc-audio-processing-dev 0.3 package (which predates AEC3). Dep is target-gated to Linux, so enabling the feature on non-Linux is a no-op. - lib.rs re-exports LinuxAecCapture/LinuxAecPlayback as AudioCapture/AudioPlayback when `linux-aec` is on, otherwise falls back to the CPAL audio_io path. Shared public API (start/ring/stop/Drop) means downstream code is unchanged. - New `linux-aec` feature in wzp-desktop forwards to wzp-client/linux-aec so `cargo tauri build -- --features wzp-desktop/linux-aec` builds the AEC variant. APM configuration: - EchoCancellation: High suppression, delay-agnostic mode on, extended filter on, stream_delay_ms=60 initial hint - NoiseSuppression: High - HighPassFilter: on - AGC: off (can fight Opus encoder's own gain staging + adaptive quality controller; add later if users report low mic level) Frame size handling: - Pipeline uses 20 ms frames (960 samples @ 48 kHz mono) - APM requires strict 10 ms (480 samples) per call - Each 20 ms frame is split into two 480-sample halves, APM called twice, halves stitched back - Same pattern for render and capture sides - Carry-buffer logic handles the case where CPAL delivers samples in arbitrary chunk sizes that don't divide 960 Build infrastructure: - scripts/Dockerfile.linux-desktop-builder adds meson, ninja-build, python3, clang for the webrtc-audio-processing bundled build - scripts/build-linux-desktop-docker.sh takes a new --aec flag that enables the linux-aec feature and renames the output artifacts with an `-aec` suffix so noAEC and AEC variants can coexist on disk Task #30. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
83 lines
3.5 KiB
Rust
83 lines
3.5 KiB
Rust
//! WarzonePhone Client Library
|
|
//!
|
|
//! End-to-end voice call pipeline:
|
|
//! - **Send**: mic → encode (Opus/Codec2) → FEC → encrypt → QUIC DATAGRAM
|
|
//! - **Recv**: QUIC DATAGRAM → decrypt → FEC decode → jitter buffer → decode → speaker
|
|
//!
|
|
//! Targets: Android (JNI), Windows desktop, macOS/Linux (testing)
|
|
|
|
#[cfg(feature = "audio")]
|
|
pub mod audio_io;
|
|
#[cfg(feature = "audio")]
|
|
pub mod audio_ring;
|
|
// VoiceProcessingIO is an Apple Core Audio API — only compile the module
|
|
// when the `vpio` feature is on AND we're targeting macOS. Enabling the
|
|
// feature on Windows/Linux was previously silently broken.
|
|
#[cfg(all(feature = "vpio", target_os = "macos"))]
|
|
pub mod audio_vpio;
|
|
// WASAPI-direct capture with Windows's OS-level AEC (AudioCategory_Communications).
|
|
// Only compiled when `windows-aec` feature is on AND target is Windows. The
|
|
// `windows` dependency is itself gated to Windows in Cargo.toml, so enabling
|
|
// this feature on non-Windows targets is a no-op.
|
|
#[cfg(all(feature = "windows-aec", target_os = "windows"))]
|
|
pub mod audio_wasapi;
|
|
// WebRTC AEC3 (Audio Processing Module) wrapper around CPAL capture + playback
|
|
// on Linux. Only compiled when `linux-aec` feature is on AND target is Linux.
|
|
// The webrtc-audio-processing dep is itself gated to Linux in Cargo.toml.
|
|
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
|
|
pub mod audio_linux_aec;
|
|
pub mod bench;
|
|
pub mod call;
|
|
pub mod drift_test;
|
|
pub mod echo_test;
|
|
pub mod featherchat;
|
|
pub mod handshake;
|
|
pub mod metrics;
|
|
pub mod sweep;
|
|
|
|
// AudioPlayback: three possible backends depending on feature flags.
|
|
// 1. Default CPAL (`audio_io::AudioPlayback`) — baseline on every platform.
|
|
// 2. Linux AEC (`audio_linux_aec::LinuxAecPlayback`) — CPAL + WebRTC APM
|
|
// render-side tee, so echo from speakers gets cancelled from the mic.
|
|
//
|
|
// On macOS and Windows we always use the default CPAL playback because:
|
|
// - macOS: VoiceProcessingIO handles AEC at the capture side (Apple's
|
|
// native hardware AEC uses its own reference signal handling).
|
|
// - Windows: WASAPI AudioCategory_Communications AEC uses the system
|
|
// render mix as reference — no per-process plumbing needed.
|
|
//
|
|
// Linux is the only platform where the in-app approach is necessary, so
|
|
// the AEC playback path is gated to target_os = "linux".
|
|
|
|
#[cfg(all(
|
|
feature = "audio",
|
|
any(not(feature = "linux-aec"), not(target_os = "linux"))
|
|
))]
|
|
pub use audio_io::AudioPlayback;
|
|
|
|
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
|
|
pub use audio_linux_aec::LinuxAecPlayback as AudioPlayback;
|
|
|
|
// AudioCapture: three possible backends depending on feature flags.
|
|
// 1. Default CPAL (`audio_io::AudioCapture`) — baseline on every platform.
|
|
// 2. Windows AEC (`audio_wasapi::WasapiAudioCapture`) — direct WASAPI
|
|
// with AudioCategory_Communications, OS APO chain does AEC.
|
|
// 3. Linux AEC (`audio_linux_aec::LinuxAecCapture`) — CPAL + WebRTC APM
|
|
// capture-side echo cancellation using the playback tee as reference.
|
|
// All three expose the same public API (`start`, `ring`, `stop`, `Drop`).
|
|
|
|
#[cfg(all(
|
|
feature = "audio",
|
|
any(not(feature = "windows-aec"), not(target_os = "windows")),
|
|
any(not(feature = "linux-aec"), not(target_os = "linux"))
|
|
))]
|
|
pub use audio_io::AudioCapture;
|
|
|
|
#[cfg(all(feature = "windows-aec", target_os = "windows"))]
|
|
pub use audio_wasapi::WasapiAudioCapture as AudioCapture;
|
|
|
|
#[cfg(all(feature = "linux-aec", target_os = "linux"))]
|
|
pub use audio_linux_aec::LinuxAecCapture as AudioCapture;
|
|
pub use call::{CallConfig, CallDecoder, CallEncoder};
|
|
pub use handshake::perform_handshake;
|