Adds a direct WASAPI microphone capture path for the Windows desktop
build that opens the default communications endpoint via
IMMDeviceEnumerator -> IAudioClient2 -> SetClientProperties with
AudioCategory_Communications, turning on Windows's communications
audio processing chain (AEC, noise suppression, automatic gain
control). The communications AEC operates at the OS level and uses
the system render mix as the reference signal, so echo from our
existing CPAL playback stream is cancelled automatically with no
per-process reference plumbing.
Architecture:
- New crates/wzp-client/src/audio_wasapi.rs module (~280 lines).
Event-driven capture loop on a dedicated thread; pushes PCM into
the same lock-free AudioRing used by the CPAL path. Same public
API as audio_io::AudioCapture so downstream code is unchanged.
- New `windows-aec` feature in wzp-client that pulls in the
`windows` crate (Microsoft's official Rust COM bindings) gated to
target_os = "windows" only. Enabling the feature on non-Windows
targets is a no-op since both the module and the dep are
cfg(target_os = "windows").
- lib.rs re-exports WasapiAudioCapture as AudioCapture when the
feature is on, otherwise falls back to the CPAL AudioCapture.
AudioPlayback is always the CPAL one — no reason to swap it.
- desktop/src-tauri/Cargo.toml Windows target enables the new
feature: `features = ["audio", "windows-aec"]`.
Implementation notes:
- Uses eCommunications role (not eConsole) for GetDefaultAudioEndpoint
— the user-configured "communications" device that Teams/Zoom
pick up, and the one Windows's AEC is tuned for.
- Requests 48 kHz mono i16 with AUDCLNT_STREAMFLAGS_AUTOCONVERTPCM +
SRC_DEFAULT_QUALITY so Windows handles any format conversion in
the audio engine instead of rejecting our format.
- Event-driven with SetEventHandle / WaitForSingleObject — no
polling, minimal CPU cost between packets.
- 200 ms wait timeout so the capture thread polls `running` often
enough for Drop to stop cleanly even if the audio engine stalls
(e.g. device unplug).
Task #24.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cargo-xwin drives the Windows MSVC cross-compile via clang-cl, under
which CMake sets MSVC=1 — causing libopus 1.3.1's `if(NOT MSVC)` guards
to skip the per-file `-msse4.1` / `-mssse3` COMPILE_FLAGS that its x86
SIMD source files need. Clang-cl (unlike real cl.exe) still honors
Clang's target-feature system, so those files then fail to compile
with "always_inline function '_mm_cvtepi16_epi32' requires target
feature 'sse4.1'" errors across silk/NSQ_sse4_1.c, NSQ_del_dec_sse4_1.c,
and VQ_WMat_EC_sse4_1.c.
Earlier attempts to fix this downstream (cargo-xwin toolchain file,
override.cmake CMAKE_C_COMPILE_OBJECT <FLAGS> replace, CFLAGS env vars)
all failed because cargo-xwin rewrites override.cmake from scratch on
every `cargo xwin build` invocation and cmake-rs's -DCMAKE_C_FLAGS=
assembly happens before toolchain FORCE sets propagate.
Fixing it upstream at the source: vendor audiopus_sys 0.2.2 into
vendor/audiopus_sys, patch its bundled opus/CMakeLists.txt to introduce
an MSVC_CL var (true only when CMAKE_C_COMPILER_ID == "MSVC", i.e. real
cl.exe), and flip the eight `if(NOT MSVC)` SIMD guards to
`if(NOT MSVC_CL)`. Clang-cl then gets the GCC-style per-file flags and
the SSE4.1 sources build cleanly. Also flip the `if(MSVC)` global /arch
block at line 445 to `if(MSVC_CL)` so only cl.exe applies /arch:AVX and
clang-cl relies purely on per-file flags (no global/per-file mixing).
Wire via [patch.crates-io] in the workspace root Cargo.toml; the patch
is resolved relative to the workspace root as `vendor/audiopus_sys`.
Upstream context: xiph/opus#256, xiph/opus PR #257 (both stale).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
First step of the Windows x86_64 desktop build: stop pulling
coreaudio-rs into the Windows dependency graph so the project can at
least run `cargo check --target x86_64-pc-windows-msvc`. Software AEC
is already disabled in engine.rs so there's nothing else to stub — the
macOS-specific VPIO path is skipped via #[cfg(target_os = "macos")] on
both sides and Windows falls through to the plain CPAL
AudioCapture/AudioPlayback branch that already existed.
crates/wzp-client/Cargo.toml
- coreaudio-rs optional dep moved under [target.'cfg(target_os = "macos")']
- `vpio` feature now uses `dep:coreaudio-rs` syntax and the gated dep
- Enabling `vpio` on Windows/Linux is a no-op at resolution time
crates/wzp-client/src/lib.rs
- `pub mod audio_vpio` is now #[cfg(all(feature = "vpio", target_os = "macos"))]
- Previously `vpio` alone was enough to try to compile the Core Audio
bindings, which would fail on non-Apple targets the moment the
feature flag was flipped on
desktop/src-tauri/Cargo.toml
- [target.'cfg(not(target_os = "android"))'] removed — was leaking
vpio into Windows/Linux via the catch-all.
- macOS: wzp-client with features = ["audio", "vpio"]
- Windows: wzp-client with features = ["audio"]
- Linux: wzp-client with features = ["audio"]
- Android: wzp-client with default-features = false (unchanged)
- Dropped the unused direct coreaudio-rs = "0.11" dep on macOS —
wzp-desktop's own sources never call Core Audio directly.
Verified via `cargo tree --target x86_64-pc-windows-msvc -p wzp-desktop`
that the Windows target now resolves wzp-client with cpal but without
coreaudio-rs. macOS target still resolves with coreaudio (direct via
vpio feature and transitively via cpal). macOS `cargo check` still
builds cleanly.
Cross-compile from macOS hit a cargo-xwin + llvm-lib setup issue in
ring's build.rs, so the actual `cargo check --target
x86_64-pc-windows-msvc` did not complete locally. Build verification
belongs on the user's Windows x86_64 host where MSVC is present
natively.
See tasks #23 (this one), #24 (Voice Capture DSP / WASAPI Communications
for OS-level AEC on Windows), and #25 (aarch64-pc-windows-msvc support).
Replaces the Android-side CallEngine::start() stub with a real implementation
that mirrors the desktop start() body but routes all PCM through the
standalone wzp-native cdylib loaded at startup via libloading instead of
using CPAL.
- desktop/src-tauri/src/wzp_native.rs: new module with a static
OnceLock<libloading::Library> + cached raw fn pointers for every symbol
we need (version, hello, audio_start/stop, read_capture, write_playout,
is_running, capture/playout_latency_ms). init() resolves everything once
at startup; accessors return default values if init() never ran.
- desktop/src-tauri/src/lib.rs: drop the inline dlopen smoke test, add
`mod wzp_native;` behind target_os="android", and invoke
wzp_native::init() from the Tauri setup() callback so the library is
loaded + all symbols cached before any CallEngine can touch audio.
- desktop/src-tauri/src/engine.rs: the Android #[cfg] branch of
CallEngine::start() now does the full QUIC handshake + signal loop +
Opus send/recv tasks, calling wzp_native::audio_start() /
audio_read_capture() / audio_write_playout() instead of the desktop
CPAL rings. SyncWrapper now holds a placeholder Box<()> on Android
because the audio backend lives in a process-global singleton inside
libwzp_native.so rather than being owned per-engine.
Next step: build #39 on the remote docker builder and smoke-test on
Pixel 6 that the Connect button in the UI successfully brings up Oboe
and streams audio through the dlopen boundary.
Wire AdaptiveQualityController into Android engine for auto codec
switching based on network quality reports. Add color-coded TX/RX
codec badges to the in-call screen showing active codecs and Auto mode.
- Recv task: ingest QualityReports, feed to controller, signal profile
changes via AtomicU8 to send task
- Send task: check for pending profile switch at frame boundaries,
update encoder/FEC/frame size
- Track peer codec from incoming packet headers
- Kotlin UI: codec badges (blue=studio, green=good, amber=degraded,
red=catastrophic) with Auto tag
- Add .taskmaster to .gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Relay identity:
- Stored in ~/.wzp/relay-identity (hex-encoded 32-byte seed)
- Generated on first run, reused on restart
- Fingerprint stays consistent across relay restarts
Linux build script (scripts/build-linux-notify.sh):
- Fire and forget: Hetzner VM → build all binaries → upload to rustypaste → ntfy notify → destroy VM
- Builds: wzp-relay, wzp-client, wzp-client-audio, wzp-web, wzp-bench
- Packages as tar.gz, uploads to rustypaste
- --keep flag to preserve VM
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ping was a static JNI method that loaded the .so before nativeInit,
crashing jemalloc. Now ping is an instance method on WzpEngine:
- Engine is created once (nativeInit), reused for both ping and call
- pingRelay() uses same tokio runtime pattern as startCall()
- Auto-pings all servers on app launch (after engine init)
- No process restart needed
- TOFU fingerprints saved on first successful ping
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- New wzp-android crate with Oboe C++ backend, lock-free SPSC ring buffers,
engine orchestrator, codec pipeline, and Android Gradle project structure
- AEC (NLMS adaptive filter), AGC (two-stage with fast attack/slow release),
windowed-sinc FIR resampler replacing linear interpolation (wzp-codec)
- Opus encoder tuning: complexity 7 default, set_expected_loss support
- Mobile jitter buffer: asymmetric EMA (fast up/slow down), handoff spike
detection with 2s cooldown, configurable safety margin
- Network-aware quality control: cellular-specific thresholds, faster
downgrade on cellular, proactive tier drop on WiFi→cellular handoff,
FEC ratio boost during network transitions
- Handoff detection in PathMonitor via RTT jitter spike analysis
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PresenceRegistry tracks who is connected where:
- register_local/unregister_local for directly connected users
- update_peer for fingerprints reported by peer relays
- lookup returns Local or Remote(addr)
- expire_stale removes entries older than timeout
Gossip via probe connections:
- New SignalMessage::PresenceUpdate { fingerprints, relay_addr }
- Probes send local fingerprints every 10s alongside Ping/Pong
- Receiving relay updates its remote presence table
HTTP API on metrics port:
- GET /presence — all known fingerprints + locations
- GET /presence/:fingerprint — single lookup
- GET /peers — peer relays + their connected users
Wired into relay main:
- Registry created at startup
- register_local after auth+handshake
- unregister_local on disconnect
- Passed to probe mesh and metrics server
Also marks FC-10 as DONE in integration tracker.
48 relay tests + 42 proto tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
T6 wiring: Trunking in relay hot path
- TrunkedForwarder wraps transport with TrunkBatcher
- run_participant uses 5ms flush timer when trunking enabled
- send_trunk/recv_trunk on QuinnTransport
- --trunking flag on relay config
- 2 new tests: forwarder batches, auto-flush on full
T7 wiring: Mini-frames in encoder/decoder
- MediaPacket::encode_compact/decode_compact with MiniFrameContext
- CallEncoder sends mini-headers for consecutive frames (full every 50th)
- CallDecoder auto-detects full vs mini on receive
- mini_frames_enabled in CallConfig (default true)
- 3 new tests: encode/decode sequence, periodic full, disabled mode
Noise suppression (nnnoiseless/RNNoise)
- NoiseSupressor in wzp-codec: pure Rust ML-based noise removal
- Processes 960-sample frames as two 480-sample halves
- Integrated in CallEncoder before silence detection
- noise_suppression in CallConfig (default true)
- 4 new tests: creation, processing, SNR improvement, passthrough
T1-S4: Adaptive playout delay
- AdaptivePlayoutDelay: EMA-based jitter tracking (NetEq-inspired)
- Computes target_delay from observed inter-arrival jitter
- JitterBuffer::new_adaptive() uses adaptive delay
- adaptive_jitter in CallConfig (default true)
- 5 new tests: stable, jitter increase, recovery, clamping, estimate
272 tests passing across all crates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WZP-S-4: Room access control
- hash_room_name() in wzp-crypto: SHA-256("featherchat-group:"+name)[:16]
- CLI --room flag hashes before SNI, web bridge does the same
- RoomManager gains ACL: with_acl(), allow(), is_authorized()
- join() returns Result, rejects unauthorized fingerprints
WZP-S-5: Crypto handshake wired into all live paths
- CLI: perform_handshake() after connect, before any mode
- Relay: accept_handshake() after auth, before room join
- Web bridge: perform_handshake() after auth, before audio
- Relay generates ephemeral identity at startup
WZP-S-6: Web bridge featherChat auth
- --auth-url flag: browsers send {"type":"auth","token":"..."} as first WS msg
- Validates against featherChat, passes token to relay
- --cert/--key flags for production TLS (replaces self-signed)
WZP-S-7: wzp-proto standalone
- Cargo.toml uses explicit versions (no workspace inheritance)
- FC can use as git dependency
WZP-S-9: All 6 hardcoded assumptions resolved
- Auth, hashed rooms, mandatory handshake, real TLS certs,
profile negotiation, token validation
CLI also gains --room and --token flags.
179 tests passing across all crates.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Identity (6 tests):
- Same seed → same Ed25519/X25519 keys, same fingerprint, same display
- Random seed, raw HKDF output verified
BIP39 Mnemonic (3 tests):
- Roundtrip both directions, identical strings
CallSignal Interop (4 tests):
- Offer/Answer/Hangup roundtrip through FC bincode serialization
- Signal type mapping verified
Auth Contract (2 tests):
- Request/response shapes match between WZP and FC
Uses warzone-protocol v0.0.21 as real dependency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>