34 Commits

Author SHA1 Message Date
Siavash Sameni
12020b019c fix(video): normalize VideoToolbox plane strides to tight I420
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m20s
Mirror to GitHub / mirror (push) Failing after 28s
Android-encoded H.264 decoded cleanly with ffmpeg but showed diagonal
green/magenta banding on macOS. Root cause: shiguredo_video_toolbox's
I420Frame exposes y/u/v planes as bytes_per_row * height, including
CoreVideo's stride padding. VideoToolboxDecoder concatenated those
slices verbatim, then downstream code indexed the buffer as tight I420,
producing per-row drift that wrapped one full row every 16 chroma rows
(32 luma rows) at 960x540.

Add i420_frame_to_tight() helper that copies each plane row-by-row at
width / chroma_width using the plane's actual stride. All three macOS
decoders (H.264, HEVC, AV1) now call it. On first decode each logs the
real plane dimensions and strides at target wzp_video::videotoolbox so
future stride bugs are diagnosable from logs.

Verified mathematically against the corrupted dump:
  band period = u_stride / (u_stride - chroma_width)
              = 512 / (512 - 480) = 16 chroma rows = 32 luma rows
which matches the measured spacing exactly. 640x360 was unaffected
because chroma_width 320 is already 64-aligned.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 15:22:40 +04:00
Siavash Sameni
3ea25a0656 fix(android): use MediaCodec input layout for video encode
Some checks failed
Mirror to GitHub / mirror (push) Failing after 29s
Build Release Binaries / build-amd64 (push) Failing after 4m1s
2026-05-26 11:35:24 +04:00
Siavash Sameni
112472609e fix(video): add frame metadata and Android encode diagnostics
Some checks failed
Mirror to GitHub / mirror (push) Failing after 41s
Build Release Binaries / build-amd64 (push) Failing after 4m7s
2026-05-26 11:28:17 +04:00
Siavash Sameni
9a7745978b feat(video): add codec and resolution controls
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m38s
Mirror to GitHub / mirror (push) Failing after 38s
2026-05-26 10:05:20 +04:00
Siavash Sameni
f85efb9576 fix(video): improve android stream smoothness
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m35s
2026-05-26 09:57:10 +04:00
Siavash Sameni
31b2caa54d fix(video): request keyframes after packet loss
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m14s
2026-05-26 09:23:08 +04:00
Siavash Sameni
079e21e174 fix(video): resync decoder after packet gaps
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m7s
Mirror to GitHub / mirror (push) Failing after 29s
2026-05-26 09:16:02 +04:00
Siavash Sameni
e676641538 fix(android): suppress debuggable lint for diagnostic builds
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m4s
2026-05-26 09:09:06 +04:00
Siavash Sameni
9713efc404 chore(android): add release debuggable build
Some checks failed
Mirror to GitHub / mirror (push) Failing after 32s
Build Release Binaries / build-amd64 (push) Failing after 3m17s
2026-05-26 09:05:09 +04:00
Siavash Sameni
8415804a1a fix(video): vsync remote canvas draws
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m24s
2026-05-26 08:46:11 +04:00
Siavash Sameni
f65b399a21 fix(build): preserve debuggable android APKs
Some checks failed
Mirror to GitHub / mirror (push) Failing after 20s
Build Release Binaries / build-amd64 (push) Failing after 3m24s
2026-05-26 08:35:46 +04:00
Siavash Sameni
3437a6bd11 debug(video): add android frame dump pull helper
Some checks failed
Build Release Binaries / build-amd64 (push) Has been cancelled
Mirror to GitHub / mirror (push) Has been cancelled
2026-05-26 08:34:36 +04:00
Siavash Sameni
15eb00ed5e debug(video): dump frames across capture and decode
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 2m58s
Mirror to GitHub / mirror (push) Failing after 29s
2026-05-26 07:39:21 +04:00
Siavash Sameni
0c2297a2b7 fix(video): sync camera capture and float preview
Some checks failed
Mirror to GitHub / mirror (push) Failing after 33s
Build Release Binaries / build-amd64 (push) Failing after 3m9s
2026-05-26 07:30:19 +04:00
Siavash Sameni
a08a37b5eb fix(video): stabilize relay streams and remote rendering
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m2s
2026-05-26 07:18:22 +04:00
Siavash Sameni
f6ace54556 fix(call): enable direct video and shorten portmap probe
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m8s
2026-05-26 06:35:31 +04:00
Siavash Sameni
47baa1a765 fix(video): reassemble out-of-order fragments
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m13s
2026-05-26 06:16:53 +04:00
Siavash Sameni
ee654cd1ef fix(video): skip startup black frames
Some checks failed
Mirror to GitHub / mirror (push) Failing after 29s
Build Release Binaries / build-amd64 (push) Failing after 3m2s
2026-05-25 21:35:00 +04:00
Siavash Sameni
d2046060b5 fix(video): request android sync frames via mediacodec
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m5s
2026-05-25 21:28:59 +04:00
Siavash Sameni
0b7bf1b385 fix(video): feed android h264 encoder nv12
Some checks failed
Mirror to GitHub / mirror (push) Failing after 30s
Build Release Binaries / build-amd64 (push) Failing after 3m16s
2026-05-25 21:20:01 +04:00
Siavash Sameni
e8f139588a chore(video): sample decoded frames periodically
Some checks failed
Mirror to GitHub / mirror (push) Failing after 26s
Build Release Binaries / build-amd64 (push) Failing after 3m30s
2026-05-25 21:14:32 +04:00
Siavash Sameni
0115b11de7 chore(video): log compact video samples
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m7s
2026-05-25 21:06:32 +04:00
Siavash Sameni
fa812a17d9 fix(video): normalize mediacodec buffers
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m13s
2026-05-25 21:02:41 +04:00
Siavash Sameni
8d6b168f1b fix(video): normalize camera frames before encoding
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m16s
2026-05-25 20:49:32 +04:00
Siavash Sameni
ca164ada5c fix(relay): forward legacy h264 room video stream
Some checks failed
Mirror to GitHub / mirror (push) Failing after 38s
Build Release Binaries / build-amd64 (push) Has been cancelled
2026-05-25 20:46:41 +04:00
Siavash Sameni
2d58bae9ba chore(relay): log video forwarding decisions in debug tap
Some checks failed
Mirror to GitHub / mirror (push) Failing after 27s
Build Release Binaries / build-amd64 (push) Failing after 3m41s
2026-05-25 20:42:24 +04:00
Siavash Sameni
e1ca6ca6e6 fix(video): use relay-default stream for room video
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Has been cancelled
2026-05-25 20:39:25 +04:00
Siavash Sameni
06d28a9280 fix(video): preserve annex-b mediacodec output
Some checks failed
Mirror to GitHub / mirror (push) Failing after 31s
Build Release Binaries / build-amd64 (push) Failing after 3m35s
2026-05-25 20:20:22 +04:00
Siavash Sameni
d57ebe3d2c fix(video): force h264 and trace frame pipeline
Some checks failed
Build Release Binaries / build-amd64 (push) Failing after 3m32s
Mirror to GitHub / mirror (push) Failing after 28s
2026-05-25 20:03:11 +04:00
Siavash Sameni
7eca79846f fix(quality): use windowed loss instead of cumulative for codec adaptation
Some checks failed
Mirror to GitHub / mirror (push) Failing after 36s
Build Release Binaries / build-amd64 (push) Failing after 3m9s
Quinn's cumulative loss_pct (lost / sent since connection start) was
biased forever by handshake-era losses. Even ~5 lost-out-of-100 early
packets pinned us at "Degraded" (5% threshold) and Codec2_1200 was just
a few more drops away. The metric only diluted as thousands more clean
packets accumulated — by which time the call was over.

LossWindow tracks prev (sent, lost) and reports delta loss per ~25-
packet window. The cumulative value is the fallback when the window
hasn't accumulated enough samples (< 20 packets).

All 6 sites converted (DRED tuner + QualityReport on both send tasks,
self-observation on both recv tasks).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 18:55:57 +04:00
Siavash Sameni
25b3278d31 feat(android): wire video send + recv in Android engine; add video:* debug events
Some checks failed
Mirror to GitHub / mirror (push) Failing after 30s
Build Release Binaries / build-amd64 (push) Failing after 3m5s
Mirror the desktop video pipeline into the #[cfg(target_os="android")] start
function: capture _negotiated_video_codec from the handshake, spawn a video
send task that pulls VideoFrames from camera_tx, encodes/packetizes/sends.
Add video reassembly + decode + emit "video:frame" in the recv task before
the audio branch so Android can both send and receive video.

Instrumentation: emit video:first_send and video:first_recv on both desktop
and android paths so we can verify the pipeline end-to-end.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 18:19:42 +04:00
Siavash Sameni
cbc3a8d37e feat(ui): full-screen video stage with PiP local preview
Some checks failed
Mirror to GitHub / mirror (push) Failing after 28s
Build Release Binaries / build-amd64 (push) Failing after 3m5s
Move video out of the voice drawer into a fixed-position stage that
covers the lobby above the drawer. Remote canvas fills the stage with
object-fit: contain; local preview is a 200x112 PiP in the bottom-right.
Placeholder shows "Waiting for remote video" with a frame counter until
the first frame arrives. Counter logs first remote frame to console for
debugging.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 17:53:10 +04:00
Siavash Sameni
1329abbeba docs(prd): rewrite E2E PRD — prior approach broke multi-client voice
Some checks failed
Mirror to GitHub / mirror (push) Failing after 34s
Build Release Binaries / build-amd64 (push) Failing after 3m21s
Document why wrapping QuinnTransport with EncryptingTransport using the
pairwise client↔relay key cannot work for an SFU (recipient has a different
key than sender). Propose two valid paths: MLS group keys (true E2E) or
hop-by-hop relay re-encryption (relay-trusted). Recommend hop-by-hop first.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 17:44:57 +04:00
Siavash Sameni
e8cab25eda fix: revert E2E AEAD wrapping (broke multi-client voice); add Android CAMERA
Some checks failed
Mirror to GitHub / mirror (push) Failing after 24s
Build Release Binaries / build-amd64 (push) Failing after 3m19s
Voice regression: EncryptingTransport encrypts media with the pairwise
client↔relay session key, but the relay forwards bytes without re-encrypting
per recipient. Sender's key_A ≠ recipient's key_B → recipient cannot decrypt
→ silent audio between mac and android. Drop the wrapper; restore plaintext-
over-QUIC-TLS to the relay. Proper E2E needs MLS group keys or relay hop-by-
hop re-encryption (future PRD).

Android camera: add CAMERA manifest permission + runtime request via
MainActivity. NOTE: still not sufficient — Tauri/Wry's WebChromeClient does
not grant getUserMedia, so video on Android needs a Tauri plugin override
or native Camera2 path. Documented in MainActivity.kt.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 17:04:56 +04:00
26 changed files with 4768 additions and 506 deletions

5
.gitignore vendored
View File

@@ -12,6 +12,11 @@ npm-debug.log*
yarn-debug.log* yarn-debug.log*
yarn-error.log* yarn-error.log*
dev-debug.log dev-debug.log
# Debug frame dump artifacts
android-frame-dumps/
wzp-frame-dumps.tar
# Dependency directories # Dependency directories
node_modules/ node_modules/
# Environment variables # Environment variables

View File

@@ -538,7 +538,7 @@ async fn run_call(
alias: alias.map(|s| s.to_string()), alias: alias.map(|s| s.to_string()),
protocol_version: 2, protocol_version: 2,
supported_versions: vec![2], supported_versions: vec![2],
video_codecs: vec![], video_codecs: vec![CodecId::H264Baseline],
}; };
transport.send_signal(&offer).await?; transport.send_signal(&offer).await?;
info!("CallOffer sent, waiting for CallAnswer..."); info!("CallOffer sent, waiting for CallAnswer...");

View File

@@ -15,7 +15,8 @@ use std::time::{Duration, Instant};
use clap::Parser; use clap::Parser;
use tracing::info; use tracing::info;
use wzp_proto::{CodecId, MediaPacket, MediaTransport, default_signal_version}; use wzp_proto::{CodecId, MediaPacket, MediaTransport, MediaType, default_signal_version};
use wzp_video::{VideoDecoder, create_video_decoder, transport::VideoReassembler};
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// CLI // CLI
@@ -68,6 +69,14 @@ struct Args {
// For now, header-only analysis provides loss%, jitter, codec stats. // For now, header-only analysis provides loss%, jitter, codec stats.
#[arg(long)] #[arg(long)]
key: Option<String>, key: Option<String>,
/// Track video fragmentation, completed frames, keyframes, and decode health.
#[arg(long)]
video_probe: bool,
/// Decode completed video frames in --video-probe mode.
#[arg(long)]
video_decode: bool,
} }
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -198,6 +207,305 @@ fn find_or_create_participant(
id id
} }
// ---------------------------------------------------------------------------
// Video probe
// ---------------------------------------------------------------------------
#[derive(Default, Clone)]
struct PlaneSample {
min: u8,
max: u8,
mean: f64,
}
#[derive(Default, Clone)]
struct I420Sample {
y: PlaneSample,
u: PlaneSample,
v: PlaneSample,
valid_i420: bool,
}
struct VideoStreamProbe {
id: usize,
codec: CodecId,
wire_stream_id: u8,
packets: u64,
lost: u64,
last_seq: u32,
seq_initialized: bool,
frames: u64,
keyframes: u64,
bytes: u64,
max_frame_bytes: usize,
first_seen: Instant,
last_seen: Instant,
last_frame: Option<Instant>,
reassembler: VideoReassembler,
decoder: Option<Box<dyn VideoDecoder>>,
decoder_key: Option<(CodecId, u32, u32)>,
decode_ok: u64,
decode_pending: u64,
decode_err: u64,
last_decode_debug: Option<String>,
last_i420_sample: Option<I420Sample>,
}
impl VideoStreamProbe {
fn new(id: usize, codec: CodecId, wire_stream_id: u8, decode: bool) -> Self {
let decoder = if decode {
create_video_decoder(codec, 1280, 720).ok()
} else {
None
};
let now = Instant::now();
Self {
id,
codec,
wire_stream_id,
packets: 0,
lost: 0,
last_seq: 0,
seq_initialized: false,
frames: 0,
keyframes: 0,
bytes: 0,
max_frame_bytes: 0,
first_seen: now,
last_seen: now,
last_frame: None,
reassembler: VideoReassembler::new(),
decoder,
decoder_key: decode.then_some((codec, 1280, 720)),
decode_ok: 0,
decode_pending: 0,
decode_err: 0,
last_decode_debug: None,
last_i420_sample: None,
}
}
fn ingest(&mut self, pkt: &MediaPacket, now: Instant) {
self.packets += 1;
self.last_seen = now;
if pkt.header.codec_id != self.codec {
self.codec = pkt.header.codec_id;
self.reassembler = VideoReassembler::new();
self.decoder = self
.decoder
.is_some()
.then(|| create_video_decoder(self.codec, 1280, 720).ok())
.flatten();
self.decoder_key = self.decoder.as_ref().map(|_| (self.codec, 1280, 720));
}
if self.seq_initialized {
let expected = self.last_seq.wrapping_add(1);
let gap = pkt.header.seq.wrapping_sub(expected);
if gap > 0 && gap < 100 {
self.lost += gap as u64;
}
}
self.last_seq = pkt.header.seq;
self.seq_initialized = true;
if let Some(frame) = self.reassembler.push(pkt) {
self.frames += 1;
self.bytes += frame.data.len() as u64;
self.max_frame_bytes = self.max_frame_bytes.max(frame.data.len());
self.last_frame = Some(now);
if frame.is_keyframe {
self.keyframes += 1;
}
if frame.codec_id != self.codec {
self.codec = frame.codec_id;
}
let frame_width = frame.width.unwrap_or(1280) as u32;
let frame_height = frame.height.unwrap_or(720) as u32;
let decoder_key = (self.codec, frame_width, frame_height);
if self.decoder.is_some() && self.decoder_key != Some(decoder_key) {
self.decoder = create_video_decoder(self.codec, frame_width, frame_height).ok();
self.decoder_key = self.decoder.as_ref().map(|_| decoder_key);
}
if let Some(decoder) = self.decoder.as_mut() {
match decoder.decode(&frame.data) {
Ok(Some(decoded)) => {
self.decode_ok += 1;
self.last_decode_debug = decoder.debug_snapshot();
self.last_i420_sample =
Some(sample_i420(&decoded.data, decoded.width, decoded.height));
}
Ok(None) => {
self.decode_pending += 1;
self.last_decode_debug = decoder.debug_snapshot();
}
Err(err) => {
self.decode_err += 1;
self.last_decode_debug = Some(err.to_string());
}
}
}
}
}
fn loss_percent(&self) -> f64 {
let total = self.packets + self.lost;
if total == 0 {
0.0
} else {
(self.lost as f64 / total as f64) * 100.0
}
}
fn avg_frame_bytes(&self) -> u64 {
if self.frames == 0 {
0
} else {
self.bytes / self.frames
}
}
fn fps(&self) -> f64 {
let secs = self.last_seen.duration_since(self.first_seen).as_secs_f64();
if secs <= 0.0 {
0.0
} else {
self.frames as f64 / secs
}
}
}
struct VideoProbe {
streams: Vec<VideoStreamProbe>,
decode: bool,
}
impl VideoProbe {
fn new(decode: bool) -> Self {
Self {
streams: Vec::new(),
decode,
}
}
fn ingest(&mut self, pkt: &MediaPacket, now: Instant) {
if pkt.header.media_type != MediaType::Video {
return;
}
let idx = self.find_or_create_stream(pkt);
self.streams[idx].ingest(pkt, now);
}
fn find_or_create_stream(&mut self, pkt: &MediaPacket) -> usize {
for (i, s) in self.streams.iter().enumerate() {
if s.seq_initialized
&& s.wire_stream_id == pkt.header.stream_id
&& s.codec == pkt.header.codec_id
{
let delta = pkt.header.seq.wrapping_sub(s.last_seq);
if delta > 0 && delta < 80 {
return i;
}
}
}
let id = self.streams.len();
self.streams.push(VideoStreamProbe::new(
id,
pkt.header.codec_id,
pkt.header.stream_id,
self.decode,
));
id
}
fn print(&self) {
if self.streams.is_empty() {
eprintln!(" video: no packets yet");
return;
}
for s in &self.streams {
let age_ms = s
.last_frame
.map(|t| t.elapsed().as_millis() as u64)
.unwrap_or(u64::MAX);
let mut line = format!(
" video#{} wire_stream={} {:?}: {} pkts {:.1}% loss | {} frames ({:.1} fps), {} key, avg={}B max={}B, last_frame={}ms",
s.id,
s.wire_stream_id,
s.codec,
s.packets,
s.loss_percent(),
s.frames,
s.fps(),
s.keyframes,
s.avg_frame_bytes(),
s.max_frame_bytes,
if age_ms == u64::MAX { 0 } else { age_ms },
);
if s.decoder.is_some() || s.decode_ok > 0 || s.decode_err > 0 {
line.push_str(&format!(
" | dec ok={} pending={} err={}",
s.decode_ok, s.decode_pending, s.decode_err
));
}
if let Some(sample) = &s.last_i420_sample {
line.push_str(&format!(
" | i420={} y={:.1}/{}/{} u={:.1}/{}/{} v={:.1}/{}/{}",
sample.valid_i420,
sample.y.mean,
sample.y.min,
sample.y.max,
sample.u.mean,
sample.u.min,
sample.u.max,
sample.v.mean,
sample.v.min,
sample.v.max,
));
}
if let Some(debug) = &s.last_decode_debug {
line.push_str(&format!(" | {debug}"));
}
eprintln!("{line}");
}
}
}
fn sample_i420(data: &[u8], width: u32, height: u32) -> I420Sample {
let y_len = width as usize * height as usize;
let uv_len = y_len / 4;
if data.len() < y_len + uv_len * 2 {
return I420Sample {
valid_i420: false,
..I420Sample::default()
};
}
I420Sample {
valid_i420: true,
y: sample_plane(&data[..y_len]),
u: sample_plane(&data[y_len..y_len + uv_len]),
v: sample_plane(&data[y_len + uv_len..y_len + uv_len * 2]),
}
}
fn sample_plane(data: &[u8]) -> PlaneSample {
if data.is_empty() {
return PlaneSample::default();
}
let mut min = u8::MAX;
let mut max = u8::MIN;
let mut sum: u64 = 0;
for &b in data {
min = min.min(b);
max = max.max(b);
sum += b as u64;
}
PlaneSample {
min,
max,
mean: sum as f64 / data.len() as f64,
}
}
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
// Capture writer (binary packet log for later replay) // Capture writer (binary packet log for later replay)
// --------------------------------------------------------------------------- // ---------------------------------------------------------------------------
@@ -580,6 +888,7 @@ async fn run_no_tui(
total_packets: &mut u64, total_packets: &mut u64,
deadline: Option<Instant>, deadline: Option<Instant>,
mut capture_writer: Option<&mut CaptureWriter>, mut capture_writer: Option<&mut CaptureWriter>,
mut video_probe: Option<&mut VideoProbe>,
) -> anyhow::Result<()> { ) -> anyhow::Result<()> {
let mut print_timer = Instant::now(); let mut print_timer = Instant::now();
loop { loop {
@@ -594,6 +903,9 @@ async fn run_no_tui(
let idx = let idx =
find_or_create_participant(participants, pkt.header.seq, pkt.header.codec_id); find_or_create_participant(participants, pkt.header.seq, pkt.header.codec_id);
participants[idx].ingest(&pkt, now); participants[idx].ingest(&pkt, now);
if let Some(ref mut probe) = video_probe {
probe.ingest(&pkt, now);
}
*total_packets += 1; *total_packets += 1;
if let Some(ref mut w) = capture_writer { if let Some(ref mut w) = capture_writer {
w.write_packet(&pkt, now)?; w.write_packet(&pkt, now)?;
@@ -608,6 +920,9 @@ async fn run_no_tui(
} }
if print_timer.elapsed() >= Duration::from_secs(2) { if print_timer.elapsed() >= Duration::from_secs(2) {
print_stats(participants, *total_packets); print_stats(participants, *total_packets);
if let Some(ref probe) = video_probe {
probe.print();
}
print_timer = Instant::now(); print_timer = Instant::now();
} }
} }
@@ -616,7 +931,7 @@ async fn run_no_tui(
fn print_stats(participants: &[ParticipantStats], total: u64) { fn print_stats(participants: &[ParticipantStats], total: u64) {
eprintln!( eprintln!(
"--- {} participants | {} total packets ---", "--- {} packet streams | {} total packets ---",
participants.len(), participants.len(),
total total
); );
@@ -644,6 +959,7 @@ async fn run_tui(
start_time: Instant, start_time: Instant,
deadline: Option<Instant>, deadline: Option<Instant>,
mut capture_writer: Option<&mut CaptureWriter>, mut capture_writer: Option<&mut CaptureWriter>,
mut video_probe: Option<&mut VideoProbe>,
) -> anyhow::Result<()> { ) -> anyhow::Result<()> {
crossterm::terminal::enable_raw_mode()?; crossterm::terminal::enable_raw_mode()?;
let mut stdout = std::io::stdout(); let mut stdout = std::io::stdout();
@@ -684,6 +1000,9 @@ async fn run_tui(
pkt.header.codec_id, pkt.header.codec_id,
); );
participants[idx].ingest(&pkt, now); participants[idx].ingest(&pkt, now);
if let Some(ref mut probe) = video_probe {
probe.ingest(&pkt, now);
}
*total_packets += 1; *total_packets += 1;
if let Some(ref mut w) = capture_writer { if let Some(ref mut w) = capture_writer {
w.write_packet(&pkt, now)?; w.write_packet(&pkt, now)?;
@@ -941,6 +1260,17 @@ async fn main() -> anyhow::Result<()> {
let mut participants: Vec<ParticipantStats> = Vec::new(); let mut participants: Vec<ParticipantStats> = Vec::new();
let mut total_packets: u64 = 0; let mut total_packets: u64 = 0;
let start_time = Instant::now(); let start_time = Instant::now();
let mut video_probe = (args.video_probe || args.video_decode).then(|| {
eprintln!(
"Video probe enabled{}",
if args.video_decode {
" with decode"
} else {
""
}
);
VideoProbe::new(args.video_decode)
});
if args.no_tui { if args.no_tui {
run_no_tui( run_no_tui(
@@ -949,6 +1279,7 @@ async fn main() -> anyhow::Result<()> {
&mut total_packets, &mut total_packets,
deadline, deadline,
capture_writer.as_mut(), capture_writer.as_mut(),
video_probe.as_mut(),
) )
.await?; .await?;
} else { } else {
@@ -959,12 +1290,17 @@ async fn main() -> anyhow::Result<()> {
start_time, start_time,
deadline, deadline,
capture_writer.as_mut(), capture_writer.as_mut(),
video_probe.as_mut(),
) )
.await?; .await?;
} }
// Print summary // Print summary
print_summary(&participants, total_packets, start_time.elapsed()); print_summary(&participants, total_packets, start_time.elapsed());
if let Some(probe) = &video_probe {
eprintln!("\n=== Video Probe Summary ===");
probe.print();
}
// Clean close // Clean close
transport.close().await?; transport.close().await?;

View File

@@ -8,6 +8,8 @@ use wzp_proto::{
CodecId, HangupReason, MediaTransport, QualityProfile, SignalMessage, default_signal_version, CodecId, HangupReason, MediaTransport, QualityProfile, SignalMessage, default_signal_version,
}; };
const SUPPORTED_VIDEO_CODECS: &[CodecId] = &[CodecId::H264Baseline];
/// Result of a successful client-side handshake. /// Result of a successful client-side handshake.
pub struct HandshakeResult { pub struct HandshakeResult {
pub session: Box<dyn CryptoSession>, pub session: Box<dyn CryptoSession>,
@@ -71,6 +73,16 @@ pub async fn perform_handshake(
transport: &dyn MediaTransport, transport: &dyn MediaTransport,
seed: &[u8; 32], seed: &[u8; 32],
alias: Option<&str>, alias: Option<&str>,
) -> Result<HandshakeResult, HandshakeError> {
perform_handshake_with_video_codecs(transport, seed, alias, SUPPORTED_VIDEO_CODECS.to_vec())
.await
}
pub async fn perform_handshake_with_video_codecs(
transport: &dyn MediaTransport,
seed: &[u8; 32],
alias: Option<&str>,
video_codecs: Vec<CodecId>,
) -> Result<HandshakeResult, HandshakeError> { ) -> Result<HandshakeResult, HandshakeError> {
// 1. Create key exchange from identity seed // 1. Create key exchange from identity seed
let mut kx = WarzoneKeyExchange::from_identity_seed(seed); let mut kx = WarzoneKeyExchange::from_identity_seed(seed);
@@ -102,7 +114,7 @@ pub async fn perform_handshake(
alias: alias.map(|s| s.to_string()), alias: alias.map(|s| s.to_string()),
protocol_version: 2, protocol_version: 2,
supported_versions: vec![2], supported_versions: vec![2],
video_codecs: vec![CodecId::Av1Main, CodecId::H264Baseline, CodecId::H265Main], video_codecs,
}; };
transport transport
.send_signal(&offer) .send_signal(&offer)
@@ -110,14 +122,11 @@ pub async fn perform_handshake(
.map_err(HandshakeError::Transport)?; .map_err(HandshakeError::Transport)?;
// 5. Wait for CallAnswer — 10s timeout guards against relay not responding. // 5. Wait for CallAnswer — 10s timeout guards against relay not responding.
let answer = tokio::time::timeout( let answer = tokio::time::timeout(std::time::Duration::from_secs(10), transport.recv_signal())
std::time::Duration::from_secs(10), .await
transport.recv_signal(), .map_err(|_| HandshakeError::Transport(wzp_proto::TransportError::Timeout { ms: 10_000 }))?
) .map_err(HandshakeError::Transport)?
.await .ok_or(HandshakeError::ConnectionClosed)?;
.map_err(|_| HandshakeError::Transport(wzp_proto::TransportError::Timeout { ms: 10_000 }))?
.map_err(HandshakeError::Transport)?
.ok_or(HandshakeError::ConnectionClosed)?;
let (callee_identity_pub, callee_ephemeral_pub, callee_signature, _chosen_profile, video_codec) = let (callee_identity_pub, callee_ephemeral_pub, callee_signature, _chosen_profile, video_codec) =
match answer { match answer {
@@ -128,7 +137,13 @@ pub async fn perform_handshake(
chosen_profile, chosen_profile,
video_codec, video_codec,
.. ..
} => (identity_pub, ephemeral_pub, signature, chosen_profile, video_codec), } => (
identity_pub,
ephemeral_pub,
signature,
chosen_profile,
video_codec,
),
SignalMessage::Hangup { SignalMessage::Hangup {
reason: HangupReason::ProtocolVersionMismatch { server_supported }, reason: HangupReason::ProtocolVersionMismatch { server_supported },
.. ..
@@ -153,7 +168,10 @@ pub async fn perform_handshake(
.derive_session(&callee_ephemeral_pub) .derive_session(&callee_ephemeral_pub)
.map_err(|e| HandshakeError::KeyDerivation(e.to_string()))?; .map_err(|e| HandshakeError::KeyDerivation(e.to_string()))?;
Ok(HandshakeResult { session, video_codec }) Ok(HandshakeResult {
session,
video_codec,
})
} }
#[cfg(test)] #[cfg(test)]
@@ -183,22 +201,26 @@ mod tests {
let mut kx = WarzoneKeyExchange::from_identity_seed(&[0x55; 32]); let mut kx = WarzoneKeyExchange::from_identity_seed(&[0x55; 32]);
kx.generate_ephemeral(); kx.generate_ephemeral();
let session = kx.derive_session(&[0u8; 32]).unwrap(); let session = kx.derive_session(&[0u8; 32]).unwrap();
let hs = HandshakeResult { session, video_codec: None }; let hs = HandshakeResult {
session,
video_codec: None,
};
assert!(hs.video_codec.is_none()); assert!(hs.video_codec.is_none());
let mut kx2 = WarzoneKeyExchange::from_identity_seed(&[0x66; 32]); let mut kx2 = WarzoneKeyExchange::from_identity_seed(&[0x66; 32]);
kx2.generate_ephemeral(); kx2.generate_ephemeral();
let session2 = kx2.derive_session(&[0u8; 32]).unwrap(); let session2 = kx2.derive_session(&[0u8; 32]).unwrap();
let hs2 = HandshakeResult { session: session2, video_codec: Some(CodecId::Av1Main) }; let hs2 = HandshakeResult {
assert_eq!(hs2.video_codec, Some(CodecId::Av1Main)); session: session2,
video_codec: Some(CodecId::H264Baseline),
};
assert_eq!(hs2.video_codec, Some(CodecId::H264Baseline));
} }
#[test] #[test]
fn offer_contains_three_video_codecs() { fn offer_contains_h264_only() {
// The offer sent in perform_handshake always includes the three codecs // Keep room video on the common denominator until Android AV1/HEVC
// declared in order: AV1 > H264 > H265. Verify via the const list. // send paths are proven in-device.
let offered = vec![CodecId::Av1Main, CodecId::H264Baseline, CodecId::H265Main]; assert_eq!(SUPPORTED_VIDEO_CODECS, &[CodecId::H264Baseline]);
assert_eq!(offered.len(), 3);
assert_eq!(offered[0], CodecId::Av1Main, "AV1 must be preferred");
} }
} }

View File

@@ -177,9 +177,9 @@ mod tests {
#[test] #[test]
fn video_codec_picks_first_offered() { fn video_codec_picks_first_offered() {
let codecs = vec![CodecId::Av1Main, CodecId::H264Baseline, CodecId::H265Main]; let codecs = vec![CodecId::H264Baseline];
let chosen: Option<CodecId> = codecs.into_iter().next(); let chosen: Option<CodecId> = codecs.into_iter().next();
assert_eq!(chosen, Some(CodecId::Av1Main)); assert_eq!(chosen, Some(CodecId::H264Baseline));
} }
#[test] #[test]

View File

@@ -2028,7 +2028,7 @@ async fn main() -> anyhow::Result<()> {
(None, None) (None, None)
}; };
let media_handle = tokio::spawn(room::run_participant( let mut media_handle = tokio::spawn(room::run_participant(
room_mgr.clone(), room_mgr.clone(),
room_name.clone(), room_name.clone(),
participant_id, participant_id,
@@ -2041,15 +2041,38 @@ async fn main() -> anyhow::Result<()> {
federation_room_hash, federation_room_hash,
authenticated_fp.is_some(), authenticated_fp.is_some(),
)); ));
let signal_handle = tokio::spawn(room::run_participant_signals( let mut signal_handle = tokio::spawn(room::run_participant_signals(
room_mgr.clone(), room_mgr.clone(),
room_name.clone(), room_name.clone(),
participant_id, participant_id,
transport.clone(), transport.clone(),
)); ));
tokio::select! { tokio::select! {
_ = media_handle => {}, _ = &mut media_handle => {
_ = signal_handle => {}, signal_handle.abort();
let _ = signal_handle.await;
},
_ = &mut signal_handle => {
close_transport(&*transport, "signal-loop-ended").await;
match tokio::time::timeout(Duration::from_secs(2), &mut media_handle).await {
Ok(_) => {}
Err(_) => {
warn!(
%addr,
room = %room_name,
participant = participant_id,
"media loop did not exit after signal close; forcing room leave"
);
media_handle.abort();
let _ = media_handle.await;
if let Some((update, senders)) =
room_mgr.leave(&room_name, participant_id)
{
room::broadcast_signal(&senders, &update).await;
}
}
}
},
} }
// Participant disconnected — clean up presence + per-session metrics // Participant disconnected — clean up presence + per-session metrics

View File

@@ -51,9 +51,13 @@ impl DebugTap {
dir = dir, dir = dir,
addr = %addr, addr = %addr,
seq = h.seq, seq = h.seq,
media = ?h.media_type,
codec = ?h.codec_id, codec = ?h.codec_id,
stream_id = h.stream_id,
ts = h.timestamp, ts = h.timestamp,
fec_block = h.fec_block, fec_block = h.fec_block,
keyframe = h.is_keyframe(),
frame_end = h.is_frame_end(),
repair = h.is_repair(), repair = h.is_repair(),
len = pkt.payload.len(), len = pkt.payload.len(),
fan_out, fan_out,
@@ -61,6 +65,35 @@ impl DebugTap {
); );
} }
pub fn log_video_route(
&self,
room: &str,
addr: &std::net::SocketAddr,
peer_id: ParticipantId,
pkt: &wzp_proto::MediaPacket,
selected_layer: u8,
forwarded: bool,
reason: &str,
) {
let h = &pkt.header;
info!(
target: "debug_tap",
room = %room,
addr = %addr,
peer_id,
seq = h.seq,
stream_id = h.stream_id,
selected_layer,
codec = ?h.codec_id,
keyframe = h.is_keyframe(),
frame_end = h.is_frame_end(),
len = pkt.payload.len(),
forwarded,
reason,
"TAP VIDEO ROUTE"
);
}
pub fn log_signal(&self, room: &str, signal: &wzp_proto::SignalMessage) { pub fn log_signal(&self, room: &str, signal: &wzp_proto::SignalMessage) {
match signal { match signal {
wzp_proto::SignalMessage::RoomUpdate { wzp_proto::SignalMessage::RoomUpdate {
@@ -295,6 +328,23 @@ impl ReceiverState {
} }
} }
fn video_route_reason(pkt: &wzp_proto::MediaPacket, selected_layer: u8) -> Option<&'static str> {
if pkt.header.stream_id == selected_layer {
return Some("selected_layer");
}
// Compatibility for the pre-simulcast single-layer H.264 room-video path.
// Older clients used video stream 1 while current clients use stream 0 so
// they pass through relay defaults. Forward both H.264 single-layer ids.
if pkt.header.codec_id == wzp_proto::CodecId::H264Baseline
&& (pkt.header.stream_id == 0 || pkt.header.stream_id == 1)
{
return Some("h264_single_layer_compat");
}
None
}
/// Unique participant ID within a room. /// Unique participant ID within a room.
pub type ParticipantId = u64; pub type ParticipantId = u64;
@@ -304,6 +354,24 @@ fn next_id() -> ParticipantId {
NEXT_PARTICIPANT_ID.fetch_add(1, Ordering::Relaxed) NEXT_PARTICIPANT_ID.fetch_add(1, Ordering::Relaxed)
} }
fn outbound_video_stream_id(participant_id: ParticipantId) -> u8 {
// Reserve stream 0 for the sender's local/simulcast layer id. Forwarded
// room video needs a sender-distinct stream id so receivers and analyzers
// do not merge independent H264 access-unit sequences.
((participant_id.saturating_sub(1) % 250) + 1) as u8
}
fn with_outbound_video_stream_id(
pkt: &wzp_proto::MediaPacket,
participant_id: ParticipantId,
) -> wzp_proto::MediaPacket {
let mut out = pkt.clone();
if out.header.media_type == wzp_proto::MediaType::Video {
out.header.stream_id = outbound_video_stream_id(participant_id);
}
out
}
/// Events emitted by RoomManager for federation to observe. /// Events emitted by RoomManager for federation to observe.
#[derive(Clone, Debug)] #[derive(Clone, Debug)]
pub enum RoomEvent { pub enum RoomEvent {
@@ -438,6 +506,25 @@ impl Room {
); );
} }
fn remove_by_fingerprint(&mut self, fingerprint: &str) -> Vec<ParticipantId> {
let mut removed = Vec::new();
self.participants.retain(|p| {
let matches = p.fingerprint.as_deref() == Some(fingerprint);
if matches {
removed.push(p.id);
}
!matches
});
for id in &removed {
self.qualities.remove(id);
}
removed
}
fn contains(&self, id: ParticipantId) -> bool {
self.participants.iter().any(|p| p.id == id)
}
fn others(&self, exclude_id: ParticipantId) -> Vec<ParticipantSender> { fn others(&self, exclude_id: ParticipantId) -> Vec<ParticipantSender> {
self.participants self.participants
.iter() .iter()
@@ -632,6 +719,18 @@ impl RoomManager {
.entry(room_name.to_string()) .entry(room_name.to_string())
.or_insert_with(|| Arc::new(RwLock::new(Room::new()))); .or_insert_with(|| Arc::new(RwLock::new(Room::new())));
let mut room = arc.write().unwrap(); let mut room = arc.write().unwrap();
if let Some(fp) = fingerprint {
let removed = room.remove_by_fingerprint(fp);
for old_id in removed {
warn!(
room = room_name,
participant = old_id,
fingerprint = fp,
"replacing existing participant with same fingerprint"
);
self.clear_participant_state(room_name, old_id);
}
}
let id = room.add( let id = room.add(
addr, addr,
sender, sender,
@@ -708,6 +807,7 @@ impl RoomManager {
let mut room = arc.write().unwrap(); let mut room = arc.write().unwrap();
room.qualities.remove(&participant_id); room.qualities.remove(&participant_id);
room.remove(participant_id); room.remove(participant_id);
self.clear_participant_state(room_name, participant_id);
if room.is_empty() { if room.is_empty() {
drop(room); // release room lock drop(room); // release room lock
drop(arc); // release DashMap guard drop(arc); // release DashMap guard
@@ -799,7 +899,14 @@ impl RoomManager {
self.keyframe_cache self.keyframe_cache
.iter() .iter()
.filter(|e| e.key().0 == room_name) .filter(|e| e.key().0 == room_name)
.map(|e| e.value().packets.clone()) .map(|e| {
let sender_id = e.key().1;
e.value()
.packets
.iter()
.map(|pkt| with_outbound_video_stream_id(pkt, sender_id))
.collect()
})
.collect() .collect()
} }
@@ -809,6 +916,27 @@ impl RoomManager {
self.keyframe_buffer.retain(|k, _| k.0 != room_name); self.keyframe_buffer.retain(|k, _| k.0 != room_name);
self.pli_state.retain(|k, _| k.0 != room_name); self.pli_state.retain(|k, _| k.0 != room_name);
self.stream_owner.retain(|k, _| k.0 != room_name); self.stream_owner.retain(|k, _| k.0 != room_name);
self.receiver_states.retain(|k, _| k.0 != room_name);
}
fn clear_participant_state(&self, room_name: &str, participant_id: ParticipantId) {
self.keyframe_cache
.retain(|k, _| !(k.0 == room_name && k.1 == participant_id));
self.keyframe_buffer
.retain(|k, _| !(k.0 == room_name && k.1 == participant_id));
self.pli_state
.retain(|k, _| !(k.0 == room_name && k.1 == participant_id));
self.stream_owner
.retain(|k, owner| !(k.0 == room_name && *owner == participant_id));
self.receiver_states
.retain(|k, _| !(k.0 == room_name && k.1 == participant_id));
}
pub fn contains_participant(&self, room_name: &str, participant_id: ParticipantId) -> bool {
self.rooms
.get(room_name)
.map(|arc| arc.read().unwrap().contains(participant_id))
.unwrap_or(false)
} }
/// PLI suppression window (PRD-video-v1 T4.7). /// PLI suppression window (PRD-video-v1 T4.7).
@@ -1142,6 +1270,7 @@ pub async fn run_participant(
transport, transport,
metrics, metrics,
session_id, session_id,
debug_tap,
is_authenticated, is_authenticated,
) )
.await; .await;
@@ -1225,6 +1354,16 @@ async fn run_participant_plain(
} }
}; };
if !room_mgr.contains_participant(&room_name, participant_id) {
info!(
room = %room_name,
participant = participant_id,
forwarded = packets_forwarded,
"stale participant loop stopped"
);
break;
}
// Cache keyframe packets for fast join-to-first-frame replay. // Cache keyframe packets for fast join-to-first-frame replay.
room_mgr.update_keyframe_cache(&room_name, participant_id, &pkt); room_mgr.update_keyframe_cache(&room_name, participant_id, &pkt);
// Register this participant as the owner of this stream for PLI routing. // Register this participant as the owner of this stream for PLI routing.
@@ -1232,6 +1371,12 @@ async fn run_participant_plain(
room_mgr room_mgr
.stream_owner .stream_owner
.insert((room_name.clone(), pkt.header.stream_id), participant_id); .insert((room_name.clone(), pkt.header.stream_id), participant_id);
if pkt.header.media_type == wzp_proto::MediaType::Video {
room_mgr.stream_owner.insert(
(room_name.clone(), outbound_video_stream_id(participant_id)),
participant_id,
);
}
} }
let recv_gap_ms = last_recv_instant.elapsed().as_millis() as u64; let recv_gap_ms = last_recv_instant.elapsed().as_millis() as u64;
@@ -1275,9 +1420,8 @@ async fn run_participant_plain(
room = %room_name, room = %room_name,
participant = participant_id, participant = participant_id,
seq = pkt.header.seq, seq = pkt.header.seq,
"VideoScorer: Abusive verdict — dropping packet" "VideoScorer: Abusive verdict — observe-only"
); );
continue;
} }
} }
@@ -1324,33 +1468,56 @@ async fn run_participant_plain(
broadcast_signal(&all_senders, &directive).await; broadcast_signal(&all_senders, &directive).await;
} }
// Debug tap: log packet metadata + record stats
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_packet(&room_name, "in", &addr, &pkt, others.len());
}
}
if let Some(ref mut ts) = tap_stats {
ts.record_in(&pkt, others.len());
}
// Forward to all others, applying simulcast layer selection for video. // Forward to all others, applying simulcast layer selection for video.
let fwd_start = std::time::Instant::now(); let fwd_start = std::time::Instant::now();
let pkt_bytes = pkt.payload.len() as u64; let pkt_bytes = pkt.payload.len() as u64;
let is_video = pkt.header.media_type == wzp_proto::MediaType::Video; let is_video = pkt.header.media_type == wzp_proto::MediaType::Video;
let mut actual_fan_out = 0usize;
for (other_id, other) in &others { for (other_id, other) in &others {
// Simulcast layer selection (T5.6): video packets are filtered // Simulcast layer selection (T5.6): video packets are filtered
// by the receiver's selected layer. Audio and non-simulcast // by the receiver's selected layer. Audio and non-simulcast
// traffic pass through unchanged. // traffic pass through unchanged.
if is_video { if is_video {
let selected = room_mgr.selected_layer(&room_name, *other_id); let selected = room_mgr.selected_layer(&room_name, *other_id);
if pkt.header.stream_id != selected { let route_reason = video_route_reason(&pkt, selected);
if route_reason.is_none() {
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_video_route(
&room_name,
&addr,
*other_id,
&pkt,
selected,
false,
"simulcast_layer_mismatch",
);
}
}
continue; continue;
} }
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_video_route(
&room_name,
&addr,
*other_id,
&pkt,
selected,
true,
route_reason.unwrap_or("selected_layer"),
);
}
}
} }
match other { match other {
ParticipantSender::Quic(t) => { ParticipantSender::Quic(t) => {
if let Err(e) = t.send_media(&pkt).await { let outbound_pkt = if is_video {
with_outbound_video_stream_id(&pkt, participant_id)
} else {
pkt.clone()
};
if let Err(e) = t.send_media(&outbound_pkt).await {
send_errors += 1; send_errors += 1;
if send_errors <= 5 || send_errors % 100 == 0 { if send_errors <= 5 || send_errors % 100 == 0 {
warn!( warn!(
@@ -1361,14 +1528,28 @@ async fn run_participant_plain(
"send_media error: {e}" "send_media error: {e}"
); );
} }
} else {
actual_fan_out += 1;
} }
} }
ParticipantSender::WebSocket(_) => { ParticipantSender::WebSocket(_) => {
let _ = other.send_raw(&pkt.payload).await; let _ = other.send_raw(&pkt.payload).await;
actual_fan_out += 1;
} }
} }
} }
// Debug tap: log packet metadata + record stats after forwarding so
// fan_out reflects actual sends after video layer filtering.
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_packet(&room_name, "in", &addr, &pkt, actual_fan_out);
}
}
if let Some(ref mut ts) = tap_stats {
ts.record_in(&pkt, actual_fan_out);
}
// Federation: forward to active peer relays via channel // Federation: forward to active peer relays via channel
if let Some(ref fed_tx) = federation_tx { if let Some(ref fed_tx) = federation_tx {
let data = pkt.to_bytes(); let data = pkt.to_bytes();
@@ -1394,7 +1575,7 @@ async fn run_participant_plain(
); );
} }
let fan_out = others.len() as u64; let fan_out = actual_fan_out as u64;
metrics.packets_forwarded.inc_by(fan_out); metrics.packets_forwarded.inc_by(fan_out);
metrics.bytes_forwarded.inc_by(pkt_bytes * fan_out); metrics.bytes_forwarded.inc_by(pkt_bytes * fan_out);
packets_forwarded += 1; packets_forwarded += 1;
@@ -1457,6 +1638,7 @@ async fn run_participant_trunked(
transport: Arc<wzp_transport::QuinnTransport>, transport: Arc<wzp_transport::QuinnTransport>,
metrics: Arc<RelayMetrics>, metrics: Arc<RelayMetrics>,
session_id: String, session_id: String,
debug_tap: Option<DebugTap>,
_is_authenticated: bool, _is_authenticated: bool,
) { ) {
use std::collections::HashMap; use std::collections::HashMap;
@@ -1472,6 +1654,11 @@ async fn run_participant_trunked(
ConformanceMeter::with_token_bucket(crate::conformance::TokenBucket::for_audio_session()); ConformanceMeter::with_token_bucket(crate::conformance::TokenBucket::for_audio_session());
let mut video_scorer_trunked = VideoScorer::new(); let mut video_scorer_trunked = VideoScorer::new();
let mut last_bwe_kbps_trunked: Option<u32> = None; let mut last_bwe_kbps_trunked: Option<u32> = None;
let mut tap_stats = if debug_tap.as_ref().map_or(false, |t| t.matches(&room_name)) {
Some(TapStats::new())
} else {
None
};
info!( info!(
room = %room_name, room = %room_name,
@@ -1510,6 +1697,16 @@ async fn run_participant_trunked(
} }
}; };
if !room_mgr.contains_participant(&room_name, participant_id) {
info!(
room = %room_name,
participant = participant_id,
forwarded = packets_forwarded,
"stale participant loop stopped (trunked)"
);
break;
}
// Cache keyframe packets for fast join-to-first-frame replay. // Cache keyframe packets for fast join-to-first-frame replay.
room_mgr.update_keyframe_cache(&room_name, participant_id, &pkt); room_mgr.update_keyframe_cache(&room_name, participant_id, &pkt);
// Register this participant as the owner of this stream for PLI routing. // Register this participant as the owner of this stream for PLI routing.
@@ -1518,6 +1715,15 @@ async fn run_participant_trunked(
(room_name.clone(), pkt.header.stream_id), (room_name.clone(), pkt.header.stream_id),
participant_id, participant_id,
); );
if pkt.header.media_type == wzp_proto::MediaType::Video {
room_mgr.stream_owner.insert(
(
room_name.clone(),
outbound_video_stream_id(participant_id),
),
participant_id,
);
}
} }
let recv_gap_ms = last_recv_instant.elapsed().as_millis() as u64; let recv_gap_ms = last_recv_instant.elapsed().as_millis() as u64;
@@ -1560,9 +1766,8 @@ async fn run_participant_trunked(
room = %room_name, room = %room_name,
participant = participant_id, participant = participant_id,
seq = pkt.header.seq, seq = pkt.header.seq,
"VideoScorer: Abusive verdict — dropping packet (trunked)" "VideoScorer: Abusive verdict — observe-only (trunked)"
); );
continue;
} }
} }
@@ -1605,12 +1810,40 @@ async fn run_participant_trunked(
let fwd_start = std::time::Instant::now(); let fwd_start = std::time::Instant::now();
let pkt_bytes = pkt.payload.len() as u64; let pkt_bytes = pkt.payload.len() as u64;
let is_video = pkt.header.media_type == wzp_proto::MediaType::Video; let is_video = pkt.header.media_type == wzp_proto::MediaType::Video;
let mut actual_fan_out = 0usize;
for (other_id, other) in &others { for (other_id, other) in &others {
if is_video { if is_video {
let selected = room_mgr.selected_layer(&room_name, *other_id); let selected = room_mgr.selected_layer(&room_name, *other_id);
if pkt.header.stream_id != selected { let route_reason = video_route_reason(&pkt, selected);
if route_reason.is_none() {
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_video_route(
&room_name,
&addr,
*other_id,
&pkt,
selected,
false,
"simulcast_layer_mismatch",
);
}
}
continue; continue;
} }
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_video_route(
&room_name,
&addr,
*other_id,
&pkt,
selected,
true,
route_reason.unwrap_or("selected_layer"),
);
}
}
} }
match other { match other {
ParticipantSender::Quic(t) => { ParticipantSender::Quic(t) => {
@@ -1618,7 +1851,12 @@ async fn run_participant_trunked(
let fwd = forwarders let fwd = forwarders
.entry(peer_addr) .entry(peer_addr)
.or_insert_with(|| TrunkedForwarder::new(t.clone(), sid_bytes)); .or_insert_with(|| TrunkedForwarder::new(t.clone(), sid_bytes));
if let Err(e) = fwd.send(&pkt).await { let outbound_pkt = if is_video {
with_outbound_video_stream_id(&pkt, participant_id)
} else {
pkt.clone()
};
if let Err(e) = fwd.send(&outbound_pkt).await {
send_errors += 1; send_errors += 1;
if send_errors <= 5 || send_errors % 100 == 0 { if send_errors <= 5 || send_errors % 100 == 0 {
warn!( warn!(
@@ -1629,13 +1867,24 @@ async fn run_participant_trunked(
"trunked send error: {e}" "trunked send error: {e}"
); );
} }
} else {
actual_fan_out += 1;
} }
} }
ParticipantSender::WebSocket(_) => { ParticipantSender::WebSocket(_) => {
let _ = other.send_raw(&pkt.payload).await; let _ = other.send_raw(&pkt.payload).await;
actual_fan_out += 1;
} }
} }
} }
if let Some(ref tap) = debug_tap {
if tap.matches(&room_name) {
tap.log_packet(&room_name, "in", &addr, &pkt, actual_fan_out);
}
}
if let Some(ref mut ts) = tap_stats {
ts.record_in(&pkt, actual_fan_out);
}
let fwd_ms = fwd_start.elapsed().as_millis() as u64; let fwd_ms = fwd_start.elapsed().as_millis() as u64;
if fwd_ms > max_forward_ms { if fwd_ms > max_forward_ms {
max_forward_ms = fwd_ms; max_forward_ms = fwd_ms;
@@ -1645,12 +1894,12 @@ async fn run_participant_trunked(
room = %room_name, room = %room_name,
participant = participant_id, participant = participant_id,
fwd_ms, fwd_ms,
fan_out = others.len(), fan_out = actual_fan_out,
"slow forward (trunked)" "slow forward (trunked)"
); );
} }
let fan_out = others.len() as u64; let fan_out = actual_fan_out as u64;
metrics.packets_forwarded.inc_by(fan_out); metrics.packets_forwarded.inc_by(fan_out);
metrics.bytes_forwarded.inc_by(pkt_bytes * fan_out); metrics.bytes_forwarded.inc_by(pkt_bytes * fan_out);
packets_forwarded += 1; packets_forwarded += 1;
@@ -1669,6 +1918,10 @@ async fn run_participant_trunked(
send_errors, send_errors,
"participant stats (trunked)" "participant stats (trunked)"
); );
if let (Some(tap), Some(ts)) = (&debug_tap, &mut tap_stats) {
tap.log_stats(&room_name, ts);
ts.reset_period();
}
max_recv_gap_ms = 0; max_recv_gap_ms = 0;
max_forward_ms = 0; max_forward_ms = 0;
last_log_instant = std::time::Instant::now(); last_log_instant = std::time::Instant::now();
@@ -1727,6 +1980,72 @@ mod tests {
assert!(mgr.list().is_empty()); assert!(mgr.list().is_empty());
} }
#[test]
fn join_replaces_existing_fingerprint_in_same_room() {
let mgr = RoomManager::new();
let addr: std::net::SocketAddr = "127.0.0.1:10000".parse().unwrap();
let (tx1, _rx1) = tokio::sync::mpsc::channel(1);
let (tx2, _rx2) = tokio::sync::mpsc::channel(1);
let (first_id, _, _, _) = mgr
.join(
"room",
addr,
ParticipantSender::WebSocket(tx1),
Some("fp-a"),
Some("old"),
)
.unwrap();
let (second_id, update, _, _) = mgr
.join(
"room",
addr,
ParticipantSender::WebSocket(tx2),
Some("fp-a"),
Some("new"),
)
.unwrap();
assert_ne!(first_id, second_id);
assert!(!mgr.contains_participant("room", first_id));
assert!(mgr.contains_participant("room", second_id));
assert_eq!(mgr.room_size("room"), 1);
if let wzp_proto::SignalMessage::RoomUpdate {
count,
participants,
..
} = update
{
assert_eq!(count, 1);
assert_eq!(participants[0].fingerprint, "fp-a");
assert_eq!(participants[0].alias.as_deref(), Some("new"));
} else {
panic!("expected RoomUpdate");
}
}
#[test]
fn outbound_video_stream_ids_are_sender_distinct_and_nonzero() {
assert_eq!(outbound_video_stream_id(1), 1);
assert_eq!(outbound_video_stream_id(2), 2);
assert_eq!(outbound_video_stream_id(250), 250);
assert_eq!(outbound_video_stream_id(251), 1);
}
#[test]
fn rewrite_only_changes_video_stream_id() {
let mut video = make_test_packet(b"video");
video.header.media_type = wzp_proto::MediaType::Video;
video.header.stream_id = 0;
let rewritten = with_outbound_video_stream_id(&video, 42);
assert_eq!(rewritten.header.stream_id, 42);
assert_eq!(video.header.stream_id, 0);
let audio = make_test_packet(b"audio");
let rewritten_audio = with_outbound_video_stream_id(&audio, 42);
assert_eq!(rewritten_audio.header.stream_id, audio.header.stream_id);
}
#[test] #[test]
fn acl_open_mode_allows_all() { fn acl_open_mode_allows_all() {
let mgr = RoomManager::new(); let mgr = RoomManager::new();

View File

@@ -19,7 +19,7 @@ shiguredo_svt_av1 = "2026.1.0"
shiguredo_video_toolbox = "2026.1" shiguredo_video_toolbox = "2026.1"
[target.'cfg(target_os = "android")'.dependencies] [target.'cfg(target_os = "android")'.dependencies]
ndk = { version = "0.9", features = ["media"] } ndk = { version = "0.9", features = ["api-level-28", "media"] }
[dev-dependencies] [dev-dependencies]
rand = "0.8" rand = "0.8"

View File

@@ -12,4 +12,9 @@ pub trait VideoDecoder: Send {
/// Returns `Ok(Some(frame))` when a frame is ready, `Ok(None)` if more /// Returns `Ok(Some(frame))` when a frame is ready, `Ok(None)` if more
/// data is needed (e.g., for reordering), or an error. /// data is needed (e.g., for reordering), or an error.
fn decode(&mut self, access_unit: &[u8]) -> Result<Option<VideoFrame>, VideoError>; fn decode(&mut self, access_unit: &[u8]) -> Result<Option<VideoFrame>, VideoError>;
/// Compact implementation-specific state useful for field diagnostics.
fn debug_snapshot(&self) -> Option<String> {
None
}
} }

View File

@@ -49,6 +49,11 @@ pub trait VideoEncoder: Send {
/// ///
/// Default implementation is a no-op. /// Default implementation is a no-op.
fn set_mode(&mut self, _mode: crate::EncoderMode) {} fn set_mode(&mut self, _mode: crate::EncoderMode) {}
/// Optional platform-specific encoder state for debug logs.
fn debug_snapshot(&self) -> Option<String> {
None
}
} }
/// Raw video frame input for encoding. /// Raw video frame input for encoding.

View File

@@ -27,6 +27,8 @@ pub struct MediaCodecEncoder {
width: u32, width: u32,
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
height: u32, height: u32,
#[cfg(target_os = "android")]
input_format_logged: bool,
force_keyframe: bool, force_keyframe: bool,
#[cfg(not(target_os = "android"))] #[cfg(not(target_os = "android"))]
_width: u32, _width: u32,
@@ -39,12 +41,18 @@ pub struct MediaCodecEncoder {
/// Android color format constant: YUV 4:2:0 planar (I420). /// Android color format constant: YUV 4:2:0 planar (I420).
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
const COLOR_FORMAT_YUV420_PLANAR: i32 = 19; const COLOR_FORMAT_YUV420_PLANAR: i32 = 19;
/// Android color format constant: YUV 4:2:0 semiplanar (usually NV12).
#[cfg(target_os = "android")]
const COLOR_FORMAT_YUV420_SEMIPLANAR: i32 = 21;
/// Android MediaCodec CBR bitrate mode (MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_CBR). /// Android MediaCodec CBR bitrate mode (MediaCodecInfo.EncoderCapabilities.BITRATE_MODE_CBR).
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
const BITRATE_MODE_CBR: i32 = 2; const BITRATE_MODE_CBR: i32 = 2;
/// AMediaCodec keyframe buffer flag. /// AMediaCodec keyframe buffer flag.
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
const AMEDIACODEC_BUFFER_FLAG_KEY_FRAME: u32 = 1; const AMEDIACODEC_BUFFER_FLAG_KEY_FRAME: u32 = 1;
/// MediaCodec encoder parameter key for forcing the next output frame to be a sync frame.
#[cfg(target_os = "android")]
const MEDIA_CODEC_REQUEST_SYNC_FRAME: &str = "request-sync";
// AMediaCodec is thread-safe; the NonNull inside MediaCodec suppresses auto-Send. // AMediaCodec is thread-safe; the NonNull inside MediaCodec suppresses auto-Send.
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
@@ -61,8 +69,8 @@ impl MediaCodecEncoder {
format.set_i32("height", height as i32); format.set_i32("height", height as i32);
format.set_i32("bitrate", bitrate_bps as i32); format.set_i32("bitrate", bitrate_bps as i32);
format.set_i32("frame-rate", 30); format.set_i32("frame-rate", 30);
format.set_i32("i-frame-interval", 1); format.set_i32("i-frame-interval", 4);
format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR); format.set_i32("color-format", COLOR_FORMAT_YUV420_SEMIPLANAR);
let codec = MediaCodec::from_encoder_type("video/avc").ok_or_else(|| { let codec = MediaCodec::from_encoder_type("video/avc").ok_or_else(|| {
VideoError::PlatformError("AMediaCodec_createEncoderByType failed".into()) VideoError::PlatformError("AMediaCodec_createEncoderByType failed".into())
@@ -80,6 +88,7 @@ impl MediaCodecEncoder {
codec, codec,
width, width,
height, height,
input_format_logged: false,
force_keyframe: false, force_keyframe: false,
}) })
} }
@@ -114,21 +123,52 @@ impl VideoEncoder for MediaCodecEncoder {
.dequeue_input_buffer(std::time::Duration::from_millis(10)) .dequeue_input_buffer(std::time::Duration::from_millis(10))
{ {
Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => { Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => {
let flags = if self.force_keyframe { if self.force_keyframe {
AMEDIACODEC_BUFFER_FLAG_KEY_FRAME self.request_sync_frame();
} else { }
0 let layout = encoder_input_layout(&self.codec, self.width, self.height);
}; if !self.input_format_logged {
self.input_format_logged = true;
log_media_codec_input_format("h264_encoder_input", &self.codec, &layout);
}
let input_capacity = { buffer.buffer_mut().len() };
let mut input = i420_to_encoder_input(
&frame.data,
self.width as usize,
self.height as usize,
&layout,
COLOR_FORMAT_YUV420_SEMIPLANAR,
)?;
if input.len() > input_capacity {
tracing::warn!(
target: "wzp_video::mediacodec",
padded_len = input.len(),
input_capacity,
"MediaCodec H.264 input buffer smaller than padded layout; falling back to tight NV12"
);
let tight_layout = EncoderInputLayout {
color_format: layout.color_format,
stride: self.width as usize,
slice_height: self.height as usize,
};
input = i420_to_encoder_input(
&frame.data,
self.width as usize,
self.height as usize,
&tight_layout,
COLOR_FORMAT_YUV420_SEMIPLANAR,
)?;
}
let to_copy = { let to_copy = {
let buf = buffer.buffer_mut(); let buf = buffer.buffer_mut();
let n = frame.data.len().min(buf.len()); let n = input.len().min(buf.len());
for (d, &s) in buf[..n].iter_mut().zip(frame.data[..n].iter()) { for (d, &s) in buf[..n].iter_mut().zip(input[..n].iter()) {
d.write(s); d.write(s);
} }
n n
}; };
self.codec self.codec
.queue_input_buffer(buffer, 0, to_copy, frame.timestamp_ms as u64 * 1000, flags) .queue_input_buffer(buffer, 0, to_copy, frame.timestamp_ms as u64 * 1000, 0)
.map_err(|e| { .map_err(|e| {
VideoError::PlatformError(format!("queue_input_buffer failed: {e}")) VideoError::PlatformError(format!("queue_input_buffer failed: {e}"))
})?; })?;
@@ -160,13 +200,25 @@ impl VideoEncoder for MediaCodecEncoder {
if packet.is_empty() { if packet.is_empty() {
return false; return false;
} }
let nal_type = packet[0] & 0x1F; let nals = split_annex_b(packet);
nal_type == 5 if nals.is_empty() {
return (packet[0] & 0x1F) == 5;
}
nals.iter()
.any(|nal| !nal.is_empty() && (nal[0] & 0x1F) == 5)
} }
} }
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
impl MediaCodecEncoder { impl MediaCodecEncoder {
fn request_sync_frame(&self) {
let mut params = MediaFormat::new();
params.set_i32(MEDIA_CODEC_REQUEST_SYNC_FRAME, 0);
if let Err(e) = self.codec.set_parameters(params) {
tracing::warn!(error = %e, "AMediaCodec request sync frame failed");
}
}
/// Drain all available output buffers and convert from AVCC to Annex-B. /// Drain all available output buffers and convert from AVCC to Annex-B.
fn drain_output(&mut self) -> Result<Vec<u8>, VideoError> { fn drain_output(&mut self) -> Result<Vec<u8>, VideoError> {
let mut output = Vec::new(); let mut output = Vec::new();
@@ -181,7 +233,7 @@ impl MediaCodecEncoder {
if is_keyframe { if is_keyframe {
self.force_keyframe = false; self.force_keyframe = false;
} }
let data = buffer.buffer().to_vec(); let data = output_buffer_payload(&buffer)?;
output.extend_from_slice(&avcc_to_annexb(&data)); output.extend_from_slice(&avcc_to_annexb(&data));
self.codec self.codec
.release_output_buffer(buffer, false) .release_output_buffer(buffer, false)
@@ -191,7 +243,10 @@ impl MediaCodecEncoder {
} }
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => continue, ) => {
log_media_codec_format("h264_encoder_output", &self.codec);
continue;
}
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged,
) => continue, ) => continue,
@@ -266,6 +321,7 @@ impl VideoDecoder for MediaCodecDecoder {
format.set_str("mime", "video/avc"); format.set_str("mime", "video/avc");
format.set_i32("width", self.width as i32); format.set_i32("width", self.width as i32);
format.set_i32("height", self.height as i32); format.set_i32("height", self.height as i32);
format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR);
format.set_buffer("csd-0", &sps); format.set_buffer("csd-0", &sps);
format.set_buffer("csd-1", &pps); format.set_buffer("csd-1", &pps);
@@ -318,14 +374,12 @@ impl VideoDecoder for MediaCodecDecoder {
// Drain output. // Drain output.
match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) { match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) {
Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => { Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => {
let data = buffer.buffer().to_vec(); let data = decoded_i420_payload(codec, &buffer, self.width, self.height)?;
codec codec.release_output_buffer(buffer, false).map_err(|e| {
.release_output_buffer(buffer, false) VideoError::PlatformError(format!(
.map_err(|e| { "decoder release_output_buffer failed: {e}"
VideoError::PlatformError(format!( ))
"decoder release_output_buffer failed: {e}" })?;
))
})?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -333,6 +387,12 @@ impl VideoDecoder for MediaCodecDecoder {
timestamp_ms: 0, timestamp_ms: 0,
})) }))
} }
Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => {
log_media_codec_format("h264_decoder_output", codec);
Ok(None)
}
Ok(_) => Ok(None), Ok(_) => Ok(None),
Err(e) => Err(VideoError::PlatformError(format!( Err(e) => Err(VideoError::PlatformError(format!(
"decoder dequeue_output_buffer failed: {e}" "decoder dequeue_output_buffer failed: {e}"
@@ -345,6 +405,17 @@ impl VideoDecoder for MediaCodecDecoder {
Err(VideoError::NotInitialized) Err(VideoError::NotInitialized)
} }
} }
fn debug_snapshot(&self) -> Option<String> {
#[cfg(target_os = "android")]
{
media_codec_debug_snapshot(self.codec.as_ref())
}
#[cfg(not(target_os = "android"))]
{
None
}
}
} }
// ============================================================================ // ============================================================================
@@ -361,6 +432,8 @@ pub struct MediaCodecHevcEncoder {
width: u32, width: u32,
#[cfg(target_os = "android")] #[cfg(target_os = "android")]
height: u32, height: u32,
#[cfg(target_os = "android")]
input_format_logged: bool,
force_keyframe: bool, force_keyframe: bool,
#[cfg(not(target_os = "android"))] #[cfg(not(target_os = "android"))]
_width: u32, _width: u32,
@@ -383,7 +456,7 @@ impl MediaCodecHevcEncoder {
format.set_i32("height", height as i32); format.set_i32("height", height as i32);
format.set_i32("bitrate", bitrate_bps as i32); format.set_i32("bitrate", bitrate_bps as i32);
format.set_i32("frame-rate", 30); format.set_i32("frame-rate", 30);
format.set_i32("i-frame-interval", 1); format.set_i32("i-frame-interval", 4);
format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR); format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR);
let codec = MediaCodec::from_encoder_type("video/hevc").ok_or_else(|| { let codec = MediaCodec::from_encoder_type("video/hevc").ok_or_else(|| {
@@ -402,6 +475,7 @@ impl MediaCodecHevcEncoder {
codec, codec,
width, width,
height, height,
input_format_logged: false,
force_keyframe: false, force_keyframe: false,
}) })
} }
@@ -434,17 +508,60 @@ impl VideoEncoder for MediaCodecHevcEncoder {
.dequeue_input_buffer(std::time::Duration::from_millis(10)) .dequeue_input_buffer(std::time::Duration::from_millis(10))
{ {
Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => { Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => {
let flags = if self.force_keyframe { AMEDIACODEC_BUFFER_FLAG_KEY_FRAME } else { 0 }; let flags = if self.force_keyframe {
AMEDIACODEC_BUFFER_FLAG_KEY_FRAME
} else {
0
};
let layout = encoder_input_layout(&self.codec, self.width, self.height);
if !self.input_format_logged {
self.input_format_logged = true;
log_media_codec_input_format("hevc_encoder_input", &self.codec, &layout);
}
let input_capacity = { buffer.buffer_mut().len() };
let mut input = i420_to_encoder_input(
&frame.data,
self.width as usize,
self.height as usize,
&layout,
COLOR_FORMAT_YUV420_PLANAR,
)?;
if input.len() > input_capacity {
tracing::warn!(
target: "wzp_video::mediacodec",
padded_len = input.len(),
input_capacity,
"MediaCodec HEVC input buffer smaller than padded layout; falling back to tight I420"
);
let tight_layout = EncoderInputLayout {
color_format: layout.color_format,
stride: self.width as usize,
slice_height: self.height as usize,
};
input = i420_to_encoder_input(
&frame.data,
self.width as usize,
self.height as usize,
&tight_layout,
COLOR_FORMAT_YUV420_PLANAR,
)?;
}
let to_copy = { let to_copy = {
let buf = buffer.buffer_mut(); let buf = buffer.buffer_mut();
let n = frame.data.len().min(buf.len()); let n = input.len().min(buf.len());
for (d, &s) in buf[..n].iter_mut().zip(frame.data[..n].iter()) { for (d, &s) in buf[..n].iter_mut().zip(input[..n].iter()) {
d.write(s); d.write(s);
} }
n n
}; };
self.codec self.codec
.queue_input_buffer(buffer, 0, to_copy, frame.timestamp_ms as u64 * 1000, flags) .queue_input_buffer(
buffer,
0,
to_copy,
frame.timestamp_ms as u64 * 1000,
flags,
)
.map_err(|e| { .map_err(|e| {
VideoError::PlatformError(format!("queue_input_buffer failed: {e}")) VideoError::PlatformError(format!("queue_input_buffer failed: {e}"))
})?; })?;
@@ -472,11 +589,12 @@ impl VideoEncoder for MediaCodecHevcEncoder {
} }
fn is_keyframe(&self, packet: &[u8]) -> bool { fn is_keyframe(&self, packet: &[u8]) -> bool {
if packet.len() < 2 { let nals = split_annex_b(packet);
return false; if nals.is_empty() {
return packet.len() >= 2 && matches!((packet[0] >> 1) & 0x3F, 19 | 20);
} }
let nal_type = (packet[0] >> 1) & 0x3F; nals.iter()
nal_type == 19 || nal_type == 20 .any(|nal| nal.len() >= 2 && matches!((nal[0] >> 1) & 0x3F, 19 | 20))
} }
} }
@@ -556,7 +674,11 @@ impl VideoEncoder for MediaCodecAv1Encoder {
.dequeue_input_buffer(std::time::Duration::from_millis(0)) .dequeue_input_buffer(std::time::Duration::from_millis(0))
{ {
Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => { Ok(ndk::media::media_codec::DequeuedInputBufferResult::Buffer(mut buffer)) => {
let flags = if self.force_keyframe { AMEDIACODEC_BUFFER_FLAG_KEY_FRAME } else { 0 }; let flags = if self.force_keyframe {
AMEDIACODEC_BUFFER_FLAG_KEY_FRAME
} else {
0
};
let to_copy = { let to_copy = {
let buf = buffer.buffer_mut(); let buf = buffer.buffer_mut();
let n = frame.data.len().min(buf.len()); let n = frame.data.len().min(buf.len());
@@ -566,7 +688,13 @@ impl VideoEncoder for MediaCodecAv1Encoder {
n n
}; };
self.codec self.codec
.queue_input_buffer(buffer, 0, to_copy, frame.timestamp_ms as u64 * 1000, flags) .queue_input_buffer(
buffer,
0,
to_copy,
frame.timestamp_ms as u64 * 1000,
flags,
)
.map_err(|e| { .map_err(|e| {
VideoError::PlatformError(format!( VideoError::PlatformError(format!(
"AV1 encoder queue_input_buffer failed: {e}" "AV1 encoder queue_input_buffer failed: {e}"
@@ -615,7 +743,7 @@ impl MediaCodecHevcEncoder {
if is_keyframe { if is_keyframe {
self.force_keyframe = false; self.force_keyframe = false;
} }
let data = buffer.buffer().to_vec(); let data = output_buffer_payload(&buffer)?;
output.extend_from_slice(&avcc_to_annexb(&data)); output.extend_from_slice(&avcc_to_annexb(&data));
self.codec self.codec
.release_output_buffer(buffer, false) .release_output_buffer(buffer, false)
@@ -625,7 +753,10 @@ impl MediaCodecHevcEncoder {
} }
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => continue, ) => {
log_media_codec_format("hevc_encoder_output", &self.codec);
continue;
}
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged,
) => continue, ) => continue,
@@ -657,7 +788,7 @@ impl MediaCodecAv1Encoder {
self.force_keyframe = false; self.force_keyframe = false;
} }
// AV1 output from MediaCodec is already in OBU format. // AV1 output from MediaCodec is already in OBU format.
let data = buffer.buffer().to_vec(); let data = output_buffer_payload(&buffer)?;
output.extend_from_slice(&data); output.extend_from_slice(&data);
self.codec self.codec
.release_output_buffer(buffer, false) .release_output_buffer(buffer, false)
@@ -669,7 +800,10 @@ impl MediaCodecAv1Encoder {
} }
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => continue, ) => {
log_media_codec_format("av1_encoder_output", &self.codec);
continue;
}
Ok( Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged, ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputBuffersChanged,
) => continue, ) => continue,
@@ -742,6 +876,7 @@ impl VideoDecoder for MediaCodecHevcDecoder {
format.set_str("mime", "video/hevc"); format.set_str("mime", "video/hevc");
format.set_i32("width", self.width as i32); format.set_i32("width", self.width as i32);
format.set_i32("height", self.height as i32); format.set_i32("height", self.height as i32);
format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR);
format.set_buffer("csd-0", &vps); format.set_buffer("csd-0", &vps);
format.set_buffer("csd-1", &sps); format.set_buffer("csd-1", &sps);
format.set_buffer("csd-2", &pps); format.set_buffer("csd-2", &pps);
@@ -795,14 +930,12 @@ impl VideoDecoder for MediaCodecHevcDecoder {
match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) { match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) {
Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => { Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => {
let data = buffer.buffer().to_vec(); let data = decoded_i420_payload(codec, &buffer, self.width, self.height)?;
codec codec.release_output_buffer(buffer, false).map_err(|e| {
.release_output_buffer(buffer, false) VideoError::PlatformError(format!(
.map_err(|e| { "decoder release_output_buffer failed: {e}"
VideoError::PlatformError(format!( ))
"decoder release_output_buffer failed: {e}" })?;
))
})?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -810,6 +943,12 @@ impl VideoDecoder for MediaCodecHevcDecoder {
timestamp_ms: 0, timestamp_ms: 0,
})) }))
} }
Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => {
log_media_codec_format("hevc_decoder_output", codec);
Ok(None)
}
Ok(_) => Ok(None), Ok(_) => Ok(None),
Err(e) => Err(VideoError::PlatformError(format!( Err(e) => Err(VideoError::PlatformError(format!(
"decoder dequeue_output_buffer failed: {e}" "decoder dequeue_output_buffer failed: {e}"
@@ -822,6 +961,17 @@ impl VideoDecoder for MediaCodecHevcDecoder {
Err(VideoError::NotInitialized) Err(VideoError::NotInitialized)
} }
} }
fn debug_snapshot(&self) -> Option<String> {
#[cfg(target_os = "android")]
{
media_codec_debug_snapshot(self.codec.as_ref())
}
#[cfg(not(target_os = "android"))]
{
None
}
}
} }
/// Android MediaCodec AV1 decoder. /// Android MediaCodec AV1 decoder.
@@ -881,6 +1031,7 @@ impl VideoDecoder for MediaCodecAv1Decoder {
format.set_str("mime", "video/av01"); format.set_str("mime", "video/av01");
format.set_i32("width", self.width as i32); format.set_i32("width", self.width as i32);
format.set_i32("height", self.height as i32); format.set_i32("height", self.height as i32);
format.set_i32("color-format", COLOR_FORMAT_YUV420_PLANAR);
format.set_buffer("csd-0", &seq_header); format.set_buffer("csd-0", &seq_header);
let codec = MediaCodec::from_decoder_type("video/av01").ok_or_else(|| { let codec = MediaCodec::from_decoder_type("video/av01").ok_or_else(|| {
@@ -930,14 +1081,12 @@ impl VideoDecoder for MediaCodecAv1Decoder {
match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) { match codec.dequeue_output_buffer(std::time::Duration::from_millis(10)) {
Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => { Ok(ndk::media::media_codec::DequeuedOutputBufferInfoResult::Buffer(buffer)) => {
let data = buffer.buffer().to_vec(); let data = decoded_i420_payload(codec, &buffer, self.width, self.height)?;
codec codec.release_output_buffer(buffer, false).map_err(|e| {
.release_output_buffer(buffer, false) VideoError::PlatformError(format!(
.map_err(|e| { "AV1 decoder release_output_buffer failed: {e}"
VideoError::PlatformError(format!( ))
"AV1 decoder release_output_buffer failed: {e}" })?;
))
})?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -945,6 +1094,12 @@ impl VideoDecoder for MediaCodecAv1Decoder {
timestamp_ms: 0, timestamp_ms: 0,
})) }))
} }
Ok(
ndk::media::media_codec::DequeuedOutputBufferInfoResult::OutputFormatChanged,
) => {
log_media_codec_format("av1_decoder_output", codec);
Ok(None)
}
Ok(_) => Ok(None), Ok(_) => Ok(None),
Err(e) => Err(VideoError::PlatformError(format!( Err(e) => Err(VideoError::PlatformError(format!(
"AV1 decoder dequeue_output_buffer failed: {e}" "AV1 decoder dequeue_output_buffer failed: {e}"
@@ -957,6 +1112,453 @@ impl VideoDecoder for MediaCodecAv1Decoder {
Err(VideoError::NotInitialized) Err(VideoError::NotInitialized)
} }
} }
fn debug_snapshot(&self) -> Option<String> {
#[cfg(target_os = "android")]
{
media_codec_debug_snapshot(self.codec.as_ref())
}
#[cfg(not(target_os = "android"))]
{
None
}
}
}
#[cfg(target_os = "android")]
fn media_codec_debug_snapshot(codec: Option<&MediaCodec>) -> Option<String> {
let codec = codec?;
let format = codec.output_format();
Some(format!(
"color_format={:?} width={:?} height={:?} stride={:?} slice_height={:?} crop=({:?},{:?},{:?},{:?})",
format.i32("color-format"),
format.i32("width"),
format.i32("height"),
format.i32("stride"),
format.i32("slice-height"),
format.i32("crop-left"),
format.i32("crop-top"),
format.i32("crop-right"),
format.i32("crop-bottom"),
))
}
#[cfg(target_os = "android")]
fn output_buffer_payload(
buffer: &ndk::media::media_codec::OutputBuffer<'_>,
) -> Result<Vec<u8>, VideoError> {
let info = buffer.info();
let offset = usize::try_from(info.offset()).map_err(|_| {
VideoError::PlatformError(format!(
"negative MediaCodec output offset: {}",
info.offset()
))
})?;
let size = usize::try_from(info.size()).map_err(|_| {
VideoError::PlatformError(format!("negative MediaCodec output size: {}", info.size()))
})?;
let end = offset.checked_add(size).ok_or_else(|| {
VideoError::PlatformError(format!(
"MediaCodec output range overflow: offset={offset} size={size}"
))
})?;
let raw = buffer.buffer();
if end > raw.len() {
return Err(VideoError::PlatformError(format!(
"MediaCodec output range outside buffer: offset={offset} size={size} buffer_len={}",
raw.len()
)));
}
Ok(raw[offset..end].to_vec())
}
#[cfg(target_os = "android")]
fn decoded_i420_payload(
codec: &MediaCodec,
buffer: &ndk::media::media_codec::OutputBuffer<'_>,
width: u32,
height: u32,
) -> Result<Vec<u8>, VideoError> {
let payload = output_buffer_payload(buffer)?;
let format = codec.output_format();
let color_format = format
.i32("color-format")
.unwrap_or(COLOR_FORMAT_YUV420_PLANAR);
let stride = positive_format_usize(&format, "stride").unwrap_or(width as usize);
let slice_height = positive_format_usize(&format, "slice-height").unwrap_or(height as usize);
match color_format {
COLOR_FORMAT_YUV420_PLANAR => yuv420_planar_to_tight_i420(
&payload,
width as usize,
height as usize,
stride,
slice_height,
),
COLOR_FORMAT_YUV420_SEMIPLANAR => yuv420_semiplanar_to_tight_i420(
&payload,
width as usize,
height as usize,
stride,
slice_height,
),
_ => {
let expected = i420_len(width as usize, height as usize)?;
if payload.len() < expected {
return Err(VideoError::PlatformError(format!(
"unsupported MediaCodec color format {color_format} produced {} bytes, expected at least {expected}",
payload.len()
)));
}
let mut data = payload;
data.truncate(expected);
Ok(data)
}
}
}
#[cfg(target_os = "android")]
fn positive_format_usize(format: &MediaFormat, key: &str) -> Option<usize> {
let value = format.i32(key)?;
(value > 0).then_some(value as usize)
}
#[cfg(target_os = "android")]
#[derive(Clone, Copy, Debug)]
struct EncoderInputLayout {
color_format: Option<i32>,
stride: usize,
slice_height: usize,
}
#[cfg(target_os = "android")]
fn encoder_input_layout(codec: &MediaCodec, width: u32, height: u32) -> EncoderInputLayout {
let format = codec.input_format();
let width = width as usize;
let height = height as usize;
EncoderInputLayout {
color_format: format.i32("color-format"),
stride: positive_format_usize(&format, "stride")
.unwrap_or(width)
.max(width),
slice_height: positive_format_usize(&format, "slice-height")
.unwrap_or(height)
.max(height),
}
}
#[cfg(target_os = "android")]
fn log_media_codec_input_format(label: &str, codec: &MediaCodec, layout: &EncoderInputLayout) {
let input_format = codec.input_format();
let output_format = codec.output_format();
tracing::info!(
target: "wzp_video::mediacodec",
label,
input_color_format = input_format.i32("color-format"),
input_width = input_format.i32("width"),
input_height = input_format.i32("height"),
input_stride = input_format.i32("stride"),
input_slice_height = input_format.i32("slice-height"),
output_color_format = output_format.i32("color-format"),
output_width = output_format.i32("width"),
output_height = output_format.i32("height"),
output_stride = output_format.i32("stride"),
output_slice_height = output_format.i32("slice-height"),
effective_color_format = layout.color_format,
effective_stride = layout.stride,
effective_slice_height = layout.slice_height,
"MediaCodec input format"
);
}
#[cfg(target_os = "android")]
fn i420_to_encoder_input(
src: &[u8],
width: usize,
height: usize,
layout: &EncoderInputLayout,
default_color_format: i32,
) -> Result<Vec<u8>, VideoError> {
let color_format = layout.color_format.unwrap_or(default_color_format);
match color_format {
COLOR_FORMAT_YUV420_PLANAR => {
i420_to_padded_planar(src, width, height, layout.stride, layout.slice_height)
}
COLOR_FORMAT_YUV420_SEMIPLANAR => {
i420_to_padded_nv12(src, width, height, layout.stride, layout.slice_height)
}
other => {
tracing::warn!(
target: "wzp_video::mediacodec",
color_format = other,
default_color_format,
"unsupported MediaCodec encoder input color format; using requested default"
);
if default_color_format == COLOR_FORMAT_YUV420_PLANAR {
i420_to_padded_planar(src, width, height, layout.stride, layout.slice_height)
} else {
i420_to_padded_nv12(src, width, height, layout.stride, layout.slice_height)
}
}
}
}
#[cfg(target_os = "android")]
fn log_media_codec_format(label: &str, codec: &MediaCodec) {
let format = codec.output_format();
tracing::info!(
target: "wzp_video::mediacodec",
label,
color_format = format.i32("color-format"),
width = format.i32("width"),
height = format.i32("height"),
stride = format.i32("stride"),
slice_height = format.i32("slice-height"),
crop_left = format.i32("crop-left"),
crop_right = format.i32("crop-right"),
crop_top = format.i32("crop-top"),
crop_bottom = format.i32("crop-bottom"),
"MediaCodec output format changed"
);
}
#[cfg(target_os = "android")]
fn i420_len(width: usize, height: usize) -> Result<usize, VideoError> {
width
.checked_mul(height)
.and_then(|y| y.checked_add(y / 2))
.ok_or_else(|| {
VideoError::InvalidInput(format!("invalid I420 dimensions {width}x{height}"))
})
}
#[cfg(target_os = "android")]
fn i420_to_padded_nv12(
src: &[u8],
width: usize,
height: usize,
stride: usize,
slice_height: usize,
) -> Result<Vec<u8>, VideoError> {
let y_size = width.checked_mul(height).ok_or_else(|| {
VideoError::InvalidInput(format!("invalid frame dimensions {width}x{height}"))
})?;
let uv_size = y_size / 4;
let expected = y_size + uv_size * 2;
if src.len() < expected {
return Err(VideoError::InvalidInput(format!(
"I420 frame too small for NV12 conversion: {} bytes, expected {expected}",
src.len()
)));
}
if stride < width || slice_height < height {
return Err(VideoError::InvalidInput(format!(
"invalid encoder input layout {stride}x{slice_height} for {width}x{height}"
)));
}
let chroma_width = width / 2;
let chroma_height = height / 2;
let y_stride = stride;
let uv_stride = stride;
let y_slice_height = slice_height;
let uv_slice_height = (slice_height / 2).max(chroma_height);
let y_padded_size = y_stride.checked_mul(y_slice_height).ok_or_else(|| {
VideoError::InvalidInput(format!(
"invalid padded Y layout {y_stride}x{y_slice_height}"
))
})?;
let uv_padded_size = uv_stride.checked_mul(uv_slice_height).ok_or_else(|| {
VideoError::InvalidInput(format!(
"invalid padded UV layout {uv_stride}x{uv_slice_height}"
))
})?;
let total = y_padded_size
.checked_add(uv_padded_size)
.ok_or_else(|| VideoError::InvalidInput("padded NV12 size overflow".into()))?;
let mut out = vec![0u8; total];
out[y_padded_size..].fill(128);
for row in 0..height {
let src_off = row * width;
let dst_off = row * y_stride;
out[dst_off..dst_off + width].copy_from_slice(&src[src_off..src_off + width]);
}
let u = &src[y_size..y_size + uv_size];
let v = &src[y_size + uv_size..y_size + uv_size * 2];
for row in 0..chroma_height {
let src_row = row * chroma_width;
let dst_row = y_padded_size + row * uv_stride;
for col in 0..chroma_width {
out[dst_row + col * 2] = u[src_row + col];
out[dst_row + col * 2 + 1] = v[src_row + col];
}
}
Ok(out)
}
#[cfg(target_os = "android")]
fn i420_to_padded_planar(
src: &[u8],
width: usize,
height: usize,
stride: usize,
slice_height: usize,
) -> Result<Vec<u8>, VideoError> {
let y_size = width.checked_mul(height).ok_or_else(|| {
VideoError::InvalidInput(format!("invalid frame dimensions {width}x{height}"))
})?;
let uv_size = y_size / 4;
let expected = y_size + uv_size * 2;
if src.len() < expected {
return Err(VideoError::InvalidInput(format!(
"I420 frame too small for padded planar copy: {} bytes, expected {expected}",
src.len()
)));
}
if stride < width || slice_height < height {
return Err(VideoError::InvalidInput(format!(
"invalid encoder input layout {stride}x{slice_height} for {width}x{height}"
)));
}
let chroma_width = width / 2;
let chroma_height = height / 2;
let y_stride = stride;
let chroma_stride = (stride / 2).max(chroma_width);
let y_slice_height = slice_height;
let chroma_slice_height = (slice_height / 2).max(chroma_height);
let y_padded_size = y_stride.checked_mul(y_slice_height).ok_or_else(|| {
VideoError::InvalidInput(format!(
"invalid padded Y layout {y_stride}x{y_slice_height}"
))
})?;
let chroma_padded_size = chroma_stride
.checked_mul(chroma_slice_height)
.ok_or_else(|| {
VideoError::InvalidInput(format!(
"invalid padded chroma layout {chroma_stride}x{chroma_slice_height}"
))
})?;
let chroma_total = chroma_padded_size
.checked_mul(2)
.ok_or_else(|| VideoError::InvalidInput("padded I420 chroma size overflow".into()))?;
let total = y_padded_size
.checked_add(chroma_total)
.ok_or_else(|| VideoError::InvalidInput("padded I420 size overflow".into()))?;
let mut out = vec![0u8; total];
out[y_padded_size..].fill(128);
for row in 0..height {
let src_off = row * width;
let dst_off = row * y_stride;
out[dst_off..dst_off + width].copy_from_slice(&src[src_off..src_off + width]);
}
let src_u = y_size;
let src_v = y_size + uv_size;
let dst_u = y_padded_size;
let dst_v = y_padded_size + chroma_padded_size;
for row in 0..chroma_height {
let src_off = row * chroma_width;
let dst_off = row * chroma_stride;
out[dst_u + dst_off..dst_u + dst_off + chroma_width]
.copy_from_slice(&src[src_u + src_off..src_u + src_off + chroma_width]);
out[dst_v + dst_off..dst_v + dst_off + chroma_width]
.copy_from_slice(&src[src_v + src_off..src_v + src_off + chroma_width]);
}
Ok(out)
}
#[cfg(target_os = "android")]
fn yuv420_planar_to_tight_i420(
src: &[u8],
width: usize,
height: usize,
stride: usize,
slice_height: usize,
) -> Result<Vec<u8>, VideoError> {
let y_size = width * height;
let chroma_width = width / 2;
let chroma_height = height / 2;
let chroma_stride = stride / 2;
let chroma_slice_height = slice_height / 2;
let padded_y_size = stride * slice_height;
let padded_chroma_size = chroma_stride * chroma_slice_height;
let required = padded_y_size + padded_chroma_size * 2;
if src.len() < required {
return Err(VideoError::PlatformError(format!(
"planar YUV buffer too small: {} < {required} (stride={stride}, slice_height={slice_height})",
src.len()
)));
}
let mut out = vec![0u8; i420_len(width, height)?];
for row in 0..height {
let src_start = row * stride;
let dst_start = row * width;
out[dst_start..dst_start + width].copy_from_slice(&src[src_start..src_start + width]);
}
let src_u = padded_y_size;
let src_v = src_u + padded_chroma_size;
let dst_u = y_size;
let dst_v = dst_u + chroma_width * chroma_height;
for row in 0..chroma_height {
let src_row = row * chroma_stride;
let dst_row = row * chroma_width;
out[dst_u + dst_row..dst_u + dst_row + chroma_width]
.copy_from_slice(&src[src_u + src_row..src_u + src_row + chroma_width]);
out[dst_v + dst_row..dst_v + dst_row + chroma_width]
.copy_from_slice(&src[src_v + src_row..src_v + src_row + chroma_width]);
}
Ok(out)
}
#[cfg(target_os = "android")]
fn yuv420_semiplanar_to_tight_i420(
src: &[u8],
width: usize,
height: usize,
stride: usize,
slice_height: usize,
) -> Result<Vec<u8>, VideoError> {
let y_size = width * height;
let chroma_width = width / 2;
let chroma_height = height / 2;
let padded_y_size = stride * slice_height;
let required = padded_y_size + stride * chroma_height;
if src.len() < required {
return Err(VideoError::PlatformError(format!(
"semiplanar YUV buffer too small: {} < {required} (stride={stride}, slice_height={slice_height})",
src.len()
)));
}
let mut out = vec![0u8; i420_len(width, height)?];
for row in 0..height {
let src_start = row * stride;
let dst_start = row * width;
out[dst_start..dst_start + width].copy_from_slice(&src[src_start..src_start + width]);
}
let dst_u = y_size;
let dst_v = dst_u + chroma_width * chroma_height;
for row in 0..chroma_height {
let src_row = padded_y_size + row * stride;
let dst_row = row * chroma_width;
for col in 0..chroma_width {
let pair = src_row + col * 2;
out[dst_u + dst_row + col] = src[pair];
out[dst_v + dst_row + col] = src[pair + 1];
}
}
Ok(out)
} }
/// Type alias for HEVC parameter-set triple returned by `extract_vps_sps_pps`. /// Type alias for HEVC parameter-set triple returned by `extract_vps_sps_pps`.
@@ -989,8 +1591,13 @@ fn extract_vps_sps_pps(annex_b: &[u8]) -> HevcParameterSets {
/// (4-byte start codes `0x00 0x00 0x00 0x01`). /// (4-byte start codes `0x00 0x00 0x00 0x01`).
#[allow(dead_code)] #[allow(dead_code)]
fn avcc_to_annexb(data: &[u8]) -> Vec<u8> { fn avcc_to_annexb(data: &[u8]) -> Vec<u8> {
if starts_with_annex_b_start_code(data) {
return data.to_vec();
}
let mut out = Vec::with_capacity(data.len() + data.len() / 4); let mut out = Vec::with_capacity(data.len() + data.len() / 4);
let mut offset = 0; let mut offset = 0;
let mut saw_nal = false;
while offset + 4 <= data.len() { while offset + 4 <= data.len() {
let nal_len = u32::from_be_bytes([ let nal_len = u32::from_be_bytes([
data[offset], data[offset],
@@ -1000,15 +1607,20 @@ fn avcc_to_annexb(data: &[u8]) -> Vec<u8> {
]) as usize; ]) as usize;
offset += 4; offset += 4;
if offset + nal_len > data.len() { if offset + nal_len > data.len() {
break; return if saw_nal { out } else { data.to_vec() };
} }
out.extend_from_slice(&[0x00, 0x00, 0x00, 0x01]); out.extend_from_slice(&[0x00, 0x00, 0x00, 0x01]);
out.extend_from_slice(&data[offset..offset + nal_len]); out.extend_from_slice(&data[offset..offset + nal_len]);
offset += nal_len; offset += nal_len;
saw_nal = true;
} }
out out
} }
fn starts_with_annex_b_start_code(data: &[u8]) -> bool {
data.starts_with(&[0x00, 0x00, 0x01]) || data.starts_with(&[0x00, 0x00, 0x00, 0x01])
}
/// Parse an Annex-B access unit and return the first SPS and PPS found. /// Parse an Annex-B access unit and return the first SPS and PPS found.
#[allow(dead_code)] #[allow(dead_code)]
fn extract_sps_pps(annex_b: &[u8]) -> (Option<Vec<u8>>, Option<Vec<u8>>) { fn extract_sps_pps(annex_b: &[u8]) -> (Option<Vec<u8>>, Option<Vec<u8>>) {
@@ -1064,7 +1676,7 @@ fn split_annex_b(data: &[u8]) -> Vec<&[u8]> {
/// Android MediaCodec `csd-0`. /// Android MediaCodec `csd-0`.
#[allow(dead_code)] #[allow(dead_code)]
fn extract_sequence_header_obu(data: &[u8]) -> Option<Vec<u8>> { fn extract_sequence_header_obu(data: &[u8]) -> Option<Vec<u8>> {
use crate::av1_obu::{ObuHeader, read_leb128}; use crate::av1_obu::{read_leb128, ObuHeader};
let mut i = 0usize; let mut i = 0usize;
while i < data.len() { while i < data.len() {
let header = ObuHeader::from_byte(data[i]); let header = ObuHeader::from_byte(data[i]);
@@ -1135,6 +1747,11 @@ mod tests {
}; };
assert!(enc.is_keyframe(&[0x65, 0x01])); assert!(enc.is_keyframe(&[0x65, 0x01]));
assert!(!enc.is_keyframe(&[0x41, 0x01])); assert!(!enc.is_keyframe(&[0x41, 0x01]));
assert!(enc.is_keyframe(&[
0x00, 0x00, 0x00, 0x01, 0x67, 0x01, // SPS
0x00, 0x00, 0x00, 0x01, 0x68, 0x02, // PPS
0x00, 0x00, 0x00, 0x01, 0x65, 0x03, // IDR
]));
} }
#[test] #[test]
@@ -1155,6 +1772,16 @@ mod tests {
assert_eq!(annex_b, expected); assert_eq!(annex_b, expected);
} }
#[test]
fn avcc_to_annexb_passes_through_annexb() {
let annex_b = vec![
0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0xC0, 0x1E, 0x00, 0x00, 0x00, 0x01, 0x65, 0x88,
0x84, 0x21,
];
assert_eq!(avcc_to_annexb(&annex_b), annex_b);
}
#[test] #[test]
fn hevc_mediacodec_encoder_returns_not_initialized_on_non_android() { fn hevc_mediacodec_encoder_returns_not_initialized_on_non_android() {
let enc = MediaCodecHevcEncoder::new(1280, 720, 2_000_000); let enc = MediaCodecHevcEncoder::new(1280, 720, 2_000_000);

View File

@@ -28,6 +28,24 @@ use wzp_proto::{CodecId, MediaHeaderV2, MediaPacket, MediaType};
/// 1200 (QUIC MTU) 16 (MediaHeaderV2) 16 (AEAD tag) = 1168. /// 1200 (QUIC MTU) 16 (MediaHeaderV2) 16 (AEAD tag) = 1168.
pub const VIDEO_MAX_PAYLOAD: usize = 1168; pub const VIDEO_MAX_PAYLOAD: usize = 1168;
const VIDEO_FRAME_META_MAGIC: [u8; 4] = *b"WZV1";
const VIDEO_FRAME_META_LEN: usize = 8;
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub struct VideoFrameMeta {
pub width: u16,
pub height: u16,
}
#[derive(Clone, Debug, PartialEq, Eq)]
pub struct ReassembledVideoFrame {
pub codec_id: CodecId,
pub is_keyframe: bool,
pub width: Option<u16>,
pub height: Option<u16>,
pub data: Vec<u8>,
}
/// Fragments one encoded video frame into a sequence of [`MediaPacket`]s. /// Fragments one encoded video frame into a sequence of [`MediaPacket`]s.
/// ///
/// Pass each `MediaPacket` to `transport.send_media()`. /// Pass each `MediaPacket` to `transport.send_media()`.
@@ -37,12 +55,20 @@ pub fn packetize_video_frame(
is_keyframe: bool, is_keyframe: bool,
seq: &mut u32, seq: &mut u32,
timestamp_ms: u32, timestamp_ms: u32,
width: u32,
height: u32,
) -> Vec<MediaPacket> { ) -> Vec<MediaPacket> {
if frame.is_empty() { if frame.is_empty() {
return vec![]; return vec![];
} }
let chunks: Vec<&[u8]> = frame.chunks(VIDEO_MAX_PAYLOAD).collect(); let mut framed = Vec::with_capacity(VIDEO_FRAME_META_LEN + frame.len());
framed.extend_from_slice(&VIDEO_FRAME_META_MAGIC);
framed.extend_from_slice(&(width.min(u16::MAX as u32) as u16).to_be_bytes());
framed.extend_from_slice(&(height.min(u16::MAX as u32) as u16).to_be_bytes());
framed.extend_from_slice(frame);
let chunks: Vec<&[u8]> = framed.chunks(VIDEO_MAX_PAYLOAD).collect();
let total = chunks.len().min(255); let total = chunks.len().min(255);
let mut packets = Vec::with_capacity(total); let mut packets = Vec::with_capacity(total);
@@ -63,7 +89,11 @@ pub fn packetize_video_frame(
flags, flags,
media_type: MediaType::Video, media_type: MediaType::Video,
codec_id, codec_id,
stream_id: 1, // stream 0 = audio, 1 = video // Legacy relays default receivers to video layer 0. Use video stream
// 0 for the single-layer room-video path so packets are forwarded
// before any receiver quality state exists. Audio is separated by
// media_type, so stream_id 0 does not collide with audio packets.
stream_id: 0,
fec_ratio: 0, fec_ratio: 0,
seq: *seq, seq: *seq,
timestamp: timestamp_ms, timestamp: timestamp_ms,
@@ -91,6 +121,7 @@ struct PendingFrame {
fragments: HashMap<u8, Vec<u8>>, fragments: HashMap<u8, Vec<u8>>,
total_fragments: u8, total_fragments: u8,
is_keyframe: bool, is_keyframe: bool,
saw_frame_end: bool,
codec_id: Option<CodecId>, codec_id: Option<CodecId>,
} }
@@ -113,9 +144,8 @@ impl VideoReassembler {
/// Push one received video packet. /// Push one received video packet.
/// ///
/// Returns `Some((codec_id, is_keyframe, frame_bytes))` when a complete /// Returns `Some(frame)` when a complete frame is ready, `None` otherwise.
/// frame is ready, `None` otherwise. pub fn push(&mut self, pkt: &MediaPacket) -> Option<ReassembledVideoFrame> {
pub fn push(&mut self, pkt: &MediaPacket) -> Option<(CodecId, bool, Vec<u8>)> {
let hdr = &pkt.header; let hdr = &pkt.header;
let fragment_index = (hdr.fec_block >> 8) as u8; let fragment_index = (hdr.fec_block >> 8) as u8;
let fragment_count = (hdr.fec_block & 0xFF) as u8; let fragment_count = (hdr.fec_block & 0xFF) as u8;
@@ -131,10 +161,15 @@ impl VideoReassembler {
if is_keyframe { if is_keyframe {
entry.is_keyframe = true; entry.is_keyframe = true;
} }
if is_frame_end {
entry.saw_frame_end = true;
}
entry.codec_id = Some(hdr.codec_id); entry.codec_id = Some(hdr.codec_id);
// Only attempt reassembly once the last fragment has arrived. // Attempt reassembly once we know the frame end has arrived. The end
if !is_frame_end { // fragment can arrive before earlier fragments on QUIC/datagram paths,
// so retry on every later fragment instead of only on the end packet.
if !entry.saw_frame_end {
return None; return None;
} }
@@ -151,7 +186,14 @@ impl VideoReassembler {
for i in 0..total as u8 { for i in 0..total as u8 {
frame.extend_from_slice(pending.fragments.get(&i)?); frame.extend_from_slice(pending.fragments.get(&i)?);
} }
Some((codec_id, pending.is_keyframe, frame)) let (meta, data) = split_video_frame_payload(frame);
Some(ReassembledVideoFrame {
codec_id,
is_keyframe: pending.is_keyframe,
width: meta.map(|m| m.width),
height: meta.map(|m| m.height),
data,
})
} }
/// Evict stale pending frames older than `max_age_ms` milliseconds. /// Evict stale pending frames older than `max_age_ms` milliseconds.
@@ -159,12 +201,22 @@ impl VideoReassembler {
/// Call periodically (e.g. every 2s) to prevent accumulation of frames /// Call periodically (e.g. every 2s) to prevent accumulation of frames
/// whose first or middle fragments were lost. /// whose first or middle fragments were lost.
pub fn evict_stale(&mut self, current_timestamp_ms: u32, max_age_ms: u32) { pub fn evict_stale(&mut self, current_timestamp_ms: u32, max_age_ms: u32) {
self.pending.retain(|&ts, _| { self.pending
current_timestamp_ms.wrapping_sub(ts) <= max_age_ms .retain(|&ts, _| current_timestamp_ms.wrapping_sub(ts) <= max_age_ms);
});
} }
} }
fn split_video_frame_payload(mut frame: Vec<u8>) -> (Option<VideoFrameMeta>, Vec<u8>) {
if frame.len() < VIDEO_FRAME_META_LEN || frame[..4] != VIDEO_FRAME_META_MAGIC {
return (None, frame);
}
let width = u16::from_be_bytes([frame[4], frame[5]]);
let height = u16::from_be_bytes([frame[6], frame[7]]);
frame.drain(..VIDEO_FRAME_META_LEN);
(Some(VideoFrameMeta { width, height }), frame)
}
impl Default for VideoReassembler { impl Default for VideoReassembler {
fn default() -> Self { fn default() -> Self {
Self::new() Self::new()
@@ -183,26 +235,37 @@ mod tests {
fn single_fragment_roundtrip() { fn single_fragment_roundtrip() {
let frame = make_frame(100); let frame = make_frame(100);
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&frame, CodecId::Av1Main, true, &mut seq, 1000); let pkts = packetize_video_frame(&frame, CodecId::Av1Main, true, &mut seq, 1000, 640, 480);
assert_eq!(pkts.len(), 1); assert_eq!(pkts.len(), 1);
assert!(pkts[0].header.is_keyframe()); assert!(pkts[0].header.is_keyframe());
assert!(pkts[0].header.is_frame_end()); assert!(pkts[0].header.is_frame_end());
assert_eq!(pkts[0].header.media_type, MediaType::Video); assert_eq!(pkts[0].header.media_type, MediaType::Video);
assert_eq!(pkts[0].header.stream_id, 0);
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
let result = reassembler.push(&pkts[0]); let result = reassembler.push(&pkts[0]);
assert!(result.is_some()); assert!(result.is_some());
let (codec, is_kf, data) = result.unwrap(); let result = result.unwrap();
assert_eq!(codec, CodecId::Av1Main); assert_eq!(result.codec_id, CodecId::Av1Main);
assert!(is_kf); assert!(result.is_keyframe);
assert_eq!(data, frame); assert_eq!(result.width, Some(640));
assert_eq!(result.height, Some(480));
assert_eq!(result.data, frame);
} }
#[test] #[test]
fn multi_fragment_roundtrip() { fn multi_fragment_roundtrip() {
let frame = make_frame(VIDEO_MAX_PAYLOAD * 3 + 50); let frame = make_frame(VIDEO_MAX_PAYLOAD * 3 + 50);
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&frame, CodecId::H264Baseline, false, &mut seq, 2000); let pkts = packetize_video_frame(
&frame,
CodecId::H264Baseline,
false,
&mut seq,
2000,
960,
540,
);
assert_eq!(pkts.len(), 4); assert_eq!(pkts.len(), 4);
assert!(!pkts[0].header.is_frame_end()); assert!(!pkts[0].header.is_frame_end());
assert!(pkts[3].header.is_frame_end()); assert!(pkts[3].header.is_frame_end());
@@ -213,34 +276,66 @@ mod tests {
for pkt in &pkts { for pkt in &pkts {
result = reassembler.push(pkt); result = reassembler.push(pkt);
} }
let (codec, is_kf, data) = result.unwrap(); let result = result.unwrap();
assert_eq!(codec, CodecId::H264Baseline); assert_eq!(result.codec_id, CodecId::H264Baseline);
assert!(!is_kf); assert!(!result.is_keyframe);
assert_eq!(data, frame); assert_eq!(result.width, Some(960));
assert_eq!(result.height, Some(540));
assert_eq!(result.data, frame);
} }
#[test] #[test]
fn out_of_order_delivery() { fn out_of_order_delivery() {
let frame = make_frame(VIDEO_MAX_PAYLOAD * 2 + 100); let frame = make_frame(VIDEO_MAX_PAYLOAD * 2 + 100);
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&frame, CodecId::Av1Main, false, &mut seq, 3000); let pkts = packetize_video_frame(&frame, CodecId::Av1Main, false, &mut seq, 3000, 320, 240);
assert_eq!(pkts.len(), 3); assert_eq!(pkts.len(), 3);
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
// Deliver out of order: 2, 0, 1 // Deliver out of order: 2, 0, 1
assert!(reassembler.push(&pkts[2]).is_none()); // last arrives first — no total_fragments yet assert!(reassembler.push(&pkts[2]).is_none()); // last arrives first — no total_fragments yet
assert!(reassembler.push(&pkts[0]).is_none()); assert!(reassembler.push(&pkts[0]).is_none());
let result = reassembler.push(&pkts[1]); let result = reassembler
// Fragment 2 arrived before total was known, so reassembly waits .push(&pkts[1])
// for frame_end again — result may be None here due to missing total. .expect("last missing fragment completes frame");
// This tests that we don't panic; correctness of OOO is best-effort. assert_eq!(result.codec_id, CodecId::Av1Main);
let _ = result; assert!(!result.is_keyframe);
assert_eq!(result.width, Some(320));
assert_eq!(result.height, Some(240));
assert_eq!(result.data, frame);
} }
#[test] #[test]
fn empty_frame_produces_no_packets() { fn empty_frame_produces_no_packets() {
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&[], CodecId::Av1Main, false, &mut seq, 0); let pkts = packetize_video_frame(&[], CodecId::Av1Main, false, &mut seq, 0, 640, 480);
assert!(pkts.is_empty()); assert!(pkts.is_empty());
} }
#[test]
fn old_payload_without_meta_still_reassembles() {
let payload = Bytes::copy_from_slice(&[0x00, 0x00, 0x00, 0x01, 0x65]);
let pkt = MediaPacket {
header: MediaHeaderV2 {
version: MediaHeaderV2::VERSION,
flags: MediaHeaderV2::FLAG_KEYFRAME | MediaHeaderV2::FLAG_FRAME_END,
media_type: MediaType::Video,
codec_id: CodecId::H264Baseline,
stream_id: 0,
fec_ratio: 0,
seq: 7,
timestamp: 123,
fec_block: 1,
},
payload: payload.clone(),
quality_report: None,
};
let mut reassembler = VideoReassembler::new();
let frame = reassembler.push(&pkt).unwrap();
assert_eq!(frame.codec_id, CodecId::H264Baseline);
assert_eq!(frame.width, None);
assert_eq!(frame.height, None);
assert_eq!(frame.data, payload.to_vec());
}
} }

View File

@@ -8,13 +8,110 @@ mod imp {
pub use shiguredo_video_toolbox::{ pub use shiguredo_video_toolbox::{
CodecConfig, DecodedFrame, Decoder, DecoderCodec, DecoderConfig, EncodeOptions, Encoder, CodecConfig, DecodedFrame, Decoder, DecoderCodec, DecoderConfig, EncodeOptions, Encoder,
EncoderConfig, FrameData, H264EncoderConfig, H264EntropyMode, H264Profile, EncoderConfig, FrameData, H264EncoderConfig, H264EntropyMode, H264Profile,
HevcEncoderConfig, HevcProfile, PixelFormat, HevcEncoderConfig, HevcProfile, I420Frame, PixelFormat,
}; };
} }
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
use imp::*; use imp::*;
/// Copy a VideoToolbox I420 CVPixelBuffer into a tightly-packed I420 byte vector
/// of `width * height + 2 * (width/2) * (height/2)` bytes.
///
/// The per-plane `bytes_per_row` (stride) reported by CoreVideo can be larger
/// than the visible plane width (typically aligned to 16/64 bytes). Concatenating
/// the raw plane slices without removing that stride padding produces a buffer
/// that downstream code — which indexes as tight I420 of `width x height` —
/// mis-interprets, producing horizontal green/magenta bands that drift one
/// chroma row each time the per-row stride excess accumulates to one full row.
///
/// `frame_label` is used for one-time tracing of the actual plane dimensions so
/// the first decoded frame of a session prints its real layout. The boolean
/// flag is flipped to true after the first log so the format string is emitted
/// at most once per decoder lifetime.
#[cfg(target_os = "macos")]
fn i420_frame_to_tight(
frame: &I420Frame<'_>,
width: u32,
height: u32,
frame_label: &'static str,
logged: &mut bool,
) -> Result<Vec<u8>, VideoError> {
let w = width as usize;
let h = height as usize;
if w == 0 || h == 0 {
return Err(VideoError::PlatformError(format!(
"decoder produced empty frame ({w}x{h})"
)));
}
let cw = w / 2;
let ch = h / 2;
let y = frame.y_plane();
let u = frame.u_plane();
let v = frame.v_plane();
let y_stride = frame.y_stride();
let u_stride = frame.u_stride();
let v_stride = frame.v_stride();
let fw = frame.width();
let fh = frame.height();
if !*logged {
*logged = true;
tracing::info!(
target: "wzp_video::videotoolbox",
label = frame_label,
configured_width = w,
configured_height = h,
frame_width = fw,
frame_height = fh,
y_stride,
u_stride,
v_stride,
y_len = y.len(),
u_len = u.len(),
v_len = v.len(),
"VideoToolbox decoder I420 plane layout"
);
}
if y_stride < w || u_stride < cw || v_stride < cw {
return Err(VideoError::PlatformError(format!(
"decoder plane stride smaller than width: y_stride={y_stride} u_stride={u_stride} v_stride={v_stride} for {w}x{h}"
)));
}
let needed_y = y_stride.checked_mul(h).ok_or_else(|| {
VideoError::PlatformError(format!("y plane size overflow {y_stride}x{h}"))
})?;
let needed_uv = u_stride.checked_mul(ch).ok_or_else(|| {
VideoError::PlatformError(format!("uv plane size overflow {u_stride}x{ch}"))
})?;
if y.len() < needed_y || u.len() < needed_uv || v.len() < v_stride * ch {
return Err(VideoError::PlatformError(format!(
"decoder plane buffer too small: y_len={} (need {needed_y}) u_len={} (need {needed_uv}) v_len={} (need {})",
y.len(),
u.len(),
v.len(),
v_stride * ch,
)));
}
let mut data = Vec::with_capacity(w * h + 2 * cw * ch);
for row in 0..h {
let off = row * y_stride;
data.extend_from_slice(&y[off..off + w]);
}
for row in 0..ch {
let off = row * u_stride;
data.extend_from_slice(&u[off..off + cw]);
}
for row in 0..ch {
let off = row * v_stride;
data.extend_from_slice(&v[off..off + cw]);
}
Ok(data)
}
/// macOS VideoToolbox H.264 encoder. /// macOS VideoToolbox H.264 encoder.
/// ///
/// Wraps `VTCompressionSession`. On non-macOS targets this is a compile-safe /// Wraps `VTCompressionSession`. On non-macOS targets this is a compile-safe
@@ -160,9 +257,12 @@ impl VideoEncoder for VideoToolboxEncoder {
if packet.is_empty() { if packet.is_empty() {
return false; return false;
} }
let nal_type = packet[0] & 0x1F; let nals = split_annex_b(packet);
// NAL type 5 = IDR slice (keyframe). if nals.is_empty() {
nal_type == 5 return (packet[0] & 0x1F) == 5;
}
nals.iter()
.any(|nal| !nal.is_empty() && (nal[0] & 0x1F) == 5)
} }
} }
@@ -261,6 +361,8 @@ pub struct VideoToolboxDecoder {
width: u32, width: u32,
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
height: u32, height: u32,
#[cfg(target_os = "macos")]
layout_logged: bool,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
_width: u32, _width: u32,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
@@ -279,6 +381,7 @@ impl VideoToolboxDecoder {
inner: None, inner: None,
width, width,
height, height,
layout_logged: false,
}) })
} }
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
@@ -357,13 +460,13 @@ impl VideoDecoder for VideoToolboxDecoder {
match decoded { match decoded {
Some(DecodedFrame::I420(frame)) => { Some(DecodedFrame::I420(frame)) => {
let y = frame.y_plane(); let data = i420_frame_to_tight(
let u = frame.u_plane(); &frame,
let v = frame.v_plane(); self.width,
let mut data = Vec::with_capacity(y.len() + u.len() + v.len()); self.height,
data.extend_from_slice(y); "h264_decoder",
data.extend_from_slice(u); &mut self.layout_logged,
data.extend_from_slice(v); )?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -520,12 +623,13 @@ impl VideoEncoder for VideoToolboxHevcEncoder {
} }
fn is_keyframe(&self, packet: &[u8]) -> bool { fn is_keyframe(&self, packet: &[u8]) -> bool {
if packet.len() < 2 { let nals = split_annex_b(packet);
return false; if nals.is_empty() {
return packet.len() >= 2 && matches!((packet[0] >> 1) & 0x3F, 19 | 20);
} }
let nal_type = (packet[0] >> 1) & 0x3F;
// NAL type 19 = IDR_W_RADL, 20 = IDR_N_LP. // NAL type 19 = IDR_W_RADL, 20 = IDR_N_LP.
nal_type == 19 || nal_type == 20 nals.iter()
.any(|nal| nal.len() >= 2 && matches!((nal[0] >> 1) & 0x3F, 19 | 20))
} }
} }
@@ -537,6 +641,8 @@ pub struct VideoToolboxHevcDecoder {
width: u32, width: u32,
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
height: u32, height: u32,
#[cfg(target_os = "macos")]
layout_logged: bool,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
_width: u32, _width: u32,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
@@ -551,6 +657,7 @@ impl VideoToolboxHevcDecoder {
inner: None, inner: None,
width, width,
height, height,
layout_logged: false,
}) })
} }
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
@@ -624,13 +731,13 @@ impl VideoDecoder for VideoToolboxHevcDecoder {
match decoded { match decoded {
Some(DecodedFrame::I420(frame)) => { Some(DecodedFrame::I420(frame)) => {
let y = frame.y_plane(); let data = i420_frame_to_tight(
let u = frame.u_plane(); &frame,
let v = frame.v_plane(); self.width,
let mut data = Vec::with_capacity(y.len() + u.len() + v.len()); self.height,
data.extend_from_slice(y); "hevc_decoder",
data.extend_from_slice(u); &mut self.layout_logged,
data.extend_from_slice(v); )?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -660,6 +767,8 @@ pub struct VideoToolboxAv1Decoder {
width: u32, width: u32,
#[cfg(target_os = "macos")] #[cfg(target_os = "macos")]
height: u32, height: u32,
#[cfg(target_os = "macos")]
layout_logged: bool,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
_width: u32, _width: u32,
#[cfg(not(target_os = "macos"))] #[cfg(not(target_os = "macos"))]
@@ -679,6 +788,7 @@ impl VideoToolboxAv1Decoder {
inner: Some(decoder), inner: Some(decoder),
width, width,
height, height,
layout_logged: false,
}), }),
Err(shiguredo_video_toolbox::Error::UnsupportedCodec { .. }) => { Err(shiguredo_video_toolbox::Error::UnsupportedCodec { .. }) => {
// AV1 decode not supported on this platform (e.g. M1/M2). // AV1 decode not supported on this platform (e.g. M1/M2).
@@ -686,6 +796,7 @@ impl VideoToolboxAv1Decoder {
inner: None, inner: None,
width, width,
height, height,
layout_logged: false,
}) })
} }
Err(e) => Err(VideoError::PlatformError(format!( Err(e) => Err(VideoError::PlatformError(format!(
@@ -717,13 +828,13 @@ impl VideoDecoder for VideoToolboxAv1Decoder {
.map_err(|e| VideoError::PlatformError(format!("decode failed: {e}")))?; .map_err(|e| VideoError::PlatformError(format!("decode failed: {e}")))?;
match decoded { match decoded {
Some(DecodedFrame::I420(frame)) => { Some(DecodedFrame::I420(frame)) => {
let y = frame.y_plane(); let data = i420_frame_to_tight(
let u = frame.u_plane(); &frame,
let v = frame.v_plane(); self.width,
let mut data = Vec::with_capacity(y.len() + u.len() + v.len()); self.height,
data.extend_from_slice(y); "av1_decoder",
data.extend_from_slice(u); &mut self.layout_logged,
data.extend_from_slice(v); )?;
Ok(Some(VideoFrame { Ok(Some(VideoFrame {
width: self.width, width: self.width,
height: self.height, height: self.height,
@@ -791,6 +902,11 @@ mod tests {
let enc = VideoToolboxEncoder::new(1280, 720, 2_000_000).unwrap(); let enc = VideoToolboxEncoder::new(1280, 720, 2_000_000).unwrap();
assert!(enc.is_keyframe(&[0x65, 0x01, 0x02])); assert!(enc.is_keyframe(&[0x65, 0x01, 0x02]));
assert!(!enc.is_keyframe(&[0x41, 0x01, 0x02])); assert!(!enc.is_keyframe(&[0x41, 0x01, 0x02]));
assert!(enc.is_keyframe(&[
0x00, 0x00, 0x00, 0x01, 0x67, 0x01, // SPS
0x00, 0x00, 0x00, 0x01, 0x68, 0x02, // PPS
0x00, 0x00, 0x00, 0x01, 0x65, 0x03, // IDR
]));
} }
#[test] #[test]

View File

@@ -16,9 +16,9 @@
use std::sync::Mutex; use std::sync::Mutex;
use wzp_proto::CodecId; use wzp_proto::CodecId;
use wzp_video::{ use wzp_video::{
VideoFrame,
factory::{create_video_decoder, create_video_encoder}, factory::{create_video_decoder, create_video_encoder},
transport::{VideoReassembler, packetize_video_frame}, transport::{packetize_video_frame, VideoReassembler},
VideoFrame,
}; };
/// VideoToolbox has global session registry state — serialise integration tests /// VideoToolbox has global session registry state — serialise integration tests
@@ -42,7 +42,12 @@ fn synthetic_i420(width: u32, height: u32, frame_idx: u32) -> VideoFrame {
data[y_size..y_size + uv_size].fill(128); data[y_size..y_size + uv_size].fill(128);
data[y_size + uv_size..].fill(128); data[y_size + uv_size..].fill(128);
VideoFrame { width, height, data, timestamp_ms: frame_idx as u64 * 33 } VideoFrame {
width,
height,
data,
timestamp_ms: frame_idx as u64 * 33,
}
} }
// ── tests ───────────────────────────────────────────────────────────────────── // ── tests ─────────────────────────────────────────────────────────────────────
@@ -53,10 +58,10 @@ fn h264_pipeline_roundtrip() {
let _g = VT_LOCK.lock().unwrap(); let _g = VT_LOCK.lock().unwrap();
let (w, h) = (640, 360); let (w, h) = (640, 360);
let mut encoder = create_video_encoder(CodecId::H264Baseline, w, h, 1_500_000) let mut encoder =
.expect("H264Baseline encoder"); create_video_encoder(CodecId::H264Baseline, w, h, 1_500_000).expect("H264Baseline encoder");
let mut decoder = create_video_decoder(CodecId::H264Baseline, w, h) let mut decoder =
.expect("H264Baseline decoder"); create_video_decoder(CodecId::H264Baseline, w, h).expect("H264Baseline decoder");
let mut seq = 0u32; let mut seq = 0u32;
let mut decoded_count = 0usize; let mut decoded_count = 0usize;
@@ -71,32 +76,60 @@ fn h264_pipeline_roundtrip() {
} }
let is_keyframe = encoder.is_keyframe(&encoded); let is_keyframe = encoder.is_keyframe(&encoded);
let pkts = packetize_video_frame(&encoded, CodecId::H264Baseline, is_keyframe, &mut seq, i * 33); let pkts = packetize_video_frame(
assert!(!pkts.is_empty(), "packetize must produce at least one packet"); &encoded,
CodecId::H264Baseline,
is_keyframe,
&mut seq,
i * 33,
w,
h,
);
assert!(
!pkts.is_empty(),
"packetize must produce at least one packet"
);
// All fragments for this frame share the same timestamp. // All fragments for this frame share the same timestamp.
let ts = pkts[0].header.timestamp; let ts = pkts[0].header.timestamp;
let total_frags = pkts.len(); let total_frags = pkts.len();
for (idx, pkt) in pkts.iter().enumerate() { for (idx, pkt) in pkts.iter().enumerate() {
assert_eq!(pkt.header.timestamp, ts, "all fragments of one frame share timestamp"); assert_eq!(
pkt.header.timestamp, ts,
"all fragments of one frame share timestamp"
);
let frag_idx = (pkt.header.fec_block >> 8) as usize; let frag_idx = (pkt.header.fec_block >> 8) as usize;
let frag_total = (pkt.header.fec_block & 0xFF) as usize; let frag_total = (pkt.header.fec_block & 0xFF) as usize;
assert_eq!(frag_idx, idx, "fragment index must match packet position"); assert_eq!(frag_idx, idx, "fragment index must match packet position");
assert_eq!(frag_total, total_frags, "all fragments carry the correct total count"); assert_eq!(
frag_total, total_frags,
"all fragments carry the correct total count"
);
} }
assert!(pkts.last().unwrap().header.is_frame_end(), "last packet must have FLAG_FRAME_END"); assert!(
pkts.last().unwrap().header.is_frame_end(),
"last packet must have FLAG_FRAME_END"
);
// Push through reassembler — only the last packet should yield a frame. // Push through reassembler — only the last packet should yield a frame.
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
for (j, pkt) in pkts.iter().enumerate() { for (j, pkt) in pkts.iter().enumerate() {
let result = reassembler.push(pkt); let result = reassembler.push(pkt);
if j + 1 < pkts.len() { if j + 1 < pkts.len() {
assert!(result.is_none(), "intermediate fragments must not yield a complete frame"); assert!(
result.is_none(),
"intermediate fragments must not yield a complete frame"
);
} else { } else {
let (codec, kf, data) = result.expect("last fragment must complete the frame"); let frame = result.expect("last fragment must complete the frame");
assert_eq!(codec, CodecId::H264Baseline); assert_eq!(frame.codec_id, CodecId::H264Baseline);
assert_eq!(kf, is_keyframe); assert_eq!(frame.is_keyframe, is_keyframe);
assert_eq!(data, encoded, "reassembled bytes must match original encoded bytes"); assert_eq!(frame.width, Some(w as u16));
assert_eq!(frame.height, Some(h as u16));
assert_eq!(
frame.data, encoded,
"reassembled bytes must match original encoded bytes"
);
} }
} }
@@ -118,7 +151,10 @@ fn h264_pipeline_roundtrip() {
} }
} }
assert!(decoded_count > 0, "at least one frame must have been decoded"); assert!(
decoded_count > 0,
"at least one frame must have been decoded"
);
} }
/// Fragmentation: a frame larger than VIDEO_MAX_PAYLOAD splits into multiple packets, /// Fragmentation: a frame larger than VIDEO_MAX_PAYLOAD splits into multiple packets,
@@ -134,13 +170,28 @@ fn large_frame_fragments_and_reassembles() {
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame( let pkts = packetize_video_frame(
&synthetic_encoded, CodecId::H264Baseline, true, &mut seq, 9000, &synthetic_encoded,
CodecId::H264Baseline,
true,
&mut seq,
9000,
1280,
720,
); );
assert!(pkts.len() >= 4, "large frame must produce ≥4 fragments"); assert!(pkts.len() >= 4, "large frame must produce ≥4 fragments");
assert!(pkts[0].header.is_keyframe(), "keyframe flag propagates to all fragments"); assert!(
assert!(!pkts[0].header.is_frame_end(), "first packet is not frame end"); pkts[0].header.is_keyframe(),
assert!(pkts.last().unwrap().header.is_frame_end(), "last packet is frame end"); "keyframe flag propagates to all fragments"
);
assert!(
!pkts[0].header.is_frame_end(),
"first packet is not frame end"
);
assert!(
pkts.last().unwrap().header.is_frame_end(),
"last packet is frame end"
);
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
let mut result = None; let mut result = None;
@@ -148,8 +199,13 @@ fn large_frame_fragments_and_reassembles() {
result = reassembler.push(pkt); result = reassembler.push(pkt);
} }
let (_, _, data) = result.expect("all fragments delivered → complete frame"); let frame = result.expect("all fragments delivered → complete frame");
assert_eq!(data, synthetic_encoded, "reassembled bytes must match input exactly"); assert_eq!(frame.width, Some(1280));
assert_eq!(frame.height, Some(720));
assert_eq!(
frame.data, synthetic_encoded,
"reassembled bytes must match input exactly"
);
} }
/// Packet loss: if the first fragment is missing, reassembly cannot complete. /// Packet loss: if the first fragment is missing, reassembly cannot complete.
@@ -159,7 +215,7 @@ fn missing_fragment_blocks_reassembly() {
let frame: Vec<u8> = vec![0xAB; VIDEO_MAX_PAYLOAD * 2 + 50]; let frame: Vec<u8> = vec![0xAB; VIDEO_MAX_PAYLOAD * 2 + 50];
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&frame, CodecId::Av1Main, false, &mut seq, 1234); let pkts = packetize_video_frame(&frame, CodecId::Av1Main, false, &mut seq, 1234, 640, 480);
assert!(pkts.len() >= 3); assert!(pkts.len() >= 3);
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
@@ -177,9 +233,9 @@ fn missing_fragment_blocks_reassembly() {
#[test] #[test]
fn video_codec_selection_semantics() { fn video_codec_selection_semantics() {
// The relay's selection rule is: first codec offered by the caller. // The relay's selection rule is: first codec offered by the caller.
let offered = vec![CodecId::Av1Main, CodecId::H264Baseline, CodecId::H265Main]; let offered = vec![CodecId::H264Baseline];
let chosen = offered.into_iter().next(); let chosen = offered.into_iter().next();
assert_eq!(chosen, Some(CodecId::Av1Main)); assert_eq!(chosen, Some(CodecId::H264Baseline));
// When no codecs are offered, video is audio-only. // When no codecs are offered, video is audio-only.
let empty: Vec<CodecId> = vec![]; let empty: Vec<CodecId> = vec![];
@@ -193,7 +249,15 @@ fn evict_stale_removes_aged_frames() {
let frame: Vec<u8> = vec![0x55; VIDEO_MAX_PAYLOAD * 2]; let frame: Vec<u8> = vec![0x55; VIDEO_MAX_PAYLOAD * 2];
let mut seq = 0u32; let mut seq = 0u32;
let pkts = packetize_video_frame(&frame, CodecId::H264Baseline, false, &mut seq, 500); let pkts = packetize_video_frame(
&frame,
CodecId::H264Baseline,
false,
&mut seq,
500,
640,
480,
);
let mut reassembler = VideoReassembler::new(); let mut reassembler = VideoReassembler::new();
// Push only first packet — frame is incomplete. // Push only first packet — frame is incomplete.

View File

@@ -105,11 +105,16 @@
</div> </div>
</div> </div>
<div id="vd-stats" class="vd-stats"></div> <div id="vd-stats" class="vd-stats"></div>
<!-- Video strip: remote (canvas) + local preview (video element) --> </div>
<div id="vd-video-strip" class="vd-video-strip hidden">
<canvas id="vd-remote-video" class="vd-video-tile" width="320" height="180"></canvas> <!-- ═════ Video stage — full-screen overlay above drawer ═════ -->
<video id="vd-local-video" class="vd-video-tile" autoplay muted playsinline></video> <div id="vd-video-strip" class="vd-video-stage hidden">
<canvas id="vd-remote-video" class="vd-remote-stage" width="1280" height="720"></canvas>
<div id="vd-remote-placeholder" class="vd-remote-placeholder">
<div class="vd-placeholder-text">Waiting for remote video…</div>
<div id="vd-remote-counter" class="vd-placeholder-sub">0 frames received</div>
</div> </div>
<video id="vd-local-video" class="vd-local-pip" autoplay muted playsinline></video>
</div> </div>
</div> </div>
@@ -169,6 +174,22 @@
OS Echo Cancellation OS Echo Cancellation
</label> </label>
</div> </div>
<div class="settings-section">
<h3>Video</h3>
<label>Codec
<select id="s-video-codec">
<option value="h264">H.264</option>
<option value="h265">H.265 / HEVC</option>
</select>
</label>
<label>Room Resolution
<select id="s-video-resolution">
<option value="640x360">640 x 360</option>
<option value="960x540">960 x 540</option>
<option value="1280x720">1280 x 720</option>
</select>
</label>
</div>
<div class="settings-section"> <div class="settings-section">
<h3>Relays</h3> <h3>Relays</h3>
<div id="s-relay-list"></div> <div id="s-relay-list"></div>

View File

@@ -3,7 +3,9 @@
<uses-permission android:name="android.permission.INTERNET" /> <uses-permission android:name="android.permission.INTERNET" />
<uses-permission android:name="android.permission.RECORD_AUDIO" /> <uses-permission android:name="android.permission.RECORD_AUDIO" />
<uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" /> <uses-permission android:name="android.permission.MODIFY_AUDIO_SETTINGS" />
<uses-permission android:name="android.permission.CAMERA" />
<uses-feature android:name="android.hardware.microphone" android:required="true" /> <uses-feature android:name="android.hardware.microphone" android:required="true" />
<uses-feature android:name="android.hardware.camera" android:required="false" />
<!-- AndroidTV support --> <!-- AndroidTV support -->
<uses-feature android:name="android.software.leanback" android:required="false" /> <uses-feature android:name="android.software.leanback" android:required="false" />

View File

@@ -16,10 +16,19 @@ class MainActivity : TauriActivity() {
private const val AUDIO_PERMISSIONS_REQUEST = 4242 private const val AUDIO_PERMISSIONS_REQUEST = 4242
private val REQUIRED_AUDIO_PERMISSIONS = arrayOf( private val REQUIRED_AUDIO_PERMISSIONS = arrayOf(
Manifest.permission.RECORD_AUDIO, Manifest.permission.RECORD_AUDIO,
Manifest.permission.MODIFY_AUDIO_SETTINGS Manifest.permission.MODIFY_AUDIO_SETTINGS,
Manifest.permission.CAMERA
) )
} }
// NOTE: granting CAMERA at the Android system layer is necessary but NOT
// sufficient for video on Android. Tauri/Wry's internal WebChromeClient
// does not currently grant `getUserMedia` permission requests, so the
// browser-layer getUserMedia call still fails even after the OS grants
// CAMERA. Fixing this needs either a Tauri plugin that overrides the
// WebChromeClient, or a native Camera2/CameraX capture path that bypasses
// the WebView. Tracked as a follow-up.
override fun onCreate(savedInstanceState: Bundle?) { override fun onCreate(savedInstanceState: Bundle?) {
enableEdgeToEdge() enableEdgeToEdge()
super.onCreate(savedInstanceState) super.onCreate(savedInstanceState)

File diff suppressed because it is too large Load Diff

View File

@@ -31,7 +31,7 @@ use engine::CallEngine;
use serde::Serialize; use serde::Serialize;
use std::path::PathBuf; use std::path::PathBuf;
use std::sync::atomic::{AtomicBool, Ordering}; use std::sync::atomic::{AtomicBool, AtomicU64, Ordering};
use std::sync::{Arc, OnceLock}; use std::sync::{Arc, OnceLock};
use tauri::{Emitter, Manager}; use tauri::{Emitter, Manager};
use tokio::sync::Mutex; use tokio::sync::Mutex;
@@ -49,6 +49,12 @@ use wzp_proto::{MediaTransport, default_signal_version};
// Mirrors the existing `wzp_codec::dred_verbose_logs` pattern. // Mirrors the existing `wzp_codec::dred_verbose_logs` pattern.
static CALL_DEBUG_LOGS: AtomicBool = AtomicBool::new(false); static CALL_DEBUG_LOGS: AtomicBool = AtomicBool::new(false);
static CAMERA_PUSH_FRAMES: AtomicU64 = AtomicU64::new(0);
static CAMERA_PUSH_DROPS: AtomicU64 = AtomicU64::new(0);
static CAMERA_PUSH_NO_ENGINE: AtomicU64 = AtomicU64::new(0);
static CAMERA_PUSH_NO_SENDER: AtomicU64 = AtomicU64::new(0);
static CAMERA_PUSH_DECODE_ERRORS: AtomicU64 = AtomicU64::new(0);
static FRAME_DUMP_WRITES: AtomicU64 = AtomicU64::new(0);
#[inline] #[inline]
fn call_debug_logs_enabled() -> bool { fn call_debug_logs_enabled() -> bool {
@@ -81,6 +87,18 @@ pub(crate) fn emit_call_debug(app: &tauri::AppHandle, step: &str, details: serde
let _ = app.emit("call-debug-log", payload); let _ = app.emit("call-debug-log", payload);
} }
#[tauri::command]
fn call_debug_log(app: tauri::AppHandle, step: String, details: serde_json::Value) {
if step == "camera:get_user_media_start" {
CAMERA_PUSH_FRAMES.store(0, Ordering::Relaxed);
CAMERA_PUSH_DROPS.store(0, Ordering::Relaxed);
CAMERA_PUSH_NO_ENGINE.store(0, Ordering::Relaxed);
CAMERA_PUSH_NO_SENDER.store(0, Ordering::Relaxed);
CAMERA_PUSH_DECODE_ERRORS.store(0, Ordering::Relaxed);
}
emit_call_debug(&app, &step, details);
}
/// Short git hash captured at compile time by build.rs. /// Short git hash captured at compile time by build.rs.
const GIT_HASH: &str = env!("WZP_GIT_HASH"); const GIT_HASH: &str = env!("WZP_GIT_HASH");
@@ -91,8 +109,15 @@ const GIT_HASH: &str = env!("WZP_GIT_HASH");
/// Returns `None` if the data is too short or encoding fails. /// Returns `None` if the data is too short or encoding fails.
/// Called from the video recv task in engine.rs to produce the `jpeg_b64` /// Called from the video recv task in engine.rs to produce the `jpeg_b64`
/// field of every `video:frame` Tauri event. /// field of every `video:frame` Tauri event.
#[cfg_attr(not(test), allow(dead_code))]
pub(crate) fn i420_to_jpeg_b64(data: &[u8], width: u32, height: u32) -> Option<String> { pub(crate) fn i420_to_jpeg_b64(data: &[u8], width: u32, height: u32) -> Option<String> {
use base64::Engine as _; use base64::Engine as _;
let bytes = i420_to_jpeg_bytes(data, width, height)?;
Some(base64::engine::general_purpose::STANDARD.encode(bytes))
}
pub(crate) fn i420_to_jpeg_bytes(data: &[u8], width: u32, height: u32) -> Option<Vec<u8>> {
use image::{DynamicImage, ImageBuffer, Rgb}; use image::{DynamicImage, ImageBuffer, Rgb};
let w = width as usize; let w = width as usize;
@@ -112,16 +137,127 @@ pub(crate) fn i420_to_jpeg_b64(data: &[u8], width: u32, height: u32) -> Option<S
let u = data[y_size + uv_idx] as f32 - 128.0; let u = data[y_size + uv_idx] as f32 - 128.0;
let v = data[y_size + uv_size + uv_idx] as f32 - 128.0; let v = data[y_size + uv_size + uv_idx] as f32 - 128.0;
let out = (row * w + col) * 3; let out = (row * w + col) * 3;
rgb[out] = (y + 1.402 * v).clamp(0.0, 255.0) as u8; rgb[out] = (y + 1.402 * v).clamp(0.0, 255.0) as u8;
rgb[out + 1] = (y - 0.344 * u - 0.714 * v).clamp(0.0, 255.0) as u8; rgb[out + 1] = (y - 0.344 * u - 0.714 * v).clamp(0.0, 255.0) as u8;
rgb[out + 2] = (y + 1.772 * u).clamp(0.0, 255.0) as u8; rgb[out + 2] = (y + 1.772 * u).clamp(0.0, 255.0) as u8;
} }
} }
let img = DynamicImage::ImageRgb8(ImageBuffer::<Rgb<u8>, Vec<u8>>::from_raw(width, height, rgb)?); let img = DynamicImage::ImageRgb8(ImageBuffer::<Rgb<u8>, Vec<u8>>::from_raw(
width, height, rgb,
)?);
let mut buf = std::io::Cursor::new(Vec::<u8>::new()); let mut buf = std::io::Cursor::new(Vec::<u8>::new());
img.write_to(&mut buf, image::ImageFormat::Jpeg).ok()?; img.write_to(&mut buf, image::ImageFormat::Jpeg).ok()?;
Some(base64::engine::general_purpose::STANDARD.encode(buf.into_inner())) Some(buf.into_inner())
}
fn should_dump_frame(frame_no: u64) -> bool {
frame_no <= 5 || frame_no % 30 == 0
}
pub(crate) fn maybe_dump_video_jpeg(
app: &tauri::AppHandle,
stage: &str,
platform: &str,
frame_no: u64,
jpeg_bytes: &[u8],
width: u32,
height: u32,
) {
if !should_dump_frame(frame_no) {
return;
}
let seq = FRAME_DUMP_WRITES.fetch_add(1, Ordering::Relaxed) + 1;
let dir = identity_dir().join("frame-dumps");
let file_name = format!("{seq:06}_{platform}_{stage}_f{frame_no:06}_{width}x{height}.jpg");
let path = dir.join(file_name);
let result = std::fs::create_dir_all(&dir).and_then(|_| std::fs::write(&path, jpeg_bytes));
match result {
Ok(()) => emit_call_debug(
app,
"video:frame_dump",
serde_json::json!({
"stage": stage,
"platform": platform,
"frame_no": frame_no,
"width": width,
"height": height,
"jpeg_bytes": jpeg_bytes.len(),
"path": path,
}),
),
Err(e) => {
if seq <= 5 || seq % 30 == 0 {
emit_call_debug(
app,
"video:frame_dump_failed",
serde_json::json!({
"stage": stage,
"platform": platform,
"frame_no": frame_no,
"error": e.to_string(),
"path": path,
}),
);
}
}
}
}
pub(crate) fn maybe_dump_video_bytes(
app: &tauri::AppHandle,
stage: &str,
platform: &str,
frame_no: u64,
bytes: &[u8],
codec: wzp_proto::CodecId,
) {
if !should_dump_frame(frame_no) || bytes.is_empty() {
return;
}
let ext = match codec {
wzp_proto::CodecId::H265Main => "h265",
wzp_proto::CodecId::Av1Main => "obu",
_ => "h264",
};
let seq = FRAME_DUMP_WRITES.fetch_add(1, Ordering::Relaxed) + 1;
let dir = identity_dir().join("frame-dumps");
let file_name = format!("{seq:06}_{platform}_{stage}_f{frame_no:06}.{ext}");
let path = dir.join(file_name);
let result = std::fs::create_dir_all(&dir).and_then(|_| std::fs::write(&path, bytes));
match result {
Ok(()) => emit_call_debug(
app,
"video:byte_dump",
serde_json::json!({
"stage": stage,
"platform": platform,
"frame_no": frame_no,
"codec": format!("{:?}", codec),
"bytes": bytes.len(),
"path": path,
}),
),
Err(e) => {
if seq <= 5 || seq % 30 == 0 {
emit_call_debug(
app,
"video:byte_dump_failed",
serde_json::json!({
"stage": stage,
"platform": platform,
"frame_no": frame_no,
"codec": format!("{:?}", codec),
"error": e.to_string(),
"path": path,
}),
);
}
}
}
} }
/// RGB24 → I420 (planar 4:2:0). Layout: Y(w×h) | U(w/2×h/2) | V(w/2×h/2). /// RGB24 → I420 (planar 4:2:0). Layout: Y(w×h) | U(w/2×h/2) | V(w/2×h/2).
@@ -138,8 +274,10 @@ fn rgb_to_i420(rgb: &[u8], w: usize, h: usize) -> Vec<u8> {
out[row * w + col] = (0.299 * r + 0.587 * g + 0.114 * b).clamp(0.0, 255.0) as u8; out[row * w + col] = (0.299 * r + 0.587 * g + 0.114 * b).clamp(0.0, 255.0) as u8;
if row % 2 == 0 && col % 2 == 0 { if row % 2 == 0 && col % 2 == 0 {
let uv = (row / 2) * (w / 2) + col / 2; let uv = (row / 2) * (w / 2) + col / 2;
out[y_size + uv] = (-0.169 * r - 0.331 * g + 0.500 * b + 128.0).clamp(0.0, 255.0) as u8; out[y_size + uv] =
out[y_size + uv_size + uv] = (0.500 * r - 0.419 * g - 0.081 * b + 128.0).clamp(0.0, 255.0) as u8; (-0.169 * r - 0.331 * g + 0.500 * b + 128.0).clamp(0.0, 255.0) as u8;
out[y_size + uv_size + uv] =
(0.500 * r - 0.419 * g - 0.081 * b + 128.0).clamp(0.0, 255.0) as u8;
} }
} }
} }
@@ -152,20 +290,86 @@ fn rgb_to_i420(rgb: &[u8], w: usize, h: usize) -> Vec<u8> {
/// The frontend calls this at ~15 fps from a canvas.toDataURL() capture loop. /// The frontend calls this at ~15 fps from a canvas.toDataURL() capture loop.
#[tauri::command] #[tauri::command]
async fn push_camera_frame( async fn push_camera_frame(
app: tauri::AppHandle,
state: tauri::State<'_, Arc<AppState>>, state: tauri::State<'_, Arc<AppState>>,
jpeg_b64: String, jpeg_b64: String,
) -> Result<(), String> { ) -> Result<(), String> {
use base64::Engine as _; use base64::Engine as _;
let jpeg_bytes = base64::engine::general_purpose::STANDARD let jpeg_bytes = match base64::engine::general_purpose::STANDARD.decode(&jpeg_b64) {
.decode(&jpeg_b64) Ok(bytes) => bytes,
.map_err(|e| e.to_string())?; Err(e) => {
let errs = CAMERA_PUSH_DECODE_ERRORS.fetch_add(1, Ordering::Relaxed) + 1;
if errs == 1 || errs % 30 == 0 {
emit_call_debug(
&app,
"camera:jpeg_base64_decode_failed",
serde_json::json!({
"errors": errs,
"error": e.to_string(),
"b64_len": jpeg_b64.len(),
}),
);
}
return Err(e.to_string());
}
};
let dyn_img = image::load_from_memory_with_format(&jpeg_bytes, image::ImageFormat::Jpeg) let dyn_img = match image::load_from_memory_with_format(&jpeg_bytes, image::ImageFormat::Jpeg) {
.map_err(|e| e.to_string())?; Ok(img) => img,
Err(e) => {
let errs = CAMERA_PUSH_DECODE_ERRORS.fetch_add(1, Ordering::Relaxed) + 1;
if errs == 1 || errs % 30 == 0 {
emit_call_debug(
&app,
"camera:jpeg_decode_failed",
serde_json::json!({
"errors": errs,
"error": e.to_string(),
"jpeg_bytes": jpeg_bytes.len(),
}),
);
}
return Err(e.to_string());
}
};
let rgb_img = dyn_img.to_rgb8(); let rgb_img = dyn_img.to_rgb8();
let w = rgb_img.width() as usize; let w = rgb_img.width() as usize;
let h = rgb_img.height() as usize; let h = rgb_img.height() as usize;
let yuv = rgb_to_i420(rgb_img.as_raw(), w, h); let yuv = rgb_to_i420(rgb_img.as_raw(), w, h);
let frame_no = CAMERA_PUSH_FRAMES.fetch_add(1, Ordering::Relaxed) + 1;
maybe_dump_video_jpeg(
&app,
"camera_jpeg_in",
std::env::consts::OS,
frame_no,
&jpeg_bytes,
w as u32,
h as u32,
);
if let Some(converted_jpeg) = i420_to_jpeg_bytes(&yuv, w as u32, h as u32) {
maybe_dump_video_jpeg(
&app,
"camera_i420_roundtrip",
std::env::consts::OS,
frame_no,
&converted_jpeg,
w as u32,
h as u32,
);
}
if frame_no == 1 || frame_no % 150 == 0 {
emit_call_debug(
&app,
"camera:frame_received",
serde_json::json!({
"frame_no": frame_no,
"width": w,
"height": h,
"jpeg_bytes": jpeg_bytes.len(),
"yuv_bytes": yuv.len(),
}),
);
}
let ts = std::time::SystemTime::now() let ts = std::time::SystemTime::now()
.duration_since(std::time::UNIX_EPOCH) .duration_since(std::time::UNIX_EPOCH)
@@ -182,7 +386,52 @@ async fn push_camera_frame(
let engine = state.engine.lock().await; let engine = state.engine.lock().await;
if let Some(ref eng) = *engine { if let Some(ref eng) = *engine {
if let Some(ref tx) = eng.camera_tx { if let Some(ref tx) = eng.camera_tx {
let _ = tx.try_send(frame); // drop frame if send task is saturated match tx.try_send(frame) {
Ok(()) => {
if frame_no == 1 || frame_no % 150 == 0 {
emit_call_debug(
&app,
"camera:frame_queued",
serde_json::json!({ "frame_no": frame_no }),
);
}
}
Err(e) => {
let drops = CAMERA_PUSH_DROPS.fetch_add(1, Ordering::Relaxed) + 1;
if drops == 1 || drops % 30 == 0 {
emit_call_debug(
&app,
"camera:frame_drop",
serde_json::json!({
"frame_no": frame_no,
"drops": drops,
"reason": e.to_string(),
}),
);
}
}
}
} else {
let count = CAMERA_PUSH_NO_SENDER.fetch_add(1, Ordering::Relaxed) + 1;
if count == 1 || count % 150 == 0 {
emit_call_debug(
&app,
"camera:no_video_sender",
serde_json::json!({
"count": count,
"hint": "video was not negotiated or the encoder task failed before camera_tx was installed",
}),
);
}
}
} else {
let count = CAMERA_PUSH_NO_ENGINE.fetch_add(1, Ordering::Relaxed) + 1;
if count == 1 || count % 150 == 0 {
emit_call_debug(
&app,
"camera:no_call_engine",
serde_json::json!({ "count": count }),
);
} }
} }
Ok(()) Ok(())
@@ -197,7 +446,7 @@ mod video_tests {
fn solid_rgb_frame(w: usize, h: usize, r: u8, g: u8, b: u8) -> Vec<u8> { fn solid_rgb_frame(w: usize, h: usize, r: u8, g: u8, b: u8) -> Vec<u8> {
let mut rgb = vec![0u8; w * h * 3]; let mut rgb = vec![0u8; w * h * 3];
for i in 0..w * h { for i in 0..w * h {
rgb[i * 3] = r; rgb[i * 3] = r;
rgb[i * 3 + 1] = g; rgb[i * 3 + 1] = g;
rgb[i * 3 + 2] = b; rgb[i * 3 + 2] = b;
} }
@@ -249,8 +498,14 @@ mod video_tests {
let s = b64.unwrap(); let s = b64.unwrap();
assert!(!s.is_empty()); assert!(!s.is_empty());
// JPEG base64 starts with '/9j/' (FFD8FF marker). // JPEG base64 starts with '/9j/' (FFD8FF marker).
let decoded = base64::engine::general_purpose::STANDARD.decode(&s).unwrap(); let decoded = base64::engine::general_purpose::STANDARD
assert_eq!(&decoded[0..2], &[0xFF, 0xD8], "output must start with JPEG SOI marker"); .decode(&s)
.unwrap();
assert_eq!(
&decoded[0..2],
&[0xFF, 0xD8],
"output must start with JPEG SOI marker"
);
} }
#[test] #[test]
@@ -273,13 +528,18 @@ mod video_tests {
let yuv = rgb_to_i420(&rgb, 64, 64); let yuv = rgb_to_i420(&rgb, 64, 64);
let b64 = i420_to_jpeg_b64(&yuv, 64, 64).expect("should produce JPEG"); let b64 = i420_to_jpeg_b64(&yuv, 64, 64).expect("should produce JPEG");
let jpeg = base64::engine::general_purpose::STANDARD.decode(&b64).unwrap(); let jpeg = base64::engine::general_purpose::STANDARD
.decode(&b64)
.unwrap();
let img = image::load_from_memory_with_format(&jpeg, image::ImageFormat::Jpeg).unwrap(); let img = image::load_from_memory_with_format(&jpeg, image::ImageFormat::Jpeg).unwrap();
let rgb_img = img.to_rgb8(); let rgb_img = img.to_rgb8();
let px = rgb_img.get_pixel(32, 32); let px = rgb_img.get_pixel(32, 32);
let (r, g, b) = (px[0], px[1], px[2]); let (r, g, b) = (px[0], px[1], px[2]);
assert!(r > g && r > b, "red frame: expected R dominant, got R={r} G={g} B={b}"); assert!(
r > g && r > b,
"red frame: expected R dominant, got R={r} G={g} B={b}"
);
} }
#[test] #[test]
@@ -556,8 +816,14 @@ async fn connect(
// Enable birthday attack for hard NAT traversal. Adds ~3s to // Enable birthday attack for hard NAT traversal. Adds ~3s to
// call setup when peer has symmetric NAT. // call setup when peer has symmetric NAT.
birthday_attack: Option<bool>, birthday_attack: Option<bool>,
video_codec: Option<String>,
video_width: Option<u32>,
video_height: Option<u32>,
) -> Result<String, String> { ) -> Result<String, String> {
let force_direct = direct_only.unwrap_or(false); let force_direct = direct_only.unwrap_or(false);
let video_codec = video_codec.unwrap_or_else(|| "h264".to_string());
let video_width = video_width.unwrap_or(1280);
let video_height = video_height.unwrap_or(720);
let enable_birthday = birthday_attack.unwrap_or(false); let enable_birthday = birthday_attack.unwrap_or(false);
emit_call_debug( emit_call_debug(
&app, &app,
@@ -570,6 +836,9 @@ async fn connect(
"peer_mapped_addr": peer_mapped_addr, "peer_mapped_addr": peer_mapped_addr,
"direct_only": force_direct, "direct_only": force_direct,
"birthday_attack": enable_birthday, "birthday_attack": enable_birthday,
"video_codec": video_codec,
"video_width": video_width,
"video_height": video_height,
}), }),
); );
let mut engine_lock = state.engine.lock().await; let mut engine_lock = state.engine.lock().await;
@@ -1028,6 +1297,9 @@ async fn connect(
app_for_engine, app_for_engine,
active_quality, active_quality,
peer_max_quality, peer_max_quality,
video_codec,
video_width,
video_height,
move |event_kind, message| { move |event_kind, message| {
let _ = app_clone.emit( let _ = app_clone.emit(
"call-event", "call-event",
@@ -1939,7 +2211,9 @@ fn do_register_signal(
"peer_loss_pct": local_loss_pct, "peer_rtt_ms": local_rtt_ms, "peer_loss_pct": local_loss_pct, "peer_rtt_ms": local_rtt_ms,
}), }),
); );
if let Err(e) = handle_upgrade_proposal(&*transport, &call_id, &proposal_id).await { if let Err(e) =
handle_upgrade_proposal(&*transport, &call_id, &proposal_id).await
{
tracing::warn!("failed to send UpgradeResponse: {e}"); tracing::warn!("failed to send UpgradeResponse: {e}");
} }
} }
@@ -1960,8 +2234,14 @@ fn do_register_signal(
}), }),
); );
if let Err(e) = handle_upgrade_response( if let Err(e) = handle_upgrade_response(
&*transport, &signal_state, &call_id, &proposal_id, accepted, &*transport,
).await { &signal_state,
&call_id,
&proposal_id,
accepted,
)
.await
{
tracing::warn!("failed to handle UpgradeResponse: {e}"); tracing::warn!("failed to handle UpgradeResponse: {e}");
} }
} }
@@ -2354,8 +2634,13 @@ async fn place_call(
.map(|la| la.port()) .map(|la| la.port())
.unwrap_or(0); .unwrap_or(0);
if v4_port > 0 { if v4_port > 0 {
match wzp_client::portmap::acquire_port_mapping(v4_port, None).await { match tokio::time::timeout(
Ok(mapping) => { std::time::Duration::from_millis(750),
wzp_client::portmap::acquire_port_mapping(v4_port, None),
)
.await
{
Ok(Ok(mapping)) => {
let addr = mapping.external_addr.to_string(); let addr = mapping.external_addr.to_string();
tracing::info!(%addr, protocol = ?mapping.protocol, "place_call: port mapping acquired"); tracing::info!(%addr, protocol = ?mapping.protocol, "place_call: port mapping acquired");
emit_call_debug( emit_call_debug(
@@ -2367,10 +2652,19 @@ async fn place_call(
); );
Some(addr) Some(addr)
} }
Err(e) => { Ok(Err(e)) => {
tracing::debug!(error = %e, "place_call: port mapping unavailable (normal on most networks)"); tracing::debug!(error = %e, "place_call: port mapping unavailable (normal on most networks)");
None None
} }
Err(_) => {
tracing::debug!("place_call: port mapping quick probe timed out");
emit_call_debug(
&app,
"place_call:portmap_timeout",
serde_json::json!({ "timeout_ms": 750 }),
);
None
}
} }
} else { } else {
None None
@@ -2597,8 +2891,13 @@ async fn answer_call(
.map(|la| la.port()) .map(|la| la.port())
.unwrap_or(0); .unwrap_or(0);
if v4_port > 0 { if v4_port > 0 {
match wzp_client::portmap::acquire_port_mapping(v4_port, None).await { match tokio::time::timeout(
Ok(mapping) => { std::time::Duration::from_millis(750),
wzp_client::portmap::acquire_port_mapping(v4_port, None),
)
.await
{
Ok(Ok(mapping)) => {
tracing::info!( tracing::info!(
addr = %mapping.external_addr, addr = %mapping.external_addr,
protocol = ?mapping.protocol, protocol = ?mapping.protocol,
@@ -2606,10 +2905,19 @@ async fn answer_call(
); );
Some(mapping.external_addr.to_string()) Some(mapping.external_addr.to_string())
} }
Err(e) => { Ok(Err(e)) => {
tracing::debug!(error = %e, "answer_call: port mapping unavailable"); tracing::debug!(error = %e, "answer_call: port mapping unavailable");
None None
} }
Err(_) => {
tracing::debug!("answer_call: port mapping quick probe timed out");
emit_call_debug(
&app,
"answer_call:portmap_timeout",
serde_json::json!({ "timeout_ms": 750 }),
);
None
}
} }
} else { } else {
None None
@@ -3208,7 +3516,9 @@ mod signal_tests {
#[tokio::test] #[tokio::test]
async fn upgrade_proposal_auto_accepts() { async fn upgrade_proposal_auto_accepts() {
let transport = LoopbackTransport::new(); let transport = LoopbackTransport::new();
handle_upgrade_proposal(&*transport, "c1", "p1").await.unwrap(); handle_upgrade_proposal(&*transport, "c1", "p1")
.await
.unwrap();
let sent = transport.take_sent(); let sent = transport.take_sent();
assert_eq!(sent.len(), 1); assert_eq!(sent.len(), 1);
@@ -3235,8 +3545,11 @@ mod signal_tests {
let signal_state = empty_signal_state(); let signal_state = empty_signal_state();
{ {
let sig = signal_state.lock().await; let sig = signal_state.lock().await;
*sig.pending_upgrade.lock().unwrap() = *sig.pending_upgrade.lock().unwrap() = Some((
Some(("c1".into(), "p1".into(), wzp_proto::QualityProfile::STUDIO_48K)); "c1".into(),
"p1".into(),
wzp_proto::QualityProfile::STUDIO_48K,
));
} }
handle_upgrade_response(&*transport, &signal_state, "c1", "p1", true) handle_upgrade_response(&*transport, &signal_state, "c1", "p1", true)
@@ -3396,6 +3709,7 @@ pub fn run() {
get_dred_verbose_logs, get_dred_verbose_logs,
set_call_debug_logs, set_call_debug_logs,
get_call_debug_logs, get_call_debug_logs,
call_debug_log,
push_camera_frame, push_camera_frame,
]) ])
.run(tauri::generate_context!()) .run(tauri::generate_context!())

View File

@@ -122,6 +122,8 @@ const sCallDebugCopyBtn = document.getElementById("s-call-debug-copy") as HTMLBu
const sCallDebugShareBtn = document.getElementById("s-call-debug-share") as HTMLButtonElement; const sCallDebugShareBtn = document.getElementById("s-call-debug-share") as HTMLButtonElement;
const sQuality = document.getElementById("s-quality") as HTMLInputElement; const sQuality = document.getElementById("s-quality") as HTMLInputElement;
const sQualityLabel = document.getElementById("s-quality-label")!; const sQualityLabel = document.getElementById("s-quality-label")!;
const sVideoCodec = document.getElementById("s-video-codec") as HTMLSelectElement;
const sVideoResolution = document.getElementById("s-video-resolution") as HTMLSelectElement;
const sFingerprint = document.getElementById("s-fingerprint")!; const sFingerprint = document.getElementById("s-fingerprint")!;
const sPublicAddr = document.getElementById("s-public-addr")!; const sPublicAddr = document.getElementById("s-public-addr")!;
const sReflectBtn = document.getElementById("s-reflect-btn")!; const sReflectBtn = document.getElementById("s-reflect-btn")!;
@@ -138,6 +140,8 @@ interface Settings {
alias: string; alias: string;
osAec: boolean; osAec: boolean;
quality: string; quality: string;
videoCodec: string;
videoResolution: string;
recentRooms: RecentRoom[]; recentRooms: RecentRoom[];
dredDebugLogs: boolean; dredDebugLogs: boolean;
callDebugLogs: boolean; callDebugLogs: boolean;
@@ -151,7 +155,7 @@ function loadSettings(): Settings {
{ name: "Default", address: "193.180.213.68:4433" }, { name: "Default", address: "193.180.213.68:4433" },
], ],
selectedRelay: 0, room: "general", alias: "", selectedRelay: 0, room: "general", alias: "",
osAec: true, quality: "auto", recentRooms: [], osAec: true, quality: "auto", videoCodec: "h264", videoResolution: "1280x720", recentRooms: [],
dredDebugLogs: false, callDebugLogs: false, dredDebugLogs: false, callDebugLogs: false,
directOnly: false, birthdayAttack: false, directOnly: false, birthdayAttack: false,
}; };
@@ -164,6 +168,25 @@ function loadSettings(): Settings {
function saveSettings(s: Settings) { function saveSettings(s: Settings) {
localStorage.setItem("wzp-settings", JSON.stringify(s)); localStorage.setItem("wzp-settings", JSON.stringify(s));
} }
function parseVideoResolution(value: string) {
const [wRaw, hRaw] = (value || "1280x720").split("x");
const width = Number.parseInt(wRaw, 10);
const height = Number.parseInt(hRaw, 10);
if (!Number.isFinite(width) || !Number.isFinite(height)) {
return { width: 1280, height: 720 };
}
return { width, height };
}
function videoConnectOptions(s: Settings) {
const { width, height } = parseVideoResolution(s.videoResolution);
return {
videoCodec: s.videoCodec || "h264",
videoWidth: width,
videoHeight: height,
};
}
function getRelay(): RelayServer | null { function getRelay(): RelayServer | null {
const s = loadSettings(); const s = loadSettings();
return s.relays[s.selectedRelay] || s.relays[0] || null; return s.relays[s.selectedRelay] || s.relays[0] || null;
@@ -180,8 +203,85 @@ let pendingCallId: string | null = null;
let cameraActive = false; let cameraActive = false;
let cameraStream: MediaStream | null = null; let cameraStream: MediaStream | null = null;
let cameraFrameTimer: number | null = null; let cameraFrameTimer: number | null = null;
let cameraFrameCallbackHandle: number | null = null;
let cameraCaptureInFlight = false;
let lastCameraCaptureAtMs = 0;
let remoteVideoActive = false; let remoteVideoActive = false;
interface FrameCallbackVideoElement extends HTMLVideoElement {
requestVideoFrameCallback?: (callback: (now: DOMHighResTimeStamp, metadata: unknown) => void) => number;
cancelVideoFrameCallback?: (handle: number) => void;
}
// Keep the local preview out of the video stage stacking context so it can float
// above the call drawer and remain draggable on phones.
document.body.appendChild(vdLocalVideo);
vdLocalVideo.classList.add("hidden");
function clampNumber(value: number, min: number, max: number) {
return Math.min(Math.max(value, min), max);
}
function keepLocalPipInViewport() {
if (vdLocalVideo.classList.contains("hidden")) return;
const rect = vdLocalVideo.getBoundingClientRect();
if (!rect.width || !rect.height) return;
const margin = 12;
const maxLeft = Math.max(margin, window.innerWidth - rect.width - margin);
const maxTop = Math.max(margin, window.innerHeight - rect.height - margin);
const left = clampNumber(rect.left, margin, maxLeft);
const top = clampNumber(rect.top, margin, maxTop);
vdLocalVideo.style.left = `${left}px`;
vdLocalVideo.style.top = `${top}px`;
vdLocalVideo.style.right = "auto";
vdLocalVideo.style.bottom = "auto";
}
function initLocalPipDrag() {
let dragPointerId: number | null = null;
let dragOffsetX = 0;
let dragOffsetY = 0;
vdLocalVideo.addEventListener("pointerdown", (event) => {
if (vdLocalVideo.classList.contains("hidden")) return;
dragPointerId = event.pointerId;
const rect = vdLocalVideo.getBoundingClientRect();
dragOffsetX = event.clientX - rect.left;
dragOffsetY = event.clientY - rect.top;
vdLocalVideo.classList.add("dragging");
vdLocalVideo.setPointerCapture(event.pointerId);
event.preventDefault();
});
vdLocalVideo.addEventListener("pointermove", (event) => {
if (dragPointerId !== event.pointerId) return;
const rect = vdLocalVideo.getBoundingClientRect();
const margin = 12;
const maxLeft = Math.max(margin, window.innerWidth - rect.width - margin);
const maxTop = Math.max(margin, window.innerHeight - rect.height - margin);
const left = clampNumber(event.clientX - dragOffsetX, margin, maxLeft);
const top = clampNumber(event.clientY - dragOffsetY, margin, maxTop);
vdLocalVideo.style.left = `${left}px`;
vdLocalVideo.style.top = `${top}px`;
vdLocalVideo.style.right = "auto";
vdLocalVideo.style.bottom = "auto";
event.preventDefault();
});
function endDrag(event: PointerEvent) {
if (dragPointerId !== event.pointerId) return;
dragPointerId = null;
vdLocalVideo.classList.remove("dragging");
try { vdLocalVideo.releasePointerCapture(event.pointerId); } catch {}
}
vdLocalVideo.addEventListener("pointerup", endDrag);
vdLocalVideo.addEventListener("pointercancel", endDrag);
window.addEventListener("resize", keepLocalPipInViewport);
}
initLocalPipDrag();
function showToast(msg: string, durationMs = 3500) { function showToast(msg: string, durationMs = 3500) {
let el = document.getElementById("wzp-toast"); let el = document.getElementById("wzp-toast");
if (!el) { if (!el) {
@@ -263,6 +363,10 @@ function renderCallDebugLog() {
sCallDebugLogEl.scrollTop = sCallDebugLogEl.scrollHeight; sCallDebugLogEl.scrollTop = sCallDebugLogEl.scrollHeight;
} }
function debugLog(step: string, details: any = {}) {
invoke("call_debug_log", { step, details }).catch(() => {});
}
// ── Quality slider ──────────────────────────────────────────────── // ── Quality slider ────────────────────────────────────────────────
const QUALITY_STEPS = ["studio-64k", "studio-48k", "studio-32k", "auto", "good", "degraded", "codec2-3200", "catastrophic"]; const QUALITY_STEPS = ["studio-64k", "studio-48k", "studio-32k", "auto", "good", "degraded", "codec2-3200", "catastrophic"];
const QUALITY_LABELS = ["Studio 64k", "Studio 48k", "Studio 32k", "Auto", "Opus 24k", "Opus 6k", "Codec2 3.2k", "Codec2 1.2k"]; const QUALITY_LABELS = ["Studio 64k", "Studio 48k", "Studio 32k", "Auto", "Opus 24k", "Opus 6k", "Codec2 3.2k", "Codec2 1.2k"];
@@ -385,6 +489,7 @@ joinVoiceBtn.addEventListener("click", async () => {
alias: s.alias || "", alias: s.alias || "",
osAec: s.osAec, osAec: s.osAec,
quality: s.quality || "auto", quality: s.quality || "auto",
...videoConnectOptions(s),
}); });
enterVoice(false); enterVoice(false);
} catch (e: any) { } catch (e: any) {
@@ -413,6 +518,7 @@ joinVideoBtn.addEventListener("click", async () => {
alias: s.alias || "", alias: s.alias || "",
osAec: s.osAec, osAec: s.osAec,
quality: s.quality || "auto", quality: s.quality || "auto",
...videoConnectOptions(s),
}); });
enterVoice(false); enterVoice(false);
startCamera(); startCamera();
@@ -465,6 +571,10 @@ function leaveVoice() {
if (statusInterval) { clearInterval(statusInterval); statusInterval = null; } if (statusInterval) { clearInterval(statusInterval); statusInterval = null; }
stopCamera(); stopCamera();
remoteVideoActive = false; remoteVideoActive = false;
remoteFrameCount = 0;
remoteFrameSerial++;
vdRemoteCounter.textContent = "0 frames received";
vdRemotePlaceholder.classList.remove("hidden");
vdVideoStrip.classList.add("hidden"); vdVideoStrip.classList.add("hidden");
remoteCtx.clearRect(0, 0, vdRemoteVideo.width, vdRemoteVideo.height); remoteCtx.clearRect(0, 0, vdRemoteVideo.width, vdRemoteVideo.height);
} }
@@ -485,44 +595,162 @@ vdSpkBtn.addEventListener("click", async () => {
// ── Camera (Blocker 4 + 5) ──────────────────────────────────────── // ── Camera (Blocker 4 + 5) ────────────────────────────────────────
const camCaptureCanvas = document.createElement("canvas"); const camCaptureCanvas = document.createElement("canvas");
const camCaptureCtx = camCaptureCanvas.getContext("2d")!; const camCaptureCtx = camCaptureCanvas.getContext("2d")!;
let cameraSendWidth = 1280;
let cameraSendHeight = 720;
let cameraCaptureFrameNo = 0;
let cameraPushFailures = 0;
const CAMERA_CAPTURE_INTERVAL_MS = 33; // ≈ 30 fps
const CAMERA_JPEG_QUALITY = 0.7;
function drawCameraFrameForSend() {
const vw = vdLocalVideo.videoWidth || camCaptureCanvas.width;
const vh = vdLocalVideo.videoHeight || camCaptureCanvas.height;
if (!vw || !vh) return;
const scale = Math.min(cameraSendWidth / vw, cameraSendHeight / vh);
const dw = vw * scale;
const dh = vh * scale;
const dx = (cameraSendWidth - dw) / 2;
const dy = (cameraSendHeight - dh) / 2;
camCaptureCtx.fillStyle = "#000";
camCaptureCtx.fillRect(0, 0, cameraSendWidth, cameraSendHeight);
camCaptureCtx.drawImage(vdLocalVideo, dx, dy, dw, dh);
}
async function captureAndPushCameraFrame() {
if (!cameraActive || cameraCaptureInFlight) return;
cameraCaptureInFlight = true;
cameraCaptureFrameNo++;
try {
drawCameraFrameForSend();
const dataUrl = camCaptureCanvas.toDataURL("image/jpeg", CAMERA_JPEG_QUALITY);
const b64 = dataUrl.slice(dataUrl.indexOf(",") + 1);
if (cameraCaptureFrameNo === 1 || cameraCaptureFrameNo % 150 === 0) {
debugLog("camera:capture_frame", {
frame_no: cameraCaptureFrameNo,
width: camCaptureCanvas.width,
height: camCaptureCanvas.height,
source_width: vdLocalVideo.videoWidth || null,
source_height: vdLocalVideo.videoHeight || null,
jpeg_b64_len: b64.length,
capture_clock: getVideoFrameCallbackApi() ? "video_frame_callback" : "interval",
});
}
await invoke("push_camera_frame", { jpegB64: b64 });
} catch (e: any) {
cameraPushFailures++;
if (cameraPushFailures === 1 || cameraPushFailures % 30 === 0) {
debugLog("camera:push_failed", {
frame_no: cameraCaptureFrameNo,
failures: cameraPushFailures,
error: errorMessage(e),
});
}
} finally {
cameraCaptureInFlight = false;
}
}
function getVideoFrameCallbackApi() {
const video = vdLocalVideo as FrameCallbackVideoElement;
if (typeof video.requestVideoFrameCallback !== "function") return null;
return video;
}
function cancelCameraCaptureLoop() {
if (cameraFrameTimer != null) {
window.clearInterval(cameraFrameTimer);
cameraFrameTimer = null;
}
const video = getVideoFrameCallbackApi();
if (video && cameraFrameCallbackHandle != null && typeof video.cancelVideoFrameCallback === "function") {
video.cancelVideoFrameCallback(cameraFrameCallbackHandle);
}
cameraFrameCallbackHandle = null;
}
function scheduleCameraFrameCapture() {
cancelCameraCaptureLoop();
lastCameraCaptureAtMs = 0;
const video = getVideoFrameCallbackApi();
if (video) {
const onVideoFrame = (now: DOMHighResTimeStamp) => {
cameraFrameCallbackHandle = null;
if (!cameraActive) return;
if (lastCameraCaptureAtMs === 0 || now - lastCameraCaptureAtMs >= CAMERA_CAPTURE_INTERVAL_MS) {
lastCameraCaptureAtMs = now;
void captureAndPushCameraFrame();
}
cameraFrameCallbackHandle = video.requestVideoFrameCallback!(onVideoFrame);
};
cameraFrameCallbackHandle = video.requestVideoFrameCallback(onVideoFrame);
debugLog("camera:capture_clock", { mode: "video_frame_callback", interval_ms: CAMERA_CAPTURE_INTERVAL_MS });
return;
}
cameraFrameTimer = window.setInterval(() => {
void captureAndPushCameraFrame();
}, CAMERA_CAPTURE_INTERVAL_MS);
debugLog("camera:capture_clock", { mode: "interval", interval_ms: CAMERA_CAPTURE_INTERVAL_MS });
}
async function startCamera() { async function startCamera() {
if (cameraActive) return; if (cameraActive) return;
const videoSize = parseVideoResolution(loadSettings().videoResolution);
cameraSendWidth = videoSize.width;
cameraSendHeight = videoSize.height;
const constraints = {
video: { width: { ideal: cameraSendWidth }, height: { ideal: cameraSendHeight }, facingMode: "user" },
audio: false,
};
debugLog("camera:get_user_media_start", { constraints });
try { try {
cameraStream = await navigator.mediaDevices.getUserMedia({ cameraStream = await navigator.mediaDevices.getUserMedia(constraints);
video: { width: { ideal: 1280 }, height: { ideal: 720 }, facingMode: "user" },
audio: false,
});
vdLocalVideo.srcObject = cameraStream; vdLocalVideo.srcObject = cameraStream;
vdVideoStrip.classList.remove("hidden"); vdVideoStrip.classList.remove("hidden");
const track = cameraStream.getVideoTracks()[0]; const track = cameraStream.getVideoTracks()[0];
const settings = track.getSettings(); const settings = track.getSettings();
camCaptureCanvas.width = settings.width ?? 640; camCaptureCanvas.width = cameraSendWidth;
camCaptureCanvas.height = settings.height ?? 360; camCaptureCanvas.height = cameraSendHeight;
debugLog("camera:get_user_media_ok", {
width: settings.width ?? null,
height: settings.height ?? null,
send_width: camCaptureCanvas.width,
send_height: camCaptureCanvas.height,
frameRate: settings.frameRate ?? null,
deviceId: settings.deviceId ? "present" : null,
facingMode: settings.facingMode ?? null,
});
cameraActive = true; cameraActive = true;
cameraCaptureFrameNo = 0;
cameraPushFailures = 0;
vdCamIcon.textContent = "Cam ✓"; vdCamIcon.textContent = "Cam ✓";
vdCamBtn.classList.add("active"); vdCamBtn.classList.add("active");
vdLocalVideo.classList.remove("hidden");
keepLocalPipInViewport();
// Capture loop at ~15 fps scheduleCameraFrameCapture();
cameraFrameTimer = window.setInterval(async () => { } catch (e: any) {
if (!cameraActive) return;
camCaptureCtx.drawImage(vdLocalVideo, 0, 0, camCaptureCanvas.width, camCaptureCanvas.height);
const dataUrl = camCaptureCanvas.toDataURL("image/jpeg", 0.75);
const b64 = dataUrl.slice(dataUrl.indexOf(",") + 1);
try { await invoke("push_camera_frame", { jpeg_b64: b64 }); } catch { /* call not active */ }
}, 67); // 67 ms ≈ 15 fps
} catch (e) {
console.warn("camera access denied or unavailable:", e); console.warn("camera access denied or unavailable:", e);
debugLog("camera:get_user_media_failed", {
name: e?.name ?? null,
message: e?.message ?? String(e),
});
} }
} }
function stopCamera() { function stopCamera() {
if (cameraActive) {
debugLog("camera:stopped", { frames: cameraCaptureFrameNo });
}
cameraActive = false; cameraActive = false;
if (cameraFrameTimer != null) { window.clearInterval(cameraFrameTimer); cameraFrameTimer = null; } cancelCameraCaptureLoop();
if (cameraStream) { cameraStream.getTracks().forEach(t => t.stop()); cameraStream = null; } if (cameraStream) { cameraStream.getTracks().forEach(t => t.stop()); cameraStream = null; }
vdLocalVideo.srcObject = null; vdLocalVideo.srcObject = null;
vdLocalVideo.classList.add("hidden");
vdCamIcon.textContent = "Cam"; vdCamIcon.textContent = "Cam";
vdCamBtn.classList.remove("active"); vdCamBtn.classList.remove("active");
// Hide strip only if remote video is also gone // Hide strip only if remote video is also gone
@@ -535,21 +763,74 @@ vdCamBtn.addEventListener("click", () => {
// ── Remote video display (Blocker 5) ───────────────────────────── // ── Remote video display (Blocker 5) ─────────────────────────────
const remoteCtx = vdRemoteVideo.getContext("2d")!; const remoteCtx = vdRemoteVideo.getContext("2d")!;
const vdRemotePlaceholder = document.getElementById("vd-remote-placeholder")!;
const vdRemoteCounter = document.getElementById("vd-remote-counter")!;
let remoteFrameCount = 0;
let remoteFrameSerial = 0;
let remoteDrawInFlight = false;
let remotePendingFrame: { serial: number; width: number; height: number; jpeg_b64: string } | null = null;
function nextAnimationFrame() {
return new Promise<void>(resolve => requestAnimationFrame(() => resolve()));
}
async function drawRemoteFrame(frame: { serial: number; width: number; height: number; jpeg_b64: string }) {
const img = new Image();
img.src = `data:image/jpeg;base64,${frame.jpeg_b64}`;
if ("decode" in img) {
await img.decode();
} else {
await new Promise<void>((resolve, reject) => {
img.onload = () => resolve();
img.onerror = () => reject(new Error("remote video image decode failed"));
});
}
if (frame.serial !== remoteFrameSerial) return;
await nextAnimationFrame();
if (frame.serial !== remoteFrameSerial) return;
if (vdRemoteVideo.width !== frame.width) vdRemoteVideo.width = frame.width;
if (vdRemoteVideo.height !== frame.height) vdRemoteVideo.height = frame.height;
remoteCtx.drawImage(img, 0, 0, vdRemoteVideo.width, vdRemoteVideo.height);
}
async function pumpRemoteVideoFrames() {
if (remoteDrawInFlight) return;
remoteDrawInFlight = true;
try {
while (remotePendingFrame) {
const frame = remotePendingFrame;
remotePendingFrame = null;
try {
await drawRemoteFrame(frame);
} catch (e) {
console.warn("remote video draw failed:", e);
}
}
} finally {
remoteDrawInFlight = false;
if (remotePendingFrame) void pumpRemoteVideoFrames();
}
}
listen("video:frame", (event: any) => { listen("video:frame", (event: any) => {
const { width, height, jpeg_b64 } = event.payload; const { width, height, jpeg_b64 } = event.payload;
if (!jpeg_b64) return; if (!jpeg_b64) return;
const frameSerial = ++remoteFrameSerial;
remoteVideoActive = true; remoteVideoActive = true;
vdVideoStrip.classList.remove("hidden"); vdVideoStrip.classList.remove("hidden");
vdRemoteVideo.width = width ?? vdRemoteVideo.width; vdRemotePlaceholder.classList.add("hidden");
vdRemoteVideo.height = height ?? vdRemoteVideo.height; remoteFrameCount++;
if (remoteFrameCount === 1) console.log("first remote video frame:", width, "x", height);
const img = new Image(); remotePendingFrame = {
img.onload = () => { serial: frameSerial,
remoteCtx.drawImage(img, 0, 0, vdRemoteVideo.width, vdRemoteVideo.height); width: width ?? vdRemoteVideo.width,
height: height ?? vdRemoteVideo.height,
jpeg_b64,
}; };
img.src = `data:image/jpeg;base64,${jpeg_b64}`; void pumpRemoteVideoFrames();
}); });
// ── Poll status ─────────────────────────────────────────────────── // ── Poll status ───────────────────────────────────────────────────
@@ -671,6 +952,7 @@ listen("signal-event", (event: any) => {
peerMappedAddr: data.peer_mapped_addr ?? null, peerMappedAddr: data.peer_mapped_addr ?? null,
directOnly: s.directOnly || false, directOnly: s.directOnly || false,
birthdayAttack: s.birthdayAttack || false, birthdayAttack: s.birthdayAttack || false,
...videoConnectOptions(s),
}); });
enterVoice(true); enterVoice(true);
} catch (e: any) { } catch (e: any) {
@@ -821,6 +1103,8 @@ function openSettings() {
sCallDebug.checked = !!s.callDebugLogs; sCallDebug.checked = !!s.callDebugLogs;
sDirectOnly.checked = !!s.directOnly; sDirectOnly.checked = !!s.directOnly;
sBirthdayAttack.checked = !!s.birthdayAttack; sBirthdayAttack.checked = !!s.birthdayAttack;
sVideoCodec.value = s.videoCodec || "h264";
sVideoResolution.value = s.videoResolution || "1280x720";
sCallDebugSection.style.display = s.callDebugLogs ? "" : "none"; sCallDebugSection.style.display = s.callDebugLogs ? "" : "none";
renderCallDebugLog(); renderCallDebugLog();
const qi = qualityToIndex(s.quality || "auto"); const qi = qualityToIndex(s.quality || "auto");
@@ -846,6 +1130,8 @@ settingsSave.addEventListener("click", () => {
s.callDebugLogs = sCallDebug.checked; s.callDebugLogs = sCallDebug.checked;
s.directOnly = sDirectOnly.checked; s.directOnly = sDirectOnly.checked;
s.birthdayAttack = sBirthdayAttack.checked; s.birthdayAttack = sBirthdayAttack.checked;
s.videoCodec = sVideoCodec.value || "h264";
s.videoResolution = sVideoResolution.value || "1280x720";
saveSettings(s); saveSettings(s);
invoke("set_dred_verbose_logs", { enabled: s.dredDebugLogs }).catch(() => {}); invoke("set_dred_verbose_logs", { enabled: s.dredDebugLogs }).catch(() => {});
invoke("set_call_debug_logs", { enabled: s.callDebugLogs }).catch(() => {}); invoke("set_call_debug_logs", { enabled: s.callDebugLogs }).catch(() => {});

View File

@@ -258,7 +258,7 @@ body {
border-top: 1px solid var(--surface2); border-top: 1px solid var(--surface2);
padding: 0 16px; padding: 0 16px;
padding-bottom: env(safe-area-inset-bottom, 8px); padding-bottom: env(safe-area-inset-bottom, 8px);
z-index: 50; z-index: 70;
animation: drawerUp 0.25s ease-out; animation: drawerUp 0.25s ease-out;
box-shadow: 0 -4px 20px rgba(0,0,0,0.4); box-shadow: 0 -4px 20px rgba(0,0,0,0.4);
} }
@@ -316,20 +316,66 @@ body {
padding: 2px 0 4px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; padding: 2px 0 4px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis;
} }
/* Video strip in voice drawer */ /* Full-screen video stage — overlays lobby/main when video is active */
.vd-video-strip { .vd-video-stage {
display: flex; position: fixed;
gap: 4px; top: 0;
padding: 4px 0 2px; left: 0;
overflow-x: auto; right: 0;
} bottom: 96px; /* leave room for voice drawer */
.vd-video-tile {
width: 160px;
height: 90px;
border-radius: 6px;
background: #000; background: #000;
z-index: 40;
overflow: hidden;
}
.vd-remote-stage {
position: absolute;
inset: 0;
width: 100%;
height: 100%;
object-fit: contain;
background: #000;
}
.vd-remote-placeholder {
position: absolute;
inset: 0;
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
color: #888;
pointer-events: none;
z-index: 1;
}
.vd-remote-placeholder.hidden { display: none; }
.vd-placeholder-text { font-size: 18px; margin-bottom: 8px; }
.vd-placeholder-sub { font-size: 12px; opacity: 0.7; }
.vd-local-pip {
position: fixed;
right: 18px;
bottom: calc(176px + env(safe-area-inset-bottom, 0px));
width: min(34vw, 220px);
height: auto;
aspect-ratio: 16 / 9;
border-radius: 8px;
background: #111;
border: 2px solid rgba(255, 255, 255, 0.2);
object-fit: cover; object-fit: cover;
flex-shrink: 0; box-shadow: 0 4px 20px rgba(0, 0, 0, 0.5);
z-index: 90;
cursor: grab;
touch-action: none;
-webkit-user-drag: none;
}
.vd-local-pip.dragging {
cursor: grabbing;
box-shadow: 0 8px 28px rgba(0, 0, 0, 0.65);
}
@media (max-width: 520px) {
.vd-local-pip {
width: min(48vw, 190px);
right: 12px;
bottom: calc(188px + env(safe-area-inset-bottom, 0px));
}
} }
/* Incoming call banner */ /* Incoming call banner */

View File

@@ -1,195 +1,98 @@
# PRD: E2E Media Encryption — Wire EncryptingTransport on Relay Path # PRD: E2E Media Encryption (rewrite)
> **Status:** proposed > **Status:** proposed (supersedes prior version)
> **Resolves:** Security gap — relay-path media travels in QUIC TLS only; WZP application-layer ChaCha20-Poly1305 is negotiated but never applied. > **Resolves:** Real end-to-end media encryption between call participants.
> **Depends on:** `wzp_client::encrypted_transport::EncryptingTransport` (already implemented). > **Replaces:** The prior version of this PRD described wrapping `QuinnTransport` in `EncryptingTransport` using the pairwise client↔relay session. That approach was implemented (commit `52a6f5e`) and **broke voice between any two clients** because the relay does not decrypt+re-encrypt — see "Why the prior fix failed" below. The wrapping was reverted in commit `e8cab25`.
## Problem ---
`CallEngine::start` (both the Android path and the desktop path) calls ## Why the prior fix failed
`wzp_client::handshake::perform_handshake`, which returns a `HandshakeResult`
containing a `session: Box<dyn CryptoSession>` (a keyed `ChaChaSession`).
Both call sites discard the session — only `hs.video_codec` is retained.
All subsequent `send_media` / `recv_media` calls go directly through `wzp_client::handshake::perform_handshake` performs ECDH **between the client and the relay**. Each client in a room ends up with a **different** pairwise session key (key_A for client A, key_B for client B, etc.).
`Arc<wzp_transport::QuinnTransport>`, which provides QUIC TLS (relay sees
plaintext application data after TLS termination at the relay). The WZP
application-level AEAD — ChaCha20-Poly1305, keyed per-call, relay-never-sees
— is never applied.
`wzp_client::encrypted_transport::EncryptingTransport` exists The relay is an SFU — it forwards `MediaPacket` bytes between participants in a room without inspecting their payloads. The relay does not run a decrypt-then-encrypt step keyed per-recipient.
(`crates/wzp-client/src/encrypted_transport.rs`) and is fully tested.
It wraps any `Arc<dyn MediaTransport>` and intercepts every `send_media` / Wrapping `QuinnTransport` in `EncryptingTransport` therefore produced:
`recv_media` call with `session.encrypt()` / `session.decrypt()`.
```
Client A: plaintext --[encrypt key_A]--> ciphertext --> Relay
Relay: forwards ciphertext (bytes) --> Client B
Client B: ciphertext --[decrypt key_B]--> garbage --> silent audio
```
Result: every recipient saw decryption failures, audio went silent.
This is **not a bug in `EncryptingTransport`** — the wrapper does exactly what it claims. The bug was thinking the pairwise client-relay session was usable for participant-to-participant media. It isn't.
## Goals ## Goals
- The relay-path `HandshakeResult::session` is used to construct an A future implementation must satisfy:
`EncryptingTransport` that wraps the raw `QuinnTransport`.
- All `send_media` and `recv_media` calls in the relay path go through the
wrapper, not the raw transport.
- The direct P2P path (`is_direct_p2p == true`) is left unchanged — QUIC TLS
is the encryption layer there.
- `cargo check --manifest-path desktop/src-tauri/Cargo.toml` passes.
- A `#[cfg(test)]` test verifies that the relay path uses `EncryptingTransport`.
## Non-goals - Two clients in a room can exchange media that the **other client** can decrypt.
- The **relay cannot decrypt** any media payload (true E2E), OR alternatively, the relay can decrypt+re-encrypt per recipient (hop-by-hop, sometimes called SFU-trusted).
- Joining and leaving the room mid-call rotates keys so departed members can't decrypt subsequent traffic (forward secrecy on membership change).
- Compatible with the existing `MediaPacket` wire format (header in plaintext, payload encrypted).
- Rekeying (`SignalMessage::Rekey`) — tracked separately. ## Two valid approaches
- Video transport encryption (same mechanism; apply after audio is confirmed working).
- Changes to the P2P path, the relay binary, or any crate outside `desktop/src-tauri`.
## Design ### Approach A — MLS group keys (true E2E)
### `EncryptingTransport` API (read `crates/wzp-client/src/encrypted_transport.rs`) Use the [MLS protocol](https://datatracker.ietf.org/doc/rfc9420/) (e.g. via the `openmls` crate) to derive a shared **group key** that all room members possess and the relay does not.
```rust - Relay acts as a **delivery service** for MLS Handshake messages (`Welcome`, `Commit`, `Proposal`) but never sees the group secret.
pub struct EncryptingTransport { ... } - Every media packet is AEAD-sealed with the current group epoch key.
- Group rekey is triggered by:
- Member join/leave (forward secrecy on membership)
- Periodic (every N seconds or N packets) for post-compromise security
- Each room maintains its own MLS group; the relay just stores opaque `mls_blob` payloads in `SignalMessage::MlsHandshake`.
impl EncryptingTransport { **Pros:** real E2E. Relay compromise does not leak media.
pub fn new(inner: Arc<dyn MediaTransport>, session: Box<dyn CryptoSession>) -> Self; **Cons:** Significant complexity (MLS state machine per room, persistent ratchet trees, key schedule). Adds `openmls` dependency (~30 KLOC). Federation across relays is harder.
}
// Implements MediaTransport: ### Approach B — Hop-by-hop re-encryption at the relay
// send_media → session.encrypt(header_bytes, payload) → inner.send_media
// recv_media → inner.recv_media → session.decrypt(header_bytes, ciphertext) The relay holds a `CryptoSession` per connected client (which it already does — see `_crypto_session` discarded in `crates/wzp-relay/src/main.rs:1817`). On forward:
// send_signal / recv_signal / path_quality / close → forwarded unchanged
```
Relay.recv_media(from A): decrypt with key_A → plaintext
Relay.send_media(to B, C, D): for each recipient X, encrypt with key_X
``` ```
`EncryptingTransport` is NOT `Arc`-wrapped by the constructor; wrap it in This is the same model as Matrix Megolm-without-Megolm — encrypted hop-by-hop but the relay sees plaintext briefly in between.
`Arc::new(...)` when storing as `Arc<dyn MediaTransport>`.
### Two call sites in `desktop/src-tauri/src/engine.rs` **Pros:** Reuses existing per-client `ChaChaSession`. Implementation is ~100 lines in the relay's room forwarding loop. Federation works the same way (each relay-relay hop has its own session).
**Cons:** Relay sees plaintext. A compromised relay can record and decrypt all media. This is **not E2E** — but it is strictly stronger than the current state (plaintext-over-QUIC-TLS exposes media to anyone with a TLS-terminating proxy on the relay).
**Call site 1 — Android path** (`CallEngine::start` around line 575): ## Recommendation
```rust **Ship Approach B first.** It's a small, well-scoped change that closes the relay-operator-can-see-plaintext-in-RAM gap without requiring an MLS rewrite. Then layer Approach A on top when the threat model demands relay-untrusted operation.
if !is_direct_p2p {
let _hs = match wzp_client::handshake::perform_handshake(...).await { Ok(hs) => hs, ... };
// hs.session is discarded here — fix this
}
```
Change: capture `hs`, then build a wrapped transport: ## Out of scope for this PRD
```rust - Federation gossip key exchange (separate PRD)
if !is_direct_p2p { - SAS (Short Authentication String) verification UX (separate PRD)
let hs = match wzp_client::handshake::perform_handshake(...).await { Ok(hs) => hs, ... }; - Rekey on session compromise (handled by the chosen approach's group/pairwise rekey)
info!(video_codec = ?hs.video_codec, "handshake complete");
let transport: Arc<dyn wzp_proto::MediaTransport> =
Arc::new(wzp_client::encrypted_transport::EncryptingTransport::new(
transport.clone(),
hs.session,
));
// use `transport` (the wrapped version) for all subsequent send_t / recv_t clones
}
```
The variable `transport` must shadow the raw `Arc<QuinnTransport>` so that ## Acceptance criteria (Approach B, first iteration)
every subsequent clone of `transport` picks up the encrypted wrapper.
**Call site 2 — Desktop path** (`CallEngine::start` around line 1551): 1. Relay's room forwarding loop (`crates/wzp-relay/src/room.rs:354` and `:1353`) calls `sender_session.decrypt()` then `recipient_session.encrypt()` per recipient before `send_media`.
2. Each `RoomMember` holds its `Box<dyn CryptoSession>` (currently discarded as `_crypto_session` in `main.rs:1817`).
3. Client-side: re-add the `EncryptingTransport` wrapping in `desktop/src-tauri/src/engine.rs` (the two sites reverted in `e8cab25`).
4. Integration test: two-client mock room exchanges media; verify each recipient gets the sender's plaintext back after the relay double-hop.
5. Existing 825 tests still pass.
```rust ## Verification
let _negotiated_video_codec = if !is_direct_p2p {
let hs = wzp_client::handshake::perform_handshake(...).await?;
info!(video_codec = ?hs.video_codec, "handshake complete");
hs.video_codec // session dropped here — fix this
} else { None };
```
Change: extract `session` before returning `video_codec`, then shadow `cargo test -p wzp-relay --test multi_client_relay_path` should pass with two simulated clients sending audio in both directions and decrypting each other's frames.
`transport` with the wrapped version. Because `transport` is used after this
block (cloned into `send_t`, `recv_t`, etc.), the shadow must happen inside
the same scope or immediately after:
```rust ## Files to touch
let (_negotiated_video_codec, transport): (_, Arc<dyn wzp_proto::MediaTransport>) =
if !is_direct_p2p {
let hs = wzp_client::handshake::perform_handshake(...).await?;
info!(video_codec = ?hs.video_codec, "handshake complete");
let enc = Arc::new(wzp_client::encrypted_transport::EncryptingTransport::new(
transport.clone(),
hs.session,
));
(hs.video_codec, enc)
} else {
info!("direct P2P — skipping relay handshake");
(None, transport.clone())
};
```
All subsequent `transport.clone()` calls then operate on the encrypted wrapper. - `crates/wzp-relay/src/main.rs` — keep `crypto_session` per-client (drop the `_` prefix)
- `crates/wzp-relay/src/room.rs` — add decrypt/re-encrypt to forward path
- `crates/wzp-relay/src/session_mgr.rs` — store sessions keyed by peer
- `desktop/src-tauri/src/engine.rs` — restore `EncryptingTransport` wrapping (~2 sites)
- `crates/wzp-relay/tests/multi_client_relay_path.rs` — new integration test
### Import ## Risk / rollback
Add to the top of `engine.rs` if not already present: If multi-client tests fail in CI, the change is contained to the relay forwarding loop and one engine.rs edit — straightforward revert.
```rust
use wzp_client::encrypted_transport::EncryptingTransport;
```
Or use the fully-qualified path inline (already shown above).
### Type compatibility
- `EncryptingTransport` implements `wzp_proto::MediaTransport` (confirmed in the source).
- The existing `send_t` / `recv_t` variables are already typed as
`Arc<dyn MediaTransport>` (or coerced on first use) — the shadow is a
drop-in replacement.
- The `vid_transport` for the video path (`line ~2090`) is also cloned from
`transport`; it will automatically use the encrypted wrapper if the shadow
is placed before those clones.
## Implementation steps
1. Read `desktop/src-tauri/src/engine.rs` lines 570620 (Android path) and
15471570 (desktop path) to see the exact variable names in each branch.
2. **Android path fix** (line ~585): rename `_hs` to `hs`, extract
`hs.session`, wrap `transport` with `EncryptingTransport::new`, re-bind
`transport` as `Arc<dyn MediaTransport>`.
3. **Desktop path fix** (line ~1551): restructure the
`if !is_direct_p2p` block to return `(video_codec, wrapped_transport)`
and shadow `transport`.
4. Confirm that `vid_transport` (line ~2090) is cloned after the shadow — if
it is, no further changes are needed for video.
5. Run `cargo check --manifest-path desktop/src-tauri/Cargo.toml`. Fix any
type-mismatch errors (usually a missing `as Arc<dyn MediaTransport>` cast
or a moved value).
6. Add a `#[cfg(test)]` module to `engine.rs` (or to a new
`engine_tests.rs` included via `#[cfg(test)] mod engine_tests`) with a
test that constructs a `LoopbackTransport`, calls `perform_handshake`
against a mock relay fixture, and verifies that a received payload is
decrypted before returning from `recv_media`. A simpler alternative that
avoids a full handshake: assert `is::<EncryptingTransport>()` on the
`transport` variable at the test call site using `std::any::Any`.
## Files to read before implementing
- `desktop/src-tauri/src/engine.rs` lines 475625 (Android path) and
14801570 (desktop path)
- `crates/wzp-client/src/encrypted_transport.rs` (full — for the exact
constructor signature and trait impl)
- `crates/wzp-client/src/handshake.rs` (for `HandshakeResult` struct
definition — confirm the `session` field name and type)
## Verify
```bash
cargo check --manifest-path desktop/src-tauri/Cargo.toml
```
Expected: 0 errors.
Manual smoke check: both `perform_handshake` call sites in `engine.rs` must
use `hs.session` (grep: `hs\.session` should appear twice, once per call site).
The string `_hs` must not remain on the relay path (only on the `_hs =` binding if the variable is intentionally unused before wrapping).
## Done when
- `cargo check --manifest-path desktop/src-tauri/Cargo.toml` exits 0.
- Both relay-path `perform_handshake` call sites build an `EncryptingTransport`
from `hs.session`.
- The direct-P2P branch (`is_direct_p2p == true`) is unchanged.
- A `#[cfg(test)]` test in `engine.rs` verifies that `EncryptingTransport`
is used on the relay path (construction proof or decrypt round-trip).

View File

@@ -0,0 +1,415 @@
# BUG-003: Android to macOS Video Banding / Horizontal Lines
**Severity:** P0/P1 - Android camera video is visibly corrupted on macOS at common resolutions.
**Status:** Root cause identified 2026-05-26; candidate fix in `crates/wzp-video/src/videotoolbox.rs`. Awaiting on-device verification.
**Branch:** `main`.
**Latest build observed:** `3ea25a0` (`fix(android): use MediaCodec input layout for video encode`).
**Direction affected:** Android camera -> macOS desktop display.
**Direction mostly OK:** macOS camera -> Android display.
---
## Root Cause (2026-05-26)
The Android H.264 bitstream is **valid**: the locally-encoded `.h264` files and
the macOS-reassembled `.h264` files both decode cleanly with software ffmpeg.
SPS reports the expected `960x540`, `coded_height=544`, `yuv420p`, High profile,
level 3.1.
The corruption appears purely on the macOS receive side. The shiguredo
`I420Frame` wrapper around `CVPixelBuffer` exposes each plane as
`bytes_per_row * height` bytes — i.e. the raw plane buffer including the
per-row stride padding that CoreVideo adds for alignment. `VideoToolboxDecoder`
was concatenating those slices verbatim, then handing the buffer downstream
tagged as tight I420 of `width x height`. The JPEG-encoding consumer
(`i420_to_jpeg_bytes` in `desktop/src-tauri/src/lib.rs`) indexes the buffer
with tight strides `width` and `width/2`, so any plane where
`bytes_per_row > tight_stride` produces per-row drift in the consumer's reads.
Numerical confirmation from the corrupted dump
`000002_desktop_remote_decoded_f000001_960x540.jpg`:
- Banding period along the diagonal: exactly **32 luma rows** = 16 chroma rows.
- Per-column-slice peak offsets shift by ~5 rows per 230-column step, i.e. the
bands are a tilted diagonal, not horizontal — consistent with one chroma row
of drift accumulating per 16 chroma rows of consumer read.
- Solving `u_stride / (u_stride - chroma_width) = 16` with `chroma_width = 480`
yields `u_stride = 512`. That is exactly the 64-byte aligned chroma stride
CoreVideo emits for a 480-wide plane.
- Luma at 960 wide is already 64-aligned, so `y_stride = 960` and the luma
plane is unaffected. This matches the bug doc note that 640x360 looks fine
(chroma_width 320 is also 64-aligned, no padding needed).
## Fix
`crates/wzp-video/src/videotoolbox.rs` now has an `i420_frame_to_tight` helper
that copies each plane row-by-row using its own `bytes_per_row`, producing a
genuine tight I420 buffer of `width * height + 2 * (width/2) * (height/2)`
bytes. All three decoders (H.264, HEVC, AV1) call the helper instead of
concatenating raw plane slices. On the first successful decode each decoder
logs the actual plane dimensions and strides (`tracing::info!` at target
`wzp_video::videotoolbox`) so future similar bugs are easier to diagnose
without re-deriving from band spacing.
---
## Symptom
When Android sends camera video to macOS, the macOS view shows repeated horizontal green/magenta line bands over the decoded picture. The lines cover the whole decoded frame, including black side bars added by the Android portrait-camera contain/crop fix.
The Android camera crop/zoom problem is fixed now: the Android front camera is no longer cover-cropped into an extreme zoom. The remaining bug is the line/banding corruption.
The issue is easy to see at H.264 960x540. At 640x360 it has been reported as visually good or much better. HEVC behaves differently: minimum resolution can look good, but 960x540 and 1280x720 tend to pause or deliver only bursts of frames.
---
## Current State
Recent commits relevant to this bug:
```text
3ea25a0 fix(android): use MediaCodec input layout for video encode
1124726 fix(video): add frame metadata and Android encode diagnostics
9a77459 feat(video): add codec and resolution controls
f85efb9 fix(video): improve android stream smoothness
31b2caa fix(video): request keyframes after packet loss
079e21e fix(video): resync decoder after packet gaps
e676641 fix(android): suppress debuggable lint for diagnostic builds
9713efc chore(android): add release debuggable build
```
Important behavior:
- Android source dumps are clean.
- Android I420 roundtrip dumps are clean.
- macOS decoded remote Android frames are corrupted.
- Android receiving macOS video is generally clean.
- Transport/reassembly is probably not the primary issue: early Android local encoded `.h264` files match the corresponding macOS remote reassembled `.h264` prefix/length.
- The bug is likely in Android MediaCodec encoder input layout/color handling, H.264 non-macroblock-aligned dimensions/cropping, or macOS VideoToolbox interpretation of Android-encoded H.264.
---
## Reproduction Build
Use the Tauri Android pipeline, not the legacy native Android Gradle app.
```bash
cd /Users/manwe/CascadeProjects/warzonePhone
git status --short
git log -1 --oneline
./scripts/android-build-async.sh --release-debuggable --wait
```
The APK lands here:
```bash
/Users/manwe/CascadeProjects/warzonePhone/target/tauri-android-apk/wzp-tauri-arm64.apk
```
Install it:
```bash
adb install -r /Users/manwe/CascadeProjects/warzonePhone/target/tauri-android-apk/wzp-tauri-arm64.apk
```
Use `--release-debuggable` for this bug. Plain debug builds can mask the issue because they run at much lower frame rate and look like a slideshow. Plain release builds are not usable for `run-as` frame-dump retrieval.
Critical build trap: `scripts/android-build-async.sh` runs `scripts/build-tauri-android.sh`, which SSHes to `SepehrHomeserverdk` and resets the remote source to `origin/$BRANCH`. Uncommitted local changes are ignored by the Android build. Commit and push before building, or the phone may run old code.
---
## macOS Build / Run
For local desktop repro:
```bash
cd /Users/manwe/CascadeProjects/warzonePhone/desktop
npm install
npm run tauri dev
```
Enable call debug logs in the app settings before starting the call. The in-app call log only keeps the last 200 entries; use the copy/share buttons if preserving textual logs matters.
---
## Repro Steps
1. Start the macOS desktop client.
2. Start the Android `--release-debuggable` APK.
3. Join the same room, usually `general`.
4. Use the same relay as the current manual tests, e.g. `172.16.81.135:4433`, unless testing relay-specific behavior.
5. Turn camera on for both clients.
6. Set both sides to H.264.
7. Set Android send resolution to 960x540. Mac can be 960x540 or higher.
8. Observe Android camera video on macOS.
Expected failure: macOS shows Android video with repeated horizontal green/magenta lines. Android camera source preview and Android frame dumps are clean.
Useful comparison tests:
| Codec / resolution | Observed result |
|---|---|
| H.264 960x540 | Lines/banding on macOS for Android video |
| H.264 640x360 | Reported good or much better; smoother |
| H.264 1280x720 | Lines/banding and/or worse smoothness |
| HEVC 1280x720 | Mac video smooth on Android; Android video on Mac pauses and can look zoomed/corrupt |
| HEVC 960x540 | Same pause pattern, shorter pauses |
| HEVC minimum resolution | Reported good on both devices |
---
## Artifact Collection
### Clear old dumps before a fresh run
macOS:
```bash
rm -rf "$HOME/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps"
```
Android:
```bash
adb shell run-as com.wzp.desktop rm -rf .wzp/frame-dumps
```
The Android clear command requires a debuggable build. If `run-as` fails, rebuild with `--release-debuggable`.
### Pull Android dumps
```bash
cd /Users/manwe/CascadeProjects/warzonePhone
./scripts/pull-android-frame-dumps.sh
```
Output directory:
```text
/Users/manwe/CascadeProjects/warzonePhone/android-frame-dumps/frame-dumps
```
The pull script packages files using:
```bash
adb exec-out "run-as com.wzp.desktop tar -C .wzp -cf - frame-dumps"
```
### macOS dump directory
```text
/Users/manwe/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps
```
### Important dump names
| Dump suffix | Meaning |
|---|---|
| `android_camera_jpeg_in_fXXXXXX_<WxH>.jpg` | Raw browser/camera JPEG entering Rust from Android WebView |
| `android_camera_i420_roundtrip_fXXXXXX_<WxH>.jpg` | Android camera frame after JS/canvas -> Rust I420 conversion, converted back to JPEG |
| `android_local_encoded_fXXXXXX.h264` / `.h265` | Encoded Android camera bitstream before packetization |
| `desktop_remote_encoded_reassembled_fXXXXXX.h264` / `.h265` | macOS reassembled encoded bitstream received from Android |
| `desktop_remote_decoded_fXXXXXX_<WxH>.jpg` | macOS decoded Android video frame, where the lines show |
| `android_remote_decoded_fXXXXXX_<WxH>.jpg` | Android decoded macOS video frame |
Known useful local examples from the latest sessions:
```text
Clean Android source:
/Users/manwe/CascadeProjects/warzonePhone/android-frame-dumps/frame-dumps/000407_android_camera_jpeg_in_f000150_960x540.jpg
/Users/manwe/CascadeProjects/warzonePhone/android-frame-dumps/frame-dumps/000408_android_camera_i420_roundtrip_f000150_960x540.jpg
Corrupt macOS decode:
/Users/manwe/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps/000236_desktop_remote_decoded_f000030_960x540.jpg
/Users/manwe/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps/000241_desktop_remote_decoded_f000060_960x540.jpg
/Users/manwe/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps/000244_desktop_remote_decoded_f000090_960x540.jpg
Encoded bitstream comparison:
/Users/manwe/CascadeProjects/warzonePhone/android-frame-dumps/frame-dumps/000005_android_local_encoded_f000001.h264
/Users/manwe/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps/000064_desktop_remote_encoded_reassembled_f000001.h264
```
These files are local artifacts, not committed test fixtures.
---
## Text Logs
### In-app call debug log
Enable `Call debug logs` in settings before joining. The UI buffer is limited to 200 entries. Use the in-app copy/share buttons immediately after the repro.
Useful events:
```text
camera:get_user_media_ok
camera:capture_clock
camera:capture_frame
video:first_camera_frame
video:camera_frame_sample
video:encoded_frame
video:first_send
video:first_recv
video:first_reassembled
video:reassembled_frame
video:decoder_init_start
video:first_decoded_frame
video:decoded_frame_sample
video:frame_dump
video:byte_dump
```
The crop fix is active when Android `camera:capture_frame` includes portrait source dimensions with a landscape send frame, for example:
```text
camera:capture_frame {"frame_no":150,"width":960,"height":540,"source_width":540,"source_height":960,...}
```
### Android logcat
Logcat can be noisy and may not always retain the in-app call debug entries. Still useful commands:
```bash
adb logcat -c
adb logcat -v time | rg 'camera:capture_frame|video:frame_dump|video:byte_dump|video:first_camera_frame|video:camera_frame_sample|video:encoded_frame|h264_encoder_input|hevc_encoder_input|MediaCodec input format|decoder_debug'
```
For post-run collection:
```bash
adb logcat -d -v time > /tmp/wzp-android-logcat.txt
rg 'camera:|video:|h264_encoder_input|hevc_encoder_input|MediaCodec|decoder_debug' /tmp/wzp-android-logcat.txt
```
If no `h264_encoder_input` / `hevc_encoder_input` entries appear, the current `tracing::info!` path in `crates/wzp-video/src/mediacodec.rs` may not be making it into Android logcat. Convert that diagnostic to `emit_call_debug` from the caller if the next step needs guaranteed visibility.
---
## What We Know
### The Android camera/canvas path is probably clean
The Android dumps for `android_camera_jpeg_in` and `android_camera_i420_roundtrip` at 960x540 are clean. They show the portrait front camera contained inside a landscape frame with black side bars. This means the former zoom/crop bug is fixed and the current bands are not introduced by CSS, canvas sizing, or the browser camera preview.
### The corruption appears after encode/decode
The corrupt lines are present in `desktop_remote_decoded_*`. They cover black bars as well as image content, which points to frame buffer / codec layout corruption rather than a real scene artifact.
### Transport is not the leading suspect
`android_local_encoded_f000001.h264` and `desktop_remote_encoded_reassembled_f000001.h264` have matching sizes/prefixes in the latest diagnostic run. That does not fully prove every later packet is perfect, but it makes relay/datagram/reassembly much less likely as the root cause.
Relays should not need changes for this bug unless the wire format changes. The relay forwards datagrams and does not inspect video frame internals.
### Resolution alignment is suspicious
960x540 has a height that is not divisible by 16. H.264 macroblock encoders commonly encode 960x544 and signal cropping to 960x540. The horizontal line bands may be a crop/padding/chroma-plane issue. Testing 960x544 and/or 960x528 is a high-value next step.
---
## Code Areas
Primary suspects:
- `crates/wzp-video/src/mediacodec.rs` - Android MediaCodec H.264/HEVC encoder and decoder, color format, stride, slice height handling.
- `desktop/src-tauri/src/engine.rs` - packet send/receive, decode lifecycle, frame/byte dump calls.
- `desktop/src-tauri/src/lib.rs` - `maybe_dump_video_jpeg`, `maybe_dump_video_bytes`, app-data paths, call-debug event plumbing.
- `desktop/src/main.ts` - browser camera capture, canvas scaling, codec/resolution settings, UI debug log buffer.
- `crates/wzp-video/src/transport.rs` - video packetization/reassembly and `WZV1` metadata header.
The latest attempted fix in `mediacodec.rs` uses `codec.input_format()` on Android API 28+ to derive encoder input stride/slice/color layout. Since the lines persist, either those fields are not reliable for this encoder, the chosen color format conversion is wrong, or macOS decode/crop interpretation is involved.
---
## Recommended Next Debug Steps
1. Verify whether Android logs the encoder input format on the failing build.
```bash
adb logcat -d -v time | rg 'h264_encoder_input|hevc_encoder_input|input_color_format|effective_stride|effective_slice'
```
If absent, make this an app call-debug event instead of plain tracing so it appears in the copied call log.
2. Add Android loopback decode of `android_local_encoded_*` before network.
Dump a new `android_local_decoded_fXXXXXX_<WxH>.jpg` immediately after encoding. If this local Android decode already has bands, the encoder output is bad. If Android local decode is clean but macOS decode is bad, focus on H.264 SPS cropping / VideoToolbox decode assumptions.
3. Test macroblock-aligned debug resolutions.
Add or force:
```text
960x544
960x528
640x368
640x352
```
If 960x544 fixes the lines, the bug is almost certainly H.264 crop/padding handling. If 960x528 fixes it but 960x544 does not, inspect bottom padding and crop signaling.
4. Offline-decode `android_local_encoded_*.h264` with a known-good decoder.
Example on a machine with working ffmpeg:
```bash
ffmpeg -f h264 -i android-frame-dumps/frame-dumps/000005_android_local_encoded_f000001.h264 -frames:v 1 /tmp/android-local-f1.png
ffmpeg -f h264 -i "$HOME/Library/Application Support/com.wzp.desktop/.wzp/frame-dumps/000064_desktop_remote_encoded_reassembled_f000001.h264" -frames:v 1 /tmp/macos-remote-f1.png
```
Note: Homebrew ffmpeg on this Mac was broken during debugging with a missing `libvpx.11.dylib`, so do not assume `/opt/homebrew/bin/ffmpeg` works until fixed.
5. Try explicit Android encoder input variants.
Test one variable at a time:
- Force planar color format `COLOR_FormatYUV420Planar` / value `19` and feed I420.
- Force semiplanar and try NV12 vs NV21/VU order.
- Use `COLOR_FormatYUV420Flexible` if accepted by this device.
- Use `stride = width`, `slice_height = align_up(height, 16)` only.
- Use `stride = align_up(width, 16)`, `slice_height = align_up(height, 16)`.
6. Parse SPS from Android H.264 output.
Confirm encoded dimensions and frame cropping offsets for 960x540. Compare Android output against macOS output. If SPS says 960x544 with crop to 540, test whether VideoToolbox applies the crop correctly.
7. Keep relay out of the first debugging loop.
The relay is unlikely to affect deterministic decoded line bands when local encoded and remote reassembled payloads match. Only redeploy relay if packet framing changes.
---
## Verification Criteria For A Fix
A candidate fix is good when:
- Android `android_camera_jpeg_in` and `android_camera_i420_roundtrip` remain clean.
- Android `android_local_decoded`, if added, is clean.
- macOS `desktop_remote_decoded` is clean at H.264 960x540.
- 960x540 is smooth enough for normal calls, not a debug-build slideshow.
- H.264 1280x720 either works or fails in an understood performance-only way.
- HEVC behavior is not regressed from current minimum-resolution success.
Run at least:
```bash
cargo check -p wzp-video --target aarch64-linux-android
cargo check -p wzp-video -p wzp-client -p wzp-desktop
```
Then build Android with:
```bash
./scripts/android-build-async.sh --release-debuggable --wait
```
---
## Open Questions
- Does the failing Android device actually report encoder input `stride`, `slice-height`, and `color-format` after `start()`? The code asks for this, but recent logcat sampling did not show the `h264_encoder_input` tracing lines.
- Does Android local decode of its own encoded H.264 reproduce the same lines?
- Is 960x540 failing because H.264 encodes a 544-high macroblock frame and macOS crops or interprets chroma padding incorrectly?
- Are the green/magenta bands chroma-plane corruption, luma padding leakage, or debug overlay from an encoder surface path? Current pipeline uses byte-buffer input, not surface input.
- Is HEVC's pause behavior a separate decoder buffering/keyframe issue or the same layout problem expressed differently?

View File

@@ -12,6 +12,7 @@
# ./scripts/android-build-async.sh --rust # force-clean Rust target cache # ./scripts/android-build-async.sh --rust # force-clean Rust target cache
# ./scripts/android-build-async.sh --no-pull # skip git fetch on remote # ./scripts/android-build-async.sh --no-pull # skip git fetch on remote
# ./scripts/android-build-async.sh --debug # debug APK # ./scripts/android-build-async.sh --debug # debug APK
# ./scripts/android-build-async.sh --release-debuggable # release APK with run-as dumps
# ./scripts/android-build-async.sh --wait # block until done, then tail status # ./scripts/android-build-async.sh --wait # block until done, then tail status
# #
# Progress / completion: ntfy.sh/wzp (handled by build-tauri-android.sh). # Progress / completion: ntfy.sh/wzp (handled by build-tauri-android.sh).

View File

@@ -17,6 +17,7 @@ set -euo pipefail
# Usage: # Usage:
# ./scripts/build-tauri-android.sh # full pipeline (release, arm64 only) # ./scripts/build-tauri-android.sh # full pipeline (release, arm64 only)
# ./scripts/build-tauri-android.sh --debug # debug APK (faster, no optimisation) # ./scripts/build-tauri-android.sh --debug # debug APK (faster, no optimisation)
# ./scripts/build-tauri-android.sh --release-debuggable # release APK with android:debuggable=true
# ./scripts/build-tauri-android.sh --no-pull # skip git fetch # ./scripts/build-tauri-android.sh --no-pull # skip git fetch
# ./scripts/build-tauri-android.sh --rust # force-clean rust target # ./scripts/build-tauri-android.sh --rust # force-clean rust target
# ./scripts/build-tauri-android.sh --init # also run `cargo tauri android init` # ./scripts/build-tauri-android.sh --init # also run `cargo tauri android init`
@@ -39,6 +40,7 @@ REBUILD_RUST=0
DO_PULL=1 DO_PULL=1
DO_INIT=0 DO_INIT=0
BUILD_RELEASE=1 BUILD_RELEASE=1
RELEASE_DEBUGGABLE=0
BUILD_ARCH="arm64" BUILD_ARCH="arm64"
NEXT_IS_ARCH=0 NEXT_IS_ARCH=0
for arg in "$@"; do for arg in "$@"; do
@@ -53,6 +55,7 @@ for arg in "$@"; do
--no-pull) DO_PULL=0 ;; --no-pull) DO_PULL=0 ;;
--init) DO_INIT=1 ;; --init) DO_INIT=1 ;;
--debug) BUILD_RELEASE=0 ;; --debug) BUILD_RELEASE=0 ;;
--release-debuggable) RELEASE_DEBUGGABLE=1 ;;
--arch) NEXT_IS_ARCH=1 ;; --arch) NEXT_IS_ARCH=1 ;;
-h|--help) -h|--help)
sed -n '3,32p' "$0" sed -n '3,32p' "$0"
@@ -93,6 +96,7 @@ REBUILD_RUST="${3:-0}"
DO_INIT="${4:-0}" DO_INIT="${4:-0}"
BUILD_RELEASE="${5:-0}" BUILD_RELEASE="${5:-0}"
BUILD_ARCH="${6:-arm64}" BUILD_ARCH="${6:-arm64}"
RELEASE_DEBUGGABLE="${7:-0}"
LOG_FILE=/tmp/wzp-tauri-build.log LOG_FILE=/tmp/wzp-tauri-build.log
GIT_HASH="unknown" # populated after fetch GIT_HASH="unknown" # populated after fetch
@@ -192,6 +196,7 @@ docker run --rm \
-e DO_INIT="$DO_INIT" \ -e DO_INIT="$DO_INIT" \
-e PROFILE_FLAG="$PROFILE_FLAG" \ -e PROFILE_FLAG="$PROFILE_FLAG" \
-e BUILD_ARCH="$BUILD_ARCH" \ -e BUILD_ARCH="$BUILD_ARCH" \
-e RELEASE_DEBUGGABLE="$RELEASE_DEBUGGABLE" \
-v "$BASE_DIR/data/source:/build/source" \ -v "$BASE_DIR/data/source:/build/source" \
-v "$BASE_DIR/data/cache/cargo-registry:/home/builder/.cargo/registry" \ -v "$BASE_DIR/data/cache/cargo-registry:/home/builder/.cargo/registry" \
-v "$BASE_DIR/data/cache/cargo-git:/home/builder/.cargo/git" \ -v "$BASE_DIR/data/cache/cargo-git:/home/builder/.cargo/git" \
@@ -218,6 +223,29 @@ if [ "${DO_INIT}" = "1" ] || [ ! -x gen/android/gradlew ]; then
cargo tauri android init 2>&1 | tail -20 cargo tauri android init 2>&1 | tail -20
fi fi
if [ "${RELEASE_DEBUGGABLE}" = "1" ]; then
MANIFEST="gen/android/app/src/main/AndroidManifest.xml"
if [ -f "$MANIFEST" ]; then
echo ">>> Marking release APK debuggable for frame-dump run-as access"
if ! grep -q "xmlns:tools=" "$MANIFEST"; then
perl -0pi -e "s/<manifest\\b/<manifest xmlns:tools=\"http:\\/\\/schemas.android.com\\/tools\"/s" "$MANIFEST"
fi
if grep -q "android:debuggable=" "$MANIFEST"; then
sed -i "s/android:debuggable=\"[^\"]*\"/android:debuggable=\"true\"/" "$MANIFEST"
else
perl -0pi -e "s/(<application\\b[^>]*)(>)/\$1\\n android:debuggable=\"true\"\$2/s" "$MANIFEST"
fi
if grep -q "tools:ignore=" "$MANIFEST"; then
sed -i "s/tools:ignore=\"[^\"]*\"/tools:ignore=\"HardcodedDebugMode\"/" "$MANIFEST"
else
perl -0pi -e "s/(<application\\b[^>]*)(>)/\$1\\n tools:ignore=\"HardcodedDebugMode\"\$2/s" "$MANIFEST"
fi
grep -n "debuggable\\|<application" "$MANIFEST"
else
echo ">>> WARNING: AndroidManifest.xml not found; release APK will not be debuggable"
fi
fi
# ─── Arch list from BUILD_ARCH env var ─────────────────────────────────── # ─── Arch list from BUILD_ARCH env var ───────────────────────────────────
case "${BUILD_ARCH}" in case "${BUILD_ARCH}" in
arm64) ARCHS="arm64" ;; arm64) ARCHS="arm64" ;;
@@ -302,6 +330,7 @@ done
APK_OUTPUT_DIR="/build/source/target/apk-output" APK_OUTPUT_DIR="/build/source/target/apk-output"
mkdir -p "$APK_OUTPUT_DIR" mkdir -p "$APK_OUTPUT_DIR"
rm -f "$APK_OUTPUT_DIR"/wzp-tauri-*.apk
for ARCH in $ARCHS; do for ARCH in $ARCHS; do
TARGET=$(tauri_target "$ARCH") TARGET=$(tauri_target "$ARCH")
@@ -333,7 +362,9 @@ for ARCH in $ARCHS; do
# Re-running Gradle is NOT used here because the Gradle Rust build # Re-running Gradle is NOT used here because the Gradle Rust build
# task (BuildTask.kt) calls `cargo tauri android android-studio-script` # task (BuildTask.kt) calls `cargo tauri android android-studio-script`
# which requires the full Tauri CLI environment and fails standalone. # which requires the full Tauri CLI environment and fails standalone.
UNSIGNED_APK_PATH="gen/android/app/build/outputs/apk/universal/release/app-universal-release-unsigned.apk" BUILD_VARIANT="debug"
[ -z "${PROFILE_FLAG}" ] && BUILD_VARIANT="release"
UNSIGNED_APK_PATH="gen/android/app/build/outputs/apk/universal/${BUILD_VARIANT}/app-universal-${BUILD_VARIANT}-unsigned.apk"
if [ -f "$UNSIGNED_APK_PATH" ] && ! unzip -l "$UNSIGNED_APK_PATH" 2>/dev/null | grep -q "assets/index.html"; then if [ -f "$UNSIGNED_APK_PATH" ] && ! unzip -l "$UNSIGNED_APK_PATH" 2>/dev/null | grep -q "assets/index.html"; then
echo ">>> frontend assets missing from APK — patching unsigned APK directly" echo ">>> frontend assets missing from APK — patching unsigned APK directly"
PATCH_DIR="/tmp/apk-frontend-patch-$$" PATCH_DIR="/tmp/apk-frontend-patch-$$"
@@ -347,7 +378,7 @@ for ARCH in $ARCHS; do
fi fi
# Copy produced APK with arch suffix # Copy produced APK with arch suffix
BUILT_APK=$(find gen/android -name "*.apk" -newer "$APK_OUTPUT_DIR" -type f 2>/dev/null | head -1) BUILT_APK=$(find "gen/android/app/build/outputs/apk" -path "*/${BUILD_VARIANT}/*.apk" -type f 2>/dev/null | sort | head -1)
if [ -z "$BUILT_APK" ]; then if [ -z "$BUILT_APK" ]; then
BUILT_APK=$(find gen/android -name "*.apk" -type f 2>/dev/null | sort -t/ -k1 | tail -1) BUILT_APK=$(find gen/android -name "*.apk" -type f 2>/dev/null | sort -t/ -k1 | tail -1)
fi fi
@@ -359,6 +390,12 @@ for ARCH in $ARCHS; do
# Release builds are unsigned by default. Sign with the release # Release builds are unsigned by default. Sign with the release
# keystore (checked into the repo at android/keystore/) so the # keystore (checked into the repo at android/keystore/) so the
# APK can be installed on real devices. # APK can be installed on real devices.
if [ "${BUILD_VARIANT}" = "debug" ]; then
echo ">>> Debug APK selected; preserving Gradle debug signing and android:debuggable=true"
echo ">>> $ARCH APK: $(ls -lh "$OUT_APK" | awk "{print \$5}")"
continue
fi
# Pick keystore + credentials (release preferred, debug fallback) # Pick keystore + credentials (release preferred, debug fallback)
KS_RELEASE="/build/source/android/keystore/wzp-release.jks" KS_RELEASE="/build/source/android/keystore/wzp-release.jks"
KS_DEBUG="/build/source/android/keystore/wzp-debug.jks" KS_DEBUG="/build/source/android/keystore/wzp-debug.jks"
@@ -452,11 +489,11 @@ REMOTE_SCRIPT
ssh_cmd "chmod +x /tmp/wzp-tauri-build.sh" ssh_cmd "chmod +x /tmp/wzp-tauri-build.sh"
notify_local "WZP Tauri Android build dispatched (branch=$BRANCH, arch=$BUILD_ARCH, release=$BUILD_RELEASE)" notify_local "WZP Tauri Android build dispatched (branch=$BRANCH, arch=$BUILD_ARCH, release=$BUILD_RELEASE, release-debuggable=$RELEASE_DEBUGGABLE)"
log "Triggering remote build (branch=$BRANCH, arch=$BUILD_ARCH)..." log "Triggering remote build (branch=$BRANCH, arch=$BUILD_ARCH)..."
# Run; last lines are APK_REMOTE_PATH=... (one per arch) # Run; last lines are APK_REMOTE_PATH=... (one per arch)
REMOTE_OUTPUT=$(ssh_cmd "/tmp/wzp-tauri-build.sh '$BRANCH' '$DO_PULL' '$REBUILD_RUST' '$DO_INIT' '$BUILD_RELEASE' '$BUILD_ARCH'" || true) REMOTE_OUTPUT=$(ssh_cmd "/tmp/wzp-tauri-build.sh '$BRANCH' '$DO_PULL' '$REBUILD_RUST' '$DO_INIT' '$BUILD_RELEASE' '$BUILD_ARCH' '$RELEASE_DEBUGGABLE'" || true)
echo "$REMOTE_OUTPUT" | tail -60 echo "$REMOTE_OUTPUT" | tail -60
# Download all produced APKs # Download all produced APKs

View File

@@ -0,0 +1,25 @@
#!/usr/bin/env bash
set -euo pipefail
PACKAGE="${1:-com.wzp.desktop}"
OUT_DIR="${2:-android-frame-dumps}"
LOCAL_TAR="wzp-frame-dumps.tar"
APP_DUMP_DIR="${WZP_ANDROID_DUMP_ROOT:-.wzp}"
trap 'rm -f "$LOCAL_TAR"' EXIT
if [ "${1:-}" = "-h" ] || [ "${1:-}" = "--help" ]; then
echo "Usage: $0 [package] [out-dir]"
echo "Default package: com.wzp.desktop"
echo "Default out-dir: android-frame-dumps"
exit 0
fi
echo ">>> Packaging frame dumps from $PACKAGE..."
adb exec-out "run-as $PACKAGE tar -C $APP_DUMP_DIR -cf - frame-dumps" > "$LOCAL_TAR"
rm -rf "$OUT_DIR"
mkdir -p "$OUT_DIR"
tar -xf "$LOCAL_TAR" -C "$OUT_DIR"
echo ">>> Pulled dumps:"
find "$OUT_DIR" -type f | sort | sed 's#^# #'