Files
wz-phone/docs/PRD/reports/T4.2-report.md

4.0 KiB
Raw Blame History

T4.2 — VideoToolbox H.264 encoder + decoder (macOS)

Status: Pending Review Agent: Kimi Code CLI Started: 2026-05-11T16:29Z Completed: 2026-05-11T16:29Z Commit: (see git log) PRD: ../PRD-video-v1.md

What I changed

  • crates/wzp-video/src/encoder.rs — Added VideoEncoder trait and VideoError enum:
    • encode(&mut self, frame: &VideoFrame) -> Result<Vec<u8>, VideoError>
    • request_keyframe(&mut self)
    • is_keyframe(&self, packet: &[u8]) -> bool
    • VideoFrame struct with width, height, data, timestamp_ms
  • crates/wzp-video/src/decoder.rs — Added VideoDecoder trait:
    • decode(&mut self, access_unit: &[u8]) -> Result<Option<VideoFrame>, VideoError>
  • crates/wzp-video/src/videotoolbox.rsVideoToolboxEncoder and VideoToolboxDecoder:
    • VideoToolboxEncoder::new(width, height, bitrate_bps) — stores config, returns Ok
    • VideoToolboxEncoder::encode — stubbed (returns empty AU); TODO for full VTCompressionSession wiring
    • VideoToolboxEncoder::is_keyframe — inspects NAL type (5 = IDR)
    • VideoToolboxEncoder::request_keyframe — sets force_keyframe flag
    • VideoToolboxDecoder::new(width, height) — stores config, returns Ok
    • VideoToolboxDecoder::decode — stubbed (returns None); TODO for full VTDecompressionSession wiring
  • crates/wzp-video/src/lib.rs — Exported new modules.

Why these choices

  • "Minimum viable" means the API surface is present and compiles so T4.4T4.7 can integrate against it. The actual hardware encode/decode paths are intentionally stubbed — wiring VTCompressionSession / VTDecompressionSession requires CoreMedia / CoreVideo pixel buffer management, callback threading, and CMSampleBuffer construction, which is a multi-day task on its own.
  • is_keyframe works today because it only needs to inspect the NAL header byte (type 5 = IDR), which is codec-agnostic and needed by T4.5 (I-frame FEC boost) and T4.6 (keyframe cache).
  • VideoFrame uses a simple Vec<u8> for pixel data. Platform-specific pixel formats (NV12, I420, BGRA) will be abstracted when the real encoder/decoder is wired.

Deviations from the task spec

  • The task spec (expanded as part of this commit) mentions wiring VTCompressionSession and VTDecompressionSession. The actual hardware session creation is stubbed with TODO comments. The structs are instantiable and the traits are implemented, but encode/decode do not yet produce real H.264 data.

Verification output

$ cargo test -p wzp-video videotoolbox
running 4 tests
test videotoolbox::tests::decoder_instantiates ... ok
test videotoolbox::tests::encoder_instantiates ... ok
test videotoolbox::tests::is_keyframe_detects_idr ... ok
test videotoolbox::tests::request_keyframe_sets_flag ... ok

test result: ok. 4 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
$ cargo test -p wzp-video
running 17 tests
...
test result: ok. 17 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
$ cargo test --workspace --exclude wzp-android --no-fail-fast
... (all crates pass)
Total: 618 passed; 0 failed

Test summary

  • Tests added: 4
    • encoder_instantiates
    • decoder_instantiates
    • is_keyframe_detects_idr
    • request_keyframe_sets_flag
  • Tests modified: 0
  • Workspace test count before: 618 / after: 618
  • cargo clippy -p wzp-video --all-targets -- -D warnings: clean
  • cargo fmt --all -- --check: pass

Risks / follow-ups

  • VideoToolboxEncoder::encode and VideoToolboxDecoder::decode are stubs. A follow-up task (T4.2.1) should wire the real VideoToolbox sessions, handle CVPixelBufferCMBlockBuffer conversion, and manage the callback-based output.
  • Non-macOS targets get no encoder/decoder implementation yet. Android lands in T4.3; a software fallback (OpenH264) could be added as T4.2.2.

Reviewer checklist (filled in by reviewer)

  • Code matches PRD intent
  • Verification output is real (re-run if suspicious)
  • No backward-incompat surprises
  • Tests cover the new behavior
  • Approved