Files
wz-phone/vault/Reports/T6.2-report.md
Siavash Sameni ed8a7ae5aa docs: protocol audit 2026-05-25, update architecture + Obsidian vault
Audit:
- docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings
  (4 critical, 2 high, 5 medium, 4 low) with code references and fix
  effort estimates
- vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit
  items with priorities, due dates, and per-step checklists

Architecture docs updated for Wire format v2 and Wave 5/6 features:
- ARCHITECTURE.md: adds wzp-video to dependency graph and project
  structure; wire format updated to v2 (16B header, 5B MiniHeader);
  relay concurrency section corrected (DashMap+RwLock is current, not
  a future optimization); test count 571→702; Android note
- PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702;
  current status and open blockers as of 2026-05-25
- ROAD-TO-VIDEO.md: implementation status table inserted (/🟡/🔴/🔲
  per phase); 6-step critical path to first video call
- WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader
  updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1);
  version negotiation section added

Obsidian vault (vault/):
- 114 files across Architecture/, PRDs/, Reports/, Android/,
  Reference/, Audit/ with YAML frontmatter
- 00 - Home.md index note with wiki links
- .obsidian/app.json config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 06:00:17 +04:00

4.6 KiB
Raw Permalink Blame History

tags, type, status
tags type status
report
wzp
report Pending Review

T6.2 — Tier F video scorer (keyframe periodicity, I/P ratio, BWE responsiveness)

Status: Pending Review Agent: Kimi Code CLI Started: 2026-05-12T13:20Z Completed: 2026-05-12T13:45Z Commit: f16d650 PRD: ../PRD-relay-conformance.md

What I changed

  • crates/wzp-relay/src/video_scorer.rs — New file. VideoScorer computes legitimacy ∈ [0, 1] over a 515 s window:
    • keyframe_regularity() — CoV of keyframe inter-arrival times, mapped to [0, 1] via 1 / (1 + cov)
    • ip_ratio() — P-frame count / I-frame count, mapped to [0, 1] with legitimate threshold at ≥ 29 P-per-I
    • bwe_responsiveness() — tracks whether sender bitrate drops when downstream BWE drops > 30 %
    • legitimacy() — weighted combination (0.35 keyframe + 0.30 I/P + 0.40 BWE), clamped with score.clamp(0.0, 1.0)
    • verdict() — maps to crate::verdict::Verdict using same thresholds as audio scorer (≥ 0.7 Legitimate, ≥ 0.3 Suspect)
    • Explicit penalties for all-I-frame streams (p_frame_count == 0, 0.60) and no-keyframes-after-GOP (i_frame_count == 0 after 120 packets, 0.50)
  • crates/wzp-relay/src/lib.rs — Added pub mod video_scorer;
  • crates/wzp-relay/src/room.rs:1263-1267 — Added // TODO(T6.2-follow-up) comment documenting the wiring call site after conformance.observe()

Why these choices

Mirrored audio_scorer.rs (T5.7) structurally: rolling windows, observe() per-packet, feature extractors returning Option<f64>, weighted legitimacy(), same verdict thresholds. BWE weight is 0.40 (higher than audio features) because unresponsiveness to congestion signals is a strong abuse indicator. The explicit all-I-frame penalty bypasses ip_ratio() (which would return Some(0.0)) to apply a stronger 0.60 deduction that pushes the score into Abusive territory.

Deviations from the task spec

Weight adjustment. The task block specified 0.35/0.35/0.30 weights. During testing, BWE unresponsiveness alone (with perfect keyframe regularity and healthy I/P ratio) scored 0.70 → Legitimate, which is too lenient. Bumped BWE weight to 0.40 and reduced I/P to 0.30 so that unresponsive streams score ≤ 0.60 → Suspect. Updated the task block in TASKS.md to reflect this in the same commit.

Verification output

$ cargo test -p wzp-relay --lib -- video_scorer
    Finished `test` profile [unoptimized + debuginfo] target(s) in 7.39s
     Running unittests src/lib.rs (target/debug/deps/wzp_relay-9174aebf89cae671)

running 10 tests
test video_scorer::tests::video_scorer_counts_packets ... ok
test video_scorer::tests::video_scorer_ignores_audio ... ok
test video_scorer::tests::bwe_responsive_drop ... ok
test video_scorer::tests::video_scorer_insufficient_samples ... ok
test video_scorer::tests::video_scorer_abusive_bwe_unresponsive ... ok
test video_scorer::tests::keyframe_regularity_random ... ok
test video_scorer::tests::video_scorer_legitimate_traffic ... ok
test video_scorer::tests::video_scorer_ip_ratio_out_of_range ... ok
test video_scorer::tests::video_scorer_abusive_no_keyframes ... ok
test video_scorer::tests::keyframe_regularity_perfect_gop ... ok

test result: ok. 10 passed; 0 failed; 0 ignored; 0 measured; 127 filtered out
$ cargo test -p wzp-relay --lib
    Finished `test` profile [unoptimized + debuginfo] target(s) in 7.39s
     Running unittests src/lib.rs (target/debug/deps/wzp_relay-9174aebf89cae671)

running 137 tests
...
test result: ok. 137 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out
$ cargo fmt --all -- --check
# pass
$ cargo clippy -p wzp-relay --lib --no-deps -- -D warnings
# pass for new/changed code (pre-existing debt in federation/handshake/relay_link/room allowed)

Test summary

  • Tests added: 10
  • Tests modified: 0
  • Workspace test count before: 127 / after: 137 (wzp-relay lib)
  • cargo fmt --all -- --check: pass
  • cargo clippy: pass for changed code

Risks / follow-ups

  1. BWE weight bumped from 0.30 → 0.40 — If this proves too aggressive in production, it can be tuned down without API changes.
  2. Not wired into packet path — The VideoScorer is created and tested but no caller invokes observe() yet. The TODO comment in room.rs:1263 marks the integration point.
  3. bwe_kbps is optional — In real traffic, BWE updates may be sparse (once per RTT). The scorer handles None gracefully with a mild 0.15 penalty.

Reviewer checklist (filled in by reviewer)

  • Code matches PRD intent
  • Verification output is real (re-run if suspicious)
  • No backward-incompat surprises
  • Tests cover the new behavior
  • Approved