docs: protocol audit 2026-05-25, update architecture + Obsidian vault

Audit: - docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings (4 critical, 2 high, 5 medium, 4 low) with code references and fix effort estimates - vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit items with priorities, due dates, and per-step checklists Architecture docs updated for Wire format v2 and Wave 5/6 features: - ARCHITECTURE.md: adds wzp-video to dependency graph and project structure; wire format updated to v2 (16B header, 5B MiniHeader); relay concurrency section corrected (DashMap+RwLock is current, not a future optimization); test count 571→702; Android note - PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702; current status and open blockers as of 2026-05-25 - ROAD-TO-VIDEO.md: implementation status table inserted (✅/🟡/🔴/🔲 per phase); 6-step critical path to first video call - WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1); version negotiation section added Obsidian vault (vault/): - 114 files across Architecture/, PRDs/, Reports/, Android/, Reference/, Audit/ with YAML frontmatter - 00 - Home.md index note with wiki links - .obsidian/app.json config Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 06:00:17 +04:00
parent 12b0d9738f
commit ed8a7ae5aa
120 changed files with 22781 additions and 65 deletions
--- a/vault/Reports/T5.7-report.md
+++ b/vault/Reports/T5.7-report.md
@@ -0,0 +1,89 @@
+---
+tags: [report, wzp]
+type: report
+status: Pending Review
+---
+
+# T5.7 — Tier F audio scorer (entropy/IAT/silence-fraction)
+
+**Status:** Pending Review
+**Agent:** Kimi Code CLI
+**Started:** 2026-05-12T19:15Z
+**Completed:** 2026-05-12T19:45Z
+**Commit:** 5fda5ec
+**PRD:** ../PRD-relay-conformance.md
+
+## What I changed
+
+- `crates/wzp-relay/src/audio_scorer.rs` — New file. `AudioScorer` computes `legitimacy ∈ [0, 1]` from:
+  - **IAT CoV** (`iat_cov()`) — legitimate traffic 0.1–0.4; abusive uniform IAT > 1.0
+  - **Silence fraction** (`silence_fraction()`) — legitimate 10–40%; abusive < 2%
+  - **Bitrate ratio** (`bitrate_ratio()`) — actual vs nominal codec bitrate
+  - **Q-flag cadence CV** (`q_flag_cv()`) — measures regularity of quality-flag spacing
+  - **Payload-size bimodality** (`size_bimodality()`) — speech vs silence双峰分布
+  - `legitimacy()` combines features into a weighted score clamped to [0, 1]
+  - `verdict()` maps score to `Verdict::Legitimate / Suspect / Abusive`
+- `crates/wzp-relay/src/lib.rs` — Added `pub mod audio_scorer;`.
+
+## Why these choices
+
+IAT CoV is the strongest single discriminator: real VoIP has jittery arrival times, while synthetic flood traffic tends to be perfectly periodic. Silence fraction catches streams that never send comfort-noise frames (a hallmark of non-audio data tunnelled over Opus). Bimodality uses a simple two-bin approach rather than a full histogram because the threshold is coarse-grained.
+
+## Deviations from the task spec
+
+None.
+
+## Verification output
+
+```bash
+$ cargo test -p wzp-relay --lib -- audio_scorer
+   Compiling wzp-relay v0.1.0
+    Finished `test` profile [unoptimized + debuginfo] target(s) in 6.85s
+     Running unittests src/lib.rs (target/debug/deps/wzp_relay-9174aebf89cae671)
+
+running 11 tests
+test audio_scorer::tests::audio_scorer_insufficient_samples ... ok
+test audio_scorer::tests::bitrate_ratio_saturates_when_no_codec ... ok
+test audio_scorer::tests::audio_scorer_ignores_video ... ok
+test audio_scorer::tests::q_flag_cv_regular_spacing ... ok
+test audio_scorer::tests::audio_scorer_abusive_uniform_iat ... ok
+test audio_scorer::tests::audio_scorer_abusive_no_silence ... ok
+test audio_scorer::tests::audio_scorer_legitimate_traffic ... ok
+test audio_scorer::tests::audio_scorer_counts_packets ... ok
+test audio_scorer::tests::silence_fraction_computed_correctly ... ok
+test audio_scorer::tests::size_bimodality_for_mixed_traffic ... ok
+test audio_scorer::tests::size_bimodality_for_uniform_traffic ... ok
+
+test result: ok. 11 passed; 0 failed; 0 ignored; 0 measured; 116 filtered out
+```
+
+```bash
+$ cargo fmt --all -- --check
+# pass
+```
+
+```bash
+$ cargo clippy -p wzp-relay --lib -- -D warnings
+# pass for new code (pre-existing debt in other modules allowed)
+```
+
+## Test summary
+
+- Tests added: 11
+- Tests modified: 0
+- Workspace test count before: 116 / after: 127 (wzp-relay lib)
+- `cargo clippy -p wzp-relay --lib -- -D warnings`: pass for new code
+- `cargo fmt --all -- --check`: pass
+
+## Risks / follow-ups
+
+1. **Thresholds are heuristic** — The 0.7 / 0.3 verdict boundaries were chosen by eyeballing test data, not calibrated against real traffic. May need tuning in production.
+2. **Window size is fixed at 10–30 s** — Very short calls (< 5 s) won't produce enough samples for a reliable verdict. Consider falling back to Tier A/B/C metering for short sessions.
+
+## Reviewer checklist (filled in by reviewer)
+
+- [ ] Code matches PRD intent
+- [ ] Verification output is real (re-run if suspicious)
+- [ ] No backward-incompat surprises
+- [ ] Tests cover the new behavior
+- [ ] Approved