T2.3-T2.6: BWE guard, relay conformance Tier A/B/C, Prometheus metrics
This commit is contained in:
77
docs/PRD/reports/T2.6-report.md
Normal file
77
docs/PRD/reports/T2.6-report.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# T2.6 — Prometheus metrics for conformance
|
||||
|
||||
**Status:** Pending Review
|
||||
**Agent:** Kimi Code CLI
|
||||
**Started:** 2026-05-11T17:45Z
|
||||
**Completed:** 2026-05-11T17:55Z
|
||||
**Commit:** 846c98e
|
||||
**PRD:** ../PRD-relay-conformance.md
|
||||
|
||||
## What I changed
|
||||
|
||||
- `crates/wzp-relay/src/metrics.rs`:
|
||||
- Updated `conformance_violations: IntCounterVec` labels from `["violation_type"]` to `["tier", "codec_id", "media_type", "verdict"]`.
|
||||
- Added `conformance_bytes: HistogramVec` — packet size distribution, label `media_type`.
|
||||
- Added `conformance_iat_ms: HistogramVec` — inter-arrival time distribution, label `media_type`.
|
||||
- Added `record_conformance(header, payload_len, iat_ms, violation)` helper:
|
||||
- Records bytes + IAT histograms on **every** packet.
|
||||
- Increments violation counter (with full labels) only on violations.
|
||||
|
||||
- `crates/wzp-relay/src/room.rs`:
|
||||
- Both `run_participant_plain` and `run_participant_trunked` call `metrics.record_conformance()` on every incoming packet.
|
||||
- `recv_gap_ms` (already computed for gap logging) is reused as the IAT measurement.
|
||||
|
||||
## Why these choices
|
||||
|
||||
Histograms are recorded per-packet so operators can see the full distribution of traffic, not just the abusive tail. The `media_type` label separates audio, video, data, and control traffic without over-labeling (codec_id on histograms would create too many time-series).
|
||||
|
||||
The violation counter uses four labels:
|
||||
- `tier` — "A", "B", or "C" (which conformance check failed)
|
||||
- `codec_id` — `Debug` representation (e.g., "Opus24k")
|
||||
- `media_type` — `Debug` representation (e.g., "Audio")
|
||||
- `verdict` — `Debug` representation of `Violation` enum
|
||||
|
||||
This gives operators enough dimensions to correlate violations with specific codecs and traffic types.
|
||||
|
||||
## Deviations from the task spec
|
||||
|
||||
None.
|
||||
|
||||
## Verification output
|
||||
|
||||
```bash
|
||||
$ cargo test -p wzp-relay conformance
|
||||
running 10 tests
|
||||
...(all 10 pass)...
|
||||
|
||||
test result: ok. 10 passed; 0 failed; 0 ignored; 0 measured; 76 filtered out; finished in 0.00s
|
||||
```
|
||||
|
||||
```bash
|
||||
$ cargo test -p wzp-relay
|
||||
running 86 tests
|
||||
...(all 86 pass)...
|
||||
|
||||
test result: ok. 86 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.01s
|
||||
```
|
||||
|
||||
## Test summary
|
||||
|
||||
- Tests added: 0 (metrics are exercised indirectly by conformance tests)
|
||||
- Tests modified: 0
|
||||
- `wzp-relay` test count: 86 (unchanged)
|
||||
- `cargo clippy -p wzp-relay --lib`: pass (no new warnings)
|
||||
- `cargo fmt --all -- --check`: pass
|
||||
|
||||
## Risks / follow-ups
|
||||
|
||||
- Histogram cardinality is bounded: `media_type` has 4 values, so `conformance_bytes` and `conformance_iat_ms` each produce 4 time-series. Safe for Prometheus.
|
||||
- Violation counter cardinality: `tier` (3) × `codec_id` (~9) × `media_type` (4) × `verdict` (3) = ~324 max combinations. In practice, most participants use only 1-2 codecs, so actual cardinality is much lower.
|
||||
|
||||
## Reviewer checklist (filled in by reviewer)
|
||||
|
||||
- [ ] Code matches PRD intent
|
||||
- [ ] Verification output is real
|
||||
- [ ] No backward-incompat surprises
|
||||
- [ ] Tests cover the new behavior
|
||||
- [ ] Approved
|
||||
Reference in New Issue
Block a user