Files
wz-phone/docs/PRD/reports/T4.4-report.md

110 lines
5.9 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# T4.4 — `SignalMessage::Nack` variant + RTT-gated NACK loop
**Status:** Pending Review
**Agent:** Kimi Code CLI
**Started:** 2026-05-11T16:29Z
**Completed:** 2026-05-11T16:29Z
**Commit:** (see git log)
**PRD:** ../PRD-video-v1.md
## What I changed
- `crates/wzp-proto/src/packet.rs:11881213` — Added two new `SignalMessage` variants:
- `Nack { version, stream_id, seqs }` — negative acknowledgement requesting retransmission of specific packets.
- `PictureLossIndication { version, stream_id }` — decoder can't proceed, needs a fresh keyframe. Used when RTT is too high for NACK to help.
- `crates/wzp-video/src/nack.rs` — New module with sender/receiver state machines:
- `NackSender` — caches sent packets in a 500 ms ring buffer; `on_nack(seqs)` returns clones of still-cached packets.
- `NackReceiver` — detects gaps from sequence numbers, decides NACK vs PLI based on RTT, enforces backoff (1 NACK per seq per 2×RTT) and rate cap (50 NACKs/sec).
- `CachedPacket { seq, data, timestamp_ms }` and `NackAction { Nack { seqs }, PictureLossIndication }`.
- `crates/wzp-video/src/lib.rs` — Exported `nack` module and re-exported `CachedPacket`, `NackAction`, `NackReceiver`, `NackSender`.
- `crates/wzp-client/src/featherchat.rs` — Added new `SignalMessage` variants to `signal_to_call_type` mapping (catch-all → `CallSignalType::Offer`). Fixed unused `default_signal_version` import warning.
## Why these choices
- **Two signals instead of one:** The PRD explicitly describes both NACK (low-RTT retransmission) and PLI (high-RTT keyframe request) as a unified loss-recovery loop. Adding both to `SignalMessage` keeps the wire format complete so downstream tasks (T4.6, T4.7) don't need to touch `wzp-proto` again.
- **Packet-level state machines:** The NACK receiver works at the sequence-number level rather than integrating with the depacketizer. This decouples loss detection from frame assembly and makes the state machine testable without H.264 payloads.
- **Rate cap as batch truncation:** When a large gap exceeds the 50/sec budget, the receiver emits a NACK for the first `budget` packets and defers the rest to the next tick. This avoids a single burst consuming the entire second's budget.
## Deviations from the task spec
- The TASKS.md entry for T4.4 was a skeleton ("expand before claiming"). I fleshed it out based on the PRD-video-v1.md NACK-loop section and the existing `TransportFeedback` pattern in `packet.rs`.
- `PictureLossIndication` was not in the task title but is required by the PRD for the RTT-gated decision logic. Added it as a peer variant to keep the loop complete.
## Verification output
```bash
$ cargo test -p wzp-video nack
running 8 tests
test nack::tests::receiver_backoff_respects_2x_rtt ... ok
test nack::tests::receiver_detects_gap_and_nacks ... ok
test nack::tests::receiver_late_packet_fills_gap ... ok
test nack::tests::receiver_rate_cap_falls_back_to_pli ... ok
test nack::tests::receiver_uses_pli_when_rtt_is_high ... ok
test nack::tests::receiver_wraparound_ok ... ok
test nack::tests::sender_caches_and_retransmits ... ok
test nack::tests::sender_evicts_after_500ms ... ok
test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 20 filtered out; finished in 0.00s
```
```bash
$ cargo test -p wzp-proto nack
running 2 tests
test packet::tests::nack_default_version ... ok
test packet::tests::nack_roundtrip ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 123 filtered out; finished in 0.00s
```
```bash
$ cargo test -p wzp-proto picture_loss
running 2 tests
test packet::tests::picture_loss_indication_default_version ... ok
test packet::tests::picture_loss_indication_roundtrip ... ok
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 123 filtered out; finished in 0.00s
```
```bash
$ cargo test --workspace --exclude wzp-android --no-fail-fast
... (all crates pass)
Total: 677 passed; 0 failed
```
```bash
$ cargo clippy -p wzp-video --all-targets -- -D warnings
Finished dev profile [unoptimized + debuginfo] target(s) in 0.73s
$ cargo clippy -p wzp-proto --all-targets -- -D warnings
Finished dev profile [unoptimized + debuginfo] target(s) in 1.68s
$ cargo fmt --all -- --check
# pass
```
## Test summary
- Tests added: 12
- wzp-proto: `nack_roundtrip`, `nack_default_version`, `picture_loss_indication_roundtrip`, `picture_loss_indication_default_version`
- wzp-video: `sender_caches_and_retransmits`, `sender_evicts_after_500ms`, `receiver_detects_gap_and_nacks`, `receiver_uses_pli_when_rtt_is_high`, `receiver_backoff_respects_2x_rtt`, `receiver_late_packet_fills_gap`, `receiver_rate_cap_falls_back_to_pli`, `receiver_wraparound_ok`
- Tests modified: 0
- Workspace test count before: 618 / after: 677 (difference is +59 from T4.4 + other accumulated changes; wzp-video now has 28 tests)
- `cargo clippy -p wzp-video --all-targets -- -D warnings`: clean
- `cargo clippy -p wzp-proto --all-targets -- -D warnings`: clean
- `cargo fmt --all -- --check`: pass
## Risks / follow-ups
- `NackSender` buffer is unbounded within the 500 ms TTL. Under very high packet rates it could grow large; a follow-up could add a hard byte-size cap and evict oldest-first when exceeded.
- `NackReceiver` uses a `BTreeMap` for missing seqs — fine for moderate loss but O(log n) per packet. If packet rates go very high (> 10 kpps) a ring buffer or bitmap would be faster. Not a concern for 720p30 (~60 packets/sec).
- The PLI → keyframe emission path (sender side) is not yet wired to the actual encoder. That integration happens in T4.6/T4.7 when the SFU keyframe cache lands.
- `wzp-client/src/featherchat.rs` maps both `Nack` and `PictureLossIndication` to `CallSignalType::Offer` as a catch-all. When featherChat bridge support for video loss recovery is needed, this mapping should be revisited.
## Reviewer checklist (filled in by reviewer)
- [ ] Code matches PRD intent
- [ ] Verification output is real (re-run if suspicious)
- [ ] No backward-incompat surprises
- [ ] Tests cover the new behavior
- [ ] Approved