Files
wz-phone/vault/Reference/Handoff-2026-05-12.md
Siavash Sameni ed8a7ae5aa docs: protocol audit 2026-05-25, update architecture + Obsidian vault
Audit:
- docs/AUDIT-2026-05-25.md: full protocol audit covering 8 findings
  (4 critical, 2 high, 5 medium, 4 low) with code references and fix
  effort estimates
- vault/Audit/Tasks.md: Obsidian Tasks plugin file tracking all audit
  items with priorities, due dates, and per-step checklists

Architecture docs updated for Wire format v2 and Wave 5/6 features:
- ARCHITECTURE.md: adds wzp-video to dependency graph and project
  structure; wire format updated to v2 (16B header, 5B MiniHeader);
  relay concurrency section corrected (DashMap+RwLock is current, not
  a future optimization); test count 571→702; Android note
- PROGRESS.md: Wave 5 and Wave 6 sections appended; test count 372→702;
  current status and open blockers as of 2026-05-25
- ROAD-TO-VIDEO.md: implementation status table inserted (/🟡/🔴/🔲
  per phase); 6-step critical path to first video call
- WZP-SPEC.md: MediaHeader updated to v2 (16B byte-aligned); MiniHeader
  updated to 5B with seq_delta; codec IDs 9-12 added (H.264/H.265/AV1);
  version negotiation section added

Obsidian vault (vault/):
- 114 files across Architecture/, PRDs/, Reports/, Android/,
  Reference/, Audit/ with YAML frontmatter
- 00 - Home.md index note with wiki links
- .obsidian/app.json config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 06:00:17 +04:00

172 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
tags: [reference, wzp]
type: reference
---
# Handoff — 2026-05-12 EOD
## TL;DR
Wave 5 (Phase 5) and Wave 6 (Phase 6) implementation is complete and approved on the board. Stopping for the night with one open issue: `wzp-video` does not target-compile for `aarch64-linux-android` and needs a focused `ndk = "0.9"` API migration session (~12 h). Nothing live is blocked — Tauri Android does not yet consume `wzp-video`.
**Branch state:** local `experimental-ui` HEAD `f3e3ee5`, pushed to `github` only. **Not yet on `fj`** (deploy key was read-only). Build server (`manwe@manwehs`) is up to date via github fetch.
---
## What landed today
| Wave | Tasks approved | New crates / files | Test delta |
|---|---|---|---|
| 5 | T5.1, T5.1.1, T5.2, T5.3, T5.4, T5.5, T5.6, T5.7, T5.7.1, T5.8 | `crates/wzp-relay/src/audio_scorer.rs`, `response_policy.rs`, `verdict.rs`; `wzp-video/src/controller.rs`, `simulcast.rs`, `encoder_mode.rs`; H.265 path in VT + MediaCodec | wzp-relay 99→127, wzp-video 43→71 |
| 6 | T6.1 (+ rework), T6.1.2, T6.2 | `wzp-video/src/av1_obu.rs`, `dav1d.rs`, `svt_av1.rs`, `factory.rs`; VT AV1 decoder; MediaCodec AV1; `wzp-relay/src/video_scorer.rs` | wzp-video 76→88, wzp-relay 127→137 |
Total: ~30 task units approved across the two waves. Workspace tests at 702 passing (excluding `wzp-android`).
---
## Open / next-up
### Top of queue
- **T4.3.1.1 (deferred → in-progress, blocked)** — Android target-compile of `wzp-video`. We started this tonight and hit 31 errors in `crates/wzp-video/src/mediacodec.rs` against the actual `ndk = "0.9"` API. Error categories captured below; resume with one fix-per-category commit, then attempt device instrumentation.
- **T6.3 — federated reputation gossip.** Design exploration committed (`1e729e4`, `docs/PRD/PRD-relay-federation-gossip.md`). **Decision made: Approach 3 (Ban-List Distribution).** My answers to the 6 blocker questions are in the chat thread, awaiting conversion to a real Files/Steps/Verify/Done-when task spec for the agent. The user opted not to run the agent immediately; the task spec is a write-then-park.
- **T5.1.1 follow-ups** — none. T5.1.1 closed clean.
### Latent follow-ups from earlier waves
These pre-date wave 6 and are still open:
- **AEAD wired into prod send/recv path** (referenced in T1.5 / T1.6 reports). Encryption is implemented in `wzp-crypto` but not yet on every QUIC datagram path.
- **AEAD nonce derivation: switch to `MediaHeader::seq`** (cited in T1.5.x reports). Current scheme works but isn't tied to wire-level seq.
- **`wzp-codec` clippy debt sprint** — 9 errors documented as known debt in `docs/PROTOCOL-AUDIT.md`.
- **T6.1.2 — wire AV1 into actual call engine.** The factory + step tables landed (commit `086d0a4`); no caller invokes `create_video_encoder(Av1Main, …)` yet. Real video sender wiring (the originally-blocked task) is unstarted.
- **T6.2-follow-up — wire `VideoScorer::observe()` into the packet path.** TODO marker at `crates/wzp-relay/src/room.rs:1263`.
### Permanently deferred
- **T6.1.1 — Android MediaCodec AV1 device validation.** Deferred indefinitely: the user does not own an AV1-encode-capable Android or iPhone, and AV1 hardware will not be widespread for years. Revisit when devices land.
---
## The T4.3.1.1 Android build situation
What we did tonight:
1. Pushed `experimental-ui` to `github` (deploy key on `fj` is read-only).
2. Added `github` as a remote on `manwe@manwehs:~/wzp-builder/data/source/` and checked out `experimental-ui`.
3. Ran `cargo build --target aarch64-linux-android -p wzp-video` inside the `wzp-android-builder:latest` docker image.
4. First failure: `shiguredo_dav1d` and `shiguredo_svt_av1` build scripts panic with `unsupported target: os=android, arch=aarch64`. Fixed in commit `f3e3ee5` (`fix(wzp-video): cfg-gate dav1d + svt-av1 off Android target`) — those crates now live under `[target.'cfg(not(target_os = "android"))'.dependencies]`, since Android uses MediaCodec for AV1 anyway.
5. Re-ran the build → 31 errors in `mediacodec.rs`. **Stopped here.**
### Error categories to fix tomorrow
Run the same docker invocation and tackle these one fix-commit per category:
| Error | Count | Root cause | Likely fix |
|---|---|---|---|
| `E0277` `NonNull<AMediaCodec>` not `Send` | ~3 | Raw pointer field on a struct held across `tokio::spawn`-able boundaries | Wrap in `struct SendMediaCodec(NonNull<…>); unsafe impl Send for SendMediaCodec {}` or use the `ndk` crate's owned `MediaCodec` type which already implements `Send` |
| `E0308` `&[MaybeUninit<u8>]` vs `&[u8]` | many | `ndk 0.9` returns uninitialized buffer slices; agent wrote into them as if initialized | Use `MaybeUninit::write_slice` or transmute pattern; pattern matches what `InputBuffer::write` expects |
| `E0425` missing `BITRATE_MODE_CBR` | 1+ | Constant moved/renamed in `ndk 0.9` | Search `ndk` crate docs for current constant name (likely under `MediaCodec::set_parameters` enum) |
| `E0433` `ndk_sys` not linked | several | Agent imported `ndk_sys` directly; it's not a dep, only `ndk = "0.9"` is | Replace direct `ndk_sys` calls with safe wrappers from the `ndk` crate, or add `ndk_sys` as an explicit dep |
| `E0599` `InputBuffer::index()` / `OutputBuffer::index()` private | 2 | Both are private fields in `ndk 0.9`; were public methods in older versions | Either use the buffer through its safe API (queue/dequeue by handle) or expose index via a different accessor — read the `ndk` source for current API |
### Reproduce the build
```bash
ssh -i ~/CascadeProjects/wzp manwe@manwehs \
'cd ~/wzp-builder/data/source && \
docker run --rm \
-v ~/wzp-builder/data/source:/build/source \
-v ~/wzp-builder/data/cache/cargo-registry:/home/builder/.cargo/registry \
-v ~/wzp-builder/data/cache/cargo-git:/home/builder/.cargo/git \
-v ~/wzp-builder/data/cache/target:/build/source/target \
wzp-android-builder:latest \
bash -c "cd /build/source && cargo build --target aarch64-linux-android -p wzp-video 2>&1 | tail -100"'
```
After local fixes:
```bash
git push github experimental-ui && \
ssh -i ~/CascadeProjects/wzp manwe@manwehs \
'cd ~/wzp-builder/data/source && git fetch github && git reset --hard github/experimental-ui'
# then re-run the docker build
```
### Device instrumentation half (post-compile)
User has a physical Android device. Once `cargo build --target aarch64-linux-android -p wzp-video` is clean:
- Build a minimal test harness binary (probably under `wzp-video/examples/` or a new `wzp-android-test/` crate) that does encode → decode of a synthetic frame via MediaCodec.
- Use `adb push` and `adb shell run` to exercise it.
- Compare output bytes against the dav1d/SVT-AV1 SW roundtrip from `crates/wzp-video/src/svt_av1.rs:101 svt_av1_dav1d_roundtrip_10_frames`.
Out of scope for tomorrow if the API migration eats the whole session.
---
## T6.3 — Approach 3 decision
User picked Approach 3 (Ban-List Distribution) from `docs/PRD/PRD-relay-federation-gossip.md`. My answers to the 6 open questions:
1. **Trust model:** Single admin key (user). Strongest Sybil resistance, lowest complexity.
2. **Key infra:** Reuse `wzp-crypto` Ed25519. Admin pubkey in relay config; relays verify list signatures.
3. **Fingerprint scope:** Ed25519 pubkey, not IP. Resistant to NAT rebind evasion.
4. **Privacy:** Publish `SHA-256(pubkey)` hashes, not raw pubkeys. Relays compute `H(observed)` and match. 256-bit space makes brute-force infeasible; loses some audit trail.
5. **TTL:** 30-day per-entry auto-expiry. Forces ops to actively re-publish persistent bans; prevents forever-by-mistake.
6. **Rate limiting:** N/A under Approach 3 (no gossip channel; relays poll a signed list at configurable interval, that interval is the rate limit).
Next step: turn these into a Files/Steps/Verify/Done-when task spec in `docs/PRD/TASKS.md` and move T6.3 from `Blocked``Open` ready for the agent to claim. User did not want this kicked off tonight.
---
## Build / sync state
| Location | Branch | HEAD |
|---|---|---|
| Local (Mac) | `experimental-ui` | `f3e3ee5 fix(wzp-video): cfg-gate dav1d + svt-av1 off Android target` |
| `github` remote | `experimental-ui` | `f3e3ee5` (pushed) |
| `fj` remote | `experimental-ui` | **not pushed** (deploy key read-only on `fj`) |
| `origin` (git.manko.yoga) | `experimental-ui` | **not pushed** |
| Build server `~/wzp-builder/data/source` | `experimental-ui` | `f3e3ee5` |
If you want everything on `fj` / `origin` too, get the deploy key write-privileged or push from a different identity.
`fj/main` and `github/main` have one commit (`9ae9441 fix(audio): check capture ring available...`) that doesn't exist on `experimental-ui` — a small audio fix from May 11. Cherry-pick or merge before merging `experimental-ui` back into `main`.
### Gitleaks allowlist
Added `.gitleaks.toml` in commit `f28f39d` to allowlist 4 pre-existing historical findings. Two are real tokens (paste.tbs.amn.gg and paste.dk.manko.yoga `Authorization` headers in `scripts/build*.sh`). **Rotate those tokens if those endpoints still authenticate** — the allowlist only silences the pre-push hook; the secrets are still in git history.
---
## Agent process notes for tomorrow
The Kimi Code CLI agent on this project has a **stable, well-documented fabrication tic** — one verifiable detail per report is wrong (SHA, "updated X in same commit", fmt/clippy passes, etc.). Pattern survived an explicit CR on T6.1.
**Updated policy** (in `memory/feedback_kimi_report_fabrication.md`):
1. **Always verify the SHA** in the report header against `git log`.
2. **Always run** `cargo fmt --check` and `cargo clippy -- -D warnings` yourself — don't trust the report's claims.
3. **Don't CR fabrications anymore** — the T6.1 CR didn't change the behavior. Reviewer-fix the detail, note on the board, move on. Reserve CRs for substance issues.
The substance of the code has been consistently good. Don't let the fabrication tic bias review of the code itself.
### Rebase tic
Agent has twice rewritten already-pushed commits to address CR feedback (T5.7.1 `d3b2da6``517d0eb`; T6.1 `0de9522``9334aa5`). Forward fix commits are the rule; rebasing wasn't asked for and breaks reviewer references. Mention this only if it happens a third time.
---
## Tomorrow's suggested checklist
1. **(20 min)** Read this doc, the `feedback_kimi_report_fabrication.md` memory, and the T6.1 / T6.2 / T6.1.2 board rows on `docs/PRD/TASKS.md` to reload context.
2. **(12 h)** Resume T4.3.1.1: ndk-0.9 API migration in `crates/wzp-video/src/mediacodec.rs`. One commit per error category.
3. **(30 min)** If migration lands clean, attempt the minimal device test on the user's Android phone.
4. **(20 min, optional)** Convert the T6.3 design answers into a task spec block in `TASKS.md`, leave it `Open` for the agent. Don't kick off the agent unless asked.
5. **(parking lot)** AEAD prod wiring + nonce switch + wzp-codec clippy sprint — none urgent.
---
*Generated 2026-05-12, end of Wave 6 push.*