feat(p2p): Phase 4 cross-relay direct calling over federation
Teaches the relay pair to route direct-call signaling across an
existing federation link. Alice on Relay A can now place a direct
call to Bob on Relay B if A and B are federation peers — the
wire protocol, call registry, and signal dispatch all learn to
track and route the cross-relay flow.
Phase 3.5's dual-path QUIC race then carries the media directly
peer-to-peer using the advertised reflex addrs, with zero
changes needed on the client side.
## Wire protocol (wzp-proto)
New `SignalMessage::FederatedSignalForward { inner, origin_relay_fp }`
envelope variant, appended at end of enum — JSON serde is
name-tagged so pre-Phase-4 relays just log "unknown variant" and
drop it. 2 new roundtrip tests (any-inner nesting + single
DirectCallOffer case).
## Call registry (wzp-relay)
`DirectCall.peer_relay_fp: Option<String>` — federation TLS fp
of the peer relay that forwarded the offer/answer for this call.
`None` on local calls, `Some` on cross-relay. Used by the answer
path to route the reply back through the same federation link
instead of trying (and failing) to deliver via local signal_hub.
New `set_peer_relay_fp` setter + 1 new unit test.
## FederationManager (wzp-relay)
Three new methods:
- `local_tls_fp()` — exposes the relay's own federation TLS fp
so main.rs can build `origin_relay_fp` fields.
- `broadcast_signal(msg) -> usize` — fan out any signal message
(in practice `FederatedSignalForward`) to every active peer
link, returning the reach count. Used when Relay A doesn't
know which peer has the target fingerprint.
- `send_signal_to_peer(fp, msg)` — targeted send for the reply
path where the registry already knows which peer relay to
hit.
Plus a new `cross_relay_signal_tx: Mutex<Option<Sender<...>>>`
field that `set_cross_relay_tx()` wires at startup so the
federation `handle_signal` can push unwrapped inner messages
into the main signal dispatcher.
## Federation handle_signal (wzp-relay)
New match arm for `FederatedSignalForward`:
- Loop prevention: drops forwards whose `origin_relay_fp` equals
this relay's own fp (prevents A→B→A echo loops without needing
TTL yet).
- Otherwise pulls the inner message out and pushes it through
`cross_relay_signal_tx` so the main loop's dispatcher task
handles it as if it had arrived locally.
## Main signal loop (wzp-relay)
### DirectCallOffer when target not local
Before falling through to Hangup, try the federation path:
- Wrap the offer in `FederatedSignalForward` with
`origin_relay_fp = this relay's tls_fp`
- `fm.broadcast_signal(forward)` — returns peer count
- If any peers reached, stash the call in local registry with
`caller_reflexive_addr` set, `peer_relay_fp` still None
(broadcast — the answer-side will identify itself when it
replies)
- Send `CallRinging` to caller immediately for UX feedback
- Only if no federation or no peers → legacy Hangup path
### DirectCallAnswer when peer is remote
- Registry lookup now reads both `peer_fingerprint` and
`peer_relay_fp` in one acquisition
- If `peer_relay_fp.is_some()`:
* Reject → forward a `Hangup` over federation via
`send_signal_to_peer` instead of local signal_hub
* Accept → wrap the raw answer in `FederatedSignalForward`,
route to the specific origin peer, then emit the LOCAL
CallSetup to our callee with `peer_direct_addr =
caller_reflexive_addr` (caller is remote; this side only
has the callee)
- If `peer_relay_fp.is_none()` → existing Phase 3 same-relay
path with both CallSetups (caller + callee)
### Cross-relay signal dispatcher task
New long-running task reading `(inner, origin_relay_fp)` from
`cross_relay_rx`. In Phase 4 MVP handles:
- `DirectCallOffer` — if target is local, create the call in
the registry with `peer_relay_fp = origin_relay_fp`, stash
caller addr, deliver offer to local callee. If target isn't
local, drop (no multi-hop in Phase 4 MVP).
- `DirectCallAnswer` — look up local caller by call_id, stash
callee addr, forward raw answer to local caller via
signal_hub, emit local CallSetup with `peer_direct_addr =
callee_reflexive_addr` (peer is local now; this side only
has the caller).
- `CallRinging` — best-effort forward to local caller for UX.
- `Hangup` — logged for now; Phase 4.1 will target by call_id.
## Integration tests
`crates/wzp-relay/tests/cross_relay_direct_call.rs` — 3 tests
that reproduce the main.rs cross-relay dispatcher logic inline
and assert the invariants without spinning up real binaries:
1. `cross_relay_offer_forwards_and_stashes_peer_relay_fp` —
Relay A gets Alice's offer, broadcasts. Relay B's dispatcher
creates the call with `peer_relay_fp = relay_a_tls_fp`.
2. `cross_relay_answer_crosswires_peer_direct_addrs` — full
round trip; both CallSetups (one on each relay) carry the
OTHER party's reflex addr.
3. `cross_relay_loop_prevention_drops_self_sourced_forward` —
explicit loop-prevention check.
Full workspace test goes from 413 → 419 passing. Clippy clean
on touched files.
## Non-goals (deferred to Phase 4.1+)
- Relay-mediated media fallback across federation — if P2P
direct fails (symmetric NAT on either side), the call errors
out with "no media path". Making the existing federation
media pipeline carry ephemeral call-<id> rooms is the Phase
4.1 lift.
- Multi-hop federation (A → B → C). Phase 4 MVP supports a
direct federation link between A and B only.
- Fingerprint → peer-relay routing gossip.
PRD: .taskmaster/docs/prd_phase4_cross_relay_p2p.txt
Tasks: 70-78 all completed
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -146,6 +146,14 @@ pub struct FederationManager {
|
||||
event_log: EventLogger,
|
||||
/// Per-room rate limiters for inbound federation media.
|
||||
rate_limiters: Mutex<HashMap<String, RateLimiter>>,
|
||||
/// Phase 4: channel for handing cross-relay direct-call
|
||||
/// signaling (inner message + origin relay fp) back to the
|
||||
/// main signal loop in `main.rs`. Set once at startup via
|
||||
/// `set_cross_relay_tx`. `None` when the main loop hasn't
|
||||
/// wired it up yet (e.g. during startup warmup) — forwards
|
||||
/// that arrive before wiring are dropped with a warning.
|
||||
cross_relay_signal_tx:
|
||||
Mutex<Option<tokio::sync::mpsc::Sender<(wzp_proto::SignalMessage, String)>>>,
|
||||
}
|
||||
|
||||
impl FederationManager {
|
||||
@@ -171,6 +179,78 @@ impl FederationManager {
|
||||
dedup: Mutex::new(Deduplicator::new(DEDUP_WINDOW_SIZE)),
|
||||
event_log,
|
||||
rate_limiters: Mutex::new(HashMap::new()),
|
||||
cross_relay_signal_tx: Mutex::new(None),
|
||||
}
|
||||
}
|
||||
|
||||
/// Phase 4: expose this relay's federation TLS fingerprint so
|
||||
/// the main signal loop can populate
|
||||
/// `SignalMessage::FederatedSignalForward.origin_relay_fp`.
|
||||
pub fn local_tls_fp(&self) -> &str {
|
||||
&self.local_tls_fp
|
||||
}
|
||||
|
||||
/// Phase 4: wire the channel that the main signal loop uses
|
||||
/// to receive unwrapped cross-relay direct-call signals. Called
|
||||
/// once at startup from `main.rs`.
|
||||
pub async fn set_cross_relay_tx(
|
||||
&self,
|
||||
tx: tokio::sync::mpsc::Sender<(wzp_proto::SignalMessage, String)>,
|
||||
) {
|
||||
*self.cross_relay_signal_tx.lock().await = Some(tx);
|
||||
}
|
||||
|
||||
/// Phase 4: broadcast a `SignalMessage::FederatedSignalForward`
|
||||
/// to every active federation peer link. Returns the number of
|
||||
/// peers the broadcast reached (not the number that successfully
|
||||
/// delivered the message further). Used when the local relay
|
||||
/// doesn't know which peer holds the target fingerprint for a
|
||||
/// `DirectCallOffer` — whichever peer has it will unwrap and
|
||||
/// handle locally; the rest drop silently after "target not
|
||||
/// local" check.
|
||||
///
|
||||
/// Loop prevention: the receiving relay checks
|
||||
/// `origin_relay_fp` against its own fp and drops self-sourced
|
||||
/// forwards.
|
||||
pub async fn broadcast_signal(&self, msg: &wzp_proto::SignalMessage) -> usize {
|
||||
let links = self.peer_links.lock().await;
|
||||
let mut count = 0;
|
||||
for (fp, link) in links.iter() {
|
||||
match link.transport.send_signal(msg).await {
|
||||
Ok(()) => {
|
||||
count += 1;
|
||||
tracing::debug!(peer = %link.label, %fp, "federation: broadcast signal ok");
|
||||
}
|
||||
Err(e) => {
|
||||
tracing::warn!(peer = %link.label, %fp, error = %e, "federation: broadcast signal failed");
|
||||
}
|
||||
}
|
||||
}
|
||||
count
|
||||
}
|
||||
|
||||
/// Phase 4: targeted send — used by the
|
||||
/// `DirectCallAnswer` path when the registry knows exactly
|
||||
/// which peer relay to route the reply back to. More efficient
|
||||
/// than re-broadcasting and avoids leaking the call to
|
||||
/// uninvolved peers.
|
||||
///
|
||||
/// Returns `Ok(())` on success, `Err(String)` when the peer
|
||||
/// isn't currently linked or the send fails.
|
||||
pub async fn send_signal_to_peer(
|
||||
&self,
|
||||
peer_relay_fp: &str,
|
||||
msg: &wzp_proto::SignalMessage,
|
||||
) -> Result<(), String> {
|
||||
let normalized = normalize_fp(peer_relay_fp);
|
||||
let links = self.peer_links.lock().await;
|
||||
match links.get(&normalized) {
|
||||
Some(link) => link
|
||||
.transport
|
||||
.send_signal(msg)
|
||||
.await
|
||||
.map_err(|e| format!("send to peer {normalized}: {e}")),
|
||||
None => Err(format!("no active federation link for {normalized}")),
|
||||
}
|
||||
}
|
||||
|
||||
@@ -852,6 +932,57 @@ async fn handle_signal(
|
||||
}
|
||||
}
|
||||
}
|
||||
// Phase 4: cross-relay direct-call signal envelope.
|
||||
//
|
||||
// Unwrap the inner message and hand it off to the main
|
||||
// signal loop via the cross_relay_signal_tx channel. The
|
||||
// main loop will then dispatch the inner DirectCallOffer/
|
||||
// Answer/Ringing/Hangup exactly as if it had arrived on a
|
||||
// local signal transport — with the extra context that
|
||||
// the call is "federated" (origin_relay_fp).
|
||||
//
|
||||
// Loop prevention: drop any forward whose origin matches
|
||||
// our own federation TLS fingerprint. With
|
||||
// broadcast-to-all-peers this prevents A→B→A echo loops.
|
||||
SignalMessage::FederatedSignalForward { inner, origin_relay_fp } => {
|
||||
if origin_relay_fp == fm.local_tls_fp {
|
||||
tracing::debug!(
|
||||
peer = %peer_label,
|
||||
"federation: dropping self-sourced FederatedSignalForward (loop prevention)"
|
||||
);
|
||||
return;
|
||||
}
|
||||
let tx_opt = {
|
||||
let guard = fm.cross_relay_signal_tx.lock().await;
|
||||
guard.clone()
|
||||
};
|
||||
match tx_opt {
|
||||
Some(tx) => {
|
||||
let inner_discriminant = std::mem::discriminant(&*inner);
|
||||
if let Err(e) = tx.send((*inner, origin_relay_fp.clone())).await {
|
||||
warn!(
|
||||
peer = %peer_label,
|
||||
?inner_discriminant,
|
||||
error = %e,
|
||||
"federation: cross-relay signal dispatcher full / closed"
|
||||
);
|
||||
} else {
|
||||
tracing::debug!(
|
||||
peer = %peer_label,
|
||||
?inner_discriminant,
|
||||
%origin_relay_fp,
|
||||
"federation: forwarded cross-relay signal to main dispatcher"
|
||||
);
|
||||
}
|
||||
}
|
||||
None => {
|
||||
warn!(
|
||||
peer = %peer_label,
|
||||
"federation: cross_relay_signal_tx not wired yet — dropping forward"
|
||||
);
|
||||
}
|
||||
}
|
||||
}
|
||||
_ => {} // ignore other signals
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user