# Benchmarks

This project uses [Criterion.rs](https://bheisler.github.io/criterion.rs/book/) for performance benchmarking and regression detection.

## Running Benchmarks

Run all benchmarks:
```bash
cargo bench
```

Run a specific benchmark suite:
```bash
cargo bench --bench protocol
cargo bench --bench bandwidth
cargo bench --bench tcp_rx_scan
cargo bench --bench ecsrp5
```

Run in "quick" mode (fewer iterations, useful for development):
```bash
cargo bench --bench tcp_rx_scan -- --quick
```

## Benchmark Suites

### `protocol` — Protocol Serialization
Measures the zero-allocation serialization/deserialization of `Command` (16 bytes) and `StatusMessage` (12 bytes) structs.

### `bandwidth` — Bandwidth State Atomics
Measures `BandwidthState` hot-path operations: `fetch_add`, `spend_budget`, `calc_send_interval`, `advance_next_send`, and `summary`.

### `tcp_rx_scan` — TCP RX Status Message Scan
Compares the optimized `memchr`-based scan against the old naive O(n) byte-by-byte loop on 256KB buffers. Key scenarios:
- **All zeros** (common case — data packets contain no status)
- **Status at start**
- **Status at end** (worst case for naive scan)
- **Split messages** (status spans two TCP reads)

### `ecsrp5` — EC-SRP5 Curve Construction
Compares `WCurve::new()` (heavy `BigUint` modular arithmetic) against the cached `&*WCURVE` access to demonstrate the Sprint 1 optimization.

## Interpreting Results

Criterion generates HTML reports in `target/criterion/`. Open `target/criterion/report/index.html` after running benchmarks to view interactive charts.

Example results (Apple M3 Pro, release profile):

| Benchmark | Naive/Uncached | Optimized/Cached | Speedup |
|-----------|---------------|------------------|---------|
| TCP RX scan 256KB (status at end) | 251 µs | 4.5 µs | **~55×** |
| WCurve construction | 126 µs | 1.0 ns | **~123,000×** |
| Command serialize | — | 7.7 ns | — |
| Bandwidth `fetch_add` | — | ~1 ns | — |