Full-codebase-audit 2026-05-30 outputs: - Audit report: 09 - Audits/Full Codebase Audit - 2026-05-30.md - 81 issue files ISSUE-055..135 (decisions + 1 skipped no-brainer). - Scanner docs from scratch (was zero): architecture, data model, API ref, payment flow, operations runbook + repo README. - Doc-sync updates across API reference, data models, flows, design system. - Secret Rotation Runbook (08 - Operations) for the exposed credentials. - Reusable workflow guide (07 - Development) + .claude/workflows/full-codebase-audit.js. Issues remain status:open intentionally — the code fixes are uncommitted-then-committed working-tree changes per repo and aren't "resolved" until merged/deployed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
221 lines
7.8 KiB
Markdown
221 lines
7.8 KiB
Markdown
---
|
|
title: Scanner Operations
|
|
tags: [operations, scanner, deployment]
|
|
created: 2026-05-30
|
|
---
|
|
|
|
# Scanner Operations
|
|
|
|
Runbook for deploying, configuring, monitoring, and troubleshooting the AMN Pay Scanner microservice.
|
|
|
|
---
|
|
|
|
## 1. Configuration reference
|
|
|
|
All configuration via environment variables. See `.env.example` in the scanner repo.
|
|
|
|
| Variable | Default | Required | Description |
|
|
|---|---|---|---|
|
|
| `PORT` | `8080` | no | HTTP listen port |
|
|
| `DB_PATH` | `./scanner.db` | no | SQLite database path |
|
|
| `CHAINS_JSON_PATH` | `./supported-chains.json` | no | Supported chains config |
|
|
| `TOKENS_JSON_PATH` | `./tokens.json` | no | Token registry |
|
|
| `SCANNER_API_KEY` | _(none)_ | **yes (prod)** | Bearer token for all non-health endpoints. Generate with `openssl rand -hex 32` |
|
|
| `POLL_INTERVAL_SEC` | `15` | no | Chain poll interval in seconds |
|
|
| `INTENT_TTL_HOURS` | `24` | no | Pending/confirming intents older than this are expired (0 = disabled) |
|
|
| `WEBHOOK_RETRY_HOURS` | `6` | no | Interval between automatic webhook_failed re-delivery passes (0 = disabled) |
|
|
| `TRONGRID_API_KEY` | _(none)_ | recommended | TronGrid API key; without it rate limits are very low |
|
|
| `TONCENTER_API_KEY` | _(none)_ | recommended | TonCenter API key |
|
|
| `RPC_BSC` | _(chain config)_ | no | Override BSC RPC URL (chain 56) |
|
|
| `RPC_ARB` | _(chain config)_ | no | Override Arbitrum RPC URL (chain 42161) |
|
|
| `RPC_ETH` | _(chain config)_ | no | Override Ethereum RPC URL (chain 1) |
|
|
| `RPC_POLYGON` | _(chain config)_ | no | Override Polygon RPC URL (chain 137) |
|
|
| `RPC_BASE` | _(chain config)_ | no | Override Base RPC URL (chain 8453) |
|
|
|
|
> [!warning]
|
|
> If `SCANNER_API_KEY` is not set, the scanner logs a warning and accepts all requests. Never run this way in production.
|
|
|
|
---
|
|
|
|
## 2. Docker deployment
|
|
|
|
The scanner ships as a single Docker image. The Dockerfile uses a two-stage build (Go 1.25 builder → Alpine 3.21 runtime).
|
|
|
|
### Quick start (dev)
|
|
|
|
```bash
|
|
cd scanner/
|
|
cp .env.example .env
|
|
# edit .env — set SCANNER_API_KEY, RPC overrides, etc.
|
|
|
|
docker build -t amn-scanner:dev .
|
|
docker run -d \
|
|
--name amn-scanner \
|
|
-p 8080:8080 \
|
|
-v $(pwd)/data:/data \
|
|
--env-file .env \
|
|
amn-scanner:dev
|
|
```
|
|
|
|
### Production (via arcane-cli / Watchtower)
|
|
|
|
The scanner is deployed manually via `arcane-cli` (not gitops). Watchtower does NOT manage it automatically. After pushing a new image, redeploy with:
|
|
|
|
```bash
|
|
arcane-cli project redeploy --json <project-id>
|
|
```
|
|
|
|
The SQLite database is stored on a named Docker volume (`/data`). Do not recreate the volume between deploys — it holds the checkpoint and intent state.
|
|
|
|
---
|
|
|
|
## 3. Health check
|
|
|
|
```bash
|
|
curl http://localhost:8080/health
|
|
# {"status":"ok","time":"2026-05-30T12:00:00Z"}
|
|
```
|
|
|
|
Docker `HEALTHCHECK` is already configured in the Dockerfile (30 s interval, 5 s timeout, 3 retries).
|
|
|
|
---
|
|
|
|
## 4. Monitoring
|
|
|
|
### Scanner status endpoint
|
|
|
|
```bash
|
|
curl -H "Authorization: Bearer $SCANNER_API_KEY" \
|
|
http://localhost:8080/scanner/status | jq .
|
|
```
|
|
|
|
Check:
|
|
- `lag` — should be near 0 for healthy chains (blocks behind for EVM, seconds for TON)
|
|
- `pendingIntents` — number of unresolved intents per chain
|
|
- `lastScannedBlock` — should advance each poll
|
|
|
|
### Logs
|
|
|
|
The scanner uses Go's `log/slog` structured logger with level prefixes. Key log patterns:
|
|
|
|
| Pattern | Meaning |
|
|
|---|---|
|
|
| `[scanner] worker started` | Worker goroutine began for this chain |
|
|
| `[evm] intent confirming` | EVM tx seen, waiting for confirmations |
|
|
| `[evm] intent confirmed` | EVM: N confirmations reached |
|
|
| `[tron] MATCH` / `[ton] MATCH` | Transfer matched, going to confirmed |
|
|
| `[webhook] delivered` | Webhook POST succeeded |
|
|
| `[webhook] non-2xx response` | Backend returned error (will retry) |
|
|
| `[webhook] all retries exhausted` | Intent moved to webhook_failed |
|
|
| `[scanner] reconciling confirmed intents` | Startup crash recovery in progress |
|
|
| `[evm] scanner lag` | Chain lag > 100 blocks (investigate RPC) |
|
|
|
|
---
|
|
|
|
## 5. Adding / modifying chains
|
|
|
|
Edit `supported-chains.json`. Fields:
|
|
|
|
| Field | Notes |
|
|
|---|---|
|
|
| `chainId` | Numeric EIP-155 chain ID (arbitrary int for Tron/TON) |
|
|
| `chainType` | `"evm"` (default) / `"tron"` / `"ton"` |
|
|
| `rpcUrl` | Primary RPC endpoint |
|
|
| `publicRpcUrl` | Fallback RPC (EVM only) |
|
|
| `proxyAddress` | ERC20FeeProxy address (EVM); USDT contract (Tron); USDT Jetton master (TON) |
|
|
| `confirmationThreshold` | Blocks required (EVM); ignored for Tron/TON |
|
|
| `verified` | `true` to activate the worker; `false` to disable without deleting |
|
|
|
|
> [!important]
|
|
> Changing `proxyAddress` for an EVM chain only affects new scans. Existing pending intents will still be matched against the old address until they expire or are confirmed.
|
|
|
|
After editing, restart the scanner container to pick up the new config.
|
|
|
|
---
|
|
|
|
## 6. Adding tokens to the registry
|
|
|
|
Edit `tokens.json`. Each entry:
|
|
|
|
```json
|
|
{ "chainId": 56, "address": "0x...", "symbol": "USDC", "decimals": 18, "name": "USD Coin" }
|
|
```
|
|
|
|
Token registry is used only for populating `tokenSymbol` and `decimals` in the `checkoutBlock` response. Omitting a token does not break scanning — it just leaves those fields empty.
|
|
|
|
---
|
|
|
|
## 7. Manual webhook retry
|
|
|
|
Force immediate re-delivery of all `webhook_failed` intents:
|
|
|
|
```bash
|
|
curl -X POST -H "Authorization: Bearer $SCANNER_API_KEY" \
|
|
http://localhost:8080/admin/webhooks/retry
|
|
# {"queued": N}
|
|
```
|
|
|
|
---
|
|
|
|
## 8. Database inspection
|
|
|
|
The SQLite database (`/data/scanner.db`) can be inspected with the `sqlite3` CLI inside the container:
|
|
|
|
```bash
|
|
docker exec -it amn-scanner sqlite3 /data/scanner.db
|
|
|
|
# Check stuck intents
|
|
SELECT intent_id, chain_id, status, created_at, webhook_delivered_at
|
|
FROM intents
|
|
WHERE status NOT IN ('confirmed', 'expired')
|
|
ORDER BY created_at DESC;
|
|
|
|
# Check chain checkpoints
|
|
SELECT chain_id, last_scanned_block, updated_at FROM checkpoints;
|
|
|
|
# Count by status
|
|
SELECT status, count(*) FROM intents GROUP BY status;
|
|
```
|
|
|
|
---
|
|
|
|
## 9. Troubleshooting
|
|
|
|
### Intent stuck in `pending`
|
|
|
|
1. Check `/scanner/status` — is the chain worker running and advancing (`lag` > 0 for a long time = RPC issue)?
|
|
2. Check that `chainId` and `tokenAddress` match exactly what is in `supported-chains.json` and `tokens.json`.
|
|
3. For EVM: verify the `proxyAddress` matches the contract the buyer is calling.
|
|
4. For Tron: confirm the destination address is stored in EVM-hex (0x) format in the DB.
|
|
5. Check scanner logs for `REJECT` messages around the expected tx time.
|
|
|
|
### Webhook never received by backend
|
|
|
|
1. Check `webhook_delivered_at` in the DB — if not null, the scanner delivered successfully and the backend side is the issue.
|
|
2. If null and status is `webhook_failed`: check backend logs for the incoming POST; verify `X-AMN-Signature` validation code.
|
|
3. If status is `confirmed` but `webhook_delivered_at` is null: startup reconciliation may re-deliver on next restart.
|
|
4. Use `POST /admin/webhooks/retry` to trigger immediate retry.
|
|
|
|
### High lag on EVM chain
|
|
|
|
1. Check RPC endpoint availability and rate limits.
|
|
2. Consider setting a `RPC_*` env override to a premium RPC (Alchemy, Infura, QuickNode).
|
|
3. The scanner falls back to `publicRpcUrl` if the primary fails but public nodes have lower limits.
|
|
|
|
### Intent confirmed but amount looks wrong
|
|
|
|
The scanner accepts any amount **>=** `intent.Amount`. Overpayments are not flagged. Underpayments result in the intent staying pending until TTL expiry.
|
|
|
|
---
|
|
|
|
## 10. CI/CD notes
|
|
|
|
- Woodpecker CI pipeline is in `.woodpecker/`.
|
|
- Telegram notify steps were removed (no TG secrets configured).
|
|
- Deploy step was removed — the scanner is deployed manually via `arcane-cli`.
|
|
- The CI pipeline builds and pushes the Docker image to the Gitea registry.
|
|
- Image tag format: `dev-<VERSION>` (from the `VERSION` file).
|
|
|
|
> [!tip]
|
|
> After CI completes, verify the image is in the registry before redeploying. Silent CI failures can leave a stale image tagged. Check the registry tag timestamp, not just the CI green light.
|