Initial commit: nick docs

This commit is contained in:
moojttaba
2026-05-23 20:35:34 +03:30
commit 0da235ae27
90 changed files with 18268 additions and 0 deletions

View File

@@ -0,0 +1,315 @@
---
title: Backup & Recovery
tags: [operations]
---
# Backup & Recovery
How to keep the marketplace recoverable from data loss. Covers MongoDB, Redis, the `uploads/` directory, and environment secrets, plus the disaster-recovery runbook.
---
## 1. RTO / RPO targets
| Asset | RPO (data loss tolerated) | RTO (downtime tolerated) | Backup cadence |
|-------|---------------------------|--------------------------|----------------|
| MongoDB | 1 hour | 1 hour | Hourly `mongodump` + nightly offsite |
| `uploads/` directory | 24 hours | 2 hours | Nightly `rsync` to offsite |
| Redis | 1 hour (regeneratable) | 0 minutes (app survives empty cache) | Nightly RDB snapshot |
| Production `.env` | n/a (manual) | 5 minutes | Stored in 1Password / Bitwarden vault |
| Container images | n/a (CI rebuilds) | 15 minutes | Tagged in registry by version |
Adjust these targets when product SLAs change.
---
## 2. MongoDB
### 2.1 Dump
```bash
#!/usr/bin/env bash
# scripts/backup-mongo.sh — run hourly via cron
set -euo pipefail
STAMP=$(date -u +%FT%H%M%SZ)
DEST=/var/backups/mongo
mkdir -p "$DEST"
docker exec nickapp-mongodb \
mongodump --db=marketplace --archive --gzip \
> "$DEST/marketplace-$STAMP.gz"
# Keep last 24 hourly + 14 daily
find "$DEST" -name 'marketplace-*.gz' -mtime +14 -delete
```
Cron entry:
```
0 * * * * /usr/local/bin/backup-mongo.sh >> /var/log/backup-mongo.log 2>&1
```
### 2.2 Offsite
Push the most recent dump to S3 (or Backblaze B2, or `rclone` to any provider) nightly:
```bash
aws s3 cp "$DEST"/marketplace-*.gz \
"s3://marketplace-backups/mongo/" \
--recursive --exclude "*" --include "marketplace-*.gz" \
--storage-class STANDARD_IA
```
Set a 90-day lifecycle policy on the bucket to age out old copies.
### 2.3 Restore
> [!warning] Restoring is **destructive** to the current data. Always practise on a staging clone before doing it for real.
```bash
# Restore against an empty database (fresh container)
docker exec -i nickapp-mongodb \
mongorestore --archive --gzip --drop \
< /var/backups/mongo/marketplace-2026-05-20T0300Z.gz
# Verify
docker exec nickapp-mongodb mongosh \
--eval "use marketplace; db.users.countDocuments()"
```
For partial restore (single collection):
```bash
docker exec -i nickapp-mongodb \
mongorestore --archive --gzip --drop \
--nsInclude='marketplace.payments' \
< /var/backups/mongo/marketplace-2026-05-20T0300Z.gz
```
### 2.4 Validate backups
A monthly drill — restore the latest dump into a throwaway container and run smoke queries:
```bash
docker run --rm -v $(pwd)/marketplace-latest.gz:/dump.gz mongo:8.2 \
sh -c "mongorestore --archive=/dump.gz --gzip && mongosh --eval 'db.getMongo().getDBNames()'"
```
If validation fails, treat as a sev-2 incident (see [[Incident Response]]).
---
## 3. Redis
Redis data is regeneratable — losing it means logged-out users + cold caches, no business data lost. Still cheap to back up.
### 3.1 Snapshot
```bash
# Trigger a save and copy out
docker exec nickapp-redis redis-cli -a "$REDIS_PASSWORD" BGSAVE
sleep 5
docker cp nickapp-redis:/data/dump.rdb /var/backups/redis/redis-$(date -u +%FT%H%M%SZ).rdb
```
Daily cron is sufficient.
### 3.2 Restore
```bash
# Stop redis, drop the RDB into the volume, start
docker compose -f docker-compose.production.yml stop redis
docker cp /var/backups/redis/redis-2026-05-20T0300Z.rdb nickapp-redis:/data/dump.rdb
docker compose -f docker-compose.production.yml start redis
```
If you've enabled AOF, also copy `appendonly.aof`. See [[Database Operations#persistence]].
---
## 4. `uploads/` directory
Stored on the host at `/opt/backend/uploads/` and bind-mounted into both backend and nginx containers. This is where every user upload lives — losing it means broken images, missing dispute evidence, and unhappy users.
### 4.1 Nightly sync
```bash
rsync -av --delete /opt/backend/uploads/ \
s3://marketplace-backups/uploads/
# Or rclone to any provider
rclone sync /opt/backend/uploads/ backblaze:marketplace-uploads --transfers 8
```
Cron:
```
30 3 * * * /usr/local/bin/backup-uploads.sh >> /var/log/backup-uploads.log 2>&1
```
### 4.2 Restore
```bash
rsync -av s3://marketplace-backups/uploads/ /opt/backend/uploads/
# fix ownership for the marketplace container (uid 1001)
chown -R 1001:1001 /opt/backend/uploads
```
Restart the backend container so any in-flight uploads find the right directory layout.
---
## 5. Secrets & configuration
### 5.1 `.env` files
The production `.env` lives at `/opt/backend/.env`. It is **not** version-controlled and **not** in any standard backup. Source of truth: the team password manager (1Password / Bitwarden vault).
After any change:
1. Update the host file.
2. Update the vault entry with the new value, a one-line "why", and the date.
3. `docker compose -f docker-compose.production.yml up -d` to apply.
### 5.2 SSL certs
If you run a host-level Caddy / Nginx with Let's Encrypt, certs auto-renew. Back up `/var/lib/caddy/.local/share/caddy/` (Caddy) or `/etc/letsencrypt/` (Certbot) — useful if you migrate hosts.
### 5.3 Container registry credentials
`/root/.docker/config.json` on the production host holds the `git.manko.yoga` login Watchtower uses. Recreate after a rebuild:
```bash
docker login git.manko.yoga -u manawenuz
```
---
## 6. Disaster recovery runbook
> Scenario: production host is unrecoverable (disk failure, cloud provider lost the VM, etc.).
### Phase 1 — Provision
1. Spin up a new VM matching the previous spec (≥ 4 vCPU, 8 GB RAM, 100 GB SSD).
2. Install Docker Engine + compose plugin.
3. Restore DNS pointing or stand up a temporary subdomain (`recovery.amn.gg`).
### Phase 2 — Code
```bash
cd /opt
git clone ssh://git@git.manko.yoga:222/nick/backend.git
git clone ssh://git@git.manko.yoga:222/nick/frontend.git
cd backend && git checkout main
```
### Phase 3 — Config
```bash
# Restore .env from the vault
nano /opt/backend/.env
# Restore nginx config
mkdir -p nginx/logs
# copy nginx.conf from the vault / repo / your laptop
```
### Phase 4 — Data
```bash
# Mongo
mkdir -p /var/backups/mongo
aws s3 cp s3://marketplace-backups/mongo/marketplace-LATEST.gz /var/backups/mongo/
# Uploads
mkdir -p /opt/backend/uploads
aws s3 sync s3://marketplace-backups/uploads/ /opt/backend/uploads/
chown -R 1001:1001 /opt/backend/uploads
# Redis (optional — empty is fine)
mkdir -p /var/backups/redis
aws s3 cp s3://marketplace-backups/redis/redis-LATEST.rdb /var/backups/redis/
```
### Phase 5 — Start stack
```bash
cd /opt/backend
docker login git.manko.yoga -u manawenuz
docker compose -f docker-compose.production.yml up -d
# wait ~60s
docker compose -f docker-compose.production.yml ps
```
### Phase 6 — Restore data into running containers
```bash
# Mongo
docker exec -i nickapp-mongodb \
mongorestore --archive --gzip --drop \
< /var/backups/mongo/marketplace-LATEST.gz
# Redis
docker compose stop redis
docker cp /var/backups/redis/redis-LATEST.rdb nickapp-redis:/data/dump.rdb
docker compose start redis
```
### Phase 7 — Verify
```bash
curl -fsS http://localhost:8083/api/health | jq
docker exec nickapp-mongodb mongosh --eval "use marketplace; db.users.countDocuments()"
docker compose logs --tail=200 nickapp-backend | grep -E "✅|❌"
```
### Phase 8 — Restart Watchtower & cut over DNS
```bash
docker run -d --name watchtower --restart unless-stopped \
-v /var/run/docker.sock:/var/run/docker.sock \
-v /root/.docker/config.json:/config.json \
-e WATCHTOWER_POLL_INTERVAL=300 \
-e WATCHTOWER_LABEL_ENABLE=true \
containrrr/watchtower
# Update DNS for amn.gg / dev.amn.gg to the new host's IP
```
### Phase 9 — Post-mortem
Write a post-mortem (template in [[Incident Response#postmortem-template]]) and update this runbook with anything that surprised you.
---
## 7. Quick-reference commands
```bash
# Mongo dump
docker exec nickapp-mongodb mongodump --db=marketplace --archive --gzip > backup.gz
# Mongo restore
docker exec -i nickapp-mongodb mongorestore --archive --gzip --drop < backup.gz
# Redis snapshot
docker exec nickapp-redis redis-cli -a "$REDIS_PASSWORD" BGSAVE
docker cp nickapp-redis:/data/dump.rdb redis.rdb
# Uploads to S3
rclone sync /opt/backend/uploads/ s3:marketplace-backups/uploads/
# Restore .env
# Pull from vault, paste into /opt/backend/.env, docker compose up -d
```
---
## 8. Testing the plan
> [!tip] Backups are not real until they've been restored. Drill quarterly:
>
> 1. Spin up a throwaway VM.
> 2. Walk Phases 27 of the DR runbook with the most recent backups.
> 3. Time it. If RTO is busted, fix the gap before the next drill.
> 4. Capture lessons in this file.