Document telegram-native task 5 foundation
This commit is contained in:
156
09 - Audits/Task 5.9 QA Rollout Analytics and Launch Runbooks.md
Normal file
156
09 - Audits/Task 5.9 QA Rollout Analytics and Launch Runbooks.md
Normal file
@@ -0,0 +1,156 @@
|
||||
---
|
||||
title: Task 5.9 QA, Rollout, Analytics, and Launch Runbooks
|
||||
tags: [taskmaster, telegram, qa, rollout, analytics, runbook]
|
||||
created: 2026-05-24
|
||||
status: draft
|
||||
---
|
||||
|
||||
# Task 5.9 QA, Rollout, Analytics, and Launch Runbooks
|
||||
|
||||
Source: `/.taskmaster/docs/prd-telegram-native-app-bot-wallet.md`
|
||||
|
||||
## 1) QA scope for launch readiness
|
||||
|
||||
### 1.1 Client matrix (required)
|
||||
|
||||
- Telegram iOS
|
||||
- Telegram Android
|
||||
- Telegram Desktop
|
||||
- Telegram Web
|
||||
- Light / dark themes
|
||||
- Compact / fullscreen modes
|
||||
- Normal and slow network
|
||||
- Blocked bot scenario
|
||||
- Expired / stale session scenario
|
||||
- Payment cancellation and abort
|
||||
- Unlinked user and re-link path
|
||||
|
||||
### 1.2 Functional QA checklist
|
||||
|
||||
1. Identity and linking
|
||||
2. Request listing/detail in both bot and Mini App
|
||||
3. Offer review flow
|
||||
4. Payment initiation and cancel path
|
||||
5. Delivery evidence upload
|
||||
6. Dispute open/respond and status progression
|
||||
7. Notification quiet/error state
|
||||
8. Error and blocked-bot behavior
|
||||
9. Support escalation handoff
|
||||
|
||||
### 1.3 Security/abuse QA
|
||||
|
||||
- forged/invalid `initData` rejection
|
||||
- callback replay replayed twice: one success one no-op
|
||||
- deep-link tampering
|
||||
- wallet proof mismatch
|
||||
- callback processing under invalid provider secrets
|
||||
- admin override behavior and audit event capture
|
||||
|
||||
## 2) Environments and rollout
|
||||
|
||||
### 2.1 Environment separation
|
||||
|
||||
- `telegram-dev-bot` and `telegram-prod-bot` tokens and webhook endpoints must be distinct.
|
||||
- No shared webhook secret between environments.
|
||||
- QA and production payment fixtures remain isolated.
|
||||
|
||||
### 2.2 Feature flag sequence
|
||||
|
||||
1. **Development flag off**: no surface exposed
|
||||
2. **Internal allowlist**: selected users only (buyer/seller/admin)
|
||||
3. **Beta cohort**: controlled percentage and fixed org list
|
||||
4. **Production enablement**: after runbook and KPI thresholds pass
|
||||
|
||||
### 2.3 Deployment safety
|
||||
|
||||
- If new surface increases payment mismatch or callback failure, immediately pause `TELEGRAM_SURFACE_ENABLED` and keep providers in read-only mode.
|
||||
- Use existing rollback flow from incident operations and deployment runbooks.
|
||||
|
||||
## 3) Analytics and launch KPIs
|
||||
|
||||
Track these metrics daily for 14 days after stage advancement:
|
||||
|
||||
- activation rate (`activatedTelegramUsers / startedTelegramUsers`)
|
||||
- link completion rate (`linkedUsers / startedLink`)
|
||||
- request creation from Telegram (`telegramRequestsCreated`)
|
||||
- offer response completion (`offerResponses / offersOpened`)
|
||||
- payment started / payment completed (`telegramPaymentStart`, `telegramPaymentComplete`, `telegramPaymentFail`)
|
||||
- dispute activity (`disputesOpened`, `disputesResolvedInTelegram`)
|
||||
- release approvals from Telegram context (`telegramReleaseApprovals`)
|
||||
- notification opt-outs (`notificationsOptOutRate`)
|
||||
- callback duplicate ratio (`callbackReplay / callbackTotal`)
|
||||
- average context resume latency (min and p95)
|
||||
|
||||
### Reporting destinations
|
||||
|
||||
- Sentry for exception and failure spikes
|
||||
- application logs for workflow events
|
||||
- existing monitoring dashboards for rate/latency anomalies
|
||||
|
||||
## 4) Launch runbooks
|
||||
|
||||
All runbooks are mandatory for Stage-1 rollout and post-launch incidents.
|
||||
|
||||
### 4.1 Bot outage
|
||||
|
||||
1. Validate webhook endpoint response health.
|
||||
2. Switch status to notification-only mode where possible.
|
||||
3. Confirm bot token and webhook URL.
|
||||
4. Re-route urgent flows to web fallback.
|
||||
5. Restore Telegram webhook + replay backlog after recovery.
|
||||
|
||||
### 4.2 Telegram API outage
|
||||
|
||||
1. Confirm external Telegram API status.
|
||||
2. Temporarily disable deep-link / in-app actions that require Telegram callbacks.
|
||||
3. Notify users of delayed updates.
|
||||
4. Keep pending payment states in read-only mode until callback channel is restored.
|
||||
|
||||
### 4.3 Payment provider outage
|
||||
|
||||
1. Identify affected provider via provider mode and provider health flags.
|
||||
2. Switch to read-only or alternative provider mode where configured.
|
||||
3. Run reconciliation before re-enabling full writes.
|
||||
4. Track stale pending payment age and contact support workflow.
|
||||
|
||||
### 4.4 Stuck payment
|
||||
|
||||
1. Check payment reconciliation queue and provider status.
|
||||
2. Verify callback proof and on-chain confirmation.
|
||||
3. Manually reconcile if allowed by protocol and policy.
|
||||
4. Escalate if stale > 24h in funded or processing state.
|
||||
|
||||
### 4.5 Duplicate callback
|
||||
|
||||
1. Validate idempotency path executed correctly.
|
||||
2. Confirm callback dedupe key retention window.
|
||||
3. Compare event fingerprint for payload divergence.
|
||||
4. Mark one path as duplicate no-op and keep audit trail.
|
||||
|
||||
### 4.6 Suspicious wallet proof
|
||||
|
||||
1. Block automated release/refund for the request.
|
||||
2. Flag payment and mark for manual ops review.
|
||||
3. Verify recipient, amount, and tx hash against chain/provider data.
|
||||
4. Resume only after explicit approval.
|
||||
|
||||
### 4.7 Compromised bot token
|
||||
|
||||
1. Rotate bot token immediately.
|
||||
2. Disable bot endpoints and clear webhook secret for 1 hour.
|
||||
3. Validate callback signatures with new secret.
|
||||
4. Resume in staged rollout mode with monitoring for 24h.
|
||||
|
||||
## 5) Stage exit criteria
|
||||
|
||||
- All required QA scenarios pass on iOS/Android/desktop/web.
|
||||
- No critical webhook/payload mismatch regressions in 24h observation window.
|
||||
- No unresolved payment stuck items > 24h after manual triage.
|
||||
- Incident owners can execute all seven runbooks.
|
||||
- Rollout metrics show non-degrading trend for the first two days.
|
||||
|
||||
## 6) Known rollout gaps
|
||||
|
||||
1. Fine-grained feature toggles for Telegram in existing observability dashboards are pending.
|
||||
2. Admin analytics for Telegram-originated releases are schema-dependent and need implementation wiring.
|
||||
3. Deep-link recovery behavior after prolonged Telegram link expiry still needs UX polishing.
|
||||
Reference in New Issue
Block a user