--- title: Task 5.9 QA, Rollout, Analytics, and Launch Runbooks tags: [taskmaster, telegram, qa, rollout, analytics, runbook] created: 2026-05-24 status: draft --- # Task 5.9 QA, Rollout, Analytics, and Launch Runbooks Source: `/.taskmaster/docs/prd-telegram-native-app-bot-wallet.md` ## 1) QA scope for launch readiness ### 1.1 Client matrix (required) - Telegram iOS - Telegram Android - Telegram Desktop - Telegram Web - Light / dark themes - Compact / fullscreen modes - Normal and slow network - Blocked bot scenario - Expired / stale session scenario - Payment cancellation and abort - Unlinked user and re-link path ### 1.2 Functional QA checklist 1. Identity and linking 2. Request listing/detail in both bot and Mini App 3. Offer review flow 4. Payment initiation and cancel path 5. Delivery evidence upload 6. Dispute open/respond and status progression 7. Notification quiet/error state 8. Error and blocked-bot behavior 9. Support escalation handoff ### 1.3 Security/abuse QA - forged/invalid `initData` rejection - callback replay replayed twice: one success one no-op - deep-link tampering - wallet proof mismatch - callback processing under invalid provider secrets - admin override behavior and audit event capture ## 2) Environments and rollout ### 2.1 Environment separation - `telegram-dev-bot` and `telegram-prod-bot` tokens and webhook endpoints must be distinct. - No shared webhook secret between environments. - QA and production payment fixtures remain isolated. ### 2.2 Feature flag sequence 1. **Development flag off**: no surface exposed 2. **Internal allowlist**: selected users only (buyer/seller/admin) 3. **Beta cohort**: controlled percentage and fixed org list 4. **Production enablement**: after runbook and KPI thresholds pass ### 2.3 Deployment safety - If new surface increases payment mismatch or callback failure, immediately pause `TELEGRAM_SURFACE_ENABLED` and keep providers in read-only mode. - Use existing rollback flow from incident operations and deployment runbooks. ## 3) Analytics and launch KPIs Track these metrics daily for 14 days after stage advancement: - activation rate (`activatedTelegramUsers / startedTelegramUsers`) - link completion rate (`linkedUsers / startedLink`) - request creation from Telegram (`telegramRequestsCreated`) - offer response completion (`offerResponses / offersOpened`) - payment started / payment completed (`telegramPaymentStart`, `telegramPaymentComplete`, `telegramPaymentFail`) - dispute activity (`disputesOpened`, `disputesResolvedInTelegram`) - release approvals from Telegram context (`telegramReleaseApprovals`) - notification opt-outs (`notificationsOptOutRate`) - callback duplicate ratio (`callbackReplay / callbackTotal`) - average context resume latency (min and p95) ### Reporting destinations - Sentry for exception and failure spikes - application logs for workflow events - existing monitoring dashboards for rate/latency anomalies ## 4) Launch runbooks All runbooks are mandatory for Stage-1 rollout and post-launch incidents. ### 4.1 Bot outage 1. Validate webhook endpoint response health. 2. Switch status to notification-only mode where possible. 3. Confirm bot token and webhook URL. 4. Re-route urgent flows to web fallback. 5. Restore Telegram webhook + replay backlog after recovery. ### 4.2 Telegram API outage 1. Confirm external Telegram API status. 2. Temporarily disable deep-link / in-app actions that require Telegram callbacks. 3. Notify users of delayed updates. 4. Keep pending payment states in read-only mode until callback channel is restored. ### 4.3 Payment provider outage 1. Identify affected provider via provider mode and provider health flags. 2. Switch to read-only or alternative provider mode where configured. 3. Run reconciliation before re-enabling full writes. 4. Track stale pending payment age and contact support workflow. ### 4.4 Stuck payment 1. Check payment reconciliation queue and provider status. 2. Verify callback proof and on-chain confirmation. 3. Manually reconcile if allowed by protocol and policy. 4. Escalate if stale > 24h in funded or processing state. ### 4.5 Duplicate callback 1. Validate idempotency path executed correctly. 2. Confirm callback dedupe key retention window. 3. Compare event fingerprint for payload divergence. 4. Mark one path as duplicate no-op and keep audit trail. ### 4.6 Suspicious wallet proof 1. Block automated release/refund for the request. 2. Flag payment and mark for manual ops review. 3. Verify recipient, amount, and tx hash against chain/provider data. 4. Resume only after explicit approval. ### 4.7 Compromised bot token 1. Rotate bot token immediately. 2. Disable bot endpoints and clear webhook secret for 1 hour. 3. Validate callback signatures with new secret. 4. Resume in staged rollout mode with monitoring for 24h. ## 5) Stage exit criteria - All required QA scenarios pass on iOS/Android/desktop/web. - No critical webhook/payload mismatch regressions in 24h observation window. - No unresolved payment stuck items > 24h after manual triage. - Incident owners can execute all seven runbooks. - Rollout metrics show non-degrading trend for the first two days. ## 6) Known rollout gaps 1. Fine-grained feature toggles for Telegram in existing observability dashboards are pending. 2. Admin analytics for Telegram-originated releases are schema-dependent and need implementation wiring. 3. Deep-link recovery behavior after prolonged Telegram link expiry still needs UX polishing.