Skip to main content

Operations & Deployment Runbook

Operational tasks consolidated from the legacy deployment checklists.

Before You Deploy

  1. Secrets – confirm Wrangler secrets exist for:
    • BASE44_API_URL, BASE44_JWT_SECRET (or app key).
    • DATAFORSEO_LOGIN, DATAFORSEO_PASSWORD.
    • CLICKHOUSE_HOST, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, CLICKHOUSE_DATABASE.
    • CF_IMAGES_ACCOUNT_ID, CF_IMAGES_API_TOKEN, CF_IMAGES_ACCOUNT_HASH (for icon storage).
    • ZENROWS_API_KEY (for proxied iTunes API calls when blocked).
  2. Reference data – run once per environment:
    • npm run seed:business-types
    • npm run push:dataforseo-categories
  3. Queues & storage – ensure KV namespaces (DFS_*, DATAFORSEO_CATEGORIES), queues (dfs_run, harvest_keywords), and R2 bucket (dfs-raw-payloads) exist in Cloudflare.
  4. React alignment – confirm frontend is prepared for new metadata fields (sources, brand_flag, intents).

Deploying

wrangler deploy
  • Deploys worker + queue consumer from wrangler.toml.
  • Watch logs with wrangler tail while running a sample /run to catch runtime errors early.

Post-Deploy Verification

  1. Schema health
    curl https://<worker>/test/clickhouse
    Expect all *_exists flags true.
  2. Smoke test workflow
    curl -X POST https://<worker>/run -d '{"url":"https://slack.com","project_id":"test"}'
    • Poll /run/{id}/status until awaiting_category_confirmation.
    • POST /run/{id}/confirm-categories.
    • Verify status reaches complete and ClickHouse receives inserts.
  3. Diagnostics
    • /admin/business-types responds with Base44 data.
    • /diagnostics/run/{id} shows enrichment + harvest payloads.
  4. Queues
    • Confirm harvest_keywords queue is draining (no stuck messages).
    • Check dead-letter queue if inserts fail.

Monitoring & Troubleshooting

  • DataForSEO quotas – tune DATAFORSEO_LABS_LIMIT and DATAFORSEO_LABS_MAX_REQUESTS in Wrangler vars if rate limits hit.
  • ClickHouse failures – inspect queue logs; use /test/clickhouse to verify connectivity; replay messages after resolving credentials.
  • Run errorsharvest.errors in GET /run/{id}/status surfaces per-run issues; /diagnostics/run/{id} includes stack traces.
  • Budget enforcement – KV key DFS_BUDGETS tracks quotas; reset via KV CLI if you need to unblock staging tests.
  • Apple rate limiting – if seeing 403/429 from iTunes API or HTML scrape, the worker will automatically back off. Check logs for "rate limited" messages.
  • CF Images failures – if icons aren't uploading, verify CF_IMAGES_* secrets are set. Original URLs are used as fallback.
  • iTunes redirect loops – if seeing itms-appss:// in logs, ensure Desktop Safari UAs are being used (not mobile).

Regular Maintenance

  • Nightly sync or cron to refresh ClickHouse reference tables from Base44 (if not already automated).
  • Periodic review of dfs-raw-payloads R2 bucket; consider retention policy.
  • Rotate secrets quarterly; update Wrangler secrets and redeploy.

Helpful Scripts

  • npm run seed:business-types – ensures Base44 has canonical business types.
  • npm run push:dataforseo-categories – loads taxonomy into KV.
  • npm run format / npm test (if added) – sanity checks prior to deploy.
  • Architecture overview: docs/architecture.md
  • Backend lifecycle: docs/backend.md
  • Schema & endpoints: docs/reference/schema.md, docs/reference/endpoints.md