Skip to main content

RankFabric Documentation

Welcome to RankFabric - a unified Cloudflare Worker platform providing integrated SEO intelligence, app store analytics, and marketing data services.


Quick Reference

I want to...Go to
Get started quicklyGetting Started
Understand the architectureArchitecture Overview
Learn about classificationClassification Pipeline
Set up queuesQueue Infrastructure
Debug an issueTroubleshooting
Check API endpointsAPI Reference
Understand the databaseDatabase Schema
Deploy to productionOperations Runbook

System Architecture

Storage Responsibilities:

StoragePurposeExamples
D1Operational data, classificationsdomains, urls, keywords, apps, rankings
ClickHouseAnalytics, time-serieskeyword_snapshots, serp_runs, app_analytics
Base44Canonical entitiesProjects, Categories, Business Types
KVRun state, cachingDFS_RUNS, budgets, category mappings
R2Raw payloadsHTML snapshots, crawl data
VectorizeML embeddingsClassification similarity search

Documentation by Audience

For Developers

New to the codebase? Start here:

DocumentDescription
Architecture OverviewSystem design, project structure, component roles
Backend LifecycleWorker phases, queue consumers, request flow
API EndpointsComplete API surface with examples
React IntegrationFrontend workflows and API consumption
Client API GuideHow to build API clients

For Operators

Running the platform? Essential guides:

DocumentDescription
Operations RunbookDeployment, monitoring, troubleshooting
Cost TrackingBudget management and cost optimization
Crawl ManagementJob scheduling, queue management
DiagnosticsDebug endpoints and health checks

For Architects

Understanding the design? Deep dives:

DocumentDescription
Data ArchitectureStorage layer responsibilities, data flow
System OverviewHigh-level system design
WorkflowsDurable Workflows orchestration
Architecture DiagramsVisual system representations

Core Products

1. Keyword Research

AI-powered website analysis, category detection, and keyword harvesting.

2. SERP Tracking

Daily and on-demand search ranking monitoring with location support.

3. App Store Crawler

Apple App Store and Google Play catalog discovery and rankings.

Marketing DNA analysis and backlink profile classification.

5. Domain Onboarding

Complete domain intelligence gathering with automated classification.


Classification Pipeline

RankFabric uses a sophisticated multi-stage classification system optimized for cost and accuracy.

Classification Documentation

DocumentDescription
Pipeline Master PlanImplementation status, architecture diagrams
Domain Classification7-stage domain classification pipeline
URL ClassificationURL page type classification
Keyword ClassificationMulti-dimensional keyword intelligence
Backlink ClassificationBacklink profile analysis
Classification DimensionsTaxonomy and dimension reference

Key Concepts

  • Early Exit: Stages exit at 70% confidence to minimize costs
  • Self-Learning: High-confidence results feed back to Vectorize
  • Domain-First: URLs wait for domain classification before processing
  • Negative Learning: Corrections improve future classifications

Queue Infrastructure

All heavy processing runs through Cloudflare Queues for reliability and observability.

Queue Overview

QueueConsumerPurpose
rankfabric-taskstask-consumerMain work queue (keyword harvest, SERP tracking)
domain-classifydomain-classify-consumerDomain classification pipeline
url-classifyurl-classify-consumerURL/backlink classification
keyword-classifykeyword-classify-consumerKeyword classification
clickhouse-ingestionclickhouse-consumerBatched analytics writes
app-details-fetchapp-details-consumerApp metadata enrichment
llm-verifyllm-verify-consumerLow-confidence verification
rankfabric-dlq-Dead letter queue for failures

Queue Flow Patterns

  1. User-initiated: HTTP -> KV state -> Queue -> Storage
  2. Scheduled: Cron -> Queue -> Storage
  3. Webhook: DataForSEO callback -> Queue -> Storage
  4. Cascading: Domain queue -> URL queue -> Keyword queue

Workflow Orchestration

Cloudflare Durable Workflows provide visibility, state management, and automatic retries.

WorkflowPurposeTrigger
AssetOnboardWorkflowMaster orchestrator for all assetsPOST /api/assets
DomainOnboardWorkflowDomain intelligence gatheringAsset onboard
UrlClassifyWorkflow4-stage URL classificationBacklink queue
KeywordClassifyWorkflow5-stage keyword classificationKeyword queue
SerpTrackingWorkflowSERP position trackingPOST /api/keywords/track
AppDetailsWorkflowApp store metadata enrichmentAsset onboard

See Workflows Documentation for detailed flow diagrams.


Vectorize ML System

RankFabric uses Cloudflare Vectorize for semantic similarity classification.

Vectorize Indexes

IndexPurposeEmbedding Model
domain-classifierDomain type similarityBGE-base
backlink-classifierURL page type similarityBGE-base
keyword-classifierKeyword intent/funnel similarityBGE-base

How It Works

  1. Training: High-confidence classifications are embedded and stored
  2. Inference: New items are embedded and compared to known examples
  3. Feedback: Corrections improve the model over time

See Classification Pipeline for implementation details.


Database Schema

D1 Tables (Operational)

TablePurpose
domainsDomain records with classification
urlsURL records with page type classification
keywordsGlobal keyword repository
appsApp metadata (Apple/Google)
app_category_rankingsApp positions in charts
brandsDeveloper/company entities
jobsJob queue tracking

ClickHouse Tables (Analytics)

TablePurpose
keyword_snapshotsHistorical keyword metrics
serp_runsSERP tracking results
app_analyticsApp ranking history

See Database Schema Reference for complete documentation.


Integration Guides

IntegrationDescription
React ClientFrontend workflows and API consumption
Base44Entity management and relationships
DataForSEOAPI endpoints, limits, credentials
ClickHouseAnalytics storage, ingestion queue

Troubleshooting

Common Issues

ProblemSolution
DataForSEO quota exceededCheck KV DFS_BUDGETS, adjust limits in wrangler.toml
ClickHouse connection failedVerify secrets, check /test/clickhouse endpoint
Classification stuckCheck queue status, review DLQ for errors
Apple rate limitedWorker auto-backs off; wait and retry
Queue not drainingCheck consumer logs, verify bindings

Debug Endpoints

EndpointPurpose
/test/clickhouseTest ClickHouse connectivity
/diagnostics/run/{id}Inspect run state and errors
/api/admin/queues/statusQueue health and depths
/api/admin/classifier/statsClassification statistics

See Operations Runbook for detailed troubleshooting.


Internal Documentation

Development notes and implementation plans:

DocumentDescription
D1 Subrequest AuditDatabase query optimization
Domain SetupDomain onboarding implementation
Workflow Implementation PlanWorkflow development notes
Session NotesDevelopment session notes

Reference Documentation

DocumentDescription
API EndpointsComplete API surface
API PublicPublic API documentation
API InternalInternal/admin API docs
Database SchemaD1 and ClickHouse schemas
Data FlowsSystem data flow documentation
DiagnosticsDebug endpoints and health checks

Additional Resources

  • Changelog - Version history and changes
  • Roadmap - Planned features and improvements

Getting Started

Prerequisites

  1. Cloudflare Account with Workers, D1, KV, R2, Queues, and Vectorize access
  2. DataForSEO Account for keyword and SERP data
  3. ClickHouse instance (Cloud recommended)
  4. Base44 for entity management (optional)

Quick Setup

# Clone the repository
git clone <repo-url>
cd rankfabric-edge-worker/packages/api

# Install dependencies
npm install

# Configure secrets
wrangler secret put DATAFORSEO_LOGIN
wrangler secret put DATAFORSEO_PASSWORD
wrangler secret put CLICKHOUSE_HOST
wrangler secret put CLICKHOUSE_USER
wrangler secret put CLICKHOUSE_PASSWORD

# Deploy
wrangler deploy

Verify Deployment

# Test ClickHouse connection
curl https://your-worker.workers.dev/test/clickhouse

# Run smoke test
curl -X POST https://your-worker.workers.dev/run \
-H "Content-Type: application/json" \
-d '{"url":"https://example.com","project_id":"test"}'

See Operations Runbook for complete deployment guide.


Getting Help