RankFabric Documentation

Welcome to RankFabric - a unified Cloudflare Worker platform providing integrated SEO intelligence, app store analytics, and marketing data services.

Quick Reference

I want to...	Go to
Get started quickly	Getting Started
Understand the architecture	Architecture Overview
Learn about classification	Classification Pipeline
Set up queues	Queue Infrastructure
Debug an issue	Troubleshooting
Check API endpoints	API Reference
Understand the database	Database Schema
Deploy to production	Operations Runbook

System Architecture

Storage Responsibilities:

Storage	Purpose	Examples
D1	Operational data, classifications	domains, urls, keywords, apps, rankings
ClickHouse	Analytics, time-series	keyword_snapshots, serp_runs, app_analytics
Base44	Canonical entities	Projects, Categories, Business Types
KV	Run state, caching	DFS_RUNS, budgets, category mappings
R2	Raw payloads	HTML snapshots, crawl data
Vectorize	ML embeddings	Classification similarity search

Documentation by Audience

For Developers

New to the codebase? Start here:

Document	Description
Architecture Overview	System design, project structure, component roles
Backend Lifecycle	Worker phases, queue consumers, request flow
API Endpoints	Complete API surface with examples
React Integration	Frontend workflows and API consumption
Client API Guide	How to build API clients

For Operators

Running the platform? Essential guides:

Document	Description
Operations Runbook	Deployment, monitoring, troubleshooting
Cost Tracking	Budget management and cost optimization
Crawl Management	Job scheduling, queue management
Diagnostics	Debug endpoints and health checks

For Architects

Understanding the design? Deep dives:

Document	Description
Data Architecture	Storage layer responsibilities, data flow
System Overview	High-level system design
Workflows	Durable Workflows orchestration
Architecture Diagrams	Visual system representations

Core Products

1. Keyword Research

AI-powered website analysis, category detection, and keyword harvesting.

Product Guide
Keyword Classification - Multi-dimensional keyword intelligence

2. SERP Tracking

Daily and on-demand search ranking monitoring with location support.

Product Guide
Location-aware tracking with local pack support

3. App Store Crawler

Apple App Store and Google Play catalog discovery and rankings.

Product Guide
Multi-platform category rankings

4. Backlink Intelligence

Marketing DNA analysis and backlink profile classification.

5. Domain Onboarding

Complete domain intelligence gathering with automated classification.

Classification Pipeline

RankFabric uses a sophisticated multi-stage classification system optimized for cost and accuracy.

Classification Documentation

Document	Description
Pipeline Master Plan	Implementation status, architecture diagrams
Domain Classification	7-stage domain classification pipeline
URL Classification	URL page type classification
Keyword Classification	Multi-dimensional keyword intelligence
Backlink Classification	Backlink profile analysis
Classification Dimensions	Taxonomy and dimension reference

Key Concepts

Early Exit: Stages exit at 70% confidence to minimize costs
Self-Learning: High-confidence results feed back to Vectorize
Domain-First: URLs wait for domain classification before processing
Negative Learning: Corrections improve future classifications

Queue Infrastructure

All heavy processing runs through Cloudflare Queues for reliability and observability.

Queue Overview

Queue	Consumer	Purpose
`rankfabric-tasks`	task-consumer	Main work queue (keyword harvest, SERP tracking)
`domain-classify`	domain-classify-consumer	Domain classification pipeline
`url-classify`	url-classify-consumer	URL/backlink classification
`keyword-classify`	keyword-classify-consumer	Keyword classification
`clickhouse-ingestion`	clickhouse-consumer	Batched analytics writes
`app-details-fetch`	app-details-consumer	App metadata enrichment
`llm-verify`	llm-verify-consumer	Low-confidence verification
`rankfabric-dlq`	-	Dead letter queue for failures

Queue Flow Patterns

User-initiated: HTTP -> KV state -> Queue -> Storage
Scheduled: Cron -> Queue -> Storage
Webhook: DataForSEO callback -> Queue -> Storage
Cascading: Domain queue -> URL queue -> Keyword queue

Workflow Orchestration

Cloudflare Durable Workflows provide visibility, state management, and automatic retries.

Workflow	Purpose	Trigger
AssetOnboardWorkflow	Master orchestrator for all assets	`POST /api/assets`
DomainOnboardWorkflow	Domain intelligence gathering	Asset onboard
UrlClassifyWorkflow	4-stage URL classification	Backlink queue
KeywordClassifyWorkflow	5-stage keyword classification	Keyword queue
SerpTrackingWorkflow	SERP position tracking	`POST /api/keywords/track`
AppDetailsWorkflow	App store metadata enrichment	Asset onboard

See Workflows Documentation for detailed flow diagrams.

Vectorize ML System

RankFabric uses Cloudflare Vectorize for semantic similarity classification.

Vectorize Indexes

Index	Purpose	Embedding Model
`domain-classifier`	Domain type similarity	BGE-base
`backlink-classifier`	URL page type similarity	BGE-base
`keyword-classifier`	Keyword intent/funnel similarity	BGE-base

How It Works

Training: High-confidence classifications are embedded and stored
Inference: New items are embedded and compared to known examples
Feedback: Corrections improve the model over time

See Classification Pipeline for implementation details.

Database Schema

D1 Tables (Operational)

Table	Purpose
`domains`	Domain records with classification
`urls`	URL records with page type classification
`keywords`	Global keyword repository
`apps`	App metadata (Apple/Google)
`app_category_rankings`	App positions in charts
`brands`	Developer/company entities
`jobs`	Job queue tracking

ClickHouse Tables (Analytics)

Table	Purpose
`keyword_snapshots`	Historical keyword metrics
`serp_runs`	SERP tracking results
`app_analytics`	App ranking history

See Database Schema Reference for complete documentation.

Integration Guides

Integration	Description
React Client	Frontend workflows and API consumption
Base44	Entity management and relationships
DataForSEO	API endpoints, limits, credentials
ClickHouse	Analytics storage, ingestion queue

Troubleshooting

Common Issues

Problem	Solution
DataForSEO quota exceeded	Check KV `DFS_BUDGETS`, adjust limits in wrangler.toml
ClickHouse connection failed	Verify secrets, check `/test/clickhouse` endpoint
Classification stuck	Check queue status, review DLQ for errors
Apple rate limited	Worker auto-backs off; wait and retry
Queue not draining	Check consumer logs, verify bindings

Debug Endpoints

Endpoint	Purpose
`/test/clickhouse`	Test ClickHouse connectivity
`/diagnostics/run/{id}`	Inspect run state and errors
`/api/admin/queues/status`	Queue health and depths
`/api/admin/classifier/stats`	Classification statistics

See Operations Runbook for detailed troubleshooting.

Internal Documentation

Development notes and implementation plans:

Document	Description
D1 Subrequest Audit	Database query optimization
Domain Setup	Domain onboarding implementation
Workflow Implementation Plan	Workflow development notes
Session Notes	Development session notes

Reference Documentation

Document	Description
API Endpoints	Complete API surface
API Public	Public API documentation
API Internal	Internal/admin API docs
Database Schema	D1 and ClickHouse schemas
Data Flows	System data flow documentation
Diagnostics	Debug endpoints and health checks

Additional Resources

Changelog - Version history and changes
Roadmap - Planned features and improvements

Getting Started

Prerequisites

Cloudflare Account with Workers, D1, KV, R2, Queues, and Vectorize access
DataForSEO Account for keyword and SERP data
ClickHouse instance (Cloud recommended)
Base44 for entity management (optional)

Quick Setup

# Clone the repository
git clone <repo-url>
cd rankfabric-edge-worker/packages/api

# Install dependencies
npm install

# Configure secrets
wrangler secret put DATAFORSEO_LOGIN
wrangler secret put DATAFORSEO_PASSWORD
wrangler secret put CLICKHOUSE_HOST
wrangler secret put CLICKHOUSE_USER
wrangler secret put CLICKHOUSE_PASSWORD

# Deploy
wrangler deploy

Verify Deployment

# Test ClickHouse connection
curl https://your-worker.workers.dev/test/clickhouse

# Run smoke test
curl -X POST https://your-worker.workers.dev/run \
  -H "Content-Type: application/json" \
  -d '{"url":"https://example.com","project_id":"test"}'

See Operations Runbook for complete deployment guide.

Getting Help

Operational issues? → Runbook
API questions? → API Reference
Data model questions? → Database Schema
Integration questions? → See integrations/ guides
Classification questions? → Pipeline Plan

Quick Reference​

System Architecture​

Documentation by Audience​

For Developers​

For Operators​

For Architects​

Core Products​

1. Keyword Research​

2. SERP Tracking​

3. App Store Crawler​

4. Backlink Intelligence​

5. Domain Onboarding​

Classification Pipeline​

Classification Documentation​

Key Concepts​

Queue Infrastructure​

Queue Overview​

Queue Flow Patterns​

Workflow Orchestration​

Vectorize ML System​

Vectorize Indexes​

How It Works​

Database Schema​

D1 Tables (Operational)​

ClickHouse Tables (Analytics)​

Integration Guides​

Troubleshooting​

Common Issues​

Debug Endpoints​

Internal Documentation​

Reference Documentation​

Additional Resources​

Getting Started​

Prerequisites​

Quick Setup​

Verify Deployment​

Getting Help​