Skip to main content

Base44 Integration

DEPRECATION NOTICE: Base44 integration is scheduled for removal. The functionality described below will be migrated to D1/ClickHouse.


What's Currently in the Codebase

The following Base44-specific code exists in packages/api/src/:

FileSizePurpose
lib/base44-client.js64KBFull SDK integration - keyword upserts, category management, bulk operations

Functions in use:

  • upsertKeywordToBase44() - Create/update individual keywords
  • bulkUpsertKeywords() - Batch upsert with retry logic
  • createCategoryIfNotExists() - Category creation
  • getCategoryKeywords() - Retrieve keywords by category
  • updateKeywordMetrics() - Update search volume, CPC, etc.
  • normalizeKeywordText() - Canonical text normalization
  • generateKeywordId() - Deterministic ID generation

Environment variables:

  • BASE44_API_URL
  • BASE44_JWT_SECRET
  • BASE44_APP_API_KEY

Data stored in Base44:

  • Canonical keywords (deduplicated globally across all customers)
  • Customer categories (hierarchical, per-customer)
  • Keyword-category relationships (M:M with primary designation)
  • Business types

Legacy Documentation

The following documentation describes the current (soon to be deprecated) Base44 integration.


RankFabric uses Base44 as the canonical entity store for keywords, categories, projects, and their relationships.


Overview

Base44 Role: Centralized entity management and user-facing customization layer

ClickHouse Role: Analytics, time-series metrics, historical snapshots

Separation of Concerns:

  • Base44 = Source of truth for what entities exist and how users have customized them
  • ClickHouse = Source of truth for when/how metrics changed over time

Configuration

Secrets

wrangler secret put BASE44_API_URL
# Example: https://your-app.base44.app/api/apps/app_123

wrangler secret put BASE44_JWT_SECRET
# Service token or app API key

Client Initialization

import { Base44Client } from './src/lib/base44-client.js';

const client = new Base44Client({
apiUrl: env.BASE44_API_URL,
jwtSecret: env.BASE44_JWT_SECRET
});

Entity Types

Keyword

Canonical keyword entity with latest metrics and metadata.

Schema:

{
text: string; // Normalized lowercase
normalized_text: string; // Same as text
original_keyword_text: string; // Preserve original casing
sources: string[]; // ['domain_seed', 'harvest_ai']
primary_intent: string; // 'commercial' | 'informational' | 'transactional' | 'navigational'
secondary_intents: string[]; // Other intents with confidence > 0.3
brand_flag: boolean; // True if contains brand/app names
dataforseo_category_paths: string[]; // ['/Business & Industrial/...']
latest_search_volume: number;
latest_competition: number;
latest_cpc: number;
latest_trend: string; // 'up' | 'down' | 'stable'
updated_at: string; // ISO timestamp
}

Worker Usage:

await client.upsertKeyword({
text: 'project management software',
normalized_text: 'project management software',
original_keyword_text: 'Project Management Software',
sources: ['harvest_ai'],
primary_intent: 'commercial',
brand_flag: false,
latest_search_volume: 12000,
latest_competition: 0.87,
latest_cpc: 8.45
});

Category

User-customizable categories (not DataForSEO taxonomy).

Schema:

{
id: string;
name: string;
slug: string;
description?: string;
parent_category_id?: string;
project_id?: string; // If project-specific
dataforseo_category_id?: string; // Link to DataForSEO taxonomy
created_at: string;
updated_at: string;
}

Worker Usage:

// Categories are read from Base44, not written by worker
const categories = await client.getCategories({ project_id: 'proj_123' });

ProjectKeyword (Relationship)

Links keywords to projects with run metadata.

Schema:

{
project_id: string;
keyword_text: string;
run_id: string;
added_at: string;
source: string; // 'domain_seed' | 'harvest_ai'
}

Worker Usage:

await client.upsertProjectKeyword({
project_id: 'proj_123',
keyword_text: 'project management',
run_id: 'run_abc123',
source: 'harvest_ai'
});

KeywordCategory (Relationship)

Links keywords to categories with confidence scores.

Schema:

{
keyword_text: string;
category_id: string;
dataforseo_category_id?: string;
assignment_confidence: number; // 0.0 - 1.0
assigned_by: string; // 'ai' | 'user' | 'system'
assigned_at: string;
}

Worker Usage:

await client.upsertKeywordCategory({
keyword_text: 'project management',
category_id: 'cat_productivity',
dataforseo_category_id: '12015',
assignment_confidence: 0.92,
assigned_by: 'ai'
});

BusinessType

Predefined business type taxonomy.

Schema:

{
id: string;
name: string;
description: string;
slug: string;
created_at: string;
}

Seed Data:

  • SaaS Platform
  • E-commerce Store
  • Marketplace
  • Media & Publishing
  • Lead Generation
  • Mobile App
  • Local Business
  • Agency/Services
  • Non-Profit

Worker Usage:

// Business types are seeded once via script
// Worker only reads them
const businessTypes = await client.getBusinessTypes();

Bulk Operations

Bulk Upsert Keywords

await client.bulkUpsertKeywords([
{
text: 'keyword 1',
normalized_text: 'keyword 1',
latest_search_volume: 1000
},
{
text: 'keyword 2',
normalized_text: 'keyword 2',
latest_search_volume: 500
}
]);

Performance: Batches requests (50 keywords per request) to avoid rate limits.


Bulk Create Relationships

await client.bulkCreateRelationships('ProjectKeyword', [
{
project_id: 'proj_123',
keyword_text: 'keyword 1',
run_id: 'run_abc',
source: 'harvest_ai'
},
{
project_id: 'proj_123',
keyword_text: 'keyword 2',
run_id: 'run_abc',
source: 'domain_seed'
}
]);

Harvest Pipeline Integration

The worker upserts keywords and relationships during harvest:

// 1. Prepare keywords with latest metrics
const keywordsToUpsert = mergedKeywords.map(kw => ({
text: kw.normalized_keyword,
normalized_text: kw.normalized_keyword,
original_keyword_text: kw.original_keyword_text,
sources: kw.sources,
primary_intent: kw.primary_intent,
secondary_intents: kw.secondary_intents,
brand_flag: kw.brand_flag,
dataforseo_category_paths: kw.dataforseo_category_paths,
latest_search_volume: kw.search_volume,
latest_competition: kw.competition,
latest_cpc: kw.cpc,
latest_trend: kw.trend
}));

// 2. Bulk upsert keywords
await client.bulkUpsertKeywords(keywordsToUpsert);

// 3. Create ProjectKeyword relationships
const projectKeywords = mergedKeywords.map(kw => ({
project_id: confirmedRun.project_id,
keyword_text: kw.normalized_keyword,
run_id: runId,
source: kw.sources[0]
}));
await client.bulkCreateRelationships('ProjectKeyword', projectKeywords);

// 4. Create KeywordCategory relationships
const keywordCategories = [];
for (const kw of mergedKeywords) {
for (const catId of kw.category_ids) {
keywordCategories.push({
keyword_text: kw.normalized_keyword,
category_id: catId,
dataforseo_category_id: kw.dataforseo_category_id,
assignment_confidence: kw.confidence || 0.85,
assigned_by: 'ai'
});
}
}
await client.bulkCreateRelationships('KeywordCategory', keywordCategories);

Error Handling

CORS Errors

Symptom: Access-Control-Allow-Origin errors in React app

Cause: Base44 API gateway configuration (not controllable by worker)

Fix: Contact Base44 support or configure allowed origins in Base44 app settings.


Authentication Failures

Symptom: 401 Unauthorized from Base44 API

Check:

  1. BASE44_JWT_SECRET is set correctly
  2. BASE44_API_URL includes correct app ID
  3. Service token has not expired

Test:

curl -H "Authorization: Bearer $TOKEN" $BASE44_API_URL/entities/Keyword

Rate Limits

Base44 does not currently enforce rate limits, but bulk operations batch requests to avoid overwhelming the API.

Default batch size: 50 entities per request

Configurable in client:

const client = new Base44Client({
apiUrl: env.BASE44_API_URL,
jwtSecret: env.BASE44_JWT_SECRET,
batchSize: 100 // Increase if Base44 can handle it
});

Schema Setup

Required Entities in Base44

Create these entity types in your Base44 app:

  1. Keyword

    • text (String, required, unique)
    • normalized_text (String)
    • original_keyword_text (String)
    • sources (Array of Strings)
    • primary_intent (String)
    • secondary_intents (Array of Strings)
    • brand_flag (Boolean)
    • dataforseo_category_paths (Array of Strings)
    • latest_search_volume (Number)
    • latest_competition (Number)
    • latest_cpc (Number)
    • latest_trend (String)
  2. Category

    • name (String, required)
    • slug (String, unique)
    • description (Text)
    • parent_category_id (Relationship to Category)
    • project_id (Relationship to Project)
    • dataforseo_category_id (String)
  3. BusinessType

    • name (String, required)
    • slug (String, unique)
    • description (Text)
  4. ProjectKeyword (Relationship)

    • project_id (Relationship to Project)
    • keyword_text (Relationship to Keyword via text field)
    • run_id (String)
    • added_at (DateTime)
    • source (String)
  5. KeywordCategory (Relationship)

    • keyword_text (Relationship to Keyword via text field)
    • category_id (Relationship to Category)
    • dataforseo_category_id (String)
    • assignment_confidence (Number)
    • assigned_by (String)
    • assigned_at (DateTime)

Seeding Business Types

Run once per Base44 app:

BASE44_API_URL="https://your-app.base44.app/api/apps/app_123" \
BASE44_JWT_SECRET="your-service-token" \
BASE44_BUSINESS_TYPE_ENTITY_SLUG="business_type" \
npm run seed:business-types

Script: scripts/seed-business-types.js

Creates 9 standard business types with descriptions.


Diagnostics

Test Connection

curl https://your-worker.workers.dev/diagnostics/keyword?keyword=test

Should return keyword record from Base44 (or null if not exists).


Verify Keyword Upsert

curl -X POST https://your-worker.workers.dev/run \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com", "project_id": "test_proj"}'

# Wait for harvest to complete, then check Base44
curl -H "Authorization: Bearer $TOKEN" \
$BASE44_API_URL/entities/Keyword

Best Practices

  1. Always normalize keyword text before upserting (lowercase, trim)
  2. Preserve original text in original_keyword_text for display
  3. Use bulk operations for harvest (50+ keywords) to reduce API calls
  4. Handle relationship failures gracefully - keyword upsert may succeed while relationships fail
  5. Don't delete keywords - mark as inactive or remove relationships instead
  6. Version confidence scores - store in relationship metadata, not keyword entity
  7. Sync latest metrics only - historical data goes to ClickHouse, not Base44

Webhook Integration (Optional)

If Base44 supports webhooks, configure for real-time entity updates:

wrangler secret put BASE44_WEBHOOK_URL
wrangler secret put BASE44_WEBHOOK_SECRET

Not currently implemented but endpoint stub exists at POST /webhooks/base44.