Skip to main content

D1 Subrequest Limit Audit

Date: 2024-12-14
Issue: Cloudflare D1 limits Workers to ~1000 subrequests per invocation
Status: ✅ COMPLETE - All high/medium priority fixes applied

Summary

All major D1 subrequest issues have been fixed. The pattern used across all fixes:

  1. Collect all data first (no DB calls in loops)
  2. Batch upsert domains using env.DB.batch()
  3. Batch upsert URLs using env.DB.batch()
  4. Fetch IDs with IN clause queries
  5. Batch insert final records in chunks of 50-100

Fixed Files

Problem: Was doing 9-13 DB operations per backlink Impact: 500 backlinks = 4500-6500 DB calls → "Too many API requests" error Solution: 6-phase batch approach, ~5 DB calls instead of ~5000


2. src/queue/serp-consumer.js - parseSerpResponse() ✅ FIXED

Problem: ~800+ DB calls for a typical SERP result with organic + local pack + related searches Solution: 11-phase batch approach:

  • Phase 1: Collect all data (no DB calls)
  • Phase 2: Initial inserts (keyword + serp_run) - 1 batch
  • Phase 3: Batch upsert all domains - 1 batch
  • Phase 4: Batch upsert all URLs - 1 batch
  • Phase 5: Fetch domain IDs - 1 query with IN clause
  • Phase 6: Get previous positions - 1 query
  • Phase 7: Batch upsert local entities - 1 batch
  • Phase 8: Batch insert organic positions - 1 batch
  • Phase 9: Batch insert local pack positions - 1 batch
  • Phase 10: Get position IDs and insert local_positions - 2 operations
  • Phase 11: Batch insert related searches + refinement chips - 1 batch

Result: ~10-15 DB calls instead of ~800+


3. src/queue/app-crawl-consumer.js - Google Play section ✅ FIXED

Problem: 700-1000+ DB calls for 100 apps Solution: 8-phase batch approach:

  • Phase 1: Collect all data needed (minimal DB calls)
  • Phase 2: Batch upsert all domains - 1 batch
  • Phase 3: Batch upsert all URLs - 1 batch
  • Phase 4: Fetch URL IDs - 1 query with IN clause
  • Phase 5: Get unique category names for lookup - 1 query
  • Phase 6: Upload icons (parallel, external calls)
  • Phase 7: Batch insert/update apps - chunked batches
  • Phase 8: Batch insert rankings - chunked batches

Also fixed project tracking loop to use batch operations.

Result: ~15-20 DB calls instead of ~1000


4. src/queue/domain-onboard-consumer.js - storeRankedKeywords() ✅ FIXED

Problem: 5000+ DB calls for 1000 keywords (keyword upsert + ensureUrl + SERP update + ranking insert per item) Solution: 9-phase batch approach:

  • Phase 1: Collect all data and classify keywords (no DB calls)
  • Phase 2: Batch upsert all domains - 1 batch
  • Phase 3: Batch upsert all URLs - chunked batches
  • Phase 4: Fetch URL IDs with IN clause - chunked queries
  • Phase 5: Batch update SERP title/description - chunked batches
  • Phase 6: Batch upsert keywords - chunked batches
  • Phase 7: Batch insert rankings - chunked batches
  • Phase 8: Check URLs needing classification - 1 query
  • Phase 9: Brand handling - 1-2 queries

Result: ~20-30 DB calls instead of ~5000


5. src/endpoints/admin-referring-domains.js - storeRankedKeywords() ✅ FIXED

Problem: Same as domain-onboard-consumer.js - duplicate function with 3000+ DB calls for 1000 keywords Solution: Same 7-phase batch approach (simpler version without brand/URL classification handling):

  • Phase 1: Collect all data (no DB calls)
  • Phase 2: Batch upsert all domains - chunked batches
  • Phase 3: Batch upsert all URLs - chunked batches
  • Phase 4: Fetch URL IDs with IN clause - chunked queries
  • Phase 5: Batch update SERP title/description - chunked batches
  • Phase 6: Batch upsert keywords - chunked batches
  • Phase 7: Batch insert rankings - chunked batches

Result: ~15-20 DB calls instead of ~3000


Lower Priority (Not Fixed)

6. src/lib/url-management.js - ensureUrl() / ensureDomain() ℹ️ INFO

These helper functions still do 2-3 DB operations each. They're fine for single-use cases but should NOT be called in loops. All hot paths have been refactored to use direct batch inserts instead.


7. src/lib/dataforseo-backlinks.js - upsertReferringDomains() ⚠️ LOW PRIORITY

Still uses individual upserts in a loop. Lower priority because:

  • Referring domains are fetched less frequently than backlinks
  • Typical volume is 100-500 domains, not 1000+
  • Not currently causing production issues

Can be fixed later if needed using same batch pattern.


Batch Pattern Reference

// Phase-based approach used in all fixes:

// 1. Collect data (no DB calls)
const items = [];
const uniqueDomains = new Map();
const uniqueUrls = new Map();

for (const data of inputData) {
// Extract domains/URLs, build data structures
uniqueDomains.set(domain, true);
uniqueUrls.set(url, { domain, urlHash: await generateUrlHash(url) });
items.push({ ... });
}

// 2. Batch upsert domains
const domainStatements = Array.from(uniqueDomains.keys()).map(d =>
env.DB.prepare(`INSERT INTO domains...ON CONFLICT...`).bind(d, ...)
);
for (let i = 0; i < domainStatements.length; i += CHUNK_SIZE) {
await env.DB.batch(domainStatements.slice(i, i + CHUNK_SIZE));
}

// 3. Batch upsert URLs (same pattern)

// 4. Fetch IDs with IN clause
const hashes = urlEntries.map(([, data]) => data.urlHash);
const placeholders = hashes.map(() => "?").join(",");
const { results } = await env.DB.prepare(
`SELECT id, url_hash FROM urls WHERE url_hash IN (${placeholders})`
).bind(...hashes).all();

// 5. Batch insert final records (same pattern as step 2)

Testing Completed

All fixes tested with:

  • 500+ backlinks ✅
  • 100+ SERP results ✅
  • 100+ apps per category ✅
  • 1000+ ranked keywords ✅

No "Too many API requests" errors observed.