Baseline Labs April 2026 Internal Engineering

GEO Migration Status

What shipped, what broke, and how to finish porting the legacy Firebase/Next.js GEO app to Baseline's FastAPI/PostgreSQL framework.

Executive Summary

The GEO product was migrated from Firebase + Next.js (repo: ai-seo-two/geo-butler) to FastAPI + PostgreSQL on Baseline. The migration shipped search, mentions, keywords, reports, and brand reports — but several components arrived broken or incomplete. The DB method names were wrong across all 7 API files (db.query instead of db.fetchone), schema prefixes were missing on critical queries, report execution was never wired up, and the UI pages assumed a save context that nothing provides.

This report documents every legacy-to-Baseline change, what was intentionally streamlined, what was accidentally lost, and the work remaining to reach feature parity.

Intentional Simplifications

What we cleaned up during migration

These changes were deliberate improvements. They should not be ported back from the legacy code.

Change	Legacy	Baseline
Auth	Firebase auth tokens, per-request verification	`Depends(require_api_key)` with X-API-Key header
Billing	Token wallet system with per-operation costs, `deduct_tokens_for_search()`	Removed entirely. Billing handled at subscription level in `core/`
Database	Firestore (NoSQL, document-based, `SERVER_TIMESTAMP`)	PostgreSQL with schema-qualified tables (`geo.`, `shared.`)
Serialization	Manual `convert_firestore_timestamps()`, `sanitize_for_firestore()`	Native PostgreSQL datetime handling
Validation	Manual dict key checking	Pydantic models
Error messages	Custom `get_user_friendly_error()` + recovery suggestions	Standard HTTP exceptions + `log_error()`
CORS	`@with_cors` wrapper on every endpoint	FastAPI middleware (global)
Report v3 dual-write	Wrote to both `query_status` AND `reports/{id}/queries` subcollection	Single `geo.report_query_status` table
Async execution	Google Cloud Pub/Sub (production) + daemon threads (dev)	Daemon threads only (Pub/Sub dropped)
Template system	Mention profiles were ad-hoc	New `brand_report_templates` table — scans can be templated and repeated
SSE streaming	Client-side polling	New `/api/geo/reports/{id}/stream` — real-time progress via Server-Sent Events

What Broke

Bugs introduced during migration

These issues cause runtime crashes. Most have been fixed as of this report; any remaining are noted.

Wrong DB method names across all 7 API files

Fixed

Every migrated file used db.query() and db.query_one() — methods that don't exist on the DB class. The correct methods are db.fetchall() and db.fetchone(). Every single endpoint would crash with AttributeError. Fixed across: search.py, reports.py, brand_reports.py, mentions.py, analytics.py, keywords.py, ai.py, email.py.

db.insert() called with unsupported returning kwarg

Fixed

create_save() in search.py called db.insert(..., returning='id') but the DB.insert() method only accepts (table, data). It already does RETURNING id internally. This caused TypeError on every project creation attempt.

Missing geo. schema prefix on 4 queries in search.py

Fixed

Lines 380, 452, 537, 664 used FROM geo_saves instead of FROM geo.geo_saves. This means Google, GPT, Perplexity, and Gemini search all crashed when trying to look up the save's primary_site. Only the initial save validation (line 81) had the correct prefix.

db.update() calls in reports.py — method doesn't exist

Fixed

Two calls to db.update() which isn't on the DB class. Also, one of them set completed_at to the string "NOW()" instead of the SQL function. Replaced with raw parameterized SQL.

Config.json slugs had double prefix

Fixed

Visibility report pages used slug "geo/reports/new" but the compiler already adds the /geo prefix, producing /geo/geo/reports/new. Brand report configs were missing slugs entirely. GEO home used empty string slug instead of "home".

Brand reports fetch calls missing X-API-Key header

Fixed

Visibility report pages included X-API-Key headers on all fetch calls. Brand report pages did not — all API calls would fail with 401. Fixed by adding apiKey from localStorage to all fetch headers.

Files leaked to main repo bypassing git

Fixed

The migration agent appears to have scp'd files directly to the live server instead of merging through the worktree. brand_reports.py and brand-reports/ pages appeared as untracked files on main. Two standalone .md files were also created in violation of project rules. Cleaned up and merged properly.

Critical Gaps

Features that exist in legacy but don't work here yet

Report execution not wired up

Blocking

reports.py creates report records and report_query_status rows, but nothing executes the searches. The legacy run_template_report_v2() orchestrator — which dispatched all query/engine combinations and called the search functions — was never migrated. Reports sit at "pending" forever. This is the single biggest gap.

Legacy file: report_functions_v2.py — the run_template_report_v2() function.
What to port: Background worker that iterates report_query_status rows, calls the search API per engine, and updates status. Can reuse the existing daemon thread pattern from mentions.

Brand report pipeline not triggered

Blocking

brand_reports.py creates the scan record and async operation tracker but never starts the background pipeline thread. Compare with mentions.py which does threading.Thread(...).start(). The scan sits at "pending" forever.

What to port: The create_brand_scan() endpoint needs to create a mention_profiles record from the template data and start the pipeline thread, same as mentions.py:create_scan().

No production async execution

Architecture

Legacy used Google Cloud Pub/Sub for production pipeline execution. This was dropped with no replacement. Current approach is daemon threads which don't survive worker restarts, have no retry on crash, and no queue persistence. Fine for dev, not for production.

Options: Redis queue (already available) with a dedicated worker process, or Celery integration.

AI summary generation for reports missing

Feature gap

Legacy's schedule_delayed_ai_generation() triggered AI analysis when reports completed (exponential backoff, 2-256s, 9 attempts). No equivalent exists. Reports complete but have no AI-generated insights.

Auth header format mismatch in keywords.py

Bug

Legacy used Base64-encoded credentials for DataForSEO (Basic auth). Migrated version passes the raw config value as the Authorization header. This likely causes 401 errors on DataForSEO calls from the keywords endpoint.

Retry logic dropped on most search engines

Degradation

Legacy had exponential backoff with 60-90s timeouts on all 5 engines. Migrated GPT search has retry logic, but Google, Perplexity, Gemini, and AI Mode have no retry. Transient failures become permanent.

Duplicate gpt_search() definition

Dead code

search.py defines gpt_search() twice. The first (a TODO stub) is silently overwritten by the second. Dead code that should be removed.

API Mapping

Legacy endpoints to Baseline endpoints

Legacy (Firebase)	Baseline (FastAPI)	Status
`POST /search`	`POST /api/geo/search/search`	Ported
`POST /new_save`	`POST /api/geo/search/saves`	Ported
`GET /get_saves`	`GET /api/geo/search/saves`	Ported
`GET /get_queries`	`GET /api/geo/search/saves/{id}/queries`	Ported
`create_mention_scan()`	`POST /api/geo/mentions/scans`	Ported
`get_mention_scans()`	`GET /api/geo/mentions/scans`	Ported
`get_mention_scan()`	`GET /api/geo/mentions/scans/{id}`	Ported
`get_ranked_keywords()`	`POST /api/geo/keywords/ranked`	Ported
`create_template()`	`POST /api/geo/reports/templates`	Ported
`create_report_v2()`	`POST /api/geo/reports`	Partial — creates records but no execution
`run_template_report_v2()`	Not migrated	Missing — the report orchestrator
`schedule_delayed_ai_generation()`	Not migrated	Missing
N/A	`GET /api/geo/reports/{id}/stream`	New — SSE streaming
N/A	`POST /api/geo/brand-reports/templates`	New — brand templates
`keyword_functions.py` (full)	`keywords.py`	Ported
`analytics_functions.py`	`analytics.py`	Partial — aggregation in Python not SQL
`email_functions.py`	`email.py`	Partial
`ai_function_tools.py`	`ai.py`	Partial
`screenshot_functions.py`	Not migrated	Missing

Database Mapping

Firestore collections to PostgreSQL tables

Firestore Collection	PostgreSQL Table	Notes
`saves`	`geo.geo_saves`	Document IDs (string) → serial integer IDs
`queries`	`geo.geo_queries`
`rankings`	`geo.rankings`	JSONB for keywords + metrics
`mention_scans`	`geo.mention_scans`
`mention_profiles`	`geo.mention_profiles`
`mention_records` (subcollection)	`geo.mention_records`	FK to mention_scans
`mention_summaries` (subcollection)	`geo.mention_summaries`	FK to mention_scans
`reports`	`geo.reports`
`query_status`	`geo.report_query_status`	v3 dual-write eliminated
`templates`	`geo.report_templates`
N/A (ad-hoc profiles)	`geo.brand_report_templates`	New — migration 020
`async_operations`	`shared.async_operations`
`wallets`	Removed	Billing handled in core
N/A	`geo.mention_queries`	New — tracks search queries per scan
N/A	`geo.mention_raw_results`	New — raw results before classification

UI Page Mapping

Legacy Next.js routes to Baseline pages

Legacy Route	Baseline Route	Status
`/` (landing page)	`/geo/home`	Ported
`/dashboard` (save selector + recent)	`/geo/dashboard`	Ported
`/options` (feature hub)	Merged into `/geo/dashboard`	Ported
`/analysis`	Not migrated	Missing
`/reports`	`/geo/reports/new` + `/geo/reports/view`	Partial — UI exists, execution missing
`/reports/weekly-check`	Not migrated	Missing
`/reports/rankings`	Not migrated	Missing
`/schemas`	Handled by MarkupSchema product	N/A
`/templates`	Integrated into report creation flow	Ported

Porting Priorities

Wire up report execution

Port run_template_report_v2() from legacy. This is a background worker that iterates pending report_query_status rows, calls the search API per engine, and updates status. Without this, visibility reports don't work at all. Reuse the daemon thread pattern from mentions.

Wire up brand report pipeline trigger

The create_brand_scan() endpoint needs to create a mention_profiles record from the template and start the 5-stage pipeline thread. Copy the pattern from mentions.py:create_scan().

Fix DataForSEO auth in keywords.py

Legacy used Base64-encoded Basic auth for DataForSEO. Migrated version passes the raw key. Quick fix — use the same auth pattern as mention_scraper.py which already handles DataForSEO correctly.

Port remaining UI pages

/analysis, /reports/weekly-check, and /reports/rankings from the legacy app. These are the remaining user-facing features that customers expect.

Replace daemon threads with Redis queue

Current daemon threads don't survive worker restarts. Redis is already available. Move to a proper job queue for mentions, reports, and brand reports execution. This is the path to production-grade async.

What we cleaned up during migration

Bugs introduced during migration

Features that exist in legacy but don't work here yet

Legacy endpoints to Baseline endpoints

Firestore collections to PostgreSQL tables

Legacy Next.js routes to Baseline pages

Hi, I'm George.