Baseline Labs April 2026 Internal Engineering

Codebase Health Audit

A sweep of anti-patterns, inefficiencies, and security gaps across the Baseline Labs platform. 34 findings across server infrastructure, product APIs, database migrations, and static assets.

Executive Summary

The platform is functional and shipping, but carries accumulated technical debt across four areas: silent error handling masks bugs in production, SQL construction patterns create injection surface area, duplicated utilities across products diverge in behaviour, and infrastructure configuration has race conditions and missing resource limits.

Most findings are fixable in isolation. The five quick wins at the bottom of this report can be done in under an hour and meaningfully reduce risk.

Total findings

Critical

Security + correctness

High

Reliability + data integrity

Medium

Efficiency + maintainability

Low

Style + minor cleanup

Critical

Findings that need immediate attention

Security vulnerabilities, runtime errors, and data integrity risks.

JWT secret falls back to weak default

Security core/shared/errors.py

os.environ.get("JWT_SECRET", "change-me-in-production") silently uses a guessable default if the environment variable is missing. Anyone who reads the source can forge admin tokens. Should raise an error on startup if unset.

SQL injection via f-string table names

Security core/api/admin.py

The _count_since() helper interpolates table names directly into SQL: f"SELECT COUNT(*) FROM {table}". While currently called with hardcoded values, the pattern invites future misuse. Should validate against a whitelist.

Undefined variable causes NameError

Bug core/api/admin.py

get_requests() and get_errors() reference a product variable that is never defined as a parameter. These code paths will raise NameError at runtime.

Duplicate migration numbering

Data migrations/012_*.sql

Two files share the 012 prefix: 012_geo_tables.sql and 012_schema_separation.sql. Migration runners may skip one, leaving the schema in an inconsistent state.

High Priority

Reliability, performance, and data integrity

Issues that degrade the platform under load or cause data loss.

Bare except: pass across 6+ locations

Reliability

Silent exception swallowing in core/api/admin.py, markupschema/api/inference.py, and core/api/auth.py. Bugs in production become invisible. Every bare except should at minimum call log_error().

N+1 queries in admin profile

Performance core/api/admin.py

The user_profile() endpoint fires 7-8 separate database queries that could be consolidated into 1-2 using PostgreSQL CTEs.

HTTP client resource leak in scraper

Resource Leak core/scraper/worker.py

In flush_backlink_buffer(), if client.post() raises, client.aclose() is never called because it's not in a finally block. Connections leak in error scenarios.

DB cleanup relies on __del__

Resource Leak core/shared/db.py

Connection return to pool happens in __del__, which is unreliable in Python. Under load this can exhaust the connection pool. Should enforce context manager usage.

Unauthenticated batch scrape endpoint

Security markupschema/api/scrape.py

POST /submit-batch has no Depends(require_auth) or Depends(require_api_key). Anyone can spam the scrape queue.

No resource limits on Docker containers

Infrastructure docker-compose.yml

No deploy.resources.limits on any service. A memory leak or runaway process can crash the entire host, taking all services down.

Foreign key references wrong schema

Data migrations/015_geo_remaining_apis.sql

References geo_queries(id) instead of geo.geo_queries(id). After schema separation this FK constraint will fail.

Missing ON DELETE CASCADE on FKs

Data migrations/001_initial_postgres.sql

Foreign keys on api_keys and usage_logs reference users(id) without cascade. Deleting a user leaves orphaned records and will fail if the FK is enforced.

TOCTOU race in shared lock

Concurrency scripts/shared-lock.sh

Lock acquisition checks [ -f "$LOCK" ] then writes. Between check and write, another agent can grab the same lock. Needs atomic file creation.

Medium Priority

Efficiency and maintainability

Patterns that slow development, waste resources, or will become problems at scale.

4 different extract_domain() implementations

Duplication

scrape.py, schema_generator.py, inference.py, and mention_scraper.py each implement domain extraction differently. Edge cases around www. stripping, protocol handling, and error recovery diverge. Should be one function in core/shared/.

Redis connections created inline everywhere

Duplication

schema_generator.py, scrape.py, geo/search.py, and server.py all create their own Redis connections. Some use singletons, some don't. Should be a shared utility in core/shared/cache.py.

Analytics aggregation done in Python

Performance geo/api/analytics.py

Fetches all events into memory then loops to aggregate. With thousands of events this is slow and memory-hungry. Should use SQL GROUP BY and COUNT.

N+1 R2 downloads for schema results

Performance markupschema/api/schema_generator.py

Loops over URLs making individual R2 API calls per URL to check inference status. Should batch or parallelize.

New HTTP client per scraper fetch

Performance core/scraper/fetcher.py

Creates a new httpx.AsyncClient per request. Connection setup overhead is higher than reusing a persistent pool. The comment claims it "avoids pool exhaustion" but the opposite is true.

Duplicate function definition

Dead Code geo/api/search.py

gpt_search() is defined twice. The first definition (a TODO stub) is silently overwritten by the second. Dead code.

Hardcoded "first 3 users are admin"

Fragility core/api/auth.py

is_admin = user_count < 3 is checked on signup. If a user is deleted, the count drops and the next signup gets admin. Should use a role field.

Health check spawns Python process

Infrastructure docker-compose.yml

Docker health check runs python3 -c "import urllib.request; ..." every 5 seconds per container. Each invocation takes 500-800ms. Should use curl or wget.

sys.path pollution during API discovery

Correctness server.py

sys.path.insert(0, ...) adds paths during module loading and never cleans them up. In multi-worker environments this can cause import shadowing.

No graceful shutdown for background tasks

Reliability server.py

GPU watchdog and reaper tasks are cancelled without await. Pending database writes or locks may be abandoned mid-operation.

Unbounded scraper buffers

Memory core/scraper/worker.py

_link_buffer and _backlink_buffer are module-level lists with no max size. If flush fails repeatedly, memory grows without bound in the long-running worker.

No pagination on admin list endpoints

Scalability core/api/admin.py

SELECT * FROM shared.users ORDER BY id with no LIMIT. Will return increasingly large result sets as the user base grows.

Missing indexes on log tables

Performance migrations/

Heavy filtering on shared.error_logs and shared.request_logs by timestamp, user_id, and service without indexes. Admin dashboard queries will degrade over time.

Hard-coded colours in CSS

Design System static/css/baseline.css

Multiple instances of color: #fff in .btn--primary, .notif-count, .chat-bubble--user, and .chat-send. Violates the design rule to always use var(--c-*) tokens.

Duplicate buffer flush logic

Duplication core/scraper/worker.py

flush_link_buffer() and flush_backlink_buffer() are near-identical implementations. Should be a single parameterized function.

Low Priority

Cleanup and consistency

Disabled migrations left in tree

migrations/016, 017

.sql.disabled files sit alongside their replacements (018, 019). Confusing for anyone reading the migration history.

Inconsistent API response shapes

Some endpoints return raw dicts, others use Pydantic models. Wrapper conventions differ ({"success": true} vs bare data). No standard error response format.

Duplicate CSS .section rule

static/css/baseline.css

.section defined at line ~680 and again at ~2470 with conflicting properties (the second adds animations that override the first).

!important usage indicates specificity issues

static/css/baseline.css

Three uses of !important in responsive breakpoints. Usually a sign of specificity conflicts that should be resolved structurally.

Hardcoded location codes

geo/api/mention_scraper.py, search.py

US location codes (2826, 2840) and language ("en") are hardcoded in function bodies instead of config.

Duplicate compilation in Dockerfile + Compose

Dockerfile, docker-compose.yml

python3 compile.py runs on every container start via both the Dockerfile CMD and the Compose command override. Two sources of truth for the entrypoint.

Quick Wins

High impact, low effort

These can be done in under an hour and meaningfully reduce risk.

Fix	Impact	Effort
Add auth to `/submit-batch`	Closes unauthenticated endpoint	1 line
Remove dead `gpt_search` duplicate	Eliminates confusion	Delete 3 lines
Extract shared `extract_domain()`	Fixes 4 inconsistent implementations	~30 min
Replace bare `except: pass` with logging	Makes production debugging possible	Find and replace
Rename migration 012 collision	Prevents migration runner ambiguity	Rename 1 file

Recommended Priorities

Security hardening

Fix the JWT default, add auth to batch endpoints, validate SQL table names. These are the highest risk, lowest effort items.

Error visibility

Replace every bare except: pass with log_error(). Until this is done, production issues are invisible.

Shared utilities extraction

Move extract_domain() and Redis singleton into core/shared/. Reduces duplication across 8+ files and prevents behavioural divergence.

Infrastructure guardrails

Add Docker resource limits, fix the lock race condition, add missing database indexes. Prevents cascading failures under load.

Findings that need immediate attention

Reliability, performance, and data integrity

Efficiency and maintainability

Cleanup and consistency

High impact, low effort

Hi, I'm George.