My first query against 10 million rows in ClickHouse took 4,191 milliseconds. My last took 75. That 50+x performance gain came from a single line of code that represents everything I learned about columnar databases.
Working at a company that builds open source OLAP tools means I hear about ClickHouse performance daily. Everyone talks about how much faster it is than traditional databases for analytics. But as a product manager, I've learned there's a difference between hearing about something and actually understanding it.
So I built this benchmark with a specific question in mind: How would an AI agent query these databases during a conversation? I wanted to understand not just that ClickHouse is faster for analytics, but when it matters, why it works, and what the code actually looks like when you're trying to make it perform.
To make the comparison fair, I set up PostgreSQL and ClickHouse on identical Docker containers—each with 4GB RAM and 2 CPUs on my M3 Pro. The dataset was 46 columns of aircraft tracking data, ranging from 10,000 to 10 million records. Every test ran 100 iterations with proper cache warmup to ensure I wasn't measuring random fluctuations.
The workload itself was designed to simulate how an AI agent might progressively explore unfamiliar data: first discovering the schema, then sampling records, followed by time-bucketed aggregations, and finally statistical calculations. This felt more realistic than running isolated queries—it's how conversations with data actually unfold.
You can find the full methodology and reproducible setup in BENCHMARK_METHODOLOGY.md if you want to run this yourself.
Here's what a typical AI-assisted data exploration looks like. Imagine asking Claude: "Help me understand this aircraft tracking dataset"
Claude would naturally progress through these queries:
This progression from schema discovery → sampling → aggregation → statistical analysis mirrors how developers actually explore unfamiliar data with AI assistance.
Note, this is based on the query pattern and hardware of my benchmark, but you can run the benchmark yourself to find your "crossover point".
PostgreSQL vs ClickHouse Crossover Diagram
Dataset Size | ClickHouse | PostgreSQL | PG+Index | Winner |
---|---|---|---|---|
10K | 27.9ms | 16.6ms | 12.9ms | PG+Idx (2.2x) |
50K | 75.5ms | 87.0ms | 60.5ms | ~Equal |
1M | 121.1ms | 974.1ms | 1,067ms | CH (8.8x) |
10M | 453ms | 12,201ms | 7,595ms | CH (16.8x) |
Times represent the full 4-query analytical workload: schema discovery, data sampling, aggregation, and statistical calculation.
Key insight: PostgreSQL is faster on small datasets (lower overhead). ClickHouse is faster once columnar benefits outweigh setup costs.
You can find the full benchmark results and reproducible data in RESULTS.md if you want to explore the numbers yourself.
This was my biggest "aha!" moment, and it perfectly illustrates why understanding database internals matters. (Turns out I should have RTFMed).
I approached ClickHouse like PostgreSQL. "Let's store data chronologically," I thought:
When I ran my analytical query, my results weren't as performant as I wanted:
4,191 milliseconds. For 10 million rows, that's painful.
I asked Nico (my mate and our CTO—he's patient with my curiosity) about the slow query performance: "ClickHouse query taking 4+ seconds on 10M rows. Schema and query attached - what's wrong?"
"Your ORDER BY doesn't match your query pattern. You're filtering on whether the aircraft is on the ground but it's not first in your primary key. ClickHouse has to scan the entire dataset to find those planes."
The fix was embarrassingly simple:
Same query, same data: 75ms. A 55x improvement from changing one line.
Here's the fundamental difference I missed:
PostgreSQL approach (what I did wrong):
ClickHouse reality (what I learned):
Bad ORDER BY (timestamp, hex):
Good ORDER BY (alt_baro_is_ground, hex, timestamp):
This revealed a fundamental architectural difference:
PostgreSQL philosophy: "I don't know what queries you'll run, so I'll create multiple indexes to handle various patterns."
ClickHouse philosophy: "Tell me your primary query pattern upfront, and I'll organize everything for maximum speed."
When I tested this with reduced memory (2GB vs 4GB), the ORDER BY choice became even more critical. ClickHouse failed completely during bulk inserts at 2GB - it needs substantial memory to sort and compress data during loading. But once loaded, queries ran efficiently at any memory level.
This wasn't optimization - this was fixing a fundamentally wrongly designed OLAP schema.
Nothing taught me more about database architecture than trying to load 10 million records efficiently. What started as a simple task became a deep dive into how these systems handle data differently.
Phase 1: Individual inserts
Projected time: 7.5 hours for 10M records
**Phase 2: Basic batching (size: 100)**Result: 80 minutes
Phase 3: Parameter limit optimization I asked Claude: "What's the optimal PostgreSQL batch size for 46 columns?"**"PostgreSQL's parameter limit is 65,535. You can batch ~1,400 records instead of 100."**Result: 30 minutes
Phase 4: Parallel workersResult: 18 minutes (25x improvement from Phase 1)
Here's where things got interesting. When I applied the same parallel worker approach to ClickHouse:
Why? ClickHouse's columnar engine is already internally optimized for bulk operations. Adding client-side parallelism actually created overhead rather than improvement. Again, a cool learning for me, but probably should have read the docs: "We recommend inserting data in batches of at least 1,000 rows, and ideally between 10,000–100,000 rows. Fewer, larger inserts reduce the number of parts written, minimize merge load, and lower overall system resource usage.".
This experience revealed a fundamental difference in how these systems handle concurrency:
🟦 PostgreSQL (OLTP) Philosophy:
🟨 ClickHouse (OLAP) Philosophy:
The lesson: OLTP databases benefit from client-side parallelism because they're designed for concurrent operations. OLAP databases are already internally parallelized for the operations they're optimized for.
ClickHouse performance depends heavily on using precise types. PostgreSQL is more forgiving.
I started with generic types in both databases:
Usually, I wouldn't have gone further, not bothering to optimize the types for the labor cost. But now, I can give Claude Code or Cursor (with almost any LLM) a code sample and a link to the ClickHouse type documentation, and ask for it to regenerate the CREATE TABLE
statement for efficiency: "How should I optimize ClickHouse types for this aircraft tracking schema?"
Testing with 5M records:
Storage:
Query performance:
ClickHouse columnar storage benefits:
LowCardinality optimization:
Result: LLMs remove the expertise barrier for database-specific optimizations without sacrificing performance.
Database-specific functions often perform significantly better than generic SQL. Again, these weren't all ClickHouse functions that I knew beforehand, but they were optimizations that were easily made by a copilot, and explained clearly.
Testing unique aircraft count on 5M records:
PostgreSQL (exact count):
Documentation: PostgreSQL Aggregate Functions
ClickHouse (approximate count):
Documentation: ClickHouse uniq() Function
PostgreSQL:
ClickHouse:
ClickHouse approximation algorithms:
Result: Native functions provide measurable performance gains, and LLMs make them discoverable without deep database expertise.
OLAP and OLTP databases handle missing data differently by design. I know this is a key philosophical / architectural difference, but I certainly didn't know that 6 days ago.
ClickHouse approach (performance-first):
PostgreSQL approach (semantics-first):
Same dataset, different results:
Performance benefits:
Compression efficiency:
ClickHouse: Sacrifices semantic precision for performance and compression. PostgreSQL: Preserves semantic meaning at the cost of storage and computational overhead.
Result: Neither approach is wrong - they optimize for different priorities.
Examples from this project:
Database selection traditionally required deep expertise to realize performance benefits. LLMs change this by making database-specific optimizations accessible through guided conversations rather than research projects.
**When to choose what:**Note, this is based on the query pattern and hardware of my benchmark, but you can run the benchmark yourself to find your "crossover point".
The practical impact: Specialized databases like ClickHouse provide measurable performance advantages that become accessible through LLM-assisted development. The traditional expertise barrier is lowered, making OLAP systems viable for a broader range of projects.
Full code and git history: github.com/514-labs/LLM-query-test