Kafka to ClickHouse connector.
Open Source. Native. Fast. Reliable.
Open-source Kafka to ClickHouse connector with built-in deduplication, configurable batching, and native ClickHouse delivery. No Kafka Connect, no custom consumers. Set up in minutes

Managed Connectors
The Kafka and the ClickHouse connectors are built and updated by the GlassFlow team.
High Performance
The connectors are created for optimal throughput and native support.
Clean Data
You can dedupe and join Kafka streams within GlassFlow before ingesting to ClickHouse. Auto retries make sure your data is up-to-date.
Comparison
See in detail how GlassFlow performs compared to alternative solutions

Learn how to stream data from Kafka to ClickHouse using the Kafka table Engine, ClickPipes, or Kafka Connect. Understand when to use each.
From Kafka to ClickHouse: Get all details.
How does it work?
Supports multiple Kafka topics and partitions
GlassFlow natively supports consuming from multiple Kafka topics and partitions in parallel, ensuring high-throughput and scalable ingestion. It automatically handles partition assignment, offset tracking, and rebalancing behind the scenes. This allows you to build unified pipelines that process data from various sources without manual coordination.


Adjustable waiting times for optimal throughput
GlassFlow lets you configure wait times between batch reads from Kafka, allowing you to control how often data is flushed downstream. By adjusting this interval, you can optimize the trade-off between latency and throughput based on your workload. This flexibility helps maximize performance without overwhelming downstream systems like ClickHouse.
Configurable batch sizes
GlassFlow allows you to set configurable batch sizes for reading and processing data from Kafka, tailoring the amount of data handled in each batch. This helps balance between processing efficiency and memory usage, adapting to different workload demands. By tuning batch sizes, you can optimize pipeline throughput and reduce latency based on your system’s capacity and performance goals.

Kafka to ClickHouse Performance
GlassFlow sustains throughput beyond 500K events per second in a single pipeline while performing real-time transformations and delivering optimized batches to ClickHouse. This is achieved without requiring additional stream processing frameworks or custom consumer services.
Key performance characteristics:
Batch delivery latency: under 0.12ms per record end-to-end
Backpressure handling: automatic. The ingestor pauses Kafka consumption when the internal buffer fills, resuming when ClickHouse catches up. No events are dropped or lost.
Insert efficiency: configurable batch sizes prevent the "too many parts" error that degrades ClickHouse performance under high-frequency small inserts
Deduplication window: up to 7 days, configurable — duplicates from Kafka retries and rebalances are dropped before they reach ClickHouse
For a full breakdown of the scaling benchmarks, see how GlassFlow scales to 500k+ events/sec →

Frequently asked questions
Feel free to contact us if you have any questions after reviewing our FAQs.
Do you have a demo?
Yes, visit demo.glassflow.dev to see a live Kafka to ClickHouse pipeline processing real events. You can also book a proof of concept session and we'll walk through your specific use case.
Which datatypes are supported?
GlassFlow supports JSON event streams out of the box, including nested JSON structures. Primitive types (strings, integers, floats, booleans, timestamps) are handled automatically. Complex nested objects can be flattened or mapped to ClickHouse column types during the transformation step.
More data types like Avro or Protobuf in the Enterprise Edition of GlassFlow
Can you handle nested JSON?
Yes. GlassFlow includes schema normalization that flattens nested JSON before delivery to ClickHouse. This is important because ClickHouse performs best with flat, typed schemas rather than raw JSON blobs.

