Built to make your ClickHouse stream ingestion seamless

Features to help you quickly move Kafka streams to ClickHouse and apply stateful processing.

Connect and integrate

Process and transform

Operate and observe

Deploy and secure

Connect and integrate

Seamless integration with your development cycles and flexible connections to your Kafka/ClickHouse.

Auto-Detect field mapping to ClickHouse

GlassFlow automatically detects and converts date formats between your Kafka topic and your ClickHouse. For example, transforming MongoDB timestamps (via Kafka) into ClickHouse-compatible date formats. No manual mapping required.

Python SDK for programmatic pipeline creation

With GlassFlows Python SDK, data engineers can define, test and deploy transformations programmatically, making it easy to integrate GlassFlow into existing DevOps and data engineering workflows.

Connect to all Kafka providers

Connect GlassFlow to MSK, RedPanda and Confluent, etc. Supported connection protocols include SASL, SSL, and more. Our ClickHouse connector is built using the native protocol. This gives you the best possible experience.

No code web Ul

GlassFlow’s web UI offers a guided experience for building and deploying real-time data pipelines. Powerful enough for engineers and clear enough for analysts to use it too.

Process and transform

Process and transform

GlassFlow includes data transformations and stateful processing to make your use cases run smoothly and with low effort.

7-days deduplication

Duplicates are automatically detected within 7 days of setup to ensure your data is always clean and that storage is not exhausted. Duplication can be based on the first or last event entering the pipeline.

Stateful processing

Built-in lightweight state store enables low-latency, in-memory deduplication and joins with context retention within the selected time window.

Joins, simplified

Define the fields of the Kafka streams that you would like to join and GlassFlow handles execution and state management automatically

Auto-format any JSON to a flattered table

Nested JSON structures are automatically flattened, ensuring seamless ingestion into ClickHouse tables without complex parsing logic.

Operate and observe

Operate and observe

Get full visibility and control over your running pipelines. You can monitor performance, track data flow in real time, and quickly identify bottlenecks or errors.

DLQ to keep your pipeline running

Automatically captures and isolates problematic events without disrupting data flow, making debugging and recovery effortless. Simply re-run your events after adjustments.

Analyze each step of the pipeline

End-to-end visibility into data flows, latency. and throughput. Complete with metrics, logs, and dashboards. Connect with Promotheus/ Grafana to centralize your observability.

‹12ms processing per event

GlassFlow processes events in under 12ms per record, enabling real-time stream transformations at scale.

Cost-efficient foot print

GlassFlow is lightweight. No clusters to manage or infrastructure to provision. It minimizes operational overhead and costs while ensuring high performance and reliability for every pipeline.

Deploy and secure

Deploy and secure

Make your data pipelines production-ready with one-click deployment, role-based access control, and end-to-end encryption.

Controlled usage

GlassFlow integrates with standard authentication frameworks such as Kerberos, providing secure and familiar identity management.

Built to scale on your environment

GlassFlow runs natively on Kubernetes, leveraging its scaling, reliability, and orchestration capabilities. Making it easy for you to self-host.

GlassFlow is secure

All data handled by GlassFlow is encrypted both at rest and in transit, ensuring end-to-end protection for sensitive information.

Frequently asked questions

Feel free to contact us if you have any questions after reviewing our FAQs.

Do you have a demo?

Do you have a demo?

How is GlassFlow’s deduplication different from ClickHouse’s ReplacingMergeTree?

How is GlassFlow’s deduplication different from ClickHouse’s ReplacingMergeTree?

How does GlassFlow’s deduplication work?

How does GlassFlow’s deduplication work?

Why do duplicates happen in Kafka pipelines at all?

Why do duplicates happen in Kafka pipelines at all?

What happens during failures? Can you lose or duplicate data?

What happens during failures? Can you lose or duplicate data?

What is the load that GlassFlow can handle?

What is the load that GlassFlow can handle?

Which features are coming next?

Which features are coming next?

How do I self-host GlassFlow?

How do I self-host GlassFlow?

Transformed Kafka data for ClickHouse

Get query ready data, lower ClickHouse load, and reliable
pipelines at enterprise scale.

Transformed Kafka data for ClickHouse

Get query ready data, lower ClickHouse load, and reliable
pipelines at enterprise scale.

Transformed Kafka data for ClickHouse

Get query ready data, lower ClickHouse load, and reliable
pipelines at enterprise scale.