Vector.dev is a log router.
GlassFlow is built for stateful data transformations to ClickHouse
GlassFlow is an open source tool built for real-time and large-scale observability event processing - transform and ingest TBs of data with long-lasting state and enterprise support.


Is Vector suitable for stateful
stream processing?

What is Vector
Vector by Datadog is an open-source tool for observability data pipelines designed to collect, transform, and route observability data - logs and metrics - from various sources into ClickStack and ClickHouse, among other sinks.

The problem with Vector:
Most teams start with Vector because it’s easy for shipping logs. But as volumes grow and complex transformation needs arise, Vector is often misused as a streaming engine.
When Should You Use Vector?
When Vector is the right choice Vector is a suitable option for simp…
When Vector is the right choice
Vector is a suitable option for simple log collection, routing, and lightweight event processing. If your goal is to forward logs or apply simple, stateless transformations at the edge, Vector might be efficient and easy to deploy. It works best when state, long time windows, and production guarantees are not required.


When GlassFlow is the
better choice
GlassFlow is built for real stream processing. If you're running Kafka data transformations, multi-day aggregations, or stateful workloads that require durability and observability, GlassFlow provides the architecture, dead letter queue handling, and enterprise SLAs needed for production systems.
To summarize, There are three problems with Vector:
Vector is built primarily for log collection and routing
Vector is not designed for stateful, long-running transformations
Vector does not offer production-grade SLAs
Scales efficiently
GlassFlow is proven in real world scenarios

50 TB
Of data processed daily

414k
Records per second

<1
Second latency

~ $2.80
Per TB infrastructure cost

Multi-Pipeline
Horizontal scaling with multiple pipelines

KAFKA TO CLICKHOUSE: A PRACTICAL GUIDE
This ebook covers everything you need to know about building Kafka → ClickHouse pipelines.
GlassFlow VS Vector.dev comparison
GlassFlow for stateful stream processing; when log routing
with vector isn't enough
Feature
Stateless processing
Stateful processing
Late event handling
DLQ
SLAs
ClickHouse
aligned ack
Pipeline Observability
Deployment Service
Frequently asked questions
Feel free to contact us if you have any questions after reviewing our FAQs.
Can Vector do stateful processing?
ClickHouse merging process is happening in the background and controlled via ClickHouse. That makes deduplication for streaming data nearly impossible without overspending and slow performance. ClickHouse recommends to reduce the usage of JOINs as it can slow down the system too much. Kafka lacks native deduplication and JOIN capabilities. It just stores events. You need a processing layer in between that handles both deduplication and stateful JOINs before data hits ClickHouse. You can learn more about the challenges from our blog post.
Does Vector support dead letter queues?
Yes! GlassFlow for ClickHouse is open-source under the Apache 2.0 license. You’re free to use, modify, and self-host it.
Is Vector suitable for Kafka transformations?
Currently, Kafka is the primary input. Support for additional streaming sources like Kinesis and Pub/Sub is on our roadmap. Reach out if you have specific needs via our contact form.
What is a Vector alternative for stream processing?
As GlassFlow for ClickHouse is running completely locally on your machine, we do not have any access to your data.

Transformed Kafka data for ClickHouse
Get query ready data, lower ClickHouse load, and reliable
pipelines at enterprise scale.
