Vector.dev is a log router.



GlassFlow is built for stateful data transformations to ClickHouse

GlassFlow is an open source tool built for real-time and large-scale observability event processing - transform and ingest TBs of data with long-lasting state and enterprise support.

Is Vector suitable for stateful

stream processing?

What is Vector

Vector by Datadog is an open-source tool for observability data pipelines designed to collect, transform, and route observability data - logs and metrics - from various sources into ClickStack and ClickHouse, among other sinks.

The problem with Vector:

Most teams start with Vector because it’s easy for shipping logs. But as volumes grow and complex transformation needs arise, Vector is often misused as a streaming engine.

When Should You Use Vector?

When Vector is the right choice Vector is a suitable option for simp…

When Vector is the right choice

Vector is a suitable option for simple log collection, routing, and lightweight event processing. If your goal is to forward logs or apply simple, stateless transformations at the edge, Vector might be efficient and easy to deploy. It works best when state, long time windows, and production guarantees are not required.


When GlassFlow is the

better choice

GlassFlow is built for real stream processing. If you're running Kafka data transformations, multi-day aggregations, or stateful workloads that require durability and observability, GlassFlow provides the architecture, dead letter queue handling, and enterprise SLAs needed for production systems.

To summarize, There are three problems with Vector:

Vector is built primarily for log collection and routing

Vector is not designed for stateful, long-running transformations

Vector does not offer production-grade SLAs

Scales efficiently

GlassFlow is proven in real world scenarios

50 TB

Of data processed daily

414k

Records per second

<1

Second latency

~ $2.80

Per TB infrastructure cost

Multi-Pipeline

Horizontal scaling with multiple pipelines

KAFKA TO CLICKHOUSE: A PRACTICAL GUIDE

This ebook covers everything you need to know about building Kafka → ClickHouse pipelines.

GlassFlow VS Vector.dev comparison

GlassFlow for stateful stream processing; when log routing

with vector isn't enough

Feature

Stateless processing

Stateful processing

Late event handling

DLQ

SLAs

ClickHouse

aligned ack

Pipeline Observability

Deployment Service

CH Kafka Table Engine

Limited due to the in-memory store

Frequently asked questions

Feel free to contact us if you have any questions after reviewing our FAQs.

Can Vector do stateful processing?

ClickHouse merging process is happening in the background and controlled via ClickHouse. That makes deduplication for streaming data nearly impossible without overspending and slow performance. ClickHouse recommends to reduce the usage of JOINs as it can slow down the system too much. Kafka lacks native deduplication and JOIN capabilities. It just stores events. You need a processing layer in between that handles both deduplication and stateful JOINs before data hits ClickHouse. You can learn more about the challenges from our blog post.

Does Vector support dead letter queues?

Yes! GlassFlow for ClickHouse is open-source under the Apache 2.0 license. You’re free to use, modify, and self-host it.

Is Vector suitable for Kafka transformations?

Currently, Kafka is the primary input. Support for additional streaming sources like Kinesis and Pub/Sub is on our roadmap. Reach out if you have specific needs via our contact form.

What is a Vector alternative for stream processing?

As GlassFlow for ClickHouse is running completely locally on your machine, we do not have any access to your data.

Transformed Kafka data for ClickHouse

Get query ready data, lower ClickHouse load, and reliable
pipelines at enterprise scale.