ClickStack is the open-source observability platform from ClickHouse that unifies logs, traces, metrics, and session replays into a single backend. Built on ClickHouse, HyperDX, and OpenTelemetry, it positions itself as a self-hosted Datadog alternative that your team actually owns.
We set up ClickStack across three environments — local Docker, Helm on an existing cluster, and a full production Kubernetes deployment with external ClickHouse. The Docker quickstart took under 15 minutes. The Helm install took about the same. The production-grade Kubernetes setup with HA ClickHouse, S3 storage, and OTel daemonsets took a few hours but was straightforward once we understood the architecture.
This guide walks you through all three paths so you can pick the one that matches where you are today.
What Is ClickStack and Why Does It Matter
ClickStack bundles three components into a single observability platform:
- ClickHouse — the columnar OLAP database that handles storage and querying
- HyperDX — the purpose-built frontend for exploring logs, traces, metrics, and session replays
- OpenTelemetry Collector — the ingestion layer that receives telemetry data via OTLP
Instead of running separate backends for each signal type (Loki for logs, Tempo for traces, Prometheus for metrics), ClickStack stores everything in ClickHouse with optimized schemas per data type. This means you can correlate a slow trace with its corresponding logs and metrics using a single SQL join — no manual timestamp matching across tools.
The platform supports both SQL and Lucene-style queries. Type error and hit enter for a quick search, or write full SQL for analytical queries across billions of events. Production users include organisations processing over a billion events per second.
If you have been evaluating all-in-one observability platforms for cloud-native environments, ClickStack is one of the strongest open-source contenders in 2026.
ClickStack Architecture Overview
Before diving into setup, understanding the architecture helps you make better deployment decisions.
| Component | Role | Default Port |
|---|---|---|
| ClickHouse | Columnar database for all telemetry storage | 8123 (HTTP), 9000 (native) |
| HyperDX UI | Search, visualisation, dashboards, alerts | 8080 |
| OTel Collector | Receives OTLP data (gRPC and HTTP) | 4317 (gRPC), 4318 (HTTP) |
| MongoDB | Application state (dashboards, users, config) | 27017 |
Data Flow
Your Apps (OTel SDK) → OTel Collector (4317/4318)
↓
Batch Processor
↓
ClickHouse (storage)
↓
HyperDX UI (query & visualise)
The OTel Collector receives telemetry over OTLP, batches it for throughput, and exports to ClickHouse. HyperDX reads directly from ClickHouse using optimized schemas with compression codecs (Delta, ZSTD) and bloom filter indexes for fast full-text search.
Storage Schema Highlights
ClickStack uses purpose-built schemas for each signal type:
- Logs — Full-text indexing via
tokenbf_v1on the log body, 14-day default TTL, trace correlation throughtrace_idandspan_id - Traces — Delta and ZSTD compression, nested event structures, 30-day TTL, partitioned by date
- Metrics — Native time-series support, 90-day retention, optimized for service and metric name cardinality
- Session Replays — Event-based structure capturing user interactions, 7-day TTL, linked to traces
Option 1: Docker Quickstart (Development and Testing)
This is the fastest path. One command gets you a fully functional ClickStack instance.
Prerequisites
- Docker installed (Docker Desktop or Docker Engine)
- Minimum 4 GB RAM and 2 CPU cores available
- Ports 8080, 4317, and 4318 available
Run ClickStack
docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 \
docker.hyperdx.io/hyperdx/hyperdx-all-in-one
This single container bundles ClickHouse, HyperDX, the OTel Collector, and MongoDB. Open http://localhost:8080 to access the UI.
Disable Telemetry (Optional)
To opt out of anonymised usage data collection:
docker run -p 8080:8080 -p 4317:4317 -p 4318:4318 \
-e USAGE_STATS_ENABLED=false \
docker.hyperdx.io/hyperdx/hyperdx-all-in-one
Send Test Data
Point any OpenTelemetry SDK at http://localhost:4318 (HTTP) or localhost:4317 (gRPC). Here is a quick Python example:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
provider = TracerProvider()
processor = BatchSpanProcessor(
OTLPSpanExporter(endpoint="http://localhost:4318/v1/traces")
)
provider.add_span_processor(processor)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer("my-service")
with tracer.start_as_current_span("test-span"):
print("Trace sent to ClickStack")
Within seconds, the trace appears in the HyperDX UI with full span details, timing, and attributes.
When to Use Docker
- Local development and experimentation
- Quick proof-of-concept evaluations
- CI/CD pipeline testing
- Demos and training sessions
The all-in-one container is not designed for production. Storage is ephemeral, there is no HA, and all components share resources. For anything beyond testing, use Helm or the full Kubernetes setup.
Option 2: Helm Chart Installation (Recommended for Production)
The ClickStack Helm chart is the recommended method for production deployments. It provisions all core components as separate pods with configurable resources, persistence, and scaling.
Prerequisites
- A running Kubernetes cluster (EKS, AKS, GKE, or self-managed)
- Helm 3.x installed
kubectlconfigured for your cluster- Sufficient cluster resources (8 GB RAM minimum recommended)
If you are running Kubernetes on AWS, ensure your node groups have enough capacity before deploying.
Install ClickStack
# Add the Helm repository
helm repo add clickstack https://clickhouse.github.io/ClickStack-helm-charts
helm repo update
# Install with default configuration
helm install my-clickstack clickstack/clickstack
This deploys ClickHouse, HyperDX, the OTel Collector, and MongoDB as separate pods with persistent volumes.
Access the UI
kubectl port-forward svc/my-clickstack-app 8080:8080
Open http://localhost:8080 in your browser.
For production, configure an Ingress resource with TLS instead of port-forwarding:
# clickstack-ingress.yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: clickstack-ingress
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
tls:
- hosts:
- observability.yourdomain.com
secretName: clickstack-tls
rules:
- host: observability.yourdomain.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: my-clickstack-app
port:
number: 8080
Custom Configuration
Override defaults using --set flags or a values file:
# clickstack-values.yaml
clickhouse:
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
persistence:
size: 100Gi
storageClass: gp3
hyperdx:
resources:
requests:
memory: "1Gi"
cpu: "500m"
Install with custom values:
helm install my-clickstack clickstack/clickstack \
-f clickstack-values.yaml \
--namespace observability \
--create-namespace
Connect to ClickHouse Cloud
If you prefer managed ClickHouse instead of self-hosted, configure the Helm chart to use an external instance:
# clickstack-cloud-values.yaml
clickhouse:
enabled: false
hyperdx:
defaultConnections: |
[
{
"name": "ClickHouse Cloud",
"host": "https://your-instance.clickhouse.cloud",
"port": 8443,
"username": "default",
"password": "your-password"
}
]
This removes the self-hosted ClickHouse pod and points HyperDX directly at your ClickHouse Cloud instance.
Migrate from Legacy Chart
If you are running the older hdx-oss-v2 chart (v0.8.4), migrate to the clickstack chart. The legacy chart is in maintenance mode and all new features land in clickstack/clickstack (v1.0.0+).
Option 3: Production Kubernetes with External ClickHouse
For large-scale production environments, deploy ClickHouse separately using the Altinity operator. This gives you full control over sharding, replication, and storage backends.
Architecture
┌─────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ ┌──────────────┐ ┌──────────────────────────┐ │
│ │ OTel Collector│ │ ClickHouse Cluster │ │
│ │ (DaemonSet) │───▶│ 2 shards × 2 replicas │ │
│ └──────────────┘ │ S3 storage backend │ │
│ └──────────┬───────────────┘ │
│ │ │
│ ┌──────────────┐ ┌──────────▼───────────────┐ │
│ │ HyperDX UI │◀──│ ClickHouseKeeper │ │
│ │ (ClickStack) │ │ (3 replicas) │ │
│ └──────────────┘ └──────────────────────────┘ │
└─────────────────────────────────────────────────────┘
Step 1: Deploy the ClickHouse Operator
helm repo add altinity https://helm.altinity.com
helm install clickhouse-operator altinity/altinity-clickhouse-operator \
-n clickhouse --create-namespace \
-f altinity-values.yaml
Configure the operator to watch your target namespace:
# altinity-values.yaml
configs:
files:
config.yaml:
watch:
namespaces:
- logging
Step 2: Deploy ClickHouseKeeper and ClickHouse
Create a ClickHouseKeeper installation for cluster coordination:
apiVersion: clickhouse-keeper.altinity.com/v1
kind: ClickHouseKeeperInstallation
metadata:
name: ch-logging-keeper
namespace: logging
spec:
configuration:
clusters:
- name: ch-cluster
layout:
replicasCount: 3
Deploy the ClickHouse cluster with HA configuration and S3 storage:
apiVersion: clickhouse.altinity.com/v1
kind: ClickHouseInstallation
metadata:
name: ch-logging
namespace: logging
spec:
configuration:
zookeeper:
nodes:
- host: keeper-ch-logging-keeper.logging.svc.cluster.local
port: 2181
clusters:
- name: ch-cluster
layout:
shardsCount: 2
replicasCount: 2
settings:
storage_configuration/disks/logging/type: s3
storage_configuration/disks/logging/endpoint: >-
http://minio.minio.svc.cluster.local:9000/logging/clickhouse/{installation}/{replica}/
storage_configuration/disks/logging/access_key_id: YOUR_S3_ACCESS_KEY
storage_configuration/disks/logging/secret_access_key: YOUR_S3_SECRET_KEY
storage_configuration/policies/s3_main/volumes/main/disk: logging
users:
admin_user/password:
valueFrom:
secretKeyRef:
name: ch-logging-creds
key: admin_password
admin_user/networks/ip: 0.0.0.0/0
admin_user/profile: default
Store credentials in a Kubernetes Secret:
apiVersion: v1
kind: Secret
metadata:
name: ch-logging-creds
namespace: logging
type: Opaque
stringData:
admin_password: YOUR_SECURE_PASSWORD
Step 3: Create the OTel Logs Table
Connect to ClickHouse and create the optimized logs schema:
CREATE TABLE otel_logs ON CLUSTER 'ch-cluster'
(
`Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
`TimestampTime` DateTime DEFAULT toDateTime(Timestamp),
`TraceId` String CODEC(ZSTD(1)),
`SpanId` String CODEC(ZSTD(1)),
`TraceFlags` UInt8,
`SeverityText` LowCardinality(String) CODEC(ZSTD(1)),
`SeverityNumber` UInt8,
`ServiceName` LowCardinality(String) CODEC(ZSTD(1)),
`Body` String CODEC(ZSTD(1)),
`ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
`ScopeSchemaUrl` LowCardinality(String) CODEC(ZSTD(1)),
`ScopeName` String CODEC(ZSTD(1)),
`ScopeVersion` LowCardinality(String) CODEC(ZSTD(1)),
`ScopeAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
`LogAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
INDEX idx_res_attr_key mapKeys(ResourceAttributes)
TYPE bloom_filter(0.01) GRANULARITY 1,
INDEX idx_res_attr_value mapValues(ResourceAttributes)
TYPE bloom_filter(0.01) GRANULARITY 1,
INDEX idx_log_attr_key mapKeys(LogAttributes)
TYPE bloom_filter(0.01) GRANULARITY 1,
INDEX idx_log_attr_value mapValues(LogAttributes)
TYPE bloom_filter(0.01) GRANULARITY 1,
INDEX idx_body Body TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 8
)
ENGINE = ReplicatedMergeTree(
'/clickhouse/tables/{shard}/otel_logs', '{replica}'
)
PARTITION BY toDate(TimestampTime)
PRIMARY KEY (ServiceName, TimestampTime)
ORDER BY (ServiceName, TimestampTime, Timestamp)
SETTINGS storage_policy = 's3_main';
Key design decisions in this schema:
LowCardinalityonServiceNameandSeverityTextreduces memory usage for repeated valuestokenbf_v1onBodyenables fast full-text search on log contentbloom_filterindexes on attributes allow efficient filtering without scanning- S3 storage policy keeps costs low for high-volume log storage
- Partitioning by date makes TTL cleanup and time-range queries efficient
Step 4: Deploy the OTel Collector as a DaemonSet
The collector runs on every node to capture logs from all pods:
# otel-collector-values.yaml
mode: daemonset
presets:
logsCollection:
enabled: true
kubernetesAttributes:
enabled: true
config:
exporters:
clickhouse:
endpoint: http://clickhouse-ch-logging.logging.svc.cluster.local:8123
database: default
username: admin_user
password: YOUR_CLICKHOUSE_PASSWORD
logs_table_name: otel_logs
create_schema: false
service:
pipelines:
logs:
receivers: [filelog]
processors: [k8sattributes, batch]
exporters: [clickhouse]
Deploy with Helm:
helm repo add open-telemetry \
https://open-telemetry.github.io/opentelemetry-helm-charts
helm install otel-collector opentelemetry/opentelemetry-collector \
-n opentelemetry --create-namespace \
-f otel-collector-values.yaml
The k8sattributes processor automatically enriches every log with pod name, namespace, node, labels, and annotations. The batch processor groups logs before sending to ClickHouse for better throughput.
Step 5: Deploy HyperDX (ClickStack UI)
Point ClickStack at your external ClickHouse cluster:
# clickstack-values.yaml
hyperdx:
defaultConnections: |
[
{
"name": "ClickHouse Logging",
"host": "http://clickhouse-ch-logging.logging.svc.cluster.local:8123",
"port": 8123,
"username": "admin_user",
"password": "YOUR_CLICKHOUSE_PASSWORD"
}
]
otel:
enabled: false
clickhouse:
enabled: false
Install:
helm repo add clickstack https://hyperdxio.github.io/helm-charts
helm install clickstack clickstack/clickstack \
-n logging \
-f clickstack-values.yaml
Access the UI:
kubectl port-forward svc/clickstack-app 3000:3000 -n logging
Instrumenting Your Applications
With ClickStack running, instrument your services to send telemetry. ClickStack accepts standard OTLP data, so any OpenTelemetry SDK works.
Supported Languages
| Language | SDK Package |
|---|---|
| Python | opentelemetry-sdk |
| Node.js | @opentelemetry/sdk-node |
| Java | opentelemetry-java |
| Go | go.opentelemetry.io/otel |
| .NET | OpenTelemetry.Sdk |
| Ruby | opentelemetry-sdk |
| PHP | opentelemetry-php |
| Rust | opentelemetry |
| Elixir | opentelemetry_api |
Node.js Example
const { NodeSDK } = require('@opentelemetry/sdk-node');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { OTLPLogExporter } = require('@opentelemetry/exporter-logs-otlp-http');
const sdk = new NodeSDK({
traceExporter: new OTLPTraceExporter({
url: 'http://localhost:4318/v1/traces',
}),
logRecordExporter: new OTLPLogExporter({
url: 'http://localhost:4318/v1/logs',
}),
serviceName: 'my-api',
});
sdk.start();
Kubernetes Auto-Instrumentation
For Kubernetes workloads, use the OTel Operator to inject instrumentation automatically without code changes:
apiVersion: opentelemetry.io/v1alpha1
kind: Instrumentation
metadata:
name: auto-instrumentation
spec:
exporter:
endpoint: http://otel-collector.opentelemetry:4317
propagators:
- tracecontext
- baggage
python:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-python:latest
nodejs:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-nodejs:latest
java:
image: ghcr.io/open-telemetry/opentelemetry-operator/autoinstrumentation-java:latest
Annotate your deployments to enable auto-instrumentation:
metadata:
annotations:
instrumentation.opentelemetry.io/inject-python: "true"
This approach works well for teams that need cloud-native monitoring without modifying application code.
ClickStack vs Other Observability Stacks
How does ClickStack compare to the alternatives? Here is a practical comparison based on our experience.
| Feature | ClickStack | Datadog | Grafana Stack (Loki + Tempo + Mimir) |
|---|---|---|---|
| Deployment | Self-hosted or ClickHouse Cloud | SaaS only | Self-hosted or Grafana Cloud |
| Unified backend | Single ClickHouse instance | Proprietary | Separate backends per signal |
| Query language | SQL + Lucene | Proprietary | LogQL, TraceQL, PromQL |
| Session replays | Built-in | Built-in (paid) | Not included |
| Cross-signal correlation | SQL joins across signals | Automatic | Manual linking |
| Cost model | Infrastructure only | Per-host + per-GB ingestion | Infrastructure or per-GB (Cloud) |
| Setup time | 15 minutes (Docker) | 5 minutes (SaaS) | Hours (multiple components) |
| OpenTelemetry native | Yes | Partial | Yes |
| License | MIT | Proprietary | AGPL / Apache 2.0 |
ClickStack’s biggest advantage is the unified storage layer. With the Grafana stack, you manage Loki, Tempo, and Mimir as separate systems with different query languages. With ClickStack, everything lives in ClickHouse and you query it with SQL. For teams already familiar with ClickHouse’s performance characteristics, this is a natural fit.
The trade-off is maturity. Datadog has years of polish in its UI, alerting, and integrations. The Grafana ecosystem has a massive community and plugin library. ClickStack is newer, but it is evolving rapidly and the ClickHouse foundation gives it a serious performance edge for high-cardinality data.
Performance Tuning and Best Practices
ClickHouse Resource Sizing
| Workload | Daily Ingestion | ClickHouse RAM | ClickHouse CPU | Storage |
|---|---|---|---|---|
| Small (dev/staging) | < 10 GB/day | 4 GB | 2 cores | 50 GB SSD |
| Medium (production) | 10–100 GB/day | 16 GB | 8 cores | 500 GB SSD + S3 |
| Large (high-scale) | 100+ GB/day | 64 GB+ | 16+ cores | S3 tiered storage |
OTel Collector Batch Settings
Tune the batch processor for your throughput:
processors:
batch:
send_batch_size: 10000
send_batch_max_size: 20000
timeout: 2s
Larger batches improve throughput but increase memory usage and latency. Start with these defaults and adjust based on your application monitoring requirements.
Retention Policies
Configure TTL per signal type to manage storage costs:
| Signal | Recommended TTL | Rationale |
|---|---|---|
| Session replays | 7 days | High volume, short debugging window |
| Logs | 14–30 days | Most issues surface within two weeks |
| Traces | 30 days | Needed for performance trend analysis |
| Metrics | 90 days | Long-term capacity planning and SLOs |
Security Considerations
- Store ClickHouse credentials in Kubernetes Secrets, not in Helm values files
- Enable TLS between the OTel Collector and ClickHouse in production
- Use network policies to restrict access to ClickHouse pods
- Rotate credentials regularly using a secrets manager
- Enable ClickHouse audit logging for compliance
For organisations with strict compliance requirements, align ClickStack access controls with your broader Kubernetes security best practices.
Troubleshooting Common Issues
No Data Appearing in HyperDX
- Verify the OTel Collector is running:
kubectl get pods -n opentelemetry - Check collector logs for export errors:
kubectl logs -l app=otel-collector -n opentelemetry - Confirm ClickHouse connectivity:
kubectl exec -it clickhouse-pod -- clickhouse-client -q "SELECT count() FROM otel_logs" - Ensure your application is sending to the correct OTLP endpoint and port
ClickHouse Out of Memory
- Increase
max_memory_usagein ClickHouse settings - Add more replicas to distribute query load
- Move cold data to S3 with tiered storage policies
- Reduce batch sizes in the OTel Collector
Slow Queries in HyperDX
- Check if queries use the primary key (
ServiceName,TimestampTime) - Add materialized views for frequently queried patterns
- Use
EXPLAINto identify full-scan queries - Increase ClickHouse
max_threadsfor parallelism
Helm Upgrade Fails
If upgrading from the legacy hdx-oss-v2 chart to clickstack/clickstack:
# Back up your data first
kubectl exec -it clickhouse-pod -- clickhouse-client \
-q "SELECT * FROM otel_logs FORMAT Native" > backup.native
# Uninstall old chart
helm uninstall old-clickstack
# Install new chart
helm install my-clickstack clickstack/clickstack -f values.yaml
Frequently Asked Questions
Can ClickStack replace Datadog?
For self-hosted observability, yes. ClickStack covers logs, traces, metrics, and session replays — the core Datadog features. You lose Datadog’s managed infrastructure, 700+ integrations, and polished alerting. You gain full data ownership, no per-host pricing, and the ability to run SQL against your telemetry.
Does ClickStack work with existing Prometheus metrics?
Yes. The OTel Collector can scrape Prometheus endpoints using the prometheus receiver and forward metrics to ClickHouse. If you are already running Prometheus on Kubernetes, you can run both in parallel during migration.
What is the difference between ClickStack and HyperDX?
HyperDX is the UI and query engine. ClickStack is the full bundled platform that includes HyperDX, ClickHouse, and the OTel Collector. Think of HyperDX as the frontend and ClickStack as the complete package.
How much does ClickStack cost to run?
ClickStack itself is free and MIT-licensed. Your cost is infrastructure — compute for ClickHouse and the OTel Collector, plus storage (local SSD or S3). For a medium production workload ingesting 50 GB/day, expect roughly $500–800/month on AWS, significantly less than equivalent Datadog pricing.
Can I use ClickStack with ClickHouse Cloud?
Yes. Disable the bundled ClickHouse in the Helm chart and point HyperDX at your ClickHouse Cloud instance. This gives you managed ClickHouse with automatic scaling while keeping the rest of the stack self-hosted.
Get Expert Help With Your Observability Stack
Setting up ClickStack is the easy part. Designing an observability strategy that scales with your infrastructure — retention policies, alert routing, cost management, and cross-team adoption — is where most teams need guidance.
Our team provides comprehensive monitoring and observability consulting to help you:
- Design and deploy production-grade observability platforms using ClickStack, Grafana, or Prometheus
- Migrate from expensive SaaS tools like Datadog to self-hosted alternatives without losing visibility
- Optimise ClickHouse performance for high-cardinality telemetry data at scale
We have deployed observability stacks processing billions of events across production Kubernetes clusters.