The debate about whether databases belong on Kubernetes is over. According to the Data on Kubernetes 2024 Report, 72% of organisations now use Kubernetes for database management, up from a minority just a few years ago. Meanwhile, 98% of respondents run data-intensive workloads on cloud-native platforms. The question has shifted from “should we?” to “how do we do it properly?”
At Tasrie IT Services, we have deployed and managed databases on over 200 Kubernetes clusters for clients across financial services, healthcare, e-commerce, and SaaS. We have seen what works in production and what causes 3am pages. This guide distils our hands-on experience into a practical reference covering StatefulSets, database operators, storage architecture, cost analysis, and a decision framework for choosing between managed services and Kubernetes-native databases.
StatefulSet Fundamentals
A StatefulSet is the Kubernetes workload controller purpose-built for stateful applications. Unlike Deployments, which treat pods as interchangeable cattle, StatefulSets treat each pod as a distinct entity with a stable identity that persists across restarts and rescheduling.
Stable Network Identities
Every pod in a StatefulSet receives a predictable name following the pattern $(statefulset-name)-$(ordinal). A PostgreSQL StatefulSet named pg-cluster produces pods named pg-cluster-0, pg-cluster-1, and pg-cluster-2. This predictability is critical for databases: your application can always connect to the primary at pg-cluster-0 and read replicas at pg-cluster-1 or pg-cluster-2.
StatefulSets require a Headless Service (clusterIP: None) that provides DNS entries for each pod. The resulting DNS pattern is:
$(pod-name).$(service-name).$(namespace).svc.cluster.local
This means pg-cluster-0.pg-service.databases.svc.cluster.local always resolves to the same logical pod, regardless of which node it runs on.
Ordered Deployment and Scaling
By default, StatefulSets use the OrderedReady pod management policy. Pods are created sequentially — pod-0 must be running and ready before pod-1 starts. Termination happens in reverse order. This matters for databases where the primary must initialise before replicas attempt to join the cluster.
For databases with built-in cluster membership protocols (CockroachDB, TiDB), you can use the Parallel policy instead, allowing all pods to start simultaneously and reducing scale-up time considerably.
VolumeClaimTemplates
The defining storage feature of StatefulSets is volumeClaimTemplates. Rather than sharing a single PersistentVolumeClaim across all pods, each pod gets its own dedicated PVC:
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: gp3-encrypted
resources:
requests:
storage: 100Gi
When pg-cluster-0 is created, Kubernetes provisions a PVC named data-pg-cluster-0. If the pod is deleted and recreated, it reattaches to the same PVC, preserving all data. This per-pod storage isolation is non-negotiable for databases where each replica maintains its own data directory.
One important caveat: Kubernetes does not allow you to modify volumeClaimTemplates after creation. If you need to resize volumes, the workaround is to manually patch each PVC’s spec.resources.requests.storage, delete the StatefulSet with --cascade=orphan (pods keep running with zero downtime), update the YAML with the new volume size, and reapply. It is awkward, but it works reliably in production.
StatefulSet vs Deployment: When to Use Which
We regularly encounter teams using Deployments for stateful workloads because they are more familiar. This table clarifies when each controller is appropriate:
| Feature | Deployment | StatefulSet |
|---|---|---|
| Pod naming | Random hash (e.g., app-7f8d9c) | Ordinal index (e.g., app-0, app-1) |
| Storage | Shared PVC across all pods | Dedicated PVC per pod |
| Scaling order | Simultaneous | Sequential (ordered) |
| Network identity | Ephemeral, changes on restart | Stable DNS per pod |
| Use case | Stateless APIs, web servers | Databases, message queues, consensus systems |
| Pod replacement | New pod, new identity | Same ordinal, same PVC, same DNS |
Use a Deployment when your application stores no local state, all pods are identical, and any pod can handle any request. Use a StatefulSet when pods need stable identities, ordered startup or shutdown, or per-pod persistent storage.
That said, for production database workloads, raw StatefulSets alone are rarely sufficient. That is where operators come in.
Why Operators Are Better Than Raw StatefulSets for Databases
A StatefulSet gives you stable identities and persistent storage. It does not give you automated failover, backup scheduling, point-in-time recovery, connection pooling, or rolling upgrades that respect replication lag. These are all essential for running databases in production.
Database operators extend the Kubernetes API with Custom Resource Definitions (CRDs) that encode operational knowledge. Instead of writing shell scripts to handle primary promotion when a pod fails, the operator watches the cluster state and acts autonomously. Here is what a mature database operator handles that raw StatefulSets cannot:
- Automated failover: Detects primary failure and promotes a replica within seconds
- Backup and PITR: Schedules base backups and continuous WAL archiving to object storage
- Replica management: Adds and removes read replicas declaratively
- Configuration management: Applies PostgreSQL/MySQL configuration changes safely with rolling restarts
- Version upgrades: Performs minor and major version upgrades with automated pre-flight checks
- Connection pooling: Integrates PgBouncer or ProxySQL as sidecar containers
- Monitoring integration: Exposes Prometheus metrics endpoints automatically
In our experience, teams that attempt to run production databases with raw StatefulSets inevitably build a bespoke operator over time — just one that is untested, undocumented, and maintained by a single engineer. Use an established operator from the start.
The Database Operator Landscape in 2026
The operator ecosystem has matured considerably. Here is our assessment of the leading options across database engines, based on what we deploy for clients.
PostgreSQL Operators
PostgreSQL has the richest operator ecosystem. Four operators dominate production deployments:
| Operator | Maintainer | HA Approach | Key Differentiator |
|---|---|---|---|
| CloudNativePG | EDB / CNCF | Built-in Instance Manager | CNCF Sandbox project, no Patroni dependency, fastest-growing community (4,300+ GitHub stars) |
| Zalando Postgres Operator | Zalando | Patroni-based | Battle-tested at scale since 2017, but community activity is declining |
| CrunchyData PGO | Crunchy Data | Patroni-based | Enterprise-focused, comprehensive monitoring integration |
| Percona Operator for PostgreSQL | Percona | Patroni-based | Multi-database vendor with unified tooling for PostgreSQL, MySQL, and MongoDB |
Our recommendation: For new deployments in 2026, we default to CloudNativePG. Its architecture eliminates the Patroni dependency, reducing moving parts. It is a CNCF project with strong community momentum, and its declarative backup configuration with native PITR support is production-ready. For organisations already invested in the Percona ecosystem across multiple database engines, the Percona Operator provides a consistent management experience.
MySQL Operators
| Operator | Key Features |
|---|---|
| Percona Operator for MySQL | Synchronous replication, PITR, zero-downtime upgrades, automated backups |
| MySQL Operator for Kubernetes (Oracle) | Official Oracle operator, InnoDB Cluster management |
| Vitess | CNCF Graduated project, horizontal sharding for MySQL, used by YouTube and Slack at massive scale |
For MySQL workloads that need horizontal sharding, Vitess remains the gold standard. For standard MySQL with replication, the Percona Operator offers the most comprehensive feature set.
MongoDB Operators
| Operator | Key Features |
|---|---|
| MongoDB Controllers for Kubernetes (MCK) | Unified replacement for Community + Enterprise operators (launched 2025), sharding support, integrated backups |
| Percona Operator for MongoDB | Incremental physical backups, hidden nodes, PITR, multi-storage support, automated user management |
Note that MongoDB archived its Community Operator in December 2025, replacing it with the unified MCK. If you are still using the legacy operator, plan your migration now.
Redis and Caching
For Redis-compatible workloads, Dragonfly Operator has emerged as a compelling option with automated replication and failover. Redis Enterprise Operator remains available for organisations with existing Redis Enterprise licences.
Distributed SQL (Kubernetes-Native)
Databases designed from the ground up for Kubernetes deserve special mention:
- CockroachDB: Distributed SQL with built-in replication, designed for horizontal scaling by adding pods
- TiDB: NewSQL, MySQL-compatible, proven at scale (Ninja Van case study)
- YugabyteDB: Distributed SQL, PostgreSQL-compatible, strong Kubernetes integration
These databases handle sharding, replication, and failover internally, making them excellent fits for Kubernetes even with simpler operator requirements.
Storage Considerations for Databases on Kubernetes
Storage is where database-on-Kubernetes deployments succeed or fail. Getting this right is more important than choosing the right operator.
CSI Drivers and StorageClasses
Every major cloud provider offers Container Storage Interface (CSI) drivers for their block storage:
- AWS: EBS CSI Driver (gp3 for general purpose, io2 for high IOPS)
- Azure: Azure Disk CSI Driver (Premium SSD v2 for databases)
- GCP: GCE Persistent Disk CSI Driver (pd-ssd for production)
Define explicit StorageClasses rather than relying on the default. A production database StorageClass should specify the volume type, enable encryption, and allow volume expansion:
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gp3-encrypted-expandable
provisioner: ebs.csi.aws.com
parameters:
type: gp3
encrypted: "true"
iops: "6000"
throughput: "250"
allowVolumeExpansion: true
reclaimPolicy: Retain
volumeBindingMode: WaitForFirstConsumer
Set reclaimPolicy: Retain for database volumes. The default Delete policy destroys the underlying volume when a PVC is removed, which is appropriate for ephemeral workloads but catastrophic for databases. Also note that StatefulSets do not delete PVCs on scale-down or deletion — stale PVCs accumulate silently and incur cost. Build automation to audit and clean up orphaned PVCs.
For organisations requiring third-party storage solutions, the landscape includes Portworx (enterprise-grade with application-aware snapshots), Longhorn (open-source simplicity from SUSE), Rook-Ceph (distributed storage for large-scale clusters), and OpenEBS (container-attached storage with the Mayastor engine for high-performance needs).
Backup Strategies and PITR
A persistent volume is not a backup. Volumes can be corrupted, accidentally deleted, or affected by storage-layer failures. Every database on Kubernetes needs a backup strategy that includes:
-
Operator-native backups: CloudNativePG and Percona operators support scheduled base backups with continuous WAL/binlog archiving to S3, GCS, or Azure Blob Storage. This enables point-in-time recovery (PITR) to any second within your retention window.
-
CSI volume snapshots: Fast, storage-native snapshots via CSI drivers. Useful for quick clones and pre-upgrade checkpoints, but not a substitute for application-consistent backups.
-
Velero: Cluster-level backup of Kubernetes resources and PVC snapshots. Essential for disaster recovery at the cluster level, but database-aware operators provide finer-grained recovery.
-
Application-consistent hooks: Use pre-snapshot and post-snapshot hooks to quiesce databases (flush writes, create consistent checkpoints) before taking volume snapshots.
We recommend a layered approach: operator-managed PITR as the primary recovery mechanism, CSI snapshots for quick rollbacks, and Velero for full cluster DR. Test your restores monthly — a backup you have never restored is not a backup.
Decision Framework: Managed Services vs Kubernetes-Native
Not every database belongs on Kubernetes. Here is the framework we use with clients to make this decision.
Choose Managed Services (RDS, Cloud SQL, Atlas) When
- Your team has limited Kubernetes or DBA expertise and cannot invest in building it
- You have simple, predictable database requirements with moderate scale
- You are an early-stage startup where operational simplicity outweighs cost
- Strict compliance requirements are easier to satisfy with managed service certifications
- You are committed to a single cloud provider with no portability needs
Choose Kubernetes-Native Databases When
- Cost optimisation is critical: Managed services typically cost 2x or more compared to equivalent Kubernetes deployments (Percona analysis)
- You run at scale with many database instances: Operator automation pays for itself when managing tens or hundreds of databases
- Multi-cloud or hybrid-cloud portability is a strategic requirement
- Your team has strong Kubernetes and DBA expertise (or is willing to invest in it)
- You need fine-grained control over configuration, upgrade timing, and data locality
- You are already running everything else on Kubernetes and want a unified platform
Databases Well-Suited for Kubernetes
Databases with built-in clustering, sharding, and replication work best: CockroachDB, TiDB, Vitess, MongoDB, and Cassandra handle node membership natively. Caching layers like Redis and Memcached are straightforward. PostgreSQL and MySQL are production-ready with mature operators like CloudNativePG and Percona.
Databases That Remain Challenging
Legacy single-node databases without clustering support, workloads with extreme IOPS requirements where cloud block storage becomes a bottleneck, and databases requiring exotic storage configurations (direct-attached NVMe with shared-nothing architectures) still warrant careful evaluation. For a deeper exploration of cloud-native database selection criteria, see our cloud native database guide.
Cost Comparison: Real Numbers
Cost is often the primary driver for moving databases to Kubernetes. Let us look at concrete figures.
RDS vs Self-Managed PostgreSQL on EKS
Consider a PostgreSQL deployment with 8 vCPUs, 32 GB RAM, 3 TB storage, and high availability:
| Component | AWS RDS (Multi-AZ) | Self-Managed on EKS |
|---|---|---|
| Compute | ~$1,200/month (db.r6g.2xlarge) | ~$550/month (r6g.2xlarge reserved) |
| Storage (3 TB gp3) | ~$900/month | ~$240/month (EBS gp3) |
| Backup storage | ~$200/month | ~$60/month (S3) |
| Operator/tooling | Included | $0 (open-source operators) |
| Monthly total | ~$2,500/month | ~$990/month |
| Annual total | ~$30,000/year | ~$11,880/year |
That is roughly a 60% cost reduction. Simplyblock’s analysis arrives at similar figures, and Percona’s cost calculator shows that clients typically achieve 50% savings within the first year.
However, these numbers do not include the human cost. Self-managed databases require engineering time for setup, monitoring, incident response, and upgrades. At smaller scale (one to three databases), the operational overhead can negate the infrastructure savings. The economics become compelling at scale — once you have the operator expertise and automation in place, adding the tenth or fiftieth database instance is nearly free from an operational perspective.
For organisations looking to optimise Kubernetes spending holistically, our Kubernetes cost optimisation guide covers strategies beyond database workloads.
StatefulSet Patterns and Anti-Patterns
Over hundreds of deployments, we have catalogued the practices that correlate with stable database operations and the mistakes that lead to incidents.
Patterns (Do This)
Set Pod Disruption Budgets (PDBs): Protect database quorum during node maintenance. For a three-node PostgreSQL cluster, set minAvailable: 2 to ensure Kubernetes never evicts more than one pod simultaneously during rolling updates or node drains.
Use pod anti-affinity: Spread database replicas across nodes and availability zones. Without this, all three replicas can land on the same node, creating a single point of failure:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: pg-cluster
topologyKey: kubernetes.io/hostname
Use topology spread constraints: For even distribution across failure domains, combine anti-affinity with topologySpreadConstraints targeting availability zones.
Choose the right update strategy: For databases, prefer OnDelete update strategy, which requires you to manually delete pods to trigger updates. This gives you control to verify each replica after upgrade before proceeding. Alternatively, use RollingUpdate with partition set to N-1 for canary upgrades — only the highest-ordinal pod updates first.
Set generous terminationGracePeriodSeconds: Database processes need time to flush writes, complete WAL archiving, and shut down cleanly. The default 30 seconds is often insufficient. We typically set 300-600 seconds for production databases.
Use init containers for initialisation: Schema setup, configuration rendering, and permission fixes belong in init containers, not in the main container’s entrypoint.
Anti-Patterns (Avoid This)
Using Deployments for databases: Pods lose their identity on restart, making it impossible to distinguish primary from replica. This is the most common mistake we encounter. If your team is making this error, our common Kubernetes mistakes guide covers this and other frequent pitfalls.
Storing data in pod-local storage: Without PVCs, all data is lost when a pod restarts. We have seen production databases running on emptyDir volumes — do not let this happen to you.
Ignoring PVC cleanup: StatefulSets do not delete PVCs on scale-down. Scale a cluster from five replicas to three, and you still pay for five volumes. Build automation to identify and remove orphaned PVCs.
Using RollingUpdate without partition: All replicas updating simultaneously can cause split-brain conditions or temporary loss of quorum. Always use partitioned rolling updates for databases.
Skipping backup configuration: PVC persistence is not a backup strategy. Volumes can be corrupted or accidentally deleted. Every database must have operator-managed backups with tested restore procedures.
Insufficient resource requests: Not setting CPU and memory requests (or setting them too low) invites noisy-neighbour problems. Database pods compete with other workloads for resources, leading to unpredictable latency. Always set both requests and limits based on actual workload profiling.
Day-2 Operations: Backup, Monitoring, and Upgrades
Deploying a database is day one. Keeping it running reliably for months and years is where the real work begins.
Backup and Recovery
Configure operator-native backups from day one. With CloudNativePG, a backup schedule looks like this:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pg-production
spec:
instances: 3
backup:
barmanObjectStore:
destinationPath: s3://pg-backups/production/
s3Credentials:
accessKeyID:
name: s3-creds
key: ACCESS_KEY_ID
secretAccessKey:
name: s3-creds
key: SECRET_ACCESS_KEY
retentionPolicy: "30d"
Run restore drills quarterly. Document the restore procedure, measure the time it takes, and ensure it meets your RTO and RPO targets. Our Kubernetes disaster recovery playbook provides a structured approach to DR testing.
Monitoring
Every database operator exposes Prometheus metrics. At minimum, monitor:
- Replication lag: Alert when replicas fall behind the primary
- Connection pool saturation: Alert before connections are exhausted
- Storage usage and growth rate: Alert at 80% capacity with projected time to full
- WAL archiving status: Alert on archiving failures (they indicate backup gaps)
- Query latency (p99): Alert on latency regressions
Integrate these with your existing Kubernetes monitoring stack rather than building separate dashboards for each database.
Upgrades
Minor version upgrades (e.g., PostgreSQL 16.2 to 16.3) are typically automated by operators with rolling restarts. Major version upgrades (e.g., PostgreSQL 15 to 16) require more care:
- Take a full backup before starting
- Test the upgrade on a clone of the production database
- Use the operator’s built-in upgrade mechanism (CloudNativePG supports in-place major upgrades)
- Monitor replication lag and application errors during the rollout
- Keep the previous backup available for rollback
Plan maintenance windows for major upgrades even if the operator promises zero downtime. We have seen edge cases where application connection handling does not cope gracefully with primary failover during upgrades.
Adoption Statistics: The Industry Has Moved
The CNCF survey data and DoK 2024 Report paint a clear picture of where the industry stands:
- 93% of organisations now use, pilot, or evaluate Kubernetes
- 80% have deployed Kubernetes in production (up from 66% in 2023)
- 72% use Kubernetes for database management
- 67% use Kubernetes for analytics workloads
- 54% run AI/ML workloads on Kubernetes
- 35% cite technical complexity as the top remaining barrier
The shift is not hypothetical. Major organisations — financial institutions, healthcare providers, and technology companies — are running production databases on Kubernetes today. The tooling has caught up with the ambition. CNCF-backed projects like CloudNativePG and Vitess, combined with mature commercial operators from Percona and Crunchy Data, have closed the operational gap that made databases on Kubernetes risky just a few years ago.
As The New Stack reported, Kubernetes has “finally solved its biggest problem” — managing databases. The combination of mature operators, reliable CSI storage, and battle-tested patterns means organisations that avoid running databases on Kubernetes are increasingly the outliers, not the norm.
Getting Started: A Pragmatic Path
If your organisation is considering databases on Kubernetes, here is the path we recommend:
-
Start with non-production: Deploy a staging database using CloudNativePG or Percona Operator. Learn the operator’s CRDs, backup configuration, and failover behaviour.
-
Invest in storage: Define production StorageClasses with encryption, expansion, and appropriate reclaim policies. Benchmark IOPS to ensure your storage tier meets database requirements.
-
Build observability first: Set up Prometheus metrics collection and Grafana dashboards for database-specific metrics before going to production.
-
Run gameday exercises: Simulate node failures, pod evictions, and storage issues. Verify that the operator handles failover correctly and that your backups restore successfully.
-
Migrate incrementally: Start with lower-risk databases (development tools, internal services) before migrating customer-facing production databases.
-
Document everything: Runbooks for common operations (scaling, backup restore, major upgrades) should exist before production go-live.
Run Databases on Kubernetes With Confidence
Running databases on Kubernetes is no longer experimental — it is a proven approach used by 72% of organisations managing data workloads. But the difference between a successful deployment and an operational nightmare comes down to architecture decisions made on day one: choosing the right operator, configuring storage correctly, implementing backup and recovery from the start, and building the team expertise to manage it all.
Our team provides comprehensive Kubernetes consulting services to help you:
- Design database architectures on Kubernetes with the right operators, storage tiers, and high-availability configurations
- Migrate from managed services like RDS and Cloud SQL to self-managed Kubernetes-native databases, cutting infrastructure costs by 50% or more
- Implement day-2 operations including automated backups with PITR, monitoring integration, and tested disaster recovery playbooks
- Train your engineering team to operate databases on Kubernetes confidently and independently
We have deployed databases on Kubernetes across AWS EKS, Azure AKS, and Google GKE for organisations ranging from early-stage startups to enterprises managing petabytes of data.