With 82% of container users now running Kubernetes in production according to the 2025 CNCF Annual Survey, organisations face a critical architectural decision: how do you safely share clusters across teams, environments, and even external customers? Over the past three years, our team has designed and secured multi-tenant architectures for more than 20 shared Kubernetes clusters spanning financial services, healthcare, and SaaS platforms. This guide distils everything we learned into a single, actionable reference.
Multi-tenancy is not a single feature you toggle on. It is a layered discipline that touches namespaces, RBAC, networking, resource management, pod security, and cost allocation. Get any one layer wrong, and a noisy neighbour can starve a production workload, a compromised pod can pivot laterally, or your finance team can never answer the question “how much does Team X actually cost us?”
In this guide we walk through the tenancy models available today, the isolation layers you must implement, the open-source tools that make multi-tenancy practical, and the managed-service nuances for EKS, AKS, and GKE.
Why Multi-Tenancy Matters in 2026
Running a dedicated cluster for every team or environment sounds safe, but it introduces serious operational overhead. Each cluster carries its own control plane costs, its own upgrade cadence, its own monitoring stack, and its own set of credentials to rotate. On EKS alone, the management plane costs USD 0.10 per hour per cluster — roughly USD 73 per month before you run a single workload. Multiply that by dozens of teams and you are looking at significant spend just to keep the lights on.
Shared clusters, by contrast, consolidate resources, reduce the blast radius of operational toil, and give platform teams a single pane of glass for governance. When done correctly, multi-tenancy delivers three tangible outcomes:
- Cost efficiency — higher bin-packing ratios and shared infrastructure reduce per-tenant overhead.
- Operational simplicity — fewer clusters mean fewer upgrade windows, fewer certificate rotations, and fewer monitoring stacks.
- Developer self-service — tenants get isolated environments on demand without waiting for a new cluster to be provisioned.
The challenge is achieving these benefits without sacrificing isolation. That is what the rest of this guide addresses.
Three Tenancy Models
Not all multi-tenancy is the same. The official Kubernetes multi-tenancy documentation outlines a spectrum from soft isolation within a single cluster to fully separate clusters. We find it useful to think in terms of three models.
Namespaces-as-a-Service
This is the most common starting point. Each tenant receives one or more namespaces within a shared cluster. Isolation is enforced through Kubernetes-native primitives: RBAC, NetworkPolicies, ResourceQuotas, LimitRanges, and Pod Security Standards.
When to use it: Internal teams that trust each other at a basic level, development and staging environments, or organisations just beginning their multi-tenancy journey.
Limitations: Tenants share the API server, etcd, scheduler, and node kernel. A Kubernetes-level vulnerability or a misconfigured ClusterRole can cross namespace boundaries. Tenants cannot install their own CRDs or admission webhooks.
Control Planes-as-a-Service (Virtual Clusters)
Tools like vCluster create lightweight, fully functional Kubernetes clusters inside a host cluster namespace. Each virtual cluster has its own API server and its own control plane state, but workloads are scheduled onto the host cluster’s nodes. With over 40 million virtual clusters deployed by organisations including Adobe, CoreWeave, and NVIDIA, this model has proven itself at scale.
When to use it: Tenants that need their own CRDs, admission policies, or cluster-scoped resources. Platform teams building internal Kubernetes platforms (IKPs). Organisations that need stronger isolation than namespaces provide but do not want the overhead of separate clusters.
Limitations: Virtual clusters still share the host node kernel and, in the default “shared nodes” mode, the underlying network and storage. You must combine vCluster with node-level isolation (dedicated or private nodes) for hard multi-tenancy.
Clusters-as-a-Service
Each tenant gets a dedicated physical or managed cluster. A central platform team automates provisioning, upgrades, and policy enforcement across the fleet.
When to use it: Untrusted tenants such as external customers, workloads with strict regulatory requirements (PCI DSS, HIPAA), or use cases where kernel-level isolation is non-negotiable.
Limitations: Highest cost and operational complexity. Every cluster is an additional upgrade, monitoring, and security surface.
Decision Framework
| Factor | Namespaces | Virtual Clusters | Dedicated Clusters |
|---|---|---|---|
| Isolation strength | Soft | Medium-hard | Hard |
| Tenant CRD support | No | Yes | Yes |
| Control plane cost per tenant | Zero | Low (lightweight API server) | High (full control plane) |
| Operational overhead | Low | Medium | High |
| Best for | Internal teams, dev/staging | Platform engineering, SaaS back-ends | External customers, regulated workloads |
Most organisations we work with use a combination: virtual clusters for production tenants that need CRD freedom, namespaces for development environments, and dedicated clusters only for the most sensitive or regulated workloads. If you are evaluating these options across cloud providers, our comparison of EKS, AKS, and GKE covers how each platform handles multi-tenancy natively.
Soft vs Hard Multi-Tenancy
The Kubernetes documentation distinguishes between soft and hard multi-tenancy based on the trust model between tenants.
Soft multi-tenancy assumes tenants are cooperative. They may be different teams within the same organisation who do not intend to attack each other but could accidentally interfere through resource contention or misconfiguration. Namespace-level isolation with RBAC, quotas, and network policies is usually sufficient.
Hard multi-tenancy assumes tenants are mutually untrusted. This is the model you need when hosting workloads for external customers, when dealing with compliance frameworks that mandate workload separation, or when a breach in one tenant must never allow lateral movement to another. Hard multi-tenancy requires stronger boundaries: separate control planes (virtual or physical clusters), node isolation, and often kernel-level sandboxing with gVisor or Kata Containers.
In practice, we see most organisations start with soft multi-tenancy and graduate workloads to harder isolation as trust requirements increase. The important thing is to be explicit about your trust model. Document it, map it to isolation controls, and enforce it through policy.
Five Layers of Tenant Isolation
Effective multi-tenancy requires defence in depth. No single Kubernetes primitive provides complete isolation. You need all five layers working together.
Layer 1: Namespace Boundaries
Namespaces are the foundational unit of tenancy in Kubernetes. Every tenant resource — Deployments, Services, ConfigMaps, Secrets — lives within a namespace, and most Kubernetes RBAC and policy mechanisms operate at the namespace level.
For a deeper understanding of how namespaces function and why they matter, see our guide to Kubernetes namespaces.
Best practices we enforce:
- One namespace per tenant per environment (e.g.,
team-a-prod,team-a-staging). - Standardised labelling:
tenant,environment,cost-centre. - Automated namespace provisioning through GitOps or a self-service portal.
Layer 2: RBAC (Role-Based Access Control)
RBAC determines what each tenant can do. The principle is straightforward: tenants should have the minimum permissions necessary to operate within their namespaces and zero visibility into other tenants’ resources.
Critical rules:
- Never grant
cluster-adminto tenant users. This is one of the most common anti-patterns we encounter. - Use
RoleandRoleBinding(namespace-scoped), notClusterRoleandClusterRoleBinding, for tenant permissions. - Restrict the ability to create or modify NetworkPolicies, ResourceQuotas, and LimitRanges to platform administrators.
- Audit RBAC bindings regularly. Tools like
kubectl auth can-i --list --as=tenant-userhelp verify effective permissions.
If you are not yet confident in your Kubernetes security posture, RBAC is the first place to look.
Layer 3: Network Policies
By default, Kubernetes allows all pod-to-pod communication across all namespaces. This is a significant risk in a multi-tenant environment. A compromised pod in one tenant’s namespace can freely communicate with pods in every other namespace unless you explicitly restrict it.
# Deny all ingress from other namespaces
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-cross-namespace-ingress
namespace: tenant-a
spec:
podSelector: {}
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
tenant: team-a
We apply a default-deny ingress policy to every tenant namespace at provisioning time, then layer on explicit allow rules for legitimate cross-namespace traffic such as shared ingress controllers or monitoring agents. We cover network isolation in greater depth in the dedicated section below.
Layer 4: Resource Quotas and LimitRanges
Without resource constraints, a single tenant can consume all available CPU, memory, or storage on shared nodes, starving other tenants. ResourceQuotas cap the total resources a namespace can consume. LimitRanges set default and maximum resource requests and limits for individual pods.
apiVersion: v1
kind: ResourceQuota
metadata:
name: tenant-a-quota
namespace: tenant-a
spec:
hard:
requests.cpu: "8"
requests.memory: 16Gi
limits.cpu: "16"
limits.memory: 32Gi
persistentvolumeclaims: "10"
pods: "50"
Why this matters: We have seen incidents where a tenant’s runaway batch job allocated hundreds of pods, triggering node autoscaling that drove up costs and degraded performance for every other tenant on the cluster. Quotas prevent this.
Layer 5: Pod Security Standards
Pod Security Standards (PSS) define three profiles — Privileged, Baseline, and Restricted — that control what pods are allowed to do at the kernel and container runtime level.
For multi-tenant clusters, we enforce the Restricted profile on all tenant namespaces. This prevents pods from running as root, using host networking, mounting host paths, or escalating privileges. Platform namespaces (monitoring, ingress controllers) may use the Baseline or Privileged profile where justified, but tenant workloads have no business running privileged containers.
Pod Security Admission (PSA) is the built-in enforcement mechanism since Kubernetes 1.25, but we typically supplement it with a policy engine like Kyverno or OPA Gatekeeper for finer-grained control.
Tools Compared
The Kubernetes ecosystem offers several tools purpose-built for multi-tenancy. Here is how we evaluate them.
vCluster
vCluster creates virtual Kubernetes clusters inside host cluster namespaces. Each virtual cluster runs its own API server (typically k3s or k0s) and stores its state independently, while workloads are synced down to the host cluster for scheduling.
Strengths: Tenants get full cluster-admin within their virtual cluster. CRDs, admission webhooks, and namespaces are fully isolated. Lightweight enough to spin up in seconds. The CNCF has highlighted vCluster as a leading solution for multi-tenancy challenges.
Considerations: Requires careful node isolation configuration for hard multi-tenancy. The Pro tier adds features like centralised management, sleep mode, and SSO.
Capsule
Capsule is a CNCF project that introduces a “Tenant” custom resource, grouping multiple namespaces under a single tenant entity. It automatically applies RBAC, quotas, network policies, and LimitRanges to all namespaces in a tenant.
Strengths: Lightweight, Kubernetes-native, no additional control planes. Excellent for namespace-as-a-service models. Integrates with Kyverno for policy enforcement. Prevents cross-tenant RBAC leakage by design.
Considerations: Does not provide CRD isolation. Tenants still share the same API server and etcd. Better suited for soft multi-tenancy than hard.
Hierarchical Namespaces (HNC)
The Hierarchical Namespace Controller was developed by the Kubernetes Multi-Tenancy Working Group. It allowed parent-child namespace relationships with policy inheritance.
Current status: The HNC repository was archived in April 2025 and is no longer maintained. We recommend migrating to Capsule or vCluster if you are currently using HNC. Its core limitation was that it did not provide stronger isolation than standard namespaces and struggled with quota inheritance across hierarchies.
Kyverno
Kyverno is a Kubernetes-native policy engine that uses YAML-based policies to validate, mutate, and generate resources. In a multi-tenant context, Kyverno can enforce tenant labelling, block privilege escalation, auto-generate NetworkPolicies for new namespaces, and ensure every pod has resource limits.
Strengths: YAML-native policies lower the learning curve compared to Rego. Deep Kubernetes integration. Can generate default resources (quotas, network policies) automatically when a namespace is created.
OPA Gatekeeper
OPA Gatekeeper uses the Open Policy Agent engine with Rego-based policies. It provides constraint templates that can be reused across clusters.
Strengths: Extremely flexible policy language. Large community and policy library. Well-suited for organisations with complex compliance requirements. Integrates with external data sources for context-aware policies.
Comparison Table
| Tool | Approach | Isolation Level | CRD Isolation | Learning Curve | Best For |
|---|---|---|---|---|---|
| vCluster | Virtual clusters | Medium-hard | Yes | Medium | Platform engineering, SaaS |
| Capsule | Tenant CRD + namespaces | Soft | No | Low | Internal teams, namespace-as-a-service |
| HNC (retired) | Hierarchical namespaces | Soft | No | Low | Legacy only — migrate away |
| Kyverno | Policy engine | Complements others | N/A | Low | Policy enforcement, automation |
| OPA Gatekeeper | Policy engine | Complements others | N/A | Medium-high | Complex compliance requirements |
In practice, these tools are complementary. A typical production setup we deploy combines vCluster or Capsule for tenancy boundaries with Kyverno or Gatekeeper for policy enforcement.
Network Isolation Deep-Dive
Network isolation is where multi-tenancy most commonly fails. The default Kubernetes networking model is “allow all” — every pod can reach every other pod across every namespace. This is fine for a single-team cluster but dangerous in a shared environment.
The Default Allow-All Problem
When you create a namespace, Kubernetes applies no NetworkPolicies by default. A compromised pod in tenant-b can probe services in tenant-a, exfiltrate data, or pivot to cluster-internal services like the Kubernetes API server’s metrics endpoints. This violates the principle of least privilege at the network layer.
Deny-All-First Pattern
We apply a deny-all ingress and egress policy to every tenant namespace from the moment it is created:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-all
namespace: tenant-a
spec:
podSelector: {}
policyTypes:
- Ingress
- Egress
From this baseline, we layer on explicit allow rules:
- Ingress from the shared ingress controller namespace.
- Egress to DNS (kube-dns) on port 53.
- Egress to specific external services the tenant needs.
- Ingress from monitoring agents (Prometheus scrape).
This inverts the default model from “allow unless denied” to “deny unless allowed,” which is the only safe posture for multi-tenant clusters.
Cilium vs Calico for Multi-Tenant Clusters
Both Cilium and Calico support Kubernetes NetworkPolicies, but they differ in architecture and capabilities.
Cilium is built on eBPF and excels at identity-aware security. Rather than relying solely on IP addresses, Cilium policies can target workloads by Kubernetes labels, service accounts, and namespaces. It supports L7 filtering (e.g., allowing HTTP GET but blocking POST to a specific path) and achieves up to 100 Gbps throughput in direct routing mode with sub-millisecond latency.
Calico uses a mature iptables or nftables dataplane (with an optional eBPF mode). Its standout multi-tenancy feature is GlobalNetworkPolicy, which lets platform teams define cluster-wide guardrail policies that individual tenants cannot override. Calico achieves 80-90 Gbps throughput with nftables.
Our recommendation: For new multi-tenant clusters, we default to Cilium for its identity-based policies and superior observability through Hubble. For organisations with existing Calico deployments, upgrading to Calico’s eBPF dataplane provides most of the performance benefits without a full CNI migration.
If you are concerned about the broader security posture of your clusters, network isolation is a critical piece of the puzzle alongside secrets management and admission control.
Cost Allocation per Tenant
Multi-tenancy without cost visibility is a recipe for unchecked spending. If you cannot attribute costs to individual tenants, you cannot hold teams accountable, justify shared infrastructure investments, or identify waste.
Showback vs Chargeback
Showback reports costs to tenants for awareness without actual billing. It is a good starting point that builds cost consciousness across teams. Chargeback actually bills tenants for their usage, typically through internal cost transfers. It requires more precise allocation but drives stronger accountability.
We recommend starting with showback and graduating to chargeback once your labelling standards and allocation accuracy are mature.
Kubecost
Kubecost (now part of IBM/Apptio) is the most widely adopted Kubernetes cost management platform. It provides real-time cost allocation by namespace, label, controller, and pod. It reconciles with actual cloud billing data, accounting for reserved instances, savings plans, and spot pricing.
For multi-tenancy, Kubecost’s key feature is the ability to map costs to organisational units — teams, departments, products, or tenants — using Kubernetes labels and annotations. Its allocation API can feed data directly into internal billing systems.
OpenCost
OpenCost is the CNCF open-source project that Kubecost originally developed and open-sourced. It provides real-time cost monitoring for cloud-native environments and is ideal for single-cluster visibility.
Key difference: OpenCost is free and vendor-neutral but lacks Kubecost’s multi-cluster UI, long-term data retention, SSO, and discount reconciliation. For organisations with a single shared cluster, OpenCost may be sufficient. For multi-cluster multi-tenant platforms, Kubecost’s enterprise features justify the investment.
Labelling Is Everything
Neither tool works without consistent labelling. We enforce these labels on every tenant resource through Kyverno policies:
tenant— the owning team or customer.environment— prod, staging, dev.cost-centre— the internal billing code.
Resources without these labels are rejected at admission time. This ensures 100% of workloads are attributable from day one rather than retroactively chasing unlabelled pods.
For a deeper exploration of cost management tooling, our Kubernetes cost optimisation guide covers strategies beyond tenant allocation, including right-sizing, spot instances, and cluster autoscaler tuning.
Common Anti-Patterns
Over 20+ multi-tenant engagements, we have seen the same mistakes repeatedly. Avoid these.
1. Cluster-per-Tenant Sprawl
Provisioning a dedicated cluster for every team feels safe but creates enormous operational overhead. We have seen organisations with 50+ clusters where each one has a different Kubernetes version, a different monitoring configuration, and a different set of security policies. Consolidating to fewer shared clusters with proper isolation reduces toil and improves consistency.
2. Namespace-Only Isolation
Creating namespaces without RBAC, network policies, resource quotas, or pod security standards provides the illusion of isolation without the reality. Namespaces are a scope mechanism, not a security boundary, unless you layer the other four isolation controls on top.
3. Ignoring the Noisy Neighbour Problem
Without ResourceQuotas and LimitRanges, a single tenant’s workload can consume all available node resources. We have seen batch jobs that autoscaled to hundreds of pods overnight, driving up costs and degrading latency for every other tenant. Always set quotas, and always set them conservatively at first — it is easier to increase limits than to recover from an outage.
4. No Cost Visibility
If you cannot answer “how much does Tenant X cost?” you have no feedback loop. Teams without cost visibility have no incentive to optimise. Deploy Kubecost or OpenCost before you onboard your second tenant.
5. Granting cluster-admin to Tenants
This is the single most dangerous anti-pattern. cluster-admin grants unrestricted access to every resource in every namespace, including the ability to read other tenants’ Secrets, modify NetworkPolicies, and delete critical system components. Never grant this role to tenants. If tenants need cluster-scoped permissions (e.g., for CRDs), use vCluster to give them cluster-admin within an isolated virtual cluster.
Multi-Tenancy on Managed Services
Each major cloud provider offers distinct multi-tenancy capabilities. Understanding these differences helps you choose the right platform and avoid reinventing features that already exist.
Amazon EKS
Amazon EKS provides a solid foundation but requires more manual assembly than its competitors. Key multi-tenancy features include:
- IAM Roles for Service Accounts (IRSA): Maps Kubernetes service accounts to AWS IAM roles, enabling fine-grained access control to AWS services per tenant.
- Security Groups for Pods: Assigns AWS security groups at the pod level for network isolation beyond Kubernetes NetworkPolicies.
- GuardDuty for EKS: Detects threats specific to EKS workloads, including privilege escalation and credential exfiltration.
EKS consideration: The control plane costs USD 0.10 per hour (approximately USD 73 per month). For clusters-as-a-service models, this per-cluster cost adds up quickly, making virtual clusters or namespace-based tenancy more cost-effective.
Azure AKS
Azure AKS has a significant advantage for organisations already invested in the Microsoft ecosystem:
- Azure AD integration: Native integration with Azure Active Directory for tenant identity and RBAC. Azure AD groups can map directly to Kubernetes RBAC roles.
- Azure Policy (Gatekeeper): Built-in OPA Gatekeeper integration lets you define and enforce multi-tenancy policies through Azure Policy.
- Microsoft Defender for Containers: Provides runtime threat detection and vulnerability assessment across tenant workloads.
AKS advantage: The control plane is free, making AKS particularly attractive for multi-tenancy strategies that use cluster-per-environment or cluster-per-team patterns where you would otherwise pay for multiple control planes.
Google GKE
Google GKE offers the most built-in multi-tenancy features:
- GKE Autopilot: Removes node management entirely. Tenants consume resources without awareness of underlying infrastructure. Built-in workload isolation through GKE Sandbox (gVisor-based) provides kernel-level tenant separation.
- Binary Authorization: Ensures only signed, verified container images are deployed to the cluster, preventing tenants from running unapproved workloads.
- Workload Identity: The GKE equivalent of IRSA, mapping Kubernetes service accounts to Google Cloud IAM.
- Policy Controller: Google’s managed OPA Gatekeeper with pre-built policy bundles for multi-tenancy.
GKE advantage: Autopilot with GKE Sandbox provides the closest thing to hard multi-tenancy within a single managed cluster. For organisations that need strong isolation without the overhead of separate clusters, this combination is compelling.
Managed Service Comparison
| Feature | EKS | AKS | GKE |
|---|---|---|---|
| Control plane cost | USD 0.10/hr (~USD 73/mo) | Free | Free (Standard), USD 0.10/hr (Autopilot) |
| Native identity integration | IRSA | Azure AD | Workload Identity |
| Built-in policy engine | None (install Gatekeeper/Kyverno) | Azure Policy (Gatekeeper) | Policy Controller (Gatekeeper) |
| Kernel-level sandboxing | Manual (Firecracker/gVisor) | None built-in | GKE Sandbox (gVisor) |
| Threat detection | GuardDuty for EKS | Defender for Containers | Security Command Center |
For a comprehensive breakdown of these platforms beyond multi-tenancy, our EKS vs AKS vs GKE comparison covers pricing, networking, and ecosystem differences in detail.
Putting It All Together: A Multi-Tenancy Checklist
Before declaring your cluster production-ready for multi-tenancy, verify every item:
- Tenancy model defined — document whether you are using namespaces, virtual clusters, dedicated clusters, or a hybrid.
- Trust model documented — explicitly state whether tenants are trusted (soft) or untrusted (hard) and map isolation controls accordingly.
- Namespace provisioning automated — new tenants get namespaces (or virtual clusters) with all isolation layers pre-applied.
- RBAC scoped to namespaces — no tenant has ClusterRoleBindings. Platform admins are the only cluster-admin users.
- Default-deny NetworkPolicies applied — every tenant namespace denies ingress and egress by default.
- ResourceQuotas and LimitRanges enforced — every tenant namespace has CPU, memory, and pod count limits.
- Pod Security Standards set to Restricted — tenant namespaces enforce the Restricted PSS profile.
- Policy engine deployed — Kyverno or OPA Gatekeeper enforces labelling, security, and governance policies.
- Cost allocation operational — Kubecost or OpenCost is deployed and every resource is labelled for tenant attribution.
- Monitoring is tenant-aware — dashboards and alerts are filterable by tenant without exposing cross-tenant data.
Secure Your Shared Clusters with Confidence
Multi-tenancy is one of the most impactful ways to reduce Kubernetes costs and operational complexity, but only when isolation is implemented correctly across every layer. A misconfigured RBAC binding or a missing NetworkPolicy can turn cost savings into a security incident.
Our team provides comprehensive Kubernetes consulting services to help you:
- Design multi-tenant architectures tailored to your trust model, compliance requirements, and organisational structure using vCluster, Capsule, or namespace-based isolation.
- Implement defence-in-depth isolation across RBAC, network policies, resource quotas, Pod Security Standards, and policy engines like Kyverno and OPA Gatekeeper.
- Establish tenant cost allocation with Kubecost or OpenCost so every pound of Kubernetes spend is attributed and actionable.
We have secured shared clusters for organisations across financial services, healthcare, and SaaS — and we bring that battle-tested experience to every engagement.