Bits, Bots & Business

The "No" Button: Stopping Bad Configs Before They Wake You Up at 3 AM

Mahmoud Rashed — Fri, 23 Jan 2026 15:23:11 GMT

You get a Slack message at 8 PM. A critical production service is down. After an hour of frantic debugging, you find the cause: a developer deployed a new version without resource limits, and the pod went rogue, triggering a cascading failure. We've all been there. Managing multiple Kubernetes clusters and development teams can feel like a constant battle against inconsistency and overlooked best practices.

In the previous two articles in this series, we covered the foundations of policy-as-code and explored the magic of mutation policies for automatically correcting configurations. Now, we turn our attention to the most powerful tool in the policy engine arsenal: Validation Policies. This article will focus on the most impactful and perhaps counter-intuitive uses of Validation Policies with Kyverno, presenting them not as a tool for enforcement, but for empowerment.

Guardrails, not Gates: Helping Devs Without Being a Blocker

The most effective policy strategy isn't about building gates that simply block developers when they do something wrong. It's about creating "guardrails" that gently guide them toward best practices, making the right way the easy way.

This philosophy is where Kyverno’s simplicity shines. Unlike tools that require learning a specialized language like Rego, Kyverno uses simple, declarative YAML for policy definitions. This approachability makes it easier for platform and development teams to collaborate on policy creation and maintenance. Instead of a complex, opaque language owned by a central team, policies become shared artifacts that everyone can understand and contribute to.

A great analogy for this is a validated drop-down list in a web form. The dropdown is a preventive control; it stops you from entering bad data from the very start. But it does so in a helpful, intuitive way by showing you the valid options. A good Kyverno policy works the same way. It prevents a misconfigured resource from ever entering the cluster, but when paired with clear error messages, it teaches the developer what needs to be fixed.

This turns policy management from a confrontational process into a collaborative one, improving both security posture and developer velocity. This isn't just about developer happiness; it's about shipping more secure code, faster, by shrinking the feedback loop from days (in a security review) to seconds (at kubectl apply).

The "Root" of all Evil: Why We Block Root Users

One of the most fundamental Kubernetes security best practices is to prevent containers from running as the root user. Allowing root access inside a container is a major security risk, opening the door to privilege escalation attacks if a vulnerability is exploited. Enforcing this rule manually across hundreds or thousands of workloads is impossible.

This is a perfect use case for a preventive validation policy. With a few lines of YAML, you can create a cluster-wide rule that automatically blocks any pod that attempts to run as root.

Here is a simple Kyverno ClusterPolicy that uses a Common Expression Language (CEL) expression to validate that pods are configured to run as a non-root user:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: disallow-root-user
spec:
  validationFailureAction: Enforce
  rules:
    - name: validate-run-as-non-root
      match:
        resources:
          kinds:
            - Pod
      validate:
        cel:
          expression: "object.spec.securityContext.runAsNonRoot == true"

This simple, declarative rule is incredibly powerful. It enforces a critical security standard across the entire cluster without any manual intervention. It's a classic example of a powerful preventive control that eliminates an entire class of security risks before a workload is even scheduled.

Audit Mode: How to Test Rules Without Causing a Mass Revolt

A common pitfall I see is platform teams rolling out a new, strict policy that immediately breaks existing workflows or blocks a critical production deployment. This is the moment where the platform team is seen as a blocker, causing friction and frustration. How do you enforce standards without causing a developer revolt?

This is where Kyverno’s distinction between preventive (Enforce) and detective (Audit) controls becomes a lifesaver.

• Enforce Mode: This is a preventive control. It actively blocks any resource that violates the policy before it can be created or updated in the cluster.

• Audit Mode: This is a detective control. It allows non-compliant resources to be created but flags them for review by generating policy reports.

This dual-mode capability allows for a safe, gradual rollout strategy. As Adevinta's engineering team noted in their tech blog on their transition to Kyverno:

"In audit mode, Kyverno will not reject any request but instead will produce a resource inside the cluster called ’admissionreport.kyverno.io’ and also ‘policyreport.kyverno.io’."

This enables a practical and low-stress workflow for introducing new policies:

1. Deploy in Audit Mode: Initially, deploy all new policies with validationFailureAction: Audit.

2. Monitor Reports: Observe the generated policy reports to see which existing workloads are non-compliant and understand the real-world impact of the new rule.

3. Collaborate and Remediate: Work with the relevant development teams to help them fix their configurations based on the audit data.

4. Switch to Enforce Mode: Once you’ve confirmed that all critical workloads are compliant, you can confidently switch the policy to validationFailureAction: Enforce, knowing it won't cause unexpected disruptions.

This feature transforms policy implementation from a high-risk, all-or-nothing event into a gradual, data-driven process that builds trust between platform and development teams.

CEL Expressions: It’s Like Excel Formulas, but for Kubernetes Security

One of the game-changing features in modern Kubernetes policy enforcement is the Common Expression Language (CEL). CEL provides a standardized, declarative way to write validation logic directly into Kubernetes resources. This is a significant shift: for many common validation rules, you may no longer need a separate policy engine like Gatekeeper or even Kyverno's own pattern-matching engine; you can use the native capabilities that Kubernetes now provides. And because Kyverno offers first-class support for this standard, it gives you an incredibly powerful toolset.

Think of CEL expressions like writing formulas in a spreadsheet. You're given an object (the Kubernetes resource YAML being submitted) and you can write simple, powerful expressions to inspect its fields and validate its contents.

Here are a few common examples of what you can do with CEL:

• Checking for a specific label: object.metadata.labels.team == 'platform' This simple check ensures a resource is correctly attributed to the platform team.

• Ensuring all containers have CPU limits:

• This prevents runaway processes from starving other workloads and causing cluster-wide instability.

• Requiring at least one annotation: has(object.metadata.annotations) && object.metadata.annotations.size() > 0 This is useful for ensuring resources have the necessary metadata for monitoring or automation tools.

• Validating an image registry using regex: object.spec.containers.all(container, container.image.matches('^my-trusted-registry.io/.*')) This is a critical supply-chain security control, ensuring that only vetted images from your organization's registry can run in the cluster.

With CEL, you get a simple yet incredibly powerful way to write complex, fine-grained validation rules directly in your Kyverno policies, making them more expressive and easier to maintain.

Real Talk: How to Gently Force People to Use Labels

A common problem in any large cloud or Kubernetes environment is inconsistent or missing labels. Without proper tagging, it becomes nearly impossible to answer basic questions like, "How much does this application cost?" or "Who owns this service?" This lack of metadata hygiene leads to operational blind spots and makes cost allocation a nightmare.

The solution is to define a tagging standard and then enforce it with a validation policy. First, decide on a set of required labels (e.g., app, owner, cost-center). Second, create a Kyverno policy to ensure every new resource has them.

This ClusterPolicy uses CEL to validate that all new Deployments contain the required labels:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-standard-labels
spec:
  validationFailureAction: Enforce
  rules:
    - name: check-for-standard-labels
      match:
        resources:
          kinds:
            - Deployment
      validate:
        cel:
          expression: "has(object.metadata.labels) && ['app', 'owner', 'cost-center'].all(key, key in object.metadata.labels)"

A policy like this is the perfect candidate for the "Audit-then-Enforce" strategy. You can start by deploying it in Audit mode to discover which teams need to update their deployment manifests. This approach prevents the developer revolt that often happens when governance teams try to enforce tagging standards retroactively. It's about learning from the common mistakes seen in other ecosystems (like Azure) and applying a more intelligent, DevOps-native solution. This is the "gentle" part of the enforcement. Once you've given teams time to comply, you can switch to Enforce mode.

The payoff is immense: operational blind spots are eliminated, 'who owns this?' becomes a solved problem. Finance teams can finally get the clear cost allocation reports they've been asking for. This isn't just about hygiene; it's about running Kubernetes like a business.

Conclusion

Kyverno's validation policies are more than just a security tool; they are a flexible, powerful, and developer-friendly way to bring order and sanity to your Kubernetes clusters. By acting as helpful guardrails, offering a safe audit mode for rollouts, leveraging the power of CEL, and enforcing operational best practices like tagging, you can build a more secure, reliable, and collaborative platform.

To leave you with a final thought: What is the one small, unenforced standard in your cluster that, if automated with a policy, would have the biggest positive impact?

💡

If you found this useful, subscribe to our newsletter for more deep dives into cloud-native technologies. In the next article in this series, we’ll explore Generation Policies and show how they can automatically create resources like NetworkPolicies and RBAC rules to secure further and standardize your workloads.

📚 Recommended Reading & Resources

The "Real World" Story Scaling Policy Enforcement: Lessons from Adevinta’s Migration. The engineering blog post is referenced in this article. A must-read for understanding the operational side of moving to Kyverno and the memory/performance benefits observed at scale.
The Sandbox Kyverno Policy Playground. Don't test in production! Use this interactive playground to write, test, and debug your validation policies and CEL expressions directly in your browser before applying them to your cluster.
The Official Guide Kyverno Validation Rules Documentation: The complete reference for writing validation policies, including deep dives on patterns, anchors, and failure actions.
The Deep Dive Kubernetes Common Expression Language (CEL) Reference: Everything you need to know about the syntax and capabilities of CEL within Kubernetes. Bookmark this for when you need to write complex logic that goes beyond simple pattern matching.
The "Why" CEL-ebrating Simplicity: Mastering Kubernetes Policy Enforcement. A great overview from the CNCF blog on why the industry is shifting toward CEL and how it simplifies the policy landscape for platform engineers.

The Kubernetes Cleaning Fairy: Fixing Messy Manifests with Mutation

Mahmoud Rashed — Mon, 19 Jan 2026 22:08:28 GMT

In a previous article, we laid the foundations for governing Kubernetes clusters, focusing on how admission policies act as essential gatekeepers. They ensure that only compliant, secure, and well-formed resources make it into your environment. But what if we could go beyond simple rejection or validation? What if the platform could not only identify problems but also automatically fix them?

This article dives into a more proactive and powerful tool in the platform engineer's arsenal: mutation policies. We'll explore how mutation works not just as a gatekeeper, but as a helpful assistant that corrects and enhances resources before they are even created. This shift from "rejecting the bad" to "perfecting the good" is a game-changer that turns your platform from a gatekeeper into a collaborator, actively improving developer velocity and reducing rework.

Don't Reject, Correct: Being a helpful platform engineer.

The traditional approach to Kubernetes policy enforcement is strict validation: if a resource manifest (YAML) breaks the rules, the API server rejects it. The developer receives an error message and must return to their editor to fix the code. However, Mutation Policies offer a more collaborative alternative: proactive correction.

The Concept of Proactive Correction

Mutation policies act as a "preventive control," transforming the platform into a helpful partner rather than a gatekeeper. Instead of blocking a deployment with a "no," the platform automatically fixes common omissions or misconfigurations—such as adding missing labels or setting default resource limits. This reduces developer friction, minimizes context switching, and ensures compliance by default.

A preventative control stops something from happening; it prevents it.

By automatically correcting resources, the platform becomes a partner in the development process rather than just a critic. This significantly reduces developer friction and improves the overall experience of using the platform.

The Admission Controller Order

The power to automatically correct resources lies in the Kubernetes Admission Controller order.

When a developer runs kubectl applyThe request traverses several steps. Mutating Admission Webhooks trigger first—even before schema validation. This allows the platform to patch the resource definition on the fly. This architecture enables a true "shift-left" approach to compliance, solving issues at the earliest possible moment: API admission time.

The "Oops, I Forgot Limits" Fixer-Upper.

Here’s a classic Kubernetes scenario: a developer focuses on application logic but forgets to define resource requests and limits in their deployment manifest. A strict validation policy would reject the deployment, forcing the developer to context-switch and edit their YAML. While secure, this creates friction.

A Kyverno mutation policy solves this by proactively fixing the manifest. Instead of rejecting the workload, the admission controller intercepts the request and automatically injects sensible default values for CPU and memory. This ensures that no pod runs without limits—crucial for cluster stability and preventing "noisy neighbor" issues—while maintaining a frictionless developer experience.

Example: Kyverno ClusterPolicy for Default Limits

The following ClusterPolicy checks any Pod; if resource limits are missing, it patches them in automatically:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-default-resources
spec:
  rules:
  - name: add-default-cpu-memory-limits
    match:
      any:
      - resources:
          kinds:
          - Pod
    mutate:
      patchStrategicMerge:
        spec:
          containers:
          - (name): "*"
            resources:
              limits:
                +(cpu): "1"
                +(memory): "1Gi"
              requests:
                +(cpu): "100m"
                +(memory): "256Mi"

Understanding the Syntax

This policy uses specific Kyverno features to ensure precise application:

patchStrategicMerge: A declarative method for modifying resources. It is ideal for adding fields to a known structure without overwriting existing data.
(name): "*": A conditional anchor that acts as a wildcard, ensuring the patch applies to all containers within the pod spec.
+(cpu) / +(memory): The + The anchor is the key logic here. It instructs Kyverno to add the field only if it is not already present. If a developer has set a limit, this policy respects it and does nothing.

Impact Analysis

This policy instantly improves Kubernetes governance. It guarantees fair resource allocation and prevents Out-Of-Memory (OOM) kills caused by uncapped containers, all without requiring manual intervention from the development team.

Invisible Sidecars: Injecting Containers Like a Ninja

If you've ever used a service mesh like Istio or an observability tool like the OpenTelemetry Operator, you've witnessed the magic of mutation. These tools use mutating webhooks to inject "sidecar" containers into your application pods automatically.

Automating Sidecar Injection with Mutation Policies

If you have used a service mesh like Istio or an observability tool like the OpenTelemetry Operator, you have already witnessed the power of mutation. These tools leverage Mutating Admission Webhooks to automatically inject "sidecar" containers into application pods.

Understanding the Sidecar Pattern in Platform Engineering

Sidecar injection is a core pattern in modern platform engineering. It allows platform teams to transparently add capabilities—such as logging, proxying, or security monitoring—to application pods without requiring developers to modify their deployment manifests. This ensures a clean separation of concerns: developers focus on business logic, while the platform handles infrastructure requirements.

Real-World Examples of Sidecar Injection

The Kubernetes Admission Controller enables several common automation scenarios:

Istio Service Mesh: Automatically adds an Envoy proxy sidecar to every pod to manage traffic, enforce mTLS, and gather telemetry.
OpenTelemetry (OTel): Injects a collector sidecar to scrape metrics and traces, or adds an Init Container to auto-instrument the application before it starts.

Implementing Injection with Kyverno and JSON Patch

While simple validations can use overlay patterns, complex injections often require patchesJson6902. This method is based on the imperative JSON Patch standard (RFC 6902), making it ideal for structured modifications like appending items to a list.

Below is a Kyverno policy that injects a logging sidecar into any pod annotated with logging-enabled: "true":

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-logging-sidecar
spec:
  rules:
  - name: inject-logging-container
    match:
      any:
      - resources:
          kinds:
          - Pod
          annotations:
            logging-enabled: "true"
    mutate:
      patchesJson6902:
      - path: "/spec/containers/-"
        op: add
        value:
          name: logging-sidecar
          image: fluent/fluent-bit:latest
          args:
          - "tail"
          - "-f"
          - "/var/log/app.log"

Syntax Deep Dive: JSON Patch

The critical line here is path: "/spec/containers/-".

/spec/containers: Targets the list of containers in the Pod definition.
/-: This specific JSON Patch syntax tells the API server to append the new value to the end of the array, rather than replacing an existing index.

The Order of Chaos: Why Mutation Runs Before Validation

To build a truly robust platform, you must understand the Kubernetes admission control lifecycle. The order of operations is not accidental; it is what makes the symbiotic relationship between "correction" and "enforcement" possible.

The critical sequence for every API request is:

Mutation (Mutating Webhooks)
Schema Validation (API Server checks)
Validation (Validating Webhooks)

Why This Order Matters

This sequence is the secret sauce of auto-compliance. A resource is first modified by mutating webhooks. Only then is the final, corrected object passed to the schema checker and validating webhooks.

A Practical Workflow: "The Avengers" Label

Consider this narrative where auto-correction and compliance work seamlessly together:

The Trigger: A developer deploys a new application, but forgets the mandatory team-id label.
The Fix (Mutation): A Kyverno mutating policy intercepts the request before it is saved. Based on the namespace, it automatically injects team-id: "avengers".
The Check (Validation): The request—now carrying the new label—proceeds to the validation stage.
The Success: The validating policy confirms the team-id exists and approves the request.

The result? The developer's deployment succeeds on the first try. The application is compliant from the moment of creation, and the platform team has enforced standards without blocking the workflow.

When Magic Fails, Debugging mutations without pulling your hair out

While mutation policies can feel like magic, they are ultimately code—and code can have bugs. When a mutation policy fails, it can break deployments or slow down the API server. To avoid this, you need a robust strategy for testing, observability, and debugging.

1. Pre-Deployment Testing

Never deploy a policy blindly.

Unit Testing: Use the Kyverno CLI (kyverno test) to validate policies against mock resources locally before they ever touch a cluster.
End-to-End (E2E) Testing: For complex scenarios, use Chainsaw, a declarative testing framework tailored for Kubernetes. It allows you to spin up virtual clusters, apply policies, and verify the mutations in a realistic environment.

2. Monitoring Webhook Performance

Every admission webhook adds latency to API server requests. If your policy is slow, the entire cluster slows down. You must monitor specific Prometheus metrics exposed by the API server:

apiserver_admission_webhook_admission_duration_seconds_bucket: The most critical metric. It tracks exactly how much time your webhook adds to request processing.
apiserver_admission_webhook_fail_open_count: Tracks requests that were allowed only because the webhook failed (if failurePolicy: Ignore is set).
apiserver_admission_webhook_request_total: Useful for understanding the total load on your policy engine.

3. Debugging with Audit Logs & Annotations

Native Kubernetes policies (like MutatingAdmissionPolicy) offer a powerful feature called auditAnnotations. This allows you to log specific values from the resource directly into the Kubernetes audit stream during evaluation.

For example, to debug why a CPU limit isn't being applied, you can log the incoming request value:

# Snippet from a MutatingAdmissionPolicy
spec:
  # ... other fields
  auditAnnotations:
    - key: "cpu_request.my-company.com"
      valueExpression: "object.spec.containers[0].resources.requests.cpu"

This generates an audit log entry like cpu_request.my-company.com: "250m", providing crystal-clear visibility into what the policy engine "saw."

4. Safe Rollouts with "Audit Mode"

Policy engines like Kyverno allow you to set validationFailureAction: Audit. In this mode, requests are not blocked; instead, violations are recorded in PolicyReport CRDs.

⚠

Warning: While Audit mode is excellent for testing, these reports are stored as Kubernetes objects. In a busy cluster, they can accumulate rapidly, bloating the etcd database and degrading control plane performance. Always use a cleanup policy or TTL for these reports.

Conclusion

Mutation policies—especially when implemented with a robust engine like Kyverno—represent a significant evolution in Platform Engineering. They empower platform teams to shed the role of "config police" and become true enablers.

By building a secure, compliant, and developer-friendly "paved road" that automatically corrects common errors, you do more than just enforce rules. You codify operational excellence into the cluster itself, freeing developers to focus on what they do best: shipping great applications.

A Final Thought for Platform Teams

As you adopt these tools, consider the balance of power. While auto-correction reduces friction, it can also hide complexity.

The Challenge: How do you balance "invisible" compliance with developer awareness?
The Goal: Ensure developers know what changed in their manifest, so the "magic" doesn't become a mystery.

📚 Further Reading & Resources

Kyverno & Mutation Policies

Kyverno Mutation Docs: The official guide to writing mutation rules, including patchesJson6902 and patchStrategicMerge.
Kyverno Policy Library: A searchable collection of ready-to-use policies (great for finding examples to tweak).

Testing & Validation

Kyverno Chainsaw: The end-to-end testing tool mentioned in this post, designed specifically for Kubernetes controllers and policies.
Kyverno CLI: Learn how to run kyverno test locally to catch syntax errors before deployment.

Kubernetes Concepts

Admission Controllers Reference: The official Kubernetes documentation explaining the lifecycle of a request (Mutation → Validation).
JSON Patch (RFC 6902): A user-friendly guide to understanding the syntax used in patchesJson6902.

Platform Engineering

The "Paved Road" Concept: The original Netflix Tech Blog article that popularized the idea of building "Paved Roads" for developers.
CNCF Platform Engineering Whitepaper: An in-depth look at modern platform engineering principles.

💡

Enjoyed this deep dive into Kubernetes mutation? Subscribe to our newsletter for more platform engineering insights, and stay tuned for our next article, where we'll explore the world of advanced validation policies!

Stop Letting "YOLO Deployments" Break Your Cluster: Hello, Kyverno!

Mahmoud Rashed — Sat, 17 Jan 2026 10:28:10 GMT

Introduction: The Silent Guardian of the API Server

Maintaining a Kubernetes cluster often feels like a constant battle against configuration drift. As teams scale, the anxiety of "who deployed what and why" grows. Without a gatekeeper, your API server is essentially an open door to any configuration, regardless of how insecure or inefficient it might be.

This is where Admission Controllers step in. Think of them as the middleware of the Kubernetes world—an essential layer of logic that scrutinizes every request after it has passed authentication and authorization, but before the state is persisted to the etcd database. This specific timing makes them the ultimate arbiter of what actually enters your cluster’s "source of truth."

An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but only after the request is authenticated and authorized. In simpler terms, admission controllers can be thought of as middleware that can validate, mutate, or reject requests to the Kubernetes API.

The "Wild West" of K8s: Why we can't have nice things

At the scale of an large company, Kubernetes can quickly become a "Wild West." Without strict policy enforcement, clusters are prone to systemic risks. Some researches also identifies misconfigurations—specifically overly permissive RBAC and unbounded service accounts—as the leading cause of security breaches.

Beyond security, there is the silent financial leak of "Resource Asymmetry." The kube-scheduler relies on a complex scoring algorithm to assign pods to nodes based on CPU and memory. However, because this scoring is highly variable, developers often "play it safe" by over-estimating resource requests to avoid performance issues. This leads to a scenario where "spending outstripping use" becomes the norm.

The danger of "Missing Requests and Limits" is particularly acute; without these guardrails, a single resource-hungry pod can trigger a denial-of-service for its neighbors. This evolving threat landscape, combined with the risk of insider threats, makes automated gatekeeping the architect's path to sanity.

Kyverno vs. OPA: Why I refuse to learn another language (Rego)

When choosing a policy engine, the industry generally gravitates toward two heavyweights: Kyverno and OPA Gatekeeper. However, the friction of learning a specialized query language like Rego (used by OPA) often slows down security adoption for YAML-native teams.

The tech team at Adevinta recently highlighted a "functional failure" in OPA Gatekeeper’s mutation capabilities that led them to migrate. Specifically, Gatekeeper's Assign feature failed because it could not modify fields based on contextual data—information residing outside the specific field being observed. While OPA requires a complex external data provider setup for this, Kyverno handles it natively within the same manifest.

It’s just YAML! (The moment you fall in love)

The Kyverno advantage is simple: it feels like a natural extension of the Kubernetes experience. Because policies are written in native YAML, they integrate seamlessly with kubectl, Helm, and GitOps workflows. It is a more intuitive way to handle governance than traditional programming-heavy engines.

We are also witnessing a major shift with the arrival of CEL (Common Expression Language). In Kubernetes v1.33, ValidatingAdmissionPolicy has officially reached V1 (GA). This is a game-changer because it allows for declarative validation directly in the vanilla API server without external HTTP webhooks. By removing the need for these callouts, architects can eliminate network latency and webhook failure modes entirely. Even OPA Gatekeeper is acknowledging this shift by adding CEL support, signaling a move toward standardized, high-performance logic.

3-Minute Install: From "Zero" to "Governance"

Getting started is faster than your coffee cools. You can go from an unmanaged cluster to a governed environment using the standard Helm workflow for Kyverno:

Add the Repo: helm repo add kyverno https://kyverno.github.io/kyverno/
Update: helm repo update
Install: helm install kyverno kyverno/kyverno --namespace kyverno --create-namespace

The Lifecycle of a Request

To master admission control, you must understand the technical sequence of a request:

Mutating Phase: The controller modifies the request (e.g., injecting sidecars or team labels).
Schema Validation: The API server performs a structural JSON check to ensure the resource is well-formed.
Validating Phase: The controller checks the request against security rules and finally accepts or rejects it.

💡

Pro Tip: Always start in Audit Mode. This allows you to assess the impact of policies without breaking existing developer workflows. Once you have sanitized the environment, flip the switch to Enforce Mode.

Your First Policy: Ban the default namespace

Isolating workloads is the first step toward maturity. Kubernetes ships with a static admission controller called NamespaceLifecycle that acts as a basic safeguard. It prevents the accidental deletion of three critical system-reserved namespaces: default, kube-system, and kube-public.

However, true governance requires moving beyond defaults to enforce ResourceQuotas and custom labels. This prevents one team from monopolizing node resources. From there, follow the experimental insights from S&P 500 deployments to harden your securityContext:

Enforce Rootless Mode: Prohibit deployments where runAsNonRoot is false.
Identity Mapping: Explicitly define runAsUser and runAsGroup (e.g., set to 1000) to ensure workloads never run under a root identity.
Immutable Filesystems: Enforce readOnlyRootFilesystem: true to prevent attackers from persisting malicious scripts or exfiltrating data.

Conclusion: The Future of Sovereign Clusters

The evolution of Kubernetes is moving toward Zero Trust and "Sovereign Clusters"—environments where an enterprise maintains absolute sovereignty over its standards across multi-cluster platforms. By implementing these gatekeeping rules today, you lay the foundation for AI-driven threat detection and automated compliance.

If your API server disappeared tomorrow, would your policy engine know how to rebuild the trust?

💡

If you liked this article follow for more, and wait for the rest of the series, I will write more about each policy type and more…