The Power of Gatekeeper

After my talk at the Virtual Azure Community Day, I promised I’d deliver a series of articles about my adventures with Gatekeeper on Azure Kubernetes Service (AKS).

In Part 1, I told you all about Rego, the domain-specific language (DSL) that is used by Open Policy Agent (OPA).

Now, in Part 2, I’ll talk about the use of Gatekeeper to enforce policy on a cluster.

The examples in this post are not tailored to AKS specifically; you can apply them to any Kubernetes cluster you manage—Onprem, Hosted, Managed, or Vendored—as long as you are using Version 1.14 or later.

What is an Admission Controller?

An Admission Controller is a part of the API server that intercepts the requests to the Kubernetes API before an object is created.

In other words, if you issue a request to the API server to create a new Deployment, and you’re an authenticated (AUTHN) user who’s allowed to perform such an action (RBAC), the admission controller may intercept and mutate, validate, or conduct both actions for this request.

The following image shows a simplified schema of the steps a request takes:

Admission controller overview

Bringing OPA native to Kubernetes

While there are other options—like kube-mgmt—available for the integration of an Open Policy Agent into your cluster, Gatekeeper offers the best features. The Gatekeeper controller constantly monitors existing cluster objects to detect policy violations.

Its greatest value; however, comes from its ability to dynamically configure OPA policies using Gatekeeper’s Custom Resource Definitions (CRDs). Adding parameterized ConstraintTemplate objects to a cluster simplifies the creation of tailored policies for specific resources and scopes.

Install Gatekeeper

There are two ways to install Gatekeeper on your cluster. You can use Helm or the provided YAML file. In this post, I’m going to use the YAML file to install Gatekeeper.

You can find the YAML file on https://github.com/open-policy-agent/gatekeeper/blob/master/deploy/gatekeeper.yaml. It’s a good idea to always inspect the definition before you apply it to the cluster. It will set up all the basics like a namespace and clusterroles.

Lets set up everything by running kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/master/deploy/gatekeeper.yaml. When you do this, you’ll generate the following output:

namespace/gatekeeper-system created
customresourcedefinition.apiextensions.k8s.io/configs.config.gatekeeper.sh created
customresourcedefinition.apiextensions.k8s.io/constraintpodstatuses.status.gatekeeper.sh created
customresourcedefinition.apiextensions.k8s.io/constrainttemplatepodstatuses.status.gatekeeper.sh created
customresourcedefinition.apiextensions.k8s.io/constrainttemplates.templates.gatekeeper.sh created
serviceaccount/gatekeeper-admin created
role.rbac.authorization.k8s.io/gatekeeper-manager-role created
clusterrole.rbac.authorization.k8s.io/gatekeeper-manager-role created
rolebinding.rbac.authorization.k8s.io/gatekeeper-manager-rolebinding created
clusterrolebinding.rbac.authorization.k8s.io/gatekeeper-manager-rolebinding created
secret/gatekeeper-webhook-server-cert created
service/gatekeeper-webhook-service created
deployment.apps/gatekeeper-audit created
deployment.apps/gatekeeper-controller-manager created
validatingwebhookconfiguration.admissionregistration.k8s.io/gatekeeper-validating-webhook-configuration created

Everything is installed in the gatekeeper-system namespace. Let’s investigate what has been set up for you.

 ❯ kubectl get crd -n gatekeeper-system
NAME
configs.config.gatekeeper.sh
constraintpodstatuses.status.gatekeeper.sh
constrainttemplatepodstatuses.status.gatekeeper.sh
constrainttemplates.templates.gatekeeper.sh

The first items are the Custom Resource Definitions (CRDs) you can use to manage the ConstraintTemplates objects.

 ❯ kubectl get svc -n gatekeeper-system
NAME                         TYPE        CLUSTER-IP    EXTERNAL-IP   PORT(S)   AGE
gatekeeper-webhook-service   ClusterIP   10.96.3.152   <none>        443/TCP   12m

Validating Webhook will use the service object that you created to interact with the controller manager deployment.

 ❯ kubectl get deployments.apps -n gatekeeper-system
NAME                            READY   UP-TO-DATE   AVAILABLE   AGE
gatekeeper-audit                1/1     1            1           12m
gatekeeper-controller-manager   3/3     3            3           12m

The Audit deployment is a single pod that scans the cluster for policy validations on an ongoing basis. However, the Controller manager has three pods for High Availability purposes, and will be the main process that validates all the changes that will be deployed on the cluster.

 ❯ kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io -n gatekeeper-system
NAME
gatekeeper-validating-webhook-configuration

The Validating Webhook is a special admission controller that sends admission requests to external HTTP callbacks—the above webhook service in this case—and receives admission responses.

Templates and Constraints

Now that you have all the basic setup in place, you are ready to enforce some policies. Before I show you some examples, let’s first take a moment to look at how Gatekeeper works.

To define a constraint, you’ll need to define a ConstraintTemplate. The purpose of the ConstraintTemplate is to define both the Rego code that you use to enforce the policy and the schema that the constraint can be applied to.

The schema is used to define the parameters that this CRD will accept. Think of it as the parameters that get passed to a function in programming languages.

I’m going to show you two examples. The first will validate if the required labels are present. When you understand how this works, I’ll show you how to require a private container registry.

Required labels

The following ConstraintTemplate is taken from the official docs; it ensures that all labels defined by the constraint are present:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
        listKind: K8sRequiredLabelsList
        plural: k8srequiredlabels
        singular: k8srequiredlabels
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            labels:
              type: array
              items: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        violation[{"msg": msg, "details": {"missing_labels": missing}}] {
          provided := {label | input.review.object.metadata.labels[label]}
          required := {label | label := input.parameters.labels[_]}
          missing := required - provided
          count(missing) > 0
          msg := sprintf("you must provide labels: %v", [missing])
        }

Let’s have a closer look at this template.

A lot of the data provided in this template conforms to how the CRD was created, like the apiVersion and the kind. In the metadata and the spec, you define the name of the template, including the singular and plural variations. The important thing to note here is the openAPIV3Schema (Line 15). Here, we define the parameters that the template will accept or the labels that the constraint may define. We define the actual Rego code that will get executed in the targets part. Let’s take a closer look at it:

  1. violation[{"msg": msg, "details": {"missing_labels": missing}}]. This basically defines the message and the details that would be displayed to the user if the policy is violated. Any policy should start with Violation. A rule is violated if its body evaluates to true. The body of a rule is enclosed in curly braces {}.
  2. provided := {label | input.review.object.metadata.labels[label]}: You define a variable called provided, and use it to hold the list of labels for the resource. The expression after the := can be read as follows: iterate through the dictionary of label=value that was provided in the request and extract the label part of it (the key). In Rego, this is called Comprehension. It has a similar concept with the same name in Python. As such, if you have programmed in Python before, you will be familiar with this code.
  3. required := {label | label := input.parameters.labels[_]}: The required labels are provided in the form of an array rather than a dictionary (the required labels are supplied through the CRD that you’ll create next). Again, you define a variable called required that holds a list of the labels that you need in the resource. The expression can be read as follows: Iterate through all the items in the labels array and assign the result to the required variable.
  4. missing := required - provided: You now have two arrays—one containing the required labels and one containing the provided labels. The expression creates an array with the items that are found in required but not in provided.
  5. count(missing) > 0: If there are more than zero items in the difference array, one or more required labels are missing, and this represents a violation of the policy.
  6. msg := sprintf(“you must provide labels: %v”, [missing]): Since it’s a violation, we define the msg variable that’s displayed in the very first line of the violation rule. The message is a custom phrase that’s displayed to the client.

You can now apply this ConstraintTemplate through the use of the following command:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/ef78d95c816ebc426ed75fbc0e1ff96e512a2152/library/general/requiredlabels/template.yaml

You can have several ConstraintTemplates defined on your cluster. However, none of them will be used unless a Constraint that uses one of the templates is defined.

Let’s create the Constraint based on the following YAML:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: ns-must-have-gk
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Namespace"]
  parameters:
    labels: ["gatekeeper"]

Note that the kind must match the kind value defined in the ConstraintTemplate.

In the kinds part, you define which Kubernetes resource this policy applies to. In this case, any apiGroup that has the kind Namespace is a valid match.

Finally, the parameters part expects an array of labels. The policy then verifies their existence on the namespace. In this case, we ensure that any namespace has the gatekeeper label attached to it.

Let’s apply this Constraint through the use of the following command:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/ef78d95c816ebc426ed75fbc0e1ff96e512a2152/library/general/requiredlabels/constraint.yaml

Now, let’s test this policy by creating a YAML file that deploys a new namespace called team-backend. The YAML file doesn’t provide any labels to the namespace. It should look like this:

apiVersion: v1
kind: Namespace
metadata:
  name: team-backend

Let’s try to apply this YAML file to our cluster:

❯ kubectl apply -f namespace.yaml
Error from server ([denied by ns-must-have-gk] you must provide labels: {"gatekeeper"}): error when creating "namespace.yaml": admission webhook "validation.gatekeeper.sh" denied the request: [denied by ns-must-have-gk] you must provide labels: {"gatekeeper"}

As you can see, we were unable to create the namespace, even though the YAML file is perfectly valid. This is because we didn’t abide by the policy constraint and added a label called “gatekeeper” to the namespace definition.

Let’s modify the YAML file as follows:

apiVersion: v1
kind: Namespace
metadata:
  name: team-backend
  labels:
    gatekeeper: "true"

If we apply the updated YAML file, the namespace will be created successfully. Notice that the policy is looking for the gatekeeper label; however, it doesn’t care about its value. So, in our case, we added the gatekeeper label and set its value to true, but we could have set it to any other value, and the policy wouldn’t have been violated.

Enforce a private container registry

It is a good practice to pull your images from a private repository as you might accidentally pull a hacked image, which may compromise the security of the cluster.

To enforce the use of the private repository, we create a new ConstraintTemplate that will appear as follows:

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8sallowedrepos
spec:
  crd:
    spec:
      names:
        kind: K8sAllowedRepos
      validation:
        # Schema for the `parameters` field
        openAPIV3Schema:
          properties:
            repos:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8sallowedrepos
        violation[{"msg": msg}] {
          container := input.review.object.spec.containers[_]
          satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
          not any(satisfied)
          msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
        }
        violation[{"msg": msg}] {
          container := input.review.object.spec.initContainers[_]
          satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
          not any(satisfied)
          msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
        }

Let’s dig into this code.

You will notice it looks a lot like the k8srequiredlabels ConstraintTemplate, with a few differences.

The name and kind have changed. Notice that the kind should always use CamelCase.

In the schema, we define that the parameter should be called image and its type should be string.

Finally, we define our Rego code. If you would like to develop a better understanding of the Rego code we use here, please check out the complex example provided in my previous post.

We can now use the following command to apply this ConstraintTemplate:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/ef78d95c816ebc426ed75fbc0e1ff96e512a2152/library/general/allowedrepos/template.yaml

Let’s create the constraint based on the following YAML:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: images-must-be-from-github
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    namespaces:
      - "team-backend"
  parameters:
    repos:
      - "ghcr.io/"

We can now apply this Constraint through the use of the following command:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper-library/ef78d95c816ebc426ed75fbc0e1ff96e512a2152/library/general/allowedrepos/constraint.yaml

Now let’s try to run an image in the team-backend namespace.

 ❯ kubectl run -n team-backend  --restart=Never --image nginx toolbox
Error from server ([denied by images-must-be-from-github] container <toolbox> has an invalid image repo <nginx>, allowed repos are ["ghcr.io/"]): admission webhook "validation.gatekeeper.sh" denied the request: [denied by images-must-be-from-github] container <toolbox> has an invalid image repo <nginx>, allowed repos are ["ghcr.io/"]

If we use an image that is published on the new GitHub container registry, we will be able to run the pod.

 ❯ kubectl run -n team-backend  --restart=Never --image ghcr.io/linuxcontainersio/nginx toolbox
pod/toolbox created

Wrap up

Hopefully, this post helps shed some light on Gatekeeper and its usecases. Here are the key takeaways:

  • Gatekeeper is a native implementation of OPA for Kubernetes.
  • Gatekeeper is a validating webhook that enforces any CRD-based policies executed by the Open Policy Agent.
  • Gatekeeper’s audit functionality allows administrators to see what resources are currently violating any given policy.
  • A ConstraintTemplate describes both the Rego that enforces the constraint and the schema of the constraint.
  • Constraints informs Gatekeeper how a ConstraintTemplate needs to be enforced.

For more information, check out the Gatekeeper project on Github.

This was Part 2 of this series, through which you learned how to install Gatekeeper on an AKS cluster. If you would like to learn more about Rego, check out Part 1 of this series.

In Part 3 of this series, we’ll use an Azure Policy to make use of Microsoft’s automatic Gatekeeper installation and integration on the Azure Kubernetes Service. Later, in Part 4, we will look at Conftest and a shifting-left policy control.