The Rego Language

Because I love to give myself more work, after my talk at the Virtual Azure Community Day, I promised I’d do a series of articles about my adventures with Gatekeeper on Azure Kubernetes Service (AKS).

In Part One of the series, I’ll begin by providing an overview of Rego, a domain-specific language (DSL) that allows us to query things.

One language to query them all

Rego is used by Open Policy Agent (OPA) to write declarative, easily extensible policy decisions. OPA is employed to policy-enable software across several domains. It is used for Kubernetes, Linux, Terraform, Istio, Linkerd, and CloudFoundry, to name a few. As long as we feed Rego JSON, it will be able to query it with the policy rules we have defined.

Update the repository secrets

In this post, I describe how OPA’s policy language, Rego, is used to write policies for Kubernetes.

Rego lets us write policies to answer all kinds of questions. For example:

  • Are certain labels set for the container?
  • Where do we pull the container from?
  • Does the container have memory limits set?

The power of JSON

As mentioned above, Rego is a general-purpose policy language, meaning that it works for any layer of the stack and any domain. Rego just sees JSON data. So you can write a policy about any domain as long as the information you need to make a decision can be stuffed into JSON.

Let’s convert the following Kubernetes deployment, which is a YAML file, to JSON.

If we are using VSCode, there is an extension available that makes this task easy: yaml2json.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-kubernetes
  labels:
    app.kubernetes.io/name: hello
spec:
  template:
    metadata:
      labels:
        app: hello-kubernetes
    spec:
      containers:
        - name: hello-kubernetes
          image: paulustm/hello-kubernetes:1.5

This will convert into the following JSON file:

{
  "apiVersion": "apps/v1",
  "kind": "Deployment",
  "metadata": {
    "name": "hello-kubernetes",
    "labels": {
      "app.kubernetes.io/name": "hello",
    }
  },
  "spec": {
    "template": {
      "metadata": {
        "labels": {
          "app": "hello-kubernetes"
        }
      },
      "spec": {
        "containers": [{
          "name": "hello-kubernetes",
          "image": "paulustm/hello-kubernetes:1.5",
        }]
      }
    }
  }
}

This spec deploys a pod to the cluster with a label called app.kubernetes.io/name that contains hello as a value. In the next step, we are going to validate if this deployment spec adheres to our policies.

Actual Rego

To make a policy decision in Rego, we write logical tests on the data that comes in as input (such as the deployment data from the last section).

To help write Rego with Vscode, install the Open Policy Agent plugin plugin. This plugin provides syntax checking, highlighting, and a bunch of other useful features.

We’ll use the following Rego code to validate the JSON file we created above:

package main

deny[msg] {
    input.kind == "Deployment" # true
    not input.spec.selector.matchLabels.app # false
    msg = "Containers must provide app label for pod selectors"
}

We can execute this example by using the Command Palette from VScode and running OPA: Evaluate package. This will rerun the following JSON:

[
  [{
    "deny": [],
    "warn": []
  }]
]

As there are no messages in this JSON file, all is good. The input JSON was validated with the Rego code, and no violations of our policies were detected.

Let’s walk through the code.

The input variable contains the JSON that we feed to Rego. On Line 4, we check if the value of kind equals “Deployment.” If this requirement is met, we continue to the next line.

On Line 5, we check if there is no value for spec.selector.matchLabels.app.

As you already can see in the example, this will return false as we do have a value hello-kubernetes. Keep in mind the fact that, when we get a false, we exit the check and continue to the next one without setting the msg value to the output. So, when we executed the Rego code, we did not get any messages back; as such, the code was compliant with the policy.

If we now change the Rego code to require a label that contains a contact person for the application, we have the following:

package main

deny[msg] {
    input.kind == "Deployment" # true
    not input.metadata.labels.owner # true
    msg = "Containers must provide an owner"
}

As there is no value for metadata.labels.owner, the not resolves as true, and we continue to set the msg. If we execute this rego code, it will return the following JSON:

[
  [{
    "deny": [
      "Containers must provide an owner"
    ],
    "warn": []
  }]
]

We can fix this by updating the input YAML file as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-kubernetes
  labels:
    app.kubernetes.io/name: hello
    owner: daniel.paulus
spec:
  template:
    metadata:
      labels:
        app: hello-kubernetes
    spec:
      containers:
        - name: hello-kubernetes
          image: paulustm/hello-kubernetes:1.5

If we now convert this to JSON and run the rego code, we’ll see that we comply with the policy because we have added a label owner with a value.

A more complex example

It is not particularly complicated to use the Rego syntax to validate the existence of a label for a deployment. Let’s make things a bit more complicated by incorporating a check for our private repository.

We like to make sure all the images used in our clusters are mirrored to our image repository and not pulled from public repositories.

We have the following deployment YAML:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: hello-kubernetes
spec:
  template:
    spec:
      containers:
        - name: hello-kubernetes
          image: paulustm/hello-kubernetes:1.5

This deployment will pull its container from hub.docker.com, which is against our policies. Let’s write the Rego code to detect this discrepancy:

package main

deny[msg] {
    input.kind == "Deployment"
    image := input.spec.template.spec.containers[_].image
    not startswith(image, "danielpaulus.com/")
    msg := sprintf("image '%v' comes from untrusted registry", [image])
}

The containers array has an unknown number of elements. As such, to implement an image registry check, you need to iterate over them. We often don’t want to invent new variable names for iteration.

Rego provides the special anonymous variable _ for precisely that reason. So, in Line 5, image := input.spec.template.spec.containers[_].image. finds all the images in the containers array and assigns each to the image variable one at a time.

On Line 6, the builtin startswith checks if one string is a prefix of the other. The builtin sprintf on Line 7 formats a string with arguments. Rego has 50+ builtins available.

If we run this policy against the JSON from the above YAML file, it will return the following output:

[
  [{
    "deny": [
      "image 'paulustm/hello-kubernetes:1.5' comes from untrusted registry"
    ]
  }]
]

If we now update the image to come from the private repository and provide Rego with the following JSON:

{
  "apiVersion": "apps/v1",
  "kind": "Deployment",
  "metadata": {
    "name": "hello-kubernetes"
  },
  "spec": {
    "template": {
      "spec": {
        "containers": [
          {
            "name": "hello-kubernetes",
            "image": "danielpaulus.com/hello-kubernetes:1.5"
          }
        ]
      }
    }
  }
}

Everything will be fine again, and no deny messages will prevent the validation.

Wrap up

Hopefully, this post helps shed some light on Rego. Here are the key takeaways:

  • Rego lets you write policies about any domain and any layer of the stack: Kubernetes, Linux, Terraform, Istio, Linkerd, and CloudFoundry, to name a few.
  • Rego makes you think about policy, not programming.
  • Rego operates over JSON data. You can supply JSON data as input to every decision.
  • Rego logic is all queries.

For more information, check out the Open Policy Agent project.

Next week, I will publish Part 2 of this series, through which we will learn how to install Gatekeeper on an AKS cluster and put this Rego knowledge to the test.

In Part 3, we’ll use the Azure Policy to make use of Microsoft’s automatic Gatekeeper installation and integration.

In Part 4, we will look at conftest and a shifting left policy control.