Monitoring a Kubernetes Cluster 🖥️

Put simply, the task of Kubernetes is making it easier to run distributed workloads. Per design, these workloads typically run across multiple server instances and at scale - most real world deployments even involve operating multiple workloads simultaneously across the cluster. Thus, it is impossible to understand what happens by crawling through simple log details.

Instead, visualization, monitoring, and alerting based on metrics inherited from the systems is the way to go for. Prometheus & Grafana make up a good team and are also recommended by CNCF (Cloud Native Computing Foundation). This post explains, how to setup Prometheus & Grafana and what steps are needed to visualize the collected metrics from Kubernetes.

Cloud Native Computing Foundation

The Cloud Native Computing Foundation was founded by the Linux foundation in 2015 with the intention to push containers. Kubernetes, an open source container cluster manager, was sponsored by Google in it’s initial version 1.0 and is further maintained by CNCF. Over time, renowned companies like Netflix, Cisco, Twitter and SoundCloud joined and donated technologies or sponsored their development.

A trail map towards a Cloud native approach is published as recommendation by CNCF along with matching software technologies for each step. Thereby all steps are based on the previous one, starting with the Containerization and the setup of a microservices architecture, followed by a CI/CD pipeline up to the Orchestration with Kubernetes. At this point in time, Monitoring, Analysis and Logging should be addressed with Prometheus being recommended by CNCF as an open source monitoring solution.

Prometheus

Prometheus is a central infrastructure to store and process time-series based metrics as part of an overall monitoring system. It’s initial development was sponsored by Berlin-based company SoundCloud. In August 2018, Prometheus was designated a graduated project by CNCF, which means it fulfills criteria to prove it’s production readiness. The entire criteria can be read here, but I would still like to highlight two criteria that I think are important:

Have committers from at least two organizations
Have completed an independent and third party security audit with results published

In this post I am going to setup Prometheus and use Grafana to visualize the data being collected from the Kubernetes cluster itself and the applications running on it. Grafana is an open source software for visualizing time-series data. It’s compatible with Prometheus, but can also be used with InfluxDB and Graphite.

Fun Fact: Prometheus is used by Berlin-based company Jodel, which is a famous app among German students to anonymously share text messages with people in their neighborhood.

Kubernetes

For this post, a Kubernetes cluster needs to be up and running. It doesn’t matter whether it originates from Digital Ocean or if it is a local cluster, as long as kubectl is configured accordingly. kubectl is a command line interface for running commands against Kubernetes clusters. On macOS, for example, kubectl can be installed via Homebrew with:

1brew install kubernetes-cli

To learn more about how to install and configure kubectl, head over to documentation.

I use a local Kubernetes cluster, that comes with Docker Desktop for macOS. Hence, I won’t do a deep test here, but simply check, if pods are running:

1kubectl get pods -n kube-system
2
3coredns-584795fc57-8kmk9                1/1   Running   1   5m1s
4coredns-584795fc57-vj7wh                1/1   Running   1   5m1s
5etcd-docker-desktop                     1/1   Running   1   3m46s
6kube-apiserver-docker-desktop           1/1   Running   1   4m7s
7kube-controller-manager-docker-desktop  1/1   Running   1   3m50s
8kube-proxy-9754t                        1/1   Running   1   5m1s
9kube-scheduler-docker-desktop           1/1   Running   1   3m49s

Helm

Helm is a package manager for Kubernetes. It behaves like a package manager for operating systems, as it helps in installing applications and keeping them up-to-date. Helm can be installed on macOS via Homebrew with:

1brew install kubernetes-helm

To learn how to install Helm on your operating system, I recommend having a look at the official documentation. Helm is later on used to deploy Prometheus and Grafana to the Kubernetes cluster.

Monitoring namespace

I separate the monitoring resources (Prometheus & Grafana) into a namespace, to keep them together. This also helps, if later the resource quotas of all monitoring services need to be adjusted. To proceed, create a file namespace.yaml inside a folder monitoring and add the following content:

1kind: Namespace
2apiVersion: v1
3metadata:
4  name: monitoring

The next commands apply the namespace monitoring to the cluster and check, whether it is installed correctly:

1kubectl apply -f monitoring/namespace.yaml
2kubectl get namespaces | grep monitoring
3
4monitoring    Active    8s

Setup Prometheus

Helm, as a package manager, makes it fairly easy to install Prometheus (or any other application) to the cluster. First, I update the local repository to ensure, that the latest version of Prometheus gets deployed. Second, I install the package prometheus to cluster into the newly created namespace monitoring.

1helm repo update
2helm install stable/prometheus \
3    --namespace monitoring \
4    --name prometheus

It takes some time, until Prometheus changes it’s status from Pending to Running. As soon as the latter status is reached, Prometheus begins to scrape the cluster together with node-exporter and collect metrics. To access a rudimentary dashboard, the port of Prometheus needs to be forwarded with the next commands:

1export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
2kubectl --namespace monitoring port-forward $POD_NAME 9090

Once the second command was successful, visit http://127.0.0.1:9090 to check out Prometheus in action.

Setup Grafana

As stated in the last paragraph, the built-in dashboard of Prometheus is kept rather simple. Thus, Grafana may be an alternative, as it can use Prometheus as a datasource to create more powerful dashboards. After installing Grafana to the cluster, it must be configured to read the collected metrics from Prometheus. There are two possible solutions, to achieve this:

Configure the datasource manually via Grafana’s UI
Configure the datasource via YAML file

I choose the second option, because I want the steps to be replicate-able and also lower the manual effort as much as possible.

Configuration

After installing Grafana with Helm to the cluster, it will look for any ConfigMap labeled with grafana_datasource. Thus, create a file monitoring/grafana/config.yaml and add the following content:

1apiVersion: v1
2kind: ConfigMap
3metadata:
4  name: prometheus-grafana-datasource
5  namespace: monitoring
6  labels:
7    grafana_datasource: "1" # Tells Granfana's provisioner, that it's a datasource to inject
8data:
9  datasource.yaml: |- # Tells Kubernetes to read the next lines as text
10    apiVersion: 1 # Configuration, that Grafana will read
11    datasources:
12    - name: Prometheus
13      type: prometheus
14      access: proxy # Tells Grafana to read metrics from Prometheus server and not an API
15      orgId: 1
16      url: http://prometheus-server.monitoring.svc.cluster.local # Points always to Prometheus' server IP

Next step is to apply the configuration file and do a simple check:

1kubectl apply -f monitoring/grafana/config.yaml
2kubectl get configmaps -n monitoring
3
4NAME                            DATA    AGE
5prometheus-alertmanager         1       15h
6prometheus-grafana-datasource   1       53s
7prometheus-server               3       15h

Override default values

When using a Helm chart, it’s default values can be overwritten by creating a values.yaml file and passing it as an argument to helm install. A list of values, that can be overwritten in Grafana’s Helm chart, is available here.

Create a file monitoring/grafana/values.yaml and add the following content:

1sidecar:
2  image: xuxinkun/k8s-sidecar:0.0.7
3  imagePullPolicy: IfNotPresent
4  datasources:
5    enabled: true
6    label: grafana_datasource

This injects a sidecar, which loads the datasource into Grafana, when it gets provisioned. Finally, Grafana can be installed to the cluster with the overwritten values by using:

1helm install stable/grafana \
2    -f monitoring/grafana/values.yaml \
3    --namespace monitoring \
4    --name grafana

Access Dashboard

By default, Grafana is deployed with a randomly generated password, which can be obtained with the next command. It prints out the password to the console.

1kubectl get secret \
2    --namespace monitoring \
3    grafana \
4    -o jsonpath="{.data.admin-password}" \
5    | base64 --decode ; echo

Having the password, the port of Grafana needs to forwarded, to be accessible. Then, one can login with username admin and the recently obtained password at http://127.0.0.1:3000.

1export POD_NAME=$(kubectl get pods --namespace monitoring -l "app=grafana,release=grafana" -o jsonpath="{.items[0].metadata.name}")
2kubectl --namespace monitoring port-forward $POD_NAME 3000

Lastly, a dashboard can be created in Grafana. Here it’s possible to directly chose Prometheus as data source, as the connection details have been injected during the deployment. Besides the possibility to build a custom dashboard from scratch, templates may also be used.