Monitoring Kubernetes with Prometheus

Kubernetes is among the emerging open-source products expanding in the market at a very fast rate. It is a portable, extensible, and open-source platform used for managing containerized workloads and services. Companies are widely adopting it for the development of their major products. Docker is always used for running Kubernetes servers on local systems for testing purposes. 

It becomes essential for companies to monitor their Kubernetes container. Many software tools provide Kubernetes monitoring also. Various tools like Prometheus, Grafana, Fluent, the ELK stack, and cAdvisor provide Kubernetes monitoring services. In this post, we will cover monitoring your Kubernetes cluster with the help of Prometheus. 

Feel free to use these links to navigate the post.

Prometheus is a well-known tool for monitoring your Kubernetes application. It has many advantages like DevOps culture, a multi-dimensional data model, service discovery, and modular and highly available components.

First, let’s look at the benefits of using Prometheus and why we should use it.

The challenges of monitoring Kubernetes

To establish dependable monitoring, alerting, and graphing architecture for a Kubernetes cluster, we must overcome numerous problems.

Visibility Issue:

Containers are difficult to monitor as they are like lightweight, immutable black boxes. Black boxes are those whose output is not known for input.

The Kubernetes API and Kube-state-metrics (which natively leverage Prometheus metrics) expose Kubernetes internal data such as the number of desired replicas in a deployment, nodes unable to be scheduled, and so on.

Because you only need to offer a metrics port and don't need to add too much complexity or run other services, Prometheus is a suitable fit for microservices. Often, the service will already have an HTTP interface, and the developer will only need to add a new path, such as /metrics.

Dynamic Infrastructure: Instability

Ephemeral entities that can start or stop reporting at any time are an issue for traditional, more static monitoring systems, as we noted earlier. To deal with this, Prometheus has several auto-discover techniques. The following are the most important for this guide:

New layers of Infrastructure

Due to multiple layers of infrastructure, things like physical hosts or service ports have lost their relevance. Microservice performance (with distinct pods dispersed over several nodes,) namespace, deployment versions, and so on, all require monitoring to be organized around them.

You can simply adjust to these new scopes by using Prometheus' label-based data architecture in conjunction with PromQL.

Advantages of Using Prometheus for Kubernetes Monitoring

Prometheus is a modern solution for managing applications. It is mainly concerned with Kubernetes as it works best in that case. There are several benefits of using Prometheus for the deployment of your Kubernetes cluster, like multi-dimensional data model, DevOps culture, etc. Some of them are listed below:

1. Multi-dimensional Data Model

A multi-dimensional data model helps to keep data available in key-value pairs. It allows flexible, query, and accurate time-series data, powering its query language. In a multi-dimensional data model, data stores itself as a data cube with the help of which you can view and model data from multiple dimensions. 

Dimensions and facts define multi-dimensional data. Dimensions are the objects or entities about which the company is keeping the record. Fact is the theme around which the whole data organizes. This multi-dimensional data can be queried using the PromQL language. Using PromQL, we can select query and aggregate Prometheus and then use it for future reference, analysis and visualization.

In some circumstances, the service won’t be able to serve Prometheus metrics, and you won't be able to change the code to make it work. In that situation, you'll need to install a Prometheus exporter as part of the service, which usually installs as a sidecar container within the same pod.

2. DevOps Culture

DevOps is emerging among companies. Developers want to integrate development with operations because both are involved in continuous integration and deployment. This has dramatically increased the speed of deployment of the application.

In this culture, various components and tools like client libraries, application instrumentation code, exporters for converting metrics into Prometheus formats, alert managers, UI plugins, or third-party plugins exist in one ecosystem.  

3. Accessible Formats and Protocols

Prometheus metrics come in various formats like dot-metrics and tagged metrics. They provide metrics in human-readable form and are published using HTTP. You can also access these metrics through a web browser. Prometheus provides many client libraries for different languages like Golang, Java, Python, NodeJS, or Javascript, helping to convert the metrics into a human-readable format.

4. Service Discovery

Prometheus automatically scrapes the data from targets so that the software and service do not need to emit the data. Prometheus has some methods for scraping data and auto-discovering scraped targets. You can also use filters and matches, which make it excellent to use.

5. Time-Series Database

Time-series data is essential for companies as it gives the data of a particular time or within a time frame. They are simply some events tracked via monitoring; issues raised aggregated over a long time like application performance monitored or server metrics. As Prometheus is a time-series database tool, we get advantages like more data points, more data sources, more monitoring, more controls, security, and integrity.

6. Scalability

If you are using Prometheus, you can use as many different Prometheus servers using the federation approach. You just have to use “/federation” to scrape the data from other Prometheus servers. This approach is handy if you have access to all the servers. 

7. Whitebox and Blackbox Monitoring

Prometheus with Kubernetes provides various third-party libraries and exporters, which enables Whitebox and Blackbox monitoring. Whitebox monitoring involves metrics from our internal systems like logs, HTTP handlers that send out internal statistics, etc. Blackbox monitoring includes monitoring that affects your users like server down, page not functioning, or degradation in sites’ performance.

8. Pull-based Metrics

What if you do not know where your monitoring system is? You can use pull-based monitoring; simply expose the metrics of your servers as an HTTP endpoint, and Prometheus will extract those metrics from it. You just need to create a REST API to expose your Prometheus format metrics to a web port. If the metrics are not in Prometheus’s specified formats, then you can use various third-party exporters. Once your endpoint is ready, then with auto-discover plugins, Prometheus will collect, filter, and aggregate the metrics. Prometheus can do this due to the tremendous support from metric providers like Kubernetes, Open Stack, GCE, AWS EC2, ZooKeeper severest, and much more.

Getting Started: Prometheus Installation and Configuration

Prometheus is an open-source program for monitoring and alerting based on metrics. It connects to your app, extracts real-time metrics, compresses them, and saves them in a time-series database. It has a robust data model and query language and the ability to deliver thorough and actionable information. Prometheus, like Kubernetes, has achieved a mature “graduated” stage with CNCF.

Before starting the installation of Prometheus and configuring it, you have to make sure that you have the required prerequisites for the installation. The prerequisites are:

Now let us move with the process of installation. There are various ways to install Prometheus, like using a docker image or any other configuration management system like Ansible, Chef, Puppet, etc. In this article, we will be using the Prometheus docker image. 

Step 1

First, download the Prometheus docker image from this official link. It will give you a binary file which we will use for installation.

Step 2

Extract the tar.gz file or type this command:

tar -xvzf prometheus-2.11.1.linux-amd64.tar.gz

Step 3

After extraction, move to the directory containing prometheus-2.11.1.linux-amd64 and list down the contents in that folder. It will have mainly three files:

  1. prometheus 
  2. prometheus.yml 
  3. promtool

Step 4

Now execute the binary present in the folder using the following command.

./prometheus

After successful execution, visit localhost:9090; you will see your Prometheus dashboard running there. Now inside the prometheus/ folder, there will be a data/ folder created. Prometheus will save all the application metrics in this folder, and you can see those metrics on localhost:9090/metrics.

Step 5

You have successfully installed the Prometheus docker image and its binary file. Now it’s time for running Prometheus as a service. First of all, create a file:

/etc/systemd/system/prometheus.service

Paste the below code in the file:

[Unit]
Description=Prometheus Server
Documentation=https://prometheus.io/docs/introduction/overview/
After=network-online.target

[Service]
User=root
Restart=on-failure

#Change this line if you download the
#Prometheus on different path user

ExecStart=~/prometheus/prometheus --storage.tsdb.path=/var/lib/prometheus/data/ --web.external-url=http://myurl.com:9090

[Install]
WantedBy=multi-user.target

Now save and exit. Next, reload the systemctl Daemon.

sudo systemctl daemon-reload

Alternatively, for downloading and running the Prometheus binary via your host command line, you just have to type this simple code:

prometheus-2.21.0.linux-amd64$ ./prometheus
./prometheus


level=info ts=2020-09-25T10:04:24.911Z caller=main.go:310 msg="No time or size retention was set so using the default time retention" duration=15d
[...]
level=info ts=2020-09-25T10:04:24.916Z caller=main.go:673 msg="Server is ready to receive web requests."

Kubernetes + Prometheus Monitoring Architecture

The Kubernetes-Prometheus architecture consists of various components that we will discuss in this section. cAdvisor runs as a part of the Kubelet binary in Kubernetes. cAdvisor is basically an open-source library for resource usage and performance analysis agent. It supports Docker containers natively and is mainly for containers.

Kube-State metrics is a simple service for listening to Kubernetes-API. It generates the metrics about the state of objects such as deployments, nodes, and pods. 

The whole Kubernetes and Prometheus architecture include the following things:

The Prometheus servers want the maximum amount of target auto-discovery possible. To achieve this, Prometheus uses Prometheus Kubernetes SD, Consul SD, Azure SD or Azure VM, GCE SD for GCE instances. EC2 SD for AWS VM and File SD. Prometheus servers act as the core of the whole system and do the same function as the human brain. The server captures the metrics in multi-dimensional time series formats.

Prometheus can also collect the metrics related to Kubernetes services and orchestration status. The Kube-state-metrics pulls orchestration and cluster-level metrics. Kubernetes also controls plain metrics such as kubelet, etcd, dns, scheduler, etc. Alertmanager manages alert notifications, grouping, inhibition, etc. Prometheus can configure the rules to trigger alerting PromQl language. Grafana shows the scrapped metrics in the dashboard using a better UI.

How Does Prometheus Connect to Kubernetes?

There are various ways of installing Prometheus in your Kubernetes cluster, but the two most used methods for installation are :

You can follow the steps on their official website to download and run the Prometheus binary in your host.

For deploying a Prometheus server inside a container, just the run following command, and the Prometheus server will be visible on localhost:9090

docker run -p 9090:9090 -v /tmp/prometheus.yml:/etc/prometheus/prometheus.yml \
      prom/prometheus

Now you can use this Docker container for creating a Kubernetes deployment object. You can use Helm for installing Prometheus in Kubernetes. The Helm chart, maintained by the Prometheus community, helps with easy installation and the configuration of Prometheus and other applications that form the ecosystem.

Use the following steps for installation of Prometheus in Kubernetes:

First, add the Prometheus charts repository into your Helm configuration.

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm repo update

Now install Prometheus:

# Helm 3
helm install [RELEASE_NAME] prometheus-community/prometheus
# Helm 2
helm install --name [RELEASE_NAME] prometheus-community/prometheus

Now you have successfully installed the Prometheus chart repository.

How to Monitor Kubernetes Using Prometheus

We use open-source tools like Prometheus in the Kubernetes ecosystem to probe and discover if your services are responsive. However, monitoring your Kubernetes cluster is one area where it retains some resemblance to your legacy infrastructure. The fundamental necessity to monitor disc space, CPU, and memory on individual nodes and ensure that they are available remains. The capabilities you have on top of the particular nodes, however, alter.

You have learned how to install Prometheus using its binary file successfully. You have also learned how to install the Prometheus chart repository using Helm. Prometheus is a handy monitoring tool. However, Prometheus is for monitoring various kinds of tools like Kafka producer, Cassandra Client, Kafka Client, etc. Kubernetes is something for which everyone recommends Prometheus.

What Can I Monitor in Prometheus?

Using Prometheus, you can monitor different metric types like counter, graphs, gauge, summary, histograms, etc. Prometheus also supports metric client libraries for other programming languages like Java, Golang, Javascript, and Python. It also has a vast list of exporters. Exporters are those tools you use when you have to export metrics and translate them in a suitable format for Prometheus.

Monitoring a Kubernetes Service using Prometheus

Prometheus metrics are some endpoints exposed through HTTP(S), they give advantages like:

Many services are designed to be exposed by Prometheus endpoints, and you can directly convert them into Prometheus metrics. Many other services are not Prometheus ready, which is why we need to use exporters for them.

Let us first see how we can monitor those Kubernetes services whose microservice already offers a Kubernetes endpoint.

For the integration of microservices and containers, we will use a reverse proxy called Traefik. It is mainly an Ingress controller of the entry point and connector between the internet and microservice inside your cluster.

To use Traefik, first, you have to install it using the Traefik installation guide or Kubernetes-essentials installation. For quick Traefik deployment with Prometheus support, use the following commands: 

helm repo add stable https://kubernetes-charts.storage.googleapis.com/
helm install traefik stable/traefik --set metrics.Prometheus.enabled=true

Now you can check that Prometheus metrics are being exposed in the service using traefik-prometheus by just typing the curl:

$ curl 100.66.30.208:9100/metrics
# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 2.4895e-05
go_gc_duration_seconds{quantile="0.25"} 4.4988e-05

Now let’s see how we can monitor those Kubernetes services using Prometheus, which need exporters. Some server applications like Nginx and PostgreSQL are much older than Prometheus metrics, so we need to use exporters for using their metrics in Prometheus. Various types of exporters are available on the internet. You can use any of them to serve the purpose. You can check the complete list of exporters on the official website.

Monitoring Kubernetes Cluster with Prometheus

Prometheus also gives the facility of monitoring clusters. There are some aspects that one should keep in mind for monitoring clusters:

For monitoring Kubernetes nodes, we can use node-exporter hosted by the Prometheus project itself. It can be easily deployed as a Daemon set and will automatically scale off if you add or remove a node in your cluster.

Cons of Using Prometheus

We have talked so much about the benefits of Prometheus and how to use it. Now let’s see some detailed downsides of Prometheus. The downsides are:

Support for logs is one of them. You need both metrics and logs for total visibility into your applications, but log management already occurs by a number of open- and closed-source log aggregators.

Anomaly detection, automatic horizontal scalability, and user management are also not available with Prometheus. These functionalities are essential in most large-scale enterprise environments, according to our customer base.

Multi-tenancy is also an issue in Prometheus, which means it can scrape several targets but does not separate users, authorization, or keep metrics "separate" among users. Anyone can access the data using the endpoint or API (it does not contain any authorization-like token system). The same holds true for capacity isolation. If a single user or target transmits too many metrics, the Prometheus server may crash for everyone. All of these problems limit its scalability, making it difficult to use Prometheus in a corporate setting.

Prometheus is an excellent monitoring system. Its original goal was simple and adaptable and meant to save all of its compressed metrics in a single host in its own disk-based time-series database. Prometheus has an undistributed storage layer, and it is not built for long-term storage (it is responsible for keeping data for months or years) (as all the data is on one machine). Prometheus is excellent for alerting and short-term trends, but not for more historical data needs (such as capacity planning or invoicing, where historical data is critical and you want to preserve data for a long time).

Prometheus is not a dashboarding solution; it has a basic UI for experimenting with PromQL queries. However, it relies on Grafana for dashboarding, which adds a layer of complexity to the setup.

Recap

Metrics are always vital for any application. You need to check on them regularly if you want to grow your software successfully. Prometheus has made monitoring of Kubernetes nodes and clusters very easy. It provides various types of metrics like counter, graphs, summary, gauge, etc. 

You can monitor your Kubernetes services or cluster using Prometheus. While many microservices provide metrics that they can directly convert to the Prometheus endpoint, some other microservices that are older than Prometheus need exporters to use their metrics in Prometheus.