Getting Started with Prometheus and Grafana
The world is moving towards a data-driven society, resulting in businesses gathering more data to leverage for profit. This data is either in structured forms like JSON/CSV/XML or unstructured forms like text written on a notepad or typed texts. Regardless of the data’s formatting, the important thing is the information it contains. Various tools in the market extract data from different agents and platforms that will improve your metric monitoring system.
Prometheus is a tool used to monitor applications and show all essential metrics on one screen. Prometheus does not use protocols like SMNP or any kind of agent service. Instead, it scrapes metrics from a client (target) through HTTP and stores them in a local time-series database that you can query with its own DSL. It also uses exporters to convert metrics into Prometheus format. After collecting and processing data, Prometheus displays it in a beautiful UI using Grafana, a third-party tool used to show Prometheus metrics.
This article will discuss Prometheus, Grafana, and how Prometheus uses Grafana as a third-party library - use these links to navigate through the blog:
An Overview of Prometheus
Prometheus is an open-source tool used for monitoring and maintaining applications. SoundCloud is a standalone, open-source project. In 2016, Prometheus joined the Cloud Native Computing Foundation as a second-hosted project after Kubernetes. Initially, it used StatsD and Graphite which were not that advanced. After that, they added more features like a multi-dimensional database, operational simplicity, scalable data collection, etc. after taking inspiration from Borgmon (used by Google at the time). They integrated all these features within Prometheus itself and it became the second incubated project under Cloud Native Computing Foundation.
Some of the core features that make Prometheus so valuable are:
- The multi-dimensional data model in which data is identified by key-value pairs.
- PromQl, language used for querying multidimensional data models.
- Server-centered monitoring, no reliance on distributed storage.
- Automatic pulling of metrics via an HTTP endpoint
- Using exporters for non-supported metrics.
- Multiple nodes for graphing and dashboarding support.
How does Prometheus work?
Prometheus consists of many different architectural components like alert managers, exporters, push gateway, Prometheus servers, etc. Prometheus first scrapes data from instrumented jobs using HTTP endpoints directly or with the help of a push gateway. Sometimes Prometheus uses exporters to convert metrics into Prometheus readable format and then pull it. After scraping, it processes the data and runs some basic scripts for creating a new time series, using an old one, or generating alerts. For data visualization, you can use Grafana and other third-party libraries.
Prometheus is suitable for both machine-centric monitoring as well as dynamic service-oriented architectures. For purely numeric time-series data, it works at a different level. It also supports querying and collecting multi-dimensional data where every company focuses on microservices. You can rely on Prometheus when you want to quickly find and diagnose problems – your whole infrastructure will work even if it’s broken at some point.
Like any other tool, Prometheus is comprised of both internal and external components to make Prometheus a strong monitoring tool. Each component has its specific work and own requirements too.
In total, Prometheus has 7 components.
The server is the brain of any web or mobile application. It collects multi-dimensional time-series data, then analyzes and aggregates said data. The process of collecting metrics is called Scraping. The Prometheus server determines the targets and automatically scrapes. The language used for querying these languages is called PromQl. After applying filters and adjustments, the data is saved in key-value pairs.
Not all types of metrics are scrapped by the Prometheus server alone, some require extra mechanics. Prometheus Gateway is the intermediary source used for metrics from those jobs which can not be scrapped by usual methods. But we have to be very cautious while using the Prometheus Gateway as there are certain drawbacks.
- When you open multiple instances through Prometheus Gateway, it creates some failure points that can be a bottleneck.
- With Prometheus Gateway you could not use Prometheus automatic health monitoring.
- The Push Gateway exposes the data collected by Prometheus, and the bad thing is we can not delete the data manually or through its API. It happens because the lifecycle of data in the Prometheus gateway is different than that of the Prometheus Server.
Prometheus Gateway is very useful when you want to capture a service-level batch job.
Alertmanager is responsible for managing alerts sent by the clients. It checks for supplication and groups the alerts and routes them to the correct application like email, Pageruty, Opsgenie, etc. The alert manager is also responsible for switching on and off notifications. Using some configuration settings you can also group messages of one type and tell the alert manager to send them at once. You can also mute notifications for some events. There are some terms related to alert manager that you should be knowing when you are dealing with Prometheus:
- Grouping: In most smartphones or devices all notifications from a single app or service come as a single notification. If hundreds of notifications are received at once in the Prometheus server the system may fall. Grouping converts the same notification type into one and thereby reduces the server load.
- Inhibition: This notification is generally ignored when AlertManager detects that other alerts are already firing. For example, lets say an alert is firing which alerts the cluster is not reachable. Alertmanger then pauses all the alerts related to this cluster and prevents firing hundreds of unrelated alerts.
- Silences: Silence, as the name suggests, is the property that mutes warnings. Matches such as regex matching are used for silence. When the incoming alerts match the normal expression and properties of the preceding alerts, no notice will be sent to the system
Prometheus target represents how Prometheus extracts metrics from a different resource. In many cases, the metrics are exposed by the services themselves but in this case, Prometheus collects metrics directly. In some instances in which services are not exposed, exporters are required by Prometheus. Exporters are programs that extract data from a service and transform it into Prometheus-compatible forms.:
- Hardware: Node/system
- HTTP: HAProxy, NGINX, Apache.
- APIs: Github, Docker Hub.
- Other monitoring systems: Cloudwatch.
Client libraries provide client-specific instrumentation and metrics collection. There are various client libraries provided by Prometheus, some official and some unofficial. For all of the major programming languages, client libraries are generally available. Client libraries handle features like thread safety, bookkeeping, and the structure of Prometheus text explication in response to HTTP requests. Because metrics-based monitoring doesn't track individual events, the more you have, the more client library storage you'll need.
Client libraries are not restricted to text output metrics from Prometheus. It is an open ecosystem that can be used by the same APIs used for creating text format for the creation of metrics in another format or for feeding into other instruments.
As mentioned above in many cases metrics are self-exposed by the service. In that case, Prometheus automatically collects metrics. But in some cases, Prometheus needs to scrape metrics.
Database exporters, hardware related, issue trackers, storage, HTTP, APIs, logging, alert managers miscellaneous, are some examples of Prometheus exporters. Aerospike, Cickhouse, Couchbase, CouchDB, Druid, Elastic Search, and others are database exporters. Exporters of hardware include Accuse, Big-IP, Collins, Dell Hardware, and others. Aside from that, there's Ansible Tower, Caddy, CRG, Doorman, Etcd, Kubernetes, Midonet-Kubernetes, Xandikos, and so on.
Exporters for Prometheus are mostly hosted on Github, but you can also create your own for instrumenting your code. While creating your library, keep in mind common instrumentation guidelines provided by Prometheus.
So far, we have discussed how we can use static-config files to configure the dependencies manually. This process is ok when you have simple uses with the config file, but what if you have to do this in a large amount, especially when some instances are added or removed every single minute.
Here we use the concept of service discovery. Service discovery assists Prometheus in determining what to escape in whichever database you like. Console, Amazon's EC2, and Kubernetes are Prometheus' default service discovery resources.
For non-supported source resources, you have to use a file-based service discovery mechanism. This can be done using your configuration management system like Ansible or Chef and passing the script containing a list of sources from which you want to pull data.
Prometheus provides many third-party integrations, including exporters. These integrations provide many extra facilities to the application and enable the users to do more tasks. Several third-party integrations include File Service discovery, AlertManager, and Remote Endpoint Storage. In this section, we discuss third party integration within each Prometheus category:
File Service Discovery
File Service Discovery integrations include: Kuma, LightSail, NetBox, and Packet.
Remote Endpoint and Storage
These integrations help in reading and writing remotely. You can write and read services transparently and are mostly meant for storage purposes. Some of the integration are:
- AWS Timestream
- Azure Data Explorer
Alertmanager webhook service
These integrations are beneficial in sending alerts as one of the primary tasks. Some integration in this category are:
- AWS SNS
- JIRA Alert
- Service Now
- Rocket Chat
These third-party integrations help integrate management functionality on top of your existing services:
- Prometheus Operator
An Overview of Grafana
Prometheus is used for collecting and aggregating app metrics from different platforms. However, Prometheus is not very good at representing those metrics in a dashboard. Hence the more data you have, the more confusing it is. Grafana was developed to solve these issues. It is an open-source web application tool for metrics visualization. Grafana also helps create alerts for any unexpected app behavior and takes the right steps towards it. Over 1000 developers have contributed over 23,000 contributions to Grafana. On GitHub, the project has over 32,000 ratings and 6,000 forks. Grafana's developers are quite active, with over 2,000 problems and 100 active Pull Requests.
Prometheus is used for making dashboards more intuitive and user-friendly. It helps you convert unordered metrics in a super simple UI and makes it easy to read.
Grafana.com keeps a repository of shared dashboards that may be downloaded and utilized with Grafana standalone instances. To view dashboards for the "Prometheus" data source exclusively, use the Grafana.com "Filter" option.
Components of Grafana
Although there are many components of Grafana, here we will be discussing some of the top-level components of Grafana. They are:
- Distributor: The distributor is responsible for accepting all formats like Jaeger, OpenTelemetry, Zipkin. It routes spans to ingesters by manipulating the traceID and using a consistently distributed hash ring.
- Ingester: The ingester batches trace blocks, creates bloom and index filters, and then flushes everything to the background.
- Query Frontend: This component is responsible for making the search space for an incoming query.
- Query Editor: Query editors write custom queries for the components collected from each data source. You can use the query editor for writing custom queries that will expose metrics collected in the Prometheus database.
How Grafana Works
Grafana connects to any data source such as Graphite, Prometheus, Influx DB, ElasticSearch, MySQL, PostgreSQL, etc.
Grafana is an open-source solution and also allows us to write plug-ins for integration with various data sources from scratch.
The tool supports the study, analysis, and monitoring of data over a period of time called time-series analytics.
It helps track user behavior, application behavior, error frequency in production or in the pre-provided environment, the type of errors in which related data appears, and contextual scenarios. In addition to the core open-source solution, the Grafana Cloud & Enterprise team offers two other services for businesses.
Competitors of Grafana
Grafana's most well-known competitor is Kibana, which is part of the Elasticsearch ecosystem. Kibana and Grafana both aim to make it simple to visualize and alert on the data they have access to, which is also Kibana's main flaw. Grafana does not have a restriction on the number of data sources it may use, whereas Kibana does.
Kibana has superior search capabilities than Grafana, which makes logical given that it is the tool that Elastic employs in its commercial product. Only commercial offerings have the time and resources to provide comprehensive search and event correlation.
Before installing Grafana, you have to look for various system requirements as it only supports these operating systems:
- RPM Based Linux(CentOS, Fedora, OpenSuse, Redhat)
Grafana has some hardware requirements, although it does not use as much memory or CPU. Basic requirements of Grafana are minimum memory of 255MB and 1 CPU. But some features like server-side rendering, alerting, and data source proxy require more CPUs. Databases that are supported by Grafana are MySQL, PostgreSQL, and SQLite. Browsers that are supported by Grafana are Chrome, Firefox, Safari, Microsoft Edge, and IE 11 for Grafana versions prior to 6.
Now we have discussed the requirements of Grafana let us move to the installation steps. For different operating systems, there can be different steps for installation so in this article we will be seeing the installation of Grafana for Widows. First, go to Grafana’s official website for downloading Grafana. After a successful download, double click on it to run the installer.
With every version of Grafana, there is a zip file associated. Before unzipping the file open the file properties and click ‘ok’. This zip file is mainly for standalone windows libraries that Grafana will use. Then after successful installations run grafana-server.exe from the command line and you can see Grafana in your local port 3000. For using Grafana as a windows service you can use NSSM
Before starting a Prometheus installation make sure you have proper internet and administrative permission on your computer.
Step 1: update the YAM and download the latest binaries of Prometheus available whether it is for Windows, Linux, or the official website.
sudo yum update -y
Step 2: Download the tar file using curl or directly and rename the file as prometheus-files.
Step 3: After this, you have to create a Prometheus user, required directories like /etc /var, and Prometheus as the owner of these directories
sudo useradd --no-create-home --shell /bin/false prometheus sudo mkdir <directory-path> sudo chown prometheus:prometheus <directory-path> #Giving access to #Prometheus directory
Step 4: For setting up a Prometheus configuration all your configurations should be present in the /etc/prometheus/prometheus.yml file. Create this file using the following command
sudo vi /etc/prometheus/prometheus.yml
This command will open the above file in vi editors for you.
Then add this contents to the file. You can tweak scrape_interval according to your need and the target also.
global: scrape_interval: 20s #Indicates how frequently to scrap targets scrape_configs: - job_name: 'prometheus metrics' scrape_interval: 5s static_configs: - targets: ['localhost:9090']
Change the ownership to Prometheus user.
sudo chown prometheus:prometheus <yml file path>
Step 5: Create a Prometheus service file at the following location
For checking the Prometheus service statue run the following command
sudo systemctl status prometheus
Now you can also access Prometheus web UI using the 9090 port locally.
Before creating dashboards for Prometheus, you have to set Prometheus as a data source in Grafana in the settings. You can simply select Prometheus as a Datasource with the same port number as the Grafana config file. Now go to the settings and select Prometheus for importing the dashboard.
Now let us see how to create a new dashboard and various metrics related to them.
Click on the plus icon on the left side and select the new dashboard here. After that, a screen comes with two big blank spaces where you can add rows and widgets for showing your metrics. All the queries that you want to run here will be using Prometheus query language (PromQL).
Get Grafana metrics into Prometheus
Grafana provides endpoint metrics which expose Prometheus metrics. You can also see real-time metrics in the dashboard and analyze the data. Go to the data source edit page and pick the dashboard tab to import the bundled dashboard. There is a Grafana dashboard and a Prometheus dashboard there. Import the data and begin examining all of the metrics!
For more information about getting Grafana metrics, you can reach out to the internal Grafana metrics.
Now you can configure the data sources using the Grafana config file. In the previous versions you could only use API for provisioning and configuring the data and that required some HTTP credentials setup and services should be running before the creation of dashboards. But from v5.0 of Grafana, you can provision it using config files. It has made version control easy and natural for data sources. Below is one of the examples for provisioning the data source:
apiVersion: 2 datasources: - name: Prometheus at ScoutAPm type: prometheus # Access mode - proxy (server in the UI) or direct (browser in the UI). access: proxy url: http://localhost:9090 jsonData: httpMethod: POST exemplarTraceIdDestinations: # Link of webhook pushing data to Grafana # datasourceUid value can be anything, but it should be unique across all defined data source uids. - datasourceUid: my_scout_uid name: traceID
Prometheus is a great tool for scraping metrics from your website and other sources. It helps in analyzing multi-dimensional databases using its own query language, PromQl. Grafana is used to enhance dashboard representation and Prometheus UI. Grafana helps in representing metrics in a very systematic way and also helps in analyzing them. You can integrate Grafana and Prometheus very easily using the steps mentioned above.
Grafana is a useful tool for not only Prometheus but for other platforms as well.
ScoutAPM is also an application monitoring tool with many modern features. You can use it to smoothly monitor your application. You can get started with just a free sign-up on scoutapm.com. It is absolutely free for 14-days and no card details are required.