上QQ阅读APP看书，第一时间看更新

Deploying Prometheus stack

We'll start by cloning vfarcic/docker-flow-monitor repository from https://github.com/vfarcic/docker-flow-monitor. It contains all the scripts and Docker stacks we'll use throughout this chapter.

All the commands from this chapter are available in the 03-deploying-prometheus.sh Gist at https://gist.github.com/vfarcic/e597004e626fbffc47de72bdc75a3498.

git clone \
    https://github.com/vfarcic/docker-flow-monitor.git 

cd docker-flow-monitor

The rest of the chapter will require Docker machine. Please set it up using the installation instructions at ( https://docs.docker.com/machine/install-machine/). If you are a Windows user, please run all the commands from Git Bash (installed through Git) or any other bash you might have.

Before we create a Prometheus service, we need to have a cluster. It will consist of three nodes created with Docker machine.

Feel free to skip the commands that follow if you already have a working Swarm cluster.

chmod +x scripts/dm-swarm.sh 
 
./scripts/dm-swarm.sh 
 
eval $(docker-machine env swarm-1)

The dm-swarm.sh script created the nodes and joined them into a Swarm cluster.

I will assume that you are already familiar with Docker Swarm and do not need an explanation of the script. If that's not the case, please consider reading The DevOps 2.1 Toolkit: Docker Swarm for an in-depth examination of how Docker Swarm clusters work.

Now we can create the first Prometheus service. We'll start small and slowly move toward a more robust solution.

We'll deploy the stack defined in stacks/prometheus.yml. It is as follows:

version: "3" 
 
services: 
 
  prometheus: 
    image: prom/prometheus 
    ports: 
      - 9090:9090

As you can see, it is as simple as it can get. It specifies the image and the port that should be opened.

Let's deploy the stack.

docker stack deploy \  
    -c stacks/prometheus.yml \  
    monitor

Please wait a few moments until the image is pulled and deployed. You can monitor the status by executing the docker stack ps monitor command.

Let's confirm that Prometheus service is indeed up-and-running.

open "http://$(docker-machine ip swarm-1):9090"

If you're a Windows user, Git Bash might not be able to use the open command. If that's the case, replace the open command with echo. As a result, you'll get the full address that should be opened directly in your browser of choice.

You should see the Prometheus graph screen.

Let's take a look at the configuration.

open "http://$(docker-machine ip swarm-1):9090/config"

You should see the default config that does not define much more than intervals and internal scraping. In its current state, Prometheus is not very useful, so we'll have to spice it up a bit.

Figure 3-1: Prometheus with the default configuration

We should start fine tuning Prometheus. There are quite a few ways we can do that.

We can create a new Docker image that would extend the one we used and add our own configuration file. That solution has a distinct advantage of being immutable and, hence, very reliable. Since Docker image cannot be changed, we can guarantee that the configuration is exactly as we want it to be no matter where we deploy it. If the service fails, Swarm will reschedule it and, since the configuration is baked into the image, it'll be preserved. The problem with that approach is that it is not suitable for microservices architecture. If Prometheus has to be reconfigured with every new service (or at least those that expose metrics), we would need to build it quite often and tie that build to CD processes executed for the services we're developing. This approach is suitable only for a relatively static cluster and monolithic applications. Discarded!

Figure 3-2: Creating a new image every time Prometheus config change

What would be the alternative approach?

We can enter a running Prometheus container, modify its configuration, and reload it. While this allows a higher level of dynamism, it is not fault-tolerant. If Prometheus fails, Swarm will reschedule it, and all the changes we made will be lost. Besides fault tolerance, modifying a config in a running container poses additional problems when running it as a service inside a cluster. We need to find out the node it is running in, SSH into it, figure out the ID of the container, and, only then, we can exec into it, modify the config, and send a reload request. While those steps are not overly complicated and can be scripted, they will pose an unnecessary operational complexity. Discarded!

Figure 3-3: Updating Prometheus configuration inside a container

Among other reasons, we discarded the previous solution because it is not fault-tolerant.

We could mount a network volume to the service. That would solve persistence, but would still leave the problem created by a dynamic nature of a cluster. We still, potentially, need to change the configuration and reload Prometheus every time a new service is deployed or updated.

From the operational perspective, this solution is simpler than the previous solution we discussed. We do not need to find out the node it is running in, SSH into it, figure out the ID of the container, exec into it, and modify the config. Instead, we can alter the file on the network drive and send a reload request to Prometheus. While network drive simplifies the process, it does not make it as dynamic and independent from the services as it should be. We would need to make sure that the deployment pipeline of each of the services has the required steps that will reconfigure Prometheus. By doing that we would break one of our objectives. That is, our services would not contain all the information about themselves. Instead, we'd need to create a different pipeline for each and specify the targets, alerts, and other information we might need before reconfiguring Prometheus. We'll discard this solution as well.

Figure 3-4: Updating Prometheus configuration stored on a network drive

What other options do we have? If we're looking for an out-of-the-box solution that uses the official Prometheus image, all our options are exhausted. But we are engineers. We are used to extending other people solutions and adapting them to suit our needs. Let's not limit our options and try to design a solution that would suit us well.