Recently I put together a post on using Prometheus to discover services within AWS, Azure and the Google Cloud Platform. Not long after publishing this post, I saw that service discovery for Digital Ocean is now available within Prometheus as well.

This feature is not in the published version as of yet (2.19.2 at the time of writing), so you will need to do one of the following: -

  • If you are using Docker, use the :master tag (e.g. docker pull prom/prometheus:master)
  • Build it from source

Because this feature is not in a general released version, the service discovery mechanism may change at a later date. If it does, I will update this post to reflect that.

Building from source

To build Prometheus from source, you will need to make sure you have a working Golang (v1.13 or above) environment, as well as installing NodeJS and Yarn. To prepare an Ubuntu 20.04 instance for this, do the following: -

# Install Golang
$ apt install golang

# Install NodeJS and NPM (Node Package Manager)
$ apt install nodejs npm

# Install Yarn
$ npm install -g npm

# Set your GOPATH
$ mkdir ~/go
$ export GOPATH=~/go

After this, you can follow the instructions provided on the Prometheus GitHub README. These are: -

$ mkdir -p $GOPATH/src/github.com/prometheus
$ cd $GOPATH/src/github.com/prometheus
$ git clone https://github.com/prometheus/prometheus.git
$ cd prometheus
$ make build

This process generates all the web assets (using NodeJS and Yarn), as well as injecting them into the Golang build process. This makes the binary portable (i.e. the web assets are part of the binary, rather than in a static path). The output of the make build command is below: -

cd web/ui/react-app && yarn --frozen-lockfile
yarn install v1.22.4
[1/4] Resolving packages...
[2/4] Fetching packages...
info [email protected]: The platform "linux" is incompatible with this module.
info "[email protected]" is an optional dependency and failed compatibility check. Excluding it from installation.
info [email protected]: The platform "linux" is incompatible with this module.
info "[email protected]" is an optional dependency and failed compatibility check. Excluding it from installation.
[3/4] Linking dependencies...
warning " > [email protected]" has unmet peer dependency "[email protected]".
[4/4] Building fresh packages...
Done in 47.65s.
>> building React app
building React app
yarn run v1.22.4
$ react-scripts build
Creating an optimized production build...
Compiled successfully.

File sizes after gzip:

  260.61 KB  build/static/js/2.dfe05a07.chunk.js
  29.67 KB   build/static/js/main.3085a125.chunk.js
  23.27 KB   build/static/css/2.df42c974.chunk.css
  1.5 KB     build/static/css/main.0b010d50.chunk.css
  770 B      build/static/js/runtime-main.5db206b5.js

The project was built assuming it is hosted at ./.
You can control this with the homepage field in your package.json.

The build folder is ready to be deployed.

Find out more about deployment here:

  bit.ly/CRA-deploy

Done in 70.80s.
>> writing assets
# Un-setting GOOS and GOARCH here because the generated Go code is always the same,
# but the cached object code is incompatible between architectures and OSes (which
# breaks cross-building for different combinations on CI in the same container).
cd web/ui && GO111MODULE=on GOOS= GOARCH= go generate -x -v  -mod=vendor
doc.go
go run -mod=vendor assets_generate.go
writing assets_vfsdata.go
ui.go
curl -s -L https://github.com/prometheus/promu/releases/download/v0.5.0/promu-0.5.0.linux-amd64.tar.gz | tar -xvzf - -C /tmp/tmp.Zr9ksLsvhC
promu-0.5.0.linux-amd64/
promu-0.5.0.linux-amd64/promu
promu-0.5.0.linux-amd64/NOTICE
promu-0.5.0.linux-amd64/LICENSE
mkdir -p /home/$USER/go/bin
cp /tmp/tmp.Zr9ksLsvhC/promu-0.5.0.linux-amd64/promu /home/$USER/go/bin/promu
rm -r /tmp/tmp.Zr9ksLsvhC
>> building binaries
GO111MODULE=on /home/$USER/go/bin/promu build --prefix /home/$USER/go/src/github.com/prometheus/prometheus
 >   prometheus
 >   promtool
 >   tsdb

This will generate the prometheus binary, as well as promtool. You can use this prometheus binary in place of your existing one. For me, this would be /usr/local/bin/prometheus.

Do note that as this is an unreleased version, you may encounter bugs that are not in the official published versions.

Terraform

Now that we have a Prometheus binary that can discover services in DigitalOcean, we can define our Digital Ocean resources in Terraform. For information on how to install Terraform and the project structure, see here.

Digital Ocean

Digital Ocean offer a variety of services, from cloud instances (virtual machines), databases, object storage, load balancers and managed Kubernetes. While they the size of AWS, Azure or Google Cloud Platform, they still have a significant user base (myself being one of them!)

Cloud Instances within Digital Ocean are known as Droplets. The smallest Droplet comes with 1 vCPU and 1G of memory, with the largest at 32 vCPUs and 192G of memory.

You can sign up for a DigitalOcean account here. As a point of note, if you listen to podcasts in the Linux and Open Source community, you’ll probably hear offers to get $100 of credit for 60 days, so be sure to use one of these if you can.

Once you have signed up, you’ll be asked to create a project. This is a container for your resources, allowing you to group them together as you see fit: -

Digital Ocean Project

Create an API key

Terraform uses Digital Ocean API keys to authenticate and provision resources. You can generate a key in the API section of the Digital Ocean Cloud Console: -

Digital Ocean API Key

You can choose to either: -

  • Configure the API token as a variable in Terraform
  • Expose it as an environment variable

To configure it as a variable in Terraform, do something like the following (taken from the Terraform Digital Ocean Provider page): -

# Set the variable value in *.tfvars file
# or using -var="do_token=..." CLI option
variable "do_token" {
  type = "string"
  default = "$TOKEN_GOES_HERE"
}

# Configure the DigitalOcean Provider
provider "digitalocean" {
  token = var.do_token
}

Alternatively, Terraform will use the environment variables $DIGITALOCEAN_TOKEN or $DIGITALOCEAN_ACCESS_TOKEN. You can set this with export DIGITALOCEAN_TOKEN="###API-KEY### or you can place them in your .bashrc or .zshrc to be loaded when you open a terminal.

Configure Terraform - Droplets

Now that we have an API key created and available, we can use Terraform with Digital Ocean.

In your chosen directory to define your infrastructure (I am using ~/terraform/basic-vms for this), create a providers.tf file that contains the following: -

# Digital Ocean Provider

provider "digitalocean" {
}

The above discovers the Digital Ocean API Key from our environment variables ($DIGITALOCEAN_TOKEN in my case). No other details are required at this stage.

After this, run terraform init. This downloads the Digital Ocean Terraform provider binary: -

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "digitalocean" (terraform-providers/digitalocean) 1.20.0...
- Downloading plugin for provider "template" (hashicorp/template) 2.1.2...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.digitalocean: version = "~> 1.20"
* provider.template: version = "~> 2.1"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

The template provider is because we use it for user-data (i.e. first-time boot configuration). This is covered in the AWS EC2s section of the previous post on Cloud service discovery.

Define the infrastructure - Droplets

You can now create the configuration files for your first Droplet. The below is from the file do.tf in the terraform/basic-vms directory: -

resource "digitalocean_droplet" "yetiops-prom-vm" {
  image              = "ubuntu-20-04-x64"
  name               = "yetiops-prom-vm"
  region             = "fra1"
  size               = "s-1vcpu-1gb"
  ssh_keys           = [digitalocean_ssh_key.yetiops-ssh-key.fingerprint]

  tags = [
    digitalocean_tag.prometheus.id,
    digitalocean_tag.node_exporter.id
  ]

  user_data          = data.template_file.ubuntu.template

}

resource "digitalocean_tag" "prometheus" {
  name = "prometheus"
}

resource "digitalocean_tag" "node_exporter" {
  name = "node_exporter"
}

resource "digitalocean_ssh_key" "yetiops-ssh-key" {
  name       = "SSH Key"
  public_key = file("~/.ssh/id_ed25519.pub")
}

resource "digitalocean_firewall" "yetiops-prom-vm" {
  name = "yetiops-prom-vm"

  droplet_ids = [digitalocean_droplet.yetiops-prom-vm.id]

  inbound_rule {
    protocol         = "tcp"
    port_range       = "22"
    source_addresses = ["$MY_PUBLIC_IP/32"]
  }

  inbound_rule {
    protocol         = "tcp"
    port_range       = "9100"
    source_addresses = ["$MY_PUBLIC_IP/32"]
  }

  inbound_rule {
    protocol         = "icmp"
    source_addresses = ["0.0.0.0/0", "::/0"]
  }

  outbound_rule {
    protocol              = "icmp"
    destination_addresses = ["0.0.0.0/0", "::/0"]
  }

  outbound_rule {
    protocol              = "tcp"
    port_range            = "1-65535"
    destination_addresses = ["0.0.0.0/0", "::/0"]
  }

  outbound_rule {
    protocol              = "udp"
    port_range            = "1-65535"
    destination_addresses = ["0.0.0.0/0", "::/0"]
  }
}

To summarize what we are doing here, we are: -

  • Creating a Digital Ocean Droplet, running Ubuntu 20.04, of size s-1vcpu-1gb (1 vCPU, 1G of memory)
  • Using the Ubuntu template file for user-data (which is a cloud-config file that installs the Prometheus Node Exporter and nothing more)
  • Creating two tags (prometheus and node_exporter) and attaching them to the Droplet
  • Adding a local SSH key to Digital Ocean so that we can SSH into the instance once it is provisioned
  • Creating a firewall that allows SSH and the Node Exporter port inbound from my public IP, any ICMP (i.e. Ping) traffic, and all outbound traffic

Unlike tags in AWS and Azure, and labels in Google Cloud Platform, Digital Ocean tags are not $KEY:$VALUE based (i.e. prometheus: true). This is something to be aware of when configuring Prometheus later.

Build the infrastructure - Droplets

We can now apply our configuration, and see if it builds a Digital Ocean Droplet: -

$ terraform apply
data.template_file.ubuntu: Refreshing state...
data.template_cloudinit_config.ubuntu: Refreshing state...

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # digitalocean_droplet.yetiops-prom-vm will be created
  + resource "digitalocean_droplet" "yetiops-prom-vm" {
      + backups              = false
      + created_at           = (known after apply)
      + disk                 = (known after apply)
      + id                   = (known after apply)
      + image                = "ubuntu-20-04-x64"
      + ipv4_address         = (known after apply)
      + ipv4_address_private = (known after apply)
      + ipv6                 = false
      + ipv6_address         = (known after apply)
      + ipv6_address_private = (known after apply)
      + locked               = (known after apply)
      + memory               = (known after apply)
      + monitoring           = false
      + name                 = "yetiops-prom-vm"
      + price_hourly         = (known after apply)
      + price_monthly        = (known after apply)
      + private_networking   = (known after apply)
      + region               = "fra1"
      + resize_disk          = true
      + size                 = "s-1vcpu-1gb"
      + ssh_keys             = (known after apply)
      + status               = (known after apply)
      + tags                 = (known after apply)
      + urn                  = (known after apply)
      + user_data            = "2169d8a3e100623d34bf1a7b2f6bd924a8997bfb"
      + vcpus                = (known after apply)
      + volume_ids           = (known after apply)
      + vpc_uuid             = (known after apply)
    }

  # digitalocean_firewall.yetiops-prom-vm will be created
  + resource "digitalocean_firewall" "yetiops-prom-vm" {
      + created_at      = (known after apply)
      + droplet_ids     = (known after apply)
      + id              = (known after apply)
      + name            = "yetiops-prom-vm"
      + pending_changes = (known after apply)
      + status          = (known after apply)

      + inbound_rule {
          + protocol                  = "icmp"
          + source_addresses          = [
              + "0.0.0.0/0",
              + "::/0",
            ]
          + source_droplet_ids        = []
          + source_load_balancer_uids = []
          + source_tags               = []
        }
      + inbound_rule {
          + port_range                = "22"
          + protocol                  = "tcp"
          + source_addresses          = [
              + "$MY_PUBLIC_IP/32",
            ]
          + source_droplet_ids        = []
          + source_load_balancer_uids = []
          + source_tags               = []
        }
      + inbound_rule {
          + port_range                = "9100"
          + protocol                  = "tcp"
          + source_addresses          = [
              + "$MY_PUBLIC_IP/32",
            ]
          + source_droplet_ids        = []
          + source_load_balancer_uids = []
          + source_tags               = []
        }

      + outbound_rule {
          + destination_addresses          = [
              + "0.0.0.0/0",
              + "::/0",
            ]
          + destination_droplet_ids        = []
          + destination_load_balancer_uids = []
          + destination_tags               = []
          + protocol                       = "icmp"
        }
      + outbound_rule {
          + destination_addresses          = [
              + "0.0.0.0/0",
              + "::/0",
            ]
          + destination_droplet_ids        = []
          + destination_load_balancer_uids = []
          + destination_tags               = []
          + port_range                     = "1-65535"
          + protocol                       = "tcp"
        }
      + outbound_rule {
          + destination_addresses          = [
              + "0.0.0.0/0",
              + "::/0",
            ]
          + destination_droplet_ids        = []
          + destination_load_balancer_uids = []
          + destination_tags               = []
          + port_range                     = "1-65535"
          + protocol                       = "udp"
        }
    }

  # digitalocean_ssh_key.yetiops-ssh-key will be created
  + resource "digitalocean_ssh_key" "yetiops-ssh-key" {
      + fingerprint = (known after apply)
      + id          = (known after apply)
      + name        = "SSH Key"
      + public_key  = "$SSH_PUBLIC_KEY_CONTENTS"
    }

  # digitalocean_tag.node_exporter will be created
  + resource "digitalocean_tag" "node_exporter" {
      + databases_count        = (known after apply)
      + droplets_count         = (known after apply)
      + id                     = (known after apply)
      + images_count           = (known after apply)
      + name                   = "node_exporter"
      + total_resource_count   = (known after apply)
      + volume_snapshots_count = (known after apply)
      + volumes_count          = (known after apply)
    }

  # digitalocean_tag.prometheus will be created
  + resource "digitalocean_tag" "prometheus" {
      + databases_count        = (known after apply)
      + droplets_count         = (known after apply)
      + id                     = (known after apply)
      + images_count           = (known after apply)
      + name                   = "prometheus"
      + total_resource_count   = (known after apply)
      + volume_snapshots_count = (known after apply)
      + volumes_count          = (known after apply)
    }

Plan: 5 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

digitalocean_tag.prometheus: Creating...
digitalocean_tag.node_exporter: Creating...
digitalocean_ssh_key.yetiops-ssh-key: Creating...
digitalocean_tag.prometheus: Creation complete after 1s [id=prometheus]
digitalocean_ssh_key.yetiops-ssh-key: Creation complete after 1s [id=27810953]
digitalocean_tag.node_exporter: Creation complete after 1s [id=node_exporter]
digitalocean_droplet.yetiops-prom-vm: Creating...
digitalocean_droplet.yetiops-prom-vm: Still creating... [10s elapsed]
digitalocean_droplet.yetiops-prom-vm: Still creating... [20s elapsed]
digitalocean_droplet.yetiops-prom-vm: Still creating... [30s elapsed]
digitalocean_droplet.yetiops-prom-vm: Creation complete after 35s [id=197944957]
digitalocean_firewall.yetiops-prom-vm: Creating...
digitalocean_firewall.yetiops-prom-vm: Creation complete after 0s [id=180b6bd2-ee0c-4649-bae5-9cb2d3db6473]

Apply complete! Resources: 5 added, 0 changed, 0 destroyed.

We can double check that Terraform is managing these resources now with terraform state list: -

$ terraform state list
data.template_cloudinit_config.ubuntu
data.template_file.ubuntu
digitalocean_droplet.yetiops-prom-vm
digitalocean_firewall.yetiops-prom-vm
digitalocean_ssh_key.yetiops-ssh-key
digitalocean_tag.node_exporter
digitalocean_tag.prometheus

We can check to see if the instance is in the Digital Ocean Console: -

Digital Ocean Droplet

Now lets try SSH: -

$ ssh [email protected]$DROPLET_PUBLIC_IP
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-29-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Mon Jun 29 06:48:28 UTC 2020

  System load:  0.13              Processes:             103
  Usage of /:   5.6% of 24.06GB   Users logged in:       0
  Memory usage: 19%               IPv4 address for eth0: $PUBLIC_IP 
  Swap usage:   0%                IPv4 address for eth0: 10.19.0.5

65 updates can be installed immediately.
29 of these updates are security updates.
To see these additional updates run: apt list --upgradable



The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

[email protected]:~# ps aux | grep -i node
prometh+    2266  0.0  1.0 336924 10468 ?        Ssl  06:44   0:00 /usr/bin/prometheus-node-exporter
root       14312  0.0  0.0   8156   672 pts/0    S+   06:48   0:00 grep --color=auto -i node

Configure Terraform - Prometheus User

Digital Ocean have the option of creating applications with limited permissions, but it appears this is for Oauth2-based applications. Standard access tokens do not get the same level of fine grained permissions. With this being the case, we do not create a Prometheus user in Terraform. Instead, follow the steps above for creating an API key for Prometheus.

Prometheus

Now that we have our Droplet in Digital Ocean, we can configure our Prometheus instance. I am using an Ubuntu 20.04 virtual machine in my lab for this.

Digital Ocean Service Discovery

To allow Prometheus to discover instances in Digital Ocean, use configuration like the below: -

global:
  scrape_interval:     15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
    - targets:
      - 'localhost:9090'
  - job_name: 'do-nodes'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
        port: 9100
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,node_exporter,.*
        action: keep

Replace $DIGITAL_OCEAN_PROMETHEUS_API_TOKEN with the token created above.

Notice that we have a digitalocean_tags metadata field, rather than each tag having its own field. Here is the AWS EC2 relabelling configuration for comparison: -

    relabel_configs:
      - source_labels: [__meta_ec2_tag_prometheus]
        regex: true.*
        action: keep
      - source_labels: [__meta_ec2_tag_node_exporter]
        regex: true.*
        action: keep

Because of this, rather than looking for the value true, we are using a regular expression to filter the contents of the tags field itself (looking for prometheus and node_exporter). If you run another exporter, like the HAProxy Exporter, you could use something like the following: -

  - job_name: 'do-nodes-haproxy'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
        port: 9101
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,haproxy_exporter,.*
        action: keep

The Digital Ocean Service Discovery uses the Public IPv4 address of the Digital Ocean Droplet by default to target (unlike AWS, Azure or Google Cloud Platform, which uses the private IPv4). Because of this, we do not need to do any relabelling to reach it from an external source. Digital Ocean do have VPCs, so if you decide to run your Prometheus instance within Digital Ocean, you will want to relabel the target address to use the Private IPv4 address instead.

We can now look at the labels that the Digital Ocean Service Discovery generates: -

Digital Ocean Droplet Service Discovery

Notice the tags field which contains all of the tags we have assigned (and hence what we match upon).

Can we reach the node_exporter on the Droplet?

Digital Ocean Droplet Target

Yes we can, brilliant!

More efficient service discovery

Julien Pivotto (one of the contributors to Prometheus) made me aware that there is a more efficient way of using Prometheus Service Discovery with multiple targets

The approach in this post looks something like the below: -

  - job_name: 'do-nodes'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
        port: 9100
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,node_exporter,.*
        action: keep
  - job_name: 'do-nodes-haproxy'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
        port: 9101
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,haproxy_exporter,.*
        action: keep

In the first job we make the relevant API calls to the Digital Ocean API to retrieve the configured Droplets. When we get to the second job, we make the same API calls again to retrieve the configured Droplets. The more jobs you have, the more API calls Prometheus would make.

Instead, you can use something like: -

  - job_name: 'do-nodes'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,node_exporter,.*
        action: keep
      - source_labels: [__meta_digitalocean_public_ipv4]
        target_label: __address__
        replacement: '$1:9100'
  - job_name: 'do-nodes-haproxy'
    digitalocean_sd_configs:
      - bearer_token: '$DIGITAL_OCEAN_PROMETHEUS_API_TOKEN'
    relabel_configs:
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,prometheus,.*
        action: keep
      - source_labels: [__meta_digitalocean_tags]
        regex: .*,haproxy_exporter,.*
        action: keep
      - source_labels: [__meta_digitalocean_public_ipv4]
        target_label: __address__
        replacement: '$1:9101'

Because the jobs are using relabelling for the __address__ label (i.e. the Prometheus target address), they use the same Service Discovery API calls. All relabelling actions are performed locally, so they do not necessitate more API calls.

In the near future I will put a post together to compare the two approaches, so we can see the difference it makes.

Grafana

With the above, we can use any Node Exporter dashboard in Grafana to view all of the discovered instances. The Node Exporter Full is always a good start when using the Node Exporter: -

Digital Ocean Grafana

If we add more Droplets, they will also appear in this dashboard too.

Summary

Prometheus is constantly being improved, with new features that make it more compelling with every release.

The inclusion of the Digital Ocean service discovery mechanism means that those who either do not require the complexity of AWS, Azure or Google Cloud Platform, or just prefer them as a provider, no longer need to run other mechanisms (like Consul or other methods) to automatically discover the Droplets and services.