<aside> 💡 This guide will help you build a high-quality monitoring environment by running only the essential services without going through the kubernetes platform. Other detailed settings can be studied with googling on your own, and it's worth it. This method of building a monitoring environment is simple and easy, but it is a high-level knowledge that you must know if you are a competent node operator.
</aside>
Add Prometheus Account.
cd
sudo useradd --no-create-home --shell /usr/sbin/nologin prometheus
Install Prometheus package.
sudo apt-get install -y prometheus prometheus-node-exporter prometheus-pushgateway prometheus-alertmanager
Verify the process is operational.
ps -ef | grep prometheus
prometh+ 825 1 0 Jun11 ? 00:02:41 /usr/bin/prometheus-alertmanager
prometh+ 831 1 1 Jun11 ? 00:48:26 /usr/bin/prometheus-node-exporter
prometh+ 840 1 0 Jun11 ? 00:00:00 /usr/bin/prometheus-pushgateway
prometh+ 3028 1 0 Jun11 ? 00:14:54 /usr/bin/prometheus
Install Grafana.
Add Grafana Repository.
sudo add-apt-repository "deb <https://packages.grafana.com/oss/deb> stable main"
<aside> ⚠️ If you install it for the first time, you will probably get a NO_PUBKEY error like the message below, because you don't have public key yet.
</aside>
Copy the key and access keyserver.ubuntu.com.
The key you need to copy from the message above for example is 8C8C34C524098CB6
, but you should search key block with *0x8C8C34C524098CB6*
. And then click the pub key block link.
You need to copy the entire block information like the image below and save it. (ex. ./my-key.txt)
Now run the adding repository command once again.
sudo apt-key add my-key.txt
sudo add-apt-repository "deb <https://packages.grafana.com/oss/deb> stable main"
<aside> ⚠️ If added successfully, the screen output is similar to the below.
</aside>
Hit:1 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic InRelease
Hit:2 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic-updates InRelease
Hit:3 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic-backports InRelease
Hit:4 <http://security.ubuntu.com/ubuntu> bionic-security InRelease
Get:5 <https://packages.grafana.com/oss/deb> stable InRelease [12.1 kB]
Get:6 <https://packages.grafana.com/oss/deb> stable/main amd64 Packages [13.8 kB]
Fetched 13.8 kB in 1s (9670 B/s)
Reading package lists... Done
Download gpg key for installing Grafana.
curl <https://packages.grafana.com/gpg.key> | sudo apt-key add -
Install Grafana package.
sudo apt-get install grafana
<aside> ⚠️ If installed successfully, it can be seen that “three processes are triggered” in last part of the screen output as below.
</aside>
Start Grafana.
sudo systemctl daemon-reload
sudo systemctl start grafana-server
sudo systemctl enable grafana-server
<aside> ⚠️ If started and enabled successfully, the screen output is similar to the below.
</aside>
Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable grafana-server
Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
Verify the process is operational.
ps -ef | grep grafana
grafana 819 1 0 Jun11 ? 00:03:19 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/run/grafana/grafana-server.pid --packaging=deb cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning
Modify Prometheus configuration.
Open prometheus.yml in the path below.
sudo nano /etc/prometheus/prometheus.yml
Copy the contents below and add them to the prometheus.yml file.
- job_name: aptos
static_configs:
- targets: ['localhost:9101']
All targets ip addresses in the file should be modified to localhost.
<aside> ⚠️ If the setting is well modified, it will be similar to the below.
</aside>
# Sample config for Prometheus.
global:
scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
# scrape_timeout is set to the global default (10s).
# Attach these labels to any time series or alerts when communicating with
# external systems (federation, remote storage, Alertmanager).
external_labels:
monitor: 'example'
# Alertmanager configuration
alerting:
alertmanagers:
- static_configs:
- targets: ['localhost:9093']
# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
# - "first_rules.yml"
# - "second_rules.yml"
# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
- job_name: 'prometheus'
# Override the global default and scrape targets from this job every 5 seconds.
scrape_interval: 5s
scrape_timeout: 5s
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['localhost:9090']
- job_name: exporter
# If prometheus-node-exporter is installed, grab stats about the local
# machine by default.
static_configs:
- targets: ['localhost:9100']
- job_name: aptos
static_configs:
- targets: ['localhost:9101']
After changing the settings, reload Prometheus service process.
sudo systemctl reload prometheus.service
Configure firewall and port forwarding setting.
The three ports (3000: Grafana is essential, 9100: Exporter and 9090: Prometheus are optional) must be configured to be externally accessible for remote monitoring. You may need to reboot the server after modifying this setting.
Access a monitoring web page.
Access “your server IP address: 9090/targets” with browser and the Prometheus web page will appears as below.
And If everything is set up well, you can see the metric data in all three targets as below.
Access “your server IP address: 3000” with browser and the Prometheus web page will appears as below.
Login ID : admin, Password : admin (After login, you can change password)
Select Configuration > Data sources > Prometheus menu in Grafana web page. Write localhost with prometheus metric port number(9090) into HTTP URL window as below and click Save & test button.
If everything is going well, you will see the message "Data source is working" in the test result.
Import completed dashboards.
Aptos Core Team has developed a very beautiful and detailed dashboard so far. You can easily import by simply accessing this Aptos Core Github link and copying raw contents from various json files.
https://github.com/aptos-labs/aptos-core/tree/main/dashboards
Select Create > Import menu in Grafana web page. Paste json content you copied into Import via panel json window, and then click load > import.
If you imported several dashboards, you can select and manage dashboards in Dashboards > Browse menu.
Create your own dashboard.
Select Create > Dashboard > Add a new panel menu or click icon, browse and select metric data.
Select appropriate dashboard type for selected metric data
Edit Title and change color scheme.
You can configure many other settings. All done, save dashboard and adjust your panel layout.
I made a validator and fullnode pair monitoring dashboard for AIT2. You want it, you can make it.