1. Before we start…

<aside> 💡 This guide will help you build a high-quality monitoring environment by running only the essential services without going through the kubernetes platform. Other detailed settings can be studied with googling on your own, and it's worth it. This method of building a monitoring environment is simple and easy, but it is a high-level knowledge that you must know if you are a competent node operator.

</aside>

  1. Install Prometheus.

Add Prometheus Account.

cd
sudo useradd --no-create-home --shell /usr/sbin/nologin prometheus

Install Prometheus package.

sudo apt-get install -y prometheus prometheus-node-exporter prometheus-pushgateway prometheus-alertmanager

Verify the process is operational.

ps -ef | grep prometheus
prometh+     825       1  0 Jun11 ?        00:02:41 /usr/bin/prometheus-alertmanager
prometh+     831       1  1 Jun11 ?        00:48:26 /usr/bin/prometheus-node-exporter
prometh+     840       1  0 Jun11 ?        00:00:00 /usr/bin/prometheus-pushgateway
prometh+    3028       1  0 Jun11 ?        00:14:54 /usr/bin/prometheus
  1. Install Grafana.

    Add Grafana Repository.

    sudo add-apt-repository "deb <https://packages.grafana.com/oss/deb> stable main"
    

    <aside> ⚠️ If you install it for the first time, you will probably get a NO_PUBKEY error like the message below, because you don't have public key yet.

    </aside>

    Screen_Shot_2022-06-11_at_5.19.27_AM (1).png

    Copy the key and access keyserver.ubuntu.com.

    The key you need to copy from the message above for example is 8C8C34C524098CB6, but you should search key block with *0x8C8C34C524098CB6*. And then click the pub key block link.

    스크린샷_2022-05-19_18.12.11 (2).png

    You need to copy the entire block information like the image below and save it. (ex. ./my-key.txt)

    스크린샷_2022-05-19_18.04.11 (1).png

    Now run the adding repository command once again.

    sudo apt-key add my-key.txt
    sudo add-apt-repository "deb <https://packages.grafana.com/oss/deb> stable main"
    

    <aside> ⚠️ If added successfully, the screen output is similar to the below.

    </aside>

    Hit:1 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic InRelease
    Hit:2 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic-updates InRelease
    Hit:3 <http://ap-northeast-1.ec2.archive.ubuntu.com/ubuntu> bionic-backports InRelease
    Hit:4 <http://security.ubuntu.com/ubuntu> bionic-security InRelease
    Get:5 <https://packages.grafana.com/oss/deb> stable InRelease [12.1 kB]
    Get:6 <https://packages.grafana.com/oss/deb> stable/main amd64 Packages [13.8 kB]
    Fetched 13.8 kB in 1s (9670 B/s)
    Reading package lists... Done
    

    Download gpg key for installing Grafana.

    curl <https://packages.grafana.com/gpg.key> | sudo apt-key add -
    

    Install Grafana package.

    sudo apt-get install grafana
    

    <aside> ⚠️ If installed successfully, it can be seen that “three processes are triggered” in last part of the screen output as below.

    </aside>

    Screen_Shot_2022-06-11_at_5.28.56_AM (1).png

    Start Grafana.

    sudo systemctl daemon-reload
    sudo systemctl start grafana-server
    sudo systemctl enable grafana-server
    

    <aside> ⚠️ If started and enabled successfully, the screen output is similar to the below.

    </aside>

    Synchronizing state of grafana-server.service with SysV service script with /lib/systemd/systemd-sysv-install.
    Executing: /lib/systemd/systemd-sysv-install enable grafana-server
    Created symlink /etc/systemd/system/multi-user.target.wants/grafana-server.service → /usr/lib/systemd/system/grafana-server.service.
    

    Verify the process is operational.

    ps -ef | grep grafana
    
    grafana      819       1  0 Jun11 ?        00:03:19 /usr/sbin/grafana-server --config=/etc/grafana/grafana.ini --pidfile=/run/grafana/grafana-server.pid --packaging=deb cfg:default.paths.logs=/var/log/grafana cfg:default.paths.data=/var/lib/grafana cfg:default.paths.plugins=/var/lib/grafana/plugins cfg:default.paths.provisioning=/etc/grafana/provisioning
    
  2. Modify Prometheus configuration.

    Open prometheus.yml in the path below.

    sudo nano /etc/prometheus/prometheus.yml
    

    Copy the contents below and add them to the prometheus.yml file.

      - job_name: aptos
        static_configs:
          - targets: ['localhost:9101']
    

    All targets ip addresses in the file should be modified to localhost.

    <aside> ⚠️ If the setting is well modified, it will be similar to the below.

    </aside>

    # Sample config for Prometheus.
    
    global:
      scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
      evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
      # scrape_timeout is set to the global default (10s).
    
      # Attach these labels to any time series or alerts when communicating with
      # external systems (federation, remote storage, Alertmanager).
      external_labels:
          monitor: 'example'
    
    # Alertmanager configuration
    alerting:
      alertmanagers:
      - static_configs:
        - targets: ['localhost:9093']
    
    # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
    rule_files:
      # - "first_rules.yml"
      # - "second_rules.yml"
    
    # A scrape configuration containing exactly one endpoint to scrape:
    # Here it's Prometheus itself.
    scrape_configs:
      # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
      - job_name: 'prometheus'
    
        # Override the global default and scrape targets from this job every 5 seconds.
        scrape_interval: 5s
        scrape_timeout: 5s
    
        # metrics_path defaults to '/metrics'
        # scheme defaults to 'http'.
    
        static_configs:
          - targets: ['localhost:9090']
    
      - job_name: exporter
        # If prometheus-node-exporter is installed, grab stats about the local
        # machine by default.
        static_configs:
          - targets: ['localhost:9100']
    
      - job_name: aptos
        static_configs:
          - targets: ['localhost:9101']
    

    After changing the settings, reload Prometheus service process.

    sudo systemctl reload prometheus.service
    
  3. Configure firewall and port forwarding setting.

    The three ports (3000: Grafana is essential, 9100: Exporter and 9090: Prometheus are optional) must be configured to be externally accessible for remote monitoring. You may need to reboot the server after modifying this setting.

  4. Access a monitoring web page.

    Access “your server IP address: 9090/targets” with browser and the Prometheus web page will appears as below.

    Screen Shot 2022-06-14 at 11.16.43 PM.png

    And If everything is set up well, you can see the metric data in all three targets as below.

    Screen Shot 2022-06-14 at 11.22.43 PM.png

    Access “your server IP address: 3000” with browser and the Prometheus web page will appears as below.

    Login ID : admin, Password : admin (After login, you can change password)

    스크린샷_2022-05-20_11.44.28.png

    Select Configuration > Data sources > Prometheus menu in Grafana web page. Write localhost with prometheus metric port number(9090) into HTTP URL window as below and click Save & test button.

    Untitled

    If everything is going well, you will see the message "Data source is working" in the test result.

    Screen Shot 2022-06-14 at 11.49.50 PM.png

  5. Import completed dashboards.

    Aptos Core Team has developed a very beautiful and detailed dashboard so far. You can easily import by simply accessing this Aptos Core Github link and copying raw contents from various json files.

    https://github.com/aptos-labs/aptos-core/tree/main/dashboards

    Screen Shot 2022-06-15 at 12.02.27 AM.png

    Select Create > Import menu in Grafana web page. Paste json content you copied into Import via panel json window, and then click load > import.

    Screen Shot 2022-06-15 at 12.08.43 AM.png

    If you imported several dashboards, you can select and manage dashboards in Dashboards > Browse menu.

  6. Create your own dashboard.

    Select Create > Dashboard > Add a new panel menu or click icon, browse and select metric data.

    Screen Shot 2022-06-15 at 12.36.08 AM.png

    Screen_Shot_2022-06-10_at_1.51.55_AM.png

    Select appropriate dashboard type for selected metric data

    Screen_Shot_2022-06-10_at_1.52.57_AM.png

    Edit Title and change color scheme.

    Screen_Shot_2022-06-10_at_1.54.28_AM.png

    Screen_Shot_2022-06-10_at_1.55.54_AM.png

    You can configure many other settings. All done, save dashboard and adjust your panel layout.

    Screen_Shot_2022-06-10_at_1.57.57_AM.png

    I made a validator and fullnode pair monitoring dashboard for AIT2. You want it, you can make it.

    Screen Shot 2022-07-03 at 1.34.09 AM.png