Prometheus监控服务器

随着 ESXI 上虚拟机的数量逐渐增多，查看和监测每个服务器的状态开始变得繁琐，因此有必要部署一套监测系统来记录所有服务器的运行状态。

背景

几年前用的是 Ganglia 来监控服务器，然而 ganglia 在 2018 年之后基本就不再维护了，而且试着安装了一下 ganglia-web 在 PHP 8 下也无法正常运行，只好寻找其他开源的监控系统。

随着云原生的发展，Prometheus 作为一个开源的监控和报警系统，也受到越来越多人的关注，它是继 Kubernetes 后第二个加入 CNCF( Cloud Native Computing Foundation) 的项目。

Prometheus

作为一个监控系统，Prometheus 将采集的指标数据（metrics）以时序数据的方式存储。简单来说就是记录指标的时候带上了时间戳，而数据主要是以键值对的形式保存。

特点

一个多维度的时序数据模型，也即 metric + 关联的键值对；
PromQL，一个查询语言；
不依赖分布式存储，单一的服务器节点就够了；
以 HTTP 的形式主动拉取指标数据（pull 模式）；
支持通过内部 gateway 的形式推送数据（push 模式）；
监控目标可以静态配置或通过服务发现；
支持图形面板；

组件

prometheus 服务器，主动抓取和保存数据；
客户端库；
gateway（push 模式）；
各种功能的 exporters；
报警管理器 alertmanager；
其他支持工具；

架构

安装

首先从官方下载最新版本的二进制可执行文件：

$ wget https://github.com/prometheus/prometheus/releases/download/v2.37.1/prometheus-2.37.1.linux-amd64.tar.gz
$ tar -zxvf prometheus-2.37.1.linux-amd64.tar.gz

其中有个示例配置文件 prometheus.yml，然后就可以直接启动了：

$ ./prometheus --config.file=prometheus.yml

我们也可以用 supervisor 管理进程，配置文件：

$ sudo vim /etc/supervisor/conf.d/prometheus.conf

[program:prometheus]
directory=/home/ubuntu/vhost/prometheus
command=/home/ubuntu/vhost/prometheus/prometheus --config.file=prometheus.yml
autostart=true
autorestart=true
user=ubuntu
redirect_stderr=true
stdout_logfile=/home/ubuntu/log/prometheus.log

Prometheus 启动后默认监听在 9090 端口，可以直接在浏览器访问 http://127.0.0.1:9090/metrics 查看它自身的指标数据：

也可以访问 http://127.0.0.1:9090/graph 查询数据，查询语句还带智能提示：

Exporter

通过上面简单的介绍我们已经知道 Prometheus 只负责采集和保存数据，而具体的指标数据需要自己输出。

不过对于常见的监控，官方和第三方都提供了相关的 exporter，基本上开箱即用。

Node Exporter

要监控 Linux 服务器的状态，我们可以直接使用官方的 Node Exporter。

如果是 Windows，则可以使用另一个 Windows Exporter。

从官方下载二进制可执行文件，然后直接启动：

$ wget https://github.com/prometheus/node_exporter/releases/download/v1.4.0/node_exporter-1.4.0.linux-amd64.tar.gz
# 解压
$ tar -zxvf node_exporter-1.3.1.linux-amd64.tar.gz
# 运行
$ ./node_exporter

同样使用 supervisor 来管理：

$ sudo vim /etc/supervisor/conf.d/node_exporter.conf

[program:node_exporter]
directory=/home/ubuntu/vhost/prometheus
command=/home/ubuntu/vhost/prometheus/node_exporter
autostart=true
autorestart=true
user=ubuntu
redirect_stderr=true
stdout_logfile=/home/ubuntu/log/node_exporter.log

Node Exporter 默认监听在 9100 端口，可以查看 http://127.0.0.1:9100/metrics 检查是否正常。

然后需要修改 Prometheus 配置，添加对 Node Exporter 的数据抓取，添加一个新的 job：

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ["localhost:9090"]

  - job_name: "node"
    static_configs:
      - targets:
        - "localhost:9100"

最后重启一下 prometheus：

$ supervisorctl restart prometheus

HAProxy Exporter

对于 HAProxy，官方也有提供 Exporter；不过从 haproxy 2.0.0 开始，其自身也提供 prometheus 模块：

$ haproxy -vv
HAProxy version 2.4.14-1ubuntu1 2022/02/28 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2026.
Known bugs: http://www.haproxy.org/bugs/bugs-2.4.14.html
Running on: Linux 5.15.0-37-generic #39-Ubuntu SMP Wed Jun 1 19:16:45 UTC 2022 x86_64
......
Built with Lua version : Lua 5.3.6
Built with the Prometheus exporter as a service
Built with network namespace support.
Built with libslz for stateless compression.

所以我们可以直接在 haproxy 的配置中开启 prometheus exporter：

$ sudo vim /etc/haproxy/haproxy.cfg

frontend stats
        bind 0.0.0.0:8404
        option http-use-htx
        http-request use-service prometheus-exporter if { path /metrics }
        stats enable
        stats uri /stats
        stats refresh 10s

访问 http://127.0.0.1:8404/stats 可以查看 haproxy 的运行状态：

而 http://127.0.0.1:8404/metrics 则是 prometheus 抓取数据的地址，修改 prometheus.yml：

scrape_configs:
  - job_name: "haproxy"
    static_configs:
      - targets: ["localhost:8404"]

Nginx VTS

如果想要监控 Nginx 的流量状态，则可以使用 nginx-module-vts，不过由于是第三方模块需要自己编译，可以参考 Nginx编译动态模块。

Nginx-module-vts 编译并加载后，就可以通过 /status/format/prometheus 获取到数据了，修改 prometheus.yml:

scrape_configs:
  - job_name: "nginx"
    metrics_path: "/status/format/prometheus"
    static_configs:
      - targets: ["localhost"]

Grafana

在开发环境或调试可以使用 Prometheus 提供的 graph 来查看数据，而在生产环境中官方推荐 Grafana，其安装和配置可以参考官方文档。

Grafana 也提供了许多 Dashboards 可以直接导入使用，比如 Node Exporter：

ID：11074

ID：1860

比如 HAProxy，ID：12693

比如 Nginx VTS，ID：9785

参考：

码字很辛苦，转载请注明来自ChenJiehua的《Prometheus监控服务器》

2022-09-17 2022-09-27 nginx, linux, grafana, haproxy, prometheus

Prometheus监控服务器

背景

Prometheus

特点

组件

架构

安装

Exporter

Node Exporter

HAProxy Exporter

Nginx VTS

Grafana

评论

近期文章

归档

2025年 12月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Prometheus监控服务器

背景

Prometheus

特点

组件

架构

安装

Exporter

Node Exporter

HAProxy Exporter

Nginx VTS

Grafana

评论

近期文章

归档

标签