
标签管理进阶
1.标签的分类
标签用于对数据分组和分类,利用标签可以将数据进行过滤筛选。
标签管理的常见场景:
- 1.删除不必要的指标;
- 2.从指标中删除敏感或不需要的标签;
- 3.添加,编辑或修改指标的标签值或标签格式;
标签的分类:
- 默认标签:
Prometheus自身内置的标签,格式为"__LABLE__"。
如上图所示,典型点如下所示:
- "__metrics_path__"
- "__address__"
- "__scheme__"
- "__scrape_interval__"
- "__scrape_timeout__"
- "__name__"
- "instance"
- "job"
- 应用标签:
应用本身内置,尤其是监控特定的服务,会有对应的应用标签,格式一般为"__LABLE"
如下图所示,以consul服务为例,典型点如下所示:
- "__meta_consul_address"
- "__meta_consul_service"
- "__meta_consul_dc"
- ...
- 自定义标签:
指的是用户自定义的标签,我们在定义targets可以自定义。
relabel_configs修改target标签案例
1.为targets自定义打标签案例
1.1 修改配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet-linux-node-exporter-labels"
static_configs:
- targets: ["10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
labels:
auther: violet
office: https://www.violet.com
...
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
1.2.热加载配置
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
1.3.查询数据验证
node_cpu_seconds_total{office="https://www.violet.com"}
2.relabel_configs使用target_label新增标签
2.1 修改配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet-linux-node-exporter-relabel_configs-target_label"
static_configs:
- targets: ["10.0.0.31:9100","10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
relabel_configs:
# 匹配源标签的值
- source_labels:
- job
# 将匹配到的值赋值给新标签
target_label: linux96_jobs
...
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
2.2 测试验证
node_cpu_seconds_total{linux96_jobs="oldboyedu-linux96-node-exporter-relabel_configs-target_label"}
3.relabel_configs替换标签replace案例
3.1 修改配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet-linux-node-exporter-relabel_configs-regex-separator-replacement-action"
static_configs:
- targets: ["10.0.0.31:9100","10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
relabel_configs:
# 指定正则表达式匹配成功的label进行标签管理的列表
- source_labels:
- __scheme__
- __address__
- __metrics_path__
# 表示source_labels对应Label的名称或值进行匹配此处指定的正则表达式。
# 此处我们对数据进行了分组,后面replacement会是哟合格"${1}"和"$2"进行引用。
regex: "(http|https)(.*)"
# 指定用于连接多个source_labels为一个字符串的分隔符,若不指定,默认为分号";"。
# 假设源数据如下:
# __address__="10.0.0.41:9100"
# __metrics_path__="/metrics"
# __scheme__="http"
# 拼接后操作的结果为: "http10.0.0.41:9100/metrics"
separator: ""
# 在进行Label替换的时候,可以将原来的source_labels替换为指定修改后的label。
# 将来会新加一个标签,标签的名称为"yinzhengjie_prometheus_ep",值为replacement的数据。
target_label: "yinzhengjie_prometheus_ep"
# 替换标签时,将target_label对应的值进行修改成此处的值,如果不指定,默认使用"$1"
replacement: "${1}://${2}"
# 对Label或指标进行管理,场景的动作有replace|keep|drop|lablemap|labeldrop等,默认为replace。
# 参考链接地址:
# https://prometheus.io/docs/prometheus/2.53/configuration/configuration/#relabel_config
action: replace
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
3.2 测试验证
node_cpu_seconds_total{job="violet-linux-node-exporter-relabel_configs-regex-separator-replacement-action"}
4.relabel_configs新增标签映射labelmap案例
4.1 修改配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
- job_name: "violet-linux-node-exporter-relabel_configs-labelmap"
static_configs:
- targets: ["10.0.0.31:9100","10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
relabel_configs:
- regex: "(job|app)"
replacement: "${1}_yinzhengjie_labelmap_kubernetes"
# labelmap一般用于生成新的标签,通常用于取出匹配标签名的一部分生成新标签,旧的标签依旧会存在。
# 将regex对source label中指定的标签名称进行匹配,而后将匹配到的标签的赋值给replacement字段指定的标签。
action: labelmap
...
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
4.2 测试验证
node_cpu_seconds_total{job="violet-linux-node-exporter-relabel_configs-labelmap"}
5.relabel_configs删除标签labeldrop案例
5.1 修改配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet-linux-node-exporter-relabel_configs-labelmap-labeldrop"
static_configs:
- targets: ["10.0.0.31:9100","10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
relabel_configs:
- regex: "(job|app)"
replacement: "${1}_yinzhengjie_labelmap_kubernetes"
action: labelmap
- regex: "(job|app)"
# 删除regex匹配到的标签
action: labeldrop
...
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
5.2 验证测试
node_cpu_seconds_total{job_violet_labelmap_kubernetes="violet-linux-node-exporter-relabel_configs-labelmap-labeldrop"}
6.metric_relabel_configs修改metric标签案例
6.1 测试案例
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet-linux-node-exporter-metric_relabel_configs-drop"
static_configs:
- targets: ["10.0.0.31:9100","10.0.0.41:9100","10.0.0.42:9100","10.0.0.43:9100"]
metric_relabel_configs:
- source_labels:
- __name__
regex: "node_cpu_.*"
action: drop
#target_label: "xixi"
#action: uppercase
- regex: "(id|pretty_name|version_codename)"
action: labeldrop
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# systemctl stop prometheus-server.service
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# rm -rf data/
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./prometheus
6.2 测试验证
node_cpu_seconds_total{job="violet-linux-node-exporter-metric_relabel_configs-drop"}
node_os_info{name="Ubuntu"}
pushgateway组件部署
- 1.什么是pushgateway
说白了,就是自定义监控。
- 2.部署pushgateway
wget https://github.com/prometheus/pushgateway/releases/download/v1.11.0/pushgateway-1.11.0.linux-amd64.tar.gz
- 3.解压软件包
[root@node-exporter41 ~]# tar xf pushgateway-1.11.0.linux-amd64.tar.gz -C /usr/local/bin/ pushgateway-1.11.0.linux-amd64/pushgateway --strip-components=1
[root@node-exporter41 ~]#
[root@node-exporter41 ~]# ll /usr/local/bin/pushgateway
-rwxr-xr-x 1 1001 1002 20656129 Jan 9 22:36 /usr/local/bin/pushgateway*
[root@node-exporter41 ~]#
- 4.运行pushgateway
[root@node-exporter41 ~]# pushgateway --web.telemetry-path="/metrics" --web.listen-address=:9091 --persistence.file=/oldboyedu/data/pushgateway.data
5.访问pushgateway的WebUI
http://10.0.0.41:9091/#
模拟直播在线人数统计
- 1 使用curl工具推送测试数据pushgateway
[root@elk93 dockerfile]# echo "student_online 35" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.93
[root@elk93 dockerfile]# echo "student_online $RANDOM" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.93
[root@elk93 dockerfile]#
- 2 查看pushgateway的WebUI
[root@elk93 dockerfile]# curl -s http://10.0.0.41:9091/metrics | grep ^student_online
student_online{instance="10.0.0.93",job="violet_student"} 35
[root@elk93 dockerfile]#
- 3.修改prometheus的配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "prometheus-violet-pushgatway"
# 如果pushgateway组件的标签和prometheus server冲突时,如何解决的问题。
# 会直接覆盖哟,默认为false(会将源标签加上"exported_"前缀)表示不覆盖,此处我显示定义为true。
# honor_labels: true
static_configs:
- targets:
- "10.0.0.41:9091"
- 4.重新加载配置
curl -X POST http://10.0.0.31:9090/-/reload
- 5.验证配置是否生效
http://10.0.0.31:9090/config
http://10.0.0.31:9090/targets?search=
搜索指标: "student_online"
- 6.grafana出图展示
echo "student_online 35" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.71
echo "student_online 45" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.72
echo "student_online 55" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.73
echo "student_online 65" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.74
echo "student_online 75" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.75
echo "student_online 135" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.71
echo "student_online 5" | curl --data-binary @- http://10.0.0.41:9091/metrics/job/violet_student/instance/10.0.0.71
周期性上报直播在线人数
[root@elk93 ~]# chmod +x /tmp/zhibo.sh
[root@elk93 ~]#
[root@elk93 ~]# cat /tmp/zhibo.sh
#!/bin/bash
# Filename: zhibo.sh
/usr/bin/echo "student_online $RANDOM" | /usr/bin/curl --data-binary @- http://10.0.0.41:9091/metrics/job/oldboyedu_student/instance/10.0.0.93
[root@elk93 ~]#
[root@elk93 ~]# echo '* * * * * /tmp/zhibo.sh' > /var/spool/cron/crontabs/root
[root@elk93 ~]#
[root@elk93 ~]# crontab -l
* * * * * /tmp/zhibo.sh
[root@elk93 ~]#
存在问题:
貌似这样无法定时上报任务到pushgateway,需要进一步研究。
Prometheus监控TCP的12种状态
- 1.监控TCP的12种状态
[root@elk93 ~]# cat /usr/local/bin/tcp_status2.sh
#!/bin/bash
pushgateway_url="http://10.0.0.41:9091/metrics/job/tcp_status"
time=$(date +%Y-%m-%d+%H:%M:%S)
state="SYN-SENT SYN-RECV FIN-WAIT-1 FIN-WAIT-2 TIME-WAIT CLOSE CLOSE-WAIT LAST-ACK LISTEN CLOSING ESTAB"
for i in $state
do
t=`ss -tan |grep $i |wc -l`
echo tcp_connections{state=\""$i"\"} $t >>/tmp/tcp.txt
done;
cat /tmp/tcp.txt | curl --data-binary @- $pushgateway_url
rm -rf /tmp/tcp.txt
[root@elk93 ~]#
- 2.调用脚本
[root@elk93 ~]# bash /usr/local/bin/tcp_status2.sh
SRE运维开发实现自定义的exporter
- 1.使用python程序自定义exporter案例
1.1 安装pip3工具包
[root@prometheus-node42 ~]# apt update
[root@prometheus-node42 ~]# apt install -y python3-pip
1.2 安装实际环境中相关模块库
[root@elk93 ~]# pip3 install flask prometheus_client -i https://mirrors.aliyun.com/pypi/simple
1.3 编写代码
[root@elk93 ~]# cat flask_metric.py
#!/usr/bin/python3
# auther: violet
# blog: https://www.vionletarchitect.top
from prometheus_client import start_http_server,Counter, Summary
from flask import Flask, jsonify
from wsgiref.simple_server import make_server
import time
app = Flask(__name__)
# Create a metric to track time spent and requests made
REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request')
COUNTER_TIME = Counter("request_count", "Total request count of the host")
@app.route("/apps")
@REQUEST_TIME.time()
def requests_count():
COUNTER_TIME.inc()
return jsonify({"office": "https://www.vionletarchitect.top"},{"auther":"violet"})
if __name__ == "__main__":
print("启动程序: violet-linux-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000")
start_http_server(8000)
httpd = make_server( '0.0.0.0', 8001, app )
httpd.serve_forever()
[root@elk93 ~]#
1.4 启动python程序
[root@elk93 ~]# python3 flask_metric.py
启动老男孩教育程序: oldboyedu-linux96-python-exporter, 访问路径: http://0.0.0.0:8001/apps,监控服务: http://0.0.0.0:8000
10.0.0.1 - - [30/Mar/2025 15:03:14] "GET /apps HTTP/1.1" 200 64
10.0.0.1 - - [30/Mar/2025 15:03:26] "GET /apps HTTP/1.1" 200 64
1.5 客户端测试
[root@node-exporter43 ~]# cat violet_curl_metrics.sh
#!/bin/bash
URL=http://10.0.0.93:8001/apps
while true;do
curl_num=$(( $RANDOM%50+1 ))
sleep_num=$(( $RANDOM%5+1 ))
for c_num in `seq $curl_num`;do
curl -s $URL &> /dev/null
done
sleep $sleep_num
done
[root@node-exporter43 ~]#
[root@node-exporter43 ~]# bash -x violet_curl_metrics.sh
- 2.prometheus监控python自定义的exporter实战
2.1 编辑配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# vim prometheus.yml
...
- job_name: "violet_python_custom_metrics"
static_configs:
- targets:
- 10.0.0.93:8000
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
2.2 检查配置文件语法
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# ./promtool check config prometheus.yml
Checking prometheus.yml
SUCCESS: prometheus.yml is valid prometheus config file syntax
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
2.3 重新加载配置文件
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]# curl -X POST http://10.0.0.31:9090/-/reload
[root@prometheus-server31 prometheus-2.53.4.linux-amd64]#
2.4 验证prometheus是否采集到数据
http://10.0.0.31:9090/targets
2.5 grafana作图展示
request_count_total
老男孩教育apps请求总数。
increase(request_count_total{job="violet_python_custom_metrics"}[1m])
老男孩教育每分钟请求数量曲线QPS。
irate(request_count_total{job="violet_python_custom_metrics"}[1m])
老男孩教育每分钟请求量变化率曲线
request_processing_seconds_sum{job="violet_python_custom_metrics"} / request_processing_seconds_count{job="violet_python_custom_metrics"}
老男孩教育每分钟请求处理平均耗时