性能测试工具 JMeter 实时监控仪表板配置 (Grafana + InfluxDB)

Keith Mo · December 16, 2017 · Last by 孙高飞 replied at December 02, 2018 · 7266 hits
本帖已被设为精华帖!

从我的博客搬运:

http://keithmo.me/post/2017/12/2017-12-16-jmeter-grafana-dashboard/

【警告】这篇东西是配置笔记,你会看到满屏鸡肠,不打算自己做一个可以不看。成品我已经传到官网了,能直接导入Grafana里,地址如下:

https://grafana.com/dashboards/4026

接下来还会配Gatling、服务器性能指标、数据库、MQ的等等,都是把官网上一些比较好的拿来改造成方便自己看的样子


在服务器上跑 JMeter 做压测的话,给工具本身也配上实时监控是必须的,命令行输出能提供的信息太少。

JMeter的 Backend Listener 支持 Graphite 和 InfluxDB,这里选择 InfluxDB 做时序数据库,支持类似 SQL 的查询语法是最大的优点。另外在 JMeter 3.2+ 里配置起来也比 Graphite 方便太多。(缺点是直到写这篇文章时官网文档都没更新,要自己查存储的字段,猜它有什么用)

Grafana 能配出非常漂亮的监控仪表板,就是配的过程非常痛苦,不做非常详细的笔记的话过几天又忘光了,于是有了这篇东西。


【前提】

  • 采集器:JMeter 3.2+,Backend Listener 里选择 InfluxdbBackendListenerClient
  • 数据源:InfluxDB 1.4+
  • 面板:Grafana 4.6+
    • 已添加好数据源
    • 新建面板,添加 3 行

【注意】

吞吐率和响应时间图表只计算成功的请求(失败的通常没意义,超时失败的能在表格里看到数量),结果可能会跟JMeter里看到的有出入。

【效果】

总体

错误数

单个接口

已经上传到 Grafana 官网,可以从以下地址下载JSON文件,或通过ID 4026 直接导入:

https://grafana.com/dashboards/4026


JMeter Backend Listener 参考配置:

JMeter设置


Settings

  • General
    • Name: JMeter Dashboard
    • Description: Monitor your JMeter load test in real time with InfluxDB and Grafana.
    • Tags: load_test
  • Rows
    • Summary, Errors, Individual Transaction - $transaction

Templating

$data_source

  • Name: data_source
  • Type: Datasource
  • Type: InfluxDB

$application

  • Name: application
  • Type: Query
  • Data source: $data_source
  • Refresh: On Dashboard Load
  • Query: SHOW TAG VALUES FROM "$measurement_name" WITH KEY = "application"

$transaction

  • Name: transaction
  • Type: Query
  • Data source: $data_source
  • Refresh: On Dashboard Load
  • Query: SHOW TAG VALUES FROM "$measurement_name" WITH KEY = "transaction" WHERE "application" =~ /^$application$/ AND "transaction" != 'internal' AND "transaction" != 'all'

可惜 templating 里不支持 $timeFilter(由于 InfluxDB show tag values 语法的限制),时间久了之后各种接口名看着会比较乱。

$measurement_name

  • Name: measurement_name
  • Label: Measurement name
  • Type: Constant
  • Hide: Variable
  • Value: jmeter (JMeter Backend Listener 默认)

$send_interval

  • Name: send_interval
  • Label: Backend send interval
  • Type: Constant
  • Hide: Variable
  • Value: 5 (JMeter InfluxdbBackendListenerClient 默认)

Annotations

编辑 Annotations & Alerts(Built-in)

  • Name: Start/stop marker
  • Data source: $data_source
  • Query: select text from events where $timeFilter

第1行

第1排

Singlestat - Total Requests, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE ("application" =~ /^$application$/ AND "transaction" = 'all') AND $timeFilter GROUP BY time($__interval) fill(null)
  • Options
    • Value
      • Stat: Total
      • Postfix: Requests
      • Decimals: 0
    • Coloring: Value,把中间的颜色换成浅一点的黄色
  • Value Mappings: null -> 0

Singlestat - Failed Requests, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("countError") FROM "$measurement_name" WHERE ("transaction" = 'all' AND "application" =~ /^$application$/) AND $timeFilter GROUP BY time($__interval) fill(null)
  • Options
    • Value
      • Stat: Total
      • Postfix: Failed
      • Decimals: 0
    • Coloring: Value,把中间的颜色换成红色
  • Value Mappings: null -> 0

Singlestat - Error Rate %, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("error") / sum("all") FROM (SELECT sum("count") AS "all" FROM "$measurement_name" WHERE "transaction" = 'all' AND "application" =~ /^$application$/ AND $timeFilter GROUP BY time($__interval) fill(null)), (SELECT sum("countError") AS "error" FROM "$measurement_name" WHERE "transaction" = 'all' AND "application" =~ /^$application$/ AND $timeFilter GROUP BY time($__interval) fill(null))
  • Options
    • Value
      • Stat: Total
      • Unit: percent(0.0-1.0)
      • Decimals: 2
    • Coloring: Value,Thresholds: 0,0.01
    • Gauge: Show,Max: 1
  • Value Mappings: null -> 0

第2排

Graph - Total Throughput, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT mean("count") / $send_interval FROM "$measurement_name" WHERE ("transaction" = 'all' AND "application" =~ /^$application$/) AND $timeFilter GROUP BY time($__interval) fill(null)
    • alias: Req / sec
  • Legend
    • As Table, Min, Max, Avg,Decimals: 2
  • Display
    • Lines,Fill: 7, Null value: null

Graph - Total Errors, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("countError") FROM "$measurement_name" WHERE ("transaction" = 'all' AND "application" =~ /^$application$/) AND $timeFilter GROUP BY time($__interval) fill(null)
    • alias: Num of Errors
  • Axes
    • Decimals: 0
  • Legend
    • As Table, Total,Decimals: 0
  • Display
    • Lines,Fill: 7, Null value: null

Graph - Active Threads, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT last("maxAT") FROM "$measurement_name" WHERE ("transaction" = 'internal' AND "application" =~ /^$application$/) AND $timeFilter GROUP BY time($__interval) fill(null)
    • alias: Threads
  • Axes
    • Decimals: 0
  • Legend
    • As Table, Current,Decimals: 0
  • Display
    • Lines,Fill: 7, Null value: null

第3排

Graph - Transactions Response Times (95th pct), Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT mean("pct95.0") FROM "$measurement_name" WHERE ("statut" = 'ok' AND "application" =~ /^$application$/) AND $timeFilter GROUP BY "transaction", time($__interval) fill(null)
    • alias: $tag_transaction
  • Axes
    • Units: milliseconds(ms)
  • Legend
    • As Table, To the right, Max, Avg,Decimals: 2
  • Display
    • Lines,Null value: null
    • Thresholds
      • T1: lt 500, ok, Fill, Line
      • T2: gt 1500, warning, Line
      • T3: gt 5000, critical, Fill, Line

第2行

Table - Errors per Transaction, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE ("application" =~ /^$application$/ AND "statut" = 'ko') AND $timeFilter GROUP BY "transaction"
    • format: Table
  • Column Styles
    • Time - Type: Hidden
    • /.*/ - Decimals: 0

Table - Error Info, Span: 8

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE ("application" =~ /^$application$/ AND "responseCode" !~ /^$/) AND $timeFilter GROUP BY "responseCode","responseMessage"
    • format: Table
  • Column Styles
    • Time: Type - Hidden
    • /.*/ : Decimals 0

第3行

复制第1行的图表(除了线程图),改一下SQL和一些细节就行。

第1排

Singlestat - Total Requests - $transaction, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE ("application" =~ /^$application$/ AND "transaction" =~ /^$transaction$/ AND "statut" = 'all') AND $timeFilter GROUP BY time($__interval) fill(null)
  • Options
    • Value
      • Stat: Total
      • Postfix: Requests
      • Decimals: 0
    • Coloring: Value,把中间的颜色换成浅一点的黄色
  • Value Mappings: null -> 0

Singlestat - Failed Requests - $transaction, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE ("application" =~ /^$application$/ AND "transaction" =~ /^$transaction$/ AND "statut" = 'ko') AND $timeFilter GROUP BY time($__interval) fill(null)
  • Options
    • Value
      • Stat: Total
      • Postfix: Failed
      • Decimals: 0
    • Coloring: Value,把中间的颜色换成红色
  • Value Mappings: null -> 0

Singlestat - Error Rate % - $transaction, Span 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("error") / sum("all") FROM (SELECT sum("count") AS "all" FROM "$measurement_name" WHERE "transaction" =~ /^$transaction$/ AND "statut" = 'all' AND "application" =~ /^$application$/ AND $timeFilter GROUP BY time($__interval) fill(null)), (SELECT sum("count") AS "error" FROM "$measurement_name" WHERE "transaction" =~ /^$transaction$/ AND "statut" = 'ko' AND "application" =~ /^$application$/ AND $timeFilter GROUP BY time($__interval) fill(null))
  • Options
    • Value
      • Stat: Total
      • Unit: percent(0.0-1.0)
      • Decimals: 2
    • Coloring: Value,Thresholds: 0,0.01
    • Gauge: Show,Max: 1
  • Value Mappings: null -> 0

第2排

Graph - Throughput - $transaction, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT last("count") / $send_interval FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval)
    • alias: Req / sec
  • Legend
    • As Table, Min, Max, Avg,Decimals: 2
  • Display
    • Lines,Fill: 7, Null value: null

Graph - Errors - $transaction, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT sum("count") FROM "$measurement_name" WHERE "application" =~ /^$application$/ AND "transaction" =~ /^$transaction$/ AND "statut" = 'ko' AND $timeFilter GROUP BY time($__interval) fill(null)
    • alias: Num of Errors
  • Axes
    • Decimals: 0
  • Legend
    • As Table, Total,Decimals: 0
  • Display
    • Lines,Fill: 7
    • Points, Point Radius: 1
    • Null value: null

第3排

Graph - Response Times - $transaction, Span: 4

  • Metric
    • Data Source: $data_source
      • Options - Min time interval: [[send_interval]]s
    • SELECT last("avg") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval)
      • alias: Average
    • SELECT last("pct50.0") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval)
      • alias: Median
    • SELECT last("pct90.0") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval) fill(null)
      • alias: 90th Percentile
    • SELECT last("pct95.0") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval) fill(null)
      • alias: 95th Percentile
    • SELECT last("pct99.0") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval) fill(null)
      • alias: 99th Percentile
    • SELECT last("max") FROM "$measurement_name" WHERE ("transaction" =~ /^$transaction$/ AND "statut" = 'ok') AND $timeFilter GROUP BY time($__interval) fill(null)
      • alias: Max
  • Axes
    • Units: milliseconds(ms)
  • Legend
    • As Table, To the right
    • Max, Avg,Decimals: 2
    • Hide Series: With only nulls
  • Display
    • Lines,Null value: null
    • Thresholds
      • T1: lt 500, ok, Fill, Line
      • T2: gt 1500, warning, Line
      • T3: gt 5000, critical, Fill, Line

导出的 JSON 文件没有 data source,无法直接导入,需要手动编辑文件,在 "__inputs": [] 里加入以下:

{
"name": "JMETER_DASHBOARD",
"label": "DB name",
"description": "",
"type": "datasource",
"pluginId": "influxdb",
"pluginName": "InfluxDB"
},

如果想上传到官网,为了能正确分类,"__requires": [] 里还要加入以下:

{
"type": "datasource",
"id": "influxdb",
"name": "InfluxDB",
"version": "1.4.0"
},

参考:

https://grafana.com/dashboards/3351

如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!
共收到 19 条回复 时间 点赞

不错,JMeter跟grafana是标配,多数人都没利用好这个特点

很赞,有时间实践下

看上去不错,我们也有同事在使用jmeter,不知道能不能搞成这种高大上的

请问下只需要修改inputs里面的DBname就好了么

尹全旺 回复

抱歉现在才看到,导入应该能看到3个选项,数据源选你自己的db,然后指定表名和发送间隔(在JMeter里默认是 jmeter 和 5 秒,都按默认就不用改这2个)。

如果发现bug或有改进的建议欢迎随时告诉我。:)


以上全是我编的,我实在编不下去了……😆

Author only
wholegale39 回复

不好意思很少上论坛,现在才看到,估计早就解决了吧:P

如果没错误,Error info是没有数据的。面板上如果显示不正常可以去influxdb里查一下数据。查一下有没有 statutko 的记录。

pct 95那图表没数据的话看看db里有没有叫做 pct95.0 的列。在 JMeter 的 Backend Listener 里有个 percentiles 字段,我是设成 50;90;95;99 ,截图里有。这里写了多少,db里就会有名为 pctXX.X 的列。如果你删了95就不会有那列。

InfluxDB 1.4+ 无法访问web页面,因为这个版本是最新的,没有端口。就是想问你是怎么解决的呢 conf文件配置无效,emmm

潘鹏 回复

influxdb那个web控制台很早就没有了,直接上服务器查😀

装个低版本然后再重装高版本,Linux下还是可以有控制台的,

Keith Mo 回复


这为什么只显示一个数据,没有全部的数据都出来啦。。我从官网上下载json文件来导入,应该没错吧

潘鹏 回复

你找个接口试试?debug sampler的逻辑也许不一样

gatling加grafana也是美的很

思寒_seveniruby 将本帖设为了精华贴 21 May 23:11
15Floor has been deleted
16Floor has been deleted

请问楼主在搭建grafana过程中,有遇到不能发送邀请注册邮件么?配置文件中的smtp已经开启了;已解决

thomas · #18 · July 10, 2018
Author only

@keithmork
apt添加源方式 在ubuntu18.04 安装的influxdb 1.6.0

 sudo systemctl start influxdb



遇到问题:

cmd@TR:~$ curl http://localhost:8083
curl: (7) Failed to connect to localhost port 8083: 拒绝连接


cmd@TR:~$ curl http://localhost:8086
404 page not found



cmd@TR:~$ sudo ufw allow 8086/tcp
[sudo] cmd 的密码:
防火墙规则已更新
规则已更新(v6)


cmd@TR:~$ sudo ufw allow 8083/tcp
防火墙规则已更新
规则已更新(v6)


curl 问题依旧


cmd@TR:~$ sudo ufw disable
防火墙在系统启动时自动禁用


重启系统, 启动 sudo systemctl start influxdb curl问题依旧


启动influxd
sudo influxd -config /etc/influxdb/influxdb.conf


cmd@TR:~$ influx
Connected to http://localhost:8086 version 1.6.0
InfluxDB shell version: 1.6.0

> CREATE DATABASE jmeter

> SHOW DATABASES
name: databases
name
----
telegraf
_internal
jmeter
> use jmeter
Using database jmeter


后续用Grafana配置添加DataSources到是连通jmeter数据库正常。暂时不管该问题。

请问为何会

cmd@TR:~$ curl http://localhost:8083
curl: (7) Failed to connect to localhost port 8083: 拒绝连接


cmd@TR:~$ curl http://localhost:8086
404 page not found

已经google 没有找到合适的答案 也尝试过一些修改 但仍旧是上述问题。

但在后续用Grafana配置添加DataSources到是连通jmeter数据库正常

sorry 官方1.2以上都移除了web页面监控 已经找到答案了 。

最近在做性能测试,看到这个贴子马上也上手试了一下。就是想请教一下,如果我在grafana上设定的是每隔5秒刷新数据的话,会影响被测对象或者是client的性能吗?另外,如果还想实时监控server端的CPU等资源占用情况,有什么推荐的工具吗?谢谢!

请问楼主 如果是用这种方式的话是不是就不用在本地保存那么大的结果文件了,我是保存成jtl csv然后转换成html的

不错不错, 我这边也是这样做的。 同时还结合普罗米修斯和grafana 把监控的仪表盘也弄上去了

23Floor has been deleted
simple 专栏文章:[精华帖] 社区历年精华帖分类归总 中提及了此贴 13 Dec 14:44
需要 Sign In 后方可回复, 如果你还没有账号请点击这里 Sign Up