当图数据集群有挂掉的graphd服务时,nebula-stats-exporter监控服务也跟着报错了。

  • nebula 版本:2.0.1
  • 部署方式:分布式
  • 是否为线上版本:Y
  • 硬件信息
    • 磁盘 SSD

使用的是7月份在,教程在教程中下载的程序包:
https://docs.nebula-graph.com.cn/master/nebula-dashboard/2.deploy-dashboard/

下载的文件:

wget https://oss-cdn.nebula-graph.com.cn/nebula-graph-dashboard/nebula-graph-dashboard-beta.tar.gz

问题:
1、当图数据集群有挂掉的graphd服务时,启动是报错。
启动命令:

./nebula-stats-exporter --bare-metal --bare-metal-config=./config.yaml

返回信息:ip地址已经脱敏。

I0819 10:35:50.410815   35431 exporter.go:311] Begin Describe Nebula Metrics
I0819 10:35:50.415812   35431 exporter.go:317] Describe Nebula Metrics Done
I0819 10:35:50.416585   35431 main.go:88] Providing metrics at :9100/metrics
I0819 10:35:50.807616   35431 exporter.go:321] Collect!
I0819 10:35:50.807838   35431 exporter.go:387] Collect METAD metad0:19559 Metrics 
I0819 10:35:50.809364   35431 exporter.go:387] Collect METAD metad1:19559 Metrics 
I0819 10:35:50.810381   35431 exporter.go:387] Collect METAD metad2:19559 Metrics 
I0819 10:35:50.811389   35431 exporter.go:387] Collect GRAPHD graphd0:19669 Metrics 
I0819 10:35:50.812399   35431 exporter.go:387] Collect GRAPHD graphd1:19669 Metrics 
I0819 10:35:50.813545   35431 exporter.go:387] Collect GRAPHD graphd2:19669 Metrics 
I0819 10:35:50.814488   35431 exporter.go:387] Collect GRAPHD graphd3:19669 Metrics 
I0819 10:35:50.814783   35431 exporter.go:390] metrics from ip4:19669 was empty
E0819 10:35:50.814811   35431 exporter.go:393] get query metrics from ip4:19669 failed: Get "http://ip4:19669/stats?stats=": dial tcp ip4:19669: connect: connection refused
E0819 10:35:50.814779   35431 exporter.go:401] get status metrics from ip4:19669 failed: Get "http://ip4:19669/status": dial tcp ip4:19669: connect: connection refused
panic: send on closed channel

goroutine 81 [running]:
github.com/vesoft-inc/nebula-stats-exporter/exporter.(*NebulaExporter).CollectMetrics(0xc000461500, 0xc00027f020, 0x7, 0xc00027f088, 0x6, 0x157b729, 0x6, 0xc0002e7b00, 0x80, 0x81, ...)
        /tmp/nebula-stats-exporter/exporter/exporter.go:363 +0x2b5
github.com/vesoft-inc/nebula-stats-exporter/exporter.(*NebulaExporter).CollectFromStaticConfig.func1(0xc0000400f0, 0xc00027f020, 0x7, 0xc00027f040, 0xc, 0x4cd5, 0xc00027f088, 0x6, 0xc00027f040, 0xc, ...)
        /tmp/nebula-stats-exporter/exporter/exporter.go:396 +0x3b3
created by github.com/vesoft-inc/nebula-stats-exporter/exporter.(*NebulaExporter).CollectFromStaticConfig
        /tmp/nebula-stats-exporter/exporter/exporter.go:385 +0x25e

注:ip4为第四个graphd服务:x.x.x.x

2、nebula-stats-exporter正常运行过程中,如果遇到有graphd服务突然挂掉,nebula-stats-exporter也会跟着挂掉。

已经解决了,更新了最新版本的nebula-stats-exporter,问题解决了。
最新版本:2.5.0
https://github.com/vesoft-inc/nebula-stats-exporter/releases/tag/2.5.0

1 个赞

如果问题解决了,可以勾选你自己的回复为解决方案哈~方便后续的人看到对应的解决方法,谢谢 JustDoIt

该话题在最后一个回复创建后7天后自动关闭。不再允许新的回复。