Nebula dashboard 多磁盘修改路径问题

提问参考模版:

  • nebula 版本:2.6.1
  • 部署方式:分布式
  • 安装方式:源码编译 / Docker / RPM
  • 是否为线上版本:Y / N
  • 硬件信息
    • 磁盘( 推荐使用 SSD)
    • CPU、内存信息
  • 问题的具体描述:nebula graph每个节点上都挂载了多个磁盘,监控dashboard 磁盘路径需要怎么修改?
  • 使用磁盘如下
  • 我只想监控data2,data10,data3,data1,data9,data11这几个磁盘需要怎么改配置
  • 这是原代码

增加一下过滤条件就好了,过滤条件改下 {device=~“sdc|sdd"}

尝试了下,并没有生效,依然是监控的/ 下的磁盘。
修改入下,添加了sdd

 disk_used: 'node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"} - 
 node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}',
  disk_free: 'node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}',
  disk_readbytes: 'irate(node_disk_read_bytes_total{device=~"(sd|nvme|hd|sdd)[a-z0-9]*"}[1m])',
  disk_writebytes: 'irate(node_disk_written_bytes_total{device=~"(sd|nvme|hd|sdd)[a-z0-9]*"}[1m])',
  disk_readiops: 'irate(node_disk_reads_completed_total{device=~"(sd|nvme|hd|sdd)[a-z0-9]*"}[1m])',
  disk_writeiops: 'irate(node_disk_writes_completed_total{device=~"(sd|nvme|hd|sdd)[a-z0-9]*"}[1m])',
  inode_utilization: '(1- (node_filesystem_files_free{mountpoint="/",fstype!="rootfs"}) / (node_filesystem_files{mountpoint="/",fstype!="rootfs"})) * 100'

然后 disk_used,disk_free,disk_size 的代码中都是mountpoint="/",是不是只监控挂载在"/"下的磁盘,/data2,/data3等并不会监控到?

  disk_used: 'node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"} - node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}',
  disk_free: 'node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}',
 disk_size: 'node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}',

在wwb页面metrics中的看到的日志如下:

# TYPE node_filesystem_files gauge
node_filesystem_files{device="/dev/mapper/VolGroup00-LogVol01",fstype="xfs",mountpoint="/tmp"} 1.048576e+06
node_filesystem_files{device="/dev/mapper/VolGroup00-LogVol02",fstype="xfs",mountpoint="/var"} 5.24288e+06
node_filesystem_files{device="/dev/mapper/VolGroup00-LogVol03",fstype="xfs",mountpoint="/"} 1.944584192e+09
node_filesystem_files{device="/dev/sda2",fstype="xfs",mountpoint="/boot"} 524288
node_filesystem_files{device="/dev/sdb",fstype="ext4",mountpoint="/data1"} 2.44195328e+08
node_filesystem_files{device="/dev/sdc",fstype="ext4",mountpoint="/data2"} 2.44195328e+08
node_filesystem_files{device="/dev/sdd",fstype="ext4",mountpoint="/data3"} 2.44195328e+08
node_filesystem_files{device="/dev/sde",fstype="ext4",mountpoint="/data4"} 2.44195328e+08
node_filesystem_files{device="/dev/sdf",fstype="ext4",mountpoint="/data5"} 2.44195328e+08
node_filesystem_files{device="/dev/sdg",fstype="ext4",mountpoint="/data6"} 2.44195328e+08
node_filesystem_files{device="/dev/sdh",fstype="ext4",mountpoint="/data7"} 2.44195328e+08
node_filesystem_files{device="/dev/sdi",fstype="ext4",mountpoint="/data8"} 2.44195328e+08
node_filesystem_files{device="/dev/sdj",fstype="ext4",mountpoint="/data9"} 2.44195328e+08
node_filesystem_files{device="/dev/sdk",fstype="ext4",mountpoint="/data11"} 2.44195328e+08
node_filesystem_files{device="/dev/sdl",fstype="ext4",mountpoint="/data10"} 2.44195328e+08
node_filesystem_files{device="rootfs",fstype="rootfs",mountpoint="/"} 1.944584192e+09
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 2.4741636e+07
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 2.4741636e+07
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/32106"} 2.4741636e+07
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/3533"} 2.4741636e+07
node_filesystem_files{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/61797"} 2.4741636e+07
# HELP node_filesystem_files_free Filesystem total free file nodes.
# TYPE node_filesystem_files_free gauge
node_filesystem_files_free{device="/dev/mapper/VolGroup00-LogVol01",fstype="xfs",mountpoint="/tmp"} 1.048551e+06
node_filesystem_files_free{device="/dev/mapper/VolGroup00-LogVol02",fstype="xfs",mountpoint="/var"} 5.23962e+06
node_filesystem_files_free{device="/dev/mapper/VolGroup00-LogVol03",fstype="xfs",mountpoint="/"} 1.944072e+09
node_filesystem_files_free{device="/dev/sda2",fstype="xfs",mountpoint="/boot"} 523957
node_filesystem_files_free{device="/dev/sdb",fstype="ext4",mountpoint="/data1"} 2.44195293e+08
node_filesystem_files_free{device="/dev/sdc",fstype="ext4",mountpoint="/data2"} 2.44195169e+08
node_filesystem_files_free{device="/dev/sdd",fstype="ext4",mountpoint="/data3"} 2.44195178e+08
node_filesystem_files_free{device="/dev/sde",fstype="ext4",mountpoint="/data4"} 2.44195179e+08
node_filesystem_files_free{device="/dev/sdf",fstype="ext4",mountpoint="/data5"} 2.44195179e+08
node_filesystem_files_free{device="/dev/sdg",fstype="ext4",mountpoint="/data6"} 2.44195178e+08
node_filesystem_files_free{device="/dev/sdh",fstype="ext4",mountpoint="/data7"} 2.44195238e+08
node_filesystem_files_free{device="/dev/sdi",fstype="ext4",mountpoint="/data8"} 2.44195239e+08
node_filesystem_files_free{device="/dev/sdj",fstype="ext4",mountpoint="/data9"} 2.44195239e+08
node_filesystem_files_free{device="/dev/sdk",fstype="ext4",mountpoint="/data11"} 2.44181319e+08
node_filesystem_files_free{device="/dev/sdl",fstype="ext4",mountpoint="/data10"} 2.4418892e+08
node_filesystem_files_free{device="rootfs",fstype="rootfs",mountpoint="/"} 1.944072e+09
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run"} 2.4740457e+07
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/0"} 2.4741635e+07
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/32106"} 2.4741635e+07
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/3533"} 2.4741635e+07
node_filesystem_files_free{device="tmpfs",fstype="tmpfs",mountpoint="/run/user/61797"} 2.4741635e+07

这样添加有点问题,device 是过滤设备,你想过滤sdd需要先把前面的sd去掉,这是或不是与哦

另外,mountpoint 是路径label,这个也是支持过滤的 ,写法你可以参考下device

1.多个mountpoint 可以通过device写法配置,然后disk界面可以配置滑动窗口吗?比如:我有3台机器,每台机器5个盘,就会在dashbord页面的disk模块显示出15个实例,然后整个disk页面就很长,看起来很不协调。
2.监控disk的disk_used和disk_size数据不太对,下面单独配置了/data2 的sdc,linux中查到的

/dev/sdc                         3.6T  7.6G  3.4T   1% /data2

dashbord中disk模块


3台机器的disk_size均是3.94TB,但是实际上只有3.6T;3台机器的size_uesd近似相同,和实际使用的差距很大。
下面是代码调整部分

  // disk relative:
  disk_used: 'node_filesystem_size_bytes{mountpoint="/data2",fstype!="rootfs"} - node_filesystem_avail_bytes{mountpoint="/data2",fstype!="rootfs"}',
  disk_free: 'node_filesystem_avail_bytes{mountpoint="/data2",fstype!="rootfs"}',
  disk_readbytes: 'irate(node_disk_read_bytes_total{device=~"(sdc)[a-z0-9]*"}[1m])',
  disk_writebytes: 'irate(node_disk_written_bytes_total{device=~"(sdc)[a-z0-9]*"}[1m])',
  disk_readiops: 'irate(node_disk_reads_completed_total{device=~"(sdc)[a-z0-9]*"}[1m])',
  disk_writeiops: 'irate(node_disk_writes_completed_total{device=~"(sdc)[a-z0-9]*"}[1m])',
  inode_utilization: '(1- (node_filesystem_files_free{mountpoint="/data2",fstype!="rootfs"}) / (node_filesystem_files{mountpoint="/data2",fstype!="rootfs"})) * 100',

  disk_size: 'node_filesystem_size_bytes{mountpoint="/data2",fstype!="rootfs"}',
  1. 总览的排版确实排不下所有的信息,我们目前在下一个版本增加了提示,超出建议到详情页面浏览分析。
  2. 这个数值问题,你可以把这个语句在prometheus上面跑一遍看看返回数据,先排查采集问题

在prometheus上试了下,跑出来的数据和dashbord中一样,应该是数据收集的问题,这种问题怎么处理,重新起node-exporter 服务吗?

可以把一个个语句拆出来执行,看看问题出在哪比如 disk_used= node_filesystem_size_bytes-node_filesystem_avail_bytes,那我们把这两个语句单独执行看看

node_filesystem_size_bytes 和node_filesystem_avail_bytes 都不对,都比df -h的结果大,然后相减之后导致disk_used比实际要大。

node_filesystem_size_bytes、node_filesystem_free_bytes与node_filesystem_avail_bytes 这三个指标都看看

貌似知道是哪的问题了,好像是计算的问题。我重新用官网的tar.gz安装了一版监控"/" 路径,在Prometheus上执行以下语句:


node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"} - node_filesystem_avail_bytes{mountpoint="/",fstype!="rootfs"}
结果:
55596539904  1024计算:51.77GB  1000计算:55.59

node_filesystem_size_bytes{mountpoint="/",fstype!="rootfs"}
结果:
3982482210816   1024计算:3798.95GB  3.7TB  1000计算:3982.48GB 3.9TB

您看下是不是这样,然后bytes转换的逻辑代码是在哪里啊?

nebula-dashboard/dashboard.ts at master · vesoft-inc/nebula-dashboard · GitHub 这里有转化函数,你可以修改下代码进制,再看下本地的数据

修改了这两个地方,将1000统一调整为1024,disk结果是没问题,不过不太清楚是否有其他影响?

export const getBaseLineByUnit = (baseLine, unit) => {
  switch (unit) {
    case 'KB':
    case 'KB/s':
      return 1024 * baseLine;
    case 'MB':
    case 'MB/s':
      return 1024 * 1024 * baseLine;
    case 'GB':
    case 'GB/s':
      return 1024 * 1024 * 1024 * baseLine;
    case 'TB':
    case 'TB/s':
      return 1024 * 1024 * 1024 * 1024 * baseLine;
    default:
      return baseLine;
  }
};

export const getProperByteDesc = bytes => {
  const kb = 1024;
  const mb = 1024 * 1024;
  const gb = mb * 1024;
  const tb = gb * 1024;
  const nt = bytes / tb;
  const ng = bytes / gb;
  const nm = bytes / mb;
  const nk = bytes / kb;
  let value = 0;
  let unit = '';

  if (nt >= 1) {
    value = Number(nt.toFixed(2));
    unit = 'TB';
  } else if (ng >= 1) {
    value = Number(ng.toFixed(2));
    unit = 'GB';
  } else if (nm >= 1) {
    value = Number(nm.toFixed(2));
    unit = 'MB';
  } else if (nk >= 1) {
    value = Number(nk.toFixed(2));
    unit = 'KB';
  } else {
    value = bytes;
    unit = 'Bytes';
  }

  return {
    value,
    unit,
    desc: value + unit,
  };
};

另外,web页面的disk模块当有多个磁盘时,会整个模块会变得很长,之后的调整会修改成多页还是滑动窗口形式或者其他形式?有没有规划什么时候上线?谢谢

社区版最新的一版已经封测了,目前还是主要修复反馈为主,替换node环境,优化部署,具体改动你可以后期关注下。 :handshake: 可视化增强我们会放在企业版中先做,具体同步到社区版的功能&时间,还需要内部讨论再定

好的谢谢

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。