sst ingest 命令, 12个storage 节点,大部分节点几分钟导入完成,但有部分结果节点导入sst 数据很慢,感觉卡主不动了,导致整个集群sst ingest 导入耗时比较长 80G sst 文件共导入共4小时

提问参考模版:

nebula 版本:nebulaGraph2.5.1
部署方式:分布式
安装方式: RPM
是否为线上版本:Y
硬件信息
磁盘 2TB ssd
CPU 16c 32G
问题的具体描述
相关的 meta / storage / graph info 日志信息(尽量使用文本形式方便检索)
1)sst ingest 命令, 12个storage 节点,大部分节点几分钟导入完成,但有部分结果节点导入sst 数据很慢,感觉卡主不动了,导致整个集群sst ingest 导入耗时比较长 80G sst 文件共导入共4小时

1 个赞

下载可以了,INGEST 导入问题帮看看
sst ingest 命令, 12个storage 节点,大部分节点几分钟导入完成,但有部分结果节点导入sst 数据很慢,感觉卡主不动了,导致整个集群sst ingest 导入耗时比较长 80G sst 文件共导入4小时

1


大部分storage 节点日志显示几分钟就ingest 完了

但是有少数一两个一直跑了4小时才完。这个是什么原因

cc @darionyaphet 辛苦帮忙看下

首先有几个问题需要确认一下:

各个节点上下载的数据量是否是一致的?

运行时间比较久的节点是否触发了 Compaction ?

日志是否有其他信息?

1.每个storage 节点都是我手动下载的全部sst 文件,全部一样
2.运行时间久的节点日志没有看到compaction
3.日志全部是sst 导入日志,没有看到其他日志
I0119 10:55:31.247277 7237 SlowOpTracker.h:33] [Port: 9780, Space: 4874, Part: 61] total time:100ms, Total commit: 1
I0119 10:55:44.632822 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/11/11-532981.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000318.sst, the properties of the table: # data blocks=14380; # entries=111891; # deletions=0; # merge operands=0; # range deletions=0; raw key size=2685384; raw average key size=24.000000; raw value size=56256021; raw average value size=502.775210; data block size=27302230; index block size (user-key? 0, delta-value? 0)=401566; filter block size=0; (estimated) table size=27703796; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:55:44.632863 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/11/11-564.sst
I0119 10:55:51.249320 7238 SlowOpTracker.h:33] [Port: 9780, Space: 1330, Part: 73] total time:100ms, Total commit: 1
I0119 10:55:51.249326 7235 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 73] total time:100ms, Total commit: 1
I0119 10:55:51.249372 7240 SlowOpTracker.h:33] [Port: 9780, Space: 4864, Part: 25] total time:100ms, Total commit: 1
I0119 10:55:51.249399 7251 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 61] total time:73ms, Total commit: 1
I0119 10:55:51.249418 7240 SlowOpTracker.h:33] [Port: 9780, Space: 4874, Part: 13] total time:100ms, Total send logs: 2
I0119 10:55:51.249414 7248 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 13] total time:71ms, Total commit: 1
I0119 10:55:51.249413 7224 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 37] total time:100ms, Total commit: 1
I0119 10:55:51.249410 7243 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 1] total time:100ms, Total commit: 1
I0119 10:55:51.249421 7226 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 25] total time:100ms, Total commit: 1
I0119 10:55:51.249415 7229 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 85] total time:100ms, Total commit: 1
I0119 10:55:51.249418 7239 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 97] total time:100ms, Total commit: 1
I0119 10:55:51.249439 7237 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 49] total time:88ms, Total commit: 1
I0119 10:56:01.254261 7226 SlowOpTracker.h:33] [Port: 9780, Space: 4874, Part: 49] total time:99ms, Write WAL, total 2
I0119 10:56:01.254371 7240 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 97] total time:100ms, Total commit: 1
I0119 10:56:01.254380 7221 SlowOpTracker.h:33] [Port: 9780, Space: 4864, Part: 85] total time:100ms, Total commit: 1
I0119 10:56:07.882120 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/11/11-564.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000319.sst, the properties of the table: # data blocks=22617; # entries=1792560; # deletions=0; # merge operands=0; # range deletions=0; raw key size=73494960; raw average key size=41.000000; raw value size=32266080; raw average value size=18.000000; data block size=37499345; index block size (user-key? 0, delta-value? 0)=744134; filter block size=0; (estimated) table size=38243479; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:07.882164 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/11/11-532729.sst
I0119 10:56:09.873052 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/11/11-532729.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000320.sst, the properties of the table: # data blocks=3721; # entries=7384; # deletions=0; # merge operands=0; # range deletions=0; raw key size=177216; raw average key size=24.000000; raw value size=15238345; raw average value size=2063.697860; data block size=2452688; index block size (user-key? 0, delta-value? 0)=99278; filter block size=0; (estimated) table size=2551966; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:09.873260 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532429.sst
I0119 10:56:11.264294 7238 SlowOpTracker.h:33] [Port: 9780, Space: 4874, Part: 37] total time:86ms, Write WAL, total 2
I0119 10:56:11.264322 7223 SlowOpTracker.h:33] [Port: 9780, Space: 4864, Part: 37] total time:86ms, Total commit: 1
I0119 10:56:11.265573 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532429.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000321.sst, the properties of the table: # data blocks=2449; # entries=4514; # deletions=0; # merge operands=0; # range deletions=0; raw key size=108336; raw average key size=24.000000; raw value size=15656478; raw average value size=3468.426673; data block size=1820796; index block size (user-key? 0, delta-value? 0)=64599; filter block size=0; (estimated) table size=1885395; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:11.265614 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532745.sst
I0119 10:56:13.071242 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532745.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000322.sst, the properties of the table: # data blocks=3787; # entries=7512; # deletions=0; # merge operands=0; # range deletions=0; raw key size=180288; raw average key size=24.000000; raw value size=15483605; raw average value size=2061.182774; data block size=2480061; index block size (user-key? 0, delta-value? 0)=101032; filter block size=0; (estimated) table size=2581093; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:13.071280 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532988.sst
I0119 10:56:21.268271 7229 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 1] total time:100ms, Total commit: 1
I0119 10:56:21.268407 7234 SlowOpTracker.h:33] [Port: 9780, Space: 4864, Part: 73] total time:100ms, Total commit: 1
I0119 10:56:28.501834 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532988.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000323.sst, the properties of the table: # data blocks=14394; # entries=111942; # deletions=0; # merge operands=0; # range deletions=0; raw key size=2686608; raw average key size=24.000000; raw value size=56361470; raw average value size=503.488146; data block size=27348683; index block size (user-key? 0, delta-value? 0)=402009; filter block size=0; (estimated) table size=27750692; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:28.501876 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532740.sst
I0119 10:56:30.404793 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-532740.sst, the internal file path /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/data/000324.sst, the properties of the table: # data blocks=3743; # entries=7425; # deletions=0; # merge operands=0; # range deletions=0; raw key size=178200; raw average key size=24.000000; raw value size=15348841; raw average value size=2067.183973; data block size=2486221; index block size (user-key? 0, delta-value? 0)=99894; filter block size=0; (estimated) table size=2586115; filter policy name=N/A; prefix extractor name=nullptr; column family ID=N/A; column family name=N/A; comparator name=leveldb.BytewiseComparator; merge operator name=nullptr; property collectors names=[]; SST file compression algo=Snappy; SST file compression options=N/A; creation time=0; time stamp of earliest key=0; file creation time=0; DB identity=; DB session identity=;
I0119 10:56:30.404837 7344 NebulaStore.cpp:807] Ingesting extra file: /home/service/var/data/nebula-graph-2.5.1/storaged/nebula/5322/download/12/12-533066.sst
I0119 10:56:31.008916 7344 EventListener.h:96] Ingest external SST file: column family default, the external file path /home/service/var/data/nebul

Ingesting extra file 耗时并不长 但是有好几个

I0119 10:55:51.249439 7237 SlowOpTracker.h:33] [Port: 9780, Space: 4879, Part: 49] total time:88ms, Total commit: 1

这是什么情况?

这个什么意思??我不知道

是说有其他图的操作影响了ingest 导入吗??

Ingesting 之间耗时不是很长 但中间多了些慢日志 不知道是不是又别的操作

这边sst 导入时没有做什么其他操作,这个我需要查看那些日志看到底有什么操作影响了??

是不是还有写入操作? 导致ingest 之后触发了 compact

是没有手动操作的,就我一个人在测试。是不是自动compact 触发了?? 问下nebulaGraph 的自动compact什么操作会触发
还有这边核查是关掉的,也没有看到compact 日志,sst 导入时先关掉disable_auto_compactions: ture

请问您的磁盘时SSD盘还是HDD盘?

磁盘ssd

这边发现ingest 慢的节点内存


很少了,不知道是不是这个问题。还有这个缓存一直占用不释放怎么处理

这是 Ingest 的时候的内存状态吗?

现在一直这个状态,free 很少了,这一个节点,但又不释放,怎么让它释放掉呀

我测试了多次ingest ,发现就是这台内存不足的最慢,其他内存足的一下就好了,应该就是这个问题,该怎么处理啊