nebula集群每天12点cpu异常飙高

  • nebula 版本:3.6.0
  • 部署方式:分布式
  • 安装方式: RPM
  • 是否上生产环境:Y /
  • 硬件信息
    • 磁盘 ESSD 8T
    • CPU、内存信息 64C256G
  • 问题的具体描述

公司新搭了一套集群,发现每天0点-4点cpu占用率都会升高30%左右,这段期间没有突发的写入与查询,想问下是nebula内部有什么机制吗

你可以看下这两篇,了解下 NebulaGraph 的存储原理。初步怀疑感觉是你的机器在做 compaction。

看了下compact的日志,那段时间也没有比较大的compact

COMPACT日志:

le Read Latency Histogram By Level [default] **
2024/01/26-00:01:37.649528 140675236448000 [db/db_impl/db_impl.cc:1069] ------- DUMPING STATS -------
2024/01/26-00:01:37.649564 140675236448000 [db/db_impl/db_impl.cc:1071] 
** DB Stats **
Uptime(secs): 1149247.9 total, 600.2 interval
Cumulative writes: 95M writes, 194M keys, 94M commit groups, 1.0 writes per commit group, ingest: 39.68 GB, 0.04 MB/s
Cumulative WAL: 95M writes, 0 syncs, 95198862.00 writes per sync, written: 39.68 GB, 0.04 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 528 writes, 528 keys, 528 commit groups, 1.0 writes per commit group, ingest: 0.02 MB, 0.00 MB/s
Interval WAL: 528 writes, 0 syncs, 528.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.4      1.4       0.0   1.0      0.0     28.2     50.61             47.56       704    0.072       0      0       0.0       0.0
  L1      1/0    5.07 MB   0.0     29.1     1.4     27.7      28.0      0.3       0.0  20.1     70.5     67.9    422.22            950.99       176    2.399    896M    31M       0.0       0.0
  L2     10/0   307.14 MB   0.1      0.1     0.1      0.1       0.1      0.0       0.3   1.7     29.7     24.6      4.58              4.47         1    4.581   5506K   473K       0.0       0.0
 Sum     11/0   312.20 MB   0.0     29.2     1.5     27.8      29.5      1.8       0.3  21.2     62.7     63.3    477.41           1003.02       881    0.542    902M    31M       0.0       0.0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Low      0/0    0.00 KB   0.0     29.2     1.5     27.8      28.1      0.4       0.0   0.0     70.1     67.5    426.80            955.46       177    2.411    902M    31M       0.0       0.0
High      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.4      1.4       0.0   0.0      0.0     28.2     50.61             47.56       704    0.072       0      0       0.0       0.0

Blob file count: 0, total size: 0.0 GB, garbage size: 0.0 GB, space amp: 0.0

Uptime(secs): 1149247.9 total, 600.2 interval
Flush(GB): cumulative 1.392, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 29.52 GB write, 0.03 MB/s write, 29.21 GB read, 0.03 MB/s read, 477.4 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
Block cache LRUCache@0x7ff21de20430#46191 capacity: 32.00 GB collections: 2207 last_copies: 0 last_secs: 0.180593 secs_since: 0
Block cache entry stats(count,size,portion): DataBlock(1042846,31.90 GB,99.6961%) OtherBlock(1,24.08 KB,7.17584e-05%) Misc(1,0.00 KB,0%)

** File Read Latency Histogram By Level [default] **
2024/01/26-00:11:37.830252 140675236448000 [db/db_impl/db_impl.cc:1069] ------- DUMPING STATS -------
2024/01/26-00:11:37.830287 140675236448000 [db/db_impl/db_impl.cc:1071] 
** DB Stats **
Uptime(secs): 1149848.0 total, 600.2 interval
Cumulative writes: 95M writes, 194M keys, 94M commit groups, 1.0 writes per commit group, ingest: 39.68 GB, 0.04 MB/s
Cumulative WAL: 95M writes, 0 syncs, 95199388.00 writes per sync, written: 39.68 GB, 0.04 MB/s
Cumulative stall: 00:00:0.000 H:M:S, 0.0 percent
Interval writes: 526 writes, 526 keys, 526 commit groups, 1.0 writes per commit group, ingest: 0.02 MB, 0.00 MB/s
Interval WAL: 526 writes, 0 syncs, 526.00 writes per sync, written: 0.00 GB, 0.00 MB/s
Interval stall: 00:00:0.000 H:M:S, 0.0 percent

** Compaction Stats [default] **
Level    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  L0      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.4      1.4       0.0   1.0      0.0     28.2     50.61             47.56       704    0.072       0      0       0.0       0.0
  L1      1/0    5.07 MB   0.0     29.1     1.4     27.7      28.0      0.3       0.0  20.1     70.5     67.9    422.22            950.99       176    2.399    896M    31M       0.0       0.0
  L2     10/0   307.14 MB   0.1      0.1     0.1      0.1       0.1      0.0       0.3   1.7     29.7     24.6      4.58              4.47         1    4.581   5506K   473K       0.0       0.0
 Sum     11/0   312.20 MB   0.0     29.2     1.5     27.8      29.5      1.8       0.3  21.2     62.7     63.3    477.41           1003.02       881    0.542    902M    31M       0.0       0.0
 Int      0/0    0.00 KB   0.0      0.0     0.0      0.0       0.0      0.0       0.0   0.0      0.0      0.0      0.00              0.00         0    0.000       0      0       0.0       0.0

** Compaction Stats [default] **
Priority    Files   Size     Score Read(GB)  Rn(GB) Rnp1(GB) Write(GB) Wnew(GB) Moved(GB) W-Amp Rd(MB/s) Wr(MB/s) Comp(sec) CompMergeCPU(sec) Comp(cnt) Avg(sec) KeyIn KeyDrop Rblob(GB) Wblob(GB)
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Low      0/0    0.00 KB   0.0     29.2     1.5     27.8      28.1      0.4       0.0   0.0     70.1     67.5    426.80            955.46       177    2.411    902M    31M       0.0       0.0
High      0/0    0.00 KB   0.0      0.0     0.0      0.0       1.4      1.4       0.0   0.0      0.0     28.2     50.61             47.56       704    0.072       0      0       0.0       0.0

Blob file count: 0, total size: 0.0 GB, garbage size: 0.0 GB, space amp: 0.0

Uptime(secs): 1149848.0 total, 600.2 interval
Flush(GB): cumulative 1.392, interval 0.000
AddFile(GB): cumulative 0.000, interval 0.000
AddFile(Total Files): cumulative 0, interval 0
AddFile(L0 Files): cumulative 0, interval 0
AddFile(Keys): cumulative 0, interval 0
Cumulative compaction: 29.52 GB write, 0.03 MB/s write, 29.21 GB read, 0.03 MB/s read, 477.4 seconds
Interval compaction: 0.00 GB write, 0.00 MB/s write, 0.00 GB read, 0.00 MB/s read, 0.0 seconds
Stalls(count): 0 level0_slowdown, 0 level0_slowdown_with_compaction, 0 level0_numfiles, 0 level0_numfiles_with_compaction, 0 stop for pending_compaction_bytes, 0 slowdown for pending_compaction_bytes, 0 memtable_compaction, 0 memtable_slowdown, interval 0 total count
Block cache LRUCache@0x7ff21de20430#46191 capacity: 32.00 GB collections: 2208 last_copies: 0 last_secs: 0.180509 secs_since: 0
Block cache entry stats(count,size,portion): DataBlock(1042878,31.90 GB,99.6991%) OtherBlock(1,24.08 KB,7.17584e-05%) Misc(1,0.00 KB,0%)

NebulaGraph 没有这种定时机制。你有配 TTL 之类的吗?是不是可以在 cpu 占用率升高的时候看下是哪些进程导致?

1 个赞

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。