个别的图空间中submit job stats后,show job xxx提示 1005:Key not existed!

提问参考模版:

  • nebula 版本:3.0.1
  • 部署方式:分布式
  • 安装方式: RPM
  • 是否为线上版本: N
  • 硬件信息
    • 磁盘
    • CPU、内存信息
  • 问题的具体描述:个别的图空间中submit job status后,show job xxx提示 1005:Key not existed!

:thinking: 老哥,你没打错命令么,不是 submit job stats 吗

大写的尴尬,发出来的时候写错了,实际执行的是submit job stats 然后 show job xxx。

虽然概率不大,但是我还是想确认下这个是 Studio 的问题还是内核的,你可以去用 Nebula-Console(不是 Studio 的 Console)执行下这个语句吗,看是不是还报错。

image
一样的错误

有 meta 的日志吗? (如果 meta 是多副本的, 则要 meta leader 的. (如果不知道哪个是 meta leader, 那就都贴吧 :joy:))

meta日志都是空的???数据库部署的有问题吗???只有graph有日志,日志记录的关键信息与报错信息是一样的

nebula-metad.INFO整个都空的?可以看看日志级别是啥 minloglevel

截取一部分 INFO

I20220401 12:41:55.583463 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9669, role = GRAPH
I20220401 12:41:56.399236 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9669, role = GRAPH
I20220401 12:41:58.287043 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9669, role = GRAPH
I20220401 12:42:03.826640 15645 AdminJobProcessor.cpp:28] process() op = ADD, cmd = STATS, paras.size()=1 test_0921
I20220401 12:42:03.826678 15645 AdminJobProcessor.cpp:47] Job has already exists: 288
I20220401 12:42:04.736620 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9779, role = STORAGE
I20220401 12:42:04.749502 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9779, role = STORAGE
I20220401 12:42:04.776947 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9779, role = STORAGE
I20220401 12:42:05.594892 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9669, role = GRAPH
I20220401 12:42:06.410789 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9669, role = GRAPH
I20220401 12:42:08.298684 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9669, role = GRAPH
I20220401 12:42:14.750511 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9779, role = STORAGE
I20220401 12:42:14.762845 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9779, role = STORAGE
I20220401 12:42:14.790170 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9779, role = STORAGE
I20220401 12:42:15.602650 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9669, role = GRAPH
I20220401 12:42:16.422299 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9669, role = GRAPH
I20220401 12:42:18.310148 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9669, role = GRAPH
I20220401 12:42:21.823513 15645 AdminJobProcessor.cpp:28] process() op = SHOW, paras.size()=2 288 test_0921
I20220401 12:42:23.763962 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9779, role = STORAGE
I20220401 12:42:23.776069 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9779, role = STORAGE
I20220401 12:42:23.803406 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9779, role = STORAGE
I20220401 12:42:25.616006 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9669, role = GRAPH
I20220401 12:42:26.433708 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9669, role = GRAPH
I20220401 12:42:28.321748 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9669, role = GRAPH
I20220401 12:42:34.777487 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9779, role = STORAGE
I20220401 12:42:34.789362 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9779, role = STORAGE
I20220401 12:42:34.816625 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9779, role = STORAGE
I20220401 12:42:35.627251 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.24":9669, role = GRAPH
I20220401 12:42:36.440585 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.23":9669, role = GRAPH
I20220401 12:42:38.333384 15645 HBProcessor.cpp:33] Receive heartbeat from "192.168.8.22":9669, role = GRAPH

日志级别设置的最小级别:0,请多多指教

这个是出错时候的不? 最好是在某次出错的时候log, 再grep -v 一下 HBProcessor.cpp?

前面的日志中 AdminJobProcessor 这三行日志记录 就是出错的日志

test_0921 为啥感觉这个打印出来的 space name 跟你贴图里的不太一样

后面的日志是重新建的图空间测试 复现了上午的贴图的test0921

亲, 整个过程中删过 space 吗?

没有的

show jobs 一下呢

show jobs中没有 submit job stats的job id

你现在每次 submit job stats, 返回的 job id 会变吗?

另外, 这个参数改过没 ? FLAGS_job_expired_secs