语句运行和时间超出预期

存在一个问题,有时候语句优化或者效能都不是特别好,然后在查询的过程中发现直接就卡死了,机器也会卡死,无法在前端页面停止语句执行;
我们的场景是有150WTag点和300W边,查询需要在2-10跳之间的一个场景,(这些点和边已经无法再划分下去了)初探查询后发现机器直接就宕了。

语句卡死的话,有具体的查询语句吗,可以补充下数据量和语句,加个 Profile 让我们的研发同学看看呢。

顺便补充下你的机器配置哈。

刚才去机房重启机器了,哈哈
机器配置:4core,2.40GHz,内存40G,SSD450G左右
数据量:13万点和150万边;
nebula:单点部署,docker容器,每个服务都是一个实例;
故障原因:其实是在studio中输入了一个vid后,进行6跳的图探索后页面卡死,然后机器宕机;

能捞下错误日志贴一下吗

storage

I20220419 02:09:56.685847    35 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 21] total time:541ms, Write WAL, total 2
I20220419 02:09:58.441030    41 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 1] total time:1186ms, Write WAL, total 2
I20220419 02:09:58.441000    39 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 68] total time:1186ms, Write WAL, total 2
I20220419 02:09:59.229022    43 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 65] total time:401ms, Write WAL, total 2
I20220419 02:10:04.826126    38 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 86] total time:1362ms, Write WAL, total 2
I20220419 02:10:04.826122    35 SlowOpTracker.h:31] [Port: 9780, Space: 1, Part: 1] total time:2335ms, Total send logs: 2

graph

E20220419 02:15:48.885459    59 MetaClient.cpp:735] Send request to "metad0":9559, exceed retry limit
E20220419 02:16:17.542914    59 MetaClient.cpp:736] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:16:42.546355    59 GraphSessionManager.cpp:222] Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:17:00.522236    52 GraphSessionManager.cpp:246] Update sessions failed: Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:21:05.131572    42 MetaClient.cpp:735] Send request to "metad0":9559, exceed retry limit
E20220419 02:21:14.495889    42 MetaClient.cpp:736] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:21:28.106245    51 MetaClient.cpp:171] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:24:26.130075    43 MetaClient.cpp:735] Send request to "metad0":9559, exceed retry limit
E20220419 02:24:46.109652    43 MetaClient.cpp:736] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:25:10.510530    43 GraphSessionManager.cpp:222] Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:25:35.496979    52 GraphSessionManager.cpp:246] Update sessions failed: Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:27:53.872589    44 MetaClient.cpp:735] Send request to "metad0":9559, exceed retry limit
E20220419 02:28:12.221868    44 MetaClient.cpp:736] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:28:30.082262    51 MetaClient.cpp:171] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220419 02:32:54.326568    45 MetaClient.cpp:735] Send request to "metad0":9559, exceed retry limit

meta

I20220419 02:09:38.831596   123 HBProcessor.cpp:33] Receive heartbeat from "storaged0":9779, role = STORAGE
I20220419 02:09:45.977432   123 HBProcessor.cpp:33] Receive heartbeat from "graphd":9669, role = GRAPH
I20220419 02:09:47.476292    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:101ms, Total commit: 1
I20220419 02:09:48.866307   123 HBProcessor.cpp:33] Receive heartbeat from "storaged0":9779, role = STORAGE
I20220419 02:09:57.907231    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:729ms, Write WAL, total 2
I20220419 02:10:13.864542    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:3851ms, Total send logs: 2
I20220419 02:10:17.368753   123 HBProcessor.cpp:33] Receive heartbeat from "graphd":9669, role = GRAPH
I20220419 02:10:27.829978   122 HBProcessor.cpp:33] Receive heartbeat from "storaged0":9779, role = STORAGE
I20220419 02:10:29.152884    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:10731ms, Total commit: 1
I20220419 02:10:30.693095    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:319ms, Write WAL, total 2
I20220419 02:10:34.235693    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:661ms, Total send logs: 2
I20220419 02:10:40.078974    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:4431ms, Total commit: 1
I20220419 02:10:46.363931    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:139ms, Write WAL, total 2
I20220419 02:10:48.154943    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:714ms, Total send logs: 2
I20220419 02:10:51.888608    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:2111ms, Total commit: 1
I20220419 02:10:56.706192    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:806ms, Write WAL, total 2
I20220419 02:10:58.987819    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:442ms, Total send logs: 2
I20220419 02:11:03.965934    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:2467ms, Total commit: 1
I20220419 02:11:12.359587    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:299ms, Write WAL, total 2
I20220419 02:11:13.151219   123 HBProcessor.cpp:33] Receive heartbeat from "graphd":9669, role = GRAPH
I20220419 02:11:17.564379    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:1304ms, Total send logs: 2
I20220419 02:11:24.064764    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:5186ms, Total commit: 1
I20220419 02:11:25.734196    43 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:271ms, Write WAL, total 2
I20220419 02:11:29.816751    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:1474ms, Total send logs: 2
I20220419 02:11:38.079864    42 SlowOpTracker.h:31] [Port: 9560, Space: 0, Part: 0] total time:5505ms, Total commit: 1
I20220419 02:11:42.891696   121 HBProcessor.cpp:33] Receive heartbeat from "storaged0":9779, role = STORAGE

graph还是storage OOM了

什么版本

3.0.2

来一个实例吧,我刚刚查询的,看图:


这个查询用时无限制,现在都没有返回,我之前加的是count()聚合,但是直接报了系统内容不足,因为我确实做了一些限制在里面,
我想知道这种是语句问题吗?需要优化?以下是执行计划

我limit 100,返回了,用时11s,然后我在docker中看到,这个查询起初,storaged内存占用就上了50%,CPU100%+,然后就是graphd cpu100%+ 内存80%;

explain看一下

上面有一张图吧,那个dot格式

不要dot格式

计划上能优化的不太多,看上去可以去掉一些不必要的边属性以及Filter算子,不过也得等下个版本

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。