1.全表扫出现的情况是说有选不到合适索引的查询吗?
应该是的,没有创建和用到索引。按理说指定vid的情况下,应该是不需要用索引的吧
就比如这个?
match (v)-[e:edge_attack*1..6]->(v2) where id(v)== "172.16.33.1" return e limit 100
2.内存的问题
目前看应该是全表扫描导致的内存飙升然后graphd的docker重启
1.全表扫出现的情况是说有选不到合适索引的查询吗?
应该是的,没有创建和用到索引。按理说指定vid的情况下,应该是不需要用索引的吧
就比如这个?
match (v)-[e:edge_attack*1..6]->(v2) where id(v)== "172.16.33.1" return e limit 100
2.内存的问题
目前看应该是全表扫描导致的内存飙升然后graphd的docker重启
再举个例子
match (v)-[e:edge_attack]->(v2) return e limit 10
再nebula上响应需要10s
再neo4j上只需要2ms
nebula的profile,可以看出来确实发生了全表扫描。初此之外,还有很多match语句的limit没有办法做法early stop的例子。
上面这个应该是最简单的例子了,所以导致match语句基本处于不可用的状态。如果用的话可能也只是离线数据分析可以用,这种语句对在线场景来说是完全不能用的。
Got 10 rows (time spent 14.305845s/14.320751453s)
Execution Plan (optimize time 242 us)
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| id | name | dependencies | profiling data | operator info |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 11 | Project | 10 | { | outputVar: { |
| | | | "execTime": "2169(us)", | "colNames": [ |
| | | | "rows": 10, | "e" |
| | | | "totalTime": "2178(us)", | ], |
| | | | "version": 0 | "type": "DATASET", |
| | | | } | "name": "__Limit_8" |
| | | | | } |
| | | | | inputVar: __Limit_10 |
| | | | | columns: [ |
| | | | | "$-.e[0]" |
| | | | | ] |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 10 | Limit | 5 | { | outputVar: { |
| | | | "execTime": "170941(us)", | "colNames": [ |
| | | | "rows": 10, | "v", |
| | | | "totalTime": "170954(us)", | "e", |
| | | | "version": 0 | "v2" |
| | | | } | ], |
| | | | | "type": "DATASET", |
| | | | | "name": "__Limit_10" |
| | | | | } |
| | | | | inputVar: __AppendVertices_5 |
| | | | | offset: 0 |
| | | | | count: 10 |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 5 | AppendVertices | 4 | { | outputVar: { |
| | | | "execTime": "810858(us)", | "colNames": [ |
| | | | "resp[0]": { | "v", |
| | | | "exec": "14550(us)", | "e", |
| | | | "host": "storaged:9779", | "v2" |
| | | | "total": "15381(us)" | ], |
| | | | }, | "type": "DATASET", |
| | | | "rows": 736460, | "name": "__AppendVertices_5" |
| | | | "totalTime": "826505(us)", | } |
| | | | "total_rpc": "15933(us)", | inputVar: __Traverse_4 |
| | | | "version": 0 | space: 330 |
| | | | } | dedup: true |
| | | | | limit: -1 |
| | | | | filter: |
| | | | | orderBy: [] |
| | | | | src: none_direct_dst($-.e) |
| | | | | props: [ |
| | | | | { |
| | | | | "props": [ |
| | | | | "_tag" |
| | | | | ], |
| | | | | "tagId": 332 |
| | | | | }, |
| | | | | { |
| | | | | "props": [ |
| | | | | "_tag" |
| | | | | ], |
| | | | | "tagId": 331 |
| | | | | } |
| | | | | ] |
| | | | | exprs: |
| | | | | vertex_filter: |
| | | | | if_track_previous_path: true |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 4 | Traverse | 2 | { | outputVar: { |
| | | | "execTime": "4746657(us)", | "colNames": [ |
| | | | "rows": 736460, | "v", |
| | | | "step[1]": [ | "e" |
| | | | { | ], |
| | | | "exec": "4015017(us)", | "type": "DATASET", |
| | | | "host": "storaged:9779", | "name": "__Traverse_4" |
| | | | "storage_detail": { | } |
| | | | "GetNeighborsNode": "4013851(us)", | inputVar: __Dedup_2 |
| | | | "HashJoinNode": "4648(us)", | space: 330 |
| | | | "RelNode": "4013879(us)", | dedup: true |
| | | | "SingleEdgeNode": "4551(us)" | limit: -1 |
| | | | }, | filter: |
| | | | "total": "6667161(us)", | orderBy: [] |
| | | | "total_rpc_time": "6667544(us)", | src: $_vid |
| | | | "vertices": 86 | edgeTypes: [] |
| | | | } | edgeDirection: OUT_EDGE |
| | | | ], | vertexProps: |
| | | | "totalTime": "11414242(us)", | edgeProps: [ |
| | | | "version": 0 | { |
| | | | } | "props": [ |
| | | | | "_src", |
| | | | | "_type", |
| | | | | "_rank", |
| | | | | "_dst", |
| | | | | "name", |
| | | | | "proto", |
| | | | | "app_proto", |
| | | | | "tool", |
| | | | | "attacker_port", |
| | | | | "victim_port", |
| | | | | "phase", |
| | | | | "result", |
| | | | | "tag", |
| | | | | "severity", |
| | | | | "event_time", |
| | | | | "flow_id" |
| | | | | ], |
| | | | | "type": 334 |
| | | | | } |
| | | | | ] |
| | | | | statProps: |
| | | | | exprs: |
| | | | | random: false |
| | | | | steps: 1..1 |
| | | | | vertex filter: |
| | | | | edge filter: |
| | | | | if_track_previous_path: false |
| | | | | first step filter: |
| | | | | tag filter: |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 2 | Dedup | 1 | { | outputVar: { |
| | | | "execTime": "121272(us)", | "colNames": [ |
| | | | "rows": 86, | "_vid" |
| | | | "totalTime": "121284(us)", | ], |
| | | | "version": 0 | "type": "DATASET", |
| | | | } | "name": "__Dedup_2" |
| | | | | } |
| | | | | inputVar: __IndexScan_1 |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 1 | IndexScan | 3 | { | outputVar: { |
| | | | "execTime": "0(us)", | "colNames": [ |
| | | | "resp[0]": { | "_vid" |
| | | | "exec": "1505342(us)", | ], |
| | | | "host": "storaged:9779", | "type": "DATASET", |
| | | | "storage_detail": { | "name": "__IndexScan_1" |
| | | | "IndexEdgeScanNode(IndexID=335, Path=())": "1227079(us)", | } |
| | | | "IndexLimitNode(limit=9223372036854775807)": "1227075(us)", | inputVar: |
| | | | "IndexProjectionNode(projectColumn=[_src])": "1227078(us)" | space: 330 |
| | | | }, | dedup: false |
| | | | "total": "1690639(us)" | limit: 9223372036854775807 |
| | | | }, | filter: |
| | | | "rows": 736460, | orderBy: [] |
| | | | "totalTime": "1768582(us)", | schemaId: 334 |
| | | | "version": 0 | isEdge: true |
| | | | } | returnCols: [ |
| | | | | "_src" |
| | | | | ] |
| | | | | indexCtx: [ |
| | | | | { |
| | | | | "columnHints": [], |
| | | | | "filter": "", |
| | | | | "index_id": 335 |
| | | | | } |
| | | | | ] |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
| 3 | Start | | { | outputVar: { |
| | | | "execTime": "1(us)", | "colNames": [], |
| | | | "rows": 0, | "type": "DATASET", |
| | | | "totalTime": "48(us)", | "name": "__Start_3" |
| | | | "version": 0 | } |
| | | | } | |
-----+----------------+--------------+-------------------------------------------------------------------+---------------------------------
Tue, 28 Mar 2023 07:58:23 UTC
感谢这个反馈,现在 nebula 确实还存在许多优化上的事情要做,针对这条语句有两方面的优化没能做好,一个是列裁剪,一个是 limit 下推。我分别记了相关的 issue,会将对应的规则加到优化器中。
关于下面这个 query 应该不存在扫全表的问题:
match (v)-[e:edge_attack*1..6]->(v2) where id(v)== "172.16.33.1" return e limit 100
如果方便的话,可以帮忙再贴一下对应的计划吗?我本地看了是直接走的通过 id 来拓展。
再次感谢反馈
这个的场景是亮点之间有比较多不同rank的边,目前1…6会导致graphd内存占用过高,然后重启。
所以缩短了步数
profile match (v)-[e:edge_attack*1..2]->(v2) where id(v)== "172.16.33.1" return e limit 100
Got 100 rows (time spent 28.081838s/28.107994682s)
Execution Plan (optimize time 222 us)
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| id | name | dependencies | profiling data | operator info |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 11 | Project | 10 | { | outputVar: { |
| | | | "execTime": "23118(us)", | "colNames": [ |
| | | | "rows": 100, | "e" |
| | | | "totalTime": "23124(us)", | ], |
| | | | "version": 0 | "type": "DATASET", |
| | | | } | "name": "__Limit_8" |
| | | | | } |
| | | | | inputVar: __Limit_10 |
| | | | | columns: [ |
| | | | | "[__VAR_1 IN $-.e WHERE is_edge($__VAR_1)]" |
| | | | | ] |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 10 | Limit | 5 | { | outputVar: { |
| | | | "execTime": "4363810(us)", | "colNames": [ |
| | | | "rows": 100, | "v", |
| | | | "totalTime": "4363820(us)", | "e", |
| | | | "version": 0 | "v2" |
| | | | } | ], |
| | | | | "type": "DATASET", |
| | | | | "name": "__Limit_10" |
| | | | | } |
| | | | | inputVar: __AppendVertices_5 |
| | | | | offset: 0 |
| | | | | count: 100 |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 5 | AppendVertices | 4 | { | outputVar: { |
| | | | "execTime": "11375379(us)", | "colNames": [ |
| | | | "resp[0]": { | "v", |
| | | | "exec": "30692(us)", | "e", |
| | | | "host": "storaged:9779", | "v2" |
| | | | "total": "32106(us)" | ], |
| | | | }, | "type": "DATASET", |
| | | | "rows": 12332977, | "name": "__AppendVertices_5" |
| | | | "totalTime": "11407795(us)", | } |
| | | | "total_rpc": "33668(us)", | inputVar: __Traverse_4 |
| | | | "version": 0 | space: 330 |
| | | | } | dedup: true |
| | | | | limit: -1 |
| | | | | filter: |
| | | | | orderBy: [] |
| | | | | src: none_direct_dst($-.e) |
| | | | | props: [ |
| | | | | { |
| | | | | "props": [ |
| | | | | "_tag" |
| | | | | ], |
| | | | | "tagId": 332 |
| | | | | }, |
| | | | | { |
| | | | | "props": [ |
| | | | | "_tag" |
| | | | | ], |
| | | | | "tagId": 331 |
| | | | | } |
| | | | | ] |
| | | | | exprs: |
| | | | | vertex_filter: |
| | | | | if_track_previous_path: true |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 4 | Traverse | 2 | { | outputVar: { |
| | | | "execTime": "12027550(us)", | "colNames": [ |
| | | | "rows": 12332977, | "v", |
| | | | "step[1]": [ | "e" |
| | | | { | ], |
| | | | "exec": "158854(us)", | "type": "DATASET", |
| | | | "host": "storaged:9779", | "name": "__Traverse_4" |
| | | | "storage_detail": { | } |
| | | | "GetNeighborsNode": "157079(us)", | inputVar: __Dedup_2 |
| | | | "HashJoinNode": "6590(us)", | space: 330 |
| | | | "RelNode": "157080(us)", | dedup: true |
| | | | "SingleEdgeNode": "6587(us)" | limit: -1 |
| | | | }, | filter: |
| | | | "total": "260731(us)", | orderBy: [] |
| | | | "total_rpc_time": "261870(us)", | src: $-._vid |
| | | | "vertices": 1 | edgeTypes: [] |
| | | | } | edgeDirection: OUT_EDGE |
| | | | ], | vertexProps: |
| | | | "step[2]": [ | edgeProps: [ |
| | | | { | { |
| | | | "exec": "7853(us)", | "props": [ |
| | | | "host": "storaged:9779", | "_src", |
| | | | "storage_detail": { | "_type", |
| | | | "GetNeighborsNode": "7322(us)", | "_rank", |
| | | | "HashJoinNode": "970(us)", | "_dst", |
| | | | "RelNode": "7323(us)", | "name", |
| | | | "SingleEdgeNode": "962(us)" | "proto", |
| | | | }, | "app_proto", |
| | | | "total": "12948(us)", | "tool", |
| | | | "total_rpc_time": "13017(us)", | "attacker_port", |
| | | | "vertices": 7 | "victim_port", |
| | | | } | "phase", |
| | | | ], | "result", |
| | | | "totalTime": "12284637(us)", | "tag", |
| | | | "version": 0 | "severity", |
| | | | } | "event_time", |
| | | | | "flow_id" |
| | | | | ], |
| | | | | "type": 334 |
| | | | | } |
| | | | | ] |
| | | | | statProps: |
| | | | | exprs: |
| | | | | random: false |
| | | | | steps: 1..2 |
| | | | | vertex filter: |
| | | | | edge filter: |
| | | | | if_track_previous_path: false |
| | | | | first step filter: |
| | | | | tag filter: |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 2 | Dedup | 1 | { | outputVar: { |
| | | | "execTime": "13(us)", | "colNames": [ |
| | | | "rows": 1, | "_vid" |
| | | | "totalTime": "14(us)", | ], |
| | | | "version": 0 | "type": "DATASET", |
| | | | } | "name": "__Dedup_2" |
| | | | | } |
| | | | | inputVar: __VAR_0 |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 1 | PassThrough | 3 | { | outputVar: { |
| | | | "execTime": "7(us)", | "colNames": [ |
| | | | "rows": 0, | "_vid" |
| | | | "totalTime": "10(us)", | ], |
| | | | "version": 0 | "type": "DATASET", |
| | | | } | "name": "__VAR_0" |
| | | | | } |
| | | | | inputVar: |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
| 3 | Start | | { | outputVar: { |
| | | | "execTime": "0(us)", | "colNames": [], |
| | | | "rows": 0, | "type": "DATASET", |
| | | | "totalTime": "22(us)", | "name": "__Start_3" |
| | | | "version": 0 | } |
| | | | } | |
-----+----------------+--------------+-------------------------------------------+------------------------------------------------
Tue, 28 Mar 2023 11:03:42 UTC
感觉match扫描的时候是广搜,并且match到所有的边之后再开始过滤的,感觉是不是不好优化,我没测试过neo4j的limit机制,不知道是不是一样的呢。
这也是为什么之前想要改为GO语句的原因,但是GO的limit又是在where condition之前选择的。
贴一下统计数据
(root@nebula) [quanxi]> show stats
+---------+---------------+--------+
| Type | Name | Count |
+---------+---------------+--------+
| "Tag" | "tag_attack" | 0 |
| "Tag" | "tag_ip" | 163 |
| "Edge" | "edge_attack" | 736460 |
| "Edge" | "edge_ref" | 0 |
| "Space" | "vertices" | 163 |
| "Space" | "edges" | 736460 |
可以看一下
是的,nebula 的拓展都是通过广搜,如果把最早的 GO 转化为 MATCH 的话,其实还有个 filter 条件,类似下面的语句:
MATCH (v)-[e:edge_attack*1..6]-(v2)
WHERE id(v)=="172.16.33.1" AND all(i in e WERE i.event_time<1679888603187 AND i.event_time>1679888501000)
RETURN DISTINCT e
如果后续这个优化合入相信可以缓解你的问题。
当然 limit 的优化,我也记了个 issue,后面也会推进这个优化的事情。优化的方法在考虑切换深搜的方式。
多谢你的场景提供,希望能一起把 nebula 打磨的更易用好用。
感谢回复,另外就是GO语句的limit有计划做赶紧吗,就是关于limit和where condition的先后问题
关于 GO 的 limit 和 where 的执行顺序的问题,这个我们内部的讨论是不再调整了,之前我也说明了缘由是之前 GO 的语意的设计问题。
后续更多的优化期望是在 match 中实现,这块的优化空间更大一些。如果现在使用上需求比较紧急可以先按照你最初的设定给个稍微大些的 limit 来避免结果截断的问题。
好的,了解。还是希望能尽快优化下这里,这是不可用=>可用的改变。不然match语句真的可有可无了
另外就是设定稍微大些的 limit是解决不了问题的,应为过滤条件是不定的。所以这个功能暂时在我看来也处于不可用状态。
此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。