边rank过滤问题

即然rank是边key的一部分,为什么根据某点出发,指定某个类型的边时,通过rank过滤还是很慢?比如

GO FROM 'a_1' OVER operate WHERE rank(edge) <= 1694422913 YIELD src(edge), dst(edge), rank(edge);

profile了下,看起来没下推,所以是捞上来以后再做的过滤

1 个赞

改一下表达就好了 改成 WHERE operate._rank < 1694422913

before

profile GO FROM 'player100' OVER follow WHERE rank(edge) < 100 YIELD src(edge), dst(edge), rank(edge)

-----+-----------+--------------+--------------------------------------+------------------------------
|  3 | ExpandAll | 2            | {                                    | outputVar: {                |
|    |           |              |   "execTime": "117(us)",             |   "colNames": [             |
|    |           |              |   "graphExpandAllTime+2": "57(us)",  |     "EDGE",                 |
|    |           |              |   "resp[0]": {                       |     "__COL_0"               |
|    |           |              |     "exec": "584(us)",               |   ],                        |
|    |           |              |     "host": "storaged2:9779",        |   "type": "DATASET",        |
|    |           |              |     "storage_detail": {              |   "name": "__ExpandAll_3"   |
|    |           |              |       "GetNeighborsNode": "182(us)", | }                           |
|    |           |              |       "HashJoinNode": "160(us)",     | inputVar: __Expand_2        |
|    |           |              |       "RelNode": "183(us)",          | space: 1                    |
|    |           |              |       "SingleEdgeNode": "157(us)"    | dedup: 0                    |
|    |           |              |     },                               | limit: -1                   |
|    |           |              |     "total": "1596(us)"              | filter:                     | #<---- Filter 没有推理 rank(edge) 表达,所以没有下推
|    |           |              |   },                                 | orderBy: []                 |
|    |           |              |   "rows": 2,                         | sample: false               |
|    |           |              |   "totalTime": "1924(us)",           | joinInput: false            |
|    |           |              |   "version": 0                       | maxSteps: 1                 |
|    |           |              | }                                    | edgeProps: [                |
|    |           |              |                                      |   {                         |
|    |           |              |                                      |     "props": [              |
|    |           |              |                                      |       "_dst",               |
|    |           |              |                                      |       "_rank",              |
|    |           |              |                                      |       "_src",               |
|    |           |              |                                      |       "_type",              |
|    |           |              |                                      |       "degree"              |
|    |           |              |                                      |     ],                      |
|    |           |              |                                      |     "type": 5               |
|    |           |              |                                      |   }                         |
|    |           |              |                                      | ]                           |
|    |           |              |                                      | stepLimits: []              |
|    |           |              |                                      | minSteps: 1                 |
|    |           |              |                                      | vertexProps:                |
|    |           |              |                                      | vertexColumns: []           |
|    |           |              |                                      | edgeColumns: [              |
|    |           |              |                                      |   "EDGE AS EDGE",           |
|    |           |              |                                      |   "*._rank AS __COL_0"      |
|    |           |              |                                      | ]                           |
-----+-----------+--------------+--------------------------------------+------------------------------

after

profile GO FROM 'player100' OVER follow WHERE follow._rank < 100 YIELD src(edge), dst(edge), rank(edge)

-----+-----------+--------------+-------------------------------------+-------------------------------
|  6 | ExpandAll | 2            | {                                   | outputVar: {                 |
|    |           |              |   "execTime": "112(us)",            |   "colNames": [              |
|    |           |              |   "graphExpandAllTime+2": "62(us)", |     "__COL_0",               |
|    |           |              |   "resp[0]": {                      |     "EDGE",                  |
|    |           |              |     "exec": "371(us)",              |     "__COL_1"                |
|    |           |              |     "host": "storaged2:9779",       |   ],                         |
|    |           |              |     "storage_detail": {             |   "type": "DATASET",         |
|    |           |              |       "FilterNode": "68(us)",       |   "name": "__Filter_4"       |
|    |           |              |       "GetNeighborsNode": "85(us)", | }                            |
|    |           |              |       "HashJoinNode": "63(us)",     | inputVar: __Expand_2         |
|    |           |              |       "RelNode": "86(us)",          | space: 1                     |
|    |           |              |       "SingleEdgeNode": "60(us)"    | dedup: 0                     |
|    |           |              |     },                              | limit: -1                    |
|    |           |              |     "total": "1367(us)"             | filter: (follow._rank<100)   |  # 过滤条件 apply 到 storage 了。
|    |           |              |   },                                | orderBy: []                  |
|    |           |              |   "rows": 2,                        | sample: false                |
|    |           |              |   "totalTime": "1728(us)",          | joinInput: false             |
|    |           |              |   "version": 0                      | maxSteps: 1                  |
|    |           |              | }                                   | edgeProps: [                 |
|    |           |              |                                     |   {                          |
|    |           |              |                                     |     "props": [               |
|    |           |              |                                     |       "_dst",                |
|    |           |              |                                     |       "_rank",               |
|    |           |              |                                     |       "_src",                |
|    |           |              |                                     |       "_type",               |
|    |           |              |                                     |       "degree"               |
|    |           |              |                                     |     ],                       |
|    |           |              |                                     |     "type": 5                |
|    |           |              |                                     |   }                          |
|    |           |              |                                     | ]                            |
|    |           |              |                                     | stepLimits: []               |
|    |           |              |                                     | minSteps: 1                  |
|    |           |              |                                     | vertexProps:                 |
|    |           |              |                                     | vertexColumns: []            |
|    |           |              |                                     | edgeColumns: [               |
|    |           |              |                                     |   "follow._rank AS __COL_0", |
|    |           |              |                                     |   "EDGE AS EDGE",            |
|    |           |              |                                     |   "*._rank AS __COL_1"       |
|    |           |              |                                     | ]                            |
-----+-----------+--------------+-------------------------------------+-------------------------------

这是 https://www.siwei.io/ngql-execution-plan/ 中提及的一个典型的优化思路

3 个赞

感谢。还有个问题想请教下,rank条件下推后,storaged是否需要扫描所有边(指定起点指定类型的所有边), 然后过滤除满足条件的边。还是说可以根据 rank的条件,只扫描满足条件的边?

扫的话应该是全扫,但是返回给 graphd 的是 filter 之后的哈

调整前后有明显区别吗?我看之前的总时间是1924(us),之后是1728(us),好像没有太大的区别啊

只是 wey 给的例子如此,不是 zhangyw 的数据对比,具体的数据差距要看 zhangyw 的数据量的,wey 的例子是基于官方 basketball 数据集的,本身数据量就不大。

1 个赞

好,我尝试下自己的,谢谢您的回复

不客气,如果你有性能优化这块的数据对比的话,:joy: 可以和我们分享下呀(社区有征文活动 2023 年 NebulaGraph 技术社区年度征文

因为不是全图扫,是典型的图探索,所以这里主要是单个点出边比如上万才会有差别,一般的平均出度几十个的没啥差距,我的例子主要关注算子的 filter 信息,往右拉一下能看到

1 个赞

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。