Match/Go 跳步过滤条件分页查询性能优化

VincentSleepless · 2022 年10 月 9 日 03:55

环境信息

nebula 版本：3.2.0
部署方式：分布式 3节点
安装方式：RPM
是否为线上版本：测试环境
硬件信息
- 磁盘阿里云高效云盘 300G* 1
- CPU、内存信息 4C 16G
问题的具体描述

我们在开发数据资产管理平台，参考业内标杆的开源系统Apache Altas / Linkedin Datahub，其底层的存储均仅有一个数据库，Apache Altas(Janus Graph) Linkin Datahub(neo4j)，在做数据资产的地图检索的时候，某种特定的资产肯定会有分页查询，比如我们举下面一个例子：

比如一个Maxcompute数据源资产（某个项目空间），其下有表资产，表资产下有列资产。

其在nebulagraph的模型为
TAG
MaxcomputeDsAsset(数据源项目空间)
MaxcomputeTableAsset（表）
MaxcomputeColumnAsset（列）

Edge
maxcompute_ds_table (项目空间表关联关系)
maxcompute_table_cloumn（表列关联关系）

索引
MaxcomputeDsAsset project_name 字段
MaxcomputeTableAsset table_name 字段
MaxcomputeColumnAsset column_name 字段

我们造了一些测试数据，模拟查询（先模拟一跳分页，没模拟列）
MaxcomputeDsAsset 1个
MaxcomputeTableAsset 10w个 (一个项目空间 10w表)
MaxcomputeTableAsset 20w个（每个表 2列）

集群参数优化(参考论坛和官方配置)：
storaged配置
–query_concurrently=true
–rocksdb_column_family_options={“write_buffer_size”:“67108864”,“max_write_buffer_number”:“4”,“max_bytes_for_level_base”:“268435456”,“disable_auto_compactions”:false}
–rocksdb_enable_kv_separation=true
–enable_rocksdb_prefix_filtering=true
–enable_rocksdb_whole_key_filtering=false
–enable_partitioned_index_filter=true
–rocksdb_filtering_prefix_length=12
–rocksdb_block_cache=256

graphd配置
–num_operator_threads=4

我们在做数据地图检索的时候，会先指定选择资产类型，是表或者是列；

我们有如下几个场景：

1.选定资产类型为MaxComputeTable，筛选条件仅table名称前缀过滤
很显然这是一个基于索引的Lookup

LOOKUP ON MaxcomputeTableAsset WHERE MaxcomputeTableAsset.name STARTS WITH ‘table’ YIELD properties(vertex) | limit 1, 10

查询速度很快，在毫秒级；

2.选定资产类型为MaxComputeTable，筛选条件新增数据源，table名称前缀过滤不变

这时就无法仅仅对MaxComputeTable的属性进行过滤，需要基于某个MaxcomputeDsAsset 数据源的点，进行1跳过滤的分页查询

采用GO语句
GO 1 STEPS FROM “project1001” OVER maxcompute_ds_table REVERSELY where properties($$).name STARTS WITH ‘table’ YIELD properties($$).name as name, properties($$).tableName as tableName, properties($$).createdTime as createdTime, properties($$).env as env,properties($$).isExternal as isExternal | LIMIT 1,20

耗时在1.8s

使用Match语句
MATCH (v:MaxcomputeDsAsset)<–(v2:MaxcomputeTableAsset) WHERE id(v) == ‘project1001’ and v2.MaxcomputeTableAsset.name STARTS WITH ‘table’ RETURN v2.MaxcomputeTableAsset.name AS name, v2.MaxcomputeTableAsset.tableName as tableName, v2.MaxcomputeTableAsset.createdTime as createdTime, v2.MaxcomputeTableAsset.env as env, v2.MaxcomputeTableAsset.isExternal as isExternal skip 0 limit 10;

耗时在1.8s

同时我们根据实际生产数据，最大的项目空间下有5.5w张表，进行模拟

– 出边 15w 1跳 Execution Time 3.562981 (s)
– 出边 10w 1跳 Execution Time 1.710542 (s)
– 出边 5w 1跳 Execution Time 1.01257 (s)
– 出边2.5w 1跳 Execution Time 0.56089 (s)
– 出边 1w 1跳 Execution Time 0.167283 (s)
– 出边0.5w 1跳 Execution Time 0.085381 (s)

发现出边越少，性能越好，但是到10w级左右时，耗时已经有明显下降

我们用其他组的neo4j相关模拟进行查询

1跳用户资源总数据量14w
企业微信截图_16652843381637

1跳用户资源分页查询，耗时在几十毫秒

想咨询一下，是否有优化的可能。

caton-hpg · 2022 年10 月 9 日 06:09

1）是SSD吗？建议使用SSD
2）partition是多少个？

caton-hpg · 2022 年10 月 9 日 06:33

给你个排查思路，在ngql语句前面加个profile关键字，
1）看看limit是否下推了没，如果没有，可能需要调整一下语句。
2）看看哪个算子耗时多

VincentSleepless · 2022 年10 月 9 日 07:01

目前没用SSD，partition 100，目前只有一块盘，测试只是用来做验证。

VincentSleepless · 2022 年10 月 9 日 07:02

其实我上图的逻辑不复杂，就是从指定点的一跳过滤分页查询，但是这个点出边有10w个。
profile 有查过，目前感觉nebula的过滤是全部查完之后再过滤的。

VincentSleepless · 2022 年10 月 9 日 07:04

profile MATCH (v:MaxcomputeDsAsset)<–(v2:MaxcomputeTableAsset) WHERE id(v) == ‘project1001’ and v2.MaxcomputeTableAsset.tableName STARTS WITH ‘table’ RETURN v2.MaxcomputeTableAsset.name AS name, v2.MaxcomputeTableAsset.tableName as tableName, v2.MaxcomputeTableAsset.createdTime as createdTime, v2.MaxcomputeTableAsset.env as env, v2.MaxcomputeTableAsset.isExternal as isExternal skip 0 limit 10;

result (1).csv (3.4 KB)

VincentSleepless · 2022 年10 月 9 日 07:16

这个SQL还有优化的可能么？

caton-hpg · 2022 年10 月 9 日 07:34

建议用SSD吧，性能差别非常大。

caton-hpg · 2022 年10 月 9 日 07:52

建议把 project1001 放到 match模式匹配里面，如
MATCH (v:MaxcomputeDsAsset{name:“xxx”})<–(v2:MaxcomputeTableAsset)

VincentSleepless · 2022 年10 月 9 日 07:54

match 这里如何对点ID过滤？

caton-hpg · 2022 年10 月 9 日 08:07

id应该只能放到where里过滤。

另外如果您的库中，一个点的边太多，在查边的时候消耗的时间太长了。
1）一种方式是在match里限制一下边的属性
2）max_edge_returned_per_vertex 设置截断参考 Storage 服务配置 - NebulaGraph Database 手册

VincentSleepless · 2022 年10 月 9 日 08:16

1.数据源DSAsset只有一个，目前不涉及这块的下推，未来也是基于出边ID进行过滤；
2.max_edge_returned_per_vertex 这个系统默认配置相当的大，我出边才10W，且也均需要，目前现在这种查询效率太低了。

VincentSleepless · 2022 年10 月 9 日 08:19

差不多的数据，neo4j大概是在100ms以内，1跳分页查询。

VincentSleepless · 2022 年10 月 9 日 08:32

我去申请3块SSD试试

VincentSleepless · 2022 年10 月 10 日 08:02

我这边申请了3块阿里云IOPS 5W的PL1规格SSD ，同样灌入10w数据进行测试。

MATCH (v:MaxcomputeDsAsset)<-[e:maxcompute_ds_table]-(v2:MaxcomputeTableAsset) WHERE id(v) == ‘project1001’ and v2.MaxcomputeTableAsset.name STARTS WITH ‘table’ RETURN v2.MaxcomputeTableAsset.name AS name, v2.MaxcomputeTableAsset.tableName as tableName, v2.MaxcomputeTableAsset.createdTime as createdTime, v2.MaxcomputeTableAsset.env as env, v2.MaxcomputeTableAsset.isExternal as isExternal skip 0 limit 10;

match语句 1跳分页查询，耗时1.96s

GO 1 STEPS FROM “project1001” OVER maxcompute_ds_table REVERSELY where properties($$).tableName STARTS WITH ‘table’ YIELD properties($$).name as name, properties($$).tableName as tableName, properties($$).createdTime as createdTime, properties($$).env as env,properties($$).isExternal as isExternal | LIMIT 1,10

go语句 1跳分页查询，耗时1.96s

MATCH (v1:MaxcomputeDsAsset)<-[e1:maxcompute_ds_table]-(v2:MaxcomputeTableAsset)<-[e2:maxcompute_table_cloumn{relationType:“contain”}]-(v3:MaxcomputeColumnAsset) WHERE id(v1) == ‘project1001’ and v2.MaxcomputeTableAsset.name == ‘table_prd_89999’ RETURN v1.MaxcomputeDsAsset.name AS projectName, v2.MaxcomputeTableAsset.tableName as tableName, v3.MaxcomputeColumnAsset.name AS colname skip 0 limit 10;

match 语句2跳语句，耗时5.4s

caton-hpg · 2022 年10 月 11 日 02:27

1）一个点的出边有10万条，太多了。
2）我建议先将截断设置为 1万先试试效果。

wey · 2022 年10 月 11 日 03:14

我的感觉现在这里慢的原因是 GetNeighbors 的 filter 没有下推，导致了边的全扫描。

我去看了些下现在 GetNeighbors 的 filter 下推的条件：

只能做到 $^. 条件以及边条件的过滤
并且要求 YIELD 不是终点数据

所以在当前版本下，可以做的优化有两点。

修改一下图建模，把 MaxcomputeTableAsset.name 额外存一份在边 maxcompute_ds_table 上比如 maxcompute_ds_table.TableAssetName，以方便拓展过滤
用在 GO 的时候，YIELD 点 ID，然后接一个管道去 FETCH，这样，就不会全扫描了

优化前：

GO 1 STEPS FROM 'Nets' OVER serve REVERSELY \
  where properties($$).name STARTS WITH 'T' \
  YIELD properties($$).name AS name | LIMIT 1,10

优化以后

GO 1 STEPS FROM 'Nets' OVER serve REVERSELY \
  where serve.start_year > 1000 YIELD src(edge) AS src | LIMIT 1,10 | \
  FETCH PROP ON player $-.src YIELD properties(vertex).name AS name

可以 explain 一下看到最终的查询为：

(root@nebula) [basketballplayer]> explain GO 1 STEPS FROM 'Nets' OVER serve REVERSELY where serve.start_year > 1000 YIELD src(edge) AS src | LIMIT 1,10 | FETCH PROP ON player $-.src YIELD properties(vertex).name AS name
Execution succeeded (time spent 1246/13246 us)

Execution Plan (optimize time 553 us)

-----+--------------+--------------+----------------+--------------------------------------
| id | name         | dependencies | profiling data | operator info                       |
-----+--------------+--------------+----------------+--------------------------------------
|  6 | Project      | 10           |                | outputVar: {                        |
|    |              |              |                |   "colNames": [                     |
|    |              |              |                |     "name"                          |
|    |              |              |                |   ],                                |
|    |              |              |                |   "type": "DATASET",                |
|    |              |              |                |   "name": "__Project_6"             |
|    |              |              |                | }                                   |
|    |              |              |                | inputVar: __GetVertices_5           |
|    |              |              |                | columns: [                          |
|    |              |              |                |   "properties(VERTEX).name AS name" |
|    |              |              |                | ]                                   |
-----+--------------+--------------+----------------+--------------------------------------
| 10 | GetVertices  | 11           |                | outputVar: {                        |
|    |              |              |                |   "colNames": [],                   |
|    |              |              |                |   "type": "DATASET",                |
|    |              |              |                |   "name": "__GetVertices_5"         |
|    |              |              |                | }                                   |
|    |              |              |                | inputVar: __Limit_8                 |
|    |              |              |                | space: 49                           |
|    |              |              |                | dedup: false                        |
|    |              |              |                | limit: 9223372036854775807          |
|    |              |              |                | filter:                             |
|    |              |              |                | orderBy: []                         |
|    |              |              |                | src: src(EDGE)                      |
|    |              |              |                | props: [                            |
|    |              |              |                |   {                                 |
|    |              |              |                |     "props": [                      |
|    |              |              |                |       "_tag",                       |
|    |              |              |                |       "age",                        |
|    |              |              |                |       "name"                        |
|    |              |              |                |     ],                              |
|    |              |              |                |     "tagId": 50                     |
|    |              |              |                |   }                                 |
|    |              |              |                | ]                                   |
|    |              |              |                | exprs:                              |
-----+--------------+--------------+----------------+--------------------------------------
| 11 | Limit        | 12           |                | outputVar: {                        |
|    |              |              |                |   "colNames": [],                   |
|    |              |              |                |   "type": "DATASET",                |
|    |              |              |                |   "name": "__Limit_8"               |
|    |              |              |                | }                                   |
|    |              |              |                | inputVar: __GetNeighbors_12         |
|    |              |              |                | offset: 1                           |
|    |              |              |                | count: 10                           |
-----+--------------+--------------+----------------+--------------------------------------
| 12 | GetNeighbors | 0            |                | outputVar: {                        |
|    |              |              |                |   "colNames": [],                   |
|    |              |              |                |   "type": "DATASET",                |
|    |              |              |                |   "name": "__GetNeighbors_12"       |
|    |              |              |                | }                                   |
|    |              |              |                | inputVar: __VAR_0                   |
|    |              |              |                | space: 49                           |
|    |              |              |                | dedup: false                        |
|    |              |              |                | limit: 11                           |
|    |              |              |                | filter: (serve.start_year>1000)     |
|    |              |              |                | orderBy: []                         |
|    |              |              |                | src: COLUMN[0]                      |
|    |              |              |                | edgeTypes: []                       |
|    |              |              |                | edgeDirection: OUT_EDGE             |
|    |              |              |                | vertexProps:                        |
|    |              |              |                | edgeProps: [                        |
|    |              |              |                |   {                                 |
|    |              |              |                |     "props": [                      |
|    |              |              |                |       "_dst",                       |
|    |              |              |                |       "_rank",                      |
|    |              |              |                |       "_src",                       |
|    |              |              |                |       "_type",                      |
|    |              |              |                |       "end_year",                   |
|    |              |              |                |       "start_year"                  |
|    |              |              |                |     ],                              |
|    |              |              |                |     "type": -52                     |
|    |              |              |                |   }                                 |
|    |              |              |                | ]                                   |
|    |              |              |                | statProps:                          |
|    |              |              |                | exprs:                              |
|    |              |              |                | random: false                       |
-----+--------------+--------------+----------------+--------------------------------------
|  0 | Start        |              |                | outputVar: {                        |
|    |              |              |                |   "colNames": [],                   |
|    |              |              |                |   "type": "DATASET",                |
|    |              |              |                |   "name": "__Start_0"               |
|    |              |              |                | }                                   |
-----+--------------+--------------+----------------+--------------------------------------

filter: (serve.start_year>1000) 被嵌入到 GetNeighbors 算子里了，你可以 profile 一下，扫描的行数应该会有不同。

优化思路参考这个文章：nGQL 简明教程，第二期 nGQL 执行计划详解与调优 - siwei.io

VincentSleepless · 2022 年11 月 3 日 08:33

是一个好的方法，但是如果这个name更新了怎么办，如果有10w级别的出边，我可能需要更新全部出边里的属性信息，比较适合不常更新的模式。我们后续做血缘模型的时候，会考虑这个思路，感谢wey!!

system · 2022 年11 月 10 日 08:34

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。