match 查询性能慢,求优化建议

基础信息

  • nebula 版本:3.8.0

  • 部署方式:

|机器|CPU|内存|磁盘|网络|metad|storaged|graphd|

|:–:|:–:|:–:|:–:|:–:|:–:|:–:|:–:expressionless:

|node1|16C|64G|1T ssd|1000Mbps|1|1|1|

|node2|16C|64G|1T ssd|1000Mbps|1|1|1|

|node3|16C|64G|1T ssd|1000Mbps|1|1|1|

|node4|16C|64G|1T ssd|1000Mbps||1|1|

|node5|16C|64G|1T ssd|1000Mbps||1|1|

|node6|16C|64G|1T ssd|1000Mbps||1|1|

|node7|16C|64G|1T ssd|1000Mbps||1|1|

|node8|16C|64G|1T ssd|1000Mbps||1|1|

  • 安装方式:预编译好的程序

  • 是否上生产环境:N

数据集信息

  • 表结构:



暂未创还能任何索引。

  • 数据量:

  • 点:1.4亿

  • 边:2亿

问题

执行如下查询:


MATCH p = (src:company)<-[e1:invest*1..3]-()-[e2:invest*1..3]->(dst:company)

WHERE id(src) == "1012425713"

AND id(dst) == "1011975012"

AND id(src) != id(dst)

WITH

e1,

e2,

collect(dst) AS dstNodes,

p,

count(p) AS total,

collect(p) AS paths

RETURN DISTINCT

p,

LENGTH(p) AS len

ORDER BY len ASC

SKIP 0

LIMIT 10000;

查询耗时:10s+

根据文档做出的调整

  • 配置调整:

  • graphd


########## performance optimization ##########

# The max job size in multi job mode

--max_job_size=32

# The min batch size for handling dataset in multi job mode, only enabled when max_job_size is greater than 1

--min_batch_size=81920

# if true, return directly without go through RPC

--optimize_appendvertices=false

# number of paths constructed by each thread

--path_batch_size=100000

--num_operator_threads=32

  • storaged

# The default block cache size used in BlockBasedTable.

# The unit is MB.

--rocksdb_block_cache=35840

# Network IO threads number

--num_io_threads=64

  • 系统配置调整

  • 网络


sudo sysctl -w net.ipv4.tcp_slow_start_after_idle=0

sysctl net.ipv4.tcp_slow_start_after_idle

sudo sysctl -w net.core.somaxconn=1024

sysctl net.core.somaxconn

sudo sysctl -w net.core.netdev_max_backlog=10000

sysctl net.core.netdev_max_backlog

  • 透明大页

echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled

echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag

cat /sys/kernel/mm/transparent_hugepage/enabled

cat /sys/kernel/mm/transparent_hugepage/defrag

疑问

  • 在上述配置下,上述查询的耗时是否在预期范围内?因为通过监控并没有看到有系统瓶颈(磁盘、网络和CPU负载均在正常范围内),所以怀疑是查询中扫描的边的数量(70多万条)及其属性数据获取耗时过高导致的。

  • 在业务无法调整的情况下,是否还有其他方式进行优化方案。

补充:

  1. 麻烦确认下你有多少个 tag
  2. 麻烦确认下 storaged参数query_concurrently 是否打开:
  3. MATCH p = (src:company)<-[e1:invest*1..3]-()-[e2:invest*1..3]->(dst:company) WHERE id(src) == "1012425713" AND id(dst) == "1011975012" RETURN DISTINCT p, LENGTH(p) AS len ORDER BY len ASC SKIP 0 LIMIT 10000;
    语句离有点啰嗦的地方,我精简了下
  1. 只有一个tag:
  2. 已经打开:
  3. 使用您精简后的语句查询,大概耗时在8s左右,profile如下:
    profile2.csv (21.4 KB)

补充信息:分区数是120