基础信息
-
nebula 版本:3.8.0
-
部署方式:
|机器|CPU|内存|磁盘|网络|metad|storaged|graphd|
|:–:|:–:|:–:|:–:|:–:|:–:|:–:|:–
|node1|16C|64G|1T ssd|1000Mbps|1|1|1|
|node2|16C|64G|1T ssd|1000Mbps|1|1|1|
|node3|16C|64G|1T ssd|1000Mbps|1|1|1|
|node4|16C|64G|1T ssd|1000Mbps||1|1|
|node5|16C|64G|1T ssd|1000Mbps||1|1|
|node6|16C|64G|1T ssd|1000Mbps||1|1|
|node7|16C|64G|1T ssd|1000Mbps||1|1|
|node8|16C|64G|1T ssd|1000Mbps||1|1|
-
安装方式:预编译好的程序
-
是否上生产环境:N
数据集信息
- 表结构:
暂未创还能任何索引。
-
数据量:
-
点:1.4亿
-
边:2亿
问题
执行如下查询:
MATCH p = (src:company)<-[e1:invest*1..3]-()-[e2:invest*1..3]->(dst:company)
WHERE id(src) == "1012425713"
AND id(dst) == "1011975012"
AND id(src) != id(dst)
WITH
e1,
e2,
collect(dst) AS dstNodes,
p,
count(p) AS total,
collect(p) AS paths
RETURN DISTINCT
p,
LENGTH(p) AS len
ORDER BY len ASC
SKIP 0
LIMIT 10000;
查询耗时:10s+
- profile输出:
profile.csv (22.3 KB)
根据文档做出的调整
-
配置调整:
-
graphd
########## performance optimization ##########
# The max job size in multi job mode
--max_job_size=32
# The min batch size for handling dataset in multi job mode, only enabled when max_job_size is greater than 1
--min_batch_size=81920
# if true, return directly without go through RPC
--optimize_appendvertices=false
# number of paths constructed by each thread
--path_batch_size=100000
--num_operator_threads=32
- storaged
# The default block cache size used in BlockBasedTable.
# The unit is MB.
--rocksdb_block_cache=35840
# Network IO threads number
--num_io_threads=64
-
系统配置调整
-
网络
sudo sysctl -w net.ipv4.tcp_slow_start_after_idle=0
sysctl net.ipv4.tcp_slow_start_after_idle
sudo sysctl -w net.core.somaxconn=1024
sysctl net.core.somaxconn
sudo sysctl -w net.core.netdev_max_backlog=10000
sysctl net.core.netdev_max_backlog
- 透明大页
echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag
cat /sys/kernel/mm/transparent_hugepage/enabled
cat /sys/kernel/mm/transparent_hugepage/defrag
疑问
-
在上述配置下,上述查询的耗时是否在预期范围内?因为通过监控并没有看到有系统瓶颈(磁盘、网络和CPU负载均在正常范围内),所以怀疑是查询中扫描的边的数量(70多万条)及其属性数据获取耗时过高导致的。
-
在业务无法调整的情况下,是否还有其他方式进行优化方案。