nebula2.5.1批量入库问题

woaiwah · 2021 年12 月 6 日 03:48

nebula 版本：2.5.1
部署方式：分布式
安装方式：RPM
是否为线上版本：Y
硬件信息
- 磁盘 SSD

spark批量导入顶点边时，如果使用1个副本数据，可以正常导入数据，没有任务和问题，但是，将副本数修改为3时，导入就会报错，而且只有边6亿+的时候才会报错，千万顶点和边都没有报错，报错如下入

21/12/06 02:39:20 INFO importer.NebulaV2EdgeBatchImporter$: error_code : -1005, error_msg: Storage Error: part: 63, error: E_RPC_FAILURE(-3).
21/12/06 02:39:20 INFO importer.NebulaV2EdgeBatchImporter$: error_code : -1005, error_msg: Storage Error: part: 121, error: E_RPC_FAILURE(-3).

graphd-stderr

当前任务总共6亿+数据

分片改为3以后，速度变慢了n倍。

liuyu85cn · 2021 年12 月 6 日 07:17

3副本会走 raft, 会比 1 副本慢的. 您这个可能是达到了 raft 的最大吞吐.

可以考虑改一下 storage 的参数: FLAGS_max_batch_size, 默认是 256, 改成 1024 试试?

woaiwah · 2021 年12 月 6 日 07:25

nebula-graphd.conf、nebula-metad.conf、nebula-storaged.conf、nebula-storaged-listener.conf这个四个配置文件配置项有详细的介绍吗？

liuyu85cn · 2021 年12 月 6 日 07:29

https://docs.nebula-graph.com.cn/2.6.1/5.configurations-and-logs/1.configurations/4.storage-config/

storage 的, 其它的也在附近

woaiwah · 2021 年12 月 6 日 07:32

这个里面也没有您说的 FLAGS_max_batch_size 相关配置的介绍呢。

liuyu85cn · 2021 年12 月 6 日 08:31

嗯, 有的非常用选项没有暴露, 直接写在代码文件里的:

https://github.com/vesoft-inc/nebula/blob/master/src/kvstore/raftex/RaftPart.cpp

30 行

woaiwah · 2021 年12 月 6 日 08:38

直接在storaged.conf的添加 --max_batch_size=1024就可以了吗？我看代码里面写的是批处理最大的日志数呢

liuyu85cn · 2021 年12 月 6 日 08:39

嗯, 然后重启, 你得确认下 --local_config=true

woaiwah · 2021 年12 月 6 日 08:40

我看代码里面写的是批处理最大的日志数呢

liuyu85cn · 2021 年12 月 6 日 08:40

这样会让 raft 的吞吐快一点, 如果还是不行的话, 可能要更详细的性能分析下, 也可以考虑让 importer 的速度降一些, 具体问题具体分析吧.

liuyu85cn · 2021 年12 月 6 日 08:42

是的, 一般都是 wal 的 buffer 打满, insert 可能排队超时, 就会报这个 storage rpc error

woaiwah · 2021 年12 月 6 日 08:43

好的，我先试下