nebualGraph 2.5.1 exchange 导入数据和Connector 删除数据时出现一些异常导致导入失败,导致其他数据图查询不了

nebula 版本:2.5.1
部署方式(分布式 / rpm
是否为线上版本:Y
硬件信息
磁盘ssd 500G
CPU 16C、32G 内存信息
问题的具体描述
nebualGraph 2.5.1 exchange 导入数据和Connector 删除数据时出现一些异常导致导入失败
日志如下:
meta 日志--------

W1119 20:47:03.270332 67505 RaftPart.cpp:648] [Port: 9780, Space: 4639, Part: 17] The appendLog buffer is full. Please slow down the log appending rate.replicatingLogs_ :1
W1119 21:56:44.786274 67510 RaftPart.cpp:365] [Port: 9780, Space: 4639, Part: 27] The partition is not a leader
W1119 21:56:44.786331 67510 RaftPart.cpp:702] [Port: 9780, Space: 4639, Part: 27] Cannot append logs, clean the buffer
W1119 22:04:54.801285 67518 RaftPart.cpp:702] [Port: 9780, Space: 4639, Part: 9] Cannot append logs, clean the buffer
W1119 22:04:54.801288 67530 RaftPart.cpp:365] [Port: 9780, Space: 4639, Part: 34] The partition is not a leader
W1119 22:07:31.397456 67510 RaftPart.cpp:648] [Port: 9780, Space: 4639, Part: 29] The appendLog buffer is full. Please slow down the log appending rate.replicatingLogs_ :100: 

storage 日志---------------------

E1119 20:37:47.888574 334055 GetSpaceProcessor.cpp:19] Get space Failed, SpaceName knowledge_graph_v3 error: E_LEADER_CHANGED
E1119 20:37:47.888574 334035 GetSpaceProcessor.cpp:19] Get space Failed, SpaceName knowledge_graph_v3 error: E_LEADER_CHANGED

graph 日志----------------

E1119 20:47:12.866794 365577 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 37
E1119 20:47:12.866808 365577 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 13
E1119 20:47:12.866822 365577 StorageAccessExecutor.h:122] Storage Error: part: 25, error: E_CONSENSUS_ERROR(-3001).
E1119 20:47:12.866346 365562 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 2

E1119 20:47:12.866794 365577 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 37
E1119 20:47:12.866808 365577 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 13
E1119 20:47:12.866822 365577 StorageAccessExecutor.h:122] Storage Error: part: 25, error: E_CONSENSUS_ERROR(-3001).
E1119 20:47:12.866346 365562 StorageAccessExecutor.h:42] DeleteEdgesExecutor failed, error E_CONSENSUS_ERROR, part 2


E1119 21:51:53.512437 365571 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 38
E1119 21:51:53.512450 365571 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 36
E1119 21:51:53.512463 365571 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 30
E1119 21:51:53.512476 365571 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 10
E1119 21:51:53.512490 365571 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 33
E1119 21:51:53.512496 365515 StorageClientBase.inl:163] Request to "10.37.80.68":9779 failed: N6apache6thrift9transport19TTransportExceptionE: Timed Out
E1119 21:51:53.509222 365580 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 2
E1119 21:51:53.509336 365570 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 2
E1119 21:51:53.511932 365574 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_RPC_FAILURE, part 48
E1119 21:51:53.510509 365583 StorageAccessExecutor.h:122] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E1119 22:14:56.929430 365579 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_LEADER_CHANGED, part 7
E1119 22:14:56.929447 365579 StorageAccessExecutor.h:42] GetNeighborsExecutor failed, error E_LEADER_CHANGED, part 19
E1119 22:14:56.929486 365574 QueryInstance.cpp:110] Storage Error: The leader has changed. Try again later
E1119 22:14:56.929694 365585 QueryInstance.cpp:110] Storage Error: The leader has changed. Try again later
E1119 22:18:31.581185 365584 QueryInstance.cpp:110] SyntaxError: syntax error near `null'
E1119 22:19:13.641271 365513 GraphSessionManager.cpp:213] Update sessions failed: Session not existed!
E1119 22:19:13.641403 365545 GraphSessionManager.cpp:242] Update sessions failed: Update sessions failed: Session not existed!
E1119 22:40:01.263505 365517 GraphSessionManager.cpp:213] Update sessions failed: Session not existed!
E1119 22:40:01.263604 365545 GraphSessionManager.cpp:242] Update sessions failed: Update sessions failed: Session not existed!
E1119 22:50:12.029105 365541 GraphSessionManager.cpp:213] Update sessions failed: Session not existed!
E1119 22:50:12.029214 365545 GraphSessionManager.cpp:242] Update sessions failed: Update sessions failed: Session not existed!

执行了什么语句报这个错误的,是不是语句过长了,和这个帖子类似 graphd进程崩溃(非oom) - #19,来自 jmq2020

就简单的删除语句 delete id,xxxxx, delete edge id,xxx ,id个数size 256 个不多啊,和插入语句

exchange 仅仅是导入数据 没有删除和邻居查找功能

我先用Connector 删除, 用exchange 导入的

你发的贴没有说清楚怎样修改语句的大小参数是那个呀,发我一下
还有个问题,我同时更新删除不同space 的数据,不同space leader chagne会相互影响吗

这边删除2000w边是不是太多了导致还没自动compact完, 这边又接着插入数据才会有leader change

同时更新删除不同space 的数据 不会相互影响

没这个参数吧 …

我看了删除2000w 成功了,这边语句不大的删除就256 个id 一个语句,但是好像是删除完,再插入导入新的数据有问题 下面日志 是什么原因啊,是不是删除完要等一下在导入数据,等compact 完,还是可以直接导入不会影响啊????
E1119 20:37:47.888574 334035 GetSpaceProcessor.cpp:19] Get space Failed, SpaceName knowledge_graph_v3 error: E_LEADER_CHANGED

W1119 20:47:03.270332 67505 RaftPart.cpp:648] [Port: 9780, Space: 4639, Part: 17] The appendLog buffer is full. Please slow down the log appending rate.replicatingLogs_ :1
W1119 21:56:44.786274 67510 RaftPart.cpp:365] [Port: 9780, Space: 4639, Part: 27] The partition is not a leader
W1119 21:56:44.786331 67510 RaftPart.cpp:702] [Port: 9780, Space: 4639, Part: 27] Cannot append logs, clean the buffer
W1119 22:04:54.801285 67518 RaftPart.cpp:702] [Port: 9780, Space: 4639, Part: 9] Cannot append logs, clean the buffer
W1119 22:04:54.801288 67530 RaftPart.cpp:365] [Port: 9780, Space: 4639, Part: 34] The partition is not a leader
W1119 22:07:31.397456 67510 RaftPart.cpp:648] [Port: 9780, Space: 4639, Part: 29] The appendLog buffer is full. Please slow down the log appending rate.replicatingLogs_ :100:

等compact 完了在导入数据吧

但是这个compact 是自动的,我没有手动触发,这个不知道时间啊,还是我手动触发一次,在查看状态job 完了再导入新的数据

通常来说导数据之前应该把 auto compact关闭 导完数据之后在手动compact一下

是的我是先关闭。导完在开启的,没有手动,自动不行吗

目前好像是不支持自动化

问下这边导入完数据后开启disable_auto_compactions: false ,nebulaGraph 会立刻开始compact 吗

还有就是当时导致其他图查询一直超时也查询不了,studio 也登陆不了,这个问题有点严重,请帮多看一下

请问各个进程还都在吗?

进程都是好的,就是访问超时,感觉阻塞了

studio 的问题 请家瑞来回答一下吧 @steam

1 个赞