The leader has changed异常无法自动恢复

背景

nebula 版本:v.2.0.1
部署方式(分布式 / 单机 / Docker / DBaaS):分布式:3节点3副本
是否为线上版本:Y
单机配置: 32核+128G内存+2TSSD*2

问题描述

1、nebula-console连接其中一个graphd 执行命令异常(在另个graph上执行命令能正常返回结果)
输出如下,间隔一段时间重试没有用

(xxx) [relations]> show hosts
+----------------+-------+----------+--------------+-------------------------+--------------------------+
| Host           | Port  | Status   | Leader count | Leader distribution     | Partition distribution   |
+----------------+-------+----------+--------------+-------------------------+--------------------------+
| "XXXXXX.11" | 44500 | "ONLINE" | 33           | "relations:33" | "relations:96"  |
+----------------+-------+----------+--------------+-------------------------+--------------------------+
| "XXXXXX.12" | 44500 | "ONLINE" | 32           | "relations:32" | "relations:96"  |
+----------------+-------+----------+--------------+-------------------------+--------------------------+
| "XXXXXX.13" | 44500 | "ONLINE" | 33           | "relations:33" | "relations:96"  |
+----------------+-------+----------+--------------+-------------------------+--------------------------+
| "Total"        |       |          | 98           | "relations:98" | "relations:288" |
+----------------+-------+----------+--------------+-------------------------+--------------------------+
Got 4 rows (time spent 1082/6917 us)

Wed, 30 Jun 2021 11:11:59 CST

(xxx) [relations]> PROFILE MATCH (v)--(v2)--(v3) where id(v) == 'effd57f13d22eec8b5554d1b51a7059d' and v.attr!='1' and v2.attr!='1' and v3.attr!='1' RETURN v.key,v.tag_type,v2.key,v2.tag_type,v3.key,v3.tag_type;
[ERROR (-8)]: Storage Error: The leader has changed. Try again later

storaged异常日志

0630 11:24:20.007037 17369 FileBasedWal.cpp:503] [Port: 44501, Space: 9, Part: 31] There is a gap in the log id. The last log id is 15359309, and the id being appended is 15359309
E0630 11:24:20.008132 17369 FileBasedWal.cpp:583] [Port: 44501, Space: 9, Part: 31] Failed to append log for logId 15359309
E0630 11:24:20.008476 17369 RaftPart.cpp:757] [Port: 44501, Space: 9, Part: 31] Failed to write into WAL
E0630 11:24:20.008602 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.008764 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.009063 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.009443 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.009670 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.009795 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.010216 17369 RaftPart.cpp:771] [Port: 44501, Space: 9, Part: 31] Failed append logs
E0630 11:24:20.010630 17369 Part.cpp:447] [Port: 44501, Space: 9, Part: 31] Consensus error -6
E0630 11:24:20.021241 17381 FileBasedWal.cpp:503] [Port: 44501, Space: 9, Part: 31] There is a gap in the log id. The last log id is 15359309, and the id being appended is 15359309
E0630 11:24:20.021368 17381 FileBasedWal.cpp:583] [Port: 44501, Space: 9, Part: 31] Failed to append log for logId 15359309

贴下这个part的INFO日志 现在啥问题

重启了graph、storage进程目前已恢复。
nebula storaged层raft协议 一个term的选举时间是多久?

该话题在最后一个回复创建后30天后自动关闭。不再允许新的回复。

浙ICP备20010487号