2.0.1的时候是正常的,升级到2.5.1之后,我把数据全部清空了,都不能balance leader
show hosts图贴下
额 大概率还是你的配置不对
这个还有吗
有的,还多了一些其他错误
E1028 10:34:17.735702 12478 MetaClient.cpp:635] Send request to “172.19.143.225”:9559, exceed retry limit
E1028 10:34:17.736052 12437 MetaClient.cpp:65] Heartbeat failed, status:RPC failure in MetaClient: N6apache6thrift9transport19TTransportExceptionE: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E1028 10:35:19.389958 12642 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 16] Receive response about askForVote from “172.19.143.225”:9780, error code is E_TERM_OUT_OF_DATE
E1028 10:35:19.390007 12642 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 16] Receive response about askForVote from “172.19.143.226”:9780, error code is E_TERM_OUT_OF_DATE
你试试全部停掉 把所有storage里的data_path路径下的data_path/nebula/spaceId/wal做下备份(各个space独立的) 然后删掉 重启 看看能行吗
不过还是推荐先看看网 因为网一定是有问题的 就算按上面操作做了 也有问题
Heartbeat failed这个日志就不该有 netstat或者ss看看
所以你得去查配置和网络啊 一定是有错的…… 要么配错了 要么重复了 要么被占用了……
能列举一下nebula有哪些隐藏端口么?我这边配置文件里的端口貌似都没问题
storage会占用配置里port, port + 1, port - 1, port - 2总共4个
meta会占用配置里的port, port + 1应该是两个
graph只占port
文档里应该有写
我是直接拿2.0.1版本升级到2.5.1的,配置文件都没改过呢,升级前balance leader还是可用的。刚才试了下19559、19669、19779、19780这些端口也是正常的
有一台storage的日志和其它服务不一样
块引用
E1028 10:34:44.355659 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 5] Receive response about askForVote from “172.19.143.226”:9780, error code is E_UNKNOWN_PART
E1028 10:34:44.429636 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 9] Receive response about askForVote from “172.19.143.226”:9780, error code is E_UNKNOWN_PART
E1028 10:34:44.587910 24823 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 4] Receive response about askForVote from “172.19.143.226”:9780, error code is E_UNKNOWN_PART
E1028 10:34:44.886487 24822 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 20] Receive response about askForVote from “172.19.143.226”:9780, error code is E_UNKNOWN_PART
E1028 10:34:44.900859 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 1] Receive response about askForVote from “172.19.143.226”:9780, error code is E_UNKNOWN_PART
E1028 10:35:05.969920 24823 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 15] Receive response about askForVote from “172.19.143.227”:9780, error code is E_UNKNOWN_PART
E1028 10:35:06.330148 24817 Host.cpp:375] [Port: 9780, Space: 90, Part: 5] [Host: 172.19.143.227:9780] Failed to append logs to the host (Err: E_UNKNOWN_PART)
E1028 10:35:06.332677 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 19] Receive response about askForVote from “172.19.143.227”:9780, error code is E_UNKNOWN_PART
E1028 10:35:06.446908 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 11] Receive response about askForVote from “172.19.143.227”:9780, error code is E_UNKNOWN_PART
E1028 10:35:07.217571 24824 RaftPart.cpp:1118] [Port: 9780, Space: 90, Part: 15] Receive response about askForVote from “172.19.143.227”:9780, error code is E_UNKNOWN_PART
那你还能退到2.0.1吗 6楼是说改配置了? 找报Heartbeat failed那个机器 然后netstat grep对应meta接口 看链接啥状况 日志已经说了链接没建立 先把这个解决了再说
配置没有改呢,虽然日志报了错,但是我这边netstat结果都是ESTABLISHED状态呢
你按照上面 critical 27 的方法操作下,然后截图下相关的情况,- -。不然问题就卡在这,进行不下去了