metad无法完成选举

  • nebula 版本:2.0.1

  • 部署方式(分布式 / 单机 / Docker / DBaaS):分布式

  • 是否为线上版本:Y

  • 硬件信息

    • 磁盘( 推荐使用 SSD)1.8T
    • CPU、内存信息 96C 128G
  • metad无法完成选举

  • 打开metad 全部日志发现如下

I0604 16:43:34.474550 11528 NebulaStore.cpp:393] Add peer "192.168.100.74":45501
I0604 16:43:34.474571 11528 NebulaStore.cpp:393] Add peer "192.168.100.75":45501
I0604 16:43:34.474608 11528 RocksEngine.cpp:155] Get: ^D^@^@^@^A^@^@^@ Not Found
I0604 16:43:34.474620 11528 Part.cpp:50] [Port: 45501, Space: 0, Part: 0] Cannot fetch the last committed log id from the storage engine
I0604 16:43:34.474630 11528 RaftPart.cpp:295] [Port: 45501, Space: 0, Part: 0] There are 2 peer hosts, and total 3 copies. The quorum is 2, as learner 0, lastLogId 0, lastLogTerm 0, committedLogId 0, term 0
I0604 16:43:34.474642 11528 RaftPart.cpp:308] [Port: 45501, Space: 0, Part: 0] Add peer "192.168.100.74":45501
I0604 16:43:34.474660 11528 RaftPart.cpp:308] [Port: 45501, Space: 0, Part: 0] Add peer "192.168.100.75":45501
I0604 16:43:34.474778 11528 NebulaStore.cpp:357] Space 0, part 0 has been added, asLearner 0
I0604 16:43:34.474818 11528 NebulaStore.cpp:68] Register handler...
I0604 16:43:34.474828 11528 MetaDaemon.cpp:99] Waiting for the leader elected...
I0604 16:43:34.474839 11528 MetaDaemon.cpp:112] Leader has not been elected, sleep 1s
I0604 16:43:35.133103 11569 RaftPart.cpp:1043] [Port: 45501, Space: 0, Part: 0] Start leader election, reason: lastMsgDur 659, term 0
I0604 16:43:35.133157 11569 RaftPart.cpp:1165] [Port: 45501, Space: 0, Part: 0] Start leader election...
I0604 16:43:35.133179 11569 RaftPart.cpp:1193] [Port: 45501, Space: 0, Part: 0] Sending out an election request (space = 0, part = 0, term = 1, lastLogId = 0, lastLogTerm = 0, candidateIP = 192.168.100.73, candidatePort = 45501)
I0604 16:43:35.133213 11569 RaftPart.cpp:1213] [Port: 45501, Space: 0, Part: 0] Sending AskForVoteRequest to [Port: 45501, Space: 0, Part: 0] [Host: 192.168.100.74:45501]
I0604 16:43:35.133495 11569 RaftPart.cpp:1213] [Port: 45501, Space: 0, Part: 0] Sending AskForVoteRequest to [Port: 45501, Space: 0, Part: 0] [Host: 192.168.100.75:45501]
I0604 16:43:35.133527 11574 ThriftClientManager.inl:48] There is no existing client to "192.168.100.74":45501, trying to create one
I0604 16:43:35.133548 11569 RaftPart.cpp:1232] [Port: 45501, Space: 0, Part: 0] AskForVoteRequest has been sent to all peers, waiting for responses
I0604 16:43:35.133610 11574 ThriftClientManager.inl:69] Connecting to "192.168.100.74":45501 for 1 times
I0604 16:43:35.134526 11574 ThriftClientManager.inl:48] There is no existing client to "192.168.100.75":45501, trying to create one
I0604 16:43:35.134565 11574 ThriftClientManager.inl:69] Connecting to "192.168.100.75":45501 for 2 times
I0604 16:43:35.134907 11574 AsyncSocket.cpp:2229] AsyncSocket::handleConnect(this=0x7fd959200390, fd=118 host=192.168.100.74:45501) exception: AsyncSocketException: connect failed, type = Socket not open, errno = 113 (No route to host)
I0604 16:43:35.135071 11574 AsyncSocket.cpp:2229] AsyncSocket::handleConnect(this=0x7fd959200710, fd=119 host=192.168.100.75:45501) exception: AsyncSocketException: connect failed, type = Socket not open, errno = 113 (No route to host)
I0604 16:43:35.135136 11574 CollectNSucceeded.inl:63] Set Value [completed=2, total=2, Result list size=0]
I0604 16:43:35.135239 11569 RaftPart.cpp:1239] [Port: 45501, Space: 0, Part: 0] Got AskForVote response back
I0604 16:43:35.135293 11569 RaftPart.cpp:1275] [Port: 45501, Space: 0, Part: 0] No one is elected, continue the election

连接对应metad节点失败

192.168.100.73去连接192.168.100.75 45501端口,失败
image

192.168.100.75查看端口显示如下

只绑定了ipv6端口?试了几种方案都不好使,麻烦问下这种情况如何处理,同机器上sshd应用同时绑定了ipv4和ipv6端口
192.168.100.75 ifconfig显示如下

[root@localhost logs]# ifconfig
em1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.100.75  netmask 255.255.255.0  broadcast 192.168.100.255
        inet6 fe80::2eea:7fff:feed:fa8  prefixlen 64  scopeid 0x20<link>
        ether 2c:ea:7f:ed:0f:a8  txqueuelen 1000  (Ethernet)
        RX packets 521951  bytes 144353534 (137.6 MiB)
        RX errors 0  dropped 37591  overruns 0  frame 0
        TX packets 162825  bytes 16776423 (15.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 32

em2: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 2c:ea:7f:ed:0f:a9  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 34

em3: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 2c:ea:7f:ed:0f:aa  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 141

em4: flags=4099<UP,BROADCAST,MULTICAST>  mtu 1500
        ether 2c:ea:7f:ed:0f:ab  txqueuelen 1000  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0
        device interrupt 142

lo: flags=73<UP,LOOPBACK,RUNNING>  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10<host>
        loop  txqueuelen 1000  (Local Loopback)
        RX packets 685  bytes 35394 (34.5 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 685  bytes 35394 (34.5 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

已找到问题,防火墙设置问题,最好还是把防火墙直接关闭

1 个赞

该主题在最后一个回复创建后2天后自动关闭。不再允许新的回复。