nebula-console 连接失败

  • nebula 版本:3.1.0
  • operator 1.2.0
  • k8s 青云 (我在原生 k8s v1.19.13 部署没有出现这个问题)
  • 部署方式:分布式
  • 安装方式:helm
  • 是否为线上版本:Y
  • 硬件信息
    • 磁盘( 推荐使用 SSD)
    • CPU、内存信息
  • 问题的具体描述
    手动艾特大佬 @kevin.qiao @wey
    看起来是通过服务名称解析 地址有问题
    但是我单独起了一个centos7 的 pod ping 了一下 是可以解析的
[root@centos7-v1-68cff9b955-t8wjc /]# ping ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local
PING ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140) 56(84) bytes of data.
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=1 ttl=62 time=0.544 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=2 ttl=62 time=0.235 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=3 ttl=62 time=0.316 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=4 ttl=62 time=0.303 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=5 ttl=62 time=0.414 ms
/ # nebula-console -addr nebula-test3-cluster-nebula-cluster-graphd-svc -port 9669 -u root -p nebula
2022/07/25 13:11:42 Fail to create a new session from connection pool, fail to authenticate, error: Create session failed: LeaderChanged: Leader changed!
panic: Fail to create a new session from connection pool, fail to authenticate, error: Create session failed: LeaderChanged: Leader changed!

goroutine 1 [running]:
log.Panicf(0x7fb106, 0x35, 0xc0000b3e58, 0x1, 0x1)
        /usr/local/go/src/log/log.go:345 +0xc0
main.main()
        /usr/src/main.go:541 +0x985

补充信息:meta0 节点日志
ERROR

r-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:04.755223    72 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:04.777478    72 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:06.255128    73 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:06.263564    73 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:06.826195    74 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:06.839129    74 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:08.312213    75 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:08.318936    75 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:08.899307    60 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:08.907496    60 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:10.360368    61 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 01:33:10.374974    61 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2

INFO

I20220726 01:35:17.236651   125 HBProcessor.cpp:40] Machine "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779 is not registed
I20220726 01:35:31.367146   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 01:35:35.053288   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:15:11.444511   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:15:31.520273   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:16:26.933535   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:25:13.013173   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:25:36.381052   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:27:00.778218   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:27:06.821228   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:30:37.899597   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:30:41.608166   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:39:22.216208   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:41:33.884649   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:42:44.569519   125 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 02:42:44.571643   125 HBProcessor.cpp:40] Machine "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779 is not registed
I20220726 02:43:12.492338   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:53:14.881646   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:53:23.499811   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 03:00:43.565696   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 03:00:46.803411   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 03:20:04.885737   125 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED

在节点内 ping 是可以ping 通的

[root@ai-plat-test-nebula-cluster-metad-0 nebula]# ping ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local
PING ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140) 56(84) bytes of data.
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=1 ttl=62 time=1.71 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=2 ttl=62 time=0.283 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=3 ttl=62 time=0.345 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=4 ttl=62 time=0.361 ms

补充信息:meta1 节点信息
ERROR

[root@ai-plat-test-nebula-cluster-metad-1 nebula]# tail -fn 2000 logs/nebula-metad.ERROR
Log file created at: 2022/07/26 02:41:16
Running on machine: ai-plat-test-nebula-cluster-metad-1
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
E20220726 02:41:16.079069     1 FileUtils.cpp:377] Failed to read the directory "/usr/local/nebula/data/meta/nebula" (2): No such file or directory

INFO

Log file created at: 2022/07/26 02:41:16
Running on machine: ai-plat-test-nebula-cluster-metad-1
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
I20220726 02:41:16.012902     1 MetaDaemon.cpp:135] localhost = "ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9559
I20220726 02:41:16.023653     1 NebulaStore.cpp:51] Start the raft service...
I20220726 02:41:16.025631     1 NebulaSnapshotManager.cpp:25] Send snapshot is rate limited to 10485760 for each part by default
I20220726 02:41:16.074375     1 RaftexService.cpp:46] Start raft service on 9560
I20220726 02:41:16.077728     1 NebulaStore.cpp:85] Scan the local path, and init the spaces_
E20220726 02:41:16.079069     1 FileUtils.cpp:377] Failed to read the directory "/usr/local/nebula/data/meta/nebula" (2): No such file or directory
I20220726 02:41:16.091737     1 NebulaStore.cpp:271] Init data from partManager for "ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9559
I20220726 02:41:16.091754     1 NebulaStore.cpp:387] Create data space 0
I20220726 02:41:16.215380     1 RocksEngine.cpp:97] open rocksdb on /usr/local/nebula/data/meta/nebula/0/data
I20220726 02:41:16.236122     1 NebulaStore.cpp:459] Space 0, part 0 has been added, asLearner 0
I20220726 02:41:16.236162     1 NebulaStore.cpp:78] Register handler...
I20220726 02:41:16.236167     1 MetaDaemonInit.cpp:101] Waiting for the leader elected...
I20220726 02:41:16.236172     1 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
I20220726 02:41:16.484977    60 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-0.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.101.40.32":9560
I20220726 02:41:16.488499    60 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.74.36.214":9560
I20220726 02:41:17.236260     1 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
I20220726 02:41:18.236408     1 KVBasedClusterIdMan.h:109] There is no clusterId existed in kvstore!
I20220726 02:41:18.236443     1 MetaDaemonInit.cpp:129] I am follower, wait for the leader's clusterId
I20220726 02:41:18.236446     1 MetaDaemonInit.cpp:131] Waiting for the leader's clusterId
I20220726 02:41:19.236593     1 KVBasedClusterIdMan.h:109] There is no clusterId existed in kvstore!
I20220726 02:41:19.237990     1 MetaDaemonInit.cpp:131] Waiting for the leader's clusterId
I20220726 02:41:20.238178     1 MetaDaemonInit.cpp:140] Get meta version is 3
I20220726 02:41:20.238209     1 MetaDaemonInit.cpp:157] Nebula store init succeeded, clusterId 5012000189760346409
I20220726 02:41:20.238214     1 MetaDaemon.cpp:148] Start http service
I20220726 02:41:20.238947     1 MetaDaemonInit.cpp:162] Starting Meta HTTP Service
I20220726 02:41:20.241266    99 WebService.cpp:124] Web service started on HTTP[19559]
I20220726 02:41:20.241307     1 JobManager.cpp:69] Not leader, skip reading remaining jobs
I20220726 02:41:20.241415     1 JobManager.cpp:56] JobManager initialized
I20220726 02:41:20.241426   105 JobManager.cpp:119] JobManager::scheduleThread enter
I20220726 02:41:20.244334     1 MetaDaemon.cpp:213] The meta daemon start on "ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9559
I20220726 02:43:02.105170   142 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:52:37.946771   142 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 02:55:20.444234   142 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 03:20:46.410261   142 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED

补充信息:meta2 节点信息

ERROR

r-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:10.823745    64 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:11.591990    65 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:11.603370    66 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:13.645320    67 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:13.648716    68 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:14.370640    69 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:15.725541    71 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:15.729526    70 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:16.751642    72 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:17.748519    74 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:19.660256    75 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2
E20220726 02:41:20.378620    61 ThriftClientManager-inl.h:70] Failed to resolve address for 'ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local': Name or service not known (error=-2): Unknown error -2

INFO

I20220726 03:26:14.160804   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:14.608219   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:15.162343   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:15.609239   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:16.163241   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:16.610944   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:19.401046   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:20.403025   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:21.405977   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:22.144124   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:22.409160   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:23.158027   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:23.271517   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-0.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:24.163991   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:24.276914   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-0.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220726 03:26:25.165108   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220726 03:26:25.281967   142 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-0.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE

在节点内是可以ping 通的

[root@ai-plat-test-nebula-cluster-metad-2 nebula]# ping ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local
PING ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140) 56(84) bytes of data.
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=1 ttl=62 time=0.836 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=2 ttl=62 time=0.268 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=3 ttl=62 time=0.391 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=4 ttl=62 time=0.427 ms
64 bytes from ai-plat-test-nebula-cluster-metad-1.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local (100.87.204.140): icmp_seq=5 ttl=62 time=0.310 ms

补充信息:storage0 storage1 storage2 节点 日志都是类似这样

E20220726 03:21:23.243811    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:21:36.695433    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:21:51.156972    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:22:06.634219    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:22:21.619824    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:22:35.781181    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:22:52.801788    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:23:06.061062    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:23:20.419384    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:23:35.361099    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:23:51.154027    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:24:06.662156    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:24:21.914187    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:24:40.798929    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:24:57.410149    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:25:12.742134    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:25:26.783438    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:25:44.087163    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:25:57.596655    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:26:13.247774    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:26:26.277366    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:26:39.296201    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:05.245218    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:20.795356    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:35.933255    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:53.742290    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:28:07.883606    66 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!

补充信息:graphd0 graphd1 节点 日志内容类似如下

E20220726 03:26:38.169318    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:26:59.774423    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:13.086678    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:26.399684    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:39.635113    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:27:55.847878    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:28:09.062281    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:28:30.510973    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:28:43.550163    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:29:00.206394    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:29:13.244976    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:29:26.260198    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:29:39.280460    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:29:52.295989    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:30:05.332660    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:30:20.650094    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:30:34.972697    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:30:51.766494    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:31:04.793565    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:31:19.249075    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:31:32.600176    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:31:46.725164    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220726 03:32:00.107298    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!

是不是 fqdn 的后缀 tld 没修改,默认是 cluster.local

kubernetesClusterDomain: "cluster.local" 改成你实际的后缀?

实际后缀就是 cluster.local

上面又加了一些补充信息,报错是无法解析,但是我到节点内 ping 是可以解析的

奥,仔细看了你的日志不是dns问题的,只不过是集群本身一直 leader change,全新的集群?能重新拉么?

可以重新拉取,是全新的,但是已经尝试过几次了不知道是不是集群的性能问题,会启动缓慢,造成前后启动不同步

@kevin.qiao 雷神看看青云的 k8s 上有啥特别的?

kubectl get pods -n stg-ai-platform 看下呢,服务都拉起来了吗

昨天重启了meta节点,多等了会儿,然后就可以了,再观察下。今天早上看又是

/ # nebula-console -addr ai-plat-test-nebula-cluster-graphd-svc -port 9669 -u root -p nebula
2022/07/27 01:23:54 Fail to create a new session from connection pool, fail to authenticate, error: Create session failed: LeaderChanged: Leader changed!
panic: Fail to create a new session from connection pool, fail to authenticate, error: Create session failed: LeaderChanged: Leader changed!

goroutine 1 [running]:
log.Panicf(0x7fb106, 0x35, 0xc00008be58, 0x1, 0x1)
        /usr/local/go/src/log/log.go:345 +0xc0
main.main()
        /usr/src/main.go:541 +0x985

@kevin.qiao

今早看的 mate1 节点日志

-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.74.36.213":9560
I20220726 20:52:46.600610   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:19:00.676297   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:19:03.741428   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:22:03.085753   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:26:34.531612   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:35:49.556167   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:51:15.421535   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:51:18.669797   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:05:47.076761   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:18:19.376209    74 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-0.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.68.231.4":9560
I20220726 22:18:21.676909    74 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.74.36.213":9560
I20220726 22:20:08.241590   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:20:30.300455   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:35:38.548537   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:49:47.832068   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:49:49.570137   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:50:46.168562   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:59:24.443636   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:01:18.091946    75 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-0.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.68.231.4":9560
I20220726 23:01:20.607277    75 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.74.36.213":9560
I20220726 23:21:19.161077   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:30:20.573021   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:35:30.437738   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:48:18.237913   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 00:22:26.261052   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 00:30:17.868191    60 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-0.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.68.231.4":9560
I20220727 00:30:20.649178    60 ThriftClientManager-inl.h:67] resolve "ai-plat-test-nebula-cluster-metad-2.ai-plat-test-nebula-cluster-metad-headless.stg-ai-platform.svc.cluster.local":9560 as "100.74.36.213":9560
I20220727 00:46:46.295497   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 00:46:48.916327   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:03:10.139436   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:05:15.894284   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED

meta2 节点日志

I20220726 17:45:07.137455   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 17:49:59.061681   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 17:57:15.257186   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 17:57:18.203634   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 18:06:28.157634   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 18:20:55.564607   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 18:55:45.346731   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:06:20.623407   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:22:48.581584   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:25:00.351994   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:36:11.657853   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:36:14.653427   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:47:19.388726   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 19:53:45.073022   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:06:06.080439   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:23:30.257988   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:36:00.586833   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:36:03.573624   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:52:48.752642   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 20:55:38.052726   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:05:55.116792   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:05:58.056350   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:22:00.263043   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:35:52.425391   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:49:34.462785   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 21:57:31.046119   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:35:41.510743   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 22:50:41.971894   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:05:33.059928   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:21:15.755441   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220726 23:51:52.296730   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 00:05:24.991441   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:16:00.457886   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:16:03.258361   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:23:30.857805   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED
I20220727 01:23:33.545468   143 ListHostsProcessor.cpp:343] List Hosts Failed, error E_LEADER_CHANGED

meta0

lat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:50.159091   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-0.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:51.057163   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:52.059600   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:52.100401   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:52.680024   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:53.061601   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:53.064487   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:53.102466   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:53.682037   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:54.063131   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-1.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:54.066514   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:54.104439   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:54.683459   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:55.068984   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAGE
I20220727 01:28:55.105429   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-1.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:55.685506   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-graphd-0.ai-plat-test-nebula-cluster-graphd-svc.stg-ai-platform.svc.cluster.local":9669, role = GRAPH
I20220727 01:28:56.071555   121 HBProcessor.cpp:33] Receive heartbeat from "ai-plat-test-nebula-cluster-storaged-2.ai-plat-test-nebula-cluster-storaged-headless.stg-ai-platform.svc.cluster.local":9779, role = STORAG

graph节点

E20220727 01:26:05.890628    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
I20220727 01:26:14.672483    30 GraphService.cpp:68] Authenticating user root from 100.74.36.193:60098
E20220727 01:26:17.680052    30 GraphSessionManager.cpp:133] Create session failed:LeaderChanged: Leader changed!
E20220727 01:26:17.680143    30 GraphService.cpp:97] Create session for userName: root, ip: 100.74.36.193 failed: Create session failed: LeaderChanged: Leader changed!
E20220727 01:26:18.902870    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:26:31.919932    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:26:44.936885    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:26:57.953989    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:27:10.970829    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:27:23.988610    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:27:37.017784    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:27:50.046165    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:28:03.060753    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:28:16.074229    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:28:29.091440    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:28:42.106809    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:28:55.123025    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:29:08.137410    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:29:21.152701    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:29:34.168135    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:29:47.184506    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:30:00.194479    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
I20220727 01:30:11.635766    35 GraphService.cpp:68] Authenticating user root from 100.74.36.193:37314
E20220727 01:30:13.213289    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:30:14.641866    35 GraphSessionManager.cpp:133] Create session failed:LeaderChanged: Leader changed!
E20220727 01:30:14.642002    35 GraphService.cpp:97] Create session for userName: root, ip: 100.74.36.193 failed: Create session failed: LeaderChanged: Leader changed!
E20220727 01:30:26.229909    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:30:39.241982    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:30:52.261052    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!
E20220727 01:31:05.271106    73 MetaClient.cpp:178] Heartbeat failed, status:LeaderChanged: Leader changed!

现在看 meta0 节点是 Leader 了 也能收到 Heartbeat ,但是 graph 上报 Heartbeat failed

@kevin.qiao 这个和这个 PR https://github.com/vesoft-inc/nebula-operator/pull/137 修复的问题类似么?