服务启动后自动变成exited

  • nebula 版本:2.6.1
  • 部署方式:单机
  • 安装方式:RPM
  • 是否为线上版本:Y

在python客户端连接中突然出现connection refused错误,排了很多可能,最终发现是服务启动了自动exited。
如下面所展示的,start all启动服务没问题,启动之后status显示metad服务状态为exited,并且过了一段时间graphd也会变成exited。

[root@izhp34ek2t7a6mzihbrgtsz ~]# sudo /usr/local/nebula/scripts/nebula.service -v start all
[INFO] Starting nebula-metad...
[INFO] Done
[ERROR] nebula-graphd already running: 16722
[ERROR] nebula-storaged already running: 6821
[root@izhp34ek2t7a6mzihbrgtsz ~]# sudo /usr/local/nebula/scripts/nebula.service -v start all
[INFO] Starting nebula-metad...
[INFO] Done
[ERROR] nebula-graphd already running: 16722
[ERROR] nebula-storaged already running: 6821
[root@izhp34ek2t7a6mzihbrgtsz ~]# sudo /usr/local/nebula/scripts/nebula.service -v status all
[INFO] nebula-metad(de03025): Exited
[INFO] nebula-graphd(de03025): Running as 16722, Listening on 9669
[INFO] nebula-storaged(de03025): Running as 6821, Listening on 9779

后来把lsof -i:9559的三个meta进程全部kill之后再启动服务就正常了,但是我很奇怪为什么会有这样的问题,有点担心以后出现这个问题。

[root@izhp34ek2t7a6mzihbrgtsz ~]# lsof -i:9559
COMMAND    PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
nebula-me 3907 root  188u  IPv4  33712      0t0  TCP *:9559 (LISTEN)
nebula-me 3907 root  191u  IPv4  34492      0t0  TCP localhost:9559->localhost:59346 (ESTABLISHED)
nebula-me 3907 root  192u  IPv4  35155      0t0  TCP localhost:9559->localhost:59356 (ESTABLISHED)
nebula-me 3907 root  193u  IPv4  34494      0t0  TCP localhost:9559->localhost:59348 (ESTABLISHED)
nebula-me 3907 root  194u  IPv4  34631      0t0  TCP localhost:9559->localhost:59350 (ESTABLISHED)
nebula-me 3907 root  195u  IPv4  35086      0t0  TCP localhost:9559->localhost:59352 (ESTABLISHED)
nebula-me 3907 root  196u  IPv4  35190      0t0  TCP localhost:9559->localhost:59358 (ESTABLISHED)
nebula-me 3907 root  197u  IPv4  35192      0t0  TCP localhost:9559->localhost:59360 (ESTABLISHED)
nebula-me 3907 root  198u  IPv4  35949      0t0  TCP localhost:9559->localhost:59362 (ESTABLISHED)
nebula-me 3907 root  199u  IPv4  35196      0t0  TCP localhost:9559->localhost:59364 (ESTABLISHED)
nebula-me 3907 root  200u  IPv4  35199      0t0  TCP localhost:9559->localhost:59366 (ESTABLISHED)
nebula-me 3907 root  201u  IPv4  35204      0t0  TCP localhost:9559->localhost:59370 (ESTABLISHED)
nebula-me 3907 root  202u  IPv4  35466      0t0  TCP localhost:9559->localhost:59380 (ESTABLISHED)
nebula-me 3907 root  203u  IPv4  35460      0t0  TCP localhost:9559->localhost:59374 (ESTABLISHED)
nebula-me 3907 root  204u  IPv4  35462      0t0  TCP localhost:9559->localhost:59376 (ESTABLISHED)
nebula-me 3907 root  205u  IPv4  35464      0t0  TCP localhost:9559->localhost:59378 (ESTABLISHED)
nebula-me 3907 root  206u  IPv4  36285      0t0  TCP localhost:9559->localhost:59382 (ESTABLISHED)
nebula-me 3907 root  207u  IPv4  47136      0t0  TCP localhost:9559->localhost:59990 (ESTABLISHED)
nebula-me 3907 root  208u  IPv4  46964      0t0  TCP localhost:9559->localhost:60004 (ESTABLISHED)
nebula-me 3907 root  209u  IPv4  46290      0t0  TCP localhost:9559->localhost:59982 (ESTABLISHED)
nebula-me 3907 root  210u  IPv4  46292      0t0  TCP localhost:9559->localhost:59984 (ESTABLISHED)
nebula-me 3907 root  211u  IPv4  46075      0t0  TCP localhost:9559->localhost:59986 (ESTABLISHED)
nebula-me 3907 root  212u  IPv4  47131      0t0  TCP localhost:9559->localhost:59988 (ESTABLISHED)
nebula-me 3907 root  213u  IPv4  47008      0t0  TCP localhost:9559->localhost:60006 (ESTABLISHED)
nebula-me 3907 root  214u  IPv4  47296      0t0  TCP localhost:9559->localhost:59996 (ESTABLISHED)
nebula-me 3907 root  215u  IPv4  46729      0t0  TCP localhost:9559->localhost:60000 (ESTABLISHED)
nebula-me 3907 root  216u  IPv4  46844      0t0  TCP localhost:9559->localhost:60002 (ESTABLISHED)
nebula-me 3907 root  217u  IPv4  47550      0t0  TCP localhost:9559->localhost:60008 (ESTABLISHED)
nebula-me 3907 root  218u  IPv4  47554      0t0  TCP localhost:9559->localhost:60012 (ESTABLISHED)
nebula-me 3907 root  219u  IPv4  47030      0t0  TCP localhost:9559->localhost:60014 (ESTABLISHED)
nebula-me 3907 root  220u  IPv4  47032      0t0  TCP localhost:9559->localhost:60016 (ESTABLISHED)
nebula-me 3907 root  221u  IPv4  47034      0t0  TCP localhost:9559->localhost:60018 (ESTABLISHED)
nebula-me 3907 root  222u  IPv4  47603      0t0  TCP localhost:9559->localhost:60020 (ESTABLISHED)
nebula-st 4171 root    9u  IPv4  34890      0t0  TCP localhost:59346->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   14u  IPv4  34892      0t0  TCP localhost:59348->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   21u  IPv4  35085      0t0  TCP localhost:59350->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   26u  IPv4  34634      0t0  TCP localhost:59352->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   31u  IPv4  34723      0t0  TCP localhost:59356->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   36u  IPv4  35946      0t0  TCP localhost:59358->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   41u  IPv4  35948      0t0  TCP localhost:59360->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   46u  IPv4  35195      0t0  TCP localhost:59362->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   51u  IPv4  35952      0t0  TCP localhost:59364->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   56u  IPv4  35954      0t0  TCP localhost:59366->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   61u  IPv4  35956      0t0  TCP localhost:59370->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   66u  IPv4  36277      0t0  TCP localhost:59374->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   71u  IPv4  36279      0t0  TCP localhost:59376->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   76u  IPv4  36281      0t0  TCP localhost:59378->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   81u  IPv4  36284      0t0  TCP localhost:59380->localhost:9559 (ESTABLISHED)
nebula-st 4171 root   86u  IPv4  35469      0t0  TCP localhost:59382->localhost:9559 (ESTABLISHED)
nebula-st 6821 root    9u  IPv4  45954      0t0  TCP localhost:59982->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   15u  IPv4  45956      0t0  TCP localhost:59984->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   21u  IPv4  46380      0t0  TCP localhost:59986->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   26u  IPv4  46497      0t0  TCP localhost:59988->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   31u  IPv4  46499      0t0  TCP localhost:59990->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   36u  IPv4  46617      0t0  TCP localhost:59996->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   41u  IPv4  47380      0t0  TCP localhost:60000->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   46u  IPv4  47453      0t0  TCP localhost:60002->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   51u  IPv4  47534      0t0  TCP localhost:60004->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   56u  IPv4  47549      0t0  TCP localhost:60006->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   61u  IPv4  47011      0t0  TCP localhost:60008->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   66u  IPv4  47014      0t0  TCP localhost:60012->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   71u  IPv4  47557      0t0  TCP localhost:60014->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   76u  IPv4  47559      0t0  TCP localhost:60016->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   81u  IPv4  47561      0t0  TCP localhost:60018->localhost:9559 (ESTABLISHED)
nebula-st 6821 root   86u  IPv4  47057      0t0  TCP localhost:60020->localhost:9559 (ESTABLISHED)
[root@izhp34ek2t7a6mzihbrgtsz etc]# kill -9 3907
[root@izhp34ek2t7a6mzihbrgtsz etc]# kill -9 4171
[root@izhp34ek2t7a6mzihbrgtsz etc]# kill -9 6821
[root@izhp34ek2t7a6mzihbrgtsz etc]# lsof -i:9559

再次启动服务就没有那个问题了。

[root@izhp34ek2t7a6mzihbrgtsz etc]# sudo /usr/local/nebula/scripts/nebula.service -v start all
[INFO] Starting nebula-metad...
[INFO] Done
[INFO] Starting nebula-graphd...
[INFO] Done
[INFO] Starting nebula-storaged...
[INFO] Done
[root@izhp34ek2t7a6mzihbrgtsz etc]# sudo /usr/local/nebula/scripts/nebula.service -v start all
[ERROR] nebula-metad already running: 17478
[ERROR] nebula-graphd already running: 17551
[ERROR] nebula-storaged already running: 17585

虽然问题解决了,但很想知道为什么会出现这个问题,这种问题的出现是因为客户端里因为意外导致的session没有release还是connection没有close呢,希望能解惑。

这说明 meta 服务不正常, 没有成功起来, graph 向 meta 心跳失败了也没启动成功, 所以客户端连不上 和客户端本身无关

贴下服务的日志吧

Log file created at: 2022/08/14 10:42:51
Running on machine: izhp34ek2t7a6mzihbrgtsz
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0814 10:42:51.147168 16910 ThriftServer.cpp:385] Got an exception while setting up the server: failed to bind to async server socket: 0.0.0.0:9560: Address already in use
E0814 10:42:51.147423 16910 RaftexService.cpp:85] Setup the Raftex Service failed, error: failed to bind to async server socket: 0.0.0.0:9560: Address already in use
E0814 10:42:51.155354 16861 NebulaStore.cpp:60] Start the raft service failed
E0814 10:42:51.155366 16861 MetaDaemon.cpp:94] Nebula store init failed
E0814 10:42:51.156683 16861 MetaDaemon.cpp:271] Init kv failed!

是这样的错failed to bind to async server socket: 0.0.0.0:9560: Address already in use

端口被占用了 配置文件里换个端口后重启服务

看起来确实是这样,简单的问题被我想复杂了,感谢解答

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。