无法正常启动(Leader has not been elected, sleep 1s)

  • nebula 版本:nebula-graph-3.1.0.ubuntu2004

  • 部署方式:K8S 集群部署

  • 安装方式:tar

  • 是否为线上版本:Y

  • 硬件信息

    • 磁盘( 推荐使用 SSD)
    • CPU、内存信息
  • 问题的具体描述

    ​ 无法正常启动,启动服务只有一个端口(9560),其他端口并未开启

  • 相关的 meta / storage / graph info 日志信息(尽量使用文本形式方便检索)

    meta 日志

    I20220616 05:44:01.366055    12 MetaDaemonInit.cpp:101] Waiting for the leader elected...
    I20220616 05:44:01.366060    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:02.246410    64 ThriftClientManager-inl.h:67] resolve "nebula-metad-0.nebula-metad-headless":9560 as "100.121.9.99":9560
    E20220616 05:44:02.280918    64 ThriftClientManager-inl.h:70] Failed to resolve address for 'nebula-metad-1.nebula-metad-headless': Name or service not known (error=-2): Unknown error -2
    E20220616 05:44:02.294188    64 ThriftClientManager-inl.h:70] Failed to resolve address for 'nebula-metad-2.nebula-metad-headless': Name or service not known (error=-2): Unknown error -2
    I20220616 05:44:02.366214    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:31.372649    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    E20220616 05:44:31.897184    72 ThriftClientManager-inl.h:70] Failed to resolve address for 'nebula-metad-1.nebula-metad-headless': Name or service not known (error=-2): Unknown error -2
    E20220616 05:44:31.900964    72 ThriftClientManager-inl.h:70] Failed to resolve address for 'nebula-metad-2.nebula-metad-headless': Name or service not known (error=-2): Unknown error -2
    I20220616 05:44:32.372843    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:33.373090    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:33.788409    73 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9560 as "100.121.9.68":9560
    I20220616 05:44:33.790158    73 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9560 as "100.84.112.13":9560
    I20220616 05:44:34.373303    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:34.508486    74 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9560 as "100.121.9.68":9560
    I20220616 05:44:34.509632    74 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9560 as "100.84.112.13":9560
    I20220616 05:44:48.036476    72 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9560 as "100.121.9.68":9560
    I20220616 05:44:48.037859    72 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9560 as "100.84.112.13":9560
    I20220616 05:44:48.376334    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:49.376627    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:50.376904    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:51.377097    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:52.377275    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:53.377452    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:54.377732    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    I20220616 05:44:55.377936    12 MetaDaemonInit.cpp:113] Leader has not been elected, sleep 1s
    

    graphd日志

Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
I20220616 05:36:22.591248    13 GraphDaemon.cpp:130] Starting Graph HTTP Service
I20220616 05:36:22.596262    15 WebService.cpp:124] Web service started on HTTP[19669]
I20220616 05:36:22.596390    13 GraphDaemon.cpp:144] Number of networking IO threads: 8
I20220616 05:36:22.596415    13 GraphDaemon.cpp:153] Number of worker threads: 8
I20220616 05:36:22.602818    13 MetaClient.cpp:80] Create meta client to "nebula-metad-2.nebula-metad-headless":9559
I20220616 05:36:22.602854    13 MetaClient.cpp:81] root path: /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64, data path size: 0
I20220616 05:36:22.615861    43 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
I20220616 05:36:23.620822    43 ThriftClientManager-inl.h:67] resolve "nebula-metad-0.nebula-metad-headless":9559 as "100.121.9.122":9559
I20220616 05:36:24.624138    43 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9559 as "100.121.9.120":9559
I20220616 05:36:25.628290    43 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9559 as "100.121.9.120":9559
E20220616 05:36:25.628726    43 MetaClient.cpp:744] Send request to "nebula-metad-1.nebula-metad-headless":9559, exceed retry limit
E20220616 05:36:25.629374    43 MetaClient.cpp:745] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20220616 05:36:25.629544    13 MetaClient.cpp:98] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
I20220616 05:36:25.629611    13 MetaClient.cpp:123] Waiting for the metad to be ready!
I20220616 05:36:35.632445    44 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
I20220616 05:36:36.637044    44 ThriftClientManager-inl.h:67] resolve "nebula-metad-0.nebula-metad-headless":9559 as "100.121.9.122":9559
I20220616 05:36:37.645635    44 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9559 as "100.121.9.120":9559
I20220616 05:36:38.648883    44 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
E20220616 05:36:38.649602    44 MetaClient.cpp:744] Send request to "nebula-metad-2.nebula-metad-headless":9559, exceed retry limit
E20220616 05:36:38.649667    44 MetaClient.cpp:745] RpcResponse exception: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20220616 05:36:38.649818    13 MetaClient.cpp:98] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
I20220616 05:36:38.649895    13 MetaClient.cpp:123] Waiting for the metad to be ready!

stroage 日志

Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
I20220616 05:38:02.190927    12 StorageDaemon.cpp:129] localhost = "100.121.9.113":9779
I20220616 05:38:02.191428    12 StorageDaemon.cpp:144] data path= /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64/data/storage
I20220616 05:38:02.210676    12 MetaClient.cpp:80] Create meta client to "nebula-metad-1.nebula-metad-headless":9559
I20220616 05:38:02.210716    12 MetaClient.cpp:81] root path: /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64, data path size: 1
W20220616 05:38:02.210760    12 FileBasedClusterIdMan.cpp:43] Open file failed, error No such file or directory
I20220616 05:38:02.223978    55 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9559 as "100.121.9.120":9559
I20220616 05:38:03.229784    55 ThriftClientManager-inl.h:67] resolve "nebula-metad-1.nebula-metad-headless":9559 as "100.121.9.120":9559
I20220616 05:38:04.233721    55 ThriftClientManager-inl.h:67] resolve "nebula-metad-0.nebula-metad-headless":9559 as "100.121.9.122":9559
I20220616 05:38:05.237344    55 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
E20220616 05:38:05.237897    55 MetaClient.cpp:744] Send request to "nebula-metad-2.nebula-metad-headless":9559, exceed retry limit
E20220616 05:38:05.238159    55 MetaClient.cpp:745] RpcResponse exception: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20220616 05:38:05.238399    12 MetaClient.cpp:98] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
I20220616 05:38:05.238458    12 MetaClient.cpp:123] Waiting for the metad to be ready!
W20220616 05:38:15.238667    12 FileBasedClusterIdMan.cpp:43] Open file failed, error No such file or directory
I20220616 05:38:15.240800    61 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
I20220616 05:38:16.244614    61 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
I20220616 05:38:17.248571    61 ThriftClientManager-inl.h:67] resolve "nebula-metad-2.nebula-metad-headless":9559 as "100.84.112.12":9559
I20220616 05:38:18.252589    61 ThriftClientManager-inl.h:67] resolve "nebula-metad-0.nebula-metad-headless":9559 as "100.121.9.122":9559
E20220616 05:38:18.253121    61 MetaClient.cpp:744] Send request to "nebula-metad-0.nebula-metad-headless":9559, exceed retry limit
E20220616 05:38:18.253191    61 MetaClient.cpp:745] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20220616 05:38:18.253284    12 MetaClient.cpp:98] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
I20220616 05:38:18.253307    12 MetaClient.cpp:123] Waiting for the metad to be ready!

如果有日志或者代码,记得用 Markdown 语法(下面语法)包裹它们提高阅读体验,让回复者更快解决问题哟~~

K8S Version :

[root@test-master01 nebula-graph]# kubectl version
Client Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T18:03:20Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"22", GitVersion:"v1.22.0", GitCommit:"c2b5237ccd9c0f1d600d3072634ca66cefdf272f", GitTreeState:"clean", BuildDate:"2021-08-04T17:57:25Z", GoVersion:"go1.16.6", Compiler:"gc", Platform:"linux/amd64"}

Dockerfile

########################################################################################################################
####### image describe  : ubuntu verion 22.10                                                                    #######
####### Build Shell     : docker build -t nebula-graph:3.1.0 .                                                   #######
########################################################################################################################
FROM ubuntu:22.10
ENV TZ=Asia/Shanghai
ENV LANG en_US.utf8

## default start script
#ENV START_SCRIPT /home/nebula/nebula-graph/bin/nebula-metad
## default config file path
#ENV CONFIG_FILE /home/nebula/nebula-graph/etc/nebula-metad.conf
## default log file path
#ENV LOG_FILE /home/nebula/nebula-graph/logs/nebula-metad.INFO

## apt update and install net tools
RUN apt-get update && apt-get install -y vim wget curl telnet iputils-ping net-tools

## create user and user group
RUN groupadd -g 530 nebula
RUN useradd -g 530 -u 530 -m -d /home/nebula -s /bin/bash nebula
# set user password
RUN echo "nebula:nebula" |chpasswd

## Download and unzip the installation package
RUN wget -P /home/nebula https://oss-cdn.nebula-graph.com.cn/package/3.1.0/nebula-graph-3.1.0.ubuntu2004.amd64.tar.gz
RUN tar xvf /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64.tar.gz -C /home/nebula && \
    ln -s /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64 /home/nebula/nebula-graph && \
    rm -rf /home/nebula/nebula-graph-3.1.0.ubuntu2004.amd64.tar.gz 

RUN chmod +777 -R /home/nebula

WORKDIR /home/nebula/nebula-graph
#USER nebula

ENTRYPOINT cp ${CONFIG_FILE:=/home/nebula/nebula-graph/etc/nebula-metad.conf.default} /home/nebula/nebula-graph/etc/nebula.conf && \
	echo "--local_ip=$(awk 'END {print $1}' /etc/hosts)" >> /home/nebula/nebula-graph/etc/nebula.conf && \
	echo "--ws_ip=$(awk 'END {print $1}' /etc/hosts)" >> /home/nebula/nebula-graph/etc/nebula.conf && \
	chown +777 /home/nebula/nebula-graph/etc/nebula.conf && \	
	/home/nebula/nebula-graph/bin/${NEBULA_MODEL:=nebula-metad} --flagfile /home/nebula/nebula-graph/etc/nebula.conf ${NEBULA_POTS:=""} && \
	sleep 3 && \
	tail -f ${LOG_FILE:=/home/nebula/nebula-graph/logs/nebula-*.INFO}

K8S 执行Yaml

---
### Application Config MAP
apiVersion: v1
kind: ConfigMap
metadata:
  name: nebula-graph-config
  namespace: temp
data:
  container-post-start.sh: |
    #!/bin/bash
    SRC_FILE_PATH=$1
    NEW_FILE_PATH=$2
    if [ ! -f "$SRC_FILE_PATH" ]; then
      echo 'original file not found'
      exit 0
    fi
    
    if [ -f "$NEW_FILE_PATH" ]; then
      rm -rf $NEW_FILE_PATH
    fi
    
    echo 'this user by:$(whoami)'
    cp $SRC_FILE_PATH $NEW_FILE_PATH
    echo "## last update time by:$(date '+%Y-%m-%d %H:%M:%S')" >> $NEW_FILE_PATH
    echo "--local_ip=$(awk 'END {print $1}' /etc/hosts)" >> $NEW_FILE_PATH


  nebula-metad.conf: |
    ########## basics ##########
    # Whether to run as a daemon process
    --daemonize=true
    # The file to host the process id
    --pid_file=pids/nebula-metad.pid
    
    ########## logging ##########
    # The directory to host logging files
    --log_dir=logs
    # Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
    --minloglevel=0
    # Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
    --v=0
    # Maximum seconds to buffer the log messages
    --logbufsecs=0
    # Whether to redirect stdout and stderr to separate output files
    --redirect_stdout=false
    # Destination filename of stdout and stderr, which will also reside in log_dir.
    --stdout_log_file=metad-stdout.log
    --stderr_log_file=metad-stderr.log
    # Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
    --stderrthreshold=2
    # wether logging files' name contain time stamp, If Using logrotate to rotate logging files, than should set it to true.
    --timestamp_in_logfile_name=true
    
    ########## networking ##########
    # Comma separated Meta Server addresses
    --meta_server_addrs=nebula-metad-0.nebula-metad-headless:9559,nebula-metad-1.nebula-metad-headless:9559,nebula-metad-2.nebula-metad-headless:9559
    # Meta daemon listening port
    --port=9559
    # HTTP service ip
    #--ws_ip=0.0.0.0
    # HTTP service port
    --ws_http_port=19559
    # Port to listen on Storage with HTTP protocol, it corresponds to ws_http_port in storage's configuration file
    --ws_storage_http_port=19779
    
    ########## storage ##########
    # Root data path, here should be only single path for metad
    --data_path=/home/nebula/nebula-graph/nebula-metad/data
    
    ########## Misc #########
    # The default number of parts when a space is created
    --default_parts_num=100
    # The default replica factor when a space is created
    --default_replica_factor=1
    
    --heartbeat_interval_secs=10
    --agent_heartbeat_interval_secs=60

  nebula-graphd.conf: |
    ########## basics ##########
    # Whether to run as a daemon process
    --daemonize=true
    # The file to host the process id
    --pid_file=pids/nebula-graphd.pid
    # Whether to enable optimizer
    --enable_optimizer=true
    # The default charset when a space is created
    --default_charset=utf8
    # The default collate when a space is created
    --default_collate=utf8_bin
    # Whether to use the configuration obtained from the configuration file
    --local_config=true
    
    ########## logging ##########
    # The directory to host logging files
    --log_dir=logs
    # Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
    --minloglevel=0
    # Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
    --v=0
    # Maximum seconds to buffer the log messages
    --logbufsecs=0
    # Whether to redirect stdout and stderr to separate output files
    --redirect_stdout=true
    # Destination filename of stdout and stderr, which will also reside in log_dir.
    --stdout_log_file=graphd-stdout.log
    --stderr_log_file=graphd-stderr.log
    # Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
    --stderrthreshold=2
    # wether logging files' name contain time stamp.
    --timestamp_in_logfile_name=true
    ########## query ##########
    # Whether to treat partial success as an error.
    # This flag is only used for Read-only access, and Modify access always treats partial success as an error.
    --accept_partial_success=false
    # Maximum sentence length, unit byte
    --max_allowed_query_size=4194304
    
    ########## networking ##########
    # Comma separated Meta Server Addresses
    --meta_server_addrs=nebula-metad-0.nebula-metad-headless:9559,nebula-metad-1.nebula-metad-headless:9559,nebula-metad-2.nebula-metad-headless:9559
    # Local IP used to identify the nebula-graphd process.
    # Change it to an address other than loopback if the service is distributed or
    # will be accessed remotely.
    #--local_ip=127.0.0.1
    # Network device to listen on
    --listen_netdev=any
    # Port to listen on
    --port=9669
    # To turn on SO_REUSEPORT or not
    --reuse_port=false
    # Backlog of the listen socket, adjust this together with net.core.somaxconn
    --listen_backlog=1024
    # The number of seconds Nebula service waits before closing the idle connections
    --client_idle_timeout_secs=28800
    # The number of seconds before idle sessions expire
    # The range should be in [1, 604800]
    --session_idle_timeout_secs=28800
    # The number of threads to accept incoming connections
    --num_accept_threads=1
    # The number of networking IO threads, 0 for # of CPU cores
    --num_netio_threads=0
    # The number of threads to execute user queries, 0 for # of CPU cores
    --num_worker_threads=0
    # HTTP service ip
    #--ws_ip=0.0.0.0
    # HTTP service port
    --ws_http_port=19669
    # storage client timeout
    --storage_client_timeout_ms=60000
    # Port to listen on Meta with HTTP protocol, it corresponds to ws_http_port in metad's configuration file
    --ws_meta_http_port=19559
    
    ########## authentication ##########
    # Enable authorization
    --enable_authorize=false
    # User login authentication type, password for nebula authentication, ldap for ldap authentication, cloud for cloud authentication
    --auth_type=P@ssw0rd123
    
    ########## memory ##########
    # System memory high watermark ratio, cancel the memory checking when the ratio greater than 1.0
    --system_memory_high_watermark_ratio=0.8
    
    ########## metrics ##########
    --enable_space_level_metrics=false
    
    ########## experimental feature ##########
    # if use experimental features
    --enable_experimental_feature=false



  nebula-storaged.conf: |
    ########## basics ##########
    # Whether to run as a daemon process
    --daemonize=true
    # The file to host the process id
    --pid_file=pids/nebula-storaged.pid
    # Whether to use the configuration obtained from the configuration file
    --local_config=true
    
    ########## logging ##########
    # The directory to host logging files
    --log_dir=logs
    # Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
    --minloglevel=0
    # Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
    --v=0
    # Maximum seconds to buffer the log messages
    --logbufsecs=0
    # Whether to redirect stdout and stderr to separate output files
    --redirect_stdout=true
    # Destination filename of stdout and stderr, which will also reside in log_dir.
    --stdout_log_file=storaged-stdout.log
    --stderr_log_file=storaged-stderr.log
    # Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
    --stderrthreshold=2
    # Wether logging files' name contain time stamp.
    --timestamp_in_logfile_name=true
    
    ########## networking ##########
    # Comma separated Meta server addresses
    --meta_server_addrs=nebula-metad-0.nebula-metad-headless:9559,nebula-metad-1.nebula-metad-headless:9559,nebula-metad-2.nebula-metad-headless:9559
    # Local IP used to identify the nebula-storaged process.
    # Change it to an address other than loopback if the service is distributed or
    # will be accessed remotely.
    #--local_ip=127.0.0.1
    # Storage daemon listening port
    --port=9779
    # HTTP service ip
    #--ws_ip=0.0.0.0
    # HTTP service port
    --ws_http_port=19779
    # heartbeat with meta service
    --heartbeat_interval_secs=10
    
    ######### Raft #########
    # Raft election timeout
    --raft_heartbeat_interval_secs=30
    # RPC timeout for raft client (ms)
    --raft_rpc_timeout_ms=500
    ## recycle Raft WAL
    --wal_ttl=14400
    
    ########## Disk ##########
    # Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
    # One path per Rocksdb instance.
    --data_path=data/storage
    
    # Minimum reserved bytes of each data path
    --minimum_reserved_bytes=268435456
    
    # The default reserved bytes for one batch operation
    --rocksdb_batch_size=4096
    # The default block cache size used in BlockBasedTable.
    # The unit is MB.
    --rocksdb_block_cache=4
    # The type of storage engine, `rocksdb', `memory', etc.
    --engine_type=rocksdb
    
    # Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
    # For the sake of binary compatibility, the default value is snappy.
    # Recommend to use:
    #   * lz4 to gain more CPU performance, with the same compression ratio with snappy
    #   * zstd to occupy less disk space
    #   * lz4hc for the read-heavy write-light scenario
    --rocksdb_compression=lz4
    
    # Set different compressions for different levels
    # For example, if --rocksdb_compression is snappy,
    # "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
    # In order to disable compression for level 0/1, set it to "no:no"
    --rocksdb_compression_per_level=
    
    # Whether or not to enable rocksdb's statistics, disabled by default
    --enable_rocksdb_statistics=false
    
    # Statslevel used by rocksdb to collection statistics, optional values are
    #   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
    #   * kExceptTimers, Skip timer stats
    #   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
    #   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
    #   * kAll, Collect all stats
    --rocksdb_stats_level=kExceptHistogramOrTimers
    
    # Whether or not to enable rocksdb's prefix bloom filter, enabled by default.
    --enable_rocksdb_prefix_filtering=true
    # Whether or not to enable rocksdb's whole key bloom filter, disabled by default.
    --enable_rocksdb_whole_key_filtering=false
    
    ############## Key-Value separation ##############
    # Whether or not to enable BlobDB (RocksDB key-value separation support)
    --rocksdb_enable_kv_separation=false
    # RocksDB key value separation threshold in bytes. Values at or above this threshold will be written to blob files during flush or compaction.
    --rocksdb_kv_separation_threshold=100
    # Compression algorithm for blobs, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
    --rocksdb_blob_compression=lz4
    # Whether to garbage collect blobs during compaction
    --rocksdb_enable_blob_garbage_collection=true
    
    ############## rocksdb Options ##############
    # rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
    --rocksdb_db_options={}
    # rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
    --rocksdb_column_family_options={"write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456"}
    # rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
    --rocksdb_block_based_table_options={"block_size":"8192"}

---
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: nebula-graph-pdb
  namespace: temp
spec:
  minAvailable: 3
  selector:
    matchLabels:
      app: nebula-graph



#---
### 对外暴露端口
#apiVersion: v1
#kind: Service
#metadata:
#  name: nebula-graph
#  namespace: temp
#spec:
#  selector:
#    app: nebula-graph
#  type: NodePort
#  ports:
#    - name: nebula-graph
#      protocol: TCP
#      port: 9669
#      nodePort: 9669
#      targetPort: 9669

---
### 内部通讯端口
apiVersion: v1
kind: Service
metadata:
  name: nebula-metad-headless
  namespace: temp
  labels:
    app: nebula-metad
spec:
  ports:
    - port: 9559
      name: metad-meta-server
    - port: 9560
      name: metad-meta-server-leder
    - port: 19559
      name: metad-meta-server-leader
    - port: 19560
      name: metad-meta-server-ws-http
    - port: 9669
      name: metad-graphd-server
    - port: 19669
      name: metad-graphd-server-ws
    - port: 19670
      name: metad-graphd-server-ws-http
    - port: 9779
      name: metad-strong-server
    - port: 19779
      name: metad-strong-server-ws
    - port: 19780
      name: metad-strong-server-ws-storage
  clusterIP: None
  selector:
    app: nebula-metad
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nebula-metad
  namespace: temp
spec:
  selector:
    matchLabels:
      app: nebula-metad # has to match .spec.template.metadata.labels
  serviceName: "nebula-metad-headless"
  replicas: 3 # by default is 1
  template:
    metadata:
      labels:
        app: nebula-metad # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      ## 反亲和性:将服务分散到不同服务器上
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: nebula-graph
                namespaces:
                  - temp
                topologyKey: kubernetes.io/hostname
              weight: 100
      containers:
        - name: nebula-metad
          image: dockerhub.clinbrain.com/nebula-graph:3.1.009
          imagePullPolicy: IfNotPresent
          env:
            - name: TZ
              value: Asia/Shanghai
            - name: START_SCRIPT
              value: /home/nebula/nebula-graph/bin/nebula-metad
            - name: CONFIG_FILE
              value: /home/nebula/nebula-graph/etc/nebula-metad.conf.template
          ports:
            - containerPort: 9559
              name: metad-meta
            - containerPort: 19559
              name: metad-strong
            - containerPort: 9560
              name: metad-leader
            - containerPort: 19560
              name: metad-leaderg
          volumeMounts:
            - name: storage-volume
              mountPath: /home/nebula/nebula-graph/data
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-metad.conf.template
              subPath: nebula-metad.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-graphd.template
              subPath: nebula-graphd.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-storaged.template
              subPath: nebula-storaged.conf
            - name: config-volume
              mountPath: /script/container-post-start.sh
              subPath: container-post-start.sh
      dnsConfig:
        searches:
          - temp.svc.cluster.local
      dnsPolicy: ClusterFirst
      volumes:
        - name: config-volume
          configMap:
            defaultMode: 0777
            name: nebula-graph-config
  volumeClaimTemplates:
    - metadata:
        name: storage-volume
        annotations:
          volume.beta.kubernetes.io/storage-class: nebula-graph-storage-class
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi

---
### 内部通讯端口
apiVersion: v1
kind: Service
metadata:
  name: nebula-graphd-headless
  namespace: temp
  labels:
    app: nebula-graphd
spec:
  ports:
    - port: 9559
      name: metad-meta-server
    - port: 9560
      name: metad-meta-server-leder
    - port: 19559
      name: metad-meta-server-leader
    - port: 19560
      name: metad-meta-server-ws-http
    - port: 9669
      name: metad-graphd-server
    - port: 19669
      name: metad-graphd-server-ws
    - port: 19670
      name: metad-graphd-server-ws-http
    - port: 9779
      name: metad-strong-server
    - port: 19779
      name: metad-strong-server-ws
    - port: 19780
      name: metad-strong-server-ws-storage
  clusterIP: None
  selector:
    app: nebula-graphd
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nebula-graphd
  namespace: temp
spec:
  selector:
    matchLabels:
      app: nebula-graphd # has to match .spec.template.metadata.labels
  serviceName: "nebula-graphd-headless"
  replicas: 1 # by default is 1
  template:
    metadata:
      labels:
        app: nebula-graphd # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      ## 反亲和性:将服务分散到不同服务器上
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: nebula-graph
                namespaces:
                  - temp
                topologyKey: kubernetes.io/hostname
              weight: 100
      containers:
        - name: nebula-graphd
          image: dockerhub.clinbrain.com/nebula-graph:3.1.009
          imagePullPolicy: IfNotPresent
          env:
            - name: TZ
              value: Asia/Shanghai
            - name: NEBULA_MODEL
              value: nebula-graphd
            - name: CONFIG_FILE
              value: /home/nebula/nebula-graph/etc/nebula-graphd.template
          ports:
            - containerPort: 9669
              name: graphd-meta
            - containerPort: 19669
              name: graphd-graphd
            - containerPort: 19670
              name: graphd-strong
          volumeMounts:
            - name: storage-volume
              mountPath: /home/nebula/nebula-graph/data
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-metad.conf.template
              subPath: nebula-metad.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-graphd.template
              subPath: nebula-graphd.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-storaged.template
              subPath: nebula-storaged.conf
            - name: config-volume
              mountPath: /script/container-post-start.sh
              subPath: container-post-start.sh
      dnsConfig:
        searches:
          - temp.svc.cluster.local
      dnsPolicy: ClusterFirst
      volumes:
        - name: config-volume
          configMap:
            defaultMode: 0744
            name: nebula-graph-config
  volumeClaimTemplates:
    - metadata:
        name: storage-volume
        annotations:
          volume.beta.kubernetes.io/storage-class: nebula-graph-storage-class
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi
---
### 内部通讯端口
apiVersion: v1
kind: Service
metadata:
  name: nebula-storaged-headless
  namespace: temp
  labels:
    app: nebula-storaged
spec:
  ports:
    - port: 9559
      name: metad-meta-server
    - port: 9560
      name: metad-meta-server-leder
    - port: 19559
      name: metad-meta-server-leader
    - port: 19560
      name: metad-meta-server-ws-http
    - port: 9669
      name: metad-graphd-server
    - port: 19669
      name: metad-graphd-server-ws
    - port: 19670
      name: metad-graphd-server-ws-http
    - port: 9779
      name: metad-strong-server
    - port: 19779
      name: metad-strong-server-ws
    - port: 19780
      name: metad-strong-server-ws-storage
  clusterIP: None
  selector:
    app: nebula-storaged
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nebula-storaged
  namespace: temp
spec:
  selector:
    matchLabels:
      app: nebula-storaged # has to match .spec.template.metadata.labels
  serviceName: "nebula-storaged-headless"
  replicas: 1 # by default is 1
  template:
    metadata:
      labels:
        app: nebula-storaged # has to match .spec.selector.matchLabels
    spec:
      terminationGracePeriodSeconds: 10
      ## 反亲和性:将服务分散到不同服务器上
      affinity:
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - podAffinityTerm:
                labelSelector:
                  matchLabels:
                    app: nebula-graph
                namespaces:
                  - temp
                topologyKey: kubernetes.io/hostname
              weight: 100
      containers:
        - name: nebula-storaged
          image: dockerhub.clinbrain.com/nebula-graph:3.1.009
          imagePullPolicy: IfNotPresent
          env:
            - name: TZ
              value: Asia/Shanghai
            - name: NEBULA_MODEL
              value: nebula-storaged
            - name: CONFIG_FILE
              value: /home/nebula/nebula-graph/etc/nebula-storaged.template
          ports:
            - containerPort: 9779
              name: storaged-meta
            - containerPort: 19779
              name: storaged-graphd
            - containerPort: 19780
              name: storaged-strong
          volumeMounts:
            - name: storage-volume
              mountPath: /home/nebula/nebula-graph/data
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-metad.conf.template
              subPath: nebula-metad.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-graphd.template
              subPath: nebula-graphd.conf
            - name: config-volume
              mountPath: /home/nebula/nebula-graph/etc/nebula-storaged.template
              subPath: nebula-storaged.conf
            - name: config-volume
              mountPath: /script/container-post-start.sh
              subPath: container-post-start.sh
      dnsConfig:
        searches:
          - temp.svc.cluster.local
      dnsPolicy: ClusterFirst
      volumes:
        - name: config-volume
          configMap:
            defaultMode: 0744
            name: nebula-graph-config
  volumeClaimTemplates:
    - metadata:
        name: storage-volume
        annotations:
          volume.beta.kubernetes.io/storage-class: nebula-graph-storage-class
      spec:
        accessModes: [ "ReadWriteOnce" ]
        resources:
          requests:
            storage: 10Gi

storage Class

---
apiVersion: v1
kind: PersistentVolume
metadata:
  finalizers:
    - kubernetes.io/pv-protection
  name: nfs-pv-nebula-graph-storage-class
spec:
  accessModes:
    - ReadWriteMany
  capacity:
    storage: 1024Gi
  mountOptions: []
  nfs:
    path: /home/nfs/kubernetes/plugins/nebula-graph
    server: nfs-server.clinbrain.com
  persistentVolumeReclaimPolicy: Retain
  storageClassName: nfs-storageclass-provisioner
  volumeMode: Filesystem

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  finalizers:
    - kubernetes.io/pvc-protection
  name: nfs-pvc-nebula-graph-storage-class
  namespace: kube-system
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 100Gi
  storageClassName: nfs-storageclass-provisioner
  volumeMode: Filesystem
  volumeName: nfs-pv-nebula-graph-storage-class

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: eip-nfs-client-provisioner
  namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: eip-nfs-client-provisioner-runner
rules:
  - apiGroups:
      -
    resources: ["nodes"]
    verbs: ["get","list","watch"]
  - apiGroups:
      -
    resources: ["persistentvolumes"]
    verbs: ["get", "list", "watch", "create", "delete"]
  - apiGroups:
      -
    resources: ["persistentvolumeclaims"]
    verbs: ["get", "list", "watch", "update"]
  - apiGroups: ["storage.k8s.io"]
    resources: ["storageclasses"]
    verbs: ["get", "list", "watch"]
  - apiGroups:
      -
    resources: ["events"]
    verbs: ["create", "update", "patch"]

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: eip-run-nfs-client-provisioner

roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: eip-nfs-client-provisioner-runner
subjects:
  - kind: ServiceAccount
    name: eip-nfs-client-provisioner
    namespace: kube-system

---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: eip-leader-locking-nfs-client-provisioner
  namespace: kube-system
rules:
  - apiGroups:
      -
    resources: ["endpoints"]
    verbs: ["get", "list", "watch", "create", "update", "patch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: eip-leader-locking-nfs-client-provisioner
  namespace: kube-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: eip-leader-locking-nfs-client-provisioner
subjects:
  - kind: ServiceAccount
    name: eip-nfs-client-provisioner
    namespace: kube-system

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nfs-nebula-graph-storage-class
  name: nfs-nebula-graph-storage-class
  namespace: kube-system
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nfs-nebula-graph-storage-class
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: nfs-nebula-graph-storage-class
    spec:
      containers:
        - env:
            - name: PROVISIONER_NAME
              value: nfs-nebula-graph-storage-class
            - name: NFS_SERVER
              value: nfs-server.clinbrain.com
            - name: NFS_PATH
              value: /home/nfs/kubernetes/plugins/nebula-graph
          image: swr.cn-east-2.myhuaweicloud.com/kuboard-dependency/nfs-subdir-external-provisioner:v4.0.2
          name: nfs-client-provisioner
          volumeMounts:
            - mountPath: /persistentvolumes
              name: nfs-client-root
      serviceAccountName: eip-nfs-client-provisioner
      volumes:
        - name: nfs-client-root
          persistentVolumeClaim:
            claimName: nfs-pvc-nebula-graph-storage-class

---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: nebula-graph-storage-class
mountOptions: []
parameters:
  archiveOnDelete: 'false'
provisioner: nfs-nebula-graph-storage-class
reclaimPolicy: Retain
volumeBindingMode: Immediate

meta服务启动时日志截取下看看呢,是否有打印localhost = “” 相关的日志,另外看下local_ip在启动命令日志是否有配置,看你的配置文件没找到相关配置

都配置了,localhost 打印了本地IP信息

另外两个metad的日志有吗,有可能是互相连接不了,也有可能是三台机器的raft日志有问题所以选不了 :thinking:


三台服务的 相关配置都是有的

是要看三个metad的日志,还有你把配置里的v调到3试试 :thinking:

Chrome.zip (141.8 KB)
日志和配置文件都在里面了。

你要把日志等级开大点才行

你配置文件里全部是域名地址,local_ip是IP地址,两边不一致了,meta里注册的是域名,local_ip同样需要是域名

全部调整为 4 ?

域名可以解析为IP , 你的意思是要不就全部都是域名或者全部都是IP?

是的,以注册到meta的地址为准,另外如果你想用IP,容器环境下还要注意pod重启的情景,一旦IP地址无法做到保持不变,也会带来问题,建议用域名地址

2022161601.zip (94.6 KB)
全部使用域名都是不行的。

域名地址是nebula-metad-0.nebula-metad-headless的格式,看local_ip是 localhost = “nebula-metad-1”:9559,少了’.nebula-metad-headless’这一段

加了 后缀效果是一样的 。
2022161602.zip (24.3 KB)

要不开个远程帮忙看下

方便提供你的部署文件吗,我本地复现下

image

image

2022061701.zip (7.3 KB)

收到,我使用你的配置方案跑一遍看看

大佬辛苦了