连接 NebulaGraph 出现:error msg: Create session failed


  • nebula 版本:3.6.0

  • 部署方式:k8s单机

  • 安装方式:RPM

  • 是否上生产环境:Y

  • 问题的具体描述

kubectl run -ti --image vesoft/nebula-console:v3.5.0 --restart=Never -- nebula-console-01 --addr nebula-service -port 9669 -u root -p vesoft


kubectl run -ti --image vesoft/nebula-console:v3.5.0 --restart=Never -- nebula-console-01 --addr nebula-service -port 9669 -u root -p vesoft
If you don't see a command prompt, try pressing enter.
2024/01/30 06:26:49 Fail to create a new session from connection pool, failed to authenticate, error code: -1002, error msg: Create session failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not ope
panic: Fail to create a new session from connection pool, failed to authenticate, error code: -1002, error msg: Create session failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not ope

goroutine 1 [running]:
log.Panicf(0x81c3a5, 0x35, 0xc0000d5e58, 0x1, 0x1)
	/usr/local/go/src/log/log.go:345 +0xc0
	/usr/src/main.go:538 +0x985
  • 相关的 meta / storage / graph info 日志信息
tail -f nebula-graphd.ERROR   
E20240130 06:32:44.027412   222 GraphService.cpp:113] Create session for userName: root, ip: failed: Create session failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not ope
E20240130 06:32:56.283411   478 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:32:56.283663   478 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:32:56.283706   425 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:33:09.296993   479 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:33:09.297209   479 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:33:09.297251   425 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:33:22.310073   480 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:33:22.310290   480 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:33:22.310324   425 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:33:33.763198   481 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:33:33.763422   481 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:33:33.763459   426 GraphSessionManager.cpp:290] Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:33:35.325246   482 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:33:35.325286   482 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:33:35.325320   425 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect


tail -f nebula-metad.INFO    
I20240130 06:26:12.061431   270 JobManager.cpp:150] JobManager::scheduleThread enter
I20240130 06:26:12.257335   339 HBProcessor.cpp:33] Receive heartbeat from "":9669, role = GRAPH
I20240130 06:26:12.257390   339 HBProcessor.cpp:89] Update host "":9669 dir info, root path: /usr/local/nebula, data path size: 0
I20240130 06:26:12.280733   339 SessionManagerProcessor.cpp:136] resp session size: 1
I20240130 06:26:12.297084   339 ListHostsProcessor.cpp:249] skip inactive host: "":9779
I20240130 06:26:12.327262   339 SessionManagerProcessor.cpp:136] resp session size: 1
I20240130 06:26:12.373484   339 HBProcessor.cpp:33] Receive heartbeat from "":9779, role = STORAGE
I20240130 06:26:12.373574   339 HBProcessor.cpp:52] Set clusterId for new host "":9779!
I20240130 06:26:12.373584   339 HBProcessor.cpp:89] Update host "":9779 dir info, root path: /usr/local/nebula, data path size: 1
I20240130 06:26:12.383105   339 SessionManagerProcessor.cpp:136] resp session size: 1


Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:35:32.274577   466 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:35:32.274760   466 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:35:32.274794   489 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:35:45.290772   473 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:35:45.291550   473 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:35:45.291589   489 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:35:58.306519   488 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:35:58.327594   488 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:35:58.327644   489 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:36:11.341065   457 MetaClient.cpp:772] Send request to "":9559, exceed retry limit
E20240130 06:36:11.341297   457 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20240130 06:36:11.341332   489 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20240130 06:36:24.356230   462 MetaClient.cpp:772] Send request to "":9559, exceed retry limit

这个是通信报错了,你贴下 meta 和 graph 的配置文件。


cat nebula-metad.conf
########## basics ##########
# Whether to run as a daemon process
# The file to host the process id

########## logging ##########
# The directory to host logging files
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
# Maximum seconds to buffer the log messages
# Whether to redirect stdout and stderr to separate output files
# Destination filename of stdout and stderr, which will also reside in log_dir.
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
# wether logging files' name contain time stamp, If Using logrotate to rotate logging files, than should set it to true.

########## networking ##########
# Comma separated Meta Server addresses
# Local IP used to identify the nebula-metad process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
# Meta daemon listening port
# HTTP service ip
# HTTP service port
# Port to listen on Storage with HTTP protocol, it corresponds to ws_http_port in storage's configuration file

########## storage ##########
# Root data path, here should be only single path for metad

########## Misc #########
# The default number of parts when a space is created
# The default replica factor when a space is created



cat nebula-graphd.conf
########## basics ##########
# Whether to run as a daemon process
# The file to host the process id
# Whether to enable optimizer
# The default charset when a space is created
# The default collate when a space is created
# Whether to use the configuration obtained from the configuration file

########## logging ##########
# The directory to host logging files
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
# Maximum seconds to buffer the log messages
# Whether to redirect stdout and stderr to separate output files
# Destination filename of stdout and stderr, which will also reside in log_dir.
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
# wether logging files' name contain time stamp.
########## query ##########
# Whether to treat partial success as an error.
# This flag is only used for Read-only access, and Modify access always treats partial success as an error.
# Maximum sentence length, unit byte

########## networking ##########
# Comma separated Meta Server Addresses
# Local IP used to identify the nebula-graphd process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
# Network device to listen on
# Port to listen on
# To turn on SO_REUSEPORT or not
# Backlog of the listen socket, adjust this together with net.core.somaxconn
# The number of seconds Nebula service waits before closing the idle connections
# The number of seconds before idle sessions expire
# The range should be in [1, 604800]
# The number of threads to accept incoming connections
# The number of networking IO threads, 0 for # of CPU cores
# Max active connections for all networking threads. 0 means no limit.
# Max connections for each networking thread = num_max_connections / num_netio_threads
# The number of threads to execute user queries, 0 for # of CPU cores
# HTTP service ip
# HTTP service port
# storage client timeout
# slow query threshold in us
# Port to listen on Meta with HTTP protocol, it corresponds to ws_http_port in metad's configuration file

########## authentication ##########
# Enable authorization
# User login authentication type, password for nebula authentication, ldap for ldap authentication, cloud for cloud authentication

########## memory ##########
# System memory high watermark ratio, cancel the memory checking when the ratio greater than 1.0

########## metrics ##########

########## experimental feature ##########
# if use experimental features

# if use balance data feature, only work if enable_experimental_feature is true

# enable udf, written in c++ only for now

# set the directory where the .so files of udf are stored, when enable_udf is true

########## session ##########
# Maximum number of sessions that can be created per IP and per user

########## memory tracker ##########
# trackable memory ratio (trackable_memory / (total_memory - untracked_reserved_memory) )
# untracked reserved memory in Mib

# enable log memory tracker stats periodically
# log memory tacker stats interval in milliseconds

# enable memory background purge (if jemalloc is used)
# memory background purge interval in seconds

########## performance optimization ##########
# The max job size in multi job mode
# The min batch size for handling dataset in multi job mode, only enabled when max_job_size is greater than 1
# if true, return directly without go through RPC
# number of paths constructed by each thread


cat nebula-storaged.conf
########## basics ##########
# Whether to run as a daemon process
# The file to host the process id
# Whether to use the configuration obtained from the configuration file

########## logging ##########
# The directory to host logging files
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
# Maximum seconds to buffer the log messages
# Whether to redirect stdout and stderr to separate output files
# Destination filename of stdout and stderr, which will also reside in log_dir.
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
# Wether logging files' name contain time stamp.

########## networking ##########
# Comma separated Meta server addresses
# Local IP used to identify the nebula-storaged process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
# Storage daemon listening port
# HTTP service ip
# HTTP service port
# heartbeat with meta service

######### Raft #########
# Raft election timeout
# RPC timeout for raft client (ms)
## recycle Raft WAL

########## Disk ##########
# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
# One path per Rocksdb instance.

# Minimum reserved bytes of each data path

# The default reserved bytes for one batch operation
# The default block cache size used in BlockBasedTable.
# The unit is MB.
# The type of storage engine, `rocksdb', `memory', etc.

# Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
# For the sake of binary compatibility, the default value is snappy.
# Recommend to use:
#   * lz4 to gain more CPU performance, with the same compression ratio with snappy
#   * zstd to occupy less disk space
#   * lz4hc for the read-heavy write-light scenario

# Set different compressions for different levels
# For example, if --rocksdb_compression is snappy,
# "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
# In order to disable compression for level 0/1, set it to "no:no"

# Whether or not to enable rocksdb's statistics, disabled by default

# Statslevel used by rocksdb to collection statistics, optional values are
#   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
#   * kExceptTimers, Skip timer stats
#   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
#   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
#   * kAll, Collect all stats

# Whether or not to enable rocksdb's prefix bloom filter, enabled by default.
# Whether or not to enable rocksdb's whole key bloom filter, disabled by default.

############## rocksdb Options ##############
# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma

############### misc ####################
# Whether turn on query in multiple thread
# Whether remove outdated space data
# Network IO threads number
# Max active connections for all networking threads. 0 means no limit.
# Max connections for each networking thread = num_max_connections / num_netio_threads
# Worker threads number to handle request
# Maximum subtasks to run admin jobs concurrently
# The rate limit in bytes when leader synchronizes snapshot data
# The amount of data sent in each batch when leader synchronizes snapshot data
# The rate limit in bytes when leader synchronizes rebuilding index
# The amount of data sent in each batch when leader synchronizes rebuilding index

########## memory tracker ##########
# trackable memory ratio (trackable_memory / (total_memory - untracked_reserved_memory) )
# untracked reserved memory in Mib

# enable log memory tracker stats periodically
# log memory tacker stats interval in milliseconds

# enable memory background purge (if jemalloc is used)
# memory background purge interval in seconds

这个 ip 是你的本地 ip?建议你把所有的 的本地地址都改成真实的 ip 地址。


把各个配置中这个改成真实 ip:9559


你看看这个改成 是不是可以成功,这里是要填 graphd 的服务 ip 的。

show hosts就会出现:

show hosts;
[ERROR (-1005)]: RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect

= =,我对 K8s 这套不是很熟悉,我帮你喊个人 @wey 他是 K8s 部署的,连 console 报错了。


./nebula.service status all
[INFO] nebula-metad(de9b3ed): Running as 24, Listening on 9559 
[INFO] nebula-graphd(de9b3ed): Running as 48, Listening on 9669 
[INFO] nebula-storaged(de9b3ed): Running as 83, Listening on 9779 

但是当我进入容器之后,执行添加 Storage 主机的命令之后:



./nebula.service status all
[INFO] nebula-metad(de9b3ed): Exited
[INFO] nebula-graphd(de9b3ed): Running as 48, Listening on 9669 
[INFO] nebula-storaged(de9b3ed): Running as 83, Listening on 9779 

你是起了一个 k8s pod 然后进去手动安装的 rpm 包?我有太多问题了。。。。。。。。。

非常不推荐,就算(因为什么原因?)不用 nebula k8s operator,建议至少用 container image 去搞,我记得有项目把 docker compose 变成 k8s resource yaml 的可以用那个改改 GitHub - vesoft-inc/nebula-docker-compose: Docker compose for Nebula Graph

感觉你在这个 pod 里进程没法绑定在 上,你可以把配置改成这个 pod 的 hostname 弄一下,但是也是不推荐的。

1 个赞

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。