nebula3.0 内存占用

lopn · 2022 年9 月 21 日 02:05

nebula-storaged 随着时间的推移，内存占用越来越大，最后报错
Used memory hits the high watermark(0.800000) of total system memory.

这个问题怎么解

Lisa · 2022 年9 月 21 日 02:05

机器配置信息给下

lopn · 2022 年9 月 21 日 03:55

nebula 版本：3.0.0
部署方式：单机版
是否为线上版本： Y
CPU、内存信息 8核16G

lopn · 2022 年9 月 21 日 04:48

storage配置信息

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-storaged.pid
# Whether to use the configuration obtained from the configuration file
--local_config=true

########## logging ##########
# The directory to host logging files
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0
# Whether to redirect stdout and stderr to separate output files
--redirect_stdout=true
# Destination filename of stdout and stderr, which will also reside in log_dir.
--stdout_log_file=storaged-stdout.log
--stderr_log_file=storaged-stderr.log
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
--stderrthreshold=2
# Wether logging files' name contain time stamp.
--timestamp_in_logfile_name=true

########## networking ##########
# Comma separated Meta server addresses
--meta_server_addrs=127.0.0.1:9559
# Local IP used to identify the nebula-storaged process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
--local_ip=127.0.0.1
# Storage daemon listening port
--port=9779
# HTTP service ip
--ws_ip=0.0.0.0
# HTTP service port
--ws_http_port=19779
# HTTP2 service port
--ws_h2_port=19780
# heartbeat with meta service
--heartbeat_interval_secs=10

######### Raft #########
# Raft election timeout
--raft_heartbeat_interval_secs=30
# RPC timeout for raft client (ms)
--raft_rpc_timeout_ms=500
## recycle Raft WAL
--wal_ttl=14400

########## Disk ##########
# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
# One path per Rocksdb instance.
--data_path=/data/nebula_data/storage

# Minimum reserved bytes of each data path
--minimum_reserved_bytes=268435456

# The default reserved bytes for one batch operation
--rocksdb_batch_size=4096
# The default block cache size used in BlockBasedTable.
# The unit is MB.
--rocksdb_block_cache=4
# The type of storage engine, `rocksdb', `memory', etc.
--engine_type=rocksdb

# Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
# For the sake of binary compatibility, the default value is snappy.
# Recommend to use:
#   * lz4 to gain more CPU performance, with the same compression ratio with snappy
#   * zstd to occupy less disk space
#   * lz4hc for the read-heavy write-light scenario
--rocksdb_compression=lz4

# Set different compressions for different levels
# For example, if --rocksdb_compression is snappy,
# "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
# In order to disable compression for level 0/1, set it to "no:no"
--rocksdb_compression_per_level=

# Whether or not to enable rocksdb's statistics, disabled by default
--enable_rocksdb_statistics=false

# Statslevel used by rocksdb to collection statistics, optional values are
#   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
#   * kExceptTimers, Skip timer stats
#   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
#   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
#   * kAll, Collect all stats
--rocksdb_stats_level=kExceptHistogramOrTimers

# Whether or not to enable rocksdb's prefix bloom filter, enabled by default.
--enable_rocksdb_prefix_filtering=true
# Whether or not to enable rocksdb's whole key bloom filter, disabled by default.
--enable_rocksdb_whole_key_filtering=false

############## Key-Value separation ##############
# Whether or not to enable BlobDB (RocksDB key-value separation support)
--rocksdb_enable_kv_separation=false
# RocksDB key value separation threshold. Values at or above this threshold will be written to blob files during flush or compaction.
--rocksdb_kv_separation_threshold=0
# Compression algorithm for blobs, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
--rocksdb_blob_compression=lz4
# Whether to garbage collect blobs during compaction
--rocksdb_enable_blob_garbage_collection=true

############## rocksdb Options ##############
--rocksdb_db_options={"max_open_files":"50000"}
--rocksdb_block_based_table_options={"block_size":"32768","cache_index_and_filter_blocks":"true"}


# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
#--rocksdb_db_options={}
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_column_family_options={"write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456"}
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
#--rocksdb_block_based_table_options={"block_size":"8192"}

lopn · 2022 年9 月 21 日 04:49

之前看论坛里改过一些rocksdb_db 的配置信息，但是没有用

lopn · 2022 年9 月 21 日 12:06

@Lisa 帮忙看下

lopn · 2022 年9 月 23 日 08:38

@Lisa

5434 root      20   0 2763560   1.4g   3620 S   2.3  9.1 123:13.45 nebula-storaged

9.23 1.4G

liwenhui · 2022 年9 月 23 日 10:12

block_cache设的多少，可能跟这个有关

lopn · 2022 年9 月 26 日 03:58

@Lisa

 5434 root      20   0 3713832   2.1g   3900 S  14.4 13.7 179:51.52 nebula-storaged

9.26 2.1g

lopn · 2022 年9 月 26 日 04:01

@Lisa @liwenhui

# The default reserved bytes for one batch operation
--rocksdb_batch_size=4096
# The default block cache size used in BlockBasedTable.
# The unit is MB.
--rocksdb_block_cache=4

是说这个吗？默认值

lopn · 2022 年9 月 26 日 04:02

@Lisa @liwenhui

############## rocksdb Options ##############
--rocksdb_db_options={"max_open_files":"50000"}
--rocksdb_block_based_table_options={"block_size":"32768","cache_index_and_filter_blocks":"true"}


# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
#--rocksdb_db_options={}
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_column_family_options={"write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456"}
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
#--rocksdb_block_based_table_options={"block_size":"8192"}

liwenhui · 2022 年9 月 27 日 07:00

看看这个有没有帮助记一次 nebula-storaged 内存占用高解决的过程 - #4，来自 liuqian1990

lopn · 2022 年9 月 28 日 03:39

@liwenhui @Lisa
没看我发的配置吗？就是看上面的链接改过的，没有用

lopn · 2022 年9 月 28 日 03:43

@Lisa @liwenhui

5434 root      20   0 4082472   2.5g   4448 S   0.7 16.3 204:38.37 nebula-storaged

9.23 1.4G
9.26 2.1g
9.28 2.5g

xjc · 2022 年9 月 28 日 04:20

数据量多大？每天是否有数据写入？一共几个space，space的分片多少？
麻烦这些信息发一下。

lopn · 2022 年9 月 29 日 09:44

@xjc 每天都有数据写入 100W 的点插入用了 INSERT VERTEX 和 INSERT VERTEX IF NOT EXISTS

一共5个space

lopn · 2022 年9 月 29 日 10:11

(root@nebula) [(none)]> SHOW HOSTS GRAPH;
+-------------+------+----------+---------+--------------+---------+
| Host        | Port | Status   | Role    | Git Info Sha | Version |
+-------------+------+----------+---------+--------------+---------+
| "127.0.0.1" | 9669 | "ONLINE" | "GRAPH" | "02b2091"    | "3.0.0" |
+-------------+------+----------+---------+--------------+---------+
Got 1 rows (time spent 555/1273 us)

Thu, 29 Sep 2022 18:09:30 CST

(root@nebula) [(none)]> SHOW HOSTS GRAPH;
+-------------+------+----------+---------+--------------+---------+
| Host        | Port | Status   | Role    | Git Info Sha | Version |
+-------------+------+----------+---------+--------------+---------+
| "127.0.0.1" | 9669 | "ONLINE" | "GRAPH" | "02b2091"    | "3.0.0" |
+-------------+------+----------+---------+--------------+---------+
Got 1 rows (time spent 509/1062 us)

Thu, 29 Sep 2022 18:09:36 CST

(root@nebula) [(none)]> SHOW HOSTS STORAGE;
+-------------+------+-----------+-----------+--------------+---------+
| Host        | Port | Status    | Role      | Git Info Sha | Version |
+-------------+------+-----------+-----------+--------------+---------+
| "127.0.0.1" | 9779 | "OFFLINE" | "STORAGE" | ""           |         |
+-------------+------+-----------+-----------+--------------+---------+
Got 1 rows (time spent 522/1169 us)

Thu, 29 Sep 2022 18:09:38 CST

(root@nebula) [(none)]> SHOW HOSTS META;
+-------------+------+----------+--------+--------------+---------+
| Host        | Port | Status   | Role   | Git Info Sha | Version |
+-------------+------+----------+--------+--------------+---------+
| "127.0.0.1" | 9559 | "ONLINE" | "META" | "02b2091"    | "3.0.0" |
+-------------+------+----------+--------+--------------+---------+
Got 1 rows (time spent 467/985 us)

Thu, 29 Sep 2022 18:09:47 CST

(root@nebula) [(none)]> show spaces;
+---------------------+
| Name                |
+---------------------+
| "space_2OD5XBFqPCX" |
| "space_8G7d2TDqHwc" |
| "space_ASAASdbMg"   |
| "space_PbUHvAP3khw" |
| "space_rNDIFizDj4U" |
+---------------------+
Got 5 rows (time spent 455/1007 us)

xjc · 2022 年9 月 29 日 10:11

space的分片数呢？这里可能是因为写数据时候的wal log的缓存。

lopn · 2022 年9 月 29 日 10:12

还有一个问题，storage 显示offline ，但是服务正常，这个怎么回事

lopn · 2022 年9 月 29 日 10:20

看了下，分片没有设，用了默认值 100