空间放大严重，想通过控制wal生命周期和 data数据compaction来优化，优化没有任何效果

fenghua · 2021 年10 月 8 日 09:19

提问参考模版：

nebula 版本：2.5.0
部署方式（分布式 / 单机 / Docker / DBaaS）：分布式
是否为线上版本：Y
硬件信息
- 磁盘 ESSD 1T
- CPU、内存信息 32C 64G
问题的具体描述
相关的 meta / storage / graph info 日志信息（尽量使用文本形式方便检索）

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-storaged.pid
# Whether to use the configuration obtained from the configuration file
--local_config=true

########## logging ##########
# The directory to host logging files
--log_dir=/mnt/logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0
# Whether to redirect stdout and stderr to separate output files
--redirect_stdout=true
# Destination filename of stdout and stderr, which will also reside in log_dir.
--stdout_log_file=storaged-stdout.log
--stderr_log_file=storaged-stderr.log
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
--stderrthreshold=2

########## networking ##########
# Comma separated Meta server addresses
--meta_server_addrs=脱敏处理
# Local IP used to identify the nebula-storaged process.
# Change it to an address other than loopback if the service is distributed or
# will be accessed remotely.
--local_ip=脱敏处理
# Storage daemon listening port
--port=9779
# HTTP service ip
--ws_ip=0.0.0.0
# HTTP service port
--ws_http_port=19779
# HTTP2 service port
--ws_h2_port=19780
# heartbeat with meta service
--heartbeat_interval_secs=10

######### Raft #########
# Raft election timeout
--raft_heartbeat_interval_secs=30
# RPC timeout for raft client (ms)
--raft_rpc_timeout_ms=500
## recycle Raft WAL
--wal_ttl=14400
--clean_wal_interval_secs=60
########## Disk ##########
# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
# One path per Rocksdb instance.
--data_path=/mnt/data/storage

# Minimum reserved bytes of each data path
--minimum_reserved_bytes=268435456

# The default reserved bytes for one batch operation
--rocksdb_batch_size=4096
# The default block cache size used in BlockBasedTable.
# The unit is MB.
--rocksdb_block_cache=4096
# The type of storage engine, `rocksdb', `memory', etc.
--engine_type=rocksdb

# Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
# For the sake of binary compatibility, the default value is snappy.
# Recommend to use:
#   * lz4 to gain more CPU performance, with the same compression ratio with snappy
#   * zstd to occupy less disk space
#   * lz4hc for the read-heavy write-light scenario
--rocksdb_compression=lz4

# Set different compressions for different levels
# For example, if --rocksdb_compression is snappy,
# "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
# In order to disable compression for level 0/1, set it to "no:no"
--rocksdb_compression_per_level=

# Whether or not to enable rocksdb's statistics, disabled by default
--enable_rocksdb_statistics=false

# Statslevel used by rocksdb to collection statistics, optional values are
#   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
#   * kExceptTimers, Skip timer stats
#   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
#   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
#   * kAll, Collect all stats
--rocksdb_stats_level=kExceptHistogramOrTimers

# Whether or not to enable rocksdb's prefix bloom filter, disabled by default.
--enable_rocksdb_prefix_filtering=false

############## rocksdb Options ##############
--rocksdb_disable_wal=true
# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
# rocksdb的level 0到level 1的多线程compaction也是开始一个线程准备参数，然后开启多个线程并行执行，最终再由一个线程收尾的方式来提高并行度
# 我们每stats_dump_period_sec秒就会把统计信息导出到日志文件。默认为600，意味着每10分钟导出一次
--rocksdb_db_options={"max_subcompactions": "10", "max_background_jobs" :"10","stats_dump_period_sec":"200", "write_thread_max_yield_usec":"600"}
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
# 而compaction的触发条件是两类：文件个数和文件大小。对于level0，触发条件是sst文件个数，通过参数level0_file_num_compaction_trigger控制
# level1-levelN触发条件是sst文件的大小，通过参数max_bytes_for_level_base和max_bytes_for_level_multiplier来控制每一层最大的容量
# 通过参数max_write_buffer_number可以控制memtable的总数量，如果写入非常快，而compaction很慢，会导致memtable数量超过阀值，导致write stall的严重后果。另外一个参数是min_write_buffer_number_to_merge，整个参数是控制至少几个immutable才会触发flush，默认是1。flush的基本流程如下：
# max_write_buffer_number_to_maintain=(max_write_buffer_number|0)，“0”表示writer_buffer flush之后立即释放内存，如果是这样，当事务进行写冲突检查时，可能需要读取SST Table，降低性能。
--rocksdb_column_family_options={"disable_auto_compactions": false, "write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456","level0_file_num_compaction_trigger":"10","max_write_buffer_number":"4", "min_write_buffer_number_to_merge":"2", "max_write_buffer_number_to_maintain":"1"}
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_block_based_table_options={"block_size":"8192"}

使用背景：公司做的风控系统，每天生成4亿条数据，离线批量写入图库
由于数据大部分是更新操作，

理想的情况：每日6点-7点批量写入，写入数据完成后 wal 和数据增加，之后马上手动执行合并命令，磁盘占用回落到正常水平，wal日志 4小时内删除

事故现场：由于10月1日 8点磁盘被写满，服务不可用，进行了在一次写入过程中的服务强制重启，之后手动删除图空间，手动执行合并数据命令，试图释放图空间，但是知道10月8日图空间并未得到释放。

目前处理思路：在nebula-storaged.conf 配置文件中增加

--wal_ttl=14400
--clean_wal_interval_secs=60

试图让wal 在60秒后，主动删除过期的就数据，但是数据并没有任何变动
另外的问题：我是否可以手动删除data目录下的所有数据来释放磁盘，目前图空间实际已被清空

critical27 · 2021 年10 月 8 日 09:41

wal_ttl得改啊……光改clean_wal_interval_secs没用的

critical27 · 2021 年10 月 8 日 09:53

每clean_wal_interval_secs删wal_ttl之前的无用日志

yechen89 · 2021 年10 月 8 日 10:09

请问，生产环境中wal_ttl值多少比较合适？在wal文件被执行删除操作前，wal里数据确保会全部落盘嘛？

fenghua · 2021 年10 月 8 日 10:34

我有改wal_ttl=14400, 配置文件里有配置

xjc · 2021 年10 月 8 日 10:55

图空间已经drop了？那么相应的data和wal是可以删除的。

fenghua · 2021 年10 月 8 日 11:03

storage/nebula下的所有文件都可以删掉了是把，meta下的不用处理

xjc · 2021 年10 月 8 日 11:04

只能删除不存在的图空间对应的spaceid啊，反过来查下哪些还存在。

system · 2021 年11 月 7 日 11:05

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。

空间放大严重，想通过控制wal生命周期 和 data数据compaction来优化，优化没有任何效果

空间放大严重，想通过控制wal生命周期和 data数据compaction来优化，优化没有任何效果