storaged data目录下的xxx.sst文件一直在增加

  • nebula 版本:2.0 beta
  • 部署方式(分布式 / 单机 / Docker / DBaaS):Docker
  • 硬件信息
    • 磁盘( 必须为 SSD ,不支持 HDD)SSD
    • CPU、内存信息:40C 126G

  • 问题的具体描述
    我在入库一定量数据之后,停止了入库,此时storage data目录大小为5.1G。
    但是,后续发现data目录大小一直在增长,详细查看后发现sst文件一直在增加,
    image
    经过两个小时左右,data目录以及增长到36G,而且还在不断增长中,但是在之前停止入库后的这段时间内我没有入库任何数据。
    image

我的nebula配置如下,主要是禁掉了WAL日志和auto compact

下次可以把提问模版中不需要用到的信息删掉呀~

不需要用到的信息指的是哪条信息呢

:joy: 被我删掉的

为了更快地定位、解决问题,麻烦参考下面模版提问 ^ ^

提问参考模版:

@critical27

这个问题比较怪异,请nebula的大佬们帮忙找找原因。

你好,经过一晚上时间,nebula data目录大小已经增长到220G了,每个space的data目录都在不断上涨,这个问题有空麻烦帮忙看下

看一下storage的日志呢

我翻了一遍storage的error、warning、info日志,没有几条消息,都是我们业务入库的一些信息。
每个space下rocksdb的data目录里的sst文件一直在增长,rocksdb的log我看着挺大的,都是重复在打下面这个消息:

这段时间有写入数据么

没有写入,昨天我把nebula重启了,之前的数据都删了,只入库了一次数据,入库完data目录是5.1G,然后data目录大小就一直在上涨,storage进程的cpu占用一直是比较高

入库的数据量是多大,可能是在内存里,然后慢慢flush了

入库数据量很小,原始数据才11.16G,我们是把原始数据拆分成点和边入库,拆分出的数据比源数据要小。

data/storage/nebula/907/data/LOG 文件贴下,看看是否compact发生了什么奇怪的错误。这个问题没碰到过。。。

这个文件太大了,52G
image
我看了一下,都是在重复打印Manual flush,下面是LOG开头去掉打印配置的部分:

2020/12/15-09:53:04.421147 7fe17ddfe700 [db_impl/db_impl_write.cc:1666] [default] New memtable created with log file: #3. Immutable memtables: 0.
2020/12/15-09:53:04.421307 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.421276) [db_impl/db_impl_compaction_flush.cc:2188] Calling FlushMemTableToOutputFile with column family [default], flush slots available 1, compaction slots available 1, flush slots scheduled 1, compaction slots scheduled 0
2020/12/15-09:53:04.421317 7fe1555fe700 [flush_job.cc:318] [default] [JOB 2] Flushing memtable with next log file: 3
2020/12/15-09:53:04.421353 7fe1555fe700 EVENT_LOG_v1 {“time_micros”: 1608025984421341, “job”: 2, “event”: “flush_started”, “num_memtables”: 1, “num_entries”: 1, “num_deletes”: 0, “total_data_size”: 18, “memory_usage”: 768, “flush_reason”: “Manual Flush”}
2020/12/15-09:53:04.421362 7fe1555fe700 [flush_job.cc:347] [default] [JOB 2] Level-0 flush table #6: started
2020/12/15-09:53:04.422800 7fe1555fe700 EVENT_LOG_v1 {“time_micros”: 1608025984422729, “cf_name”: “default”, “job”: 2, “event”: “table_file_creation”, “file_number”: 6, “file_size”: 970, “table_properties”: {“data_size”: 32, “index_size”: 34, “index_partitions”: 0, “top_level_index_size”: 0, “index_key_is_user_key”: 0, “index_value_is_delta_encoded”: 0, “filter_size”: 69, “raw_key_size”: 16, “raw_average_key_size”: 16, “raw_value_size”: 0, “raw_average_value_size”: 0, “num_data_blocks”: 1, “num_entries”: 1, “num_deletions”: 0, “num_merge_operands”: 0, “num_range_deletions”: 0, “format_version”: 0, “fixed_key_len”: 0, “filter_policy”: “rocksdb.BuiltinBloomFilter”, “column_family_name”: “default”, “column_family_id”: 0, “comparator”: “leveldb.BytewiseComparator”, “merge_operator”: “nullptr”, “prefix_extractor_name”: “nullptr”, “property_collectors”: “”, “compression”: “Snappy”, “compression_options”: "window_bits=-14; level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; ", “creation_time”: 1608025984, “oldest_key_time”: 1608025984, “file_creation_time”: 1608025984}}
2020/12/15-09:53:04.422854 7fe1555fe700 [flush_job.cc:389] [default] [JOB 2] Level-0 flush table #6: 970 bytes OK
2020/12/15-09:53:04.422911 7fe1555fe700 [version_set.cc:3814] Creating manifest 7
2020/12/15-09:53:04.425621 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.422886) [memtable_list.cc:445] [default] Level-0 commit table #6 started
2020/12/15-09:53:04.425628 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.425554) [memtable_list.cc:501] [default] Level-0 commit table #6: memtable #1 done
2020/12/15-09:53:04.425632 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.425574) EVENT_LOG_v1 {“time_micros”: 1608025984425565, “job”: 2, “event”: “flush_finished”, “output_compression”: “Snappy”, “lsm_state”: [1, 0, 0, 0, 0, 0, 0], “immutable_memtables”: 0}
2020/12/15-09:53:04.425636 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.425595) [db_impl/db_impl_compaction_flush.cc:203] [default] Level summary: files[1 0 0 0 0 0 0] max score 0.25
2020/12/15-09:53:04.425787 7fe17ddfe700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.428398 7fe17b5fe700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.428444 7fe17b5fe700 [db_impl/db_impl_write.cc:1666] [default] New memtable created with log file: #3. Immutable memtables: 0.
2020/12/15-09:53:04.428517 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.428494) [db_impl/db_impl_compaction_flush.cc:2188] Calling FlushMemTableToOutputFile with column family [default], flush slots available 1, compaction slots available 1, flush slots scheduled 1, compaction slots scheduled 0
2020/12/15-09:53:04.428526 7fe1555fe700 [flush_job.cc:318] [default] [JOB 3] Flushing memtable with next log file: 3
2020/12/15-09:53:04.428545 7fe1555fe700 EVENT_LOG_v1 {“time_micros”: 1608025984428536, “job”: 3, “event”: “flush_started”, “num_memtables”: 1, “num_entries”: 9, “num_deletes”: 0, “total_data_size”: 162, “memory_usage”: 1072, “flush_reason”: “Manual Flush”}
2020/12/15-09:53:04.428552 7fe1555fe700 [flush_job.cc:347] [default] [JOB 3] Level-0 flush table #8: started
2020/12/15-09:53:04.429853 7fe1555fe700 EVENT_LOG_v1 {“time_micros”: 1608025984429783, “cf_name”: “default”, “job”: 3, “event”: “table_file_creation”, “file_number”: 8, “file_size”: 1046, “table_properties”: {“data_size”: 107, “index_size”: 34, “index_partitions”: 0, “top_level_index_size”: 0, “index_key_is_user_key”: 0, “index_value_is_delta_encoded”: 0, “filter_size”: 69, “raw_key_size”: 144, “raw_average_key_size”: 16, “raw_value_size”: 0, “raw_average_value_size”: 0, “num_data_blocks”: 1, “num_entries”: 9, “num_deletions”: 0, “num_merge_operands”: 0, “num_range_deletions”: 0, “format_version”: 0, “fixed_key_len”: 0, “filter_policy”: “rocksdb.BuiltinBloomFilter”, “column_family_name”: “default”, “column_family_id”: 0, “comparator”: “leveldb.BytewiseComparator”, “merge_operator”: “nullptr”, “prefix_extractor_name”: “nullptr”, “property_collectors”: “”, “compression”: “Snappy”, “compression_options”: "window_bits=-14; level=32767; strategy=0; max_dict_bytes=0; zstd_max_train_bytes=0; enabled=0; ", “creation_time”: 1608025984, “oldest_key_time”: 1608025984, “file_creation_time”: 1608025984}}
2020/12/15-09:53:04.429911 7fe1555fe700 [flush_job.cc:389] [default] [JOB 3] Level-0 flush table #8: 1046 bytes OK
2020/12/15-09:53:04.430604 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.429942) [memtable_list.cc:445] [default] Level-0 commit table #8 started
2020/12/15-09:53:04.430613 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.430514) [memtable_list.cc:501] [default] Level-0 commit table #8: memtable #1 done
2020/12/15-09:53:04.430620 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.430542) EVENT_LOG_v1 {“time_micros”: 1608025984430532, “job”: 3, “event”: “flush_finished”, “output_compression”: “Snappy”, “lsm_state”: [2, 0, 0, 0, 0, 0, 0], “immutable_memtables”: 0}
2020/12/15-09:53:04.430627 7fe1555fe700 (Original Log Time 2020/12/15-09:53:04.430570) [db_impl/db_impl_compaction_flush.cc:203] [default] Level summary: files[2 0 0 0 0 0 0] max score 0.50
2020/12/15-09:53:04.430677 7fe17b5fe700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.433302 7fe17ddfe700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.433308 7fe17ddfe700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.433385 7fe17cdff700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.433396 7fe17cdff700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.433496 7fe17bfff700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.433514 7fe17bfff700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.433627 7fe17b5fe700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.433643 7fe17b5fe700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK
2020/12/15-09:53:04.433771 7fe17ddfe700 [db_impl/db_impl_compaction_flush.cc:1306] [default] Manual flush start.
2020/12/15-09:53:04.433777 7fe17ddfe700 [db_impl/db_impl_compaction_flush.cc:1316] [default] Manual flush finished, status: OK

  1. 这个日志怎么搞到这么大了,你对他做了什么。其实这个LOG文件没啥用,它不存数据。
  2. 每10分钟一个flush应该是正常的,你看看现在频率是多少。没奇怪的crontab在flush吧。

什么也没做,这个LOG一直在增长,flush频率特别快,而且sst文件也一直在增长


crontab里没有定时任务
image

@critical27 也许某个参数的行为改了?

clean_wal_interval_secs和wal_ttl为什么要改成0?为什么我看你下面截图wal大小是0?这几个参数不能随便乱改。禁掉wal具体指改了啥?

禁掉wal就是指把这两个参数都改成了0,因为我们这业务对磁盘大小要求比较严格,所以选择不生成wal日志。