Nebula Graphd (3.5.0) crash down

是想带启动和查询日志一起拿下来看一下么?

是的, 启动的时候的一些日志

graphd level 2日志
nebula-graphd.data-cdh6-test02.root.log.INFO.20230630-134129.8263 (30.9 KB)

graphd level 4日志
nebula-graphd.data-cdh6-test02.root.log.INFO.20230630-171538.961 (120.6 KB)

config配置
nebula-graphd.conf (5.0 KB)

配置项
curl http://172.20.221.58:19669/flags

check_plan_killed_frequency=8
cluster_id_path="cluster.id"
expired_time_factor=5
failed_login_attempts=0
heartbeat_interval_secs=10
meta_client_retry_interval_secs=1
meta_client_retry_times=3
meta_client_timeout_ms=60000
password_lock_time_in_secs=0
storage_client_retry_interval_ms=1000
storage_client_timeout_ms=60000
enable_udf=0
udf_path="/alidata2/nebula-graph3/udf/"
log_disk_check_interval_secs=10
log_min_reserved_bytes_to_error=67108864
log_min_reserved_bytes_to_fatal=4194304
log_min_reserved_bytes_to_warn=268435456
cgroup_v1_memory_current_path="/sys/fs/cgroup/memory/memory.usage_in_bytes"
cgroup_v1_memory_max_path="/sys/fs/cgroup/memory/memory.limit_in_bytes"
cgroup_v1_memory_stat_path="/sys/fs/cgroup/memory/memory.stat"
cgroup_v2_controllers="/sys/fs/cgroup/cgroup.controllers"
cgroup_v2_memory_current_path="/sys/fs/cgroup/memory.current"
cgroup_v2_memory_max_path="/sys/fs/cgroup/memory.max"
cgroup_v2_memory_stat_path="/sys/fs/cgroup/memory.stat"
containerized=0
memory_purge_enabled=1
memory_purge_interval_seconds=10
memory_tracker_available_ratio=0.8
memory_tracker_detail_log=1
memory_tracker_detail_log_interval_ms=3000
memory_tracker_limit_ratio=0.4
memory_tracker_untracked_reserved_memory_mb=1024
system_memory_high_watermark_ratio=0.5
gflags_mode_json="share/resources/gflags.json"
ca_path=""
cert_path=""
enable_graph_ssl=0
enable_meta_ssl=0
enable_ssl=0
key_path=""
password_path=""
conn_timeout_ms=1000
timezone_file="share/resources/date_time_zonespec.csv"
timezone_name="UTC+00:00:00"
redirect_stdout=1
stderr_log_file="graphd-stderr.log"
stdout_log_file="graphd-stdout.log"
enable_lifetime_optimize=1
path_batch_size=10000
path_threshold_ratio=2
path_threshold_size=100
traverse_parallel_threshold_rows=150000
enable_optimizer_property_pruner_rule=1
max_plan_depth=512
enable_optimizer_collapse_project_rule=1
accept_partial_success=0
auth_type="password"
client_idle_timeout_secs=28800
client_white_list="3.5.0:3.0.0"
cloud_http_url=""
daemonize=1
default_charset="utf8"
default_collate="utf8_bin"
disable_octal_escape_char=0
enable_async_gc=0
enable_authorize=0
enable_client_white_list=1
enable_data_balance=1
enable_experimental_feature=1
enable_optimizer=1
ft_request_retry_times=3
gc_worker_size=0
graph_use_vertex_key=0
listen_backlog=1024
listen_netdev="any"
local_config=1
local_ip="172.20.221.58"
max_allowed_connections=9223372036854775807
max_allowed_query_size=4194304
max_allowed_statements=512
max_job_size=1
max_sessions_per_ip_per_user=500
meta_server_addrs="172.20.221.57:9559,172.20.221.58:9559,172.20.221.59:9559"
min_batch_size=8192
num_accept_threads=1
num_max_connections=0
num_netio_threads=2
num_operator_threads=2
num_path_thread=10
num_rows_to_check_memory=1024
num_worker_threads=2
optimize_appendvertices=0
pid_file="pids/nebula-graphd.pid"
port=9669
reuse_port=0
session_idle_timeout_secs=28800
session_reclaim_interval_secs=60
check_memory_interval_in_secs=1
enable_space_level_metrics=0
slow_query_threshold_us=200000
max_expression_depth=512
ws_http_port=19669
ws_ip="172.20.221.58"
ws_threads=4
codel_enabled=0
thrift_cpp2_protocol_reader_container_limit=0
thrift_cpp2_protocol_reader_string_limit=0
thrift_server_request_debug_log_entries_max=10000
service_identity=""
thrift_abort_if_exceeds_shutdown_deadline=1
thrift_ssl_policy="disabled"
folly_memory_idler_purge_arenas=1
dynamic_cputhreadpoolexecutor=1
codel_interval=100
codel_target_delay=5
dynamic_iothreadpoolexecutor=1
threadtimeout_ms=60000
observer_manager_pool_size=4
logging=""
folly_hazptr_use_executor=1
flagfile="/alidata1/opt/nebula-graph-3.5.0/etc/nebula-graphd.conf"
fromenv=""
tryfromenv=""
undefok=""
tab_completion_columns=80
tab_completion_word=""
help=0
helpfull=0
helpmatch=""
helpon=""
helppackage=0
helpshort=0
helpxml=0
version=0
alsologtoemail=""
alsologtostderr=0
colorlogtostderr=0
drop_log_memory=1
log_backtrace_at=""
log_dir="/alidata2/nebula-graph3/log"
log_link=""
log_prefix=1
log_utc_time=0
logbuflevel=0
logbufsecs=0
logemaillevel=999
logfile_mode=436
logmailer=""
logtostderr=0
max_log_size=1800
minloglevel=0
stderrthreshold=3
stop_logging_if_full_disk=0
timestamp_in_logfile_name=1
symbolize_stacktrace=1
v=4
vmodule=""
zlib_compressor_buffer_growth=2024
s2shape_index_cell_size_to_long_edge_ratio=1
s2shape_index_default_max_edges_per_cell=10
s2shape_index_tmp_memory_budget_mb=100
s2cell_union_decode_max_num_cells=1000000
s2debug=0
s2loop_lazy_indexing=1
s2polygon_decode_max_num_vertices=50000000
s2polygon_decode_max_num_loops=10000000
s2polygon_lazy_indexing=1
dcache_unit_test=0

这里还有一个很有意思的现象
机器内存 2C16G
测试水位线
–system_memory_high_watermark_ratio=0.5
–memory_tracker_limit_ratio=0.5
–memory_tracker_untracked_reserved_memory_mb=50

NGQL 1
MATCH p=(src)-[e:ScheduleTaskRelationship*10]-(dst) where id(src) in [1113869279948709888] RETURN DISTINCT p LIMIT 300;

正常执行会被打爆,15G+内存不足,OOM killed
执行日志
nebula-graphd.data-cdh6-test02.root.log.INFO.20230630-181510.9310 (64.8 KB)

NGQL 2
MATCH p=(src)-[e:ScheduleTaskRelationship*5]-(dst) where id(src) in [1113869279948709888,1113869416506859520,1113869280837902336,1113869280984702976,1113869281156669440,1113869281303470080,1113869281446076416,1113869281584488448,1113869282473680896,1113869282616287232,1113869282746310656,1113869282876334080,1113869283023134720,1113869286454075392,1113869286584098816,1113869286718316544,1113869286844145664,1113869286961586176,1113869287175495680,1113869287309713408] RETURN DISTINCT p LIMIT 300

可以正常执行,在水位线设置再0.9的情况下,大概需要消耗12.xG的内存,运行完成。

但是水位线设置为0.5后,重启服务,也会在正常的8G+的时候被拦截,java客户端也会出现-2600错误码
拦截日志:

nebula-graphd.data-cdh6-test02.root.log.INFO.20230630-181007.8635 (121.3 KB)
客户端日志:

好的, 这边会尝试用跳数10的查询复现一下, 看起来是有点问题

你好, 你这边先尝试在配置文件中改下max_job_size, 将默认值的1改成多个,比如改成2, 看能不能生效
–max_job_size=2

max_job_size=1的情况,我们会去修复;

3 个赞

当调整为2时,验证准确无误,可以拦截!谢谢快速支持,辛苦了!

2 个赞

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。