- nebula 版本:3.8.0
- 部署方式:分布式
- 安装方式:RPM
- 是否上生产环境: N
- 硬件信息
- 磁盘 HDD
- CPU、内存信息 48核,256G
- 服务器数量:3台
- 问题的具体描述
graph和storaged不定时产生core文件,分别分析内容为:
我使用三台完全空闲的服务器,内存是256G,CPU48核,数据库中只有几条数据,graph和storaged产生很多core文件,以下是gdb分析core文件内容:
graphd:
Using host libthread_db library “/lib64/libthread_db.so.1”.
Core was generated by `/home/nebula/nebula-graph-3.8.0.el7/bin/nebula-graphd --flagfile /home/nebula/n’.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000002282b5d in folly::EventBase::runImmediatelyOrRunInEventBaseThreadAndWait(folly::Function<void ()>) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-317.el7.x86_64
(gdb) bt
#0 0x0000000002282b5d in folly::EventBase::runImmediatelyOrRunInEventBaseThreadAndWait(folly::Function<void ()>) ()
#1 0x000000000178a792 in ?? ()
#2 0x00000000018f4b4c in std::_Hashtable<std::pair<nebula::HostAddr, folly::EventBase*>, std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > >, std::__detail::_Select1st, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() ()
#3 0x00000000018f4bde in void folly::threadlocal_detail::ElementWrapper::set<std::unordered_map<std::pair<nebula::HostAddr, folly::EventBase*>, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > > >>(std::unordered_map<std::pair<nebula::HostAddr, folly::EventBase>, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > > >)::{lambda(void, folly::TLPDestructionMode)#2}::_FUN(void*, folly::TLPDestructionMode) ()
#4 0x00000000021fa082 in folly::threadlocal_detail::StaticMetaBase::onThreadExit(void*) ()
#5 0x00007f7482bf1ca2 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#6 0x00007f7482bf1eb3 in start_thread () from /lib64/libpthread.so.0
#7 0x00007f748291a96d in clone () from /lib64/libc.so.6
storaged:
Using host libthread_db library “/lib64/libthread_db.so.1”.
Core was generated by `/home/nebula/nebula-graph-3.8.0.el7/bin/nebula-storaged --flagfile /home/nebula’.
Program terminated with signal 11, Segmentation fault.
#0 0x00000000026a6afd in folly::EventBase::runImmediatelyOrRunInEventBaseThreadAndWait(folly::Function<void ()>) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.17-317.el7.x86_64
#0 0x00000000026a6afd in folly::EventBase::runImmediatelyOrRunInEventBaseThreadAndWait(folly::Function<void ()>) ()
#1 0x00000000013fcef2 in ?? ()
#2 0x000000000156b2ec in std::_Hashtable<std::pair<nebula::HostAddr, folly::EventBase*>, std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > >, std::__detail::_Select1st, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::~_Hashtable() ()
#3 0x000000000156b37e in void folly::threadlocal_detail::ElementWrapper::set<std::unordered_map<std::pair<nebula::HostAddr, folly::EventBase*>, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > > >>(std::unordered_map<std::pair<nebula::HostAddr, folly::EventBase>, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient, std::hash<std::pair<nebula::HostAddr, folly::EventBase*> >, std::equal_to<std::pair<nebula::HostAddr, folly::EventBase*> >, std::allocator<std::pair<std::pair<nebula::HostAddr, folly::EventBase*> const, std::shared_ptrnebula::meta::cpp2::MetaServiceAsyncClient > > >)::{lambda(void, folly::TLPDestructionMode)#2}::_FUN(void*, folly::TLPDestructionMode) ()
#4 0x0000000002614142 in folly::threadlocal_detail::StaticMetaBase::onThreadExit(void*) ()
#5 0x00007f44d5e64ca2 in __nptl_deallocate_tsd () from /lib64/libpthread.so.0
#6 0x00007f44d5e64eb3 in start_thread () from /lib64/libpthread.so.0
#7 0x00007f44d5b8d96d in clone () from /lib64/libc.so.6
系统日志中也有报错:
dmesg |grep nebula
[761397.518518] graph-netio3[33493]: segfault at 7fc687a02310 ip 0000000002282b5d sp 00007fc6887f5410 error 4 in nebula-graphd[f4b000+1913000]
[766000.751482] graph-netio2[42458]: segfault at 7fbc4f402310 ip 0000000002282b5d sp 00007fbc501f5410 error 4 in nebula-graphd[f4b000+1913000]
[766053.127185] graph-netio1[4170]: segfault at 7f260e602310 ip 0000000002282b5d sp 00007f260f3f5410 error 4 in nebula-graphd[f4b000+1913000]
[776420.065589] graph-netio7[8275]: segfault at 7ff4ea002310 ip 0000000002282b5d sp 00007ff4eadf5410 error 4 in nebula-graphd[f4b000+1913000]
[777792.449798] IOThreadPool1[46250]: segfault at 7f9c3ec05090 ip 00000000026a6afd sp 00007f9c3f9f5390 error 4 in nebula-storaged[1002000+1c8f000]
[779193.289671] graph-netio3[27141]: segfault at 7f9463e02310 ip 0000000002282b5d sp 00007f9464bf5410 error 4 in nebula-graphd[f4b000+1913000]
[779910.045721] graph-netio4[24777]: segfault at 7f932a202310 ip 0000000002282b5d sp 00007f932aff5410 error 4 in nebula-graphd[f4b000+1913000]
[779978.574505] graph-netio8[27523]: segfault at 7f66ad202310 ip 0000000002282b5d sp 00007f66adff5410 error 4 in nebula-graphd[f4b000+1913000]
[785030.694424] graph-netio2[46409]: segfault at 7fe3acc02310 ip 0000000002282b5d sp 00007fe3ad9f5410 error 4 in nebula-graphd[f4b000+1913000]
[850662.697566] graph-netio1[45942]: segfault at 7fb5de602310 ip 0000000002282b5d sp 00007fb5df3f5410 error 4 in nebula-graphd[f4b000+1913000]
[850665.186527] IOThreadPool332[30184]: segfault at 7fd7d5f5d090 ip 00000000026a6afd sp 00007fd6e7a46390 error 4 in nebula-storaged[1002000+1c8f000]
[862849.480717] graph-netio4[33843]: segfault at 7f582de02310 ip 0000000002282b5d sp 00007f582ebf5410 error 4 in nebula-graphd[f4b000+1913000]
[933167.660636] graph-netio2[36429]: segfault at 7f3d1a202310 ip 0000000002282b5d sp 00007f3d1aff5410 error 4 in nebula-graphd[f4b000+1913000]
[933276.125462] graph-netio1[1591]: segfault at 7f2772802310 ip 0000000002282b5d sp 00007f27735f5410 error 4 in nebula-graphd[f4b000+1913000]
[933538.312948] graph-netio1[6091]: segfault at 7ff3d1c02310 ip 0000000002282b5d sp 00007ff3d29f5410 error 4 in nebula-graphd[f4b000+1913000]