-
nebula 版本:1.2.0
-
部署方式:分布式
-
是否为线上版本:Y
-
硬件信息
- 磁盘:SSD
- CPU、内存信息:2C*12core、384GB
-
问题的具体描述
图数据库已运行3个月无异常,突然间无法连接,调查发现graphd进程crash了,有coredump文件产生。这个问题反复出现,只要开始查询或写入,过一会儿graphd进程就会崩溃,第一次崩溃时coredump文件有90多GB,后面崩溃时coredump文件基本为700MB左右。
通过新建图空间,插入数据进行查询和写入不会有问题,但是对旧的图数据空间不能进行操作,否则graphd进程就会崩溃。 -
stderr日志信息
RowReader.cpp:166] Check failed: ver == schema->getVersion() (1 vs. 0)
*** Check failure stack trace: ***
@ 0x19919dc google::LogMessage::Fail()
@ 0x199654d google::LogMessage::SendToLog()
@ 0x19916ad google::LogMessage::Flush()
@ 0x1991f08 google::LogMessageFatal::~LogMessageFatal()
@ 0x1578567 nebula::RowReader::getRowReader()
@ 0xfea2f3 nebula::graph::FetchVerticesExecutor::processResult()
@ 0xfebb7f _ZN5folly6detail8function14FunctionTraitsIFvONS_3TryIN6nebula7storage18StorageRpcResponseINS5_4cpp213QueryResponseEEEEEEE9callSmallIZNS_7futures6detail10FutureBaseIS9_E18thenImplementationIZNOS_6FutureIS9_E9thenValueIRZNS4_5graph21FetchVerticesExecutor13fetchVerticesEvEUlOS9_E_EENSK_INSG_19valueCallableResultIS9_T_E10value_typeEEEOST_EUlSB_E_NSG_14callableResultIS9_SY_EELb1EJSB_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5valueENS12_6ReturnEE4typeESX_NSG_9argResultIXT1_EST_JDpT2_EEEEUlSB_E_EEvRNS1_4DataESB_
@ 0xf26481 _ZN5folly6detail8function14FunctionTraitsIFvvEE9callSmallIZNS_7futures6detail4CoreIN6nebula7storage18StorageRpcResponseINSA_4cpp213QueryResponseEEEE10doCallbackEvEUlvE0_EEvRNS1_4DataE
@ 0x10aaee9 apache::thrift::concurrency::FunctionRunner::run()
@ 0x16a3c37 apache::thrift::concurrency::ThreadManager::Task::run()
@ 0x16e8b9d apache::thrift::concurrency::ThreadManager::ImplT<>::Worker<>::run()
@ 0x17b9e1b apache::thrift::concurrency::PthreadThread::threadMain()
@ 0x2af05fe74e24 start_thread
@ 0x2af06018134c __clone
- coredump文件调试信息
(gdb) backtrace
#0 0x00002b11df8db1f7 in raise () from /lib64/libc.so.6
#1 0x00002b11df8dc8e8 in abort () from /lib64/libc.so.6
#2 0x0000000000eb1bf6 in ?? ()
#3 0x00000000019919dd in google::LogMessage::Fail() ()
#4 0x000000000199654e in google::LogMessage::SendToLog() ()
#5 0x00000000019916ae in google::LogMessage::Flush() ()
#6 0x0000000001991f09 in google::LogMessageFatal::~LogMessageFatal() ()
#7 0x0000000001578568 in nebula::RowReader::getRowReader(folly::Range<char const*>, std::shared_ptr<nebula::meta::SchemaProviderIf const>) ()
#8 0x0000000000fea2f4 in nebula::graph::FetchVerticesExecutor::processResult(nebula::storage::StorageRpcResponse<nebula::storage::cpp2::QueryResponse>&&) ()
#9 0x0000000000febb80 in _ZN5folly6detail8function14FunctionTraitsIFvONS_3TryIN6nebula7storage18StorageRpcResponseINS5_4cpp213QueryResponseEEEEEEE9callSmallIZNS_7futures6detail10FutureBaseIS9_E18thenImplementationIZNOS_6Futu reIS9_E9thenValueIRZNS4_5graph21FetchVerticesExecutor13fetchVerticesEvEUlOS9_E_EENSK_INSG_19valueCallableResultIS9_T_E10value_typeEEEOST_EUlSB_E_NSG_14callableResultIS9_SY_EELb1EJSB_EEENSt9enable_ifIXntsrNT0_13ReturnsFutureE5 valueENS12_6ReturnEE4typeESX_NSG_9argResultIXT1_EST_JDpT2_EEEEUlSB_E_EEvRNS1_4DataESB_ ()
#10 0x0000000000f26482 in void folly::detail::function::FunctionTraits<void ()>::callSmall<folly::futures::detail::Core<nebula::storage::StorageRpcResponse<nebula::storage::cpp2::QueryResponse> >::doCallback()::{lambda()#2}>( folly::detail::function::Data&) ()
#11 0x00000000010aaeea in apache::thrift::concurrency::FunctionRunner::run() ()
#12 0x00000000016a3c38 in apache::thrift::concurrency::ThreadManager::Task::run() ()
#13 0x00000000016e8b9e in apache::thrift::concurrency::ThreadManager::ImplT<folly::LifoSemImpl<std::atomic, folly::SaturatingSemaphore<true, std::atomic> > >::Worker<folly::LifoSemImpl<std::atomic, folly::SaturatingSemaphore< true, std::atomic> > >::run() ()
#14 0x00000000017b9e1c in apache::thrift::concurrency::PthreadThread::threadMain(void*) ()
#15 0x00002b11df691e25 in start_thread () from /lib64/libpthread.so.0
#16 0x00002b11df99e34d in clone () from /lib64/libc.so.6