nebula创建schema的时侯超时

  • nebula 版本:v3.2.0
  • 部署方式:单机
  • 安装方式:源码编译
  • 是否为线上版本: N
  • 硬件信息
    • 磁盘( 推荐使用 SSD)
    • CPU、内存信息
  • 问题的具体描述
    用nebula-console连接nebula创建schema的时候,执行到一半突然超时,导致部分TAG创建失败。之前已经部署过多次,第一次发现创建途中超时。
    nebula-console日志:
    Mon, 26 Sep 2022 23:32:06 CST
    [ERROR (-1005)]: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out

nebula-metad日志:
E20220926 23:32:06.624686 15193 Serializer.h:43] Thrift serialization is only defined for structs and unions, not containers thereof. Attemping to deserialize a value of type nebula::Value.

nebula-graphd日志:
E20220926 23:31:37.465178 15076 FileUtils.cpp:377] Failed to read the directory “/usr/local/nebula/data/storage/nebula” (2): No such file or directory
E20220926 23:36:11.015511 15466 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:36:11.015605 15466 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:36:11.015694 17128 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:24.060457 15119 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:40:24.060535 15119 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:24.060612 17128 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:44:37.096607 15512 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:44:37.096765 15512 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:44:37.096901 17128 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:48:50.138955 15476 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:48:50.139086 15476 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:48:50.139174 17128 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out

nebula-graphd日志:
E20220926 23:32:06.624053 15027 Serializer.h:43] Thrift serialization is only defined for structs and unions, not containers thereof. Attemping to serialize a value of type nebula::Value.
E20220926 23:36:09.663136 15207 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:36:09.663250 15207 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:36:09.663482 15024 QueryInstance.cpp:137] RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:36:13.829311 15209 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:36:13.829394 15209 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:36:13.829491 15215 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:38:22.130633 15206 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:38:22.130765 15206 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:38:22.130820 15206 GraphSessionManager.cpp:260] Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:38:22.130903 15216 GraphSessionManager.cpp:284] Update sessions failed: Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:12.694370 15207 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:40:12.694460 15207 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:12.694646 15027 QueryInstance.cpp:137] RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:26.867775 15208 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:40:26.867859 15208 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:40:26.867954 15215 MetaClient.cpp:192] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:42:35.172431 15209 MetaClient.cpp:758] Send request to “127.0.0.1”:9559, exceed retry limit
E20220926 23:42:35.172556 15209 MetaClient.cpp:759] RpcResponse exception: apache::thrift::transport::TTransportException: TTransportException: Timed out
E20220926 23:42:35.172622 15209 GraphSessionManager.cpp:260] Update sessions failed: RPC failure in MetaClient: apache::thrift::transport::TTransportException: TTransportException: Timed out

怀疑是不是metad突然挂掉导致的,因为nebula已经被重启了,不清楚是不是metad导致的,重启后重新创建schema正常。请问创建schema中途超时的原因是什么啊?

nebula metad 的日志能再多给一些吗?

meta没有更多日志了

我看生成了core文件,不过很大,有700M

可以用 bt 打印下错误栈吗

能贴一下具体创建 schema 的语句吗?

另外最原始的超时报错,常见于由于网络 or 防火墙等原因,连不上 metad 了。所以可以再确认下端口是不是配对了。

另外 metad 应该有个 stderr 日志?能发一下不,里面可能有错误堆栈之类的。

语句没问题的,不是第一次跑了,之前都没问题,nebula重启后再跑就ok

感觉应该不是吧,我是把语句放在文件中,用nebula-console执行的,跑到一半突然超时的,语句如下:
CREATE TAG IF NOT EXISTS Process (pid int DEFAULT -1, ttl int) ttl_duration= 1296000, ttl_col = “ttl”;
CREATE TAG IF NOT EXISTS File (fname string DEFAULT “”, ttl int) ttl_duration= 1296000, ttl_col = “ttl”;
CREATE TAG IF NOT EXISTS Agent (agent_id string DEFAULT “”, ttl int) ttl_duration= 1296000, ttl_col = “ttl”;

前两条没问题,第三条报错了

有日志,但不是错误发生的时候的日志,是第二天重启前后打印的日志

错误信息有点少,有点难定位。

之前遇到的类似的问题都是由于网络不同或者主机不可达。可以在运行一段时间试试:

  1. 如果再有问题记得保留现场各种日志。
  2. 看看是否当时 metad 不可达或者本机负载过高之类的。

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。