扩容后数据消失

  • 部署方式: 分布式
  • 安装方式: RPM
  • 是否上生产环境:N
  • 硬件信息
    • 磁盘( 推荐使用 SSD)
    • CPU、内存信息
  • 问题的具体描述

3.5版本单节点扩充到三节点,按文档方式修改了三台机器的meta、storage、graph的IP,启动服务后原来的数据消失

storage:
E20230720 18:19:10.702764 316258 MetaClient.cpp:772] Send request to "172.16.0.107":9559, exceed retry limit
E20230720 18:19:10.702965 316258 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:10.703018 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:23.708588 316267 MetaClient.cpp:772] Send request to "172.16.0.59":9559, exceed retry limit
E20230720 18:19:23.708632 316267 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:23.708696 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:36.714881 316269 MetaClient.cpp:772] Send request to "172.16.0.107":9559, exceed retry limit
E20230720 18:19:36.714921 316269 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:36.714987 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Failed to write to remote endpoint. Wrote 0 bytes. AsyncSocketException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused)
E20230720 18:19:49.726441 316271 MetaClient.cpp:772] Send request to "172.16.0.164":9559, exceed retry limit
E20230720 18:19:49.726496 316271 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20230720 18:19:49.726555 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20230720 18:20:02.731951 316342 MetaClient.cpp:772] Send request to "172.16.0.164":9559, exceed retry limit
E20230720 18:20:02.732009 316342 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20230720 18:20:02.732112 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20230720 18:20:15.737222 316344 MetaClient.cpp:772] Send request to "172.16.0.164":9559, exceed retry limit
E20230720 18:20:15.737280 316344 MetaClient.cpp:773] RpcResponse exception: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connection refused): Connection refused
E20230720 18:20:15.737357 316216 MetaClient.cpp:112] Heartbeat failed, status:RPC failure in MetaClient: apache::thrift::transport::TTransportException: Dropping unsent request. Connection closed after: apache::thrift::transport::TTransportException: AsyncSocketException: connect failed, type = Socket not open, errno = 111 (Connect
E20230720 18:20:25.753047 316216 MetaClient.cpp:2014] Space 1226 not found!
E20230720 18:20:25.753067 316216 MetaClient.cpp:2014] Space 1226 not found!
E20230720 18:20:25.837155 316216 MetaClient.cpp:2014] Space 3629 not found!
E20230720 18:20:25.837173 316216 MetaClient.cpp:2014] Space 3629 not found!
E20230720 18:20:26.910533 316216 MetaClient.cpp:2014] Space 134 not found!
E20230720 18:20:26.910552 316216 MetaClient.cpp:2014] Space 134 not found!
E20230720 18:20:26.999480 316216 MetaClient.cpp:2014] Space 242 not found!
E20230720 18:20:26.999495 316216 MetaClient.cpp:2014] Space 242 not found!
E20230720 18:20:27.117458 316216 MetaClient.cpp:2014] Space 174 not found!
E20230720 18:20:27.117473 316216 MetaClient.cpp:2014] Space 174 not found!
E20230720 18:20:27.553972 316216 MetaClient.cpp:2014] Space 1792 not found!
E20230720 18:20:27.553989 316216 MetaClient.cpp:2014] Space 1792 not found!
E20230720 18:20:27.686390 316216 MetaClient.cpp:2014] Space 164 not found!
E20230720 18:20:27.686406 316216 MetaClient.cpp:2014] Space 164 not found!
E20230720 18:20:27.762923 316216 MetaClient.cpp:2014] Space 131 not found!
E20230720 18:20:27.762936 316216 MetaClient.cpp:2014] Space 131 not found!
E20230720 18:20:27.816511 316216 MetaClient.cpp:2014] Space 114 not found!
E20230720 18:20:27.816524 316216 MetaClient.cpp:2014] Space 114 not found!
E20230720 18:20:27.930830 316216 MetaClient.cpp:2014] Space 136 not found!
E20230720 18:20:27.930855 316216 MetaClient.cpp:2014] Space 136 not found!
E20230720 18:20:28.186867 316216 MetaClient.cpp:2014] Space 1229 not found!
E20230720 18:20:28.186887 316216 MetaClient.cpp:2014] Space 1229 not found!
E20230720 18:20:28.334877 316216 MetaClient.cpp:2014] Space 1225 not found!
E20230720 18:20:28.334895 316216 MetaClient.cpp:2014] Space 1225 not found!
E20230720 18:20:28.416689 316216 MetaClient.cpp:2014] Space 165 not found!
E20230720 18:20:28.416715 316216 MetaClient.cpp:2014] Space 165 not found!
E20230720 18:20:28.496606 316216 MetaClient.cpp:2014] Space 125 not found!
E20230720 18:20:28.496621 316216 MetaClient.cpp:2014] Space 125 not found!
E20230720 18:20:28.558516 316216 MetaClient.cpp:2014] Space 1 not found!
E20230720 18:20:28.558529 316216 MetaClient.cpp:2014] Space 1 not found!
E20230720 18:20:29.103117 316216 MetaClient.cpp:2014] Space 168 not found!
E20230720 18:20:29.103137 316216 MetaClient.cpp:2014] Space 168 not found!

之前的数据备份了么…文档里应该写了不支持单副本或者双副本扩容的。

数据迁移失败
您好麻烦看一下帖子,这是之前的经过。
需求就是把老的2.6.2单节点服务器的数据迁移到新的三台服务器组成的集群,新集群最好采用3.5版本,需要怎么做呢,先复制数据副本再扩容吗,但是克隆语句里面没有replica_number这个参数啊。


还是有别的实现的方法?
目前单节点的迁移和数据库升级都成功了

meta 正常操作的话是不能扩的,2.x 到 3.x 不能直接 copy 数据,需要升级转换文件。

正常来说可以扩 storaged 到 三个,成功之后再升级到 3.5.0 哈。

我是想把老数据先迁移到单个新节点再升级到3.5,然后扩容这样可以吗

按您的意思只能增加storage服务,想问一下我这种情况对于性能的提升大吗,三台新服务器的性能一般,是机械硬盘,单台的老服务器是固态

抱歉, GraphD是无状态的,一直也是可以随便扩容的哈,只是 metad 默认不支持扩容,如果没有条件重新加载数据确实只能有单 metad 和多 graphd, storaged

如果新的集群是 HDD,一些场景性能确实可能不如单机 HDD 的,建议上 NVMe SSD。

如果不得不 HDD,建议多 datapath(多块硬盘)

再就是已经有的 space 是不能修改 replica factor 的哈,数据不能重新导入么?这样是最方便的

按照这个操作可以,但是要保证版本不变,机器不变,然后再扩容另外两个机器的 storaged/graphd,最后再升级,metad 不能扩

数据导入的话很麻烦,数据量太大,用importer需要把所有tag edge_type的属性在yaml文件都里写一遍。
想问一下从速度上考量,是不是用老服务器比三台新的要更好

其实要分情况,比如三台的话 graphd storaged 可以有更大的资源。

但是 IOPS 可能三台里的 HDD 也不如一台的 SSD。

再就是,以后新建的 space,三台可以做到冗余 HA。再就是即使 IOPS 差了,那些能 hit RAM 的情况下,三台性能也可能更好

Importer 可以借助 studio 里的导入,可以用图形界面写 yaml

https://docs.nebula-graph.com.cn/3.5.0/1.introduction/3.nebula-graph-architecture/2.meta-service/
看文档里面,不扩meta服务对查询速度的影响不大吧

没有影响,正常三个也是只有 leader 被访问,就是少了冗余

另外,我更正一下哈,2.6 直接升级到 3.5.0 的话,不需要升级底层的数据,感谢 @MuYi 提醒!

如果到 3.1.0 的话有这个步骤,但是因为我们在后边版本 rollback 了一个底层数据的更改,这个步骤就不需要啦。

2 个赞

好的,感谢回复