对metad进行迁移

bupt_guojun · 2021 年4 月 6 日 03:22

请问对metad进行迁移有没有什么具体步骤？

yee · 2021 年4 月 7 日 01:45

能否再把需求描述的具体些？

是想整体把所有的 metad 迁移，还是只是迁移部分结点？
迁移的时候是想在线还是离线？

cc @critical27 @HarrisChu

bupt_guojun · 2021 年4 月 7 日 02:04

需要把所有的metad进行迁移；
需要在线迁移；

critical27 · 2021 年4 月 8 日 03:14

需求是啥，多机房部署？没做meta的迁移，不过能够实现类似功能，大概流程就是把A集群的meta数据复制到B集群，然后还需要修改数据里的机器信息，有其他用户这么做过。

ThomasWang · 2021 年4 月 19 日 02:17

您好,请问下如何修改数据里的机器信息呀

迁移之后,graphd 一直往旧的storaged节点发送数据,但旧节点已经停了,所以一直报错.
对于 nebula graph ,我是一个新手,我没有在graphd的配置中找到可以配置 storaged 节点的选项,只有 metad 节点的配置,因此我猜测 graphd 通过 metad 找到 storaged 节点的信息,可是它总是找到旧的 storaged 节点的ip,而找不到新的storaged 节点ip.

已经尝试过在配置文件开头添加--local_config=true,然后没有起作用

我应该如何修改metad 中的 storaged 节点信息呢

HarrisChu · 2021 年4 月 19 日 02:59

参考下如何将一个机器的数据迁移到另一台机器 - #17 由 dingding

curl -Gs "http://192.168.8.5:19559/replace?from=192.168.8.5&to=192.168.8.6

19559 是 metad 的 ws_http_port

ThomasWang · 2021 年4 月 19 日 07:45

非常感谢,使用这个方法,我在测试环境成功了,但在线上环境失败了.
在线上环境失败的原因是测试环境和线上环境的版本不同.
测试环境版本: Nebula Graph (Version 1.2.0)
线上环境版本:Nebula Graph (Version 2020.04.01-nightly)

在测试环境,调用 metad 服务的 replace 接口,然后重启 nebula 服务后,成功解决了问题,输出如下:

$ curl -Gs "http://172.16.43.103:11000/replace?from=172.16.43.198&to=172.16.43.103"
Replace Host successfully
$ ~/server/nebula/nebula/scripts/nebula.service restart all
[INFO] Stopping nebula-metad...
[INFO] Done
[INFO] Starting nebula-metad...
[INFO] Done
[INFO] Stopping nebula-graphd...
[INFO] Done
[INFO] Starting nebula-graphd...
[INFO] Done
[INFO] Stopping nebula-storaged...
[INFO] Done
[INFO] Starting nebula-storaged...
[INFO] Done

但在线上环境,当我调用 metad 服务的 replace 接口时,发生了错误,输出如下:

$ curl -G "http://172.16.0.40:45500/replace?from=172.16.0.21&to=172.16.0.40"
curl: (52) Empty reply from server

同时,在调用 metad 服务的 replace 接口时, metad 服务输出如下错误日志:

E0419 15:17:20.581399 21685 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0419 15:17:20.581521 21685 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0419 15:17:20.581599 21685 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1718775661 (hex 0x66726f6d, ascii 'from') (transport N6apache6thrift5async12TAsyncSocketE, address 172.16.0.40, port 48484)

我该如何解决这个错误呢,或者有没有办法修改 Nebula Graph (Version 2020.04.01-nightly) 中 metad 服务中的机器信息呢

HarrisChu · 2021 年4 月 19 日 07:57

请贴一下你 2020.04.01-nightly 的 metad 的配置文件，1.x 到 2.x 的服务端口发生了变化，看下是不是用错端口了。

另外为什么线上环境用 nightly 呀，可以使用 v2.0.1 不。Release Nebula Graph v2.0.1 · vesoft-inc/nebula-graph · GitHub

ThomasWang · 2021 年4 月 19 日 08:02

Nebula Graph (Version 2020.04.01-nightly) 的metad服务的配置文件(nebula-metad.conf)内容如下:

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-metad.pid

########## logging ##########
# The directory to host logging files, which must already exists
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0

########## networking ##########
# Meta Server Address
--meta_server_addrs=172.16.0.40:45500
# Local ip
--local_ip=172.16.0.40
# Meta daemon listening port
--port=45500
# HTTP service ip
--ws_ip=172.16.0.40
# HTTP service port
--ws_http_port=11000
# HTTP2 service port
--ws_h2_port=11002

########## storage ##########
# Root data path, here should be only single path for metad
--data_path=data/meta

########## Misc #########
# The default number of parts when a space is created
--default_parts_num=100
# The default replica factor when a space is created
--default_replica_factor=1

至于为什么 nightly 版本,这个我也不太清楚原因

HarrisChu · 2021 年4 月 19 日 08:04

--ws_http_port=11000

curl 用 11000 的端口，不是 45000

ThomasWang · 2021 年4 月 19 日 08:12

谢谢大佬,问题解决了.
改成 11000 之后执行成功了