nebula-bench导入数据时报错"unsupported client version"

图说天下 · 2023 年8 月 30 日 05:55

nebula 版本：3.6.0
nebula-importer版本: 4.0(3.1.0)
部署方式：分布式
安装方式：二进制版本
是否上生产环境：N
问题的具体描述
参考下面网址中描述的步骤搭建NebulaGraph-bench 测试环境，但是在
执行importer时出错了,错误信息"unsupported client version"

NebulaGraph Bench测试环境搭建

nebula-contrib/NebulaGraph-Bench/blob/master/README_cn.md

# NebulaGraph Bench

`NebulaGraph Bench` 用于测试 NebulaGraph 的基线性能数据，使用 LDBC v0.3.3 的标准数据集。

当前只适用于 NebulaGraph v2.0 以上版本。

主要功能:

* 生产 LDBC 数据集并导入 NebulaGraph。
* 用 k6 进行压测。

## 工具依赖

|   NebulaGraph Bench    |     NebulaGraph     | NebulaGraph Importer |   K6 Plugin  |   ldbc_snb_datagen  |   NebulaGraph go    |
|:-----------------:|:-------------:|:---------------:|:------------:|:-------------------:|:--------------:|
|       v0.2        |    v2.0.1     |     v2.0.0-ga   |    v0.0.6    |       v0.3.3        |     v2.0.0-ga  |
|       v1.0.0      |    v2.5.0     |     v2.5.0      |    v0.0.7    |       v0.3.3        |     v2.5.0     |
|       v1.1.0      |    v3.0.0     |     v3.0.0      |    v0.0.9    |       v0.3.3        |     v3.0.0     |
|       v1.2.0      |    v3.1.0     |     v3.1.0      |    v1.0.0    |       v0.3.3        |     NONE       |
|       master      |    nightly    |     master      |    master    |       v0.3.3        |     NONE       |

This file has been truncated. show original

–诊断过程

参考了 nebula-importer导入数据时报错unsupported client version - #5，来自 veezhang
发现改写config.yaml文件太麻烦，对于初学者而言十分不友好，参数太多。

希望脚本在生成配置文件时加个参数，例如v2或者v3, 这样能一键生成对应
版本的yaml文件方便importer执行时选择配置文件。

同样尝试了降级importer的版本(4.0降到3.1), 执行时报了另外一个错:
cannot unmarshal !!str None into int
查看了一下yaml文件的定义，是因为定义vid时 index的默认值为None
例如下面的定义:
vid:
index: None
type: int

最后说明一点: import_config.yaml文件是执行以下命令自动生成的:
python3 run.py nebula importer --dry-run

这个命令目前只能生成v2版本的yaml文件，不支持v3或者更高版本。

xtcyclist · 2023 年8 月 30 日 05:57

意见很好，感谢，参数太多确实难用。版本多，不匹配，也令人头大。

suu · 2023 年8 月 30 日 06:05

bench和importer的版本号要配套。你的importer用的v3.1.0的话，bench是v1.2.0的吗？

图说天下 · 2023 年8 月 30 日 06:06

bench版本是 1.2

suu · 2023 年8 月 30 日 06:11

你可以贴几张你自己的图片给大家看看，说不定有提示信息你自己没看到。

图说天下 · 2023 年8 月 30 日 06:18

下面的截图是将importer版本降为3.1后执行脚本出错信息

下面的截图是更换成4.0版本的importer后，执行脚本出错

脚本自动生成的配置文件为v2版本

HarrisChu · 2023 年8 月 30 日 07:20

打开 importer 的配置文件，然后随便找一个 index 是 None 的点，上面有一个 CSV 文件的路径。

比如 xxxx/xxxx/tag_hasType_tagclass.csv
然后执行一下 head xxxx/xxxx/tag_hasType_tagclass_header.csv 看一下内容是什么？

bench 里，最开始初始化 index 是 None，然后会找 header 文件，然后遍历 header 文件第一行，如果有 .id 的列，就赋值 index。看上去是 header 文件不太对。

你 LDBC 的 csv 是用 bench 生成的么？

suu · 2023 年8 月 30 日 08:33

我之前安装的bench1.2.0，importer4.0.0，和你报一样的错误“unsupported client version”，但是我换成3.1.0以后就解决了

图说天下 · 2023 年8 月 31 日 08:58

head -1 /nebula_test/test/perf_test/NebulaGraph-Bench-1.2.0/target/data/test_data/social_network/dynamic/comment_header.csv

1236950581249|2011-08-17T14:26:59|92.39.58.88|Chrome|yes|3

不带header的csv第一行:
head -1 /nebula_test/test/perf_test/NebulaGraph-Bench-1.2.0/target/data/test_data/social_network/dynamic/comment.csv

1236950581250|2011-08-17T11:10:21|213.55.127.9|Internet Explorer|thanks|6

好像没啥差别啊。难道是生成的数据有问题？

HarrisChu · 2023 年8 月 31 日 09:36

> head -1 comment_header.csv                                                                                                                                                        add_nebula_dashboard [da1850f] modified untracked
id|creationDate|locationIP|browserUsed|content|length

看上去是 header 的那个头不见了，所以根据 header 生产的 importer 的配置是错误的
你可以用 nebula-bench 重新跑一个 sf0.1 的数据，然后把 header 复制过来。

或者全部重新生成数据。

图说天下 · 2023 年9 月 1 日 01:56

很奇怪，我重新生成测试数据后，没有发现带header的csv文件，难道是命令用错了？
python3 run.py data -s 10

还有一个命令是
python3 run.py data -os

这条命令执行完会生成带header的csv

HarrisChu · 2023 年9 月 1 日 01:58

python3 run.py data -s 10 就可以生成带 header 的，不需要 python3 run.py data -os
python3 run.py data -os 是你通过 LDBC 自己生成后，可以用这个命令去拆。

你可以用 bench 的 master + importer v3.2.0
我更新一下 repo 里的文档

图说天下 · 2023 年9 月 1 日 02:06

重新检查了一下生成的文件，确实有带header的csv

图说天下 · 2023 年9 月 1 日 03:13

还是有问题。

用了bench的master版本 + importer 3.1版本，执行还是出错:

2023/09/01 11:11:46 --- START OF NEBULA IMPORTER ---
2023/09/01 11:11:47 Client(0) fails to execute commands (CREATE SPACE IF NOT EXISTS stress_test_0901(PARTITION_NUM = 24, REPLICA_FACTOR = 3, vid_type = int64);
USE stress_test_0901;
CREATE TAG IF NOT EXISTS `Post`(`imageFile` string,`creationDate` string,`locationIP` string,`browserUsed` string,`language` string,`content` string,`length` int);
CREATE TAG IF NOT EXISTS `Tagclass`(`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Organisation`(`type` string,`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Tag`(`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Forum`(`title` string,`creationDate` string);
CREATE TAG IF NOT EXISTS `Place`(`name` string,`url` string,`type` string);
CREATE TAG IF NOT EXISTS `Comment`(`creationDate` string,`locationIP` string,`browserUsed` string,`content` string,`length` int);
CREATE TAG IF NOT EXISTS `Person`(`firstName` string,`lastName` string,`gender` string,`birthday` string,`creationDate` string,`locationIP` string,`browserUsed` string);
CREATE EDGE IF NOT EXISTS `CONTAINER_OF`();
CREATE EDGE IF NOT EXISTS `HAS_CREATOR`();
CREATE EDGE IF NOT EXISTS `STUDY_AT`(`classYear` int);
CREATE EDGE IF NOT EXISTS `IS_LOCATED_IN`();
CREATE EDGE IF NOT EXISTS `REPLY_OF`();
CREATE EDGE IF NOT EXISTS `KNOWS`(`creationDate` string);
CREATE EDGE IF NOT EXISTS `WORK_AT`(`workFrom` int);
CREATE EDGE IF NOT EXISTS `HAS_MEMBER`(`joinDate` string);
CREATE EDGE IF NOT EXISTS `HAS_TYPE`();
CREATE EDGE IF NOT EXISTS `HAS_INTEREST`();
CREATE EDGE IF NOT EXISTS `IS_PART_OF`();
CREATE EDGE IF NOT EXISTS `IS_SUBCLASS_OF`();
CREATE EDGE IF NOT EXISTS `HAS_TAG`();
CREATE EDGE IF NOT EXISTS `HAS_MODERATOR`();
CREATE EDGE IF NOT EXISTS `LIKES`(`creationDate` string);

CREATE TAG INDEX IF NOT EXISTS `person_first_name_idx` on `Person`(firstName(10));
CREATE EDGE INDEX IF NOT EXISTS `like_creationDate_idx` on `LIKES`(creationDate);
), response error code: -1005, message: Invalid param!
2023/09/01 11:11:48 --- END OF NEBULA IMPORTER ---

图说天下 · 2023 年9 月 1 日 03:16

为了节省时间，我是把bench 1.2的target目录打包成一个tgz文件，然后直接在bench 的master版本目录下解压，是否还需要复制其他文件？

图说天下 · 2023 年9 月 1 日 03:22

改用3.2版本的importer，还是有错:

2023/09/01 11:20:37 --- START OF NEBULA IMPORTER ---
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Comment.csv/comment.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Forum.csv/forum.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/forum.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Person.csv/person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Post.csv/post.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/post.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Organisation.csv/organisation.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/organisation.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Place.csv/place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Tag.csv/tag.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/tag.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/Tagclass.csv/tagclass.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/tagclass.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_LOCATED_IN.csv/person_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/KNOWS.csv/person_knows_person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_knows_person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/LIKES.csv/person_likes_comment.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_likes_comment.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/LIKES.csv/person_likes_post.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_likes_post.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_CREATOR.csv/comment_hasCreator_person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_hasCreator_person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_TAG.csv/comment_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_LOCATED_IN.csv/comment_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/REPLY_OF.csv/comment_replyOf_comment.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_replyOf_comment.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/REPLY_OF.csv/comment_replyOf_post.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_replyOf_post.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/CONTAINER_OF.csv/forum_containerOf_post.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/forum_containerOf_post.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_MEMBER.csv/forum_hasMember_person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/forum_hasMember_person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_MODERATOR.csv/forum_hasModerator_person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/forum_hasModerator_person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_TAG.csv/forum_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/forum_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_INTEREST.csv/person_hasInterest_tag.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_hasInterest_tag.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/STUDY_AT.csv/person_studyAt_organisation.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_studyAt_organisation.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/WORK_AT.csv/person_workAt_organisation.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/person_workAt_organisation.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_CREATOR.csv/post_hasCreator_person.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/post_hasCreator_person.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_TAG.csv/post_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/post_hasTag_tag.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_LOCATED_IN.csv/post_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/post_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_LOCATED_IN.csv/organisation_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/organisation_isLocatedIn_place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_PART_OF.csv/place_isPartOf_place.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/place_isPartOf_place.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/IS_SUBCLASS_OF.csv/tagclass_isSubclassOf_tagclass.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/tagclass_isSubclassOf_tagclass.csv
2023/09/01 11:20:37 [INFO] config.go:393: Failed data path: err/data/HAS_TYPE.csv/tag_hasType_tagclass.csv
2023/09/01 11:20:37 [INFO] config.go:399: find file: /nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/static/tag_hasType_tagclass.csv
2023/09/01 11:20:38 Client(0) fails to execute commands (CREATE SPACE IF NOT EXISTS stress_test_0901(PARTITION_NUM = 24, REPLICA_FACTOR = 3, vid_type = int64);
USE stress_test_0901;
CREATE TAG IF NOT EXISTS `Tag`(`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Person`(`firstName` string,`lastName` string,`gender` string,`birthday` string,`creationDate` string,`locationIP` string,`browserUsed` string);
CREATE TAG IF NOT EXISTS `Post`(`imageFile` string,`creationDate` string,`locationIP` string,`browserUsed` string,`language` string,`content` string,`length` int);
CREATE TAG IF NOT EXISTS `Comment`(`creationDate` string,`locationIP` string,`browserUsed` string,`content` string,`length` int);
CREATE TAG IF NOT EXISTS `Organisation`(`type` string,`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Place`(`name` string,`url` string,`type` string);
CREATE TAG IF NOT EXISTS `Forum`(`title` string,`creationDate` string);
CREATE TAG IF NOT EXISTS `Tagclass`(`name` string,`url` string);
CREATE EDGE IF NOT EXISTS `HAS_MODERATOR`();
CREATE EDGE IF NOT EXISTS `STUDY_AT`(`classYear` int);
CREATE EDGE IF NOT EXISTS `IS_PART_OF`();
CREATE EDGE IF NOT EXISTS `IS_SUBCLASS_OF`();
CREATE EDGE IF NOT EXISTS `LIKES`(`creationDate` string);
CREATE EDGE IF NOT EXISTS `HAS_TYPE`();
CREATE EDGE IF NOT EXISTS `HAS_MEMBER`(`joinDate` string);
CREATE EDGE IF NOT EXISTS `WORK_AT`(`workFrom` int);
CREATE EDGE IF NOT EXISTS `KNOWS`(`creationDate` string);
CREATE EDGE IF NOT EXISTS `REPLY_OF`();
CREATE EDGE IF NOT EXISTS `CONTAINER_OF`();
CREATE EDGE IF NOT EXISTS `HAS_INTEREST`();
CREATE EDGE IF NOT EXISTS `HAS_CREATOR`();
CREATE EDGE IF NOT EXISTS `IS_LOCATED_IN`();
CREATE EDGE IF NOT EXISTS `HAS_TAG`();

CREATE TAG INDEX IF NOT EXISTS `person_first_name_idx` on `Person`(firstName(10));
CREATE EDGE INDEX IF NOT EXISTS `like_creationDate_idx` on `LIKES`(creationDate);
), response error code: -1005, message: Invalid param!
2023/09/01 11:20:39 --- END OF NEBULA IMPORTER ---

图说天下 · 2023 年9 月 5 日 06:41

重新生成了测试数据，可以导入了，感觉导入速度很慢，建议输出结果加个进度条
或者增加进度说明。例如：
一共要导入多少条记录，现在导入了多少条记录，预计需要花费的时间。

现在控制台输出的只能看到: 当前已经导入了多少条记录和网络延时，用户
可能更希望看到的是每秒导入了多少条记录类似于tpmc这种性能指标。

。。。。。。
2023/09/05 14:37:31 [INFO] statsmgr.go:89: Tick: Time(5275.00s), Finished(139859610), Failed(0), Read Failed(0), Latency AVG(107436us), Batches Req AVG(112005us), Rows AVG(26513.67/s)

2023/09/05 14:37:32 [INFO] statsmgr.go:89: Done(/nebula_test/test/perf_test/NebulaGraph-Bench-master/target/data/test_data/social_network/dynamic/comment_replyOf_post.csv): Time(5276.01s), Finished(139916314), Failed(0), Read Failed(0), Latency AVG(107412us), Batches Req AVG(111981us), Rows AVG(26519.36/s)

steam · 2023 年9 月 5 日 06:46

方便的话，你可以去 importer 的 issue 区提个 feature 记录下这个需求呢：https://github.com/vesoft-inc/nebula-importer/issues

图说天下 · 2023 年9 月 5 日 06:56

还有两个问题：
1）磁盘检测：导入测试数据集前能不能检测一下需要的磁盘容量，底层存储如果使用
RocksDB的话，会有写放大的问题出现，这样会占据更多的磁盘空间；

2)断点续传：例如有20个csv文件，已经完成了10个，在导入第11个文件的时候因为
磁盘空间不足导致导入中断，下次再运行导入程序能不能从第11个文件开始，不用
再重复导入已经完成的文件。

steam · 2023 年9 月 5 日 07:23

断点续传是 importer 这块的 feature，磁盘检测的话，貌似得提到内核仓，不过你在 importer 那边提也可以。