sean
1
最近希望对Nebula较新的版本(v2)做详细的性能评估, 但前后碰到了一系列的问题。因此,我们将问题列出,希望Nebula的同学帮忙指导一下:
1 实验目的
评估并得出Nebula的基础性能数据,在不同硬件平台(CPU, SSD等)下的性能表现。
2 实验环境
问题1:
导入时发生过类似“ErrMsg: Storage Error: part: 48, error: E_RPC_FAILURE(-3)., ErrCode: -8”这样的错误。导入csv数据时显示 ErrMsg: Storage Error: part: 48, error: E_RPC_FAILURE(-3)., ErrCode: -8 这个帖子就有人提出这个问题了,但是也没有明确的回复。我们发现是nebula-importer在导入时,如果yaml文件中同时列出了vertex和edge会产生这个错误。于是,我们将vertex和edge分开导入。请问这个错误的原因是什么,是否是nebula-importer内部有并行处理产生的错误?
问题2:
根据Nebula建议,性能测试是使用Nebula-bench 和 LDBC 数据集。但是使用nebula-impoter导入LDBC数据集的时候出现一些错误。根据这篇文章所示,ldbc数据需要经过处理才能导入。
那么,Nebula有没有官方的建议如何处理,在Nebula-bench的说明中,这部分信息是缺失的。
问题3:
nebula-bench中对ldbc数据处理有一个merger的步骤,我们观察到只是修改了updateStream.csv这个文件,但是yaml配置中似乎并未使用到这个csv文件
run ldbc/scripts/csv-merger.sh to merge distribute files
请问这个步骤是否对后续有影响?
问题4:
Config ldbc configs:
这是一个config文件,还是每笔数据都去制作一个yaml文件?这里如何的配置很不详细。
因此,以上的步骤给我们一些困惑,是否可以提供详细的性能测试步骤文档,这样对于不太熟悉Nebula的同学可以把主要精力放在性能评估和分析上,而不是耗费在配置产生的问题上。希望Nebula的同学给予一些指导。
1 个赞
问题1,因为不确定你数据量大小,importer 的配置文件,导入过程是否开启 compact,所以不太清楚问题原因。你可以在导入的时候,观察一下 storage 是否有重启,io 读写速度是否有较长时间等待,graph 和 storage 日志是否有有用的信息。
问题 2 - 问题 4,都是 nebula-bench 的问题,目前nebula-bench 没更新,比较简单的办法就是用 ldbc v0.3.3 生成数据后,去掉 csv 的第一行,然后参照一下下面的配置文件。
version: v2
description: ldbc
removeTempFiles: false
clientSettings:
retry: 3
concurrency: 30 # number of graph clients
channelBufferSize: 128
space: sf30
connection:
user: root
password: nebula
address: 192.168.8.147:9669
postStart:
commands: |
UPDATE CONFIGS heartbeat_interval_secs=1;
CREATE SPACE IF NOT EXISTS sf30(PARTITION_NUM = 24, REPLICA_FACTOR = 3, vid_type = int64);
USE sf30;
CREATE TAG IF NOT EXISTS `Person`(`firstName` string,`lastName` string,`gender` string,`birthday` string,`creationDate` string,`locationIP` string,`browserUsed` string);
CREATE TAG IF NOT EXISTS `Forum`(`title` string,`creationDate` string);
CREATE TAG IF NOT EXISTS `Place`(`name` string,`url` string,`type` string);
CREATE TAG IF NOT EXISTS `Tag`(`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Organisation`(`type` string,`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Tagclass`(`name` string,`url` string);
CREATE TAG IF NOT EXISTS `Post`(`imageFile` string,`creationDate` string,`locationIP` string,`browserUsed` string,`language` string,`content` string,`length` int);
CREATE TAG IF NOT EXISTS `Comment`(`creationDate` string,`locationIP` string,`browserUsed` string,`content` string,`length` int);
CREATE EDGE IF NOT EXISTS `IS_LOCATED_IN`();
CREATE EDGE IF NOT EXISTS `HAS_MODERATOR`();
CREATE EDGE IF NOT EXISTS `KNOWS`(`creationDate` string);
CREATE EDGE IF NOT EXISTS `LIKES`(`creationDate` string);
CREATE EDGE IF NOT EXISTS `IS_SUBCLASS_OF`();
CREATE EDGE IF NOT EXISTS `STUDY_AT`(`classYear` int);
CREATE EDGE IF NOT EXISTS `CONTAINER_OF`();
CREATE EDGE IF NOT EXISTS `HAS_CREATOR`();
CREATE EDGE IF NOT EXISTS `WORK_AT`(`workFrom` int);
CREATE EDGE IF NOT EXISTS `HAS_TYPE`();
CREATE EDGE IF NOT EXISTS `HAS_TAG`();
CREATE EDGE IF NOT EXISTS `HAS_INTEREST`();
CREATE EDGE IF NOT EXISTS `HAS_MEMBER`(`joinDate` string);
CREATE EDGE IF NOT EXISTS `IS_PART_OF`();
CREATE EDGE IF NOT EXISTS `REPLY_OF`();
afterPeriod: 8s
logPath: ./err/test.log
files:
- path: /home/vesoft/sf30/social_network/dynamic/post.csv
failDataPath: ./err/data/Post.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Post
props:
- name: imageFile
type: string
index: 1
- name: creationDate
type: string
index: 2
- name: locationIP
type: string
index: 3
- name: browserUsed
type: string
index: 4
- name: language
type: string
index: 5
- name: content
type: string
index: 6
- name: length
type: int
index: 7
- path: /home/vesoft/sf30/social_network/dynamic/comment.csv
failDataPath: ./err/data/Comment.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Comment
props:
- name: creationDate
type: string
index: 1
- name: locationIP
type: string
index: 2
- name: browserUsed
type: string
index: 3
- name: content
type: string
index: 4
- name: length
type: int
index: 5
- path: /home/vesoft/sf30/social_network/dynamic/forum.csv
failDataPath: ./err/data/Forum.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Forum
props:
- name: title
type: string
index: 1
- name: creationDate
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person.csv
failDataPath: ./err/data/Person.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Person
props:
- name: firstName
type: string
index: 1
- name: lastName
type: string
index: 2
- name: gender
type: string
index: 3
- name: birthday
type: string
index: 4
- name: creationDate
type: string
index: 5
- name: locationIP
type: string
index: 6
- name: browserUsed
type: string
index: 7
- path: /home/vesoft/sf30/social_network/static/organisation.csv
failDataPath: ./err/data/Organisation.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Organisation
props:
- name: type
type: string
index: 1
- name: name
type: string
index: 2
- name: url
type: string
index: 3
- path: /home/vesoft/sf30/social_network/static/place.csv
failDataPath: ./err/data/Place.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Place
props:
- name: name
type: string
index: 1
- name: url
type: string
index: 2
- name: type
type: string
index: 3
- path: /home/vesoft/sf30/social_network/static/tag.csv
failDataPath: ./err/data/Tag.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Tag
props:
- name: name
type: string
index: 1
- name: url
type: string
index: 2
- path: /home/vesoft/sf30/social_network/static/tagclass.csv
failDataPath: ./err/data/Tagclass.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: vertex
vertex:
vid:
index: 0
type: int
tags:
- name: Tagclass
props:
- name: name
type: string
index: 1
- name: url
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person_isLocatedIn_place.csv
failDataPath: ./err/data/IS_LOCATED_IN.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_LOCATED_IN
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/post_hasCreator_person.csv
failDataPath: ./err/data/HAS_CREATOR.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_CREATOR
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/post_hasTag_tag.csv
failDataPath: ./err/data/HAS_TAG.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_TAG
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/post_isLocatedIn_place.csv
failDataPath: ./err/data/IS_LOCATED_IN.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_LOCATED_IN
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/comment_hasCreator_person.csv
failDataPath: ./err/data/HAS_CREATOR.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_CREATOR
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/comment_hasTag_tag.csv
failDataPath: ./err/data/HAS_TAG.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_TAG
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/comment_isLocatedIn_place.csv
failDataPath: ./err/data/IS_LOCATED_IN.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_LOCATED_IN
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/comment_replyOf_comment.csv
failDataPath: ./err/data/REPLY_OF.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: REPLY_OF
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/comment_replyOf_post.csv
failDataPath: ./err/data/REPLY_OF.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: REPLY_OF
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/forum_containerOf_post.csv
failDataPath: ./err/data/CONTAINER_OF.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: CONTAINER_OF
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/forum_hasMember_person.csv
failDataPath: ./err/data/HAS_MEMBER.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_MEMBER
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: joinDate
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/forum_hasModerator_person.csv
failDataPath: ./err/data/HAS_MODERATOR.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_MODERATOR
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/forum_hasTag_tag.csv
failDataPath: ./err/data/HAS_TAG.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_TAG
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/person_hasInterest_tag.csv
failDataPath: ./err/data/HAS_INTEREST.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_INTEREST
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/dynamic/person_knows_person.csv
failDataPath: ./err/data/KNOWS.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: KNOWS
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: creationDate
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person_likes_comment.csv
failDataPath: ./err/data/LIKES.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: LIKES
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: creationDate
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person_likes_post.csv
failDataPath: ./err/data/LIKES.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: LIKES
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: creationDate
type: string
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person_studyAt_organisation.csv
failDataPath: ./err/data/STUDY_AT.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: STUDY_AT
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: classYear
type: int
index: 2
- path: /home/vesoft/sf30/social_network/dynamic/person_workAt_organisation.csv
failDataPath: ./err/data/WORK_AT.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: WORK_AT
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- name: workFrom
type: int
index: 2
- path: /home/vesoft/sf30/social_network/static/organisation_isLocatedIn_place.csv
failDataPath: ./err/data/IS_LOCATED_IN.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_LOCATED_IN
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/static/place_isPartOf_place.csv
failDataPath: ./err/data/IS_PART_OF.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_PART_OF
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/static/tagclass_isSubclassOf_tagclass.csv
failDataPath: ./err/data/IS_SUBCLASS_OF.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: IS_SUBCLASS_OF
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
- path: /home/vesoft/sf30/social_network/static/tag_hasType_tagclass.csv
failDataPath: ./err/data/HAS_TYPE.csv
batchSize: 100
type: csv
csv:
withHeader: false
withLabel: false
delimiter: "|"
schema:
type: edge
edge:
name: HAS_TYPE
withRanking: false
srcVID:
index: 0
type: int
dstVID:
index: 1
type: int
props:
1 个赞
ldbc的数据有一个最大的问题是,所有的vertex的id是有问题的。比如person和comment两个tag都有一个id为12345(举个例子,并不存在这个点),那么其实这是两个点,再倒入后会被认为是同一个点的两个tag。
我的做法是在ID前加前缀,比如person的12345变为p12345,comment的12345变为c12345.