步骤:
使用nebula-importer 源码编译
git clone nebula-importer master
checkout v1.0.0 && make build
tag_order_info.csv 部分数据
00000060,USER00000048,PRDNO_03,5,1598924701
00000061,USER00000055,PRDNO_00,3,1598924701
00000062,USER00000012,PRDNO_04,0,1598924701
00000063,USER00000005,PRDNO_05,2,1598924701
00000064,USER00000048,PRDNO_05,1,1598924701
00000065,USER00000015,PRDNO_04,4,1598924701
00000066,USER00000020,PRDNO_05,0,1598924701
00000067,USER00000025,PRDNO_01,2,1598924701
00000068,USER00000051,PRDNO_02,4,1598924701
00000069,USER00000047,PRDNO_00,5,1598924701
00000070,USER00000038,PRDNO_03,1,1598924701
00000071,USER00000052,PRDNO_03,4,1598924701
00000072,USER00000003,PRDNO_05,0,1598924701
00000073,USER00000011,PRDNO_04,1,1598924701
00000074,USER00000031,PRDNO_00,4,1598924701
00000075,USER00000004,PRDNO_02,1,1598924701
00000076,USER00000017,PRDNO_01,4,1598924701
00000077,USER00000023,PRDNO_00,2,1598924701
00000078,USER00000021,PRDNO_04,4,1598924701
00000079,USER00000011,PRDNO_04,1,1598924701
00000080,USER00000027,PRDNO_00,3,1598924701
00000081,USER00000045,PRDNO_03,2,1598924701
00000082,USER00000018,PRDNO_05,0,1598924701
00000083,USER00000041,PRDNO_02,1,1598924701
00000084,USER00000039,PRDNO_00,2,1598924701
00000085,USER00000003,PRDNO_01,2,1598924701
00000086,USER00000013,PRDNO_01,4,1598924701
00000087,USER00000022,PRDNO_04,3,1598924701
00000088,USER00000031,PRDNO_01,0,1598924701
00000089,USER00000048,PRDNO_05,0,1598924701
00000090,USER00000025,PRDNO_00,4,1598924701
00000091,USER00000047,PRDNO_04,0,1598924701
00000092,USER00000000,PRDNO_01,0,1598924701
00000093,USER00000005,PRDNO_02,3,1598924701
00000094,USER00000048,PRDNO_05,4,1598924701
00000095,USER00000026,PRDNO_04,5,1598924701
00000096,USER00000005,PRDNO_05,0,1598924701
00000097,USER00000030,PRDNO_00,0,1598924701
00000098,USER00000026,PRDNO_02,5,1598924701
00000099,USER00000023,PRDNO_02,3,1598924701
配置yaml
version: v1
description: example
removeTempFiles: false
clientSettings:
retry: 3
concurrency: 2 # number of graph clients
channelBufferSize: 1
space: test100
connection:
user: root
password: nebula
address: 192.168.15.232:3699,192.168.15.233:3699
postStart:
commands: |
UPDATE CONFIGS storage:wal_ttl=3600;
UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = true };
DROP SPACE IF EXISTS test100;
CREATE SPACE IF NOT EXISTS test100(partition_num=5, replica_factor=1);
USE test100;
CREATE TAG tag_order_info(cust_id string ,product_type string, order_status int, create_time timestamp);
CREATE TAG tag_mobile(mobile string);
CREATE TAG tag_bank_card(bank_card string);
CREATE TAG tag_cert_no(cert_no string);
CREATE EDGE edge_mobile();
CREATE EDGE edge_cert_no();
CREATE EDGE edge_bank_card_no();
#CREATE TAG course(name string, credits int);
#CREATE TAG building(name string);
#CREATE TAG student(name string, age int, gender string);
#CREATE EDGE follow(likeness double);
#CREATE EDGE choose(grade int);
#CREATE TAG course_no_props();
#CREATE TAG building_no_props();
#CREATE EDGE follow_no_props();
afterPeriod: 30s
preStop:
commands: |
UPDATE CONFIGS storage:rocksdb_column_family_options = { disable_auto_compactions = false };
UPDATE CONFIGS storage:wal_ttl=86400;
logPath: ./err/test100.log
files:
- path: ./edge_bank_card_no.csv
batchSize: 2
inOrder: false
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: edge
edge:
name: edge_bank_card_no
withRanking: false
#props:
# - name: grade
# type: int
- path: ./tag_card_info.csv
failDataPath: ./err/tag_card_info.csv
batchSize: 2
inOrder: false
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: vertex
vertex:
tags:
- name: tag_bank_card
props:
- name: bank_card
type: string
#- name: credits
# type: int
#- name: building
# props:
# - name: name
# type: string
- path: ./tag_idcert_info.csv
failDataPath: ./err/tag_idcert_info.csv
batchSize: 2
inOrder: false
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: vertex
vertex:
tags:
- name: tag_cert_no
props:
- name: cert_no
type: string
- path: ./edge_cert_no.csv
failDataPath: ./err/edge_cert_no.csv
batchSize: 2
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: edge
edge:
name: edge_cert_no
withRanking: false
#props:
# - name: likeness
# type: double
- path: ./edge_mobile.csv
failDataPath: ./err/edge_mobile.csv
batchSize: 2
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: edge
edge:
name: edge_mobile
withRanking: true
- path: ./tag_order_info.csv
failDataPath: ./err/tag_order_info.csv
batchSize: 2
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: vertex
vertex:
tags:
- name: tag_order_info
props:
- name: cust_id
type: string
- name: product_type
type: string
- name: order_status
type: int
- name: create_time
type: timestamp
- path: ./tag_phone_info.csv
failDataPath: ./err/tag_phone_info.csv
batchSize: 2
type: csv
csv:
withHeader: false
withLabel: false
schema:
type: vertex
vertex:
#vid:
# index: 1
# function: hash
tags:
- name: tag_mobile
props:
- name: mobile
type: string
执行导入csv文件
nebule-import --config path/to/data.yaml
日志信息
2020/09/01 18:19:14 [INFO] clientmgr.go:28: Create 4 Nebula Graph clients
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(1): /root/testing_data/100/tag_card_info.csv, schema: < :VID,tag_bank_card.bank_card:string >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(6): /root/testing_data/100/tag_phone_info.csv, schema: < :VID,tag_mobile.mobile:string >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(4): /root/testing_data/100/edge_mobile.csv, schema: < :SRC_VID,:DST_VID,:RANK >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(3): /root/testing_data/100/edge_cert_no.csv, schema: < :SRC_VID,:DST_VID >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(0): /root/testing_data/100/edge_bank_card_no.csv, schema: < :SRC_VID,:DST_VID >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(2): /root/testing_data/100/tag_idcert_info.csv, schema: < :VID,tag_cert_no.cert_no:string >
2020/09/01 18:19:14 [INFO] reader.go:64: Start to read file(5): /root/testing_data/100/tag_order_info.csv, schema: < :VID,tag_order_info.cust_id:string,tag_order_info.product_type:string,tag_order_info.order_status:int,tag_order_info.create_time:timestamp >
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/tag_idcert_info.csv) is: 30, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/tag_idcert_info.csv): Time(30.09s), Finished(263), Failed(0), Latency AVG(986us), Batches Req AVG(1270us), Rows AVG(8.74/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/tag_card_info.csv) is: 35, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/tag_card_info.csv): Time(30.09s), Finished(275), Failed(0), Latency AVG(975us), Batches Req AVG(1259us), Rows AVG(9.14/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/tag_phone_info.csv) is: 59, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/tag_phone_info.csv): Time(30.10s), Finished(348), Failed(0), Latency AVG(962us), Batches Req AVG(1249us), Rows AVG(11.56/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/edge_bank_card_no.csv) is: 100, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/edge_bank_card_no.csv): Time(30.13s), Finished(520), Failed(0), Latency AVG(939us), Batches Req AVG(1233us), Rows AVG(17.26/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/edge_cert_no.csv) is: 100, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/edge_cert_no.csv): Time(30.14s), Finished(534), Failed(0), Latency AVG(938us), Batches Req AVG(1233us), Rows AVG(17.72/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/tag_order_info.csv) is: 100, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/tag_order_info.csv): Time(30.14s), Finished(546), Failed(0), Latency AVG(935us), Batches Req AVG(1231us), Rows AVG(18.12/s)
2020/09/01 18:19:44 [INFO] reader.go:180: Total lines of file(/root/testing_data/100/edge_mobile.csv) is: 300, error lines: 0
2020/09/01 18:19:44 [INFO] statsmgr.go:61: Done(/root/testing_data/100/edge_mobile.csv): Time(30.17s), Finished(724), Failed(0), Latency AVG(931us), Batches Req AVG(1226us), Rows AVG(24.00/s)
fetch 查询
fetch prop on tag_order_info 69; 有记录
fetch prop on tag_order_info 70; 没有记录
fetch prop on tag_order_info 71; 没有记录
…
fetch prop on tag_order_info 77; 没有记录
分析
所有的csv文件总行数是724行,finished(724),Failed(0)。这里的finished条数理论上,应该和csv行数相同。实际在1000w个订单测试中,有些数据在后台也查询不到,但是从importer日志没有发现报错;在err文件夹下,产生的csv也是空文件。
[root@nebula233 err]# ll
total 4
-rw-r–r-- 1 root root 0 Sep 1 18:19 edge_bank_card_no.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 edge_cert_no.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 edge_mobile.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 tag_card_info.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 tag_idcert_info.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 tag_order_info.csv
-rw-r–r-- 1 root root 0 Sep 1 18:19 tag_phone_info.csv
-rw-r–r-- 1 root root 3461 Sep 1 18:19 test100.log