importer的错误,只导入部分数据

nebual版本2.6.1 importer2.6

wangli@hp288prog3:/u03/nebula/2.6.1/nebula-importer$ ./nebula-importer -config tyc_import_data/import.yaml 
2021/11/23 07:47:43 --- START OF NEBULA IMPORTER ---
2021/11/23 07:47:51 [INFO] clientmgr.go:28: Create 10 Nebula Graph clients
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e88-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(0): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e88-4c31-11ec-8d3f-4166f59a1df4, schema: < :VID(string),address.name:string >
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e88-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e89-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(1): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e89-4c31-11ec-8d3f-4166f59a1df4, schema: < :VID(string),human.name:string,human.card_id:string,human.profile:string,human.sex:string,human.education:string,tyc_human_scrap_type.if_fullscrap:string,tyc_human_scrap_type.full_update_timestamp:string >
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8a-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(2): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8a-4c31-11ec-8d3f-4166f59a1df4, schema: < :SRC_VID(string),:DST_VID(string),:RANK >
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e89-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8a-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8b-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(3): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8b-4c31-11ec-8d3f-4166f59a1df4, schema: < :VID(string),email.name:string >
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8c-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(4): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8c-4c31-11ec-8d3f-4166f59a1df4, schema: < :SRC_VID(string),:DST_VID(string),:RANK >
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8b-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8c-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:51 [INFO] reader.go:26: The delimiter of /u03/nebula/2.6.1/nebula-importer/tyc_import_data/98ab6f0e-4c31-11ec-8d3f-4166f59a1df4 is U+002C ','
2021/11/23 07:47:51 [INFO] reader.go:64: Start to read file(5): /u03/nebula/2.6.1/nebula-importer/tyc_import_data/98ab6f0e-4c31-11ec-8d3f-4166f59a1df4, schema: < :VID(string),company.name:string,company.usic:string,company.ic_no:string,company.reg_date:string,company.reg_capital:float,company.tax_no:string,company.real_capital:string,company.business_scope:string,company.tax_level:string,company.check_date:string,company.org_type:string,company.industry:string,company.reg_bureau:string,company.english_name:string,company.business_term:string,company.update_date:string,company.cancel_date:string,company.cancel_reason:string,company.revoke_date:string,company.revoke_reason:string,company.summary:string,company.reg_capital_currency:string,company.real_capital_currency:string,company.staff_num:string,company.social_security_num:string,company.status:string,company.site:string,company.city:string,company.district:string,tyc_company_scrap_type.if_fullscrap:string,tyc_company_scrap_type.full_update_timestamp:string >
2021/11/23 07:47:51 [INFO] reader.go:180: Total lines of file(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/98ab6f0e-4c31-11ec-8d3f-4166f59a1df4) is: 10, error lines: 0
2021/11/23 07:47:53 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e88-4c31-11ec-8d3f-4166f59a1df4): Time(10.03s), Finished(10), Failed(0), Read Failed(0), Latency AVG(803us), Batches Req AVG(2002986us), Rows AVG(1.00/s)
2021/11/23 07:47:53 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e89-4c31-11ec-8d3f-4166f59a1df4): Time(10.03s), Finished(25), Failed(0), Read Failed(0), Latency AVG(820us), Batches Req AVG(802085us), Rows AVG(2.49/s)
2021/11/23 07:47:53 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8a-4c31-11ec-8d3f-4166f59a1df4): Time(10.03s), Finished(37), Failed(0), Read Failed(0), Latency AVG(814us), Batches Req AVG(542379us), Rows AVG(3.69/s)
2021/11/23 07:47:53 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8b-4c31-11ec-8d3f-4166f59a1df4): Time(10.03s), Finished(46), Failed(0), Read Failed(0), Latency AVG(724us), Batches Req AVG(436397us), Rows AVG(4.59/s)
2021/11/23 07:47:53 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/97e17e8c-4c31-11ec-8d3f-4166f59a1df4): Time(10.03s), Finished(54), Failed(0), Read Failed(0), Latency AVG(662us), Batches Req AVG(371841us), Rows AVG(5.38/s)
2021/11/23 07:47:56 [ERROR] handler.go:63: Client 0 fail to execute: INSERT VERTEX `company`(`name`,`usic`,`ic_no`,`reg_date`,`reg_capital`,`tax_no`,`real_capital`,`business_scope`,`tax_level`,`check_date`,`org_type`,`industry`,`reg_bureau`,`english_name`,`business_term`,`update_date`,`cancel_date`,`cancel_reason`,`revoke_date`,`revoke_reason`,`summary`,`reg_capital_currency`,`real_capital_currency`,`staff_num`,`social_security_num`,`status`,`site`,`city`,`district`),`tyc_company_scrap_type`(`if_fullscrap`,`full_update_timestamp`) VALUES  "vertex_company_4131677626": ("重庆忽米产业互联网有限公司","","","",,"","","","","","","","","","","","","","","","","","","","","","","","","否","");, ErrMsg: SemanticError: Column count doesn't match value count., ErrCode: -1009
2021/11/23 07:47:56 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/98ab6f0e-4c31-11ec-8d3f-4166f59a1df4): Time(13.03s), Finished(60), Failed(1), Read Failed(0), Latency AVG(618us), Batches Req AVG(334700us), Rows AVG(4.60/s)
2021/11/23 07:47:56 [ERROR] clientpool.go:108: Client(0) fails to execute commands (USE tyc_2021;
CREATE TAG INDEX idx_address_name ON address(name(255));
REBUILD TAG INDEX idx_address_name;
CREATE TAG INDEX idx_human_name ON human(name(255));
REBUILD TAG INDEX idx_human_name;
CREATE TAG INDEX idx_tyc_human_scrap_type_name ON tyc_human_scrap_type(if_fullscrap(255));
REBUILD TAG INDEX idx_tyc_human_scrap_type_name;
CREATE TAG INDEX idx_email_name ON email(name(255));
REBUILD TAG INDEX idx_email_name;
CREATE TAG INDEX idx_company_name ON company(name(255));
REBUILD TAG INDEX idx_company_name;
CREATE TAG INDEX idx_tyc_company_scrap_type_name ON tyc_company_scrap_type(if_fullscrap(255));
REBUILD TAG INDEX idx_tyc_company_scrap_type_name;
), response error code: -1009, message: SemanticError: Index idx_address_name not found in space tyc_2021
2021/11/23 07:47:56 Total 1 lines fail to insert into nebula graph database
2021/11/23 07:47:57 --- END OF NEBULA IMPORTER ---

company只有部分数据导入。
根据日志看

2021/11/23 07:47:56 [ERROR] handler.go:63: Client 0 fail to execute: INSERT VERTEX `company`(`name`,`usic`,`ic_no`,`reg_date`,`reg_capital`,`tax_no`,`real_capital`,`business_scope`,`tax_level`,`check_date`,`org_type`,`industry`,`reg_bureau`,`english_name`,`business_term`,`update_date`,`cancel_date`,`cancel_reason`,`revoke_date`,`revoke_reason`,`summary`,`reg_capital_currency`,`real_capital_currency`,`staff_num`,`social_security_num`,`status`,`site`,`city`,`district`),`tyc_company_scrap_type`(`if_fullscrap`,`full_update_timestamp`) VALUES  "vertex_company_4131677626": ("重庆忽米产业互联网有限公司","","","",,"","","","","","","","","","","","","","","","","","","","","","","","","否","");, ErrMsg: SemanticError: Column count doesn't match value count., ErrCode: -1009
2021/11/23 07:47:56 [INFO] statsmgr.go:62: Done(/u03/nebula/2.6.1/nebula-importer/tyc_import_data/98ab6f0e-4c31-11ec-8d3f-4166f59a1df4): Time(13.03s), Finished(60), Failed(1), Read Failed(0), Latency AVG(618us), Batches Req AVG(334700us), Rows AVG(4.60/s)
2021/11/23 07:47:56 [ERROR] clientpool.go:108: Client(0) fails to execute commands (USE tyc_2021;

出现列不对就停了。但是csv是python文件写出来的,不会有列数量不对的问题,而且这条company数据大多数列都是空的,一共32列。
请大神帮忙看看。
这个导入yaml生成,及数据csv都是python做的,以前的2.0版本用过。上次出错,大神提示是一个config设置现在不用。
看看2.6还有什么新的设置,导致这个问题。

你这个传值和原来的数据参数对不上啊,你之前的参数里面中间还有几个括号在一起的字段。

哦,我加了新的tag,数据量多了

调整下。再试试,有问题再来更新帖子,:thinking: 一般来说你可以看下报错信息,看 Error 就好了

我核对了一下,这一行数据我传了两个tag,一个是tag,另一个是tyc_company_scrap_type,一个有30个参数,一个有2 个参数。

`company`(`name`,`usic`,`ic_no`,`reg_date`,`reg_capital`,`tax_no`,`real_capital`,`business_scope`,`tax_level`,`check_date`,`org_type`,`industry`,`reg_bureau`,`english_name`,`business_term`,`update_date`,`cancel_date`,`cancel_reason`,`revoke_date`,`revoke_reason`,`summary`,`reg_capital_currency`,`real_capital_currency`,`staff_num`,`social_security_num`,`status`,`site`,`city`,`district`)
----
`tyc_company_scrap_type`(`if_fullscrap`,`full_update_timestamp`) 
----
VALUES 

 "vertex_company_4131677626": ("重庆忽米产业互联网有限公司","","","",,"","","","","","","","","","","","","","","","","","","","","","","","","否","");, ErrMsg: SemanticError: Column count doesn't match value count., ErrCode: -1009

yaml文件是这样的

schema:
    type: vertex
    vertex:
      tags:
      - name: company
        props:
        - index: 1
          name: name
          type: string
        - index: 2
          name: usic
          type: string
        - index: 3
          name: ic_no
          type: string
        - index: 4
          name: reg_date
          type: string
        - index: 5
          name: reg_capital
          type: float
        - index: 6
          name: tax_no
          type: string
        - index: 7
          name: real_capital
          type: string
        - index: 8
          name: business_scope
          type: string
        - index: 9
          name: tax_level
          type: string
        - index: 10
          name: check_date
          type: string
        - index: 11
          name: org_type
          type: string
        - index: 12
          name: industry
          type: string
        - index: 13
          name: reg_bureau
          type: string
        - index: 14
          name: english_name
          type: string
        - index: 15
          name: business_term
          type: string
        - index: 16
          name: update_date
          type: string
        - index: 17
          name: cancel_date
          type: string
        - index: 18
          name: cancel_reason
          type: string
        - index: 19
          name: revoke_date
          type: string
        - index: 20
          name: revoke_reason
          type: string
        - index: 21
          name: summary
          type: string
        - index: 22
          name: reg_capital_currency
          type: string
        - index: 23
          name: real_capital_currency
          type: string
        - index: 24
          name: staff_num
          type: string
        - index: 25
          name: social_security_num
          type: string
        - index: 26
          name: status
          type: string
        - index: 27
          name: site
          type: string
        - index: 28
          name: city
          type: string
        - index: 29
          name: district
          type: string
      - name: tyc_company_scrap_type
        props:
        - index: 30
          name: if_fullscrap
          type: string
        - index: 31
          name: full_update_timestamp
          type: string
      vid:
        index: 0
        type: string
  type: csv

第一个列是vid
还是没找到原因。我的意思的现在的yaml文件是索引0-31,csv是一行32列。麻烦帮忙看看

vertex_company_4131677626,重庆忽米产业互联网有限公司,,,,,,,,,,,,,,,,,,,,,,,,,,,,,否,

cvs文件中是这样的

vertex_company_91500240MA60AXRH3D,石柱土家族自治县一琼装饰工程有限公司,91500240MA60AXRH3D,500240011567325,2019-04-02,1000000.0,91500240MA60AXRH3D,,外装饰设计;建筑工程;建筑装饰工程;绿化景观工程;机电设备安装工程;土石方工程;钢结构工程;水
电安装工程设计;施工;整改及维修;建筑材料;装饰材料;环保设备;金属材料;木材及制品;防水材料;机电设备;花卉盆景;水泵阀门;流>体控制成套设备及配件销售;凭营业执照依法自主开展经营活动,,,有限责任公司   有限责任公司(自然人独资),,,,,,,,,,,CNY,,,,存
续,,重庆市,石柱土家族自治县,否,

这个是导入成功的数据,同一个程序生成的

原因是: reg_capital这列是float, 其他列都是string. 然后报错的那行很多列是空的, importer处理空列的时候, 对于string类型数据, 会处理成“”, 而对于其他类型, 没做处理, importer最终拼成的insert语句里, reg_capital这列就变成下图这样了, 而服务端是会忽略这样的列的, 所以插入的数据就少了一列.
image
举个例子:

解决办法:
如果某列没有值, 则生成csv时给该列指定默认值;

2 个赞

谢谢啊

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。

浙ICP备20010487号