使用nebula-importer无法导入数据

version: v3
description: yaml file for snb data
removeTempFiles: false
clientSettings:
  retry: 0
  concurrency: 10 # number of graph clients
  channelBufferSize: 128
  space: snb
  connection:
    user: root
    password: nebula
    address: xxx.xxx.xxx.xxx:9669,xxx.xxx.xxx.xxx:9669,xxx.xxx.xxx.xxx:9669
  postStart:
    commands: |
      DROP SPACE IF EXISTS snb;
      CREATE SPACE IF NOT EXISTS snb(partition_num=15, replica_factor=1, vid_type=FIXED_STRING(20));
      USE snb;
      CREATE TAG comment(creationDate DateTime, locationIP String, browserUsed String, content String, length INT);
      CREATE TAG forum(title String, creationDate DateTime);
      CREATE TAG person(firstName String, lastName String, gender String, birthday Date, creationDate DateTime, locationIP String, browserUsed String, language String, email String);
      CREATE TAG post(imageFile String, creationDate DateTime, locationIP String, browserUsed String, language String, content String, length INT);
      CREATE TAG organisation(type String, name String, url String);
      CREATE TAG place(name String, url String, type String);
      CREATE TAG `tag`(name String, url String);
      CREATE TAG tagclass(name String, url String);
      CREATE EDGE IF NOT EXISTS comment_hasCreator_person();
      CREATE EDGE IF NOT EXISTS comment_hasTag_tag();
      CREATE EDGE IF NOT EXISTS comment_isLocatedIn_place();
      CREATE EDGE IF NOT EXISTS comment_replyOf_comment();
      CREATE EDGE IF NOT EXISTS comment_replyOf_post();
      CREATE EDGE IF NOT EXISTS forum_containerOf_post();
      CREATE EDGE IF NOT EXISTS forum_hasMember_person(joinDate DateTime);
      CREATE EDGE IF NOT EXISTS forum_hasModerator_person();
      CREATE EDGE IF NOT EXISTS forum_hasTag_tag();
      CREATE EDGE IF NOT EXISTS person_hasInterest_tag();
      CREATE EDGE IF NOT EXISTS person_isLocatedIn_place();
      CREATE EDGE IF NOT EXISTS person_knows_person(creationDate DateTime);
      CREATE EDGE IF NOT EXISTS person_likes_comment(creationDate DateTime);
      CREATE EDGE IF NOT EXISTS person_likes_post(creationDate DateTime);
      CREATE EDGE IF NOT EXISTS person_studyAt_organisation(classYear int);
      CREATE EDGE IF NOT EXISTS person_workAt_organisation(workFrom int);
      CREATE EDGE IF NOT EXISTS post_hasCreator_person();
      CREATE EDGE IF NOT EXISTS post_hasTag_tag();
      CREATE EDGE IF NOT EXISTS post_isLocatedIn_place();
      CREATE EDGE IF NOT EXISTS organisation_isLocatedIn_place();
      CREATE EDGE IF NOT EXISTS place_isPartOf_place();
      CREATE EDGE IF NOT EXISTS tag_hasType_tagclass();
      CREATE EDGE IF NOT EXISTS tagclass_isSubclassOf_tagclass();
    afterPeriod: 8s
  preStop:
    commands: |
logPath: ./err/test.log
files:
  - path: ./test-data/comment_0_0.csv
    failDataPath: ./error/comment_error.csv
    batchSize: 10
    limit: 20
    inOrder: false
    type: csv
    csv:
      withHeader: false
      withLabel: false
      delimiter: "|"
    schema:
      type: vertex
      vertex: 
         vid: 
            index: 0
            type: string
        tags: 
          - name: comment
            props: 
               - name: creationDate
                 type: DateTime
                 index: 1
               - name: locationIP
                 type: string
                 index: 2
               - name: browserUsed
                 type: string
                 index: 3
               - name: content
                 type: string
                 index: 4
               - name: length
                 type: int
                 index: 5

导入的数据格式为如下:

id|creationDate|locationIP|browserUsed|content|length
1236950581249|2011-08-17T14:26:59.961+0000|92.39.58.88|Chrome|yes|3
1236950581250|2011-08-17T11:10:21.570+0000|213.55.127.9|Internet Explorer|thanks|6
2061584302085|2012-07-20T05:22:51.283+0000|213.55.127.9|Internet Explorer|LOL|3
2061584302086|2012-07-20T16:55:45.373+0000|213.55.127.9|Internet Explorer|I see|5
2061584302087|2012-07-20T04:34:17.500+0000|213.55.127.9|Internet Explorer|fine|4
2061584302088|2012-07-20T17:35:11.096+0000|92.39.58.88|Chrome|right|5

备注:数据格式的表头只是为了详细说明才添加,在实际的过程中没有。
配置完毕后执行如下命令:

./importer --config /opt/nebula-importer/snb.yaml
  • 报错信息如下:
2022/12/14 21:07:52 [INFO] reader.go:184: Total lines of file(/opt/nebula-importer/test-data/comment_0_0.csv) is: 20, error lines: 0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6e8939]

goroutine 35 [running]:
github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).startWorker(0xc0001c0200, 0xa)
	/opt/nebula-importer/pkg/client/clientpool.go:187 +0x299
github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).Init.func1(0x0?)
	/opt/nebula-importer/pkg/client/clientpool.go:149 +0x25
created by github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).Init
	/opt/nebula-importer/pkg/client/clientpool.go:148 +0x152

所有的信息都核对过,但是始终显示空指针或者内存地址错误,实在不明白哪里出问题,请赐教,万分感谢。

这里 tag 是关键字,要用 ` 扩起来,再看看有没有其他地方的报错?

这是我们的问题,这里的报错太不友好了,抱歉

https://github.com/vesoft-inc/nebula-importer/issues/251

抱歉,我贴的配置文件不是最新的,最新的里面我已经将tag添加`符号括起来了,但还是出现内存地址错误或者空指针的问题,其它地方似乎没有办法找到更具体的日志

两个问题

这里的 content 写错了 cotent

表头去掉

@veezhang 求助哈,这里我改完了之后还是会报错

2022/12/15 01:59:21 [INFO] reader.go:63: Start to read file(0): /root/test.csv, schema: < :VID(string),comment.creationDate:datetime,comment.locationIP:string,comment.browserUsed:string,comment.cotent:string,comment.length:int >
2022/12/15 01:59:21 [INFO] reader.go:179: Total lines of file(/root/test.csv) is: 1, error lines: 0
2022/12/15 01:59:21 [INFO] runner.go:79: Finish to read /root/test.csv
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x73d2d9]

不知道是不是 datetime 出的问题?

这个cotent问题我后面修复了,在加载配置文件时还是会报一样的错误

您能详细说一下嘛,我这里试着去修改看一下

我刚才尝试将comment这个tag中的creationDate的属性修改为string,加载配置文件后还是发现报同样的错误,我觉得我还是去检查一下源码中的startWorker部分看一下

感谢,我这会儿还没有时间,如果发现问题欢迎麻烦来提具体 PR/issue

@DDV123 您好,您贴出来的配置看起来yaml的缩进有些问题,vid 和 tags 貌似没有对其,你检查下看看?
然后您 importer 的版本是?

你好,我贴子里贴的是过时的版本,在实际执行过程中已经修复了,但是加载配置文件还是报如上述问题,版本我给忘了,但是我执行如下命令,

./importer -version

的返回结果只有:

 
Git Hash:  
Tag: 

我上午重现了一下,小问题都改了之后依然是指针问题,后来我忙别的事情了,我确实花了一点时间还是没找到问题原因o(╥﹏╥)o。

1 个赞

@DDV123 您好,是否可以尝试下载一个新的 nebula-importer 版本?
看您的报错信息是下面这一样报错了,之前没遇见过,这一样也不大可能出错。

我也是看了源码没找到哪有问题,我去下载个最新的版本的importer重新编译一下

1 个赞

@veezhang 你好,我执行了如下的命令获得源码

git clone https://xxxx@github.com/vesoft-inc/nebula-importer.git

这应该是最新版本的源码,利用无网络编译方式部署了nebula-importer,然后加载配置文件,还是报空指针和无内存地址。

2022/12/15 20:18:54 [INFO] config.go:410: Failed data path: error/comment_error.csv/comment_0_0.csv
2022/12/15 20:18:54 [INFO] config.go:416: find file: /opt/nebula-importer/test-data/comment_0_0.csv
2022/12/15 20:19:02 [INFO] clientmgr.go:31: Create 9 Nebula Graph clients
2022/12/15 20:19:02 [INFO] runner.go:75: Start to read /opt/nebula-importer/test-data/comment_0_0.csv
2022/12/15 20:19:02 [INFO] runner.go:96: Waiting for stats manager done
2022/12/15 20:19:02 [INFO] reader.go:49: The delimiter of /opt/nebula-importer/test-data/comment_0_0.csv is U+007C '|'
2022/12/15 20:19:02 [INFO] reader.go:63: Start to read file(0): /opt/nebula-importer/test-data/comment_0_0.csv, schema: < :VID(string),comment.creationDate:datetime,comment.locationIP:string,comment.browserUsed:string,comment.content:string,comment.length:int >
2022/12/15 20:19:02 [INFO] reader.go:179: Total lines of file(/opt/nebula-importer/test-data/comment_0_0.csv) is: 20, error lines: 0
2022/12/15 20:19:02 [INFO] runner.go:79: Finish to read /opt/nebula-importer/test-data/comment_0_0.csv
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x6ec039]

goroutine 34 [running]:
github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).startWorker(0xc0001a2100, 0x7)
	/opt/nebula-importer-new/nebula-importer/pkg/client/clientpool.go:187 +0x299
github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).Init.func1(0x0?)
	/opt/nebula-importer-new/nebula-importer/pkg/client/clientpool.go:149 +0x25
created by github.com/vesoft-inc/nebula-importer/pkg/client.(*ClientPool).Init
	/opt/nebula-importer-new/nebula-importer/pkg/client/clientpool.go:148 +0x152

噢噢噢噢,找到原因了,您可以将 retry 配置为 1 试试看。

1 个赞

好奇是啥原因:star_struck:

可以了,谢谢,虽然还有其它地方报错,但是能够加载数据了,万分感谢

2 个赞

https://github.com/vesoft-inc/nebula-importer/blob/v3.2.0/pkg/client/clientpool.go#L175-L190

如上:
retry = 0 ,然后 177 行未执行,导致 err 和 resp 都为空。
这里 retry 的判断有些问题,retry 应该是重试次数,不是执行次数。

1 个赞