nebula-imporeter导入显示start of到end of两行,没有任何反应,

  • nebula 版本:nebula-graph-3.4.0
  • 部署方式:分布式 三台服务器
  • 安装方式:rpm
  • 硬件信息
    • CPU:Intel(R) Xeon(R) Gold 5320 CPU @ 2.20GHz
      硬盘:Disk /dev/sda: 1999.8 GB, 1999844147200 bytes, 3905945600 sectors
      问题描述:
  • 使用nebula-importer导入csv文件,配置文件部分配置如下:
version: v3
description: example
removeTempFiles: false
clientSettings:
  retry: 3
  concurrency: 50 # number of graph clients
  channelBufferSize: 1000
  space: newTest
  connection:
    user: root
    password: FUZAduowei.111
    address: 192.168.0.186:9669,192.168.0.187:9669,192.168.0.188:9669
  postStart:
    commands: |
      CREATE SPACE IF NOT EXISTS `newTest`(PARTITION_NUM = 30, REPLICA_FACTOR = 1, vid_type = FIXED_STRING(50));
      USE newTest;
      CREATE TAG IF NOT EXISTS `node`(`ip_num` string,`node_id` string,`country_code` string,`point` geography,`addr` string,`node_type` string, `node_index` string);
      CREATE EDGE IF NOT EXISTS `follow`(`edgeType` string,`edgeIndex` string, `source_longitude` double,`source_latitude` double,`source_node_type` string,`source_node_index` string,`target_longitude` double,`target_latitude` double,`target_node_type` string,`target_node_index` string);
      CREATE TAG INDEX IF NOT EXISTS `dtype_index` on `node`(`node_type`(10));
      CREATE TAG INDEX IF NOT EXISTS `geo_index` on `node`(`point`);
      CREATE EDGE INDEX IF NOT EXISTS `follow_index_type` on `follow`(`edgeType`(10));
    afterPeriod: 10s
logPath: ./test.log
files:
  - path: ./node.csv
    failDataPath: ./err/pointerr.csv
    batchSize: 256
    type: csv
    csv:
      withHeader: false
      withLabel: false
      delimiter: ','
    schema:
      type: vertex
      vertex:
        vid:
          index: 0
          type: string
        tags:
          - name: node
            props:
                - name: ip_num
                  type: string
                  index: 1
                - name: node_id
                  type: string
                  index: 2
                - name: country_code
                  type: string
                  index: 3 
                - name: point
                  type: geography
                  index: 4
                - name: addr
                  type: string
                  index: 5
                - name: node_type
                  type: string
                  index: 6
                - name: node_index
                  type: string
                  index: 7

  - path: ./link.csv
    failDataPath: ./err/lineerr.csv
    batchSize: 256
    type: csv
    csv:
      withHeader: false
      withLabel: false
      delimiter: ','
    schema:
      type: edge
      edge:
        name: follow
        withRanking: false
        srcVID:
          index: 0
          type: string
        dstVID:
          index: 1
          type: string
        props:
          - name: edgeType
            type: string
            index: 2
          - name: edgeIndex
            type: string
            index: 3
          - name: source_longitude
            type: double
            index: 4
          - name: source_latitude
            type: double
            index: 5
          - name: source_node_type
            type: string
            index: 6
          - name: source_node_index
            type: string
            index: 7
          - name: target_longitude
            type: double
            index: 8
          - name: target_latitude
            type: double
            index: 9
          - name: target_node_type
            type: string
            index: 10
          - name: target_node_index
            type: string
          - index: 11

输入导入明明后,就会只显示如下输出,日志文件中也没有任何的记录:

2023/11/15 15:38:44 --- START OF NEBULA IMPORTER ---
2023/11/15 15:38:45 --- END OF NEBULA IMPORTER ---

你看下文件是不是在你填写的路径下呢。:thinking: 这个报错信息感觉是什么数据都没处理呀。

看过了,是,也换过具体路径,依旧是这种没有任何信息的输出,就两行,start然后是end

你 importer 版本号多少,先排除最基本的版本不匹配问题。

version: v3
description: example
removeTempFiles: false
clientSettings:
  retry: 3
  concurrency: 128 # number of graph clients
  channelBufferSize: 5000
  space: list
  connection:
    user: root
    password: FUZAduowei.111
    address: 192.168.0.186:9669,192.168.0.187:9669,192.168.0.188:9669
  postStart:
    commands: |
      CREATE SPACE IF NOT EXISTS `list`(PARTITION_NUM = 60, REPLICA_FACTOR = 2, vid_type = FIXED_STRING(50));
      USE list;
      CREATE TAG IF NOT EXISTS `node`();
      CREATE EDGE IF NOT EXISTS `follow`();
    afterPeriod: 20s
logPath: ./test.log
files:
  - path: ./user_point43_7.csv
    failDataPath: ./err/pointerr.csv
    batchSize: 2000
    type: csv
    csv:
      withHeader: false
      withLabel: false
      delimiter: ','
    schema:
      type: vertex
      vertex:
        vid:
          index: 0
          type: string
        tags:
          - name: node

  - path: ./user_line43_7.csv
    failDataPath: ./err/lineerr.csv
    batchSize: 2000
    type: csv
    csv:
      withHeader: false
      withLabel: false
      delimiter: ','
    schema:
      type: edge
      edge:
        name: follow
        withRanking: false
        srcVID:
          index: 0
          type: string
        dstVID:
          index: 1
          type: string

一样的路径下,这个就可以成功访问有返回的日志信息,不是版本问题,之前都可以导入

:thinking: 方便的话,升级下 importer?用 4.1 的 importer 看看是不是还有问题,它兼容 3.x 的内核的。

我换了一个最新版本的,导入后会显示这样:
{“level”:“info”,“ts”:“2023-11-16T17:36:13+08:00”,“caller”:“manager/manager.go:211”,“msg”:“manager: add import source successfully”,“source”:“local /home/importTest/file/graph-1699666196683-123456-node.csv”}
edge(follow): no prop name
我看了link的csv文件是和配置文件中的属性对应的

配置文件贴一贴看下

client:
  version: v3
  address: "127.0.0.1:9669"
  user: root
  password: 111
  # 并发连接数
  concurrencyPerAddress: 10
  # 重试间隔时间
  reconnectInitialInterval: 1s
  # sql语句失败重新执行次数
  retry: 3
  # 重试间隔时间
  retryInitialInterval: 1s

manager:
  spaceName: newTest
  # 执行语句的批处理量
  batch: 128
  # 读取器读取数据源的并发数
  readerConcurrency: 50
  # 生成待执行sql语句的并发数
  importerConcurrency: 512
  # 打印统计信息的时间间隔
  statsInterval: 10s
  hooks:
    # 导入前执行命令
    before:
      - statements:
        - |
          CREATE SPACE IF NOT EXISTS `newTest`(PARTITION_NUM = 30, REPLICA_FACTOR = 1, vid_type = FIXED_STRING(50));
          USE newTest;
          CREATE TAG IF NOT EXISTS `node`(`ip_num` string,`node_id` string,`country_code` string,`point` geography,`addr` string,`node_type` string, `node_index` string);
          CREATE EDGE IF NOT EXISTS `follow`(`edgeType` string,`edgeIndex` string,`source_longitude` double,`source_latitude` double,`source_node_type` string,`source_node_index` string,`target_longitude` double,`target_latitude` double,`target_node_type` string,`target_node_index` string);
          CREATE TAG INDEX IF NOT EXISTS `dtype_index` on `node`(`node_type`(10));
          CREATE TAG INDEX IF NOT EXISTS `geo_index` on `node`(`point`);
          CREATE EDGE INDEX IF NOT EXISTS `follow_index_type` on `follow`(`edgeType`(10));
        wait: 10s

log:
  level: INFO
  console: true
  files:
   - /home/importTest/nebula-importer.log

sources:
  - path: /home/importTest/file/graph-1699666196683-123456-node.csv
    batch: 256
    csv:
      # csv文件用什么作为分隔符
      delimiter: ","
    tags:
    - name: node
      id:
        type: "STRING"
        index: 0
      props:
              - name: ip_num
                type: string
                index: 1
              - name: node_id
                type: string
                index: 2
              - name: country_code
                type: string
                index: 3 
              - name: point
                type: geography
                index: 4
              - name: addr
                type: string
                index: 5
              - name: node_type
                type: string
                index: 6
              - name: node_index
                type: string
                index: 7

  - path: /home/importTest/file/graph-1699666196683-123456-link.csv
    batch: 256
    edges:
    - name: follow # person_knows_person
      src:
        id:
          type: "STRING"
          index: 0
      dst:
        id:
          type: "STRING"
          index: 1
      props:
          - name: edgeType
            type: string
            index: 2
          - name: edgeIndex
            type: string
            index: 3
          - name: source_longitude
            type: double
            index: 4
          - name: source_latitude
            type: double
            index: 5
          - name: source_node_type
            type: string
            index: 6
          - name: source_node_index
            type: string
            index: 7
          - name: target_longitude
            type: double
            index: 8
          - name: target_latitude
            type: double
            index: 9
          - name: target_node_type
            type: string
            index: 10
          - name: target_node_index
            type: string
          - index: 11


先不论导入失败的问题,如果你导入数据之前创建了索引的话,导入速度会变得很慢的,一般不建议先创建索引再导入数据。

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。