使用 sst 导入数据的时候，报错 ERROR processor.VerticesProcessor: java.lang.NumberFormatException:

taoshilei · 2021 年12 月 27 日 09:47

nebula 版本：2.6.0
部署方式：分布式
安装方式：RPM
是否为线上版本：Y
硬件信息
磁盘: SSD
CPU、内存信息: Linux chj-prd-nebula01 3.10.0-1160.45.1.el7.x86_64 #1 SMP Wed Oct 13 17:20:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
exchange jar 版本：2.6.1

报错信息：
使用 sst 导入数据的时候，报错 ERROR processor.VerticesProcessor: java.lang.NumberFormatException: For input string: “56.04”

source 是 hive 表，字段类型都是 string

sink 是 hdfs 路径，nebula tag 的属性类型都是 string, vid也是string

为什么会报类型转化错误呢？求解

steam · 2021 年12 月 27 日 09:50

你 Exchange 用的哪个版本。

taoshilei · 2021 年12 月 27 日 10:46

2.6.1

nicole · 2021 年12 月 27 日 10:50

desc tag 结果贴一下， Exchange的配置文件也贴一下吧

taoshilei · 2021 年12 月 27 日 10:54

desc tag :

{
  spark: {
    app: {
      name: Nebula Exchange 2.0
    }
    driver: {
      cores: 2
      maxResultSize: 5G
    }
    cores {
      max: 16
    }
  }

  nebula: {
    address:{
      graph:["10.xx:9669","10.xx:9669","10.xx:9669","10.xx:9669","10.xx:9669"]
      meta:["10.xx:9559","10.xx:9559","10.xx:9559"]
    }
    user: root
    pswd: "脱敏"
    path : {
       local:"/tmp/sst"
       remote:"/tmp/sst"
       hdfs.namenode: "hdfs://nameservice1:8020"
    }
    space: s_vcredit
    connection {
      timeout: 60000
      retry: 3
    }
    execution {
      retry: 3
    }
    error: {
      max: 32
      output: /tmp/errors/t_idcardno
    }
    rate: {
      limit: 1024
      timeout: 10000
    }
  }
  tags: [

    {
      name: t_idcardno
      type: {
        source: hive
        sink: sst
      }
  exec: "select identity_no,real_name,sex,age,folk,edu_degree,marry_status,blacklist,recognize_addr,prov,city,district,town,street_addr,current_timestamp() as update_time from dwd.dwd_vcredit_graph_idcardno_node_df where dt='vardate'"
      fields: [real_name,sex,age,folk,edu_degree,marry_status,blacklist,recognize_addr,prov,city,district,town,street_addr,update_time]
      nebula.fields: [real_name,sex,age,folk,edu_degree,marry_status,blacklist,recognize_addr,prov,city,district,town,street_addr,update_time]
      vertex:{
        field:identity_no
      }
      batch: 256
      partition: 32
    }

  ]


}

nicole · 2021 年12 月 27 日 10:59

你这配置文件和贴出来的tag信息不一致啊，你贴的tag中有10个属性，但配置文件中配置了14个属性。

taoshilei · 2021 年12 月 27 日 11:00

我刚才试了了一个，把 source 改成 orc，直接读文件，没出现这个类型转换问题。

但是出现了其他问题：
我提交spark程序，master 是 yarn 或者 yarn-cluster 就会报错：Caused by: java.nio.file.NoSuchFileException: /tmp/sst/1-168.sst
如果改成 local 就可以生成sst文件。

疑问：使用exchange程序生成sst文件，只能是 local 模式吗？我看官网的示例也是 local 模式。

下面是我提交作业的命令：

# 运行环境
export JAVA_HOME=/usr/java/jdk1.8.0_181-cloudera
spark_submit="/opt/spark-2.4.8-bin-hadoop2.7/bin/spark-submit"
# 开始运行
${spark_submit} \
--master yarn-cluster \
--files t_idcardno_try_sst_2.conf \
--queue bigdatad \
--name tsl_try_sst_20211227 \
--conf spark.sql.shuffle.partitions=200 \
--conf spark.default.parallelism=100 \
--conf spark.driver.extraJavaOptions=-Dfile.encoding=utf-8 \
--conf spark.executor.extraJavaOptions=-Dfile.encoding=utf-8 \
--driver-memory 4g  \
--executor-memory 5g \
--executor-cores 2 \
--num-executors 5  \
--class com.vesoft.nebula.exchange.Exchange /opt/nebula/nebula-exchange-2.6.1.jar  -c t_idcardno_try_sst_2.conf -h

taoshilei · 2021 年12 月 27 日 11:02

不好意思，没注意到分页了，这是第二页。

nicole · 2021 年12 月 27 日 11:02

支持local、standalone、yarn-client、yarn-cluster模式
你应该是在集群模式下本地文件写的时候出问题了

nicole · 2021 年12 月 27 日 11:03

那你的异常的值 56.04 知道是哪个属性的值么

taoshilei · 2021 年12 月 27 日 11:08

搜不到，就很诡异

taoshilei · 2021 年12 月 27 日 11:12

error99.txt (343.6 KB)
上面是on yarn 模式的日志，里面只发现了类似

21/12/27 18:29:05 ERROR processor.VerticesProcessor: org.rocksdb.RocksDBException: While open a file for appending: /tmp/sst/1-146.sst: No such file or directory
21/12/27 18:29:05 ERROR executor.Executor: Exception in task 5.1 in stage 2.0 (TID 146)
java.nio.file.NoSuchFileException: /tmp/sst/1-146.sst

这样的错误信息。

nicole · 2021 年12 月 28 日 02:12

你在执行过程中有删除本地/tmp目录的数据么， sst文件是先写本地local path, 然后再上传hdfs的。这里是rocksdb在写本地sst时文件找不到了

taoshilei · 2021 年12 月 28 日 02:13

没删除。

我以为是权限问题，改成 root 用户执行也报错找不到文件。

taoshilei · 2021 年12 月 28 日 03:37

解决了，是因为我执行的用户(kerberos用户)，在 DataNode节点的 /tmp 没有权限。

nicole · 2021 年12 月 28 日 07:52

好的，那你这个numberformat的问题呢，是数据的原因么

taoshilei · 2021 年12 月 28 日 09:08

这个还没查到原因，定位数据，的确没这个值。