使用Exchange 从neo4j导入nebula,label中有些顶点的属性值是null,导致导入失败

在使用Exchange 从neo4j导入nebula过程中,neo4j的顶点label中有些顶点的属性值是null。比如



配置文件中,查询neo4j 我把全部属性都列出来查询,并导入nebula了。

exec: "match (n:user) where id(n) <= 50000000 return n.vid as vid, n.lastModifyDate as lastModifyDate, n.batchRiskFlag as batchRiskFlag, n.emailRanFlag as emailRanFlag, n.isAgent as isAgent, n.aftsaleFlag as aftsaleFlag, n.whiteAccFlag as whiteAccFlag, n.amountTotal as amountTotal, n.jsRiskFlag as jsRiskFlag, n.RmsPnsFlag as RmsPnsFlag, n.firstAmount as firstAmount, n.userModelFlag as userModelFlag, n.ilrgtHdFlag as ilrgtHdFlag, n.regTime as regTime, n.blackAccFlag as blackAccFlag, n.userName as userName, n.TotalRiskFlag as TotalRiskFlag, n.orderTotal as orderTotal order by id(n)"

导入过程中报错,spark提示有null类型,但是我无法从label中挑出来那些顶点没有部分属性,只能按照全量属性来导入,这种问题应该怎么办?能通过配置允许属性为null解决吗?


@nicole 大佬来帮忙了 :flushed:

spark-submit版本
image

nebula 1.0不支持属性值为null的,所以存在null值数据无法插入

对于null值,exchange导入时不做统一的处理,不过你可以自定义,将null值转换为对应的默认值。
如:nebula的某属性类型为String,你可以将null转换为""
nebula的某属性类型为int, 你可以将null转换为-999

修改点:com.vesoft.nebula.tools.importer.processor.Processor#extraValue
增加该方法中的类型转换

row.schema.fields(index).dataType match {
// 增加对NullType的支持
case NullType => {
  fieldTypeMap(field) match {
          case StringType => ""
          case LongType => -999L
          case DoubleType => -0.00001
          case BooleanType => false         
   }
}
3 个赞

:+1: