提问参考模版:
- nebula 版本:v3.6.0
- 部署方式:云端
- 安装方式:RPM
- 是否上生产环境:N
问题:
使用pyspark通过spark connector写入edeg时,一定要带rankField吗?
目前代码:
df.write.format("com.vesoft.nebula.connector.NebulaDataSource")\
.option("srcPolicy", "hash")\
.option("dstPolicy", "hash")\
.option("metaAddress", "metaip:9559")\
.option("graphAddress", "graphip:9669")\
.option("user", "root")\
.option("passwd", "passwd")\
.option("type", "edge")\
.option("spaceName", "config_server")\
.option("label", "request")\
.option("srcVertexField", "src_ip")\
.option("dstVertexField", "app_id")\
.option("batch", 100)\
.option("writeMode", "insert")\
.option("randkField", "")\
.option("operateType","write").save()
发现加上 .option(“randkField”, “”)\或者删掉randField都会报空指针错误,正确方式应该是怎样?
报错日志:
: java.lang.NullPointerException
at com.vesoft.nebula.connector.NebulaDataSource.createWriter(NebulaDataSource.scala:96)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:275)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:748)
定位到代码:
} else {
val srcVertexFiled = nebulaOptions.srcVertexField
val dstVertexField = nebulaOptions.dstVertexField
val rankExist = !nebulaOptions.rankField.isEmpty
val edgeFieldsIndex = {
var srcIndex: Int = -1
var dstIndex: Int = -1
发现应该是!nebulaOptions.rankField.isEmpty