请问,如何将字符串形式的(如“王一博”)数据,写入顶点呀?

  • nebula 版本:2.6,sparkconnector:2.6,spark 2.4.0(换成2.4.4,或2.4.3,依旧读取边的时候报错)
  • 部署方式:单机
  • 安装方式:源码编译
  • 是否为线上版本:N

在写入vertex的时候,如果写入的顶点信息为字符串,则会报错:
image

java.lang.AssertionError: assertion failed
	at scala.Predef$.assert(Predef.scala:156)
	at com.vesoft.nebula.connector.writer.NebulaExecutor$.extraID(NebulaExecutor.scala:60)
	at com.vesoft.nebula.connector.writer.NebulaEdgeWriter.write(NebulaEdgeWriter.scala:56)
	at com.vesoft.nebula.connector.writer.NebulaEdgeWriter.write(NebulaEdgeWriter.scala:17)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
	at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
	at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:121)
	at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:402)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:408)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)

请问,如何将字符串形式的(如“关键词”)数据,写入顶点呀?(数字格式的数据是可以作为顶点写入vertex,但是中文字符无法作为顶点写入vertex)

来贴一下你是如何写入数据的,上面只有报错的信息,需要下你是如何操作的。

您好,这是我写入数据的代码,谢谢

def writeVertex(spark: SparkSession): Unit = {
    LOG.info("start to write nebula vertices")
  
    val df = spark.read.option("header",true).csv("E:\\data\\vertex_topic_info")
    df.show(10)

    val config = getNebulaConnectionConfig()
    val nebulaWriteVertexConfig: WriteNebulaVertexConfig = WriteNebulaVertexConfig
      .builder()
      .withSpace("Test_Interest_preference")
      .withTag("center_vertex")
      .withVidField("topic")
            .withVidAsProp(false)
//            .withVidPolicy("hash")
      .withBatch(1000)
      .build()
    println(df.schema)
    df.write.nebula(config,nebulaWriteVertexConfig).writeVertices()

  }

你贴一下 desc space Test_Interest_preference 的结果

您看一下

nicole 预判了,你的 vid 的类型是 int,不能传 string 类型的,你如果是 fix_string 的话,肖战都没问题。

1 个赞

如何指定vid的类型为fix_string呀,谢谢~

image
只能。。重新搞个 schema 了。

是创建图空间的时候,指定对嘛


没错

好滴,谢谢啦

:joy: 如果问题解决了,你可以勾选 nicole 的回复或者我的回复为解决方案,这个问题就算解决了哈。