我换了种方式,不调用nebula-spark这个API。我现在是采用nebula-java的客户端,通过spark拉取数据Dataset,然后通过ForeachPartitionFunction重写call方法,主要逻辑就是获取nebula的session,执行写入操作,开发测试没问题,到线上就会出现线上数据量大,就会出现如下问题:
Caused by: com.vesoft.nebula.client.graph.exception.NotValidConnectionException: No extra connection: All servers are broken.
at com.vesoft.nebula.client.graph.net.NebulaPool.getConnection(NebulaPool.java:215)
at com.vesoft.nebula.client.graph.net.NebulaPool.getSession(NebulaPool.java:137)
at graph.write.NebulaForeachPartition.call(NebulaForeachPartition.java:60)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$2.apply(Dataset.scala:2691)
at org.apache.spark.sql.Dataset$$anonfun$foreachPartition$2.apply(Dataset.scala:2691)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.rdd.RDD$$anonfun$foreachPartition$1$$anonfun$apply$29.apply(RDD.scala:935)
at org.apache.spark.SparkContext$$anonfun$runJob$5.apply(SparkContext.scala:2074)
查看Nebula Graph的日志:
I20220331 17:53:00.622897 19428 GraphService.cpp:67] Authenticating user root from [::ffff:10.227.111.121]:44961
I20220331 17:53:00.622998 19425 GraphService.cpp:67] Authenticating user root from [::ffff:10.227.111.121]:44960
I20220331 17:53:02.821435 19428 GraphService.cpp:67] Authenticating user root from [::ffff:10.227.111.125]:42508
I20220331 17:53:02.821532 19426 GraphService.cpp:67] Authenticating user root from [::ffff:10.227.111.125]:42504
E20220331 18:44:26.330394 19548 HeaderServerChannel.cpp:100] Received invalid request from client: apache::thrift::transport::TTransportException: Header transport frame is too large: 4294246397 (hex 0xfff4fffd) (transport apache::thrift::PreReceivedDataAsyncTransportWrapper, address ::ffff:10.227.111.101, port 40594)
E20220331 18:44:26.339790 19548 PeekingManager.h:262] peekSuccess failed, dropping connection: apache::thrift::transport::TTransportException: Channel is !good()
E20220331 18:46:04.687847 19486 HeaderServerChannel.cpp:100] Received invalid request from client: apache::thrift::transport::TTransportException: Header transport frame is too large: 4294246397 (hex 0xfff4fffd) (transport apache::thrift::PreReceivedDataAsyncTransportWrapper, address ::ffff:10.227.111.131, port 39136)
E20220331 18:46:04.688063 19486 PeekingManager.h:262] peekSuccess failed, dropping connection: apache::thrift::transport::TTransportException: Channel is !good()