nebualGraph 2.0.0 和spark -exchange 2.0.0 导入hive数据报错,导致导入失败

Nebula 版本 v2.0.1
Exchange 2.0.0

配置信息

{
spark: { app: { name: Spark Writer }
driver: { cores: 16 maxResultSize: 16G }
cores { max: 16 } }
nebula: {
address: { graph: [“10.133.98.252:9669”, “10.133.98.251:9669”, “10.131.54.188:9669”]
meta: [“10.133.98.252:9559”, “10.133.98.251:9559”, “10.131.54.188:9559”] }
user: user pswd: password
space: knowledge_graph_v3
connection { timeout: 10000 retry: 3 }
execution { retry: 3 }
error: { max: 32 output: /tmp/errors }
rate: { limit: 1024 timeout: 1000 } }
{ name: Thing
type: { source: hive sink: client }
exec: “select thing_id, thing_name, thing_title, thing_namech, thing_nameen, thing_abbreviation, thing_tag, thing_alias, thing_abstract, thing_image, thing_video, thing_audio, thing_gmtcreated, thing_gmtmodified, thing_popularity, thing_prior, thing_datasource, thing_urls, thing_class, thing_imagejson from oppo_kg_dw.dwd_kg_update_spo_thing_di_v1_59 where ds = ‘20210507’” fields: [thing_name, thing_title, thing_namech, thing_nameen, thing_abbreviation, thing_tag, thing_alias, thing_abstract, thing_image, thing_video, thing_audio, thing_gmtcreated, thing_gmtmodified, thing_popularity, thing_prior, thing_datasource, thing_urls, thing_class, thing_imagejson] nebula.fields: [Thing_name, Thing_title, Thing_nameCh, Thing_nameEn, Thing_abbreviation, Thing_tag, Thing_alias, Thing_abstract, Thing_image, Thing_video, Thing_audio, Thing_gmtCreated, Thing_gmtModified, Thing_popularity, Thing_prior, Thing_dataSource, Thing_urls, Thing_class, Thing_imageJson] vertex: thing_id is
Implicit: true batch: 128 partition: 48 }

报错信息

21/05/08 18:01:00 INFO AuthRpdClient: paths is ArrayBuffer(/hive/oppo_kg_dw.db/dwd_kg_update_spo_organisation_di_v1_59)
21/05/08 18:01:00 INFO AuthRpdClient: data is List(AuthorizeResultEntity{rpd='hive://net_kg:user@china1/hive/oppo_kg_dw.db/dwd_kg_update_spo_organisation_di_v1_59?option=select', result=false, msg='失败', hitCache=false}); adminData is ArrayBuffer(AuthorizeResultEntity{rpd='hive://net_kg:user@china1/hive/oppo_kg_dw.db/dwd_kg_update_spo_organisation_di_v1_59?option=admin', result=true, msg='成功', hitCache=false}); succeedPaths is ArrayBuffer(/hive/oppo_kg_dw.db/dwd_kg_update_spo_organisation_di_v1_59)
21/05/08 18:01:00 INFO AuthRpdClient: data indices is Range(0); authorizeResultEntity is AuthorizeResultEntity{rpd='hive://net_kg:user@china1/hive/oppo_kg_dw.db/dwd_kg_update_spo_organisation_di_v1_59?option=select', result=false, msg='失败', hitCache=false}
21/05/08 18:01:01 ERROR ApplicationMaster: User class threw exception: com.facebook.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
com.facebook.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
	at com.facebook.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
	at com.facebook.thrift.transport.TTransport.readAll(TTransport.java:84)
	at com.facebook.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:609)
	at com.facebook.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:473)
	at com.vesoft.nebula.meta.MetaService$Client.recv_getSpace(MetaService.java:515)
	at com.vesoft.nebula.meta.MetaService$Client.getSpace(MetaService.java:492)
	at com.vesoft.nebula.client.meta.MetaClient.getSpace(MetaClient.java:131)
	at com.vesoft.nebula.client.meta.MetaClient.getTags(MetaClient.java:144)
	at com.vesoft.nebula.exchange.MetaProvider.getLabelType(MetaProvider.scala:69)
	at com.vesoft.nebula.exchange.utils.NebulaUtils$.getDataSourceFieldType(NebulaUtils.scala:30)
	at com.vesoft.nebula.exchange.processor.VerticesProcessor.process(VerticesProcessor.scala:110)
	at com.vesoft.nebula.exchange.Exchange$$anonfun$main$2.apply(Exchange.scala:145)
	at com.vesoft.nebula.exchange.Exchange$$anonfun$main$2.apply(Exchange.scala:122)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at com.vesoft.nebula.exchange.Exchange$.main(Exchange.scala:122)
	at com.vesoft.nebula.exchange.Exchange.main(Exchange.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
	at com.facebook.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
	... 20 more
21/05/08 18:01:01 INFO ApplicationMaster: Final app status: FAILED, exitCode: 15, (reason: User class threw exception: com.facebook.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
	at com.facebook.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
	at com.facebook.thrift.transport.TTransport.readAll(TTransport.java:84)
	at com.facebook.thrift.protocol.TCompactProtocol.readByte(TCompactProtocol.java:609)
	at com.facebook.thrift.protocol.TCompactProtocol.readMessageBegin(TCompactProtocol.java:473)
	at com.vesoft.nebula.meta.MetaService$Client.recv_getSpace(MetaService.java:515)
	at com.vesoft.nebula.meta.MetaService$Client.getSpace(MetaService.java:492)
	at com.vesoft.nebula.client.meta.MetaClient.getSpace(MetaClient.java:131)
	at com.vesoft.nebula.client.meta.MetaClient.getTags(MetaClient.java:144)
	at com.vesoft.nebula.exchange.MetaProvider.getLabelType(MetaProvider.scala:69)
	at com.vesoft.nebula.exchange.utils.NebulaUtils$.getDataSourceFieldType(NebulaUtils.scala:30)
	at com.vesoft.nebula.exchange.processor.VerticesProcessor.process(VerticesProcessor.scala:110)
	at com.vesoft.nebula.exchange.Exchange$$anonfun$main$2.apply(Exchange.scala:145)
	at com.vesoft.nebula.exchange.Exchange$$anonfun$main$2.apply(Exchange.scala:122)
	at scala.collection.immutable.List.foreach(List.scala:392)
	at com.vesoft.nebula.exchange.Exchange$.main(Exchange.scala:122)
	at com.vesoft.nebula.exchange.Exchange.main(Exchange.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:678)
Caused by: java.net.SocketTimeoutException: Read timed out
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
	at java.net.SocketInputStream.read(SocketInputStream.java:171)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
	at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
	at com.facebook.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
	... 20 more
)

导入了一个多小时,后来就报上面的错误,最后导入失败了。我看有报hive 返回数据失败。和nebulaGraph User class threw exception: com.facebook.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out
com.facebook.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out 但我看了nebula服务都是正常的,hive 也是可以查到数据的

操作超时了, 默认超时时间是1s。你在导入过程中有做其他操作么
你的一条数据有20个字段,可以适当调小batch数和partition数 减轻服务端压力。

调小了batch partition 还是会报上面的错误,这是日志文件帮看一下什么问题

又是metaClient请求超时了,metaClient默认的超时时间是1000ms。按理 从meta服务中只获取spaces的信息不会那么容易超时呀,你服务中有多少个space?麻烦贴一下下面的信息吧:

  1. show spaces 结果
  2. nebula meta服务的日志。

就4个space

看nebula meta服务的info日志。

你可以先用这个pr 调大一下配置文件的nebula.connection.timeout的值试下
https://github.com/vesoft-inc/nebula-spark-utils/pull/79

Nebula Graph 图空间名称

space: knowledge_graph_test2

# thrift 超时时长及重试次数
# 如未设置,则默认值分别为 3000 和 3
connection {
  timeout: 100000
  retry: 10
}

# nGQL 查询重试次数
# 如未设置,则默认值为 3
execution {
  retry: 10
}
error: {
  max: 32
  output: /tmp/errors
}
rate: {
  limit: 512
  timeout: 100000
}

}

都调大了,改小了还是报上面一样的错,

设置了还是不行

你这硬盘好像写不动了啊。。。raft日志都干不下去了
是HDD吗

不是现在是测试环境测试是普通硬盘,还没上生产生产环境是ssd,就是说现在是磁盘io 过高导致失败是吧,那我用生产环境测试一下生产环境是ssd

好的,你用ssd试下