nebula 版本:2.0-rc1
部署方式(分布式 ):
硬件信息
磁盘2T SSD
CPU、内存信息:12核心+128内存
exchange为2.0版本:
nebula-java-rc1
nebula-exchange-2.0.0.jar
问题的具体描述:spark任务提交后,正常执行,未发现日志中有报错,但是数据没有入库到图数据库中
提交命令:
spark-submit --name ${task_name} \
--num-executors ${executors} \
--executor-memory ${memory} \
--executor-cores ${cores} \
--master yarn --deploy-mode cluster \
--executor-cores ${cores} \
--queue default \
--files ${config_path} \
--class com.vesoft.nebula.exchange.Exchange nebula-exchange-2.0.0.jar -c ${config_file_name} -h
tag的配置:
{
name: domain
type: {
source: hive
sink: client
}
exec: "SELECT key, md5(key) as vid, found_time, update_time, data_source as source, str1 as attr, str2 as label from csc.ods_csc_vertex where graph_type='domain' and day='{day}' and hour='{hour}'"
fields: [key, found_time, update_time, source, attr, label]
nebula.fields: [key, found_time, update_time, source, attr, label]
vertex: {
field: vid
}
batch: 256
partition: 16
}
相关日志:
21/03/26 09:53:00 INFO Exchange$: Processing Tag domain
21/03/26 09:53:00 INFO Exchange$: field keys: key, found_time, update_time, source, attr, label
21/03/26 09:53:00 INFO Exchange$: nebula keys: key, found_time, update_time, source, attr, label
21/03/26 09:53:00 INFO Exchange$: Loading from Hive and exec SELECT key, md5(key) as vid, found_time, update_time, data_source as source, str1 as attr, str2 as label from csc.dwb_csc_vertex where graph_type='domain' and day='23' and hour='17'
21/03/26 09:53:00 INFO SharedState: loading hive config file: file:/xxx/abdi/disks/data2/yarn/local/usercache/yunnaosec/filecache/6481/__spark_conf__.zip/__hadoop_conf__/hive-site.xml
21/03/26 09:53:00 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/warehouse/tablespace/managed/hive').
21/03/26 09:53:00 INFO SharedState: Warehouse path is '/warehouse/tablespace/managed/hive'.
21/03/26 09:53:00 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL.
21/03/26 09:53:00 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1c9e27c0{/SQL,null,AVAILABLE,@Spark}
21/03/26 09:53:00 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/json.
21/03/26 09:53:00 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6af39b4e{/SQL/json,null,AVAILABLE,@Spark}
21/03/26 09:53:00 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution.
21/03/26 09:53:00 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@21f6af31{/SQL/execution,null,AVAILABLE,@Spark}
21/03/26 09:53:00 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
21/03/26 09:53:00 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4755b56{/SQL/execution/json,null,AVAILABLE,@Spark}
21/03/26 09:53:00 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
21/03/26 09:53:00 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2595e781{/static/sql,null,AVAILABLE,@Spark}
21/03/26 09:53:02 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
21/03/26 09:53:04 INFO HiveUtils: Initializing HiveMetastoreConnection version 3.0 using file:/usr/hdp/current/spark2-client/standalone-metastore/standalone-metastore-1.21.2.3.1.0.0-78-hive3.jar:file:/usr/hdp/current/spark2-client/standalone-metastore/hive-ud-2.2.0.jar:file:/xxx/abdi/disks/data1/yarn/local/usercache/yunnaosec/appcache/application_1616364960915_0390/container_e217_1616364960915_0390_01_000001/__hive_libs__/standalone-metastore-1.21.2.3.1.0.0-78-hive3.jar:file:/usr/hdp/current/spark2-client/standalone-metastore/standalone-metastore-1.21.2.3.1.0.0-78-hive3.jar:file:/usr/hdp/current/spark2-client/standalone-metastore/hive-ud-2.2.0.jar
21/03/26 09:53:04 INFO HiveConf: Found configuration file file:/usr/hdp/current/spark2-client/conf/hive-site.xml
Hive Session ID = 599d0e97-68a2-4e52-8a22-90736eb8e58f
21/03/26 09:53:05 INFO SessionState: Hive Session ID = 599d0e97-68a2-4e52-8a22-90736eb8e58f
21/03/26 09:53:05 INFO SessionState: Created local directory: /xxx/abdi/disks/data1/yarn/local/usercache/yunnaosec/appcache/application_1616364960915_0390/container_e217_1616364960915_0390_01_000001/tmp/yunnaosec
21/03/26 09:53:05 INFO SessionState: Created HDFS directory: /tmp/spark/yunnaosec/599d0e97-68a2-4e52-8a22-90736eb8e58f
21/03/26 09:53:05 INFO SessionState: Created local directory: /xx/abdi/disks/data1/yarn/local/usercache/yunnaosec/appcache/application_1616364960915_0390/container_e217_1616364960915_0390_01_000001/tmp/yunnaosec/599d0e97-68a2-4e52-8a22-90736eb8e58f
21/03/26 09:53:05 INFO SessionState: Created HDFS directory: /tmp/spark/yunnaosec/599d0e97-68a2-4e52-8a22-90736eb8e58f/_tmp_space.db
21/03/26 09:53:05 INFO HiveClientImpl: Warehouse location for Hive client (version 3.0.0) is /warehouse/tablespace/managed/hive
21/03/26 09:53:07 INFO HiveMetaStoreClient
21/03/26 09:53:48 INFO TaskSetManager: Finished task 9.0 in stage 1.0 (TID 9) in 15070 ms on xxxhadoop-datanode-1.novalocal (executor 3) (14/16)
21/03/26 09:53:48 INFO TaskSetManager: Finished task 6.0 in stage 1.0 (TID 6) in 15313 ms on xxxhadoop-datanode-1.novalocal (executor 3) (15/16)
21/03/26 09:53:48 INFO TaskSetManager: Finished task 3.0 in stage 1.0 (TID 3) in 15325 ms on xxxhadoop-datanode-1.novalocal (executor 3) (16/16)
21/03/26 09:53:48 INFO YarnClusterScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
21/03/26 09:53:48 INFO DAGScheduler: ResultStage 1 (foreachPartition at VerticesProcessor.scala:243) finished in 16.087 s
21/03/26 09:53:49 INFO DAGScheduler: Job 0 finished: foreachPartition at VerticesProcessor.scala:243, took 16.670279 s
21/03/26 09:53:49 INFO Exchange$: batchSuccess.domain: 0
21/03/26 09:53:49 INFO Exchange$: batchFailure.domain: 0
21/03/26 09:53:49 INFO Exchange$: Processing Tag url