nebula exchange2.0无报错，但是无法从hive导入数据到图数据中

GuangFeiZHU · 2021 年3 月 4 日 07:38

nebula 版本：2.0
部署方式（分布式）：
硬件信息
- 磁盘2T SSD
- CPU、内存信息：12核心+128内存
问题的具体描述：spark任务提交后，正常执行，为发现日志中有报错，但是数据没有入库到图数据库中

任务提交命令：spark-submit --class com.vesoft.nebula.exchange.Exchange --master yarn --deploy-mode client nebula-exchange-2.0.0.jar -c application-hive-vertexs.conf -h

配置：

  # Processing tags
  # There are tag config examples for different dataSources.
  tags: [
    {
      name: ip
      type: {
        source: hive
        sink: client
      }
      exec: "SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09"
      fields: [sip]
      nebula.fields: [ip]
      vertex: {
        field: sip
        # policy: "hash"
      }
      batch: 256
      partition: 32
    }

   {
      name: ip
      type: {
        source: hive
        sink: client
      }
      exec: "SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09"
      fields: [dip]
      nebula.fields: [ip]
      vertex: {
        field: dip
        # policy: "hash"
      }
      batch: 256
      partition: 32
    }
  ]

  # Processing edges
  # There are edge config examples for different dataSources.
  edges: [

  ]

hive表，确认：sip,dip字段为string类型。

截取部分日志：

21/03/04 14:53:52 INFO Configs$: DataBase Config com.vesoft.nebula.exchange.config.DataBaseConfigEntry@5e054d22
21/03/04 14:53:52 INFO Configs$: User Config com.vesoft.nebula.exchange.config.UserConfigEntry@3161b833
21/03/04 14:53:52 INFO Configs$: Connection Config Some(Config(SimpleConfigObject({"retry":3,"timeout":3000})))
21/03/04 14:53:52 INFO Configs$: Execution Config com.vesoft.nebula.exchange.config.ExecutionConfigEntry@7f9c3944
21/03/04 14:53:52 INFO Configs$: Source Config Hive source exec: SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:53:52 INFO Configs$: Sink Config Hive source exec: SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:53:52 INFO Configs$: name ip  batch 256
21/03/04 14:53:52 INFO Configs$: Tag Config: Tag name: ip, source: Hive source exec: SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09, sink: Nebula sink addresses: [xxx.81:9669, xxx.82:9669, xxx.83:9669], vertex field: sip, vertex policy: None, batch: 256, partition: 32.
21/03/04 14:53:52 INFO Configs$: Source Config Hive source exec: SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:53:52 INFO Configs$: Sink Config Hive source exec: SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:53:52 INFO Configs$: name ip  batch 256
21/03/04 14:53:52 INFO Configs$: Tag Config: Tag name: ip, source: Hive source exec: SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09, sink: Nebula sink addresses: [xxx.81:9669, xxx.82:9669, xxx.83:9669], vertex field: dip, vertex policy: None, batch: 256, partition: 32.
21/03/04 14:53:52 INFO Exchange$: Config Configs(com.vesoft.nebula.exchange.config.DataBaseConfigEntry@5e054d22,com.vesoft.nebula.exchange.config.UserConfigEntry@3161b833,com.vesoft.nebula.exchange.config.ConnectionConfigEntry@c419f174,com.vesoft.nebula.exchange.config.ExecutionConfigEntry@7f9c3944,com.vesoft.nebula.exchange.config.ErrorConfigEntry@55508fa6,com.vesoft.nebula.exchange.config.RateConfigEntry@fc4543af,,List(Tag name: ip, source: Hive source exec: SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09, sink: Nebula sink addresses: [xxx.81:9669, xxx.82:9669, xxx.83:9669], vertex field: sip, vertex policy: None, batch: 256, partition: 32., Tag name: ip, source: Hive source exec: SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09, sink: Nebula sink addresses: [xxx.81:9669, xxx.82:9669, xxx.83:9669], vertex field: dip, vertex policy: None, batch: 256, partition: 32.),List(),Some(HiveConfigEntry:{waredir=hdfs://mycluster/apps/hive/warehouse, connectionURL=jdbc:hive2://sz-abdi1-zookeeper-1.novalocal:2181,sz-abdi1-zookeeper-2.novalocal:2181,sz-abdi1-zookeeper-3.novalocal:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;principal=hive/sz-abdi1-hive-1.novalocal@EXAMPLE.COM, connectionDriverName=com.mysql.jdbc.Driver, connectionUserName=hive, connectionPassWord=hive}))
21/03/04 14:53:53 INFO SparkContext: Running Spark version 2.3.2.3.1.0.0-78
21/03/04 14:53:53 INFO SparkContext: Submitted application: Nebula Exchange 2.0
21/03/04 14:53:53 INFO SecurityManager: Changing view acls to: root,ambari-qa
21/03/04 14:53:53 INFO SecurityManager: Changing modify acls to: root,ambari-qa
21/03/04 14:53:53 INFO SecurityManager: Changing view acls groups to:
21/03/04 14:53:53 INFO SecurityManager: Changing modify acls groups to:
21/03/04 14:53:53 INFO SecurityManager: SecurityManager: authentication disabled; ui acls enabled; users  with view permissions: Set(root, ambari-qa); groups with view permissions: Set(); users  with modify permissions: Set(root, ambari-qa); groups with modify permissions: Set()
21/03/04 14:53:54 INFO Utils: Successfully started service 'sparkDriver' on port 36788.

21/03/04 14:54:27 INFO YarnClientSchedulerBackend: Application application_1613965514658_0077 has started running.
21/03/04 14:54:27 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33776.
21/03/04 14:54:27 INFO NettyBlockTransferService: Server created on sz-abdi1-hadoop-datanode-1.novalocal:33776
21/03/04 14:54:27 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/03/04 14:54:27 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, sz-abdi1-hadoop-datanode-1.novalocal, 33776, None)
21/03/04 14:54:27 INFO BlockManagerMasterEndpoint: Registering block manager sz-abdi1-hadoop-datanode-1.novalocal:33776 with 366.3 MB RAM, BlockManagerId(driver, sz-abdi1-hadoop-datanode-1.novalocal, 33776, None)
21/03/04 14:54:27 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, sz-abdi1-hadoop-datanode-1.novalocal, 33776, None)
21/03/04 14:54:27 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, sz-abdi1-hadoop-datanode-1.novalocal, 33776, None)
21/03/04 14:54:28 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /metrics/json.
21/03/04 14:54:28 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5d601832{/metrics/json,null,AVAILABLE,@Spark}
21/03/04 14:54:28 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/application_1613965514658_0077
21/03/04 14:54:28 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000(ms)
21/03/04 14:54:28 INFO Exchange$: Processing Tag ip
21/03/04 14:54:28 INFO Exchange$: field keys: sip
21/03/04 14:54:28 INFO Exchange$: nebula keys: ip
21/03/04 14:54:28 INFO Exchange$: Loading from Hive and exec SELECT sip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:54:28 INFO SharedState: loading hive config file: file:/etc/spark2/3.1.0.0-78/0/hive-site.xml
21/03/04 14:54:28 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('/warehouse/tablespace/managed/hive').


21/03/04 14:57:15 INFO YarnScheduler: Removed TaskSet 1.0, whose tasks have all completed, from pool
21/03/04 14:57:15 INFO DAGScheduler: ResultStage 1 (foreachPartition at VerticesProcessor.scala:243) finished in 131.451 s
21/03/04 14:57:16 INFO DAGScheduler: Job 0 finished: foreachPartition at VerticesProcessor.scala:243, took 146.968890 s
21/03/04 14:57:16 INFO Exchange$: batchSuccess.ip: 0
21/03/04 14:57:16 INFO Exchange$: batchFailure.ip: 1376
21/03/04 14:57:16 INFO Exchange$: Processing Tag ip
21/03/04 14:57:16 INFO Exchange$: field keys: dip
21/03/04 14:57:16 INFO Exchange$: nebula keys: ip
21/03/04 14:57:16 INFO Exchange$: Loading from Hive and exec SELECT dip from sip.ods_sip_dns where  day=20210127 and hour=09
21/03/04 14:57:16 INFO HiveMetaStoreClient: Closed a connection to metastore, current connections: 1
21/03/04 14:57:18 INFO FileSourceStrategy: Pruning directories with: isnotnull(day#176),isnotnull(hour#177),(cast(day#176 as int) = 20210127),(cast(hour#177 as int) = 9)
21/03/04 14:57:18 INFO FileSourceStrategy: Post-Scan Filters:
21/03/04 14:57:18 INFO FileSourceStrategy: Output Data Schema: struct<dip: string>
21/03/04 14:57:18 INFO FileSourceScanExec: Pushed Filters:
21/03/04 14:57:18 INFO PrunedInMemoryFileIndex: Selected 2 partitions out of 2, pruned 0.0% partitions.
21/03/04 14:57:18 INFO MemoryStore: Block broadcast_3 stored as values in memory (estimated size 412.6 KB, free 365.4 MB)
21/03/04 14:57:18 INFO MemoryStore: Block broadcast_3_piece0 stored as bytes in memory (estimated size 35.7 KB, free 365.4 MB)
21/03/04 14:57:18 INFO BlockManagerInfo: Added broadcast_3_piece0 in memory on sz-abdi1-hadoop-datanode-1.novalocal:33776 (size: 35.7 KB, free: 366.2 MB)
21/03/04 14:57:19 INFO SparkContext: Created broadcast 3 from foreachPartition at VerticesProcessor.scala:243
21/03/04 14:57:19 INFO FileSourceScanExec: Planning scan with bin packing, max size: 91534365 bytes, open cost is considered as scanning 4194304 bytes.
21/03/04 14:57:19 INFO SparkContext: Starting job: foreachPartition at VerticesProcessor.scala:243
21/03/04 14:57:19 INFO DAGScheduler: Registering RDD 11 (foreachPartition at VerticesProcessor.scala:243)
21/03/04 14:57:19 INFO DAGScheduler: Got job 1 (foreachPartition at VerticesProcessor.scala:243) with 32 output partitions
21/03/04 14:57:19 INFO DAGScheduler: Final stage: ResultStage 3 (foreachPartition at VerticesProcessor.scala:243)
21/03/04 14:57:19 INFO DAGScheduler: Parents of final stage: List(ShuffleMapStage 2)
21/03/04 14:57:19 INFO DAGScheduler: Missing parents: List(ShuffleMapStage 2)
21/03/04 14:57:19 INFO DAGScheduler: Submitting ShuffleMapStage 2 (MapPartitionsRDD[11] at foreachPartition at VerticesProcessor.scala:243), which has no missing parents
21/03/04 14:57:19 INFO MemoryStore: Block broadcast_4 stored as values in memory (estimated size 12.6 KB, free 365.4 MB)
21/03/04 14:57:19 INFO MemoryStore: Block broadcast_4_piece0 stored as bytes in memory (estimated size 5.7 KB, free 365.4 MB)
21/03/04 14:57:19 INFO BlockManagerInfo: Added broadcast_4_piece0 in memory on sz-abdi1-hadoop-datanode-1.novalocal:33776 (size: 5.7 KB, free: 366.2 MB)
21/03/04 14:57:19 INFO SparkContext: Created broadcast 4 from broadcast at DAGScheduler.scala:1039
21/03/04 14:57:19 INFO DAGScheduler: Submitting 2 missing tasks from ShuffleMapStage 2 (MapPartitionsRDD[11] at foreachPartition at VerticesProcessor.scala:243) (first 15 tasks are for partitions Vector(0, 1))

nicole · 2021 年3 月 5 日 00:53

导入都失败了呀
你是yarn模式提交的，你贴出来的日志是driver的，导入过程的操作日志是会打在executor中的所以你看不到error日志。executor的日志要在18088端口（yarn的默认端口）的管理页面中去看下。

你看下/tmp/errors目录下看下是否有文件，文件中放了导入失败的语句，贴一下吧

GuangFeiZHU · 2021 年3 月 5 日 01:16

好的，已经找到原因了。
一个是不会使用yarn，日志没观察到。
后来去看graphd的日志，发现vertex VID的长度受限制，这个需要在创建space的时候进行指定，请大家看下2.0的英文文档比较详细

GuangFeiZHU · 2021 年3 月 6 日 02:54

请教一下，executor中的日志怎么查看？

nicole · 2021 年3 月 8 日 01:41

yarn logs -applicationId your_applicationId