spark-submit yarn-cluster 模式 导入hive数据 报配置文件not existed

和这帖子同样问题

直接读取jar中 配置文件的路径要把包路径写全的,我上图是给出一个示例 :rofl:
文件路径要这么写:
“nebula-exchange/src/main/resources/application.conf”

这样的话,你就只能使用导数据功能,且无法导入hive数据,需要把 配置c相关的注释掉。

但是还是需要导入hive,有其他方法吗?

drver机器上有文件,为啥读取不到

你的问题和你提到的帖子问题一模一样么?
可是

  1. 你并没有使用yarn-cluster模式,正常操作就可
  2. spark-submit命令缺少--master
  3. 如果是要导数据而不是测试源数据是否可正常读取,就不要用-D参数,文档有说明 导入命令参数 - Nebula Graph Database 手册

1,没使用,为啥读取不到? 我是看上个帖子说用client 可以,我改了测试也不行。

  1. –master 要怎么写能读取到?–master yarn–client 还是–master yarn-cluster
  2. 就是测试 上线会去掉
  1. 你上面给出的截图中,提交任务的命令缺少--master项,spark-submit使用yarn提交 有两种方式:
    ./bin/spark-submit --master yarn --deploy-mode client
    或者
    ./bin/spark-submit --master yarn-client

  2. 你要对比下你们两个帖子的情况,他是无法访问到yarn中的所有机器,对于你的情况:
    (1)如果使用yarn-client模式,你只需要在提交spark任务的机器上放置配置文件,然后 -c application.conf就可以读取到配置文件,如果读取不到请检查下路径。
    (2)如果你的yarn集群中的机器都可以访问,即便用yarn-cluster模式也能使用,前提是在每台yarn机器的同一位置放置application.conf, 原因参考:在 YARN 上运行 Spark — Spark 2.2.x 中文文档 2.2.1 文档

还是找不到。。。


l’s 命令查看是有的

报错:

application.conf
app.log
app.sql
curl_result
exchange-1.1.0.jar
exchange-1-1.1.0.jar
exchange.jar
hive_application.conf
info
main
main_origin
preCommands.txt
Application Id: application_1608793700198_22265362, Tracking URL: http://bigdata-nmg-hdprm00.nmg01.bigdata.intra.xiaojukeji.com:8088/proxy/application_1608793700198_22265362/
Notice: There is a serious problem  of syntax errors and exceptions in your code. Please check your code and dependencies!
DriverContainer URL:  http://bigdata-nmg-hdp5396.nmg01.diditaxi.com.bigdata.intra.xiaojukeji.com:8042/node/containerlogs/container_e20_1608793700198_22265362_01_000002/xx

21/01/29 12:24:04 ERROR Client: Application diagnostics message: Application application_1608793700198_22265362 failed 2 times due to AM Container for appattempt_1608793700198_22265362_000002 exited with  exitCode: 13
For more detailed output, check application tracking page:http://bigdata-nmg-hdprm00.nmg01.bigdata.intra.xiaojukeji.com:8088/proxy/application_1608793700198_22265362/Then, click on links to logs of each attempt.
Diagnostics: Exception from container-launch.
Container id: container_e20_1608793700198_22265362_02_000001
Exit code: 13
Stack trace: ExitCodeException exitCode=13: 
	at org.apache.hadoop.util.Shell.runCommand(Shell.java:545)
	at org.apache.hadoop.util.Shell.run(Shell.java:456)
	at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:722)
	at org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.launchContainer(LinuxContainerExecutor.java:372)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:310)
	at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:85)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

Shell output: main : command provided 1
main : user is yarn
main : requested yarn user is xxx


Container exited with a non-zero exit code 13. Last 4096 bytes of stderr :
ster: Preparing Local resources
21/01/29 12:23:59 INFO ApplicationMaster: ApplicationAttemptId: appattempt_1608793700198_22265362_000002
21/01/29 12:24:00 INFO ApplicationMaster: Starting the user application in a separate Thread
21/01/29 12:24:00 INFO ApplicationMaster: Waiting for spark context initialization...
21/01/29 12:24:00 ERROR ApplicationMaster: User class threw exception: java.lang.IllegalArgumentException: ./application.conf not exist
java.lang.IllegalArgumentException: ./application.conf not exist
	at com.vesoft.nebula.tools.importer.config.Configs$.parse(Configs.scala:182)
	at com.vesoft.nebula.tools.importer.Exchange$.main(Exchange.scala:76)
	at com.vesoft.nebula.tools.importer.Exchange.main(Exchange.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:735)
21/01/29 12:24:00 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User class threw exception: java.lang.IllegalArgumentException: ./application.conf not exist
	at com.vesoft.nebula.tools.importer.config.Configs$.parse(Configs.scala:182)
	at com.vesoft.nebula.tools.importer.Exchange$.main(Exchange.scala:76)
	at com.vesoft.nebula.tools.importer.Exchange.main(Exchange.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:735)
)
21/01/29 12:24:00 ERROR ApplicationMaster: Uncaught exception: 
org.apache.spark.SparkException: Exception thrown in awaitResult: 
	at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:480)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:308)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply$mcV$sp(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$1.apply(ApplicationMaster.scala:245)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:830)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1923)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:829)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:244)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:854)
	at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: java.lang.IllegalArgumentException: ./application.conf not exist
	at com.vesoft.nebula.tools.importer.config.Configs$.parse(Configs.scala:182)
	at com.vesoft.nebula.tools.importer.Exchange$.main(Exchange.scala:76)
	at com.vesoft.nebula.tools.importer.Exchange.main(Exchange.scala)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:735)
21/01/29 12:24:00 INFO ApplicationMaster: Deleting staging directory hdfs://difed/user/xxx/.sparkStaging/application_1608793700198_22265362


Failing this attempt. Failing the application.
Exception in thread "main" org.apache.spark.SparkException: Application application_1608793700198_22265362 finished with failed status
	at org.apache.spark.deploy.yarn.Client.run(Client.scala:1281)
	at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1678)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:853)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:167)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:195)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:931)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:940)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

[2021-01-29 12:24:04] *************** 运行失败[EXIT CODE: 1] ***************

你把yarn application的日志贴完整一些,看下任务提交给yarn的信息。
ps 我在yarn-client下可以正常读到文件,你试下用 local 提交看是否可以读到本地文件

不行,疯了。。。能改源码吗,直接读配置文件,且还能用hive导,怎么改,我用的1.1.0版本

公司没有loca 模式提交不了,日志就这些,没有其他日志了

不明白你为什么会需要改源码。
你需要解决的是读取不到本机的文件的问题,你把application.conf写成绝对路径看下是否能读到

您好,您的问题解决了吗 我也遇到在yarn-cluster模式下,出现conf文件找不到的问题,路径写成绝对路径和相对路径都是提示找不到

请问,这位同学的问题给解决了吗,我也遇到了

看文档,有说明
https://docs.nebula-graph.com.cn/2.0.1/nebula-exchange/parameter-reference/ex-ug-para-import-command/

好的,谢谢您,不过建议还是得把这块的文档写的详细一点 觉得有点简略了,其实生产中都在用yarn-cluster提交任务,详细一点体验更好,个人建议哈

你说的详细一点是指哪一块? 文档中给出了使用的命令,你是想知道关于命令的详细解释么,对于命令本身其实是属于spark和yarn范畴的知识,我们可以在文档中给出一个指向"yarn cluster模式提交spark任务"的介绍链接。@lzy https://spark.apache.org/docs/2.3.0/running-on-yarn.html

您好,我指的肯定不是spark 和yarn的配置参数,具体是下面截图所示的:


这几个参数不太明白,请教一下您

上面圈出来的只有 -c 是Exchange自己的参数,这个参数文档中也有说明的。

其他几个参数都是spark的参数

–files

–conf spark.driver.extraClassPath 和executor.extraClassPath

https://spark.apache.org/docs/latest/configuration.html#runtime-environment
这里面除了上面提到的几个参数,还有其他更多的参数配置,可以了解下。

1 个赞

好的 我等会先试试,谢谢啦

好勒,感谢 :sunglasses: :grimacing:

我在读取csv的时候,使用client 模式也报错,报错Expected protocol id ffffff82 but got 0 用了yarn cluster模式后也是报同样错误,也指定了绝对路径,就是找不到, 后来我把这个配置文件放到集群上对应目录就可以正常执行了,想问下这是为什么呢?