- nebula 版本:2.6.0
- 部署方式:单机
- 安装方式:Docker-compose
- 是否为线上版本:N
- 硬件信息
96C200G30T
- spark版本: 2.4.0
- scala版本: 2.11.12
- 问题的具体描述
exchange导入hdfs上的csv报错
- 报错信息
- 导入命令
spark2-submit \
--name nebula_exchange \
--deploy-mode cluster \
--queue v_wdfxb_d \
--conf spark.yarn.maxAppAttempts=1 \
--files hive_application.conf \
--class com.vesoft.nebula.exchange.Exchange nebula-exchange-2.6.0.jar -c hive_application.conf
{
spark: {
app: {
name: Nebula Exchange 2.6.0
}
driver: {
cores: 1
maxResultSize: 1G
}
executor: {
memory: 1G
}
cores:{
max: 16
}
}
nebula: {
address:{
graph:["172.18.0.9:9669","172.18.0.8:9669","172.18.0.10:9669"]
meta:["172.18.0.3:9559","172.18.0.2:9559","172.18.0.4:9559"]
}
user: root
pswd: nebula
space: basketballplayer
connection {
timeout: 30000
}
error: {
max: 1
output: /tmp/wangjinchao/nebula/exchange/errors
}
rate: {
limit: 1024
timeout: 1000
}
}
tags: [
{
name: player
type: {
source: csv
sink: client
}
path: "hdfs://10.234.12.14:8020/tmp/wangjinchao/nebula/vertex_player.csv"
fields: [_c1, _c2]
nebula.fields: [age, name]
vertex: {
field:_c0
}
separator: ","
header: false
batch: 128
partition: 32
}
{
name: team
type: {
source: csv
sink: client
}
path: "hdfs://10.234.12.14:8020/tmp/wangjinchao/nebula/vertex_team.csv"
fields: [_c1]
nebula.fields: [name]
vertex: {
field:_c0
}
separator: ","
header: false
batch: 128
partition: 32
}
]
edges: [
{
name: follow
type: {
source: csv
sink: client
}
path: "hdfs://10.234.12.14:8020/tmp/wangjinchao/nebula/edge_follow.csv"
fields: [_c2]
nebula.fields: [degree]
source: {
field: _c0
}
target: {
field: _c1
}
separator: ","
header: false
batch: 128
partition: 32
}
{
name: serve
type: {
source: csv
sink: client
}
path: "hdfs://10.234.12.14:8020/tmp/wangjinchao/nebula/edge_serve.csv"
fields: [_c2,_c3]
nebula.fields: [start_year, end_year]
source: {
field: _c0
}
target: {
field: _c1
}
separator: ","
header: false
batch: 128
partition: 32
}
]
}
1 个赞
在单个机器上用docker-compose搭的nebula, 应该是因为有三个graphd的容器吧
steam
4
回题, 你的那个报错信息是不是没全,感觉是 HDFS 的读取问题,我简单搜了下 HDFS 报错 https://blog.csdn.net/u012037852/article/details/71708925 你看看那这个帖子的法子是不是可行。
谢谢, 但是好像不太行, 我把jar包的参数-c改成hdfs的路径之后直接运行不起来了
wey
6
我这个文章里的 source 也是 hdfs,你可以照着弄一个自己的容器里的 hdfs 试试哈?
看报错直接指向登录 hdfs 鉴权失败
是kerberos认证吗, 我在提交命令之前已经认证过了
wey
10
不知道这个会不会帮到你,现在就是 spark 访问 hdfs 鉴权不通过
system
关闭
11
此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。