版本:Nebula Exchange 2.0, Nebula Graph 2.0.1 spark2.4
os:Ubuntu 18.04
按照文档说明编写了neo4j_application.conf ,执行命令
${SPARK_HOME}/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange /root/nebula-spark-utils/nebula-exchange/target/nebula-exchange-2.0.0.jar -c /root/nebula-spark-utils/nebula-exchange/target/classes/neo4j_application.conf
报错误:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.neo4j.driver.exceptions.ClientException: Database name parameter for selecting database is not supported in Bolt Protocol Version 3. Database name: 'graph.db'
再次参考文档,文档上说:Exchange使用Neo4j Driver 4.0.1实现对Neo4j数据的读取。查找Neo4j官方未找到Neo4j Driver ,但找到neo4j-spark-connector 不知道是不是这个,请确认下,若不是,希望给出Neo4j Driver 4.0.1的地址
wey
2021 年6 月 22 日 03:28
3
楼主更多信息
已打开 12:06PM - 21 Jun 21 UTC
已关闭 02:21AM - 02 Jul 21 UTC
版本:Nebula Exchange 2.0, Nebula Graph 2.0.1 spark2.4
os:Ubuntu 18.04
按照文档说明编写了n… eo4j_application.conf ,执行命令${SPARK_HOME}/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange /root/nebula-spark-utils/nebula-exchange/target/nebula-exchange-2.0.0.jar -c /root/nebula-spark-utils/nebula-exchange/target/classes/neo4j_application.conf 报错误:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.neo4j.driver.exceptions.ClientException: Database name parameter for selecting database is not supported in Bolt Protocol Version 3. Database name: 'graph.db'
再次参考文档,文档上说:Exchange使用Neo4j Driver 4.0.1实现对Neo4j数据的读取。查找Neo4j官方未找到Neo4j Driver ,但找到neo4j-spark-connector 不知道是不是这个,请确认下,若不是,希望给出Neo4j Driver 4.0.1的地址,
{
name: Disease
type: {
source: neo4j
sink: client
}
server: "bolt://127.0.0.1:7687"
user: neo4j
password:111111
database:graph.db
exec: "match (n:Disease) with id(n) as _id, n.prevent as prevent, n.cureWay as cure_way, n.name as name, n.cure_lasttime as cure_last_time, n.cure_prob as cure_prob, n.cause as cause, n.cureDepartment as cure_department,n.desc as desc, n.easy_get as easy_get return _id, prevent, cure_way, name, cure_last_time, cure_prob, cause, cure_department, desc, easy_get order by _id"
fields: [prevent,cureWay,name,cure_lasttime,cure_prob,cause,cureDepartment,desc,easy_get]
nebula.fields: [prevent,cure_way,name,cure_last_time,cure_prob,cause,cure_department,desc,easy_get]
vertex: {
field:_id
}
partition: 10
batch: 1000
check_point_path: /tmp/medical
}
...
dbms.connector.bolt.enabled=true
#dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=:7687
sataliulan:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.neo4j.driver.exceptions.ClientException: Database name parameter for selecting database is not supported in Bolt Protocol Version 3. Database name: 'graph.db'
看你这个报错信息,如果连接和库没问题的话,像是Neo4j服务端和客户端的版本不兼容。
我目前仅仅安装了neo4j 的server,是不是目前的Nebula Exchange 与这个不兼容呢?
Exchange使用的Neo4j的客户端版本是4.0.1,不兼容指的的这个4.0.1的客户端和你安装的server。
你好,目前已经成功导入数据到Nebula Graph;并且创建了相应的index, 但是在控制台查询时输出: Storage Error: part: 10, error: E_RPC_FAILURE(-3).
没有看懂,求解。
ps:我的查询语句为:match p=(a:BasePrescription)-[r]->(b:Disease) return p limit 1
nicole
2021 年6 月 22 日 11:18
10
cc @yee 帮忙看下E_RPC_FAILURE的问题。
@sataliulan 可能需要你贴一下graphd和storaged日志。
这是最新的日志,其中192.168.10.17是我将原来文件中的127.0.0.1更改后的地址
(base) hcl@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs$ tail graphd-stderr.log
E0622 17:48:59.516737 4853 QueryInstance.cpp:103] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
E0623 08:54:53.574343 4753 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 08:54:53.574565 4753 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 08:54:53.574651 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 10
E0623 08:54:53.574672 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 8
E0623 08:54:53.574688 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 6
E0623 08:54:53.574698 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 4
E0623 08:54:53.574707 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 2
E0623 08:54:53.574734 4853 StorageAccessExecutor.h:112] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
E0623 08:54:53.574774 4853 QueryInstance.cpp:103] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
(base) hcl@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs$
Nebula-graphd.conf的配置:
########## networking ##########
Comma separated Meta Server Addresses
–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559
Local IP used to identify the nebula-graphd process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=127.0.0.1
Network device to listen on
–listen_netdev=any
Port to listen on
–port=9669
To turn on SO_REUSEPORT or not
–reuse_port=false
Backlog of the listen socket, adjust this together with net.core.somaxconn
–listen_backlog=1024
Seconds before the idle connections are closed, 0 for never closed
–client_idle_timeout_secs=0
Seconds before the idle sessions are expired, 0 for no expiration
–session_idle_timeout_secs=60000
下面是nebula-storaged.conf的配置:
########## networking ##########
Comma separated Meta server addresses
–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559
Local IP used to identify the nebula-storaged process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.l0.17
Storage daemon listening port
–port=9779
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19779
HTTP2 service port
–ws_h2_port=19780
heartbeat with meta service
–heartbeat_interval_secs=10
下面是nebula-metad.conf的配置:
########## networking ##########
Comma separated Meta Server addresses
–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559
Local IP used to identify the nebula-metad process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.10.17
Meta daemon listening port
–port=9559
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19559
HTTP2 service port
–ws_h2_port=19560
下面是另一个节点的graphd.conf 配置:
########## networking ##########
Comma separated Meta Server Addresses
–meta_server_addrs=192.168.10.17:9559
Local IP used to identify the nebula-graphd process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.10.211
Network device to listen on
–listen_netdev=any
Port to listen on
–port=9669
To turn on SO_REUSEPORT or not
–reuse_port=false
Backlog of the listen socket, adjust this together with net.core.somaxconn
–listen_backlog=1024
Seconds before the idle connections are closed, 0 for never closed
–client_idle_timeout_secs=0
Seconds before the idle sessions are expired, 0 for no expiration
–session_idle_timeout_secs=0
The number of threads to accept incoming connections
–num_accept_threads=1
The number of networking IO threads, 0 for # of CPU cores
–num_netio_threads=0
The number of threads to execute user queries, 0 for # of CPU cores
–num_worker_threads=0
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19669
HTTP2 service port
–ws_h2_port=19670
下面是另一节点的nebula-storage.conf 配置:
########## networking ##########
Comma separated Meta server addresses
–meta_server_addrs=192.168.10.17:9559
Local IP used to identify the nebula-storaged process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.10.211
Storage daemon listening port
–port=9779
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19779
HTTP2 service port
–ws_h2_port=19780
heartbeat with meta service
–heartbeat_interval_secs=10
下面是另一个节点nebula-metad.conf的配置:
########## networking ##########
Comma separated Meta Server addresses
–meta_server_addrs=192.168.10.17:9559
Local IP used to identify the nebula-metad process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.10.211
Meta daemon listening port
–port=9559
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19559
HTTP2 service port
–ws_h2_port=19560
wey
2021 年6 月 23 日 09:17
13
这个错误代码的直接原因就是 graphD 上的 storage-client 的 RPC 连接断了,如果是比较接近fullscan的查询,可能有俩可能。
连接超时
storageD 无法保持这个连接了(crash,oom)
对于 1,可以调大 storage_client_timeout_ms
对于 2 如果是容器部署,可以通过查看 storaged 的 container 的 event,uptime(docker ps)看是不是 crash 过,如果是二进制部署,可以看看 storaged 所在 OS的 dmesg |grep nebula
如果 2 的话,配置如果不能提升(scale up),可以想办法优化一下查询的语句,如果语句没法修改(从而节省内存)可以发上来看看查询的语句我们看看哈,有可能那个语句(比如 match)我们 nebula core 有优化的空间哈,可以提 issue(我们正在加紧优化 match查询哈)
参考文档
https://docs.nebula-graph.com.cn/2.0.1/2.quick-start/0.FAQ/#storage_error_e_rpc_failure
我在后台日志中发现在配置ip地址时不小心将192.168.10.17 写成了192.168.l0.17.。。。目前已更正,但是我将主从节点都重新启动后,同样的 查询发现日志还是报同样的错误,但是复核各个配置文件,已经配置正常;有没有办法清除原来的缓存,让数据库识别新的ip地址呢?
执行查询语句:fetch prop on Disease 1121 yield Disease.name
错误日志如下:
E0623 18:20:06.761965 28924 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 18:20:06.762097 28924 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 18:20:06.762162 28945 StorageAccessExecutor.h:35] GetVerticesExecutor failed, error E_RPC_FAILURE, part 2
E0623 18:20:06.762178 28945 StorageAccessExecutor.h:112] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:06.762197 28945 QueryInstance.cpp:103] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:07.173538 28925 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 18:20:07.173787 28925 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 18:20:07.173828 28943 StorageAccessExecutor.h:35] GetVerticesExecutor failed, error E_RPC_FAILURE, part 2
E0623 18:20:07.173848 28943 StorageAccessExecutor.h:112] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:07.173868 28944 QueryInstance.cpp:103] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
root@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs#
附: 我已将配置文件中的192.168.l0.17 更改为192.168.10.17,并且已经重启server还是不行
这是更改后的nebula storage配置:
########## networking ##########
Comma separated Meta server addresses
–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559
Local IP used to identify the nebula-storaged process.
Change it to an address other than loopback if the service is distributed or
will be accessed remotely.
–local_ip=192.168.10.17
Storage daemon listening port
–port=9779
HTTP service ip
–ws_ip=0.0.0.0
HTTP service port
–ws_http_port=19779
HTTP2 service port
–ws_h2_port=19780
heartbeat with meta service
–heartbeat_interval_secs=10
######### Raft #########
Raft election timeout
–raft_heartbeat_interval_secs=30
RPC timeout for raft client (ms)
–raft_rpc_timeout_ms=50000
@wey 我在配置文件开头添加–local_config=true,并使用命令 ./scripts/nebula.service -c etc/nebula-storaged.conf start storaged
发现无效,graphd 服务日志仍显示原来错误的配置;
yee
2021 年6 月 24 日 03:17
17
在配置文件中加了 --local_config=true
,之后,就不必在启动脚本再指定 conf 了:
./scripts/nebula.service start storaged