neo4j 导入Nebula Graph 驱动问题

版本:Nebula Exchange 2.0, Nebula Graph 2.0.1 spark2.4
os:Ubuntu 18.04
按照文档说明编写了neo4j_application.conf ,执行命令

${SPARK_HOME}/bin/spark-submit --master "local" --class com.vesoft.nebula.exchange.Exchange /root/nebula-spark-utils/nebula-exchange/target/nebula-exchange-2.0.0.jar -c /root/nebula-spark-utils/nebula-exchange/target/classes/neo4j_application.conf

报错误:

Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.neo4j.driver.exceptions.ClientException: Database name parameter for selecting database is not supported in Bolt Protocol Version 3. Database name: 'graph.db'

再次参考文档,文档上说:Exchange使用Neo4j Driver 4.0.1实现对Neo4j数据的读取。查找Neo4j官方未找到Neo4j Driver ,但找到neo4j-spark-connector 不知道是不是这个,请确认下,若不是,希望给出Neo4j Driver 4.0.1的地址

  1. 你提到的Driver是指 neo4j-spark-connector
  2. 麻烦确认下当前dbms.active_database是否为 graph.db

楼主更多信息

    {
      name: Disease
      type: {
        source: neo4j
        sink: client
      }
      server: "bolt://127.0.0.1:7687"
      user: neo4j
      password:111111
      database:graph.db
      exec: "match (n:Disease) with id(n) as _id, n.prevent as prevent, n.cureWay as cure_way, n.name as name, n.cure_lasttime as cure_last_time, n.cure_prob as cure_prob, n.cause as cause, n.cureDepartment as cure_department,n.desc as desc, n.easy_get as easy_get return _id, prevent, cure_way, name, cure_last_time, cure_prob, cause, cure_department, desc, easy_get order by _id"
      fields: [prevent,cureWay,name,cure_lasttime,cure_prob,cause,cureDepartment,desc,easy_get]
      nebula.fields: [prevent,cure_way,name,cure_last_time,cure_prob,cause,cure_department,desc,easy_get]
      vertex: {
        field:_id
      }
      partition: 10
      batch: 1000
      check_point_path: /tmp/medical
   }


...

dbms.connector.bolt.enabled=true
#dbms.connector.bolt.tls_level=OPTIONAL
dbms.connector.bolt.listen_address=:7687

这是当前的neo4j连接情况:
neo4j

看你这个报错信息,如果连接和库没问题的话,像是Neo4j服务端和客户端的版本不兼容。

我目前仅仅安装了neo4j 的server,是不是目前的Nebula Exchange 与这个不兼容呢?

Exchange使用的Neo4j的客户端版本是4.0.1,不兼容指的的这个4.0.1的客户端和你安装的server。

哦哦,明白了

你好,目前已经成功导入数据到Nebula Graph;并且创建了相应的index, 但是在控制台查询时输出: Storage Error: part: 10, error: E_RPC_FAILURE(-3).
没有看懂,求解。
ps:我的查询语句为:match p=(a:BasePrescription)-[r]->(b:Disease) return p limit 1

cc @yee 帮忙看下E_RPC_FAILURE的问题。
@sataliulan 可能需要你贴一下graphd和storaged日志。

这是最新的日志,其中192.168.10.17是我将原来文件中的127.0.0.1更改后的地址
(base) hcl@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs$ tail graphd-stderr.log
E0622 17:48:59.516737 4853 QueryInstance.cpp:103] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
E0623 08:54:53.574343 4753 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 08:54:53.574565 4753 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 08:54:53.574651 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 10
E0623 08:54:53.574672 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 8
E0623 08:54:53.574688 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 6
E0623 08:54:53.574698 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 4
E0623 08:54:53.574707 4853 StorageAccessExecutor.h:35] IndexScanExecutor failed, error E_RPC_FAILURE, part 2
E0623 08:54:53.574734 4853 StorageAccessExecutor.h:112] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
E0623 08:54:53.574774 4853 QueryInstance.cpp:103] Storage Error: part: 10, error: E_RPC_FAILURE(-3).
(base) hcl@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs$

Nebula-graphd.conf的配置:
########## networking ##########

Comma separated Meta Server Addresses

–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559

Local IP used to identify the nebula-graphd process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=127.0.0.1

Network device to listen on

–listen_netdev=any

Port to listen on

–port=9669

To turn on SO_REUSEPORT or not

–reuse_port=false

Backlog of the listen socket, adjust this together with net.core.somaxconn

–listen_backlog=1024

Seconds before the idle connections are closed, 0 for never closed

–client_idle_timeout_secs=0

Seconds before the idle sessions are expired, 0 for no expiration

–session_idle_timeout_secs=60000
下面是nebula-storaged.conf的配置:
########## networking ##########

Comma separated Meta server addresses

–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559

Local IP used to identify the nebula-storaged process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.l0.17

Storage daemon listening port

–port=9779

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19779

HTTP2 service port

–ws_h2_port=19780

heartbeat with meta service

–heartbeat_interval_secs=10

下面是nebula-metad.conf的配置:
########## networking ##########

Comma separated Meta Server addresses

–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559

Local IP used to identify the nebula-metad process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.10.17

Meta daemon listening port

–port=9559

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19559

HTTP2 service port

–ws_h2_port=19560
下面是另一个节点的graphd.conf 配置:
########## networking ##########

Comma separated Meta Server Addresses

–meta_server_addrs=192.168.10.17:9559

Local IP used to identify the nebula-graphd process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.10.211

Network device to listen on

–listen_netdev=any

Port to listen on

–port=9669

To turn on SO_REUSEPORT or not

–reuse_port=false

Backlog of the listen socket, adjust this together with net.core.somaxconn

–listen_backlog=1024

Seconds before the idle connections are closed, 0 for never closed

–client_idle_timeout_secs=0

Seconds before the idle sessions are expired, 0 for no expiration

–session_idle_timeout_secs=0

The number of threads to accept incoming connections

–num_accept_threads=1

The number of networking IO threads, 0 for # of CPU cores

–num_netio_threads=0

The number of threads to execute user queries, 0 for # of CPU cores

–num_worker_threads=0

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19669

HTTP2 service port

–ws_h2_port=19670
下面是另一节点的nebula-storage.conf 配置:
########## networking ##########

Comma separated Meta server addresses

–meta_server_addrs=192.168.10.17:9559

Local IP used to identify the nebula-storaged process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.10.211

Storage daemon listening port

–port=9779

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19779

HTTP2 service port

–ws_h2_port=19780

heartbeat with meta service

–heartbeat_interval_secs=10

下面是另一个节点nebula-metad.conf的配置:
########## networking ##########

Comma separated Meta Server addresses

–meta_server_addrs=192.168.10.17:9559

Local IP used to identify the nebula-metad process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.10.211

Meta daemon listening port

–port=9559

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19559

HTTP2 service port

–ws_h2_port=19560

这个错误代码的直接原因就是 graphD 上的 storage-client 的 RPC 连接断了,如果是比较接近fullscan的查询,可能有俩可能。

  1. 连接超时
  2. storageD 无法保持这个连接了(crash,oom)

对于 1,可以调大 storage_client_timeout_ms

对于 2 如果是容器部署,可以通过查看 storaged 的 container 的 event,uptime(docker ps)看是不是 crash 过,如果是二进制部署,可以看看 storaged 所在 OS的 dmesg |grep nebula

如果 2 的话,配置如果不能提升(scale up),可以想办法优化一下查询的语句,如果语句没法修改(从而节省内存)可以发上来看看查询的语句我们看看哈,有可能那个语句(比如 match)我们 nebula core 有优化的空间哈,可以提 issue(我们正在加紧优化 match查询哈)

参考文档
https://docs.nebula-graph.com.cn/2.0.1/2.quick-start/0.FAQ/#storage_error_e_rpc_failure

我在后台日志中发现在配置ip地址时不小心将192.168.10.17 写成了192.168.l0.17.。。。目前已更正,但是我将主从节点都重新启动后,同样的 查询发现日志还是报同样的错误,但是复核各个配置文件,已经配置正常;有没有办法清除原来的缓存,让数据库识别新的ip地址呢?

执行查询语句:fetch prop on Disease 1121 yield Disease.name
错误日志如下:
E0623 18:20:06.761965 28924 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 18:20:06.762097 28924 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 18:20:06.762162 28945 StorageAccessExecutor.h:35] GetVerticesExecutor failed, error E_RPC_FAILURE, part 2
E0623 18:20:06.762178 28945 StorageAccessExecutor.h:112] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:06.762197 28945 QueryInstance.cpp:103] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:07.173538 28925 ThriftClientManager.inl:65] Failed to resolve address for ‘192.168.l0.17’: Name or service not known (error=-2): Unknown error -2
E0623 18:20:07.173787 28925 StorageClientBase.inl:214] Request to “192.168.l0.17”:9779 failed: N6apache6thrift9transport19TTransportExceptionE: Channel is !good()
E0623 18:20:07.173828 28943 StorageAccessExecutor.h:35] GetVerticesExecutor failed, error E_RPC_FAILURE, part 2
E0623 18:20:07.173848 28943 StorageAccessExecutor.h:112] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
E0623 18:20:07.173868 28944 QueryInstance.cpp:103] Storage Error: part: 2, error: E_RPC_FAILURE(-3).
root@hcl-OMEN-by-HP-Desktop-873-0xxx:/usr/local/nebula/logs#

附: 我已将配置文件中的192.168.l0.17 更改为192.168.10.17,并且已经重启server还是不行
这是更改后的nebula storage配置:
########## networking ##########

Comma separated Meta server addresses

–meta_server_addrs=192.168.10.17:9559,192.168.10.211:9559

Local IP used to identify the nebula-storaged process.

Change it to an address other than loopback if the service is distributed or

will be accessed remotely.

–local_ip=192.168.10.17

Storage daemon listening port

–port=9779

HTTP service ip

–ws_ip=0.0.0.0

HTTP service port

–ws_http_port=19779

HTTP2 service port

–ws_h2_port=19780

heartbeat with meta service

–heartbeat_interval_secs=10

######### Raft #########

Raft election timeout

–raft_heartbeat_interval_secs=30

RPC timeout for raft client (ms)

–raft_rpc_timeout_ms=50000

@wey 我在配置文件开头添加–local_config=true,并使用命令 ./scripts/nebula.service -c etc/nebula-storaged.conf start storaged
发现无效,graphd 服务日志仍显示原来错误的配置;

在配置文件中加了 --local_config=true,之后,就不必在启动脚本再指定 conf 了:

./scripts/nebula.service start storaged

尝试无效。目前已全部卸载重装。。