Nebula 哪个版本支持SST文件导入

我们使用场景需要导入大量数据试了exchange导入效率太慢,请问哪个版本支持SST文件导入,看了文档和论坛,都没有找到比较明确的操作步骤和nebula版本

你的NebulaGraph版本是多少? 2.0未经过完整测试,还不建议使用。 1.2 是支持sst导入的。

1 个赞

之前使用的2.0GA,download 报错


现在考虑使用1.2.0,这个如何操作,需要写代码实现吗?

或者这个文档可行吗?离线数据加载 - 加载 .sst 文件 - 《Nebula Graph v1.0.0-rc2 图数据库文档》 - 书栈网 · BookStack

上述命令在storage 节点上执行成功了吗?

1.2.0可以的, 需要的操作和配置如下:

  1. 先用exchange生成sst文件
  2. 确保nebula服务所在环境有hadoop
  3. meta配置文件中要增加 --ws_storage_http_port的配置, 该配置的port值是 storage配置文件中的–ws_http_port=xxx的值
  4. graph配置文件中要增加 --ws_meta_http_port的配置,该配置的port值是 graph配置文件中的 --ws_http_port=xxx的值
  5. 在console中分别执行download hdfs “hdfs://ip:port/path” 和 ingest

@RandomJoe 文档需要更新下

版本nebula-1.2.0 download命令一直没成功过,配置有问题吗?


nebula-metad.WARNING

HADOOP_HOME
image
nebula-meta.conf

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-metad.pid

########## logging ##########
# The directory to host logging files, which must already exists
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0

########## networking ##########
# Meta Server Address
--meta_server_addrs=10.48.20.222:45500
# Local ip
--local_ip=10.48.20.222
# Meta daemon listening port
--port=45500
# HTTP service ip
--ws_ip=10.48.20.222
# HTTP service port
--ws_http_port=11000
# HTTP2 service port
--ws_h2_port=11002

--heartbeat_interval_secs=10

########## storage ##########
# Root data path, here should be only single path for metad
--data_path=data/meta
--ws_storage_http_port=12000

nebula-graph.conf

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-metad.pid

########## logging ##########
# The directory to host logging files, which must already exists
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0

########## networking ##########
# Meta Server Address
--meta_server_addrs=10.48.20.222:45500
# Local ip
--local_ip=10.48.20.222
# Meta daemon listening port
--port=45500
# HTTP service ip
--ws_ip=10.48.20.222
# HTTP service port
--ws_http_port=11000
# HTTP2 service port
--ws_h2_port=11002

--heartbeat_interval_secs=10

########## storage ##########
# Root data path, here should be only single path for metad
--data_path=data/meta
--ws_storage_http_port=12000
[liyu04@hdp15 nebula-1.2.0]$ cat etc/nebula-graphd.conf
########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-graphd.pid

########## logging ##########
# The directory to host logging files, which must already exists
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0
# Whether to redirect stdout and stderr to separate output files
--redirect_stdout=true
# Destination filename of stdout and stderr, which will also reside in log_dir.
--stdout_log_file=stdout.log
--stderr_log_file=stderr.log
# Copy log messages at or above this level to stderr in addition to logfiles. The numbers of severity levels INFO, WARNING, ERROR, and FATAL are 0, 1, 2, and 3, respectively.
--stderrthreshold=2

########## networking ##########
# Meta Server Address
--meta_server_addrs=10.48.20.222:45500
# Local ip
--local_ip=10.48.20.222
# Network device to listen on
--listen_netdev=any
# Port to listen on
--port=3699
# To turn on SO_REUSEPORT or not
--reuse_port=false
# Backlog of the listen socket, adjust this together with net.core.somaxconn
--listen_backlog=1024
# Seconds before the idle connections are closed, 0 for never closed
--client_idle_timeout_secs=0
# Seconds before the idle sessions are expired, 0 for no expiration
--session_idle_timeout_secs=0
# The number of threads to accept incoming connections
--num_accept_threads=1
# The number of networking IO threads, 0 for # of CPU cores
--num_netio_threads=0
# The number of threads to execute user queries, 0 for # of CPU cores
--num_worker_threads=0
# HTTP service ip
--ws_ip=10.48.20.222
# HTTP service port
--ws_http_port=13000
--ws_graph_http_port=13000
# HTTP2 service port
--ws_h2_port=13002
# The default charset when a space is created
--default_charset=utf8
# The defaule collate when a space is created
--default_collate=utf8_bin

########## authorization ##########
# Enable authorization
--enable_authorize=false

########## Authentication ##########
# User login authentication type, password for nebula authentication, ldap for ldap authentication, cloud for cloud authentication
--auth_type=password

nebula-storage.conf

########## basics ##########
# Whether to run as a daemon process
--daemonize=true
# The file to host the process id
--pid_file=pids/nebula-storaged.pid

########## logging ##########
# The directory to host logging files, which must already exists
--log_dir=logs
# Log level, 0, 1, 2, 3 for INFO, WARNING, ERROR, FATAL respectively
--minloglevel=0
# Verbose log level, 1, 2, 3, 4, the higher of the level, the more verbose of the logging
--v=0
# Maximum seconds to buffer the log messages
--logbufsecs=0

########## networking ##########
# Meta server address
--meta_server_addrs=10.48.20.222:45500
# Local ip
--local_ip=10.48.20.222
# Storage daemon listening port
--port=44500
# HTTP service ip
--ws_ip=10.48.20.222
# HTTP service port
--ws_http_port=12000
# HTTP2 service port
--ws_h2_port=12002
# heartbeat with meta service
--heartbeat_interval_secs=10

######### Raft #########
# Raft election timeout
--raft_heartbeat_interval_secs=30
# RPC timeout for raft client (ms)
--raft_rpc_timeout_ms=500
## recycle Raft WAL
--wal_ttl=3600

########## Disk ##########
# Root data path. Split by comma. e.g. --data_path=/disk1/path1/,/disk2/path2/
# One path per Rocksdb instance.
--data_path=data/storage

############## Rocksdb Options ##############
# The default reserved bytes for one batch operation
--rocksdb_batch_size=4096

# The default block cache size used in BlockBasedTable. (MB)
# recommend: 1/3 of all memory
--rocksdb_block_cache=4

# Compression algorithm, options: no,snappy,lz4,lz4hc,zlib,bzip2,zstd
# For the sake of binary compatibility, the default value is snappy.
# Recommend to use:
#   * lz4 to gain more CPU performance, with the same compression ratio with snappy
#   * zstd to occupy less disk space
#   * lz4hc for the read-heavy write-light scenario
--rocksdb_compression=snappy

# Set different compressions for different levels
# For example, if --rocksdb_compression is snappy,
# "no:no:lz4:lz4::zstd" is identical to "no:no:lz4:lz4:snappy:zstd:snappy"
# In order to disable compression for level 0/1, set it to "no:no"
--rocksdb_compression_per_level=

# Whether or not to enable rocksdb's statistics, disabled by default
--enable_rocksdb_statistics=false

# Statslevel used by rocksdb to collection statistics, optional values are
#   * kExceptHistogramOrTimers, disable timer stats, and skip histogram stats
#   * kExceptTimers, Skip timer stats
#   * kExceptDetailedTimers, Collect all stats except time inside mutex lock AND time spent on compression.
#   * kExceptTimeForMutex, Collect all stats except the counters requiring to get time inside the mutex lock.
#   * kAll, Collect all stats
--rocksdb_stats_level=kExceptHistogramOrTimers

# Whether or not to enable rocksdb's prefix bloom filter, disabled by default.
--enable_rocksdb_prefix_filtering=false
# Whether or not to enable the whole key filtering.
--enable_rocksdb_whole_key_filtering=true
# The prefix length for each key to use as the filter value.
# can be 12 bytes(PartitionId + VertexID), or 16 bytes(PartitionId + VertexID + TagID/EdgeType).
--rocksdb_filtering_prefix_length=12

############## rocksdb Options ##############
# rocksdb DBOptions in json, each name and value of option is a string, given as "option_name":"option_value" separated by comma
--rocksdb_db_options={"max_subcompactions":"1","max_background_jobs":"1"}
# rocksdb ColumnFamilyOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_column_family_options={"disable_auto_compactions":"false","write_buffer_size":"67108864","max_write_buffer_number":"4","max_bytes_for_level_base":"268435456"}
# rocksdb BlockBasedTableOptions in json, each name and value of option is string, given as "option_name":"option_value" separated by comma
--rocksdb_block_based_table_options={"block_size":"8192"}

############# edge samplings ##############
# --enable_reservoir_sampling=false
# --max_edge_returned_per_vertex=2147483647

你的错误信息提示 找不到hadoop命令,是不是没配HADOOP_HOME。
ps: 关于nebula的配置我更新了下,graph的配置文件中要配 --ws_meta_http_port=11000。

环境变量HADOOP_HOME和PATH都配了,最后发现是sudo的问题,sudo启动nebula时会导致环境变量重置。用root用户启动没有这个问题

好的。