Spark导入时报错read timed out

  • nebula 版本:3.4.0
  • 部署方式:分布式
  • 安装方式:源码编译
  • 是否上生产环境:N
  • 硬件信息
    • 磁盘 HHD
  • 问题的具体描述
  • 相关的 meta / storage / graph info 日志信息
storaged-stderr:
Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162211.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162211.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162227.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162227.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162243.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162243.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162300.22063!Could not create logging file: Too many open files
COULD NOT CREATE A LOGGINGFILE 20230508-162300.22063![warn] epoll_create: Too many open files
[warn] evutil_make_internal_pipe_: pipe: Too many open files
[err] evsig_init_: socketpair: Too many open files
terminate called without an active exception
ExceptionHandler::GenerateDump sys_pipe failed:Too many open files
ExceptionHandler::SendContinueSignalToChild sys_write failed:Bad file descriptor
ExceptionHandler::WaitForContinueSignal sys_read failed:Bad file descriptor
*** Aborted at 1683534186 (Unix time, try 'date -d @1683534186') ***
*** Signal 6 (SIGABRT) (0x562f) received by PID 22063 (pthread TID 0x7f25dc7fb700) (linux TID 27768) (maybe from PID 22063, UID 0) (code: -6), stack trace: ***
(error retrieving stack trace)

storaged:
Log file created at: 2023/05/08 16:19:56
Running on machine: IIE240
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
I20230508 16:19:56.877272 22063 StorageDaemon.cpp:132] localhost = "nebula05":29779
I20230508 16:19:56.877770 22063 StorageDaemon.cpp:147] data path= /usr/local/nebula-graph/data/storage
I20230508 16:19:56.889794 22063 MetaClient.cpp:80] Create meta client to "nebula02":29559
I20230508 16:19:56.889833 22063 MetaClient.cpp:81] root path: /usr/local/nebula-graph, data path size: 1
I20230508 16:19:56.889883 22063 FileBasedClusterIdMan.cpp:53] Get clusterId: 540877272555852365
I20230508 16:19:56.901106 22105 ThriftClientManager-inl.h:67] resolve "nebula01":29559 as "10.26.24.59":29559
I20230508 16:19:57.906033 22105 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.908430 22130 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.909955 22131 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.911440 22132 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.913142 22133 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.914474 22134 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.916263 22135 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.917709 22136 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.919407 22137 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.921205 22138 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.928465 22139 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.929953 22140 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.931418 22141 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.933003 22142 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.934432 22143 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.935415 22144 ThriftClientManager-inl.h:67] resolve "nebula02":29559 as "10.26.24.60":29559
I20230508 16:19:57.943451 22063 MetaClient.cpp:3259] Load leader of "nebula01":29779 in 3 space
I20230508 16:19:57.943495 22063 MetaClient.cpp:3259] Load leader of "nebula02":29779 in 3 space
I20230508 16:19:57.943507 22063 MetaClient.cpp:3259] Load leader of "nebula03":29779 in 3 space
I20230508 16:19:57.943529 22063 MetaClient.cpp:3259] Load leader of "nebula04":29779 in 3 space
I20230508 16:19:57.943539 22063 MetaClient.cpp:3259] Load leader of "nebula05":29779 in 3 space
I20230508 16:19:57.943548 22063 MetaClient.cpp:3265] Load leader ok
I20230508 16:19:57.946210 22063 MetaClient.cpp:162] Register time task for heartbeat!
I20230508 16:19:57.946269 22063 StorageServer.cpp:244] Init schema manager
I20230508 16:19:57.946282 22063 StorageServer.cpp:247] Init index manager
I20230508 16:19:57.946291 22063 StorageServer.cpp:250] Init kvstore
I20230508 16:19:57.946355 22063 NebulaStore.cpp:52] Start the raft service...
I20230508 16:19:57.947340 22063 NebulaSnapshotManager.cpp:25] Send snapshot is rate limited to 10485760 for each part by default
I20230508 16:19:57.961162 22063 RaftexService.cpp:46] Start raft service on 29780
I20230508 16:19:57.961284 22063 NebulaStore.cpp:86] Scan the local path, and init the spaces_
I20230508 16:19:57.961408 22063 NebulaStore.cpp:93] Scan path "/usr/local/nebula-graph/data/storage/nebula/0"
I20230508 16:19:57.961428 22063 NebulaStore.cpp:93] Scan path "/usr/local/nebula-graph/data/storage/nebula/48"
I20230508 16:19:57.961889 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_bytes_for_level_base=268435456
I20230508 16:19:57.961926 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_write_buffer_number=4
I20230508 16:19:57.961937 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option write_buffer_size=67108864
I20230508 16:19:57.962373 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option block_size=8192
I20230508 16:19:58.461879 22063 RocksEngine.cpp:107] open rocksdb on /home/Aiqiang/nebula-graph/data/storage/nebula/48/data
I20230508 16:19:58.461967 22063 NebulaStore.cpp:197] Load space 48 from disk
I20230508 16:19:58.461985 22063 NebulaStore.cpp:206] Need to open 2 parts of space 48
I20230508 16:19:58.570394 22146 NebulaStore.cpp:229] Load part 48, 3 from disk
I20230508 16:19:58.570777 22147 NebulaStore.cpp:229] Load part 48, 8 from disk
I20230508 16:19:58.570840 22063 NebulaStore.cpp:263] Load space 48 complete
I20230508 16:19:58.570870 22063 NebulaStore.cpp:93] Scan path "/usr/local/nebula-graph/data/storage/nebula/5"
I20230508 16:19:58.570977 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_bytes_for_level_base=268435456
I20230508 16:19:58.570993 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_write_buffer_number=4
I20230508 16:19:58.571002 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option write_buffer_size=67108864
I20230508 16:19:58.571214 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option block_size=8192
I20230508 16:19:59.033828 22063 RocksEngine.cpp:107] open rocksdb on /usr/local/nebula-graph/data/storage/nebula/5/data
I20230508 16:19:59.033946 22063 NebulaStore.cpp:197] Load space 5 from disk
I20230508 16:19:59.033964 22063 NebulaStore.cpp:206] Need to open 2 parts of space 5
I20230508 16:19:59.036461 22147 NebulaStore.cpp:229] Load part 5, 8 from disk
I20230508 16:19:59.140286 22146 NebulaStore.cpp:229] Load part 5, 3 from disk
I20230508 16:19:59.140388 22063 NebulaStore.cpp:263] Load space 5 complete
I20230508 16:19:59.140421 22063 NebulaStore.cpp:93] Scan path "/usr/local/nebula-graph/data/storage/nebula/57"
I20230508 16:19:59.140547 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_bytes_for_level_base=268435456
I20230508 16:19:59.140563 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_write_buffer_number=4
I20230508 16:19:59.140573 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option write_buffer_size=67108864
I20230508 16:19:59.140791 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option block_size=8192
I20230508 16:19:59.396637 22063 RocksEngine.cpp:107] open rocksdb on /usr/local/nebula-graph/data/storage/nebula/57/data
I20230508 16:19:59.396706 22063 NebulaStore.cpp:197] Load space 57 from disk
I20230508 16:19:59.396723 22063 NebulaStore.cpp:206] Need to open 2 parts of space 57
I20230508 16:19:59.397532 22149 NebulaStore.cpp:229] Load part 57, 3 from disk
I20230508 16:19:59.397608 22146 NebulaStore.cpp:229] Load part 57, 8 from disk
I20230508 16:19:59.397689 22063 NebulaStore.cpp:263] Load space 57 complete
I20230508 16:19:59.397734 22063 NebulaStore.cpp:272] Init data from partManager for "nebula05":29779
I20230508 16:19:59.397768 22063 NebulaStore.cpp:370] Data space 5 has existed!
I20230508 16:19:59.397800 22063 NebulaStore.cpp:422] [Space: 5, Part: 3] has existed!
I20230508 16:19:59.397819 22063 NebulaStore.cpp:422] [Space: 5, Part: 8] has existed!
I20230508 16:19:59.397830 22063 NebulaStore.cpp:370] Data space 48 has existed!
I20230508 16:19:59.397841 22063 NebulaStore.cpp:422] [Space: 48, Part: 3] has existed!
I20230508 16:19:59.397850 22063 NebulaStore.cpp:422] [Space: 48, Part: 8] has existed!
I20230508 16:19:59.397859 22063 NebulaStore.cpp:370] Data space 57 has existed!
I20230508 16:19:59.397871 22063 NebulaStore.cpp:422] [Space: 57, Part: 3] has existed!
I20230508 16:19:59.397881 22063 NebulaStore.cpp:422] [Space: 57, Part: 8] has existed!
I20230508 16:19:59.397908 22063 NebulaStore.cpp:79] Register handler...
I20230508 16:19:59.397920 22063 StorageServer.cpp:253] Init LogMonitor
I20230508 16:19:59.398008 22063 StorageServer.cpp:120] Starting Storage HTTP Service
I20230508 16:19:59.398212 22063 StorageServer.cpp:124] Http Thread Pool started
I20230508 16:19:59.401068 22223 WebService.cpp:124] Web service started on HTTP[19779]
I20230508 16:19:59.401239 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_bytes_for_level_base=268435456
I20230508 16:19:59.401274 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option max_write_buffer_number=4
I20230508 16:19:59.401285 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option write_buffer_size=67108864
I20230508 16:19:59.401528 22063 RocksEngineConfig.cpp:371] Emplace rocksdb option block_size=8192
I20230508 16:19:59.574129 22152 EventListener.h:21] Rocksdb start compaction column family: default because of LevelL0FilesNum, status: OK, compacted 5 files into 0, base level is 0, output level is 1
I20230508 16:19:59.574173 22063 RocksEngine.cpp:107] open rocksdb on /usr/local/nebula-graph/data/storage/nebula/0/data
I20230508 16:19:59.574342 22063 AdminTaskManager.cpp:22] max concurrent subtasks: 10
I20230508 16:19:59.574474 22063 AdminTaskManager.cpp:40] exit AdminTaskManager::init()
I20230508 16:19:59.574535 22244 AdminTaskManager.cpp:224] waiting for incoming task
I20230508 16:19:59.574752 22245 AdminTaskManager.cpp:92] reportTaskFinish(), job=63, task=2, rc=E_TASK_EXECUTION_FAILED
I20230508 16:19:59.577837 22245 AdminTaskManager.cpp:134] reportTaskFinish(), job=63, task=2, rc=SUCCEEDED
I20230508 16:19:59.585227 22063 MemoryUtils.cpp:171] MemoryTracker set static ratio: 0.8
I20230508 16:19:59.690968 22152 EventListener.h:35] Rocksdb compaction completed column family: default because of LevelL0FilesNum, status: OK, compacted 5 files into 1, base level is 0, output level is 1
I20230508 16:20:10.325309 22100 AdminTask.cpp:21] createAdminTask (64, 2)
I20230508 16:20:10.325495 22100 AdminTaskManager.cpp:155] enqueue task(64, 2)
I20230508 16:20:10.325584 22244 AdminTaskManager.cpp:236] dequeue task(64, 2)
I20230508 16:20:10.325759 22244 AdminTaskManager.cpp:279] run task(64, 2), 2 subtasks in 2 thread
I20230508 16:20:10.326395 22719 StatsTask.cpp:110] Start stats task
I20230508 16:20:10.326712 22719 StatsTask.cpp:307] Stats task finished
I20230508 16:20:10.326750 22719 AdminTaskManager.cpp:315] subtask of task(64, 2) finished, unfinished task 1
I20230508 16:20:10.326828 22244 AdminTaskManager.cpp:224] waiting for incoming task
I20230508 16:20:10.326884 22720 StatsTask.cpp:110] Start stats task
I20230508 16:20:10.331351 22720 StatsTask.cpp:307] Stats task finished
I20230508 16:20:10.331390 22720 AdminTaskManager.cpp:315] subtask of task(64, 2) finished, unfinished task 0
I20230508 16:20:10.331406 22720 StatsTask.cpp:312] task(64, 2) finished, rc=[SUCCEEDED]
I20230508 16:20:10.331638 22245 AdminTaskManager.cpp:92] reportTaskFinish(), job=64, task=2, rc=SUCCEEDED
I20230508 16:20:10.333593 22245 AdminTaskManager.cpp:134] reportTaskFinish(), job=64, task=2, rc=SUCCEEDED
I20230508 16:20:10.337368 22719 AdminTaskManager.cpp:326] task(64, 2) runSubTask() exit
I20230508 16:20:13.852545 22100 AdminTask.cpp:21] createAdminTask (65, 2)
I20230508 16:20:13.852650 22100 AdminTaskManager.cpp:155] enqueue task(65, 2)
I20230508 16:20:13.852660 22244 AdminTaskManager.cpp:236] dequeue task(65, 2)
I20230508 16:20:13.852762 22244 AdminTaskManager.cpp:279] run task(65, 2), 2 subtasks in 2 thread
I20230508 16:20:13.853390 22721 StatsTask.cpp:110] Start stats task
I20230508 16:20:13.853572 22721 StatsTask.cpp:307] Stats task finished
I20230508 16:20:13.853601 22721 AdminTaskManager.cpp:315] subtask of task(65, 2) finished, unfinished task 1
I20230508 16:20:13.853683 22244 AdminTaskManager.cpp:224] waiting for incoming task
I20230508 16:20:13.853789 22830 StatsTask.cpp:110] Start stats task
I20230508 16:20:13.856384 22830 StatsTask.cpp:307] Stats task finished
I20230508 16:20:13.856418 22830 AdminTaskManager.cpp:315] subtask of task(65, 2) finished, unfinished task 0
I20230508 16:20:13.856434 22830 StatsTask.cpp:312] task(65, 2) finished, rc=[SUCCEEDED]
I20230508 16:20:13.856639 22245 AdminTaskManager.cpp:92] reportTaskFinish(), job=65, task=2, rc=SUCCEEDED
I20230508 16:20:13.859304 22245 AdminTaskManager.cpp:134] reportTaskFinish(), job=65, task=2, rc=SUCCEEDED
I20230508 16:20:13.864252 22721 AdminTaskManager.cpp:326] task(65, 2) runSubTask() exit
I20230508 16:20:18.118888 22145 MetaClient.cpp:3259] Load leader of "nebula01":29779 in 3 space
I20230508 16:20:18.118947 22145 MetaClient.cpp:3259] Load leader of "nebula02":29779 in 3 space
I20230508 16:20:18.118973 22145 MetaClient.cpp:3259] Load leader of "nebula03":29779 in 3 space
I20230508 16:20:18.118993 22145 MetaClient.cpp:3259] Load leader of "nebula04":29779 in 3 space
I20230508 16:20:18.119019 22145 MetaClient.cpp:3259] Load leader of "nebula05":29779 in 3 space
I20230508 16:20:18.119031 22145 MetaClient.cpp:3265] Load leader ok
E20230508 16:22:11.948338 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:11.948632 22144 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:12.949677 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:12.949786 22130 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:13.950801 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:13.950909 22105 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:14.951927 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:14.952029 22142 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:15.953047 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:15.953146 22141 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:16.954167 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:16.954277 22140 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:17.955364 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:17.955466 22139 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:18.956483 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:18.956585 22138 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:19.957612 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:19.957718 22137 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:20.958736 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:20.958842 22136 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:21.959857 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:21.959959 22135 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:22.960983 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:22.961089 22134 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:23.962110 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:23.962210 22133 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:24.963291 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:24.963428 22132 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:25.964452 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:25.964574 22131 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:26.965584 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:26.965691 22143 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:27.966708 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:27.966878 22144 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:28.967902 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:28.968010 22130 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:29.969019 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:29.969128 22105 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:30.970147 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:30.970252 22142 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:31.971347 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:31.971452 22141 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:32.972477 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:32.972584 22140 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:33.973601 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:33.973711 22139 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:34.974723 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:34.974826 22138 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:35.975842 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:35.975950 22137 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:36.976393 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:36.976498 22136 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:37.977552 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:37.977661 22135 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:38.978674 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:38.978783 22134 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:39.979781 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:39.979883 22133 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:40.980901 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:40.981007 22132 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:41.982024 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:41.982136 22131 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:42.983153 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:42.983263 22143 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:43.984280 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:43.984484 22144 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:44.983561 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:45.984668 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:45.984776 22105 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:46.985795 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:46.985905 22142 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:47.986920 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:47.987023 22141 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:48.988044 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:48.988153 22140 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:49.989172 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:49.989280 22139 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:50.990381 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:50.990489 22138 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:51.991508 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:51.991619 22137 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:52.992638 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:52.992748 22136 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:53.993768 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:53.993878 22135 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:54.994899 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:54.995008 22134 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:55.996024 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:55.996131 22133 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:56.997205 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:56.997315 22132 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:57.998353 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:57.998464 22131 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:22:58.999482 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:22:58.999593 22143 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:00.000607 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:00.000713 22144 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:01.001739 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:01.001857 22130 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:02.002862 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:02.002990 22105 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:03.004004 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:03.004110 22142 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:04.005113 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:04.005216 22141 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
E20230508 16:23:05.006304 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:05.006441 22140 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
I20230508 16:23:05.940542 22100 AdminTask.cpp:21] createAdminTask (66, 2)
I20230508 16:23:05.940682 22100 AdminTaskManager.cpp:155] enqueue task(66, 2)
I20230508 16:23:05.940691 22244 AdminTaskManager.cpp:236] dequeue task(66, 2)
I20230508 16:23:05.940812 22244 AdminTaskManager.cpp:279] run task(66, 2), 2 subtasks in 2 thread
E20230508 16:23:06.007522 22286 AsyncServerSocket.cpp:1027] accept failed: out of file descriptors; entering accept back-off state
E20230508 16:23:06.007772 22139 Acceptor.cpp:455] error accepting on acceptor socket: accept() failed
graphd
I20230508 15:48:48.916617  3013 SwitchSpaceExecutor.cpp:45] Graph switched to `graphv1', space id: 57
I20230508 15:52:31.103021  3014 GraphService.cpp:77] Authenticating user root from [::ffff:10.26.24.64]:44980
I20230508 15:52:31.109373  3011 SwitchSpaceExecutor.cpp:45] Graph switched to `graphv1', space id: 57
I20230508 15:52:31.146656  3091 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.159220  3092 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.169589  3025 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.186383  3026 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.194470  3027 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.202670  3028 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.210841  3029 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.219065  3030 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.227125  3031 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.235323  3032 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.245020  3033 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.254864  3034 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.264379  3035 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.273433  3036 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.296872  3037 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.307608  3038 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.357507  3039 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.367055  3040 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.375916  3041 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.383993  3042 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
I20230508 15:52:31.392699  3043 ThriftClientManager-inl.h:67] resolve "nebula05":29779 as "10.26.24.63":29779
E20230508 15:53:25.183429  3043 StorageClientBase-inl.h:227] Request to "nebula05":29779 failed: AsyncSocketException: recv() failed (peer=10.26.24.63:29779, local=10.26.24.59:53556), type = Internal error, errno = 104 (Connection reset by peer): Connection reset by peer
E20230508 15:53:25.183840  3014 StorageClientBase-inl.h:143] There some RPC errors: RPC failure in StorageClient: AsyncSocketException: recv() failed (peer=10.26.24.63:29779, local=10.26.24.59:53556), type = Internal error, errno = 104 (Connection reset by peer): Connection reset by peer
E20230508 15:53:25.184360  3014 StorageAccessExecutor.h:40] InsertVerticesExecutor failed, error E_RPC_FAILURE, part 8
E20230508 15:53:25.184455  3014 QueryInstance.cpp:151] Storage Error: RPC failure, probably timeout., query: INSERT vertex `email`(信息敏感省略)
spark
23/05/08 16:10:44 INFO NebulaVertexWriter: batch write succeed
23/05/08 16:10:50 ERROR Utils: Aborting task
com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
23/05/08 16:10:50 ERROR DataWritingSparkTask: Aborting commit for partition 0 (task 38, attempt 0, stage 2.0)
23/05/08 16:10:50 ERROR NebulaVertexWriter: insert vertex task abort.
23/05/08 16:10:50 ERROR DataWritingSparkTask: Aborted commit for partition 0 (task 38, attempt 0, stage 2.0)
23/05/08 16:10:50 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 38)
com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
23/05/08 16:10:50 INFO TaskSetManager: Starting task 1.0 in stage 2.0 (TID 39, localhost, executor driver, partition 1, PROCESS_LOCAL, 8311 bytes)
23/05/08 16:10:50 INFO Executor: Running task 1.0 in stage 2.0 (TID 39)
23/05/08 16:10:50 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 38, localhost, executor driver): com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

23/05/08 16:10:50 ERROR TaskSetManager: Task 0 in stage 2.0 failed 1 times; aborting job
23/05/08 16:10:50 INFO TaskSchedulerImpl: Cancelling stage 2
23/05/08 16:10:50 INFO TaskSchedulerImpl: Killing all running tasks in stage 2: Stage cancelled
23/05/08 16:10:50 INFO FileScanRDD: Reading File path: file:///data/nebula-data/entity_email/part-00007-1bde8b48-445e-4544-ba21-bf2f0be0414c-c000.json, range: 134217728-268435456, partition values: [empty row]
23/05/08 16:10:50 INFO Executor: Executor is trying to kill task 1.0 in stage 2.0 (TID 39), reason: Stage cancelled
23/05/08 16:10:50 INFO TaskSchedulerImpl: Stage 2 was cancelled
23/05/08 16:10:50 INFO DAGScheduler: ResultStage 2 (save at package.scala:287) failed in 13.325 s due to Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 38, localhost, executor driver): com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
23/05/08 16:10:50 WARN BlockManager: Putting block rdd_10_1 failed due to exception org.apache.spark.TaskKilledException.
23/05/08 16:10:50 INFO DAGScheduler: Job 2 failed: save at package.scala:287, took 13.333143 s
23/05/08 16:10:50 WARN BlockManager: Block rdd_10_1 could not be removed as it was not found on disk or in memory
23/05/08 16:10:50 ERROR WriteToDataSourceV2Exec: Data source writer com.vesoft.nebula.connector.writer.NebulaDataSourceVertexWriter@43fbc2bf is aborting.
23/05/08 16:10:50 ERROR NebulaDataSourceVertexWriter: NebulaDataSourceVertexWriter abort
23/05/08 16:10:50 ERROR WriteToDataSourceV2Exec: Data source writer com.vesoft.nebula.connector.writer.NebulaDataSourceVertexWriter@43fbc2bf aborted.
23/05/08 16:10:50 INFO Executor: Executor killed task 1.0 in stage 2.0 (TID 39), reason: Stage cancelled
Exception in thread "main" org.apache.spark.SparkException: Writing job aborted.
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:92)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:136)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:132)
        at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:160)
        at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:157)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:132)
        at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
        at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
        at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
        at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
        at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
        at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
        at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:280)
        at com.vesoft.nebula.connector.connector.package$NebulaDataFrameWriter.writeVertices(package.scala:287)
        at com.aiqiang.WriteVertex.writeData(WriteVertex.java:98)
        at com.aiqiang.WriteVertex.main(WriteVertex.java:43)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:855)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:930)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:939)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 2.0 failed 1 times, most recent failure: Lost task 0.0 in stage 2.0 (TID 38, localhost, executor driver): com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Driver stacktrace:
        at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1925)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1913)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1912)
        at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
        at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1912)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:948)
        at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:948)
        at scala.Option.foreach(Option.scala:257)
        at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:948)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2146)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2095)
        at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2084)
        at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
        at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:759)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:2067)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec.doExecute(WriteToDataSourceV2Exec.scala:64)
        ... 30 more
Caused by: com.vesoft.nebula.client.graph.exception.IOErrorException: java.net.SocketTimeoutException: Read timed out
        at com.vesoft.nebula.client.graph.net.SyncConnection.executeWithParameter(SyncConnection.java:189)
        at com.vesoft.nebula.client.graph.net.Session.executeWithParameter(Session.java:113)
        at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:78)
        at com.vesoft.nebula.connector.nebula.GraphProvider.submit(GraphProvider.scala:107)
        at com.vesoft.nebula.connector.writer.NebulaWriter.submit(NebulaWriter.scala:49)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.execute(NebulaVertexWriter.scala:73)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:55)
        at com.vesoft.nebula.connector.writer.NebulaVertexWriter.write(NebulaVertexWriter.scala:17)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:118)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$$anonfun$run$3.apply(WriteToDataSourceV2Exec.scala:116)
        at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1394)
        at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:146)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:67)
        at org.apache.spark.sql.execution.datasources.v2.WriteToDataSourceV2Exec$$anonfun$doExecute$2.apply(WriteToDataSourceV2Exec.scala:66)
        at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
        at org.apache.spark.scheduler.Task.run(Task.scala:123)
        at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:411)
        at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
        at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
23/05/08 16:10:50 WARN TaskSetManager: Lost task 1.0 in stage 2.0 (TID 39, localhost, executor driver): TaskKilled (Stage cancelled)
23/05/08 16:10:50 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have all completed, from pool 
23/05/08 16:10:50 INFO SparkContext: Invoking stop() from shutdown hook
23/05/08 16:10:50 INFO SparkUI: Stopped Spark web UI at http://nebula05:4040
23/05/08 16:10:50 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
23/05/08 16:10:50 INFO MemoryStore: MemoryStore cleared
23/05/08 16:10:50 INFO BlockManager: BlockManager stopped
23/05/08 16:10:50 INFO BlockManagerMaster: BlockManagerMaster stopped
23/05/08 16:10:50 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
23/05/08 16:10:50 INFO SparkContext: Successfully stopped SparkContext
23/05/08 16:10:50 INFO ShutdownHookManager: Shutdown hook called
23/05/08 16:10:50 INFO ShutdownHookManager: Deleting directory /tmp/spark-be87f99c-caf5-4f63-8970-f02b083ce731
23/05/08 16:10:50 INFO ShutdownHookManager: Deleting directory /tmp/spark-e25fcde3-69f1-41db-a582-088687a35f93
  1. 针对spark 程序中的timout问题, 可以代码中把timeout配置设大,batch size调小,spark 并发度调小
  2. 针对graphd 日志中的timeout问题,可以把nebulagraph graphd配置文件中 timeout调大

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。