Ian
1
- nebula 版本:3.8.0
- 部署方式:分布式(3个实例,meta3,graph3,storage*3)
- 安装方式:RPM
- 是否上生产环境:Y
- 硬件信息
- 问题的具体描述
1、集群中有一个实例发生了重启,nebula-java的写入任务发生了重试,但是没有接收到超过重试次数的异常。有个问题是,我配置的是100毫秒,重试3次,重启服务没有那么快,应该会超过3次.但我的程序没有接收到异常,可以认为是重试成功?
2、在发生重启的这个时间段的数据,发现有些数据没有写入,是不是实例之间的内部同步问题?
- 相关的 meta / storage / graph info 日志信息(尽量使用文本形式方便检索)
execute error, code: -1005, message: Storage Error: RPC failure, probably timeout., retry: 1
execute error, code: -1005, message: Storage Error: RPC failure, probably timeout., retry: 2
execute error, code: -1005, message: Storage Error: RPC failure, probably timeout., retry: 3
execute error, code: -1005, message: RPC failure in StorageClient with timeout: TTransportException: Timed out, retry: 1
execute failed for IOErrorException, message: java.net.ConnectException: Connection timed out (Connection timed out), retry: 1
execute failed for IOErrorException, message: java.net.ConnectException: Connection refused (Connection refused), retry: 1
#我的代码如下:
SessionPoolConfig sessionPoolConfig = new SessionPoolConfig(addresses, spaceName, username, password);
sessionPoolConfig.setMinSessionSize(10).setMaxSessionSize(100).setRetryConnectTimes(3)
.setWaitTime(100)
.setRetryTimes(3)
.setIntervalTime(100);
sessionPool = new SessionPool(sessionPoolConfig);
try {
StopWatch stopWatch = StopWatch.createStarted();
result = sessionPool .execute(ngql);
stopWatch.stop();
if (result != null && result.isSucceeded()){
if(stopWatch.getTime() > 50){
log.info("ngql:{} execute success, spend time:{}", ngql, stopWatch.getTime());
}
break;
} else {
JSONObject failedMsg = new JSONObject();
failedMsg.put("nebula_ngql", ngql);
failedMsg.put("exec_code", result != null ? result.getErrorCode() : 0);
failedMsg.put("exec_msg", result != null ? result.getErrorMessage() : "system error");
throw new Exception(failedMsg.toJSONString());
}
} catch (Exception e){
log.error(nebula execute ngql:{} error", ngql, e);
}