执行报错:Cannot write to null outputStream

  • nebula 版本:v2.0.1
  • 部署方式(分布式):K8s
  • 是否为线上版本:Y
  • 硬件信息
  • metad(一个)
    • CPU:2核
    • 内存信:5G
  • graphd(一个)
    • CPU:2核
    • 内存信:5G
  • storaged(三个)
    • CPU:4核
    • 内存信:10G
  • 问题的具体描述
    客户端执行子图查询语句报错,偶现
  • 相关的 meta / storage / graph info 日志信息(尽量使用文本形式方便检索)
    2021-07-16 16:09:15到2021-07-16 16:11:15时间内出现一直出现,期间日志如下:
    graphd日志:
E0716 08:09:07.943351    21 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:07.943655    21 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:07.943869    21 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii 'P/1.') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 48610)
E0716 08:09:07.974197    22 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:09.527520    23 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:09:09.597036    23 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:09.597295    23 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1634625381 (hex 0x616e6765, ascii 'ange') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 54612)
E0716 08:09:11.920446    24 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:12.368623    30 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:12.696934    25 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:13.227754    27 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:13.857040    31 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:13.857415    31 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:14.632042    32 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:14.647719    32 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:14.816620    74 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:15.527568    75 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:17.226562    77 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:17.227741    77 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:17.228035    77 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1932473120 (hex 0x732f3320, ascii 's/3 ') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.207.0, port 46954)
E0716 08:09:17.413259    78 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:23.903486    79 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:24.725790    29 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:09:25.257401    28 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:25.404464    28 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:30.030542    81 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:09:30.034698    81 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:35.060921    82 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:09:35.098987    82 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:09:35.099308    82 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1634625381 (hex 0x616e6765, ascii 'ange') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 47458)
E0716 08:09:35.477921    83 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:03.459411    85 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:03.500231    85 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:03.500679    85 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176963 (hex 0x69643d43, ascii 'id=C') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.207.0, port 34320)
E0716 08:10:03.548921    86 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:03.631084    87 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:03.633495    87 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:03.633842    87 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176943 (hex 0x69643d2f, ascii 'id=/') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.207.0, port 34336)
E0716 08:10:03.664474    88 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:05.729260    90 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:05.738552    90 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:17.232810    76 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:17.237093    76 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:25.911410    19 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:34.099907    18 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:34.100581    18 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:37.292320    20 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:38.756657    21 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:10:38.757097    21 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:38.757272    21 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii 'P/1.') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 45046)
E0716 08:10:38.790674    22 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:42.254846    23 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:42.266980    23 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:42.270568    23 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1932473120 (hex 0x732f3320, ascii 's/3 ') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 39282)
E0716 08:10:42.424357    24 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:44.144896    30 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:45.690286    25 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:45.720700    25 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:45.720924    25 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176963 (hex 0x69643d43, ascii 'id=C') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 43714)
E0716 08:10:45.755589    27 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:45.804450    31 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:10:45.804643    31 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:45.804793    31 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176943 (hex 0x69643d2f, ascii 'id=/') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 43764)
E0716 08:10:45.838879    32 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:52.458904    75 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:10:52.471563    75 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:52.471776    75 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii 'P/1.') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.207.0, port 55482)
E0716 08:10:52.506103    77 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:10:53.477829    78 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:10:53.480506    78 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:10:53.480777    78 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1414541105 (hex 0x54502f31, ascii 'TP/1') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.207.0, port 40756)
E0716 08:10:53.514266    79 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:11:02.296533    80 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:11:02.307471    80 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:02.307756    80 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176963 (hex 0x69643d43, ascii 'id=C') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 60636)
E0716 08:11:02.372690    29 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:11:02.401355    28 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:11:02.401700    28 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:02.401859    28 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1768176943 (hex 0x69643d2f, ascii 'id=/') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 60684)
E0716 08:11:02.489511    81 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:11:25.619226    84 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:11:25.634616    84 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:25.635402    84 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii 'P/1.') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 57028)
E0716 08:11:25.723289    85 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:11:27.596276    86 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:11:27.604432    86 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:27.604773    86 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1414541105 (hex 0x54502f31, ascii 'TP/1') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 37766)
E0716 08:11:27.638404    87 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:11:46.622710    88 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1195725856
E0716 08:11:46.623181    88 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:51.136795    89 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:11:51.160430    89 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:11:51.178658    89 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1414541105 (hex 0x54502f31, ascii 'TP/1') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 47358)
E0716 08:11:51.227057    90 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:12:06.663426    26 PeekingManager.h:100] Received SSL connection on non SSL port
E0716 08:12:45.375663    18 GeneratedCodeHelper.cpp:116] received invalid message from client: No version identifier... old protocol client in strict mode? sz=1347375956
E0716 08:12:45.382144    18 GeneratedCodeHelper.cpp:73] invalid message from client in function process
E0716 08:12:45.382391    18 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii 'P/1.') (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.197.0, port 40776)

错误日志:

查询异常! traceId:null, tenantId:32008055, spaceName:cdp, statement:GET SUBGRAPH 4 STEPS FROM 1628204739544268, elapsed:0ms. e:{}

com.vesoft.nebula.client.graph.exception.IOErrorException: Cannot write to null outputStream

	at com.vesoft.nebula.client.graph.net.SyncConnection.execute(SyncConnection.java:74) ~[client-2.0.0.jar!/:na]

	at com.vesoft.nebula.client.graph.net.Session.execute(Session.java:46) ~[client-2.0.0.jar!/:na]

	at com.baidu.bizcrm.cdpidentity.infrastructure.persistence.impl.NebulaPersistenceImpl.executeQuery(NebulaPersistenceImpl.java:51) ~[cdp-identity-infrastructure-1.0.17-SNAPSHOT.jar!/:1.0.17-SNAPSHOT]

你应该是把pool关闭了,还在使用session

是nebulaPool吗,我没有主动关闭过nebulaPool,有自动关闭的配置吗,下面是我初始化pool的代码,麻烦帮确认下

private NebulaPool getNebulaPool() throws UnknownHostException, GraphDatabaseException {
    if (nebulaPool == null) {
        synchronized (this) {
            if (nebulaPool == null) {
                nebulaPool = new NebulaPool();
                NebulaPoolConfig nebulaPoolConfig = new NebulaPoolConfig();
                nebulaPoolConfig.setMinConnSize(this.nebulaConfig.getPoolInitNum());
                nebulaPoolConfig.setIdleTime(this.nebulaConfig.getExecuteTimeout() * 10);
                nebulaPoolConfig.setTimeout(this.nebulaConfig.getExecuteTimeout());
                nebulaPoolConfig.setMaxConnSize(this.nebulaConfig.getPoolMaxNum());
                nebulaPool.init(this.nebulaConfig.getHostList(), nebulaPoolConfig);
                log.info("初始化连接池:{}", this.nebulaConfig.getHosts());
            }
        }
    }
    return nebulaPool;
}

看你外面调用pool的代码

根据pool先预置一部分会话:

private Session getOrCreateSession() throws UnknownHostException, NotValidConnectionException, IOErrorException,
        AuthFailedException, GraphDatabaseException {
    Session session = getNebulaPool().getSession(this.nebulaConfig.getUser(),
            this.nebulaConfig.getPassword(), true);
    return session;
}

public NebulaPoolFactory(NebulaConfig nebulaConfig) throws GraphDatabaseException, AuthFailedException,
        IOErrorException, NotValidConnectionException, UnknownHostException, UnsupportedEncodingException {
    this.nebulaConfig = nebulaConfig;
    String[] spaces = nebulaConfig.getSpaceName().split(",");
    log.info("init sessions, spaceNames={}", SPACE_NAMES);
    for (int i = 0; i < spaces.length; i++) {
        String spaceName = spaces[i];
        int initialCapacity = nebulaConfig.getPoolMaxNum();
        Map<Session, Boolean> sessionMap = new ConcurrentHashMap<>(initialCapacity);
        for (int j = 0; j < initialCapacity; j++) {
            Session session = getOrCreateSession();
            ResultSet resultSet = session.execute("use " + spaceName + ";");
            if (!resultSet.isSucceeded()) {
                continue;
            }
            sessionMap.put(session, true);
        }
        spaceSessionMap.put(spaceName, sessionMap);
    }
    log.info("init sessions..end.");
}

取会话:

public synchronized Session getOrCreateSession(String spaceName) {

    Set<Map.Entry<Session, Boolean>> entries = spaceSessionMap.get(spaceName).entrySet();
    Session session = null;
    for (Map.Entry<Session, Boolean> entry : entries) {
        if (entry.getValue()) {
            session = entry.getKey();
            break;
        }
    }
    if (session != null) {
        spaceSessionMap.get(spaceName).replace(session, false);
        return session;
    }

    // FIXME: 2021/6/3 优化为等待
    throw new RuntimeException("no more valid sessions");
}

应用程序结束的时候才会调用关闭pool的方法:

@Override
public void destroy() throws Exception {
    if (MapUtils.isNotEmpty(spaceSessionMap)) {
        log.info("关闭sessions");
        Set<Map.Entry<String, Map<Session, Boolean>>> spaceSessionMapSet = this.spaceSessionMap.entrySet();
        for (Map.Entry<String, Map<Session, Boolean>> spaceSessionMap : spaceSessionMapSet) {
            Map<Session, Boolean> sessionMap = spaceSessionMap.getValue();
            Set<Map.Entry<Session, Boolean>> entries = sessionMap.entrySet();
            for (Map.Entry<Session, Boolean> entry : entries) {
                Session session = entry.getKey();
                if (session != null) {
                    session.release();
                }
            }
        }
    }

    if (nebulaPool != null) {
        log.info("关闭连接池");
        nebulaPool.close();
    }
}

你这里只有接口定义,没有接口调用流的代码?你确认下你们代码的逻辑,是不是session release之后还去用.

E0716 08:11:25.635402 84 HeaderServerChannel.cpp:114] Received invalid request from client: N6apache6thrift9transport19TTransportExceptionE: Header transport frame is too large: 1345270062 (hex 0x502f312e, ascii ‘P/1.’) (transport N6apache6thrift5async12TAsyncSocketE, address 192.168.21.0, port 57028)

你们有没有多线程使用同个session

具体调用查询的方法:

public NebulaResultSet executeQuery(String spaceName, String statement, Integer retryTimes, String tenantId,
                                    String traceId)
        throws GraphDatabaseException {
    Stopwatch stopwatch = Stopwatch.createStarted();
    log.info("查询子句: 图谱:{}, 子句:{}, traceId:{}, tenantId:{}, retryTimes:{}", spaceName, statement, traceId,
            tenantId, retryTimes);
    NebulaResultSet nebulaResultSet = new NebulaResultSet();
    Session session = null;
    try {
       // 获取session
        session = nebulaPoolFactory.getOrCreateSession(spaceName);
         // 执行查询,就是这里提示报错
        ResultSet resultSet = session.execute(statement);
        nebulaResultSet.setDbLatency(resultSet.getLatency());
        nebulaResultSet.setErrorCode(resultSet.getErrorCode());
        if (!resultSet.isSucceeded()) {
            log.error("查询出错: {}, 错误原因: {}",
                    statement, resultSet.getErrorMessage());
            throw new RuntimeException(resultSet.getErrorMessage());
        }
        if (resultSet.isEmpty()) {
            return nebulaResultSet;
        }



        nebulaResultSet.setColumns(resultSet.keys());
        List<String> cols = resultSet.keys();
        List<Row> rows = resultSet.getRows();
        if (!rows.isEmpty()) {
            for (Row row : rows) {
                Map<String, Object> props = new HashMap<>();
                List<com.vesoft.nebula.Value> rowValues = row.getValues();
                for (int i = 0; i < cols.size(); i++) {
                    String key = cols.get(i);
                    props.put(key, NebulaUtils.fromValue(rowValues.get(i)));
                }
                nebulaResultSet.getRows().add(props);
            }
        }

    } catch (Throwable e) {
        log.error("查询异常! traceId:{}, tenantId:{}, spaceName:{}, statement:{}, elapsed:{}ms. e:{}",
                traceId, tenantId, spaceName, statement, stopwatch.elapsed(TimeUnit.MILLISECONDS), e);
        // 重试
        if (retryTimes != null && maxRetryTimes > retryTimes) {
            retryTimes++;
            executeQuery(spaceName, statement, retryTimes, tenantId, traceId);
        } else {
            throw new GraphDatabaseException(e);
        }
    } finally {
       // 这里release方法只是将session标记为可用;
        nebulaPoolFactory.release(spaceName, session);
    }
    log.info("[NebulaPersistenceImpl#executeQuery] execute query end, traceId:{}, tenantId:{}, spaceName:{}, " +
                    "statement:{}, elapsed:{}ms.", traceId, tenantId, spaceName, statement,
            stopwatch.elapsed(TimeUnit.MILLISECONDS));
    return nebulaResultSet;
}

具体 nebulaPoolFactory.release(spaceName, session); 方法如下:

public void release(String spaceName, Session session) {
    if (session != null) {
        spaceSessionMap.get(spaceName).replace(session, true);
    }
}

是通过每个session对应的boolean值控制当前session可用,避免了并发调用。

你们整个程序只会调用executeQuery是吗?
你在 executeQuery 里面增加 线程id和session object地址的打印,然后出错的时候,看下日志,该session object 是否有在差不多时刻被使用两个线程同时使用。(看你取session的代码是没看出啥问题。)但是这种错误我们发现在多线程使用同个session的时候出现过,所以现在先假设就是多线程使用啦,所以你加个打印确认下,还有你可以说下你们当前并发的线程数量吗?然后所有现场都是执行子图查询吗?

好的。

还有其他地方会调用,执行execute写,但是也是一样获取session,执行execute。具体执行逻辑:

@Override
public int execute(String spaceName, String statement, Integer retryTimes, String tenantId, String traceId)
        throws GraphDatabaseException {
    Stopwatch stopwatch = Stopwatch.createStarted();
    log.info("执行子句: 图谱:{}, 子句:{}, traceId:{}, tenantId:{}, retryTimes:{}", spaceName, statement, traceId,
            tenantId, retryTimes);
    Session session = null;
    ResultSet resultSet = null;
    try {
        StopWatch stopWatch = new StopWatch();
        stopWatch.start();
        session = nebulaPoolFactory.getOrCreateSession(spaceName);
        session = nebulaPoolFactory.getOrCreateSession(spaceName);
        // 补充线程Id及session对象日志
        log.info("[NebulaPersistenceImpl#execute] execute, traceId:{}, tenantId:{}, statement:{} " +
                        "threadId:{}, session:{}.", traceId, tenantId, statement,Thread.currentThread().getId(),
                session.toString());
        stopWatch.stop();
        log.info("get session, cost={}", stopWatch.getTime());
        stopWatch.reset();
        stopWatch.start();
        resultSet = session.execute(statement);
        stopWatch.stop();
        log.info("execute, cost={}", stopWatch.getTime());
        if (!resultSet.isSucceeded()) {
            log.error("执行出错:{}, {}", resultSet.getErrorCode(), resultSet.getErrorMessage());
            throw new RuntimeException(resultSet.getErrorMessage());
        }
    } catch (Throwable e) {
        log.error("执行异常! traceId:{}, tenantId:{}, spaceName:{}, statement:{}, elapsed:{}ms. e:{}",
                traceId, tenantId, spaceName, statement, stopwatch.elapsed(TimeUnit.MILLISECONDS), e);
        // 重试
        if (retryTimes != null && maxRetryTimes > retryTimes) {
            retryTimes++;
            execute(spaceName, statement, retryTimes, tenantId, traceId);
        } else {
            throw new GraphDatabaseException(e);
        }
    } finally {
        nebulaPoolFactory.release(spaceName, session);
    }
    log.info("[NebulaPersistenceImpl#execute] execute end, traceId:{}, tenantId:{}, spaceName:{}, " +
                    "statement:{}, elapsed:{}ms.", traceId, tenantId, spaceName, statement,
            stopwatch.elapsed(TimeUnit.MILLISECONDS));
    return ErrorCode.SUCCEEDED;
}

当前session数是先创建了20个,并发线程数跟服务接收到请求数挂钩,不过超过同时执行20个会抛出异常

这个暂时不能确定了,测试环境日志已经没有了,后续复现我再同步下

刚刚这个情况有复现了,应该和查询语句没有直接关系,现在都是match语句报错,这是执行的报错日志,是按顺序执行的
Cannot write to null outputStream (22.9 KB)
这里还有session过期的问题,我们没有改过–session_idle_timeout_secs配置,就是默认值0,为什么会提示过期呢

这里还有session过期的问题,我们没有改过–session_idle_timeout_secs配置,就是默认值0,为什么会提示过期呢

你们初始化的池是不是有多个graphd,你看下服务端的状态,graphd是不是重启过,或者中间网络断开过,所以连接发生重连了,连接到其他graphd,所以会报session过期,2.0.0的服务端不支持多个服务端切换重连。所以这个 “null outputStream” 应该是重连处理有问题导致出现的,你可以把 getSession的 reconnect置为false,2.0.0的版本不要使用重连功能。

初始化池就对k8s的svc进行的操作,会可能负载均衡到三个graphd上,这个有影响码

graphd服务端没有重启过,持续运行时间大于1天了,错误是刚刚报的

设置为不要重连之后,有问题再贴下日志。

嗯嗯,因为代码已经部署到我们的预览环境,不能随便改代码,我先将连接svc换成三个pod对应的svc,问题已经没有复现了。后面会再上一版加上不重连配置。

1 个赞

如果上面 dingding 的回复解决了你的问题的话,可以勾选她的某楼回复为解决方案哈~方便以后的人一眼看到解决方法,谢谢 musiciansLyf

该话题在最后一个回复创建后7天后自动关闭。不再允许新的回复。