大数据量查询是不是不能用limit?

nebula 版本:nebula-graph-2.0.0.el7.x86_64.rpm

  • 部署方式(分布式 / 单机 / Docker / DBaaS):分布式 三节点
  • 是否为线上版本:N
  • 硬件信息
    • 磁盘( 推荐使用 SSD):SSD
    • CPU、内存信息 每天64核,512G(和其他应用公用,可用磁盘空间约1T)
  • 问题的具体描述
  • 相关的 meta / storage / graph info 日志信息
E0518 11:39:29.497005 29913 ThriftClientManager.inl:33] Invalid Channel: 0x7fc758334a00 for host: "172.80.1.2":9780
E0518 11:39:29.497128 29912 ThriftClientManager.inl:33] Invalid Channel: 0x7fc7591e3600 for host: "172.80.1.2":9780
E0518 11:39:29.497121 29893 ThriftClientManager.inl:33] Invalid Channel: 0x7fc760e00000 for host: "172.80.1.2":9780
E0518 11:39:29.497186 29906 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75c800900 for host: "172.80.1.2":9780
E0518 11:39:29.497802 29911 ThriftClientManager.inl:39] Transport is closed by peers 0x7fc759f84f10 for host: "172.80.1.2":9780
E0518 11:39:29.497825 29899 ThriftClientManager.inl:39] Transport is closed by peers 0x7fc75f21e190 for host: "172.80.1.2":9780
E0518 11:39:29.498018 29909 ThriftClientManager.inl:39] Transport is closed by peers 0x7fc75ba1d390 for host: "172.80.1.2":9780
E0518 11:39:29.498533 29901 ThriftClientManager.inl:39] Transport is closed by peers 0x7fc75e41e190 for host: "172.80.1.2":9780
E0518 11:39:29.499161 29910 ThriftClientManager.inl:39] Transport is closed by peers 0x7fc75ac1da90 for host: "172.80.1.2":9780
E0518 11:39:39.496692 29906 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75c800900 for host: "172.80.1.2":9780
E0518 11:39:39.496724 29909 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75ba00f00 for host: "172.80.1.2":46781
E0518 11:39:39.496750 29911 ThriftClientManager.inl:33] Invalid Channel: 0x7fc759e00900 for host: "172.80.1.2":9780
E0518 11:39:39.496767 29912 ThriftClientManager.inl:33] Invalid Channel: 0x7fc7591e3600 for host: "172.80.1.2":9780
E0518 11:39:39.496800 29913 ThriftClientManager.inl:33] Invalid Channel: 0x7fc758200f00 for host: "172.80.1.2":46781
E0518 11:39:39.496838 29910 ThriftClientManager.inl:33] Invalid Channel: 0x7fc71eb9e100 for host: "172.80.1.2":9780
E0518 11:39:39.496850 29915 ThriftClientManager.inl:33] Invalid Channel: 0x7fc756698d00 for host: "172.80.1.2":9780
E0518 11:39:39.496934 29914 ThriftClientManager.inl:33] Invalid Channel: 0x7fc71e473e00 for host: "172.80.1.2":9780
E0518 11:39:39.497989 29735 ThriftClientManager.inl:33] Invalid Channel: 0x7fc763800f00 for host: "172.80.1.2":9780
E0518 11:39:39.498005 29742 ThriftClientManager.inl:33] Invalid Channel: 0x7fc762a00900 for host: "172.80.1.2":9780
E0518 11:39:39.498085 29893 ThriftClientManager.inl:33] Invalid Channel: 0x7fc760e00000 for host: "172.80.1.2":9780
E0518 11:39:39.498090 29889 ThriftClientManager.inl:33] Invalid Channel: 0x7fc761c00000 for host: "172.80.1.2":9780
E0518 11:39:39.498214 29904 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75d600900 for host: "172.80.1.2":9780
E0518 11:39:39.498229 29909 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75ba7e600 for host: "172.80.1.2":9780
E0518 11:39:39.498268 29910 ThriftClientManager.inl:33] Invalid Channel: 0x7fc71eb9e100 for host: "172.80.1.2":9780
E0518 11:39:39.498236 29896 ThriftClientManager.inl:33] Invalid Channel: 0x7fc760000c00 for host: "172.80.1.2":9780
E0518 11:39:39.498239 29899 ThriftClientManager.inl:33] Invalid Channel: 0x7fc75f200000 for host: "172.80.1.2":9780
E0518 11:39:39.498282 29911 ThriftClientManager.inl:33] Invalid Channel: 0x7fc759e00900 for host: "172.80.1.2":9780
E0518 11:39:39.498334 29912 ThriftClientManager.inl:33] Invalid Channel: 0x7fc7591e3600 for host: "172.80.1.2":9780
E0518 11:39:39.498468 29913 ThriftClientManager.inl:33] Invalid Channel: 0x7fc758334a00 for host: "172.80.1.2":9780

通过 scripts/nebula.service status all 发现每个节点上的storage 服务已经挂了。选取其中一个节点打开日志nebula-storaged.主机名.nebula.log.ERROR.20210513-161937.29694 如上。

数据量

“Tag” | “Comment” | 21865475

查询语句

lookup on Comment | limit 100

同事说 数据量太大,不能用limit。我看了官方文档,也没有找到答案。在大数据量上测试时会不会是否适合用limit呢?在大结果集合的情况下使用skip ,limit会不会更好一些?

应该和limit没有关系,主要是我们现在还没有对索引扫描做limit的下推,所以会导致oom,你看下dmesg是不是oom了

1 个赞

这句话是什么意思呢?能不能用通俗的方式讲一下哇。。这个是昨天的发生的,只是storaged服务挂了,现在看不出来是不是oom了

就是现在的实现会把所有数据拿到graph层再做limit的过滤,占用的内存比较多

有没有考虑做下推,有的场景会涉及到返回数据量大,返回全部storage会挂掉,limit相当于关系数据库的真分页

肯定会做的

大概什么时候能支持?通过关系指向查询出来的点的个数属于不可控,数据量大了直接报错,试了600W数据,使用neo4j的limit查询直接秒出来。

可以去我们的repo提个issue,我们会尽快处理

:ok_hand: