nebula压测接口报错

nebula3.3在做压力测试的时候,cypher比较复杂,并发请求50,报很多这个错误,是因为内存跑满了吗?


graphd全部压测挂了

贴下你的机器配置

  • 内存大小
  • 磁盘类型
  • CPU 核数

硬盘ssd 1.5t cpu 32core 内存128GiB 三台机器一样,搭建的集群

来,把查询语句和 graph 的日志贴一下,我让研发给你看看。

查询cypher
match (m:`USER`) where id(m) in ["1510615318"] 
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(a1:APPL)-[:APPLY_ON]-(t1:`TIME`)
where r1.first_callmark_on<=20220602 and t1.`TIME`.`date`<='2022-06-02'
optional match (x)-[r2:CALL_TO]-(n:`USER`) 
where r2.first_callmark_on<=20220602 and id(n)<>'1510615318'
optional match (n:MOBILE)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(y1:`USER`) 
where r3.first_cbkmark_on<=20220602 and id(n)<>id(y1)
optional match (n:MOBILE)<-[r4:CBK_ENGINEER]-(y2:`USER`)
where r4.first_cbkmark_on<=20220602 and id(n)<>id(y2)
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:APPL)-[:APPLY_ON]->(t2:`TIME`) 
where t2.`TIME`.`date`<='2022-06-02'
optional match (n:MOBILE)-[:APPLY_FOR]->(a3:LOAN_BK)-[:APPLY_ON]->(t3:`TIME`) 
where t3.`TIME`.`date`<='2022-06-02'
optional match (n:MOBILE)-[:APPLY_FOR]->(a4:LOAN_BK)-[:M2_ON]->(t4:`TIME`) 
where t4.`TIME`.`date`<='2022-06-02'
optional match (n:MOBILE)-[:APPLY_FOR]->(a5:A_BLACK_LIST)-[:APPLY_ON]->(t5:`TIME`)
where t5.`TIME`.`date`<='2022-06-02'
optional match (n:MOBILE)-[:APPLY_FOR]->(a6:C_BLACK_LIST)-[:APPLY_ON]->(t6:`TIME`) 
where t6.`TIME`.`date`<='2022-06-02'
with 
  id(x) AS xid,
  id(n) AS nid,
  case when "MOBILE" in labels(n) then id(n) else null end AS mobile_2nd_v4,
  case when a2 is not null then id(n) else null end AS appl_2nd_v4,
  case when y1 is not null then id(n) else null end AS cbkin_State_enterprise_leader_2nd_v4,
  case when y2 is not null then id(n) else null end AS cbkin_Engineer_2nd_v4,
  case when a3 is not null then id(n) else null end AS bk_2nd_v4,
  case when a4 is not null then id(n) else null end AS bad30p_2nd_v4,
  case when a5 is not null then id(n) else null end AS ablk_2nd_v4,
  case when a6 is not null then id(n) else null end AS cblk_2nd_v4,
  case when 'ORG' in labels(n) then id(n) else null end AS org_2nd_v4,
  case when 'EXPRESS' in labels(n) then id(n) else null end AS express_2nd_v4,
  case when 'AGENCY' in labels(n) then id(n) else null end AS agency_2nd_v4,
  case when 'BNK_ORG' in labels(n) then id(n) else null end AS bnk_org_2nd_v4,
  case when 'OTHER_ORG' in labels(n) then id(n) else null end AS other_org_2nd_v4
return
  count(distinct xid) as n_appl_1st_v4,
  count(distinct nid) as n_user_2nd_v4,
  count(distinct mobile_2nd_v4) as n_mobile_2nd_v4,
  count(distinct appl_2nd_v4) as n_appl_2nd_v4,
  count(distinct cbkin_State_enterprise_leader_2nd_v4) AS n_cbkin_State_enterprise_leader_2nd_v4,
  count(distinct cbkin_Engineer_2nd_v4) AS n_cbkin_Engineer_2nd_v4,
  count(distinct bk_2nd_v4) as n_bk_2nd_v4,
  count(distinct bad30p_2nd_v4) AS n_bad30p_2nd_v4,
  count(distinct ablk_2nd_v4) AS n_ablk_2nd_v4,
  count(distinct cblk_2nd_v4) AS n_cblk_2nd_v4,
  count(distinct org_2nd_v4) as n_org_2nd_v4,
  count(distinct express_2nd_v4) as n_express_2nd_v4,
  count(distinct agency_2nd_v4) as n_agency_2nd_v4,
  count(distinct bnk_org_2nd_v4) as n_bnk_org_2nd_v4,
  count(distinct other_org_2nd_v4) as n_other_org_2nd_v4

日志

[root@VM-2-3-centos /usr/local/nebula/logs]# cat nebula-graphd.FATAL
Log file created at: 2022/12/21 14:40:12
Running on machine: VM-2-3-centos
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]yyyymmdd hh:mm:ss.uuuuuu threadid file:line] msg
F20221221 14:40:12.424347 12750 ThriftServer.cpp:826] Could not drain active requests within allotted deadline. Deadline value: 30 secs. Aborting because undefined behavior is possible. Underlying reasons could be either requests that have never terminated, long running requests, or long queues that could not be fully processed.
[root@VM-2-3-centos /usr/local/nebula/logs]#


nebula-graphd.ERROR:
E20230109 16:54:01.945885 31894 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.945917 31879 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.946002 31899 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.947834 31894 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.948792 31895 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.949147 31880 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.949414 31899 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.950392 31872 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.950440 31896 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.950753 31892 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.950856 31882 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.950999 31904 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.951181 31891 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.952771 31886 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.953426 31873 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.954102 31893 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.954118 31892 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.954294 31898 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.954934 31904 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.955346 31898 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.955394 31902 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.958231 31878 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.958451 31897 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.958611 31876 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.959128 31899 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.960462 31873 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.960470 31903 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.961555 31897 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.961822 31874 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.961889 31875 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.
E20230109 16:54:01.962025 31882 QueryInstance.cpp:137] Used memory hits the high watermark(0.800000) of total system memory.

压测机器过一会 就oom了

应该是这个语句消耗内存太多了。
你的业务场景就是这样吗?
你可以先用简单的match语句压测一下,看看性能并发。

简单的语句压测没有问题。这个是我们线上业务的。我们的业务场景就是这样的

单条可以查出来 只是慢一点,压测的话就是这样,按理说应该不会这样的

内存完全不够啊,1QPS都不一定够吧

32core 128g 发错了

optional match 好多,这个内存膨胀很厉害的。我感觉1QPS都不好搞

这个怎么处理呢 业务是这样的,我们以前是neo4j 现在想测试下nebula。

这个应该如何改造成nebula支持的呢

如果 schema 不变的话(比如不只有没有可能把 xx_ON 出去的时间信息作为 APPLY_FOR 的属性),当前的的查询我感觉不好再找进一步改动空间了。

如 yee 老师提到,3.4.0 的 MATCH 多了一些优化,比如查询过程中会更加裁剪一些数据的捞取等(还有其他优化),建议测试一下影响有多少,现在 master/每晚的打包比较接近发布的 3.4.0 了,如果卡在这里,感觉可以考虑评估看看什么情况了哈。

另外,也可以走商务渠道,让专家拿专门的时间帮忙整体评估一下可以怎么优化

同意,氪金试试

此话题已在最后回复的 30 天后被自动关闭。不再允许新回复。