nebula cypher 机器oom

我们现在是3.3 应该是最高了吧

目前 3.3 这个社区版里最高了,上面的问题是最近修复的,所以看你们测试的情况,如果改 query 也到不要紧,不想改 query 的话,可以试试 nightly 的版本

估计只能改cypher 我们这个是要到生产用的 这个测试环境 数据100多亿 不能随便更换版本,改过来改过去

OK,那就优化下 query 吧,了解一下这些查询是代码生成的还是手工写的?

我们都是全手写的,我们以前是neo4j,然后现在想切换成你们的nebula,先测试下社区版性能,看下效果,就先改我们的web 服务里面的一些cypher做一些性能测试,看下情况

如果改成连在一起的,比如这个,我按照刚才那个扣了一下,改了下符号,这个是哪里缺少括号了吗
match (m:USER) where id(m) in [“1510615318”]
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(a1:APPL)-[:APPLY_ON]-(t1:TIME)
where r1.first_callmark_on<=20220102 and t1.TIME.date < ‘2022-01-02’
optional match (x)-[r2:CALL_TO]-(n:USER)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(n2:USER)<-[r4:CBK_ENGINEER]-(n3:USER)
where r2.first_callmark_on<=20220102 and id(n)<>‘1510615318’ and r3.first_cbkmark_on<=20220602 and r4.first_cbkmark_on<=20220602
with
m,
collect(distinct x) as col_appl_1st_v4,
collect(distinct n) as col_user_2nd_v4,
count(distinct n2) as n_cbkin_State_enterprise_leader_2nd_v4
count(distinct case when n2 is not null then n3 else null end) as n_cbkin_Engineer_2nd_v4
with
m,
col_appl_1st_v4,
col_user_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4,
[uu in col_user_2nd_v4 WHERE “MOBILE” in labels(uu)] as col_mobile_2nd_v4
return
size(col_appl_1st_v4) as n_appl_1st_v4,
size(col_user_2nd_v4) as n_user_2nd_v4,
size(col_mobile_2nd_v4) as n_mobile_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4

少了一个逗号,但是这么写 貌似查询结果 跟之前 不一样了


这明显结果不对


match (m:USER) where id(m) in [“1510615318”]
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(a1:APPL)-[:APPLY_ON]-(t1:TIME)
where r1.first_callmark_on<=20220102 and t1.TIME.date < ‘2022-01-02’
optional match (x)-[r2:CALL_TO]-(n:USER)
where r2.first_callmark_on<=20220102 and id(n)<>‘1510615318’
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(y:USER)
where r3.first_cbkmark_on<=20220602 and id(n)<>‘1510615318’
with
m,
collect(distinct x) as col_appl_1st_v4,
collect(distinct n) as col_user_2nd_v4,
count(distinct y) as n_cbkin_State_enterprise_leader_2nd_v4
with
m,
col_appl_1st_v4,
col_user_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4,
[uu in col_user_2nd_v4 WHERE “MOBILE” in labels(uu)] as col_mobile_2nd_v4
return
size(col_appl_1st_v4) as n_appl_1st_v4,
size(col_user_2nd_v4) as n_user_2nd_v4,
size(col_mobile_2nd_v4) as n_mobile_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4
这是合并之前的

好的,结果不对的原因应该是原来是 optional match 现在串联的时候是 match 的语意,等下我想一下你们的这个 query 怎么等价的表达一下,欢迎使用 nebula,后面有需要可以给我们 @steam 联系一下,我们也可以了解下你们平时如何用的图库

再试试下面的 query 吧:

match (m:USER) where id(m) in [“1510615318”]
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(:APPL)-[:APPLY_ON]-(t1:TIME)
where r1.first_callmark_on<=20220102 and t1.TIME.date < ‘2022-01-02’
optional match (x)-[r2:CALL_TO]-(n:USER)
where r2.first_callmark_on<=20220102 and id(n)<>‘1510615318’
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(n2:USER)
where r3.first_cbkmark_on<=20220602 and id(n2)<>‘1510615318’
optional match (n2)<-[r4:CBK_ENGINEER]-(n3:USER)
where r4.first_cbkmark_on<=20220602 and id(n3)<>‘1510615318’
with
  m,
  collect(distinct x) as col_appl_1st_v4,
  collect(distinct n) as col_user_2nd_v4,
  count(distinct n2) as n_cbkin_State_enterprise_leader_2nd_v4,
  count(distinct case when n2 is not null then n3 else null end) as n_cbkin_Engineer_2nd_v4
with
  m,
  col_appl_1st_v4,
  col_user_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  [uu in col_user_2nd_v4 WHERE “MOBILE” in labels(uu)] as col_mobile_2nd_v4,
  n_cbkin_Engineer_2nd_v4
return
  size(col_appl_1st_v4) as n_appl_1st_v4,
  size(col_user_2nd_v4) as n_user_2nd_v4,
  size(col_mobile_2nd_v4) as n_mobile_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  n_cbkin_Engineer_2nd_v4

貌似是可以了


数据跟生产对不上 可能是因为我们导入数据的时候没有失败的情况,漏掉了,因为是第一次测试,数据可能不是那么的严谨,还需要在验证

嗯嗯 ,可以的,等初次压测调试完了,到时候可以沟通下。看下你们的商业版本

奇怪我自己新加了一个 查询的时候 就跑不动了,机器就oom了

  match (m:`USER`) where id(m) in ["1510615318"]
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(:APPL)-[:APPLY_ON]-(t1:`TIME`)
where r1.first_callmark_on<=20220102 and t1.`TIME`.`date` < '2022-01-02'
optional match (x)-[r2:CALL_TO]-(n:`USER`)
where r2.first_callmark_on<=20220102 and id(n)<>'1510615318'
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(n2:`USER`)
where r3.first_cbkmark_on<=20220602 and id(n2)<>'1510615318'
optional match (n2)<-[r4:CBK_ENGINEER]-(n3:`USER`)
where r4.first_cbkmark_on<=20220602 and id(n3)<>'1510615318'
optional match (n4:MOBILE)-[:APPLY_FOR]->(a2:APPL)-[:APPLY_ON]->(t2:`TIME`)
where t2.`TIME`.`date` < '2022-01-02'
with
  m,
  collect(distinct x) as col_appl_1st_v4,
  collect(distinct n) as col_user_2nd_v4,
  count(distinct n2) as n_cbkin_State_enterprise_leader_2nd_v4,
  count(distinct case when n2 is not null then n3 else null end) as n_cbkin_Engineer_2nd_v4,
  count(distinct case when a2 is not null then n4 else null end) as col_appl_2nd_v4
with
  m,
  col_appl_1st_v4,
  col_user_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  [uu in col_user_2nd_v4 WHERE "MOBILE" in labels(uu)] as col_mobile_2nd_v4,
  n_cbkin_Engineer_2nd_v4,
  col_appl_2nd_v4
return
  size(col_appl_1st_v4) as n_appl_1st_v4,
  size(col_user_2nd_v4) as n_user_2nd_v4,
  size(col_mobile_2nd_v4) as n_mobile_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  n_cbkin_Engineer_2nd_v4,
  size(col_appl_2nd_v4) as n_appl_2nd_v4

原来neo4j里面是这样的

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:APPL)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,
collect(distinct case when a2 is not null then n else null end) as col_appl_2nd_v4

辛苦再帮我看下

看了下你上面的 n4 的 pattern 再向外拓展的时候,并没有跟前面的 pattern 用到相同的变量,这里是不是写错?你把 n4 改成 n3 符合语意吗?

neo4j语句 没有问题哦

改成n3也不行 内存直接oom

我看前后 match 中其实是两个完全不想关的 path pattern,即使改成 n4 改成 n3 但是语意上上一个 pattern 中是 User 下个 pattern 中变成了 Mobile,这看起来不太正确。

如果两个不相关的 path pattern 做 optional match 表示什么意思呢?从实现的角度看各自拓展然后只输出了左边的 pattern,想必这 query 表达的意思也不是你们想要的。

最后一个 match 产生 OOM 的原因也是走了索引的全扫描,我建议可以再 review 一下原来语句想表达的意思,然后看看 nebula 这边如何改写一下

我们原有的cypher是这样的,怎么改成nebula cypher呢
match (m:USER {mobileId:'1bb7d145a19749a9e1b1f12dc613233115d5df3e22c741309ea12e71256d03d6'})
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(a1:APPL)-[:APPLY_ON]-(t1:TIME)
where r1.first_callmark_on<=20220602 and replace(t1.date,'-','')<='20220602'
with m,collect(distinct x) as col_appl_1st_v4

unwind case when col_appl_1st_v4 =[] then [null] else col_appl_1st_v4 end as x
optional match (x)-[r2:CALL_TO]-(n:USER) where r2.first_callmark_on<=20220602 and not (m=n)
with m,col_appl_1st_v4,collect(distinct n) as col_user_2nd_v4
with m,col_appl_1st_v4,col_user_2nd_v4,filter(n in col_user_2nd_v4 where "MOBILE" in labels(n)) as col_mobile_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(y:USER) where r3.first_cbkmark_on<=20220602 and not (y=n)
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,
count(distinct case when y is not null then n else null end) as n_cbkin_State_enterprise_leader_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n)<-[r3:CBK_ENGINEER]-(y:USER) where r3.first_cbkmark_on<=20220602 and not (y=n)
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,
count(distinct case when y is not null then n else null end) as n_cbkin_Engineer_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:APPL)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,
collect(distinct case when a2 is not null then n else null end) as col_appl_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:LOAN_BK)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
collect(distinct case when a2 is not null then n else null end) as col_bk_2nd_v4

unwind case when col_bk_2nd_v4 =[] then [null] else col_bk_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:LOAN_BK)-[:M2_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,col_bk_2nd_v4,
count(distinct case when a2 is not null then n else null end) as n_bad30p_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:A_BLACK_LIST)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
col_bk_2nd_v4,n_bad30p_2nd_v4,count(distinct case when a2 is not null then n else null end) as n_ablk_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:C_BLACK_LIST)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
col_bk_2nd_v4,n_bad30p_2nd_v4,n_ablk_2nd_v4,count(distinct case when a2 is not null then n else null end) as n_cblk_2nd_v4

return size(col_appl_1st_v4) as n_appl_1st_v4,size(col_user_2nd_v4) as n_user_2nd_v4,
size(col_mobile_2nd_v4) as n_mobile_2nd_v4,size(col_appl_2nd_v4) as n_appl_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,
size(col_bk_2nd_v4) as n_bk_2nd_v4,n_bad30p_2nd_v4,n_ablk_2nd_v4,n_cblk_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'ORG' in labels(n))) as n_org_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'EXPRESS' in labels(n))) as n_express_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'AGENCY' in labels(n))) as n_agency_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'BNK_ORG' in labels(n))) as n_bnk_org_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'OTHER_ORG' in labels(n))) as n_other_org_2nd_v4
我们原有的cypher是这样的,怎么改成nebula cypher呢
match (m:USER {mobileId:'1bb7d145a19749a9e1b1f12dc613233115d5df3e22c741309ea12e71256d03d6'})
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(a1:APPL)-[:APPLY_ON]-(t1:TIME)
where r1.first_callmark_on<=20220602 and replace(t1.date,'-','')<='20220602'
with m,collect(distinct x) as col_appl_1st_v4

unwind case when col_appl_1st_v4 =[] then [null] else col_appl_1st_v4 end as x
optional match (x)-[r2:CALL_TO]-(n:USER) where r2.first_callmark_on<=20220602 and not (m=n)
with m,col_appl_1st_v4,collect(distinct n) as col_user_2nd_v4
with m,col_appl_1st_v4,col_user_2nd_v4,filter(n in col_user_2nd_v4 where "MOBILE" in labels(n)) as col_mobile_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(y:USER) where r3.first_cbkmark_on<=20220602 and not (y=n)
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,
count(distinct case when y is not null then n else null end) as n_cbkin_State_enterprise_leader_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n)<-[r3:CBK_ENGINEER]-(y:USER) where r3.first_cbkmark_on<=20220602 and not (y=n)
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,
count(distinct case when y is not null then n else null end) as n_cbkin_Engineer_2nd_v4

unwind case when col_mobile_2nd_v4 =[] then [null] else col_mobile_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:APPL)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,
collect(distinct case when a2 is not null then n else null end) as col_appl_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:LOAN_BK)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
collect(distinct case when a2 is not null then n else null end) as col_bk_2nd_v4

unwind case when col_bk_2nd_v4 =[] then [null] else col_bk_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:LOAN_BK)-[:M2_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,col_bk_2nd_v4,
count(distinct case when a2 is not null then n else null end) as n_bad30p_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:A_BLACK_LIST)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
col_bk_2nd_v4,n_bad30p_2nd_v4,count(distinct case when a2 is not null then n else null end) as n_ablk_2nd_v4

unwind case when col_appl_2nd_v4 =[] then [null] else col_appl_2nd_v4 end as n
optional match (n:MOBILE)-[:APPLY_FOR]->(a2:C_BLACK_LIST)-[:APPLY_ON]->(t2:TIME) where replace(t2.date,'-','')<='20220602'
with m,col_appl_1st_v4,col_user_2nd_v4,col_mobile_2nd_v4,n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,col_appl_2nd_v4,
col_bk_2nd_v4,n_bad30p_2nd_v4,n_ablk_2nd_v4,count(distinct case when a2 is not null then n else null end) as n_cblk_2nd_v4

return size(col_appl_1st_v4) as n_appl_1st_v4,size(col_user_2nd_v4) as n_user_2nd_v4,
size(col_mobile_2nd_v4) as n_mobile_2nd_v4,size(col_appl_2nd_v4) as n_appl_2nd_v4,
n_cbkin_State_enterprise_leader_2nd_v4,n_cbkin_Engineer_2nd_v4,
size(col_bk_2nd_v4) as n_bk_2nd_v4,n_bad30p_2nd_v4,n_ablk_2nd_v4,n_cblk_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'ORG' in labels(n))) as n_org_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'EXPRESS' in labels(n))) as n_express_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'AGENCY' in labels(n))) as n_agency_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'BNK_ORG' in labels(n))) as n_bnk_org_2nd_v4,
size(filter(n in col_user_2nd_v4 where 'OTHER_ORG' in labels(n))) as n_other_org_2nd_v4

已经改了一点,后面的不会写,要走索引,怎么绕过索引呢

  match (m:`USER`) where id(m) in ["1510615318"]
optional match (m)-[r1:CALL_TO]-(x:MOBILE)-[:APPLY_FOR]->(:APPL)-[:APPLY_ON]-(t1:`TIME`)
where r1.first_callmark_on<=20220102 and t1.`TIME`.`date` < '2022-01-02'
optional match (x)-[r2:CALL_TO]-(n:`USER`)
where r2.first_callmark_on<=20220102 and id(n)<>'1510615318'
optional match (n)<-[r3:CBK_STATE_ENTERPRISE_LEADER]-(n2:`USER`)
where r3.first_cbkmark_on<=20220602 and id(n2)<>'1510615318'
optional match (n2)<-[r4:CBK_ENGINEER]-(n3:`USER`)
where r4.first_cbkmark_on<=20220602 and id(n3)<>'1510615318'
with
  m,
  collect(distinct x) as col_appl_1st_v4,
  collect(distinct n) as col_user_2nd_v4,
  count(distinct n2) as n_cbkin_State_enterprise_leader_2nd_v4,
  count(distinct case when n2 is not null then n3 else null end) as n_cbkin_Engineer_2nd_v4
with
  m,
  col_appl_1st_v4,
  col_user_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  [uu in col_user_2nd_v4 WHERE "MOBILE" in labels(uu)] as col_mobile_2nd_v4,
  n_cbkin_Engineer_2nd_v4
return
  size(col_appl_1st_v4) as n_appl_1st_v4,
  size(col_user_2nd_v4) as n_user_2nd_v4,
  size(col_mobile_2nd_v4) as n_mobile_2nd_v4,
  n_cbkin_State_enterprise_leader_2nd_v4,
  n_cbkin_Engineer_2nd_v4