边的唯一性问题

问题总述:使用INSERT EDGE IF NOT EXITS 插入边,发现该语义失效,srcid+dsrid+edge type+rank的唯一性失效。

具体背景:
#创建人物TAG、推文TAG、人物->推文的边、边索引
CREATE TAG Person ( name string NULL DEFAULT NULL COMMENT “姓名”, gender int8 NULL DEFAULT NULL COMMENT “性别”, email string NULL DEFAULT NULL COMMENT “邮箱”, mobilePhone string NULL DEFAULT NULL COMMENT “电话”, industry string NULL DEFAULT NULL COMMENT “行业”, jobTitle string NULL DEFAULT NULL COMMENT “职位”, createTime date NULL DEFAULT NULL COMMENT “创建时间”, jobCompanyName string NULL DEFAULT NULL COMMENT “公司名称”) ttl_duration = 0, ttl_col = “”, comment = “人物”;
CREATE TAG Tweet ( ) ttl_duration = 0, ttl_col = “”, comment = “推特贴文”;
CREATE EDGE PersonTweetRelation ( createTime date NULL DEFAULT NULL COMMENT “创建时间”) ttl_duration = 0, ttl_col = “”, comment = “人物-推特贴文”;
CREATE EDGE INDEX personTweet_edge_index_name ON PersonTweetRelation () comment “人物推文边索引”;

#插入点数据
INSERT VERTEX Person (name,gender, email,mobilePhone,industry,jobTitle,createTime,jobCompanyName) VALUES “59PPP”:(“Tom”, 1,"5951321@goo.com",“1594841346”,“IT”,“Cow&Horse”,date(“2021-03-12”),“Alibaba”);
INSERT VERTEX Tweet() VALUES “59TTT”:();

#第一次插入边数据,属性createTime为:datetime(“2011-11-11T11:11:11.111”)
INSERT EDGE IF NOT EXISTS PersonTweetRelation (createTime) VALUES “59PPP”->“59TTT”@0:(datetime(“2011-11-11T11:11:11.111”));
#查询边关系,分别从人物59PPP和推文TTT(都是双向)出发
GO FROM “59PPP” OVER PersonTweetRelation BIDIRECT YIELD edge as e;
GO FROM “59TTT” OVER PersonTweetRelation BIDIRECT YIELD edge as e;
显示的结果都是:[:PersonTweetRelation “59PPP”->“59TTT” @0 {createTime: 2011-11-11T11:11:11.111000}]

此时,一切正常。

#第二次插入边数据,属性createTime改为:datetime(“2022-02-02T22:22:22.222”)
INSERT EDGE IF NOT EXISTS PersonTweetRelation (createTime) VALUES “59PPP”->“59TTT”@0:(datetime(“2022-02-02T22:22:22.222”));
#再次查询边关系,分别从人物59PPP和推文TTT(都是双向)出发,此时结果发生了变化
GO FROM “59PPP” OVER PersonTweetRelation BIDIRECT YIELD edge as e;----->结果:[:PersonTweetRelation “59PPP”->“59TTT” @0 {createTime: 2011-11-11T11:11:11.111000}]
GO FROM “59TTT” OVER PersonTweetRelation BIDIRECT YIELD edge as e;----->结果:[:PersonTweetRelation “59PPP”->“59TTT” @0 {createTime: 2022-02-02T22:22:22.222000}]

至此,第一个问题出现,第二次插入的边数据,根据srcid+dsrid+edge type+rank的唯一性原理,应该不会插入库中,但实际上是成功插入的,在通过59TTT编历时能查出来。而此时,库中数据统计显示,仍只有1条边。

#第三次插入边数据,属性createTime改为:datetime(“2033-03-03T22:33:33.333”)
INSERT EDGE IF NOT EXISTS PersonTweetRelation (createTime) VALUES “59PPP”->“59TTT”@0:(datetime(“2033-03-03T22:33:33.333”));
#再次查询边关系,分别从人物59PPP和推文TTT(都是双向)出发,此时结果又发生了变化
GO FROM “59PPP” OVER PersonTweetRelation BIDIRECT YIELD edge as e;----->结果:[:PersonTweetRelation “59PPP”->“59TTT” @0 {createTime: 2011-11-11T11:11:11.111000}]
GO FROM “59TTT” OVER PersonTweetRelation BIDIRECT YIELD edge as e;----->结果:[:PersonTweetRelation “59PPP”->“59TTT” @0 {createTime: 2033-03-03T22:33:33.333000}]

至此,第二个问题出现,第三次插入的边数据,覆盖了第二次插入时“本不应该出现的边”。

如上所述,使用INSERT EDGE IF NOT EXITS 插入边,发现该语义失效,srcid+dsrid+edge type+rank的唯一性失效。请问有哪位前辈老师遇到过类似问题吗?

现在的逻辑是:

老师可以再仔细看看我的问题描述,目前存在的问题根据version3.8的文档描述无法解决。

嗯嗯,我的意思是:

老师您好,目前的业务需求不能用inser edge,需要用到insert edge if not exist。目前的问题是在insert edge if not exist语义下,用edgeType+起点+终点+rank来判断是否存在貌似失效了。比如insert edge if not exist多次插入后(各次插入都只有属性不同,edgeType+起点+终点+rank都相同),从起点go遍历,则结果是第一次插入的边A,从终点go遍历,则结果是最后一次插入的边X,这明显是错误的。而根据文档的说明,不管怎么插入和遍历,起点和终点的边都应该是第一次插入的边A。

你的版本是什么版本?我在 3.8.0 版本验证是 OK 的

(root@nebula) [basketballplayer]> show hosts
+-----------------+------+----------+--------------+-----------------------+------------------------+---------+
| Host            | Port | Status   | Leader count | Leader distribution   | Partition distribution | Version |
+-----------------+------+----------+--------------+-----------------------+------------------------+---------+
| "192.168.8.112" | 9779 | "ONLINE" | 10           | "basketballplayer:10" | "basketballplayer:10"  | "3.8.0" |
+-----------------+------+----------+--------------+-----------------------+------------------------+---------+
Got 1 rows (time spent 2.985ms/4.021578ms)

Thu, 16 Jan 2025 16:19:58 CST

(root@nebula) [basketballplayer]> match (v)-[e]->(v2) where id(v)=="player100" and id(v2)=="player101" and rank(e)==0 return e
+----------------------------------------------------+
| e                                                  |
+----------------------------------------------------+
| [:follow "player100"->"player101" @0 {degree: 95}] |
+----------------------------------------------------+
Got 1 rows (time spent 8.029ms/8.942686ms)

Thu, 16 Jan 2025 16:20:02 CST

(root@nebula) [basketballplayer]> INSERT EDGE IF NOT EXISTS follow(degree) values "player100"->"player101"@0:(10)
Execution succeeded (time spent 2.417ms/3.412812ms)

Thu, 16 Jan 2025 16:20:10 CST

(root@nebula) [basketballplayer]> match (v)-[e]->(v2) where id(v)=="player100" and id(v2)=="player101" and rank(e)==0 return e
+----------------------------------------------------+
| e                                                  |
+----------------------------------------------------+
| [:follow "player100"->"player101" @0 {degree: 95}] |
+----------------------------------------------------+
Got 1 rows (time spent 8.515ms/9.508925ms)

Thu, 16 Jan 2025 16:20:12 CST

(root@nebula) [basketballplayer]> INSERT EDGE IF NOT EXISTS follow(degree) values "player100"->"player101"@0:(60)
Execution succeeded (time spent 2.467ms/3.292223ms)

Thu, 16 Jan 2025 16:20:18 CST

(root@nebula) [basketballplayer]> match (v)-[e]->(v2) where id(v)=="player100" and id(v2)=="player101" and rank(e)==0 return e
+----------------------------------------------------+
| e                                                  |
+----------------------------------------------------+
| [:follow "player100"->"player101" @0 {degree: 95}] |
+----------------------------------------------------+
Got 1 rows (time spent 7.445ms/8.23476ms)

Thu, 16 Jan 2025 16:20:23 CST

(root@nebula) [basketballplayer]> INSERT EDGE follow(degree) values "player100"->"player101"@0:(30)
Execution succeeded (time spent 2.077ms/2.956776ms)

Thu, 16 Jan 2025 16:20:37 CST

(root@nebula) [basketballplayer]> match (v)-[e]->(v2) where id(v)=="player100" and id(v2)=="player101" and rank(e)==0 return e
+----------------------------------------------------+
| e                                                  |
+----------------------------------------------------+
| [:follow "player100"->"player101" @0 {degree: 30}] |
+----------------------------------------------------+
Got 1 rows (time spent 8.227ms/9.064795ms)

Thu, 16 Jan 2025 16:20:40 CST

老师,我用的也是3.8,你所示的操作是不会有问题的。
如何才会出现问题:
一:数据准备,保证player120和team202没有边关系。
delete edge serve “player120”->“team202”@0;
GO FROM “player120” OVER serve YIELD dst(edge);
二、采用INSERT EDGE IF NOT EXISTS第一次插入数据,属性:2001,2003
INSERT EDGE IF NOT EXISTS serve(start_year,end_year) values “player120”->“team202”@0:(2001,2003);
遍历,注意此处把两个vid对换一下查两次。
match (v)-[e]-(v2) where id(v)==“player120” and id(v2)==“team202” and rank(e)==0 return e
match (v)-[e]-(v2) where id(v)==“team202” and id(v2)==“player120” and rank(e)==0 return e
两次查询的结果,属性正确,都是:[:serve “player120”->“team202” @0 {end_year: 2003, start_year: 2001}]
三、采用INSERT EDGE IF NOT EXISTS再次插入数据,属性:2022,2024
INSERT EDGE IF NOT EXISTS serve(start_year,end_year) values “player120”->“team202”@0:(2022,2024);
遍历,第一次查询,属性正确:2001-2003
match (v)-[e]-(v2) where id(v)==“player120” and id(v2)==“team202” and rank(e)==0 return e
结果:[:serve “player120”->“team202” @0 {end_year: 2003, start_year: 2001}]
遍历,第二次查询(vid对换)
match (v)-[e]-(v2) where id(v)==“team202” and id(v2)==“player120” and rank(e)==0 return e
此时结果出现问题,属性错误:2022-2024:[:serve “player120”->“team202” @0 {end_year: 2024, start_year: 2022}]。

以上流程中把match换成go也是一样的错误。

我验证了下,似乎也没错

谢谢老师,我重启集群清除数据再试试看这个问题会不会复现。

嗯嗯,是比较奇怪,我这边是单机部署就是

我也遇到这个问题,部署版本为3.8.0,我验证了多种部署方式,docker compose部署单个节点,docker compose单机部署三节点的集群,docker swarm 部署多机三节点集群,rpm包部署多机三节点集群都时同样的问题,部署方式全都按照官方文档部署的
在demo_basketballplayer库中,同样的验证方式,出现的结果和上面老哥验证的一样,插入边时if not exists出现问题

问题复现了吗?

复现了老师,还是同样的问题。

老师,我们用的是开源版的,你用的是企业版的吗,会不会是开源版本的问题

我用的是社区版,重复做了下上面的操作,是可以的。我晚点找个分布式环境试试