nebula模拟业务场景性能测试--nGQL编写帮助

loveleon · 2020 年9 月 24 日 06:06

nebula 版本：
nebula-1.0.1.el7-5.x86_64.rpm
部署方式（分布式 / 单机 / Docker / DBaaS）：
分布式部署：共5台物理机
storaged 5个节点
graphd 5个节点
metad 3个节点
硬件信息
- 磁盘（SSD / HDD）：
  系统盘50G + 数据盘sas 800GB
- CPU、内存信息：
  cpu: 12core Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
  memory: 15G
业务需求描述
业务模型，仅从手机号，身份证，卡号三个维度进行关联；订单与手机号关联类型分为（申请手机号，登录手机号，紧急联系人手机号）；订单中包含“用户ID”，“产品类型”，“订单状态”，“申请时间”属性

image714×592 62.3 KB
建模语句

CREATE SPACE t1; //创建空间
CREATE TAG tag_order_info(cust_id string ,product_type string, order_status int, create_time timestamp); //订单顶点
CREATE TAG tag_mobile(mobile string); //手机顶点
CREATE TAG tag_bank_card(bank_card string); // 身份证顶点
CREATE TAG tag_cert_no(cert_no string);// 卡号顶点
CREATE EDGE edge_mobile();    // 订单与手机号关联的边 手机号，登录手机号，紧急联系人手机号 使用变的rank字段来区分
CREATE EDGE edge_cert_no();//订单与身份证关联的边
CREATE EDGE edge_bank_card_no();//订单与卡号关联的边

测试数据集描述
edge_mobile描述的是：订单vid --》 mobile vid 的有向边
edge_cert_no描述：订单vid --》身份证vid的有向边
edge_bank_card_no描述：订单vid --》卡号vid的有向边
订单量（顶点数）：3千万
手机量（顶点数）：50万
（导入数据用时：300s左右）
问题需求的具体描述
如何构造nGQL语句，执行查询：
订单1 通过手机号1，查询关联的订单2，再通过手机号2，查询关联订单3 ，
即：order1-》mobile1-》order2-》mobile2-》order3 这样一种业务2度的查询，
来模拟返回的顶点和边数据

业务一度的查询nGQL语句：

go from 0 over edge_mobile bidirect yield edge_mobile._src as vs_src ,edge_mobile._dst as v1_dst | go from $-.v1_dst over edge_mobile reversely yield $-.vs_src,$-.v1_dst,edge_mobile._dst;

返回结果：

订单1			| 手机号		| 订单2
=============================================
| $-.vs_src | $-.v1_dst  | edge_mobile._dst |
=============================================
| 0         | 2000065006 | 24460292         |
---------------------------------------------
| 0         | 2000065006 | 27589638         |
---------------------------------------------
| 0         | 2000065006 | 17313037         |
---------------------------------------------
| 0         | 2000065006 | 1294608          |
---------------------------------------------
| 0         | 2000145320 | 18029538         |
---------------------------------------------
| 0         | 2000145320 | 3202547          |
---------------------------------------------
| 0         | 2000145320 | 21276409         |
---------------------------------------------
| 0         | 2000145320 | 205055           |
---------------------------------------------
Got 468 rows (Time spent: 5.559/16.787 ms)

– 期望得到

还请指正下一度业务查询语句是否正确，以及怎样构造用于批量性能测试，业务2度查询nGQL语句，谢谢~
2.业务二度查询，返回的结果是否包含业务1度查询的结果？

yee · 2020 年9 月 24 日 06:34

你好，根据上述的建模和需求描述，想问一下对于 order1->mobile1<-order2->mobile2<-order3 的情况，你能否接受结果集中同时含有 order1, order2, order3 的情况。因为在根据 mobile 查询 order 的 2 度查询时，返回的结果集中会包含第一次的起点 order1，即：

GO FROM order1_id OVER edge_mobile YIELD edge_mobile._src AS src, edge_mobile._dst AS dst \
| GO FROM $-.dst OVER edge_mobile REVERSELY YIELD edge_mobile._dst AS dst

此时返回的 dst 结果集中会含有 order1_id。

目前 nebula 1.0 还不能支持在上述查询中过滤起点的功能，这个会在 nebula 2.0 里面支持，上面的语句类似于下面这样：

GO FROM order1_id OVER edge_mobile YIELD edge_mobile._src AS src, edge_mobile._dst AS dst \
| GO FROM $-.dst OVER edge_mobile REVERSELY WHERE edge_mobile._dst NOT IN collect($-.src) \
YIELD edge_mobile._dst AS dst

针对 nebula 1.0 目前可以在业务端来做去重，即一跳一跳的往外拓展，同时筛选每次返回的结果集。

上面语句只是演示的两跳的情况，三跳增加类似 pipe 操纵即可。

min.wu · 2020 年9 月 24 日 06:41

机械盘参数调整参考下手册，否则可用性会有问题的

tangxiyuan · 2020 年9 月 24 日 06:59

多跳的情况下，能否返回路径信息呢？
比如 order1 -> mobile1 <- order2 这样的情况在结果中能否输出3列，把这个关系表示出来呢？
另外如果只是需要返回结果集的行数，不需要返回具体数据现在支持的语法吗？

yee · 2020 年9 月 24 日 08:10

可以输出三列的信息类似：

GO FROM order1 OVER edge_mobile YIELD edge_mobile._src AS src, edge_mobile._dst AS dst \
| GO FROM $-.dst OVER edge_mobile REVERSELY \
  YIELD $-.src AS o1, $-.dst AS m1, edge_mobile._dst AS o2

如果只是统计行数的话，可以使用 YIELD COUNT(*) 类似：

GO FROM order1 OVER edge_mobile YIELD edge_mobile._src AS src, edge_mobile._dst AS dst \
| GO FROM $-.dst OVER edge_mobile REVERSELY \
  YIELD $-.src AS o1, $-.dst AS m1, edge_mobile._dst AS o2 \
| YIELD COUNT(*) AS count

loveleon · 2020 年9 月 24 日 08:21

明白~ 多谢

loveleon · 2020 年9 月 24 日 08:24

调整过部分针对机械硬盘参数。调整–raft_rpc_timeout_ms、–heartbeat_interval_secs等

loveleon · 2020 年9 月 25 日 02:15

这条2度查询语句，如下：

go from 0 over edge_mobile yield edge_mobile._src as tag_o1,edge_mobile._dst as tag_p1 | \
go from $-.tag_p1 over edge_mobile REVERSELY yield $-.tag_o1 as tag_o1,edge_mobile._src as tag_p1,edge_mobile._dst as tag_o2 | \
go from $-.tag_o2 over edge_mobile yield $-.tag_o1 as tag_o1,$-.tag_p1 as tag_p1,edge_mobile._src as tag_o2 ,edge_mobile._dst as tag_p2 | \
go from $-.tag_p2 over edge_mobile REVERSELY yield $-.tag_o1 as tag_o1,$-.tag_p1 as tag_p1,$-.tag_o2 as tag_o2,edge_mobile._src as tag_p2,edge_mobile._dst as tag_o3 | \
yield count(*) as count;

分别在nebula-1.0.1和nebula-1.1.0上执行结果有区别：

1.0.1上，返回了全部数据
Got 12751256 rows (Time spent: 12.9324/786.925 s)

1.1.0上，返回记录总条数：

| count |

217330

Got 1 rows (Time spent: 315.365/316.371 ms)

请问：造成返回结果条数不同的原因，有哪些？是做了去重么？

loveleon · 2020 年9 月 25 日 02:16

nebula-1.1.0上执行的步骤：是从nebula-1.0.1升级到nebula-1.1.0后，执行的

yee · 2020 年9 月 25 日 02:20

是的，这里 1.1.0 版本中是对 DataJoin 有个行为的改变，内部做了去重。

loveleon · 2020 年9 月 25 日 03:37

nebula升级到v1.1.0版本后，使用nebula-go 连接数据库，一直身份验证不通过。而通过终端就可以连接成功。

代码示例：

package main

import (
  "log"

  nebula "github.com/vesoft-inc/nebula-go"
  //graph "github.com/vesoft-inc/nebula-go/nebula/graph"
)

func main() {
  client, err := nebula.NewClient("127.0.0.1:3699")
  if err != nil {
    log.Fatal(err)
  }

  //if err = client.Connect("root", "nebula"); err != nil {
  if err = client.Connect("username", "password"); err != nil {
    log.Fatal(err)
  }
  defer client.Disconnect()

  resp, err := client.Execute("SHOW HOSTS;")
  if err != nil {
    log.Fatal(err)
  }

  if nebula.IsError(resp) {
    log.Printf("ErrorCode: %v, ErrorMsg: %s", resp.GetErrorCode(), resp.GetErrorMsg())
  }
}

运行go run test.go后，输出：

2020/09/25 11:30:45 Authentication fails, Invalid data length
2020/09/25 11:30:45 Invalid data length
exit status 1

环境配置

fbthrift和nebula-go都是master版本的。
–enable_authorize=true设置后，重启过整个集群

疑问

请问，是不是要使用nebula-go v2版本代码？

loveleon · 2020 年9 月 25 日 03:38

使用nebula-go/v2 下面的代码，同样报错

loveleon · 2020 年9 月 25 日 03:42

三方库存放在GOPATH环境变量下

yee · 2020 年9 月 25 日 03:51

nebula-go 下的 v2 是为 nebula 2.0 的准备的，不适合连接 1.0，接口不兼容。

你可以试着在本地更新 nebula-go 的代码，类似：

$ go get -u -v github.com/vesoft-inc/nebula-go@v1.1.0
$ go mod tidy

然后在编译执行本地代码。