高频率偶现的BAD_DATA问题

nebula版本:2.0.1
ngql看似没有问题,查询时高频率出现BAD_DATA,请问这个是怎么回事?有什么解决办法?
字段类型:date_appl_submit timestamp,

重启完之后大于0是必现的

顶一个 :no_mouth:

1 个赞

报错语句:
go 1 to 2 STEPS from “UR2712823947216322560” over e_call BIDIRECT where $$.user.user_no!=‘UR2712823947216322560’ yield distinct $$.user.user_no as userNo,$$.user.date_appl_submit as dateApplSubmit | limit 500 | yield count(case when $-.dateApplSubmit is not null and $-.dateApplSubmit > (int64)(0) then true end) as ApplUserCnt

不报错语句:
go 1 to 2 STEPS from “UR2712823947216322560” over e_call BIDIRECT where $$.user.user_no!=‘UR2712823947216322560’ and $$.user.date_appl_submit is not null and $$.user.date_appl_submit > (int64)(0) yield distinct $$.user.user_no as userNo,$$.user.date_appl_submit as dateApplSubmit | limit 500 | yield count(*) as ApplUserCnt

为啥要选择第一种形式,因为还有很多计算项(avg、count、sum等等),过滤的条件不是一致的。

请问能保证您的 user.date_appl_submit 的类型是数字的么? 有没有别的类型?相等==> 的匹配方式不同,所以 == 可能不会走到类型不对的记录出 BAD_DATA。

(root@nebula) [basketballplayer]> GO 1 to 2 steps FROM "player101" OVER follow BIDIRECT YIELD $$.player.age AS age | YIELD count(case when $-.age is not null and $-.age > 0 then true end);
+--------------------------------------------------------------------+
| count(CASE WHEN ($-.age IS NOT NULL AND ($-.age>0)) THEN true END) |
+--------------------------------------------------------------------+
| 48                                                                 |
+--------------------------------------------------------------------+
Got 1 rows (time spent 20571/59638 us)

Tue, 07 Sep 2021 11:22:23 CST

(root@nebula) [basketballplayer]> GO 1 to 2 steps FROM "player101" OVER follow BIDIRECT YIELD $$.player.name AS age | YIELD count(case when $-.age is not null and $-.age > 0 then true end);
+--------------------------------------------------------------------+
| count(CASE WHEN ($-.age IS NOT NULL AND ($-.age>0)) THEN true END) |
+--------------------------------------------------------------------+
| BAD_DATA                                                           |
+--------------------------------------------------------------------+
Got 1 rows (time spent 14913/50785 us)

Tue, 07 Sep 2021 11:24:00 CST

字段类型是timestamp,值有:NULL、0、以及正常的时间戳

明白了,结合您的那个提前过滤 null 的,和数据类型,看起来像是 null 情况在 CASE WHEN 里有先比较 > 而不是先 is not null 的情况,我问问开发同学哈。

是$-.userNo > 0 返回的BAD_DATA?

我从 高频率偶现的BAD_DATA问题 - #3,来自 Anyzm 这楼的 把 is not null 放到之前不出问题推测是这里的问题,不知道合理不?

是的,但是$-.userNo只是个别名,指代的还是date_appl_submit字段,这个字段的类型是timestamp,值有:null、0、以及其余时间戳

好像就是 大于0不行,试了等于0都可以

userNo 的类型呢?最初的截图里它也有 bad_data,有条件把 userNodate_appl_submit 全输出看看有什么值么?

这两个是一个字段,只不过userNo是取了一个别名而已,它实际上还是date_appl_submit字段

好的
@CPWstatic

$-.userNo只是个别名,指代的还是date_appl_submit字段,这个字段的类型是timestamp,值有:null、0、以及其余时间戳

报错语句:

go 1 to 2 STEPS from “UR2712823947216322560” over e_call BIDIRECT where $$.user.user_no!=‘UR2712823947216322560’ \
  yield distinct $$.user.user_no as userNo,$$.user.date_appl_submit as dateApplSubmit \
  | limit 500 | \
  yield count(case when $-.dateApplSubmit is not null and $-.dateApplSubmit > (int64)(0) then true end) as ApplUserCnt

不报错语句:

go 1 to 2 STEPS from “UR2712823947216322560” over e_call BIDIRECT where $$.user.user_no!=‘UR2712823947216322560’ \
  and $$.user.date_appl_submit is not null and $$.user.date_appl_submit > (int64)(0) \
  yield distinct $$.user.user_no as userNo,$$.user.date_appl_submit as dateApplSubmit \
  | limit 500 | \
  yield count(*) as ApplUserCnt

能不能把表达式拆解一下,看下这个bad_data是从哪个表达式产生的?

另外bad data就是个null,你当成null处理就行了

拆解过了就是 $-.dateApplSubmit > 0 有问题,

$-.dateApplSubmit == 0 没有问题

那你输出这个字段看下呢,都是数字和null吗

是的,看过了,都是数字(包含0)和null

看一下 schema

desc tag user
1 个赞

刚开始是这样的(裁剪了一些字段)

CREATE TAG user (\
    user_no string default "", \
    date_appl_submit timestamp, \
    create_time timestamp default now(),\
    update_time timestamp default now() \
);

这样就会报错,后来我们试图改变schema,希望能解决错误(事与愿违),改过后的版本:

CREATE TAG user (\
    user_no string default "", \
    date_appl_submit timestamp default  0, \
    create_time timestamp default now(),\
    update_time timestamp default now() \
);