假设有一个表如下:
time = 2023.01.01T09:00:00.000 + 1..9
code = take(`a, 9)
ov95 = 1 1 1 0 0 0 0 0 0
ov70 = 0 0 0 0 0 0 1 1 1
t = table(time, code, ov95, ov70)
现需要计算指标 position,公式如下:
则可以用以下方式实现
方案一:accumulate
res = select *, accumulate(def(pos, ov95, ov70):iif(pos==0&&ov95 == 1, 1, iif(pos==1&&ov70 == 0, 0, pos)), [ov95, ov70], 0) as pos from t
方案二:for 循环+JIT
@jit
def calPos(ov95, ov70){
pos = array(INT, size(ov95)+1)
for(i in 0:size(ov95)){
if(pos[i]==0&&ov95[i]==1) pos[i+1]=1
else if(pos[i]==1&&ov70[i]==0) pos[i+1]=0
else pos[i+1]=pos[i]
}
return pos[1..size(ov95)]
}
res = select *, calPos(ov95, ov70) as pos from t
性能测试测试数据:1000000*4 【38 MB】的内存表
数据构造脚本
n = 1000000
time = 2023.01.01T09:00:00.000 + 1..n
code = take(`a, n)
ov95 = rand([0, 1], n)
ov70 = rand([0, 1], n)
t = table(time, code, ov95, ov70)