请教一下大的id比对语句如何加速

 select * from records_from_old where not exists(select * from records_from_new where records_from_old.id in records_from_new.id)

类似于以上的语句，如果id比较多(大约200万行，差距也不大，可能只有几行)，请问能否有加速的办法？

1 条评论
分类：技术问答

最佳答案 2023-06-05 20:26

n = 100000
records_from_old = table(rand(10000, n) as id, rand(100.0, n) as val)
records_from_new = table(rand(11000, n) as id, rand(100.0, n) as val)
timer re1=select * from records_from_old where not exists(select * from records_from_new where records_from_old.id in records_from_new.id) //466.652 ms
timer re2=select * from records_from_old where id not in (exec id from records_from_new) //0.78ms
eqObj(re1.values(), re2.values())

经尝试这样可以优化，用 exec id from records_from_new 只取了一列，DDB 是列式存储的，只读一列文件较快。且第一个语句，子查询进行了一次比较遍历，又基于子查询结果再进行一次遍历。第二个语句只需要遍历一次。

0 条评论

Polly

采纳率 16% | 回答于 2023-06-05 16:41

请教一下大的id比对语句如何加速

最佳答案 2023-06-05 20:26

其它 0 个回答

相似问题