如何提升分钟级K线的计算速度

量化金融范例中提供的代码提示使用MapReduce中的mr函数提升效率,但在实际生产数据量更为庞大且无法查看进度,请教如何进行改进

`model=select top 1 symbol,date, minute(time) as minute, open, high, low, last, curVol as volume from quotes where date=2020.06.01,symbol='600000'
if(existsTable("dfs://level2", "minuteBar"))
 db.dropTable("minuteBar")
db.createPartitionedTable(model, "minuteBar", `date`symbol)
def saveMinuteBar(t){
 minuteBar=select first(last) as open, max(last) as high, min(last) as low, last(last) as last, sum(curVol) as volume from t where symbol>='600000', time between 09:30:00.000 : 15:00:00.000 group by symbol, date, minute(time) as minute
 loadTable("dfs://level2", "minuteBar").append!(minuteBar)
 return minuteBar.size()
}
ds = sqlDS(<select symbol, date, time, last, curVol from quotes>)
mr(ds,saveMinuteBar,+)`
请先 登录 后评论

1 个回答

Jax Wu

查看进度可以通过查询输出表,比如

//查询minuteBar表中的总行数
select count(*) from loadTable("dfs://level2", "minuteBar")

//查询minuteBar表中每天的总行数,可以知道哪几天已完成
select count(*) from loadTable("dfs://level2", "minuteBar") group by date
请先 登录 后评论