带python表达式的qoute字段提取转换

tick数据有quote字段如下,从csv文件读出后schema属性为STRING,
in[]:df['quotes']
out[]:
0        [{'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask...
1        [{'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask...
2        [{'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask...
                               ...                        
32998    [{'bid_p': 6.09, 'bid_v': 30800, 'ask_p': 6.1,...
32999    [{'bid_p': 6.09, 'bid_v': 30700, 'ask_p': 6.1,...
Name: quotes, Length: 33000, dtype: object


单条格式:

in[]:df.quotes.values[0]
out[]:"[{'bid_p': 20.28, 'bid_v': 6400, 'ask_p': 20.28, 'ask_v': 6400}, {'bid_v': 28600, 'bid_p': 0.0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}]"

在python中提取:

def quote(df_quotes):
keys=('bid_p1','bid_p2','bid_p3','bid_p4','bid_p5',
'bid_v1', 'bid_v2', 'bid_v3','bid_v4', 'bid_v5',
'ask_p1','ask_p2','ask_p3','ask_p4', 'ask_p5',
'ask_v1','ask_v2','ask_v3', 'ask_v4','ask_v5')
cd1 = eval(df_quotes)
cd2=pd.DataFrame(cd1,index=range(0,len(cd1)))
cd3=[c for c in np.nditer(np.array(cd2))]
cd4=pd.DataFrame(data=cd3,index=keys).T
return cd4
cd = df.quotes.apply(lambda x: quote(x))
quotes = pd.concat(list(cd), ignore_index=True)

获得下表:

attachments-2022-12-0y7xubJA6393f9b25ca3d.png


上述方法耗时较长,请教DolphinDB中如何高效实现提取。另外,对于str属性的python表达式,在python中用了eval()函数转化,在DolphinDB中该如何识别?


请先 登录 后评论

2 个回答

wale

csv可用loadText/pLoadText/LoadTextEx直接导入DolphinDB,相关教程参见import_csv.md · dolphindb/Tutorials_CN - Gitee.com

还有个股票行情数据导入实例 ,也可参考一下。


请先 登录 后评论
haaha


可以通过如下看看

def ff(val){
	f=def(r){
	p= parseExpr(r).eval()
	a=peach(d->table(d[`bid_p] as bid,d[`bid_v] as bidv,d[`ask_p] as ask,d[`ask_v] as askv),p)
	i=0
	res=NULL
	for (itb in a){
		itb_=itb
		itb_=itb_.rename!(colNames(itb)+string(i))
		if(isVoid(res)){
			res=itb_
			}
		else{
			res=res join itb_
			}
		i=i+1
	}
	return res}
	a=unionAll(ploop(f,val),0)
	return a.values()
	}

a = "[{'bid_p': 20.28, 'bid_v': 6400, 'ask_p': 20.28, 'ask_v': 6400}, {'bid_v': 28600, 'bid_p': 0.0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}, {'bid_p': 0.0, 'bid_v': 0, 'ask_p': 0.0, 'ask_v': 0}]"
select ff(val) as`bid_p0`bid_v0`ask_p0`ask_v0`bid_p1`bid_v1`ask_p1`ask_v1`bid_p2`bid_v2`ask_p2`ask_v2`bid_p3`bid_v3`ask_p3`ask_v3`bid_p4`bid_v4`ask_p4`ask_v4 from 
table(take(a,10) as val)

attachments-2022-12-40ghZ2cL639be8fcc863d.png

请先 登录 后评论
  • 2 关注
  • 0 收藏,861 浏览
  • Gaxia320 提出于 2022-12-10 11:04

相似问题