What should I do if I want to load a csv file with a column datetime (YYYYMMDD hhmmss), yet partition with date (YYYMMDD)?

Let's say we have csv file like this, how to load it properly with partition in YYYYMMDD? 

date,askO,askH,askL,askC,bidO,bidH,bidL,bidC,source,symbol
2000-01-02 23:31:00,93.825,93.825,93.825,93.825,93.8,93.8,93.8,93.8,trdb,EDH0
2000-01-02 23:32:00,93.825,93.83,93.825,93.83,93.8,93.8,93.8,93.8,trdb,EDH0
2000-01-02 23:40:00,93.83,93.83,93.83,93.83,93.8,93.8,93.8,93.8,trdb,EDH0
2000-01-02 23:44:00,93.83,93.83,93.83,93.83,93.81,93.81,93.81,93.81,trdb,EDH0
2000-01-02 23:45:00,93.83,93.83,93.83,93.83,93.82,93.82,93.82,93.82,trdb,EDH0
2000-01-02 23:48:00,93.875,93.875,93.875,93.875,93.82,93.82,93.82,93.82,trdb,EDH0
2000-01-02 23:49:00,93.835,93.835,93.835,93.835,93.82,93.82,93.82,93.82,trdb,EDH0
2000-01-02 23:55:00,93.83,93.83,93.83,93.83,93.82,93.82,93.82,93.82,trdb,EDH0
2000-01-03 00:00:00,93.83,93.83,93.83,93.83,93.81,93.81,93.81,93.81,trdb,EDH0

Though column name is "date", we want it to be partitioned by "YYYYMMDD" instead of "YYYY-MM-DD hh:mm:ss" 

请先 登录 后评论

1 个回答

jinzhi

assume the csv file is "/home/jinzhi/Desktop/test.csv"

login("admin","123456")
db= database("dfs://tstdb", VALUE, 2000.01.01..2001.01.01)
nameCol = `date`askO`askH`askL`askC`bidO`bidH`bidL`bidC`source`symbol;
typeCol = `DATE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`DOUBLE`SYMBOL`SYMBOL
schemaTb = table(1:0, nameCol, typeCol);
nx= db.createPartitionedTable(schemaTb, 'nx', `date)

def dataTransform(mutable t) {
	 return t.replaceColumn!(`date, date(t.date))
}

pt=loadTextEx(dbHandle=db,tableName=`nx , partitionColumns=`date,filename="/home/jinzhi/Desktop/test.csv",transform=dataTransform);
请先 登录 后评论