460M 的 csv 文本,采用相同的 schema ,导入到 olap 引擎库时,库体积为 107M,但是导入到 tadb 引擎库时,库体积却膨胀到 1.5G,为什么会这样?有没有办法让 tsdb 存储的压缩率接近 olap ?
具体建表语句如下
//准备表结构
tbSchema=extractTextSchema(fileDir + fileNames["filename"][0], "\t")
update tbSchema set type="SYMBOL" where name in `datasetCode`reporterCode`partnerCode`partner2Code
update tbSchema set type="BOOL" where name in `isOriginalClassification`isQtyEstimated`isAltQtyEstimated`isNetWgtEstimated`isGrossWgtEstimated`isReported`isAggregate
update tbSchema set type="CHAR" where name in `legacyEstimationFlag
update tbSchema set type="SHORT" where name in `refYear`refMonth`period`mosCode`motCode
// 创建分布式数据库
tmpTB=loadText(filename=dataFilePath, delimiter="\t", schema=tbSchema)
// TSDB
dbTSDB=database("dfs://comTradeTSDB", VALUE, `01`02, , engine="TSDB", atomic="TRANS")
ptTSDB=dbTSDB.createPartitionedTable(tmpTB, `commodity, `cmdCode, , sortColumns=`reporterCode`partnerCode`flowCode`refYear`refMonth`period`freqCode`refPeriodId, keepDuplicates=LAST)
ptTSDB.append!(tmpTB)
flushTSDBCache()
// OLAP
db=database("dfs://comTrade", VALUE, `01`02, , engine="OLAP", atomic="TRANS")
pt=db.createPartitionedTable(tmpTB, `commodity, `cmdCode)
pt.append!(tmpTB)
flushOLAPCache()