[2021/07/02-15:19:14.504] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.prediction] - ****************************************** [2021/07/02-15:19:14.507] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.prediction] - ** Start train session s6 [2021/07/02-15:19:14.509] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.prediction] - ****************************************** [2021/07/02-15:19:14.514] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 10] Search for split: p=type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true i=5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:19:14.528] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.data] T-UlzLKhyx - [ct: 24] Need to compute sampleId before checking memory cache [2021/07/02-15:19:14.529] [FT-TrainWorkThread-1xCBzFAt-1006] [DEBUG] [dip.shaker.runner] T-UlzLKhyx - [ct: 25] Script settings sampleMax=104857600 processedMax=-1 [2021/07/02-15:19:14.531] [FT-TrainWorkThread-1xCBzFAt-1006] [DEBUG] [dip.shaker.runner] T-UlzLKhyx - [ct: 27] Processing with sampleMax=104857600 processedMax=524288000 [2021/07/02-15:19:14.564] [FT-TrainWorkThread-1xCBzFAt-1006] [DEBUG] [dip.shaker.runner] T-UlzLKhyx - [ct: 60] Computed required sample id : 53bbdeb0a60f0db7b0941b23c90d7a7f-NA-b9689059e8435a4efb7ef594c85e26650--d751713988987e9331980363e24189ce [2021/07/02-15:19:14.568] [FT-TrainWorkThread-1xCBzFAt-1006] [DEBUG] [dku.shaker.cache] T-UlzLKhyx - Shaker MemoryCache get on dataset WALMARTFORECASTING.walmart_features_train_store key=ds=21037e52ffd8857dd34496c62b0fefa4--scr=85bf1bf8b18b630c5a5509b45bde2e2b--samp=53bbdeb0a60f0db7b0941b23c90d7a7f-NA-b9689059e8435a4efb7ef594c85e26650--d751713988987e9331980363e24189ce: hit [2021/07/02-15:19:14.585] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 81] Column Store meaning=LongMeaning fail=0 [2021/07/02-15:19:14.587] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 83] Column Type meaning=Text fail=0 [2021/07/02-15:19:14.589] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 85] Column Size meaning=LongMeaning fail=0 [2021/07/02-15:19:14.591] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 87] Column Date meaning=DateSource fail=0 [2021/07/02-15:19:14.592] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 88] Column Temperature meaning=DoubleMeaning fail=0 [2021/07/02-15:19:14.594] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 90] Column Fuel_Price meaning=DoubleMeaning fail=0 [2021/07/02-15:19:14.607] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 103] Column MarkDown1 meaning=DoubleMeaning fail=6663 [2021/07/02-15:19:14.608] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 104] Column MarkDown2 meaning=DoubleMeaning fail=7083 [2021/07/02-15:19:14.610] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 106] Column MarkDown3 meaning=DoubleMeaning fail=6722 [2021/07/02-15:19:14.612] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 108] Column MarkDown4 meaning=DoubleMeaning fail=6663 [2021/07/02-15:19:14.614] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 110] Column MarkDown5 meaning=DoubleMeaning fail=6663 [2021/07/02-15:19:14.616] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 112] Column CPI meaning=DoubleMeaning fail=0 [2021/07/02-15:19:14.621] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 117] Column Unemployment meaning=DoubleMeaning fail=0 [2021/07/02-15:19:14.623] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 119] Column IsHoliday meaning=Boolean fail=0 [2021/07/02-15:19:14.623] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 119] Column Dept meaning=LongMeaning fail=0 [2021/07/02-15:19:14.627] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.shaker.schema] T-UlzLKhyx - [ct: 123] Column Weekly_Sales meaning=DoubleMeaning fail=0 [2021/07/02-15:19:14.805] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [dku.ml.prediction.split] T-UlzLKhyx - Sorted train/test split: ordering column "Date" is not numeric [2021/07/02-15:19:14.808] [Thread-490] [INFO] [dku.datasets.pull] - pull background thread starting for walmart_features_train_store [2021/07/02-15:19:14.810] [Thread-490] [INFO] [dku.datasets.file] - Building Filesystem handler config: {"connection":"filesystem_managed","path":"WALMARTFORECASTING/walmart_features_train_store","notReadyIfEmpty":false,"filesSelectionRules":{"mode":"ALL","excludeRules":[],"includeRules":[],"explicitFiles":[]}} [2021/07/02-15:19:14.818] [Thread-490] [INFO] [dku.datasets.ftplike] - Enumerating Filesystem dataset prefix= [2021/07/02-15:19:14.821] [Thread-490] [DEBUG] [dku.fs.local] - Enumerating local filesystem prefix=/ [2021/07/02-15:19:14.824] [Thread-490] [DEBUG] [dku.fs.local] - Enumeration done nb_paths=1 size=7234956 [2021/07/02-15:19:14.826] [Thread-490] [INFO] [dku.input.push] - USTP: push selection.method=FULL records=100000 ratio=0.02 col=null [2021/07/02-15:19:14.828] [Thread-490] [INFO] [dku.format] - Extractor run: limit={"maxBytes":-1,"maxRecords":-1,"ordering":{"enabled":false,"rules":[]}} totalRecords=0 [2021/07/02-15:19:14.830] [Thread-490] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2021/07/02-15:19:14.833] [Thread-490] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2021/07/02-15:19:14.835] [Thread-490] [INFO] [dku.format] - Start compressed [GZIP] stream: /home/dataiku/dss/managed_datasets/WALMARTFORECASTING/walmart_features_train_store/out-s0.csv.gz / totalRecsBefore=0 [2021/07/02-15:19:14.837] [Thread-490] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2021/07/02-15:19:14.840] [Thread-490] [INFO] [dku] - getCompression filename=**out-s0.csv.gz** [2021/07/02-15:19:18.455] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 100000 lines from file, 16 columns - interned: 12 MEM: 100.0% [2021/07/02-15:19:22.960] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 200000 lines from file, 16 columns - interned: 23 MEM: 100.0% [2021/07/02-15:19:23.001] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dip.sorter.chunk] T-UlzLKhyx - Spilling chunk. used=134218566 [2021/07/02-15:19:25.415] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 300000 lines from file, 16 columns - interned: 118 MEM: 100.0% [2021/07/02-15:19:26.521] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 400000 lines from file, 16 columns - interned: 139 MEM: 100.0% [2021/07/02-15:19:26.699] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dip.sorter.chunk] T-UlzLKhyx - Spilling chunk. used=134218508 [2021/07/02-15:19:28.651] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 500000 lines from file, 16 columns - interned: 228 MEM: 100.0% [2021/07/02-15:19:32.027] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 600000 lines from file, 16 columns - interned: 309 MEM: 100.0% [2021/07/02-15:19:32.112] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dip.sorter.chunk] T-UlzLKhyx - Spilling chunk. used=134217762 [2021/07/02-15:19:36.185] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 700000 lines from file, 16 columns - interned: 387 MEM: 100.0% [2021/07/02-15:19:42.248] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 800000 lines from file, 16 columns - interned: 394 MEM: 100.0% [2021/07/02-15:19:42.309] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dip.sorter.chunk] T-UlzLKhyx - Spilling chunk. used=134218116 [2021/07/02-15:19:46.345] [Thread-490] [INFO] [dku.format.csv] - CSV Emitted 900000 lines from file, 16 columns - interned: 404 MEM: 100.0% [2021/07/02-15:19:48.057] [Thread-490] [INFO] [dku.format] - after stream totalComp=7234956 totalUncomp=63265091 totalRec=960004 [2021/07/02-15:19:48.059] [Thread-490] [INFO] [dku.format] - Extractor run done, totalCompressed=7234956 totalRecords=960004 [2021/07/02-15:19:48.061] [Thread-490] [DEBUG] [dku.datasets.pull] - pull background thread: ending queue, cursize=630 [2021/07/02-15:19:48.062] [Thread-490] [INFO] [dku.datasets.pull] - pull background thread finished for walmart_features_train_store [2021/07/02-15:19:48.064] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.datasets.pull] T-UlzLKhyx - End of stream reached [2021/07/02-15:19:48.066] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dip.sorter.chunk] T-UlzLKhyx - Spilling chunk. used=98387550 [2021/07/02-15:19:52.561] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=1 column=Temperature type=DOUBLE val=2010-02-05 [2021/07/02-15:19:52.569] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=2 column=Temperature type=DOUBLE val=2010-02-05 [2021/07/02-15:19:52.573] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=3 column=Temperature type=DOUBLE val=2010-02-12 [2021/07/02-15:19:52.577] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=4 column=Temperature type=DOUBLE val=2010-02-12 [2021/07/02-15:19:52.579] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=5 column=Temperature type=DOUBLE val=2010-02-19 [2021/07/02-15:19:52.582] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=6 column=Temperature type=DOUBLE val=2010-02-19 [2021/07/02-15:19:52.583] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=7 column=Temperature type=DOUBLE val=2010-02-26 [2021/07/02-15:19:52.585] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=8 column=Temperature type=DOUBLE val=2010-02-26 [2021/07/02-15:19:52.588] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=9 column=Temperature type=DOUBLE val=2010-03-05 [2021/07/02-15:19:52.589] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=10 column=Temperature type=DOUBLE val=2010-03-05 [2021/07/02-15:19:52.591] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=11 column=Temperature type=DOUBLE val=2010-03-12 [2021/07/02-15:19:52.592] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=12 column=Temperature type=DOUBLE val=2010-03-12 [2021/07/02-15:19:52.594] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=13 column=Temperature type=DOUBLE val=2010-03-19 [2021/07/02-15:19:52.596] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=14 column=Temperature type=DOUBLE val=2010-03-19 [2021/07/02-15:19:52.599] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=15 column=Temperature type=DOUBLE val=2010-03-26 [2021/07/02-15:19:52.601] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=16 column=Temperature type=DOUBLE val=2010-03-26 [2021/07/02-15:19:52.603] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=17 column=Temperature type=DOUBLE val=2010-04-02 [2021/07/02-15:19:52.604] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=18 column=Temperature type=DOUBLE val=2010-04-02 [2021/07/02-15:19:52.606] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=19 column=Temperature type=DOUBLE val=2010-04-09 [2021/07/02-15:20:14.863] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.ml.prediction.split] T-UlzLKhyx - [ct: 60359] Sorted train/test split: threshold = 6 [2021/07/02-15:20:14.864] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=1 column=Temperature type=DOUBLE val=2010-04-02 [2021/07/02-15:20:14.866] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=2 column=Temperature type=DOUBLE val=2013-07-19 [2021/07/02-15:20:14.867] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=2 column=Fuel_Price type=DOUBLE val=FALSE [2021/07/02-15:20:14.868] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=3 column=Temperature type=DOUBLE val=2010-04-09 [2021/07/02-15:20:14.869] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=4 column=Temperature type=DOUBLE val=2013-07-26 [2021/07/02-15:20:14.870] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=4 column=Fuel_Price type=DOUBLE val=FALSE [2021/07/02-15:20:14.871] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=5 column=Temperature type=DOUBLE val=2010-04-16 [2021/07/02-15:20:14.872] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=6 column=Temperature type=DOUBLE val=2010-02-05 [2021/07/02-15:20:14.873] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=7 column=Temperature type=DOUBLE val=2010-04-23 [2021/07/02-15:20:14.874] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=8 column=Temperature type=DOUBLE val=2010-02-12 [2021/07/02-15:20:14.876] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=9 column=Temperature type=DOUBLE val=2010-04-30 [2021/07/02-15:20:14.878] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=10 column=Temperature type=DOUBLE val=2010-02-19 [2021/07/02-15:20:14.880] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=11 column=Temperature type=DOUBLE val=2010-05-07 [2021/07/02-15:20:14.881] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=12 column=Temperature type=DOUBLE val=2010-02-26 [2021/07/02-15:20:14.882] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=13 column=Temperature type=DOUBLE val=2010-05-14 [2021/07/02-15:20:14.883] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=14 column=Temperature type=DOUBLE val=2010-03-05 [2021/07/02-15:20:14.884] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=15 column=Temperature type=DOUBLE val=2010-05-21 [2021/07/02-15:20:14.885] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=16 column=Temperature type=DOUBLE val=2010-03-12 [2021/07/02-15:20:14.886] [FT-TrainWorkThread-1xCBzFAt-1006] [WARN] [com.dataiku.dip.output.CSVSerializer] T-UlzLKhyx - OUTPUT_DATA_BAD_FLOAT: file=unk line=17 column=Temperature type=DOUBLE val=2010-05-28 [2021/07/02-15:20:16.735] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62231] Checking if splits are up to date. Policy: type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true, instance id: 5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.736] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62232] Search for split: p=type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true i=5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.741] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62237] Search for split: p=type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true i=5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.744] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62240] Checking if splits are up to date. Policy: type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true, instance id: 5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.746] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62242] Search for split: p=type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true i=5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.748] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.splits] T-UlzLKhyx - [ct: 62244] Search for split: p=type=SPLIT_SINGLE_DATASET,split=SORTED,splitBeforePrepare=true,ds=walmart_features_train_store,sel=(method=full),r=0.8,c=Date,ascending=true i=5ba06e2e4e6f28bf7a44b9d2143afc37-1 [2021/07/02-15:20:16.753] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.ml.python] T-UlzLKhyx - Joining processing thread ... [2021/07/02-15:20:16.761] [MRT-1009] [INFO] [dku.analysis.ml.python] - Running a preprocessing set: pp2 in /home/dataiku/dss/analysis-data/WALMARTFORECASTING/qQyhv580/UlzLKhyx/sessions/s6/pp2 [2021/07/02-15:20:16.761] [MRT-1009] [INFO] [dku.block.link] - Started a socket on port 41458 [2021/07/02-15:20:16.762] [MRT-1009] [INFO] [dku.ml.kernel] - Writing output of python-single-command-kernel to /home/dataiku/dss/analysis-data/WALMARTFORECASTING/qQyhv580/UlzLKhyx/sessions/s6/pp2/train.log [2021/07/02-15:20:16.762] [MRT-1009] [INFO] [dku.code.envs.resolution] - Executing Python activity in builtin env [2021/07/02-15:20:16.762] [MRT-1009] [WARN] [dku.code.projectLibs] - External libraries file not found: /home/dataiku/dss/config/projects/WALMARTFORECASTING/lib/external-libraries.json [2021/07/02-15:20:16.762] [MRT-1009] [INFO] [dku.code.projectLibs] - EXTERNAL LIBS FROM WALMARTFORECASTING is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [2021/07/02-15:20:16.763] [MRT-1009] [INFO] [dku.code.projectLibs] - chunkFolder is /home/dataiku/dss/config/projects/WALMARTFORECASTING/lib/R [2021/07/02-15:20:16.763] [MRT-1009] [INFO] [dku.python.single_command.kernel] - Starting Python process for kernel python-single-command-kernel [2021/07/02-15:20:16.763] [MRT-1009] [INFO] [dip.tickets] - Creating API ticket for analysis-ml-WALMARTFORECASTING-kJOC8n7 on behalf of admin id=analysis-ml-WALMARTFORECASTING-kJOC8n7_p7Fk5contdrf [2021/07/02-15:20:16.763] [MRT-1009] [INFO] [dku.security.process] - Starting process (regular) [2021/07/02-15:20:16.853] [MRT-1009] [INFO] [dku.security.process] - Process started with pid=5252 [2021/07/02-15:20:16.854] [MRT-1009] [INFO] [dku.processes.cgroups] - Will use cgroups [] [2021/07/02-15:20:16.854] [MRT-1009] [INFO] [dku.processes.cgroups] - Applying rules to used cgroups: [] [2021/07/02-15:20:16.988] [KNL-python-single-command-kernel-monitor-1017] [INFO] [dku.resourceusage] - Reporting start of CRU:{"context":{"type":"ANALYSIS_ML_TRAIN","authIdentifier":"admin","projectKey":"WALMARTFORECASTING","analysisId":"qQyhv580","mlTaskId":"UlzLKhyx","sessionId":"s6"},"type":"LOCAL_PROCESS","id":"JN2qH3MWqmNbz1Kk","startTime":1625239216988,"localProcess":{"cpuCurrent":0.0}} [2021/07/02-15:20:17.045] [process-resource-monitor-5252-1021] [DEBUG] [dku.resource] - Process stats for pid 5252: {"pid":5252,"commandName":"/home/dataiku/dss/bin/python","cpuUserTimeMS":110,"cpuSystemTimeMS":10,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":0,"cpuTotalMS":120,"cpuCurrent":0.0,"vmSizeMB":192,"vmRSSMB":12,"vmHWMMB":12,"vmRSSAnonMB":7,"vmDataMB":7,"vmSizePeakMB":192,"vmRSSPeakMB":12,"vmRSSTotalMBS":0,"majorFaults":0,"childrenMajorFaults":0} Installing debugging signal handler [2021/07/02-15:20:19.696] [MRT-1009] [INFO] [dku.link.secret_protected] - Connected to kernel [2021/07/02-15:20:19.698] [MRT-1009] [INFO] [dku.block.link.interaction] - Execute link command respClazz=true respTypeToken=false respIsString=false is=false asyncInputStream=false os=false [2021-07-02 15:20:19,695] [5252/MainThread] [INFO] [dataiku.base.socket_block_link] Connecting to localhost (127.0.0.1) at port 41458 [2021-07-02 15:20:19,696] [5252/MainThread] [INFO] [dataiku.base.socket_block_link] Connected to localhost (127.0.0.1) at port 41458 [2021-07-02 15:20:21,093] [5252/MainThread] [INFO] [dataiku.doctor.utils.dku_pickle] Setting cloudpickle as the pickling tool /home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/dkuapi.py:16: DeprecationWarning: inspect.getargspec() is deprecated since Python 3.0, use inspect.signature() or inspect.getfullargspec() argspec = inspect.getargspec(api) [2021-07-02 15:20:21,246] [5252/MainThread] [INFO] [root] Running analysis command: train_prediction_models_nosave [2021-07-02 15:20:21,378] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: DatasetSanityCheckDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_DATASET_SANITY_CHECKS [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: ClassifierAccuracyCheckDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_MODEL_CHECK [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: RegressionR2CheckDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_MODEL_CHECK [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: LeakageDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_LEAKAGE_DETECTION [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: TreeOverfitDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_TRAINING_OVERFIT [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.diagnostics.diagnostics] enabling diagnostic callback: MLAssertionsDiagnostic of type DiagnosticType.ML_DIAGNOSTICS_ML_ASSERTIONS [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.commands] PPS is {'target_remapping': [], 'skipPreprocessing': False, 'per_feature': {'Temperature': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Dept': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Size': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'REJECT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Store': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'REJECT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Fuel_Price': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Unemployment': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Date': {'category_handling': 'DUMMIFY', 'missing_handling': 'DROP_ROW', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DateSource'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown5': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Type': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'REJECT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'Text'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown3': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown4': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown1': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Weekly_Sales': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'TARGET', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown2': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'IsHoliday': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'Boolean'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'CPI': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}}, 'reduce': {'enabled': False, 'kept_variance': 0.0}, 'feature_generation': {'pairwise_linear': {'behavior': 'DISABLED'}, 'polynomial_combinations': {'behavior': 'DISABLED'}, 'manual_interactions': {'interactions': []}, 'numericals_clustering': {'k': 0, 'all_features': False, 'input_features': [], 'behavior': 'DISABLED'}, 'categoricals_count_transformer': {'all_features': False, 'input_features': [], 'behavior': 'DISABLED'}}, 'feature_selection_params': {'method': 'NONE', 'random_forest_params': {'n_trees': 30, 'depth': 10, 'n_features': 25}, 'lasso_params': {'alpha': [0.01, 0.1, 1.0, 10.0, 100.0], 'cross_validate': True}, 'pca_params': {'n_features': 25, 'variance_proportion': 0.9}, 'correlation_params': {'min_abs_correlation': 0.0, 'max_abs_correlation': 1.0, 'n_features': 25}, 'custom_params': {'code': '# type your code here'}}, 'preprocessingFitSampleRatio': 1.0, 'preprocessingFitSampleSeed': 1337} [2021-07-02 15:20:21,379] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] START - Loading train set [2021-07-02 15:20:21,380] [5252/MainThread] [INFO] [root] Reading with dtypes: None [2021-07-02 15:20:21,380] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Store: None (schema_type=bigint feature_type=NUMERIC feature_role=REJECT) [2021-07-02 15:20:21,380] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Type: None (schema_type=string feature_type=CATEGORY feature_role=REJECT) [2021-07-02 15:20:21,380] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Size: None (schema_type=bigint feature_type=NUMERIC feature_role=REJECT) [2021-07-02 15:20:21,380] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Date: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Temperature: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Fuel_Price: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown1: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown2: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown3: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown4: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown5: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for CPI: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Unemployment: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for IsHoliday: str (schema_type=boolean feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Dept: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Weekly_Sales: (schema_type=double feature_type=NUMERIC feature_role=TARGET) [2021-07-02 15:20:21,381] [5252/MainThread] [INFO] [root] Reading with FIXED dtypes: {'Date': 'str', 'Temperature': , 'Fuel_Price': , 'MarkDown1': 'str', 'MarkDown2': 'str', 'MarkDown3': 'str', 'MarkDown4': 'str', 'MarkDown5': 'str', 'CPI': , 'Unemployment': , 'IsHoliday': 'str', 'Dept': , 'Weekly_Sales': } [2021-07-02 15:20:25,833] [5252/MainThread] [INFO] [root] Loaded table [2021-07-02 15:20:25,917] [5252/MainThread] [INFO] [dataiku.doctor.utils] Coercion done [2021-07-02 15:20:25,917] [5252/MainThread] [INFO] [dataiku.doctor.utils.split] Loaded train df: shape=(768003,16) [2021-07-02 15:20:25,917] [5252/MainThread] [INFO] [dataiku.doctor.commands] Checking that the train set is sorted by 'Date' [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Store (int64) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Type (object) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Size (int64) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Date (object) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Temperature (float64) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Fuel_Price (float64) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : MarkDown1 (object) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : MarkDown2 (object) [2021-07-02 15:20:26,122] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : MarkDown3 (object) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : MarkDown4 (object) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : MarkDown5 (object) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : CPI (float64) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Unemployment (float64) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : IsHoliday (object) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Dept (float64) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.commands] Train col : Weekly_Sales (float64) [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] END - Loading train set [2021-07-02 15:20:26,123] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] START - Loading test set [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [root] Reading with dtypes: None [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Store: None (schema_type=bigint feature_type=NUMERIC feature_role=REJECT) [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Type: None (schema_type=string feature_type=CATEGORY feature_role=REJECT) [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Size: None (schema_type=bigint feature_type=NUMERIC feature_role=REJECT) [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Date: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,139] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Temperature: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Fuel_Price: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown1: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown2: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown3: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown4: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for MarkDown5: str (schema_type=string feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for CPI: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Unemployment: (schema_type=double feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for IsHoliday: str (schema_type=boolean feature_type=CATEGORY feature_role=INPUT) [2021-07-02 15:20:26,140] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Dept: (schema_type=bigint feature_type=NUMERIC feature_role=INPUT) [2021-07-02 15:20:26,141] [5252/MainThread] [INFO] [dataiku.doctor.utils] Computed dtype for Weekly_Sales: (schema_type=double feature_type=NUMERIC feature_role=TARGET) [2021-07-02 15:20:26,141] [5252/MainThread] [INFO] [root] Reading with FIXED dtypes: {'Date': 'str', 'Temperature': , 'Fuel_Price': , 'MarkDown1': 'str', 'MarkDown2': 'str', 'MarkDown3': 'str', 'MarkDown4': 'str', 'MarkDown5': 'str', 'CPI': , 'Unemployment': , 'IsHoliday': 'str', 'Dept': , 'Weekly_Sales': } [2021-07-02 15:20:26,875] [5252/MainThread] [INFO] [root] Loaded table [2021-07-02 15:20:26,877] [5252/MainThread] [INFO] [dataiku.doctor.utils] Coercion done [2021-07-02 15:20:26,877] [5252/MainThread] [INFO] [dataiku.doctor.utils.split] Loaded test df: shape=(192001,16) [2021-07-02 15:20:26,877] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] END - Loading test set [2021-07-02 15:20:26,931] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] START - Collecting statistics [2021-07-02 15:20:26,934] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Temperature... (type=NUMERIC) [2021-07-02 15:20:26,934] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Checking series of type: float64 (isM8=False) [2021-07-02 15:20:27,061] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Dept... (type=NUMERIC) [2021-07-02 15:20:27,061] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Checking series of type: float64 (isM8=False) [2021-07-02 15:20:27,159] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Size... (type=NUMERIC) [2021-07-02 15:20:27,159] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Store... (type=NUMERIC) [2021-07-02 15:20:27,159] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Fuel_Price... (type=NUMERIC) [2021-07-02 15:20:27,159] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Checking series of type: float64 (isM8=False) [2021-07-02 15:20:27,252] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Unemployment... (type=NUMERIC) [2021-07-02 15:20:27,252] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Checking series of type: float64 (isM8=False) [2021-07-02 15:20:27,333] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Date... (type=CATEGORY) [2021-07-02 15:20:27,752] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at MarkDown5... (type=CATEGORY) [2021-07-02 15:20:28,003] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Type... (type=CATEGORY) [2021-07-02 15:20:28,003] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at MarkDown3... (type=CATEGORY) [2021-07-02 15:20:28,281] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at MarkDown4... (type=CATEGORY) [2021-07-02 15:20:28,531] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at MarkDown1... (type=CATEGORY) [2021-07-02 15:20:28,904] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at Weekly_Sales... (type=NUMERIC) [2021-07-02 15:20:28,904] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at MarkDown2... (type=CATEGORY) [2021-07-02 15:20:29,273] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at IsHoliday... (type=CATEGORY) [2021-07-02 15:20:29,605] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Looking at CPI... (type=NUMERIC) [2021-07-02 15:20:29,605] [5252/MainThread] [INFO] [dataiku.doctor.preprocessing_collector] Checking series of type: float64 (isM8=False) [2021-07-02 15:20:29,704] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] END - Collecting statistics [2021-07-02 15:20:29,705] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] generating interactions [2021-07-02 15:20:29,705] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] {'target_remapping': [], 'skipPreprocessing': False, 'per_feature': {'Temperature': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Dept': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Size': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'REJECT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Store': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'REJECT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'LongMeaning'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Fuel_Price': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Unemployment': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Date': {'category_handling': 'DUMMIFY', 'missing_handling': 'DROP_ROW', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DateSource'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown5': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Type': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'REJECT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'Text'}, 'autoReason': 'REJECT_ZERO_VARIANCE', 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown3': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown4': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown1': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'Weekly_Sales': {'generate_derivative': False, 'impute_constant_value': 0.0, 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'TARGET', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'MarkDown2': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'IsHoliday': {'category_handling': 'DUMMIFY', 'missing_handling': 'NONE', 'missing_impute_with': 'MODE', 'dummy_clip': 'MAX_NB_CATEGORIES', 'cumulative_proportion': 0.95, 'min_samples': 10, 'max_nb_categories': 100, 'max_cat_safety': 200, 'nb_bins_hashing': 1048576, 'hash_whole_categories': True, 'dummy_drop': 'NONE', 'role': 'INPUT', 'type': 'CATEGORY', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'Boolean'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}, 'CPI': {'generate_derivative': False, 'numerical_handling': 'REGULAR', 'missing_handling': 'IMPUTE', 'missing_impute_with': 'MEAN', 'impute_constant_value': 0.0, 'rescaling': 'AVGSTD', 'quantile_bin_nb_bins': 4, 'binarize_threshold_mode': 'MEDIAN', 'binarize_constant_threshold': 0.0, 'role': 'INPUT', 'type': 'NUMERIC', 'state': {'userModified': False, 'autoModifiedByDSS': False, 'recordedMeaning': 'DoubleMeaning'}, 'customHandlingCode': '', 'customProcessorWantsMatrix': False, 'sendToInput': 'main'}}, 'reduce': {'enabled': False, 'kept_variance': 0.0}, 'feature_generation': {'pairwise_linear': {'behavior': 'DISABLED'}, 'polynomial_combinations': {'behavior': 'DISABLED'}, 'manual_interactions': {'interactions': []}, 'numericals_clustering': {'k': 0, 'all_features': False, 'input_features': [], 'behavior': 'DISABLED'}, 'categoricals_count_transformer': {'all_features': False, 'input_features': [], 'behavior': 'DISABLED'}}, 'feature_selection_params': {'method': 'NONE', 'random_forest_params': {'n_trees': 30, 'depth': 10, 'n_features': 25}, 'lasso_params': {'alpha': [0.01, 0.1, 1.0, 10.0, 100.0], 'cross_validate': True}, 'pca_params': {'n_features': 25, 'variance_proportion': 0.9}, 'correlation_params': {'min_abs_correlation': 0.0, 'max_abs_correlation': 1.0, 'n_features': 25}, 'custom_params': {'code': '# type your code here'}}, 'preprocessingFitSampleRatio': 1.0, 'preprocessingFitSampleSeed': 1337} [2021-07-02 15:20:29,705] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] No feature selection to perform [2021-07-02 15:20:29,706] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] START - Preprocessing train set [2021-07-02 15:20:29,707] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] Set MF index len 768003 [2021-07-02 15:20:29,707] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RemapValueToOutput [2021-07-02 15:20:29,708] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:20:29,709] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {'Temperature': 60.06713962085087, 'Dept': 44.26031738501317, 'Fuel_Price': 4673.854896874022, 'Unemployment': 7.957296073239719, 'CPI': 171.21886315845632} [2021-07-02 15:20:29,812] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RescalingProcessor2 (Temperature) [2021-07-02 15:20:29,818] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Temperature (avg=60.06713962015554 std=13.699755339670869 shift=60.06713962085087 inv_scale=0.05419290835459004) [2021-07-02 15:20:30,423] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Temperature (avg=2.3455553297302636e-13 std=0.7424295856031602) nulls=0 [2021-07-02 15:20:30,424] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RescalingProcessor2 (Dept) [2021-07-02 15:20:30,431] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Dept (avg=44.26031738508244 std=22.591228003603742 shift=44.26031738501317 inv_scale=0.03279542924473246) [2021-07-02 15:20:30,492] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Dept (avg=4.12777862539211e-16 std=0.7408890195440484) nulls=0 [2021-07-02 15:20:30,492] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RescalingProcessor2 (Fuel_Price) [2021-07-02 15:20:30,513] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Fuel_Price (avg=4673.854896875983 std=11389.574328305005 shift=4673.854896874022 inv_scale=8.349123598202651e-05) [2021-07-02 15:20:30,562] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Fuel_Price (avg=-2.5958761442888288e-14 std=0.9509296379790828) nulls=0 [2021-07-02 15:20:30,563] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RescalingProcessor2 (Unemployment) [2021-07-02 15:20:30,581] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Unemployment (avg=7.957296073299083 std=1.3825117620724632 shift=7.957296073239719 inv_scale=0.5366438386820443) [2021-07-02 15:20:30,641] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Unemployment (avg=-9.206591075905721e-13 std=0.7419164190207665) nulls=0 [2021-07-02 15:20:30,641] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RescalingProcessor2 (CPI) [2021-07-02 15:20:30,658] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale CPI (avg=171.2188631587484 std=29.058705791059054 shift=171.21886315845632 inv_scale=0.02553164013412434) [2021-07-02 15:20:30,726] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled CPI (avg=-1.0199665234048015e-12 std=0.7419164190206489) nulls=0 [2021-07-02 15:20:30,727] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FlushDFBuilder(num_flagonly) [2021-07-02 15:20:30,728] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:SingleColumnDropNARows (Date) [2021-07-02 15:20:30,835] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Deleting 0 rows [2021-07-02 15:20:30,846] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] MultiFrame, dropping rows: [] [2021-07-02 15:20:53,620] [5252/MainThread] [INFO] [dku.ml.preprocessing] After SCDNA input_df=(768003, 16) [2021-07-02 15:20:53,622] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (Date) [2021-07-02 15:20:54,424] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 103) nnz=768003 [2021-07-02 15:20:54,482] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown5) [2021-07-02 15:20:55,263] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 102) nnz=768003 [2021-07-02 15:20:55,305] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown3) [2021-07-02 15:20:55,941] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 102) nnz=768003 [2021-07-02 15:20:55,959] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown4) [2021-07-02 15:20:56,443] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 102) nnz=768003 [2021-07-02 15:20:56,458] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown1) [2021-07-02 15:20:56,862] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 102) nnz=768003 [2021-07-02 15:20:56,895] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown2) [2021-07-02 15:20:57,614] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 102) nnz=768003 [2021-07-02 15:20:57,653] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FastSparseDummifyProcessor (IsHoliday) [2021-07-02 15:20:58,708] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(768003, 4) nnz=768003 [2021-07-02 15:20:58,829] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:20:58,829] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {} [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FlushDFBuilder(cat_flagpresence) [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {} [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:FlushDFBuilder(interaction) [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:RealignTarget [2021-07-02 15:20:58,830] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Realign target series = (768003,) [2021-07-02 15:21:01,370] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] After realign target: (768003,) [2021-07-02 15:21:01,370] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:DropRowsWhereNoTarget [2021-07-02 15:21:01,401] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Deleting 346433 rows because one of ['target'] is missing [2021-07-02 15:21:01,401] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MF before = (768003, 622) [2021-07-02 15:21:01,402] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] target before = (768003,) [2021-07-02 15:21:01,766] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] MultiFrame, dropping rows: [ 0 1 2 ... 768000 768001 768002] [2021-07-02 15:21:12,395] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] After DRWNT input_df=(421570, 16) [2021-07-02 15:21:12,395] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MF after = (421570, 622) [2021-07-02 15:21:12,395] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] target after = (768003,) [2021-07-02 15:21:12,395] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:DumpPipelineState [2021-07-02 15:21:12,395] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] ********* Pipeline state (Before feature selection) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] input_df= (421570, 16) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] current_mf=(421570, 622) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PPR: [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] target = ((421570,)) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:EmitCurrentMFAsResult [2021-07-02 15:21:12,396] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] Set MF index len 421570 [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] FIT/PROCESS WITH Step:DumpPipelineState [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] ********* Pipeline state (At end) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] input_df= (421570, 16) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] current_mf=(0, 0) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PPR: [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] target = ((421570,)) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] TRAIN = ((421570, 622)) [2021-07-02 15:21:12,396] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] UNPROCESSED = ((421570, 16)) [2021-07-02 15:21:12,743] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] END - Preprocessing train set [2021-07-02 15:21:12,778] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] START - Preprocessing test set [2021-07-02 15:21:12,806] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] Set MF index len 192001 [2021-07-02 15:21:12,807] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RemapValueToOutput [2021-07-02 15:21:13,005] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:21:13,005] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {'Temperature': 60.06713962085087, 'Dept': 44.26031738501317, 'Fuel_Price': 4673.854896874022, 'Unemployment': 7.957296073239719, 'CPI': 171.21886315845632} [2021-07-02 15:21:13,111] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RescalingProcessor2 (Temperature) [2021-07-02 15:21:13,140] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Temperature (avg=90.58905866645983 std=2215.272360982173 shift=60.06713962085087 inv_scale=0.05419290835459004) [2021-07-02 15:21:13,245] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Temperature (avg=1.6540715616565018 std=120.05205203916321) nulls=0 [2021-07-02 15:21:13,245] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RescalingProcessor2 (Dept) [2021-07-02 15:21:13,247] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Dept (avg=44.26031738514264 std=1.29475434484499e-10 shift=44.26031738501317 inv_scale=0.03279542924473246) [2021-07-02 15:21:13,259] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Dept (avg=0.0 std=0.0) nulls=0 [2021-07-02 15:21:13,259] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RescalingProcessor2 (Fuel_Price) [2021-07-02 15:21:13,261] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Fuel_Price (avg=19203.599764907973 std=27043.401985536228 shift=4673.854896874022 inv_scale=8.349123598202651e-05) [2021-07-02 15:21:13,269] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Fuel_Price (avg=1.213106357535348 std=2.2578870569311436) nulls=0 [2021-07-02 15:21:13,270] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RescalingProcessor2 (Unemployment) [2021-07-02 15:21:13,274] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale Unemployment (avg=7.9572960732191955 std=2.0523192189615017e-11 shift=7.957296073239719 inv_scale=0.5366438386820443) [2021-07-02 15:21:13,282] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled Unemployment (avg=0.0 std=0.0) nulls=0 [2021-07-02 15:21:13,282] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RescalingProcessor2 (CPI) [2021-07-02 15:21:13,284] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescale CPI (avg=171.21886315870447 std=2.481505912598312e-10 shift=171.21886315845632 inv_scale=0.02553164013412434) [2021-07-02 15:21:13,297] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Rescaled CPI (avg=0.0 std=0.0) nulls=0 [2021-07-02 15:21:13,297] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FlushDFBuilder(num_flagonly) [2021-07-02 15:21:13,297] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:SingleColumnDropNARows (Date) [2021-07-02 15:21:13,310] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Deleting 0 rows [2021-07-02 15:21:13,311] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] MultiFrame, dropping rows: [] [2021-07-02 15:21:14,703] [5252/MainThread] [INFO] [dku.ml.preprocessing] After SCDNA input_df=(192001, 16) [2021-07-02 15:21:14,704] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (Date) [2021-07-02 15:21:14,875] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 103) nnz=192001 [2021-07-02 15:21:14,879] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown5) [2021-07-02 15:21:15,026] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 102) nnz=192001 [2021-07-02 15:21:15,030] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown3) [2021-07-02 15:21:15,283] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 102) nnz=192001 [2021-07-02 15:21:15,290] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown4) [2021-07-02 15:21:15,636] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 102) nnz=192001 [2021-07-02 15:21:15,654] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown1) [2021-07-02 15:21:15,947] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 102) nnz=192001 [2021-07-02 15:21:15,954] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (MarkDown2) [2021-07-02 15:21:16,312] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 102) nnz=192001 [2021-07-02 15:21:16,327] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FastSparseDummifyProcessor (IsHoliday) [2021-07-02 15:21:16,640] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Dummifier: Append a sparse block shape=(192001, 4) nnz=192001 [2021-07-02 15:21:16,671] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:21:16,672] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {} [2021-07-02 15:21:16,672] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FlushDFBuilder(cat_flagpresence) [2021-07-02 15:21:16,673] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:MultipleImputeMissingFromInput [2021-07-02 15:21:16,673] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MIMIFI: Imputing with map {} [2021-07-02 15:21:16,673] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:FlushDFBuilder(interaction) [2021-07-02 15:21:16,673] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:RealignTarget [2021-07-02 15:21:16,673] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Realign target series = (192001,) [2021-07-02 15:21:16,762] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] After realign target: (192001,) [2021-07-02 15:21:16,762] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] PROCESS WITH Step:DropRowsWhereNoTarget [2021-07-02 15:21:16,809] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] Deleting 192001 rows because one of ['target'] is missing [2021-07-02 15:21:16,810] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] MF before = (192001, 622) [2021-07-02 15:21:16,810] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] target before = (192001,) [2021-07-02 15:21:16,880] [5252/MainThread] [INFO] [dataiku.doctor.multiframe] MultiFrame, dropping rows: [ 0 1 2 ... 191998 191999 192000] [2021-07-02 15:21:17,203] [5252/MainThread] [DEBUG] [dku.ml.preprocessing] After DRWNT input_df=(0, 16) [2021-07-02 15:21:17,203] [5252/MainThread] [INFO] [dataiku.doctor.utils.listener] END - Preprocessing test set [2021/07/02-15:21:17.424] [MRT-1009] [INFO] [dku.block.link.interaction] - Check result for nullity exceptionIfNull=true result=null Traceback (most recent call last): File "/home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/server.py", line 46, in serve ret = api_command(arg) File "/home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/dkuapi.py", line 45, in aux return api(**kwargs) File "/home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/commands.py", line 308, in train_prediction_models_nosave transformed_test = pipeline.process(test_df) File "/home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/preprocessing/dataframe_preprocessing.py", line 2048, in process new_mf = step.process(input_df, cur_mf, result, self.generated_features_mapping) File "/home/dataiku/dataiku-dss-9.0.1/python/dataiku/doctor/preprocessing/dataframe_preprocessing.py", line 243, in process self.column_names)) dataiku.doctor.preprocessing.dataframe_preprocessing.DkuDroppedMultiframeException: ['target'] values all empty or with unknown classes (you may need to recompute the training set) [2021-07-02 15:21:17,635] [5252/MainThread] [INFO] [dataiku.base.socket_block_link] Client closed [2021/07/02-15:21:17.985] [MRT-1009] [INFO] [dku.kernels] - Getting kernel tail [2021/07/02-15:21:17.995] [MRT-1009] [INFO] [dku.kernels] - Trying to enrich exception: com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to train : : ['target'] values all empty or with unknown classes (you may need to recompute the training set) from kernel com.dataiku.dip.analysis.coreservices.AnalysisMLKernel@1967e25c process=com.dataiku.dip.security.process.RegularProcess@27a2b688 pid=5252 retcode=null [2021/07/02-15:21:17.995] [MRT-1009] [WARN] [dku.analysis.ml.python] - Training failed com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to train : : ['target'] values all empty or with unknown classes (you may need to recompute the training set) at com.dataiku.dip.io.SocketBlockLinkInteraction.throwExceptionFromPython(SocketBlockLinkInteraction.java:302) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.checkException(SocketBlockLinkInteraction.java:215) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.get(SocketBlockLinkInteraction.java:190) at com.dataiku.dip.io.SingleCommandKernelLink$1.call(SingleCommandKernelLink.java:208) at com.dataiku.dip.analysis.ml.prediction.PredictionTrainAdditionalThread.process(PredictionTrainAdditionalThread.java:78) at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:161) [2021/07/02-15:21:18.028] [MRT-1009] [WARN] [dku.kernels] - Killing kernel python-single-command-kernel [2021/07/02-15:21:18.028] [MRT-1009] [INFO] [dku.security.process] - Killing process PID: 5252 [2021/07/02-15:21:18.116] [KNL-python-single-command-kernel-monitor-1017] [INFO] [dku.kernels] - Process done with code 143 [2021/07/02-15:21:18.244] [KNL-python-single-command-kernel-monitor-1017] [INFO] [dip.tickets] - Destroying API ticket for analysis-ml-WALMARTFORECASTING-kJOC8n7 on behalf of admin [2021/07/02-15:21:18.809] [KNL-python-single-command-kernel-out-1022] [DEBUG] [process] - StreamToLine: EOF (stream closed) [2021/07/02-15:21:18.810] [KNL-python-single-command-kernel-err-1023] [DEBUG] [process] - StreamToLine: EOF (stream closed) [2021/07/02-15:21:18.873] [KNL-python-single-command-kernel-monitor-1017] [WARN] [dku.resource] - stat file for pid 5252 does not exist. Process died? [2021/07/02-15:21:18.873] [KNL-python-single-command-kernel-monitor-1017] [INFO] [dku.resourceusage] - Reporting completion of CRU:{"context":{"type":"ANALYSIS_ML_TRAIN","authIdentifier":"admin","projectKey":"WALMARTFORECASTING","analysisId":"qQyhv580","mlTaskId":"UlzLKhyx","sessionId":"s6"},"type":"LOCAL_PROCESS","id":"JN2qH3MWqmNbz1Kk","startTime":1625239216988,"localProcess":{"pid":5252,"commandName":"/home/dataiku/dss/bin/python","cpuUserTimeMS":11080,"cpuSystemTimeMS":1130,"cpuChildrenUserTimeMS":0,"cpuChildrenSystemTimeMS":10,"cpuTotalMS":12220,"cpuCurrent":0.23801967629324025,"vmSizeMB":1267,"vmRSSMB":532,"vmHWMMB":574,"vmRSSAnonMB":529,"vmDataMB":608,"vmSizePeakMB":1313,"vmRSSPeakMB":555,"vmRSSTotalMBS":22327,"majorFaults":489,"childrenMajorFaults":7}} [2021/07/02-15:21:18.888] [MRT-1009] [INFO] [dku.block.link] - Closed socket [2021/07/02-15:21:18.888] [MRT-1009] [INFO] [dku.block.link] - Closed socket [2021/07/02-15:21:18.888] [MRT-1009] [INFO] [dku.block.link] - Closed serverSocket [2021/07/02-15:21:18.889] [MRT-1009] [ERROR] [dku.analysis.ml.python] - Processing failed com.dataiku.dip.io.SocketBlockLinkKernelException: Failed to train : : ['target'] values all empty or with unknown classes (you may need to recompute the training set) at com.dataiku.dip.io.SocketBlockLinkInteraction.throwExceptionFromPython(SocketBlockLinkInteraction.java:302) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.checkException(SocketBlockLinkInteraction.java:215) at com.dataiku.dip.io.SocketBlockLinkInteraction$AsyncResult.get(SocketBlockLinkInteraction.java:190) at com.dataiku.dip.io.SingleCommandKernelLink$1.call(SingleCommandKernelLink.java:208) at com.dataiku.dip.analysis.ml.prediction.PredictionTrainAdditionalThread.process(PredictionTrainAdditionalThread.java:78) at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:161) [2021/07/02-15:21:18.889] [MRT-1009] [INFO] [dku.analysis.ml] - Locking model train info file /home/dataiku/dss/analysis-data/WALMARTFORECASTING/qQyhv580/UlzLKhyx/sessions/s6/pp2/m1/train_info.json [2021/07/02-15:21:18.909] [MRT-1009] [INFO] [dku.analysis.ml] - Unlocking model train info file /home/dataiku/dss/analysis-data/WALMARTFORECASTING/qQyhv580/UlzLKhyx/sessions/s6/pp2/m1/train_info.json [2021/07/02-15:22:56.825] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.ml.python] T-UlzLKhyx - Processing thread joined ... [2021/07/02-15:22:56.974] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.ml.python] T-UlzLKhyx - Joining processing thread ... [2021/07/02-15:23:10.778] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.ml.python] T-UlzLKhyx - Processing thread joined ... [2021/07/02-15:23:10.780] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.prediction] T-UlzLKhyx - Train done [2021/07/02-15:23:10.782] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.prediction] T-UlzLKhyx - Train done [2021/07/02-15:23:10.989] [FT-TrainWorkThread-1xCBzFAt-1006] [INFO] [dku.analysis.trainingdetails] T-UlzLKhyx - Publishing mltask-train-done reflected event