[2020/03/18-14:04:28.286] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] - ****************************************** [2020/03/18-14:04:28.286] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] - ** Start train session s1 [2020/03/18-14:04:28.286] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] - ****************************************** [2020/03/18-14:04:28.287] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.data] T-Eih1jlAt - [ct: 1] Need to compute sampleId before checking memory cache [2020/03/18-14:04:28.288] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dip.shaker.runner] T-Eih1jlAt - [ct: 2] Script settings sampleMax=104857600 processedMax=-1 [2020/03/18-14:04:28.288] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dip.shaker.runner] T-Eih1jlAt - [ct: 2] Processing with sampleMax=104857600 processedMax=524288000 [2020/03/18-14:04:28.288] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dip.shaker.runner] T-Eih1jlAt - [ct: 2] Computed required sample id : db380cfac02c726c432b4ab6de205b3b-NA-31c57f559ab672625afa057353f0de860--d751713988987e9331980363e24189ce [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.shaker.cache] T-Eih1jlAt - Shaker MemoryCache get on My_project.my_DB key=ds=024870fa98cfa5d244f392d9be497956--scr=5064075a1c127afbdab01189cc6fdf46--samp=db380cfac02c726c432b4ab6de205b3b-NA-31c57f559ab672625afa057353f0de860--d751713988987e9331980363e24189ce: hit [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 3] Column col_0 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 3] Column col_1 meaning=Text fail=0 [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 3] Column col_2 meaning=FreeText fail=0 [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 3] Column col_3 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.289] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 3] Column col_4 meaning=DateSource fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_5 meaning=Text fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_6 meaning=Text fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_7 meaning=Text fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_8 meaning=CountryMeaning fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_9 meaning=Text fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_10 meaning=FreeText fail=0 [2020/03/18-14:04:28.290] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 4] Column col_11 meaning=Text fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_12 meaning=CountryMeaning fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_13 meaning=Text fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_14 meaning=FreeText fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_15 meaning=Text fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_16 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_17 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.291] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 5] Column col_18 meaning=CountryMeaning fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_19 meaning=Text fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_20 meaning=Text fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_21 meaning=Text fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_22 meaning=LongMeaning fail=1200 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_23 meaning=FreeText fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_24 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_25 meaning=DoubleMeaning fail=0 [2020/03/18-14:04:28.292] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 6] Column col_26 meaning=CountryMeaning fail=0 [2020/03/18-14:04:28.293] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 7] Column col_27 meaning=Text fail=0 [2020/03/18-14:04:28.293] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 7] Column col_28 meaning=Text fail=0 [2020/03/18-14:04:28.293] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 7] Column col_29 meaning=Text fail=0 [2020/03/18-14:04:28.293] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 7] Column col_30 meaning=LongMeaning fail=438 [2020/03/18-14:04:28.293] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.shaker.schema] T-Eih1jlAt - [ct: 7] Column col_31 meaning=FreeText fail=0 [2020/03/18-14:04:28.349] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.input.push] T-Eih1jlAt - Using ordering from dataset: {"enabled":false,"rules":[]} [2020/03/18-14:04:28.351] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.input.push] T-Eih1jlAt - USTP: push selection.method=HEAD_SEQUENTIAL records=100000 ratio=0.02 col=null [2020/03/18-14:04:28.352] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.push] T-Eih1jlAt - [ct: 66] Push row splits limit= {"maxBytes":-1,"maxRecords":100000,"ordering":{"enabled":false,"rules":[]}} [2020/03/18-14:04:28.352] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.input.rows] T-Eih1jlAt - [ct: 66] Starting to push split com.dataiku.dip.datasets.sql.AbstractSQLTableDatasetHandler$TableClauseSplit@5bfa9e62 with limit= 100000 [2020/03/18-14:04:28.355] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 69] Connecting to jdbc:postgresql://XXX.XXX.XXX.XXX:XXXX/dss_My_project with props: {"user":"dss_My_project","password":"***"} conn=My_project-jq3TFJa [2020/03/18-14:04:28.356] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.connections.sql.driver] T-Eih1jlAt - Driver version 9.0 [2020/03/18-14:04:28.370] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 84] Driver: PostgreSQL Native Driver (JDBC 4.0) PostgreSQL 9.0 JDBC4 (build 801) (9.0) [2020/03/18-14:04:28.372] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 86] Database: PostgreSQL 12.2 (12.2) rowSize=8192 stmts=0 [2020/03/18-14:04:28.373] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 87] SET MAX ROWS 100000 [2020/03/18-14:04:28.374] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 88] Executing query SELECT * FROM "my_DB" LIMIT 100000 (st=org.postgresql.jdbc4.Jdbc4Statement@5fc501c5) [2020/03/18-14:04:28.401] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 115] Done executing query (st=org.postgresql.jdbc4.Jdbc4Statement@5fc501c5) [2020/03/18-14:04:28.601] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 315] Read 2000 records from DB [2020/03/18-14:04:28.790] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 504] Read 4000 records from DB [2020/03/18-14:04:29.080] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 794] Read 6000 records from DB [2020/03/18-14:04:29.214] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 928] Read 8000 records from DB [2020/03/18-14:04:29.365] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1079] Read 10000 records from DB [2020/03/18-14:04:29.484] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1198] Read 12000 records from DB [2020/03/18-14:04:29.605] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1319] Read 14000 records from DB [2020/03/18-14:04:29.817] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1531] Read 16000 records from DB [2020/03/18-14:04:30.003] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1717] Read 18000 records from DB [2020/03/18-14:04:30.217] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 1931] Read 20000 records from DB [2020/03/18-14:04:30.498] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 2212] Read 22000 records from DB [2020/03/18-14:04:30.725] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 2439] Read 24000 records from DB [2020/03/18-14:04:30.936] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 2650] Read 26000 records from DB [2020/03/18-14:04:31.260] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 2974] Read 28000 records from DB [2020/03/18-14:04:31.547] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 3261] Done iterating result set, returned 29970 rows [2020/03/18-14:04:31.548] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.datasets.sql] T-Eih1jlAt - [ct: 3262] Exited push [2020/03/18-14:04:31.548] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 3262] Rollback and close conn=My_project-jq3TFJa [2020/03/18-14:04:31.548] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 3262] Rollback conn=My_project-jq3TFJa [2020/03/18-14:04:31.549] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 3263] Close conn=My_project-jq3TFJa [2020/03/18-14:04:31.549] [FT-TrainWorkThread-LzEO9vGz-2969] [DEBUG] [dku.connections.sql.provider] T-Eih1jlAt - [ct: 3263] Close conn=My_project-jq3TFJa [2020/03/18-14:04:31.549] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.input.rows] T-Eih1jlAt - [ct: 3263] After split com.dataiku.dip.datasets.sql.AbstractSQLTableDatasetHandler$TableClauseSplit@5bfa9e62, still have 70030 to push [2020/03/18-14:04:31.552] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.splits] T-Eih1jlAt - [ct: 3266] Checking if splits are up to date. Policy: type=SPLIT_SINGLE_DATASET,split=RANDOM,splitBeforePrepare=true,ds=my_DB,sel=(method=head-s,records=100000),r=0.8,s=1337, instance id: bd50672b69fde08f4a8bfb147de6e1cd-0 [2020/03/18-14:04:31.552] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.splits] T-Eih1jlAt - [ct: 3266] Search for split: p=type=SPLIT_SINGLE_DATASET,split=RANDOM,splitBeforePrepare=true,ds=my_DB,sel=(method=head-s,records=100000),r=0.8,s=1337 i=bd50672b69fde08f4a8bfb147de6e1cd-0 [2020/03/18-14:04:31.553] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.splits] T-Eih1jlAt - [ct: 3267] Search for split: p=type=SPLIT_SINGLE_DATASET,split=RANDOM,splitBeforePrepare=true,ds=my_DB,sel=(method=head-s,records=100000),r=0.8,s=1337 i=bd50672b69fde08f4a8bfb147de6e1cd-0 [2020/03/18-14:04:31.554] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.ml.python] T-Eih1jlAt - Joining processing thread ... [2020/03/18-14:04:31.555] [MRT-2971] [INFO] [dku.analysis.ml.python] - Running a preprocessing set: pp2 in /opt/dataiku/analysis-data/My_project/ILegGxFT/Eih1jlAt/sessions/s1/pp2 [2020/03/18-14:04:31.555] [MRT-2971] [INFO] [dku.block.link] - Started a socket on port 36312 [2020/03/18-14:04:31.555] [MRT-2971] [INFO] [dku.ml.kernel] - Writing output of python-single-command-kernel to /opt/dataiku/analysis-data/My_project/ILegGxFT/Eih1jlAt/sessions/s1/pp2/train.log [2020/03/18-14:04:31.555] [MRT-2971] [INFO] [dku.code.envs.resolution] - Executing Python activity in builtin env [2020/03/18-14:04:31.556] [MRT-2971] [WARN] [dku.code.projectLibs] - External libraries file not found: /opt/dataiku/config/projects/My_project/lib/external-libraries.json [2020/03/18-14:04:31.556] [MRT-2971] [INFO] [dku.code.projectLibs] - EXTERNAL LIBS FROM My_project is {"gitReferences":{},"pythonPath":["python"],"rsrcPath":["R"],"importLibrariesFromProjects":[]} [2020/03/18-14:04:31.556] [MRT-2971] [INFO] [dku.code.projectLibs] - chunkFolder is /opt/dataiku/config/projects/My_project/lib/R [2020/03/18-14:04:31.556] [MRT-2971] [INFO] [dku.python.single_command.kernel] - Starting Python process for kernel python-single-command-kernel [2020/03/18-14:04:31.556] [MRT-2971] [INFO] [dip.tickets] - Creating API ticket for analysis-ml-My_project-h2WS04P on behalf of admin id=analysis-ml-My_project-h2WS04P_Ey79tLip0TPg [2020/03/18-14:04:31.556] [MRT-2971] [INFO] [dku.security.process] - Starting process (regular) [2020/03/18-14:04:31.855] [MRT-2971] [INFO] [dku.security.process] - Process started with pid=15582 [2020/03/18-14:04:31.855] [MRT-2971] [INFO] [dku.processes.cgroups] - Will use cgroups [] [2020/03/18-14:04:31.855] [MRT-2971] [INFO] [dku.processes.cgroups] - Applying rules to used cgroups: [] /opt/dataiku/bin/python: No module named doctor [2020/03/18-14:04:32.664] [KNL-python-single-command-kernel-monitor-2981] [INFO] [dku.kernels] - Process done with code 1 [2020/03/18-14:04:32.665] [KNL-python-single-command-kernel-monitor-2981] [INFO] [dip.tickets] - Destroying API ticket for analysis-ml-My_project-h2WS04P on behalf of admin [2020/03/18-14:04:32.665] [KNL-python-single-command-kernel-monitor-2981] [ERROR] [dku.kernels] - KernelMonitorThread done: Closing: ServerSocket[addr=0.0.0.0/0.0.0.0,localport=36312] [2020/03/18-14:04:32.666] [MRT-2971] [WARN] [dku.analysis.ml.python] - Training failed com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException: Subprocess failed to connect, it probably crashed at startup. Check the logs. at com.dataiku.dip.io.SocketBlockLink.waitForConnection(SocketBlockLink.java:70) at com.dataiku.dip.io.SecretProtectedKernelLink.waitForProcess(SecretProtectedKernelLink.java:39) at com.dataiku.dip.io.SecretProtectedKernelLink.waitForProcess(SecretProtectedKernelLink.java:48) at com.dataiku.dip.io.PythonSecretProtectedKernel.start(PythonSecretProtectedKernel.java:104) at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:131) Caused by: java.net.SocketException: Socket closed at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) at java.net.ServerSocket.implAccept(ServerSocket.java:560) at java.net.ServerSocket.accept(ServerSocket.java:528) at com.dataiku.dip.io.SocketBlockLink.waitForConnection(SocketBlockLink.java:66) ... 4 more [2020/03/18-14:04:32.781] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.ml.python] T-Eih1jlAt - Processing thread joined ... [2020/03/18-14:04:32.781] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.ml.python] T-Eih1jlAt - Joining processing thread ... [2020/03/18-14:04:32.781] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.ml.python] T-Eih1jlAt - Processing thread joined ... [2020/03/18-14:04:32.781] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] T-Eih1jlAt - Train done [2020/03/18-14:04:32.781] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] T-Eih1jlAt - Train done [2020/03/18-14:04:32.783] [FT-TrainWorkThread-LzEO9vGz-2969] [INFO] [dku.analysis.prediction] T-Eih1jlAt - Publishing mltask-train-done reflected event