ML Process died (exit code 139)
From the log:
python(5787,0x70000fcc5000) malloc: *** error for object 0x7ff8fe7317e0: incorrect checksum for freed object - object was probably modified after being freed.
[2018/01/24-14:07:02.303] [Exec-158] [INFO] [dku.utils] - *** set a breakpoint in malloc_error_break to debug
[2018/01/24-14:07:02.304] [Kernel-159-monitor-159] [INFO] [dku.kernels] - Process done with code 134
[2018/01/24-14:07:02.408] [MRT-153] [ERROR] [dku.analysis.prediction] - Processing failed
com.dataiku.dip.exceptions.ProcessDiedException: ML process died (exit code: 134)
at com.dataiku.dip.exceptions.ProcessDiedException.getExceptionOnProcessDeath(ProcessDiedException.java:46)
at com.dataiku.dip.kernels.DSSKernelBase.getExceptionOnProcessDeath(DSSKernelBase.java:129)
at com.dataiku.dip.analysis.coreservices.AnalysisMLKernel.executeCommand(AnalysisMLKernel.java:105)
at com.dataiku.dip.analysis.ml.prediction.PyRegularNoSavePredictionHandler$TrainAdditionalThread.process(PyRegularNoSavePredictionHandler.java:118)
at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:66)
****
Is this a caching issue?
Answers
-
Hi,
This looks like a memory corrumption bug in one of the underlying numerical computation libraries (numpy, pandas, blas,....). Is it reproducible ? Reproducible with other algorithms on this dataset ? Could you share details about your setup ? Are you at a liberty to share this dataset ?