Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

ML Process died (exit code 139)

UserBird
Dataiker
ML Process died (exit code 139)



 



From the log:



python(5787,0x70000fcc5000) malloc: *** error for object 0x7ff8fe7317e0: incorrect checksum for freed object - object was probably modified after being freed.



[2018/01/24-14:07:02.303] [Exec-158] [INFO] [dku.utils]  - *** set a breakpoint in malloc_error_break to debug



[2018/01/24-14:07:02.304] [Kernel-159-monitor-159] [INFO] [dku.kernels]  - Process done with code 134



[2018/01/24-14:07:02.408] [MRT-153] [ERROR] [dku.analysis.prediction]  - Processing failed



com.dataiku.dip.exceptions.ProcessDiedException: ML process died (exit code: 134)



at com.dataiku.dip.exceptions.ProcessDiedException.getExceptionOnProcessDeath(ProcessDiedException.java:46)



at com.dataiku.dip.kernels.DSSKernelBase.getExceptionOnProcessDeath(DSSKernelBase.java:129)



at com.dataiku.dip.analysis.coreservices.AnalysisMLKernel.executeCommand(AnalysisMLKernel.java:105)



at com.dataiku.dip.analysis.ml.prediction.PyRegularNoSavePredictionHandler$TrainAdditionalThread.process(PyRegularNoSavePredictionHandler.java:118)



at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:66)



 



****



 



Is this a caching issue?

0 Kudos
1 Reply
Clément_Stenac
Dataiker
Hi,

This looks like a memory corrumption bug in one of the underlying numerical computation libraries (numpy, pandas, blas,....). Is it reproducible ? Reproducible with other algorithms on this dataset ? Could you share details about your setup ? Are you at a liberty to share this dataset ?
0 Kudos