Job process died (killed - maybe out of memory ?) when trying to score dataset
I'm trying to score my dataset (about 800,000 rows and 720 columns). After about 5 minutes, I get the "Job process died (killed - maybe out of memory ?)" error. I noticed that if I subset it down to 10,000 rows, it runs as expected.
To get to this 800K row scoring dataset, I'm taking a 5M row table, inner joining it with my scoring scope, and outputting my 'scoring' table population (800K rows) which gets fed into the model scoring step.
I should note that when previously using SQL Server Data Source, I was able to score 900K+ row scoring datasets with no problem. I am now using Snowflake Data Sources instead, so I'm wondering if that is the cause or if it's because I'm joining the data before scoring.
Operating system used: Windows
Best Answer
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @ccecil
,
To better handle this case, I would suggest you open a support ticket with the respective job diagnostics where you saw the OOM; this can be caused by a number of potential causes. Reviewing the logs would be necessary to narrow this down.
https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html#getting-a-job-diagnosis
https://doc.dataiku.com/dss/latest/troubleshooting/obtaining-support.html#editor-support-for-all-other-dataiku-customers
Thanks,
Answers
-
Thanks @AlexT
. I just submitted a ticket.