The Python process died (killed - maybe out of memory ?)
I have included the screenshots of my python recipe and the errors.
Kindly help!!
Operating system used: Windows 11
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @vamsee51
,The recipe was using up to 41GB of memory already.
We know this by looking a the entry "vmRSSPeakMB" : 41707 in your screenshots.
You will need to either increase the memory available to execute the recipe, if running locally that may mean adjusting cgroups configuration and/or increasing total memory available to the DSS instance. If running in a container you may larger node types and different containerized execution configs.
The other approach is reduce memory usage of your recipe.
Some ways to achieve this could be:
1) Reduce the sampling size and chunk size if you are using chunked reading.
2) Delete any unused intermediate data frames
3) Chaining several operations to avoid using intermediate data frames in the first place.
https://doc.dataiku.com/dss/latest/python-api/datasets-data.html
thanks -
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
I’ve run into similar situations with very large processes. In my case things were not containerized and the swap size on the OS hosting DSS was set to 0. Which for some use cases may make sense. (Turning on swap can slow things down if you use it too much.) However, in our use case this did make a positive improvement. Without increasing hosting costs. We also ended up reducing sample size for model building.
There are a number of guides about tuning memory usage and swap for Linux on the internet. One of these may help with the details.
Now that I know you are containerized, please disregard my notes here. @AlexT
suggestion to place a support is a great idea. -
Thank you @AlexT
and @tgb417
for your swift response.I tried to increase the memory usage and tried reducing the sampling and chunk size.
Upon running the recipe its still throwing an error. And it seemed to be the resource control and not the containerized execution. We are currently stuck with the migration process.
I am sharing the additional screenshots for your understanding and I kindly request for your assistance on this!
Thank you!!
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
HI @vamsee51
,
Setting the memory request limit to a very high value does not guarantee the K8s node this pod was assigned has that much memory. You may need to add larger nodes in your actual K8s cluster to accommodate for the memory usage.
You recipe is still using 40+GB or RAM at peak.
I would suggest you submit a support ticket with the job diagnostics to advise further.
https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html#getting-a-job-diagnosis
https://support.dataiku.com/support/tickets/new