The Python process died (killed - maybe out of memory ?)

vamsee51
vamsee51 Registered Posts: 2

I have included the screenshots of my python recipe and the errors.

Kindly help!!


Operating system used: Windows 11

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker

    Hi @vamsee51
    ,

    The recipe was using up to 41GB of memory already.
    We know this by looking a the entry "vmRSSPeakMB" : 41707 in your screenshots.

    You will need to either increase the memory available to execute the recipe, if running locally that may mean adjusting cgroups configuration and/or increasing total memory available to the DSS instance. If running in a container you may larger node types and different containerized execution configs.

    The other approach is reduce memory usage of your recipe.

    Some ways to achieve this could be:
    1) Reduce the sampling size and chunk size if you are using chunked reading.
    2) Delete any unused intermediate data frames
    3) Chaining several operations to avoid using intermediate data frames in the first place.

    https://doc.dataiku.com/dss/latest/python-api/datasets-data.html

    thanks

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @vamsee51
    ,

    I’ve run into similar situations with very large processes. In my case things were not containerized and the swap size on the OS hosting DSS was set to 0. Which for some use cases may make sense. (Turning on swap can slow things down if you use it too much.) However, in our use case this did make a positive improvement. Without increasing hosting costs. We also ended up reducing sample size for model building.

    There are a number of guides about tuning memory usage and swap for Linux on the internet. One of these may help with the details.

    Now that I know you are containerized, please disregard my notes here. @AlexT
    suggestion to place a support is a great idea.

  • vamsee51
    vamsee51 Registered Posts: 2

    Thank you @AlexT
    and @tgb417
    for your swift response.

    I tried to increase the memory usage and tried reducing the sampling and chunk size.

    Upon running the recipe its still throwing an error. And it seemed to be the resource control and not the containerized execution. We are currently stuck with the migration process.

    I am sharing the additional screenshots for your understanding and I kindly request for your assistance on this!

    Thank you!!

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker

    HI @vamsee51
    ,
    Setting the memory request limit to a very high value does not guarantee the K8s node this pod was assigned has that much memory. You may need to add larger nodes in your actual K8s cluster to accommodate for the memory usage.

    You recipe is still using 40+GB or RAM at peak.
    I would suggest you submit a support ticket with the job diagnostics to advise further.

    https://doc.dataiku.com/dss/latest/troubleshooting/problems/job-fails.html#getting-a-job-diagnosis

    https://support.dataiku.com/support/tickets/new



Setup Info
    Tags
      Help me…