How to define the best Global limit for Jobs

NN
NN Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 145 Neuron

Hi Dataiku Team,
The default limit Global & Per job for concurrent activities is set by default to 5 for the instance.

https://doc.dataiku.com/dss/latest/flow/limits.html

I believe this means that in a single instance if project-1 has 5 concurrent activities running , then the job on project-2 will have to wait for any one of the activity on project-1 to complete. (i hope i understand this correct)

My Question is that , is there a best practice for defining these limits ?
i.e. if my instance is of X configuration then limits can be kept higher than 5 but should be less than 10.. something on these lines ?

Thanks..

Best Answer

  • Sergey
    Sergey Dataiker, Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts Posts: 365 Dataiker
    Answer ✓

    Hi @NN

    There is no best practice to say as this is truly based on the type of the activities (one job can run 24 hours while the other 10 jobs can run in just 1 minute), system resources DSS instance has, etc.

    So I would say that you can start increasing the global limit (I wouldn't recommend going higher than 20 maybe) and monitor the DSS load. Also, if needed, you can fine-tune the activities with user/recipeType/project:

    https://doc.dataiku.com/dss/latest/flow/limits.html#additional-limits

Answers

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 323 Neuron

    Hi @NN
    ,

    This is a good question. We've wondered about this as well.

    I thought I'd share our experience with setting a global limit not that it necessarily will be helpful but maybe serve as another reference point.

    • We had been running into issues with jobs waiting with the default 5 activity limit on our Dev (Design) instance
    • We do run a fair number of automation jobs on the Dev instance rather than on Automation instances (I'd prefer we not do that but nonetheless that is current state)
    • Most of the jobs are pushed down to a SQL database
    • Our Dev instance is fairly beefy including lots of memory (I'd have to look up the details)
    • We upped the limit to 10 and haven't had any noticeable problem since the change.

    Marlan

  • NN
    NN Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 145 Neuron

    Thankyou @sergeyd
    & @Marlan
    for your inputs.

    I will try various limits and see which gives us the best usage.

Setup Info
    Tags
      Help me…