Regarding AUto ml session
Hi Team,
While training the model with auto ML. there are 6-7 algorithm job starts in a session. Is there a way all algorithm should start at a time? Or this is something what dataiku automatically decides which algorithm to start to achieve optimal memory execution.
In the attached screenshot we see Logistic regression training started and SGD is in pending state. So the question here is is there any configuration or setting which will start all the algorithm at same time, or this is something dataiku automatically manages to achieve optimal memory execution.
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Welcome to the Dataiku community.
Auto ML jobs are usually computationally complex. Depending on how much computational power the computer you are running dss on has, you may not want to increase the number of concurrent jobs. This may actually slow things down because you don’t have enough CPU cores or RAM memory. Note the resources we are talking about are not your local computer but the server(s) on which DSS is running.
I’d take a look at top, or htop or one of the other tasks manager on the computer running these jobs to see if you have significant unused system resources. This could be more complicated to evaluate if you are doing your compute on kubernetes.If you do find that you have a bunch of unused resources you might consider increasing the concurrent limits. See the documentation pages.