Session consuming more space in yarn queue

Solved!
Srkanoje
Level 1
Session consuming more space in yarn queue

We are using jupyter notebooks in dataiku project and doing some pypspark stuff. Sometimes, unfortunately we miss closing the session which is leading to holding the resources in yarn queue. Is there anyway that it can get auto-closed once we come out of the notebook/terminate the session after a while if idle.

0 Kudos
1 Solution
fchataigner2
Dataiker
Dataiker

(correction: hiveserver2 connection are cleaned up every 10 min, not 5 min)

There is no control on the closing of these sessions. You can close them manually in yarn (ie yarn application -kill ...) . Another option is to touch the Hive settings, for example by changing a property of the hive connection in Administration > Settings > Hive, which will evict idle connections from the cache, and kill them.

View solution in original post

0 Kudos
3 Replies
fchataigner2
Dataiker
Dataiker

Hi,

DSS closes its connections to Hiveserver2 when they've been idle for 5mins, which then releases the Tez sessions (that are the thing keeping yarn resources, most likely)

0 Kudos
Srkanoje
Level 1
Author

@fchataigner2 is there any other way like a command or something else to kill these or it can get autoclosed what i am looking for is an autocloser command or an option with which we can work in dataiku

0 Kudos
fchataigner2
Dataiker
Dataiker

(correction: hiveserver2 connection are cleaned up every 10 min, not 5 min)

There is no control on the closing of these sessions. You can close them manually in yarn (ie yarn application -kill ...) . Another option is to touch the Hive settings, for example by changing a property of the hive connection in Administration > Settings > Hive, which will evict idle connections from the cache, and kill them.

View solution in original post

0 Kudos
A banner prompting to get Dataiku DSS
Public