Session consuming more space in yarn queue
We are using jupyter notebooks in dataiku project and doing some pypspark stuff. Sometimes, unfortunately we miss closing the session which is leading to holding the resources in yarn queue. Is there anyway that it can get auto-closed once we come out of the notebook/terminate the session after a while if idle.
Best Answer
-
(correction: hiveserver2 connection are cleaned up every 10 min, not 5 min)
There is no control on the closing of these sessions. You can close them manually in yarn (ie yarn application -kill ...) . Another option is to touch the Hive settings, for example by changing a property of the hive connection in Administration > Settings > Hive, which will evict idle connections from the cache, and kill them.
Answers
-
Hi,
DSS closes its connections to Hiveserver2 when they've been idle for 5mins, which then releases the Tez sessions (that are the thing keeping yarn resources, most likely)
-
@fchataigner2
is there any other way like a command or something else to kill these or it can get autoclosed what i am looking for is an autocloser command or an option with which we can work in dataiku