YARN and Oozie on DDS

Highlighted
UserBird Dataiker
Dataiker
YARN and Oozie on DDS
Jump to solution
Hi,

In DSS, is it possible to use YARN to view and manage node and containers allocation to the different Spark application across different nodes?
Additionally, is it possible to run Apache Oozie workflows to automate jobs?

Thank you!
0 Kudos
1 Solution

Accepted Solutions
Clément_Stenac Dataiker
Dataiker
Re: YARN and Oozie on DDS
Jump to solution
Hi,

DSS leverages Spark through its standard APIs, so you can use the "YARN" mode of Spark just as you would with any other Spark application. This means that you can set the number of containers to use, the memory allocation, whether to use dynamic allocations, YARN queues ... .and so on, by using standard Spark configuration keys (see https://spark.apache.org/docs/latest/running-on-yarn.html for details on the Spark configuration keys and https://doc.dataiku.com/dss/latest/spark/configuration.html to know how to set them in DSS)



DSS does not have a native integration with Oozie. However, Oozie has a REST API so you can use Python code in DSS to make calls to this REST API in order to trigger workflows (see https://oozie.apache.org/docs/4.0.0/WebServicesAPI.html)

View solution in original post

0 Kudos
1 Reply
Clément_Stenac Dataiker
Dataiker
Re: YARN and Oozie on DDS
Jump to solution
Hi,

DSS leverages Spark through its standard APIs, so you can use the "YARN" mode of Spark just as you would with any other Spark application. This means that you can set the number of containers to use, the memory allocation, whether to use dynamic allocations, YARN queues ... .and so on, by using standard Spark configuration keys (see https://spark.apache.org/docs/latest/running-on-yarn.html for details on the Spark configuration keys and https://doc.dataiku.com/dss/latest/spark/configuration.html to know how to set them in DSS)



DSS does not have a native integration with Oozie. However, Oozie has a REST API so you can use Python code in DSS to make calls to this REST API in order to trigger workflows (see https://oozie.apache.org/docs/4.0.0/WebServicesAPI.html)

View solution in original post

0 Kudos
Labels (1)