DSS Livy and spark integration

Nish Registered Posts: 1 ✭✭✭✭


i am trying to learn DSS. I could use more detailed explanation than what’s avail in docs. I haven’t integrated Spark with a remote client before, please ELI5 for me. Thank u.

I have a new Cloudera CDP 7.1.1 cluster, running Spark 2.4, Livy, and Hadoop 3.1. Hadoop cluster is kerberized, and user ‘serviceA’ is already configured.

Is it possible to integrate DSS with Livy? How?

If Livy is not an option, I have the DSS version spark-integration already pre-installed in my DSS VM. Looking at configuration setting screen, what is it minimally that I need to enter to get DSS working with spark serviceA account? My goal is to run interactive session. Thanks


  • Omar
    Omar Dataiker Posts: 30 Dataiker

    Hi Nish,

    please make sure to read our doc here, specifically for secure clusters.

    Also consider that CDP 7.1.1 is not supported, as stated here. It might work, however.

    As you will see from the documentation I posted above, DSS will need to be configured to access the cluster by running the hadoop integration and spark integration.

    Once done, DSS should be able to offload jobs to the cluster.

    To do so you don't need to integrate with Livy. What you need to do is map users and groups to hadoop users, as you want to run all your activities (in hadoop) as serviceA. This is done in Administration -> Settings -> Login (LDAP, SSO) -> User Isolation.
    This feature is only available in Enterprise license.

    Take care,

    Architect @ Dataiku

Setup Info
      Help me…