Hi Dataiku,
How exactly is the environment variable PYSPARK_DRIVER_PYTHON being set by DSS? No matter what I put in a Spark configuration or no matter what Python environment I choose, this always defaults to the path of the internal Dataiku python executable (/home/dataiku/dss_data/bin/python).
My goal here is to have everything set by the Spark configuration, so if running on a YARN cluster, the executable will be /usr/bin/python3 (default python3 executable on Linux). Whilst running in local mode, I will have a different Spark configuration that points to the python executable of the kernel (in my case, /home/dataiku/dss_data/code-envs/python/python36).
Why is PYSPARK_DRIVER_PYTHON static? And why does the 'Yarn Python executable' variable under the Code Envs page change the PYSPARK_PYTHON when Spark is running in local mode? I'm forced to having to get everyone in the team to override the environment variables at the start of each recipe, or notebook.
0 ·