PYSPARK_PYTHON environment variable issue in PySpark

Farhan
Farhan Registered Posts: 27 ✭✭✭✭
edited November 27 in Setup & Configuration

Hi

I am facing below issue for PySpark recipe.

Exception: Python in worker has different version 2.7 than that in driver 3.6, PySpark cannot run with different minor versions.Please check environment variables PYSPARK_PYTHON and PYSPARK_DRIVER_PYTHON are correctly set.

I have set the environment variables using os.environ() by providing a file path but somehow this approach only works if I run the code via Jupyter but will not work when the same run via recipe.

So i did some digging and found below community post:

PySpark Python executables — Dataiku Community

Based on above, can this issue be resolved if I set the python binary path in the code environment setting section.

(Unfortunately, I cant upload a picture so let me write it down how it looks like)

Spark

Yarn Python executable [ ] Python binary on the Yarn nodes for Pyspark (save, remove then re-install jupyter support to update in notebooks)

Operating system used: Windows 10

Operating system used: Windows 10

Setup Info
    Tags
      Help me…