Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on April 26, 2022 12:51PM
Likes: 0
Replies: 3
Hello community,
After the upgrade of Dataiku to 10.0.4, suddenly a Pyspark recipe didnt work no more on the default Python 2.7 code env due to syntactical error stated below:
File "/appl/dataiku/dataiku-dss-10.0.4/spark-standalone-home/python/pyspark/find_spark_home.py", line 68
print("Could not find valid SPARK_HOME while searching {0}".format(paths), file=sys.stderr) ^ SyntaxError: invalid syntax
Our guess is that the underlying pyspark scripts have been updated with Python 3 code and that Python 2 code is deprecated/removed. Is this correct? If so, can Pyspark never be used again with Python 2 code env?
Before the upgrade we had 9.0.3 and the script just worked fine with Python 2.
Hope someone can help me solving this problem.
Thank you in advance,
Nofit Kartoredjo
Operating system used: RedHat
Operating system used: RedHat
Spark is no longer compatible with Python2 in recent DSS releases.
Hi Alex,
Thanks for the suggestion. The first approach is what we are looking for; so using 2.7 for one specific notebook. The problem is that the error already starts at the import level (see picture) when we are using a code env with Python 2 and pyspark installed. So, even though I would add your code, it wouldn't make a difference because the error starts at this line:
from dataiku import spark as dkuspark
The scripts in this module probably use python3 .. is there anyway to adjust this?
Thanks again!
Hi Alex,
Thank you for the help. You can close the ticket since I told the user to just use Python 3.6 since Python 2 gives warning of deprecation