Announcing the winners & finalists of the Dataiku Frontrunner Awards 2021! Read their inspiring stories

PySpark Lists extract issue

Solved!
arghod
Level 2
PySpark Lists extract issue

Hi,

I am having an issue with lists on pyspark. My understanding is dataiku's pyspark recipe will accept python code. I have a simple code that works on python but not pyspark.

 

List = {
"A":"URL1",
"B":"URL2",
"C":"URL3",
"D":"URL4",
"E":"URL5"
}

GIP = list(List.keys())[2]
EOD = list(List.keys())[0]
PIR = list(List.keys())[3]
VAL = list(List.keys())[4]
COV = list(List.keys())[1]

print(GIP)
print(EOD)
print(PIR)
print(VAL)
print(COV)

 

The print statements right after shows the following:
GIP= "B"             ***Should be C
EOD= "A"           ***Should be A
PIR= "C"             ***Should be D
VAL= "D"            ***Should be E
COV= "E"           ***Should be B

No idea why, the logic is simple and works exactly as is in a python recipe. I considered maybe it is magically taking a List variable from another recepie but I renamed it to unique name ie. Listasdasdas and had the same issue. 

0 Kudos
1 Solution
arghod
Level 2
Author

Hi, the issue was solved. It had to do with the python environment. Python 3.6+ has ordered dictionaries built in and older ones don't, so I had to change the environment from Python 2.0 (default) to Python 3.6 and it worked. 

View solution in original post

0 Kudos
1 Reply
arghod
Level 2
Author

Hi, the issue was solved. It had to do with the python environment. Python 3.6+ has ordered dictionaries built in and older ones don't, so I had to change the environment from Python 2.0 (default) to Python 3.6 and it worked. 

View solution in original post

0 Kudos
A banner prompting to get Dataiku DSS