PySpark Lists extract issue

arghod · April 2021

Hi,

I am having an issue with lists on pyspark. My understanding is dataiku's pyspark recipe will accept python code. I have a simple code that works on python but not pyspark.

List = {
"A":"URL1",
"B":"URL2",
"C":"URL3",
"D":"URL4",
"E":"URL5"
}

GIP = list(List.keys())[2]
EOD = list(List.keys())[0]
PIR = list(List.keys())[3]
VAL = list(List.keys())[4]
COV = list(List.keys())[1]

print(GIP)
print(EOD)
print(PIR)
print(VAL)
print(COV)

The print statements right after shows the following:
GIP= "B" ***Should be C
EOD= "A" ***Should be A
PIR= "C" ***Should be D
VAL= "D" ***Should be E
COV= "E" ***Should be B

No idea why, the logic is simple and works exactly as is in a python recipe. I considered maybe it is magically taking a List variable from another recepie but I renamed it to unique name ie. Listasdasdas and had the same issue.

arghod · April 2021

Hi, the issue was solved. It had to do with the python environment. Python 3.6+ has ordered dictionaries built in and older ones don't, so I had to change the environment from Python 2.0 (default) to Python 3.6 and it worked.

PySpark Lists extract issue

Best Answer

Categories

Setup Info

Tags