PySpark Lists extract issue

Options
arghod
arghod Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 4 ✭✭✭✭
edited July 16 in Using Dataiku

Hi,

I am having an issue with lists on pyspark. My understanding is dataiku's pyspark recipe will accept python code. I have a simple code that works on python but not pyspark.

List = {
"A":"URL1",
"B":"URL2",
"C":"URL3",
"D":"URL4",
"E":"URL5"
}

GIP = list(List.keys())[2]
EOD = list(List.keys())[0]
PIR = list(List.keys())[3]
VAL = list(List.keys())[4]
COV = list(List.keys())[1]

print(GIP)
print(EOD)
print(PIR)
print(VAL)
print(COV)

The print statements right after shows the following:
GIP= "B" ***Should be C
EOD= "A" ***Should be A
PIR= "C" ***Should be D
VAL= "D" ***Should be E
COV= "E" ***Should be B

No idea why, the logic is simple and works exactly as is in a python recipe. I considered maybe it is magically taking a List variable from another recepie but I renamed it to unique name ie. Listasdasdas and had the same issue.

Best Answer

  • arghod
    arghod Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 4 ✭✭✭✭
    Answer ✓
    Options

    Hi, the issue was solved. It had to do with the python environment. Python 3.6+ has ordered dictionaries built in and older ones don't, so I had to change the environment from Python 2.0 (default) to Python 3.6 and it worked.

Setup Info
    Tags
      Help me…