Python recipe failing in a Python env

Solved!
danusio
Level 2
Python recipe failing in a Python env

Hey guys!

I set a Python env in order to install specific packages, but the following recipe is failing when executed in such an env:

 

 

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
df_train_asis = dataiku.Dataset("df_train_asis")
df_train_asis_df = df_train_asis.get_dataframe()
df_test_asis = dataiku.Dataset("df_test_asis")
df_test_asis_df = df_test_asis.get_dataframe()


# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.

df_results_asis_df = df_train_asis_df # Compute a Pandas dataframe to write into df_results_asis


# Write recipe outputs
df_results_asis = dataiku.Dataset("df_results_asis")
df_results_asis.write_with_schema(df_results_asis_df)

 

 

 

The error message:

Job failed: The Python process failed (exit code: 1)

Error type:com.dataiku.dip.exceptions.ProcessDiedException

 

The script works fine when run in the builtin DSS env. Any idea?

 

Operating system used: MacOS


Operating system used: MacOS

0 Kudos
1 Solution
danusio
Level 2
Author

I tried in a online Dataiku license and got the same, but the problem was the Python version indeed: only 3.6 works properly.

View solution in original post

0 Kudos
12 Replies
Turribeach

The recipe looks fine, how exactly did you setup the Python environment? What Python version is using?

0 Kudos
danusio
Level 2
Author

I'm using Python 3.9. The only different thing I set was to insert joblib, category_encoders and imblearn in Requested packages (Pip), and everything went well in installation. Current installed packages:

 

backcall==0.2.0
category-encoders==2.6.2
certifi==2023.7.22
charset-normalizer==3.3.0
decorator==5.1.1
idna==3.4
imbalanced-learn==0.11.0
imblearn==0.0
ipykernel==4.8.2
ipython==7.34.0
ipython-genutils==0.2.0
jedi==0.19.1
joblib==1.3.2
jupyter-client==5.2.4
jupyter_core==4.11.2
matplotlib-inline==0.1.6
numpy==1.23.5
packaging==23.2
pandas==1.1.5
parso==0.8.3
patsy==0.5.3
pexpect==4.8.0
pickleshare==0.7.5
prompt-toolkit==3.0.39
ptyprocess==0.7.0
Pygments==2.16.1
python-dateutil==2.8.1
pytz==2020.5
pyzmq==22.3.0
requests==2.31.0
scikit-learn==1.3.1
scipy==1.11.3
simplegeneric==0.8.1
six==1.16.0
statsmodels==0.14.0
threadpoolctl==3.2.0
tornado==5.1.1
traitlets==4.3.3
urllib3==2.0.6
wcwidth==0.2.8
0 Kudos
Turribeach

Make sure python 3.9 is allowed to open in MacOS Settings 

0 Kudos
Turribeach

How did you install Python 3.9 in your Mac?

0 Kudos
danusio
Level 2
Author

I tried in a online Dataiku license and got the same, but the problem was the Python version indeed: only 3.6 works properly.

0 Kudos
Turribeach

I ran this recipe on my Mac on 3.9 and works fine. Your code doesn't look bad to me. Can you remove one of the inputs to test?

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
customers = dataiku.Dataset("customers")
customers_df = customers.get_dataframe()


# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.

customers_out_df = customers_df # For this sample code, simply copy input to output


# Write recipe outputs
customers_out = dataiku.Dataset("customers_out")
customers_out.write_with_schema(customers_out_df)

 

0 Kudos
danusio
Level 2
Author

Done, I got the same. Code:

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
df_train_asis = dataiku.Dataset("df_train_asis")
df_train_asis_df = df_train_asis.get_dataframe()


# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.

df_check_env_df = df_train_asis_df # Compute a Pandas dataframe to write into df_check_env


# Write recipe outputs
df_check_env = dataiku.Dataset("df_check_env")
df_check_env.write_with_schema(df_check_env_df)
0 Kudos
Turribeach

What connection are your datasets on?

0 Kudos
danusio
Level 2
Author

@Turribeach Amazon S3.

0 Kudos
Turribeach

What about if you move your datasets to a file system connection?

0 Kudos
danusio
Level 2
Author

@Turribeach I'm not able to do that, I'm using a third-part Dataiku instance. Is it possible that the instance is not set to work properly with Python version > 3.6?

0 Kudos
Turribeach

Anything is possible but without being an Admin you can't really solve anything. I suggest you take up the issue with your Dataiku Administrator. 

Labels

?
Labels (2)

Setup info

?
A banner prompting to get Dataiku