Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hey guys!
I set a Python env in order to install specific packages, but the following recipe is failing when executed in such an env:
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
df_train_asis = dataiku.Dataset("df_train_asis")
df_train_asis_df = df_train_asis.get_dataframe()
df_test_asis = dataiku.Dataset("df_test_asis")
df_test_asis_df = df_test_asis.get_dataframe()
# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.
df_results_asis_df = df_train_asis_df # Compute a Pandas dataframe to write into df_results_asis
# Write recipe outputs
df_results_asis = dataiku.Dataset("df_results_asis")
df_results_asis.write_with_schema(df_results_asis_df)
The error message:
Job failed: The Python process failed (exit code: 1)
Error type:com.dataiku.dip.exceptions.ProcessDiedException
The script works fine when run in the builtin DSS env. Any idea?
Operating system used: MacOS
Operating system used: MacOS
I tried in a online Dataiku license and got the same, but the problem was the Python version indeed: only 3.6 works properly.
The recipe looks fine, how exactly did you setup the Python environment? What Python version is using?
I'm using Python 3.9. The only different thing I set was to insert joblib, category_encoders and imblearn in Requested packages (Pip), and everything went well in installation. Current installed packages:
backcall==0.2.0 category-encoders==2.6.2 certifi==2023.7.22 charset-normalizer==3.3.0 decorator==5.1.1 idna==3.4 imbalanced-learn==0.11.0 imblearn==0.0 ipykernel==4.8.2 ipython==7.34.0 ipython-genutils==0.2.0 jedi==0.19.1 joblib==1.3.2 jupyter-client==5.2.4 jupyter_core==4.11.2 matplotlib-inline==0.1.6 numpy==1.23.5 packaging==23.2 pandas==1.1.5 parso==0.8.3 patsy==0.5.3 pexpect==4.8.0 pickleshare==0.7.5 prompt-toolkit==3.0.39 ptyprocess==0.7.0 Pygments==2.16.1 python-dateutil==2.8.1 pytz==2020.5 pyzmq==22.3.0 requests==2.31.0 scikit-learn==1.3.1 scipy==1.11.3 simplegeneric==0.8.1 six==1.16.0 statsmodels==0.14.0 threadpoolctl==3.2.0 tornado==5.1.1 traitlets==4.3.3 urllib3==2.0.6 wcwidth==0.2.8
Make sure python 3.9 is allowed to open in MacOS Settings
How did you install Python 3.9 in your Mac?
I tried in a online Dataiku license and got the same, but the problem was the Python version indeed: only 3.6 works properly.
I ran this recipe on my Mac on 3.9 and works fine. Your code doesn't look bad to me. Can you remove one of the inputs to test?
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
customers = dataiku.Dataset("customers")
customers_df = customers.get_dataframe()
# Compute recipe outputs from inputs
# TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
# NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.
customers_out_df = customers_df # For this sample code, simply copy input to output
# Write recipe outputs
customers_out = dataiku.Dataset("customers_out")
customers_out.write_with_schema(customers_out_df)
Done, I got the same. Code:
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
df_train_asis = dataiku.Dataset("df_train_asis")
df_train_asis_df = df_train_asis.get_dataframe()
# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.
df_check_env_df = df_train_asis_df # Compute a Pandas dataframe to write into df_check_env
# Write recipe outputs
df_check_env = dataiku.Dataset("df_check_env")
df_check_env.write_with_schema(df_check_env_df)
What connection are your datasets on?
@Turribeach Amazon S3.
What about if you move your datasets to a file system connection?
@Turribeach I'm not able to do that, I'm using a third-part Dataiku instance. Is it possible that the instance is not set to work properly with Python version > 3.6?
Anything is possible but without being an Admin you can't really solve anything. I suggest you take up the issue with your Dataiku Administrator.