Python recipe failing in a Python env

Options
danusio
danusio Registered Posts: 6

Hey guys!

I set a Python env in order to install specific packages, but the following recipe is failing when executed in such an env:

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

# Read recipe inputs
df_train_asis = dataiku.Dataset("df_train_asis")
df_train_asis_df = df_train_asis.get_dataframe()
df_test_asis = dataiku.Dataset("df_test_asis")
df_test_asis_df = df_test_asis.get_dataframe()


# Compute recipe outputs
# TODO: Write here your actual code that computes the outputs
# NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.

df_results_asis_df = df_train_asis_df # Compute a Pandas dataframe to write into df_results_asis


# Write recipe outputs
df_results_asis = dataiku.Dataset("df_results_asis")
df_results_asis.write_with_schema(df_results_asis_df)

The error message:

Job failed: The Python process failed (exit code: 1)

Error type:com.dataiku.dip.exceptions.ProcessDiedException

The script works fine when run in the builtin DSS env. Any idea?


Operating system used: MacOS


Operating system used: MacOS

Best Answer

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    The recipe looks fine, how exactly did you setup the Python environment? What Python version is using?

  • danusio
    danusio Registered Posts: 6
    edited July 17
    Options

    I'm using Python 3.9. The only different thing I set was to insert joblib, category_encoders and imblearn in Requested packages (Pip), and everything went well in installation. Current installed packages:

    backcall==0.2.0
    category-encoders==2.6.2
    certifi==2023.7.22
    charset-normalizer==3.3.0
    decorator==5.1.1
    idna==3.4
    imbalanced-learn==0.11.0
    imblearn==0.0
    ipykernel==4.8.2
    ipython==7.34.0
    ipython-genutils==0.2.0
    jedi==0.19.1
    joblib==1.3.2
    jupyter-client==5.2.4
    jupyter_core==4.11.2
    matplotlib-inline==0.1.6
    numpy==1.23.5
    packaging==23.2
    pandas==1.1.5
    parso==0.8.3
    patsy==0.5.3
    pexpect==4.8.0
    pickleshare==0.7.5
    prompt-toolkit==3.0.39
    ptyprocess==0.7.0
    Pygments==2.16.1
    python-dateutil==2.8.1
    pytz==2020.5
    pyzmq==22.3.0
    requests==2.31.0
    scikit-learn==1.3.1
    scipy==1.11.3
    simplegeneric==0.8.1
    six==1.16.0
    statsmodels==0.14.0
    threadpoolctl==3.2.0
    tornado==5.1.1
    traitlets==4.3.3
    urllib3==2.0.6
    wcwidth==0.2.8
  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    Make sure python 3.9 is allowed to open in MacOS Settings

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    How did you install Python 3.9 in your Mac?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    edited July 17
    Options

    I ran this recipe on my Mac on 3.9 and works fine. Your code doesn't look bad to me. Can you remove one of the inputs to test?

    # -*- coding: utf-8 -*-
    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    customers = dataiku.Dataset("customers")
    customers_df = customers.get_dataframe()
    
    
    # Compute recipe outputs from inputs
    # TODO: Replace this part by your actual code that computes the output, as a Pandas dataframe
    # NB: DSS also supports other kinds of APIs for reading and writing data. Please see doc.
    
    customers_out_df = customers_df # For this sample code, simply copy input to output
    
    
    # Write recipe outputs
    customers_out = dataiku.Dataset("customers_out")
    customers_out.write_with_schema(customers_out_df)

  • danusio
    danusio Registered Posts: 6
    edited July 17
    Options

    Done, I got the same. Code:

    # -*- coding: utf-8 -*-
    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    df_train_asis = dataiku.Dataset("df_train_asis")
    df_train_asis_df = df_train_asis.get_dataframe()
    
    
    # Compute recipe outputs
    # TODO: Write here your actual code that computes the outputs
    # NB: DSS supports several kinds of APIs for reading and writing data. Please see doc.
    
    df_check_env_df = df_train_asis_df # Compute a Pandas dataframe to write into df_check_env
    
    
    # Write recipe outputs
    df_check_env = dataiku.Dataset("df_check_env")
    df_check_env.write_with_schema(df_check_env_df)
  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    What connection are your datasets on?

  • danusio
    danusio Registered Posts: 6
    Options
  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    What about if you move your datasets to a file system connection?

  • danusio
    danusio Registered Posts: 6
    Options

    @Turribeach
    I'm not able to do that, I'm using a third-part Dataiku instance. Is it possible that the instance is not set to work properly with Python version > 3.6?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,757 Neuron
    Options

    Anything is possible but without being an Admin you can't really solve anything. I suggest you take up the issue with your Dataiku Administrator.

Setup Info
    Tags
      Help me…