Function does not reduce error

jose_deoliveira
jose_deoliveira Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 2 ✭✭✭
edited July 2024 in Using Dataiku

Hi,

I'm facing some trouble in the following python recipe.

# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
from statsmodels.stats.stattools import medcouple

# Read recipe inputs
COLETA_f_datas = dataiku.Dataset("COLETA_f_datas")
COLETA_f_datas_df = COLETA_f_datas.get_dataframe()

# Define relevant functions
def q1(x):
    return x.quantile(0.25)

def q3(x):
    return x.quantile(0.75)
def mc(x):
    y = x[~pd.isnull(x)]
    return medcouple(y)
   
# Compute recipe outputs from inputs

aggregation_data_compra_df = COLETA_f_datas_df.groupby(['Data', 'ATIVO'])['COMPRA'].agg({'COMPRA_': [q1,q3, mc]}).reset_index()
#aggregation_data_venda_df = COLETA_f_datas_df.groupby(['Data', 'ATIVO'])['VENDA'].agg({'VENDA_': [ q1, q3, medcouple]}).reset_index()
#aggregation_data_indicativa_df = COLETA_f_datas_df.groupby(['Data', 'ATIVO'])['INDICATIVA'].agg({'INDICATIVA_': [ q1, q3, medcouple]}).reset_index()

#Taxas_brutas_stats_df = pd.merge(pd.merge(aggregation_data_compra_df,aggregation_data_venda_df, how ='left'),aggregation_data_indicativa_df, how = 'left')

#Taxas_brutas_stats_df.columns = ['Data', 'ATIVO','TaxaCompra_q1','TaxaCompra_q3','TaxaCompra_MC','TaxaVenda_q1','TaxaVenda_q3','TaxaVenda_MC','TaxaIndicativa_q1','TaxaIndicativa_q3','TaxaIndicativa_MC']

COLETA_F_STATS_df = aggregation_data_compra_df

# Write recipe outputs
COLETA_F_STATS = dataiku.Dataset("COLETA_F_STATS")
COLETA_F_STATS.write_with_schema(COLETA_F_STATS_df)

By running it the error from the title appears. However running the function in another plataform shows no errors. By changing the mc function to the simple medcouple function, the recipe performs well. However, the function medcouple seems to consider null cells and, by doing so, the output offers a wrong answer, forcing me to throw out the original vector and build a new one composed by only valid values. Can anyone assist me?

Thanks!

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,349 Dataiker

    Hi @jose_deoliveira
    ,

    The error " Function does not reduce error " may be related to the pandas version used.

    What pandas version are you using in DSS?

    What pandas version, are you testing this code externally?

    Starting DSS 10.0.4 we also support Pandas 1.1, Pandas 1.2 and Pandas 1.3.

  • jose_deoliveira
    jose_deoliveira Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 2 ✭✭✭

    Hi Alex,

    I'm using version 0.23.4 in DSS and 1.3.5 externally. By creating a new python recipe, i cannot see any options to change it. Maybe it is because I do not have the required permissions to do so? By change the version to 1.3 this effect will be applied in everyother python recipe or exclusivelly in this one?

    Thanks!

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,349 Dataiker

    Hi,

    You can change the Pandas version in code environment used for this recipe.

    You can create a new code env with Pandas 1.0 or higher depending on the DSS version and Python version

    https://doc.dataiku.com/dss/latest/code-envs/index.html

    you can choose the Core package version:

    CA61A62B-1642-4526-91E2-1397E6420345.jpeg

    Let me know if that helps solve your issue.

    Thanks

Setup Info
    Tags
      Help me…