Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I created a Python recipe to standarize columns
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# apply the z-score method in Pandas using the .mean() and .std() methods
def z_score(df):
# copy the dataframe
df_std = df.copy()
# apply the z-score method
for column in df_std.columns:
if is_numeric_dtype(df_std[column]):
df_std[column] = (df_std[column] - df_std[column].mean()) / df_std[column].std()
return df_std
# Read recipe inputs
spam_prepared = dataiku.Dataset("spam_prepared")
spam_prepared_df = spam_prepared.get_dataframe()
spam_standarized = z_score(spam_prepared_df)
# Write recipe outputs
spam_standarized = dataiku.Dataset("spam_standarized")
spam_standarized.write_with_schema(spam_standarized_df)
But Dataiku is telling me that is_numeric_dtype does not exist. What am I doing wrong?
Thanks
Hi Iceberg,
is_numeric_dtype is a method from the module pandas.api.types. In your code snippet, you should replace your occurrences of is_numeric_dtype with pd.api.types.is_numeric_dtype.
Best regards,
Agathe