Dataframes not reading NA country code in python receipe

Options
Ankur30
Ankur30 Partner, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer Posts: 40 Partner

Hi ,

Good Evening!!

I am reading a pandas dataframe from dataiku dataset and filtering on county code == 'NA' but my dataframe is not returning any value.

Is there any workaround for this. Kindly help!

Regards,

Ankur.

Best Answer

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    edited July 17 Answer ✓
    Options

    This behavior is expected with Pandas.

    Starting with DSS 9.0.4 or later we added support for pandas parameters keep_default_na.

    Which would allow you to add keep_default=False to your get_dataframe() function to handle this situation.

    One possible workaround for the previous version would be:

    1) Export to CSV to a managed folder using the Visual Export recipe and read the CSV with pandas from the managed folder:

    Use sample code this works only with local filesystem managed folder :

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    managed_folder = dataiku.Folder("D0gTVBY3")
    path = managed_folder.get_path()
    filename = "country_prepared.csv"
    filepath = path + "/" + filename
    
    df = pd.read_csv(filepath, keep_default_na=False) 
    

Answers

Setup Info
    Tags
      Help me…