Problem to keep the variable format int export csv

scholaschl
scholaschl Dataiku DSS Core Concepts, Registered Posts: 8 ✭✭✭
edited July 16 in Using Dataiku

Hello,

I want to export a table to a csv file in folder using this method (python recipe):

import dataiku
import pandas as pd

temp_folder = "1VZVmdhX"

path_upload_file = "data.csv"
input_dataset = dataiku.Dataset("data")
df = input_dataset.get_dataframe()

handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
   w.write(df.to_csv(sep=';').encode('utf-8'))

However, my variables in the input file are integers (25 for example) and the output csv turns me variables into double: 25.0.
What would be the solution to keep the int format?

Tagged:

Best Answer

  • Zach
    Zach Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 153 Dataiker
    edited July 17 Answer ✓

    Hi @scholaschl
    ,

    This is likely occurring because your input data contains rows where Pandas thinks that the value is a float instead of an integer. This can happen if your data contains any missing values, because Pandas treats NaN as a float. For more information, see this post.

    You can fix it by telling Pandas to drop the decimal when formatting floats:

    import dataiku
    import pandas as pd
    
    temp_folder = "1VZVmdhX"
    
    path_upload_file = "data.csv"
    input_dataset = dataiku.Dataset("data")
    df = input_dataset.get_dataframe()
    
    handle = dataiku.Folder(temp_folder)
    with handle.get_writer(path_upload_file) as w:
        w.write(df.to_csv(sep=';', float_format="%.0f").encode('utf-8'))

    Thanks,

    Zach

Answers

Setup Info
    Tags
      Help me…