Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hello,
I want to export a table to a csv file in folder using this method (python recipe):
import dataiku
import pandas as pd
temp_folder = "1VZVmdhX"
path_upload_file = "data.csv"
input_dataset = dataiku.Dataset("data")
df = input_dataset.get_dataframe()
handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
w.write(df.to_csv(sep=';').encode('utf-8'))
However, my variables in the input file are integers (25 for example) and the output csv turns me variables into double: 25.0.
What would be the solution to keep the int format?
Hi @scholaschl ,
This is likely occurring because your input data contains rows where Pandas thinks that the value is a float instead of an integer. This can happen if your data contains any missing values, because Pandas treats NaN as a float. For more information, see this post.
You can fix it by telling Pandas to drop the decimal when formatting floats:
import dataiku
import pandas as pd
temp_folder = "1VZVmdhX"
path_upload_file = "data.csv"
input_dataset = dataiku.Dataset("data")
df = input_dataset.get_dataframe()
handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
w.write(df.to_csv(sep=';', float_format="%.0f").encode('utf-8'))
Thanks,
Zach
Hi @scholaschl ,
This is likely occurring because your input data contains rows where Pandas thinks that the value is a float instead of an integer. This can happen if your data contains any missing values, because Pandas treats NaN as a float. For more information, see this post.
You can fix it by telling Pandas to drop the decimal when formatting floats:
import dataiku
import pandas as pd
temp_folder = "1VZVmdhX"
path_upload_file = "data.csv"
input_dataset = dataiku.Dataset("data")
df = input_dataset.get_dataframe()
handle = dataiku.Folder(temp_folder)
with handle.get_writer(path_upload_file) as w:
w.write(df.to_csv(sep=';', float_format="%.0f").encode('utf-8'))
Thanks,
Zach
Thank you very much for your answer, it works well