Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi,
I am trying to append the data in my current dataset.
Tried with all method (write_row_dict, write_tuple, write_dataframe).
While executing its shows 1 rows successfully written, but when I go and check dataiku application and check dataset only one row available.
Can any one guide on this ? how to resolve this ?
Thanks,
I am trying to append data in input data set. Can we append in input data set
data = {'xx:'xxx','key':'1212'}
df = pandas.DataFrame(pandas.json_normalize([data]);
writer = dataiku.core.Dataset(dataSetName,projectKey);
vp = dataiku.core.dataset_write.DatasetWriter(writer);
vp.write_dataframe(df);
No Error. It's showing 1 rows successfully written
Thanks
Hello,
you could get your output dataset as a dataframe and append the new data to it.
import dataiku
import pandas as pd
output_dataset = dataiku.Dataset("out")
out_df = output_dataset.get_dataframe()
data = [
{'xx':'row 1','key':'1212'},
{'xx':'row 2','key':'1212'},
]
df_to_append = pd.json_normalize(data)
with out_df.get_writer() as writer:
writer.write_dataframe(out_df.append(df_to_append))
Hope this solves your issue
Alex
Hello,
If you're using a Python Recipe in the flow you may want to try setting the output dataset to 'Append instead of overwrite' under 'Inputs/Outputs'. This will add new rows to the output dataset every time the recipe is run instead of overwriting the data.
You can also use a Pandas data frame append mode in your Python code to append data: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html. Since we can directly write a dataframe to a dataset in DSS, we can append the new rows to the dataframe in the Python code and then write that dataframe to the dataset. I've included a small sample below that illustrates this. It takes a dataset (test_csv_df) and then appends it to itself in a separate dataframe. The appended data dataframe is then written back to another dataset in DSS.
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Read recipe inputs
test_csv = dataiku.Dataset("Test-csv")
test_csv_df = test_csv.get_dataframe()
temp_df = test_csv_df.append(test_csv_df)
# Write recipe outputs
temp_df = dataiku.Dataset("collapsed-data")
temp_df.write_with_schema(collapsed_data_df)
Hope this helps!
Andrew M