Community Conundrum 25: Feature Visualization is now live! Read More

Append a pandas dataframe to an already existing Dataset within a plugin

Level 2
Level 2
Append a pandas dataframe to an already existing Dataset within a plugin

I'm creating a custom plugin containing a recipe that evaluates a machine learning model and outputs a DSS Dataset with performance metrics (it is very similar to the in-built Evaluate recipe). However, each time I train the model, I would like to append the new performance record to the already-existing Dataset rather than overwriting it.

The code I'm using at the end of my plugin recipe to produce such Dataset is the following:

output_dataset_name = get_output_names_for_role('output_perf')[0]
performance_metrics = dataiku.Dataset(output_dataset_name)
performance_metrics.write_with_schema(metrics_df)

metrics_df is the new record of performances that I would like to append to the existing Dataset. 

I know that write_with_schema overwrites the existing dataset, but in the docs I couldn't find an argument or another method that appends a pandas dataframe to an existing DSS Dataset. Is there a way to achieve my objective?

0 Kudos
2 Replies
Dataiker
Dataiker

Hi @RicSpd 

In the Input/Output tab of your Python recipe, you should tick the option to Append instead of override.

You can also use the write_dataframe method.

Good luck!

 

Level 2
Level 2
Author

It was so simple I feel a little stupid 😅
Thanks @Liev!

0 Kudos
Labels (2)