How to append dataframe in existing output dataset

upx86 · January 2024

Hello experts,

In dataiku v12.3.0, I was trying to append dataframe using write_dataframe() in existing dataset (with same schema). But it always overwrites with last dataframe even though the dataset spec is configured like:

dataset.spec_item["appendMode"] = True

The dataset is classified as output so it doesn't let me use dataset.get_dataframe(). It throws exception: "You cannot read dataset test.my-dataset, it is not declared as an input"

Regards,

upx86

Turribeach · January 2024

You don't set this in code, you do it on the Inputs/Outputs tab of the recipe and set the "Append instead of overwrite" check box. Then you write normally to the output like in a normal recipe and Dataiku will do the append for you.

Stanislava · August 2024

https://community.dataiku.com/discussion/comment/40054#Comment_40054

Where exactly is this "Append instead of overwrite" check box? I don't seem to have it in the Inputs/Outputs tab of the python recipe?

Turribeach · August 2024

In a Python recipe it will be in the Inputs/Outputs tab below the output dataset name (see screen shot below). But in your case it's not present since you are using an output dataset connection type (S3) that does not support writting in append mode.

Turribeach · August 2024

ELACHAR · January 29

https://community.dataiku.com/discussion/comment/44357#Comment_44357

If the output dataset does not support Append Instead of overwrite Is There any solution

Turribeach · January 29

https://community.dataiku.com/discussion/comment/45285#Comment_45285

Yes, the solution is to use a connection type that supports writing in append mode. Failing that you could use a circular recipe (allowed in v13) to first read the whole output dataset, then add the new records and then write the whole thing again. Very inefficient though…

How to append dataframe in existing output dataset

Tags

Welcome!

Answers

Welcome!

Welcome!

Quick Links

Categories

Sign up to take part

How to append dataframe in existing output dataset

Tags

Welcome!

Answers

Welcome!

Welcome!

Quick Links

Categories