How to append data to partition's dataset in Python

Alka
Alka Registered Posts: 2

I'm treating a flow of data wich I dispatch in partitions, some of my python code run from scenarios to be able to properly switch from reading and writing partitions on the go.

The datas are stored on an azure blob storage, csv-like.

When I have to write additional datas to a partitions I can't find a way to do it as an append by simply adding a file to a partition.

For example, I'm also running a continuous kafka sync recipe, which does exactly what I want since I can list the partitions and get :

Alka_0-1700051241238.png

On the contrary, my python script in scenario only generate 1 file every time, so I have to reload all the datas from the partition on memory and rewrite everything with the additional datas.
Since i'm switching partitions I can not use a python script in a recipe and simply clic the "append button".

I just want a simple way to say to a writer to put datas into a specific new files at the specific partition location, how is that so difficult ?

And no, pandas-like answers are not valid since they are too time consuming.

Any help ?

Tagged:

Answers

Setup Info
    Tags
      Help me…