Working Overwrite dataset:

Solved!
nv
Level 2
Working Overwrite dataset:

Situation:

Dataset is partitioned by Year - month - day on HDFS.

Existing data: year=2016/month=05/

day=01

day=02

day=03

...

day=12



Questions:



- If I rebuild a dataset on 2016-05-12. Is only the data on the path year=2016/month=05/day=12 overwritten?  Or Will all the datasets under the folder year=2016/... be overwritten?



- If I build a dataset on 2016-05-13. Is only the data written on the path year=2016/month=05/day=13 and all data remains unchanged (so not overwritten)?  Or Will all the datasets under the folder year=2016/... be recalculated?



 

0 Kudos
1 Solution
PierreP
Dataiker
Hi,

The answer depends on the type of recipe you're using.

If it's an sql query:

- In both cases, only the selected partition will be written/overwritten

If it's an sql script:

- It entirely depends on what you do. Everything is possible, you're responsible for delete/write the good partition.
see http://doc.dataiku.com/dss/latest/partitions/sql_recipes.html?highlight=sql%20script

View solution in original post

0 Kudos
1 Reply
PierreP
Dataiker
Hi,

The answer depends on the type of recipe you're using.

If it's an sql query:

- In both cases, only the selected partition will be written/overwritten

If it's an sql script:

- It entirely depends on what you do. Everything is possible, you're responsible for delete/write the good partition.
see http://doc.dataiku.com/dss/latest/partitions/sql_recipes.html?highlight=sql%20script
0 Kudos