Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Situation:
Dataset is partitioned by Year - month - day on HDFS.
Existing data: year=2016/month=05/
day=01
day=02
day=03
...
day=12
Questions:
- If I rebuild a dataset on 2016-05-12. Is only the data on the path year=2016/month=05/day=12 overwritten? Or Will all the datasets under the folder year=2016/... be overwritten?
- If I build a dataset on 2016-05-13. Is only the data written on the path year=2016/month=05/day=13 and all data remains unchanged (so not overwritten)? Or Will all the datasets under the folder year=2016/... be recalculated?