NOTE: this post deals with files-based partitioning
For some reason, dataiku removes the partitioning dimension(s) when a dataset is (files-based) partitioned using a sync recipe.
See for example this hands-on tutorial: "Dataiku DSS warns that a schema update is required. This is because redispatching removes the purchase_date column [= partitioning dimension] when our dataset is stored on a file system".
This behaviour can be annoying as it prevents:
An easy work-around consists in duplicating the partitioning column(s) by renaming them, but this sounds like code smell.
I upvoted this too. It is this behavior with partitions in the local filesystem that make me always use my own SQL DB instead. When you partition a SQL dataset you can still see the partitions in the table.
Thanks for your idea @tanguy . Your idea meets the criteria for submission, we'll reach out should we require more information.
If you’re reading this and think this would be a great capability to add to DSS, be sure to kudos the original post!
Thanks for submitting this idea and for sharing the context around why it would be useful to your team. You'll be pleased to hear this idea is in our backlog. It is a request we've received from customers--and we are determining the next steps for development. We can't provide a timeline at this point, but be sure to check back for updates.! For everyone else, kudos the original post to signal that you're interested in Dataiku developing and releasing this feature!
Only members of the Community can comment.