NOTE: this post deals with files-based partitioning
For some reason, dataiku removes the partitioning dimension(s) when a dataset is (files-based) partitioned using a sync recipe.
See for example this hands-on tutorial: "Dataiku DSS warns that a schema update is required. This is because redispatching removes the purchase_date column [= partitioning dimension] when our dataset is stored on a file system".
This behaviour can be annoying as it prevents:
explicitly visualizing the partitions in the explore tab
performing computations on the partitioning column (e.g, for time-based partititions, one could need to compute the time elapsed since the partition date)
retrieving the partitioning dimension later in the flow when switching to an unpartitioned format (at least with Spark SQL, I noticed that AWS Athena - which is only recommended to query datasets - does retrive the partitioning dimension)
An easy work-around consists in duplicating the partitioning column(s) by renaming them, but this sounds like code smell.