The partitioning column does not display in dataiku
boumezrag
Registered Posts: 15 ✭✭✭✭
Hi everybody,
I an getting a real problem, when I import a table from hive , the partitioning column does not display in dataiku
please any help ?
I an getting a real problem, when I import a table from hive , the partitioning column does not display in dataiku
please any help ?
Tagged:
Best Answer
-
Hi,
partitioning columns in Hive are logical columns that expose the path from the table root directory on HDFS to the files containing the data for a given partition value. In DSS, when you retrieve data from the dataset corresponding to the Hive table, you pass a list of values for the partitioning columns, and the data is filtered on these values. You can see the list of the existing values in the partitioning columns in the Status tab, or in the Sampling panel on the left of your Explore tab.
Regards,
Answers
-
Thank you for your answer,
To be honest I didn't understand , in the status tab we can see the list of all columns but not the partitioning one.
My question is : I have a table with 13 columns ( including the partitioning column) , I can see only 12 ! how can I do to display the 13 columns .
Thanks in advance. -
since you imported a partitioned Hive table as a DSS dataset, you should have a defined partitioning scheme in the dataset's Partitioning tab (under its Settings), with the missing column as dimension.
In the Status tab, you can display as Partition table, and the display will be a table with the partition identifiers as row identifiers. A partition identifier is a '|' separated list of the values of the partitioning columns. -
I attached a screenshot,
When you said "and the display will be a table with the partition identifiers as row identifiers"
is this what I screenshoted ? -
this is my screenshot
-
the values for the partition column can indeed be seen on the left.
-
So there is no way to display this column with the others ??? sorry for asking too many questions
-
This is not possible at the moment. But:
- you can specify which values of this partition column you want when you browse a dataset or build a dataset
- you can always access the column and its data via Hiveserver2 (ie SQL notebook, or in a Hive recipe when you set the engine in the Advanced tab to Hiveserver2) -
Thank you so much ,
I get it now ;-)