The partitioning column does not display in dataiku

Options
boumezrag
boumezrag Registered Posts: 15 ✭✭✭✭
Hi everybody,

I an getting a real problem, when I import a table from hive , the partitioning column does not display in dataiku

please any help ?
Tagged:

Best Answer

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Answer ✓
    Options
    Hi,

    partitioning columns in Hive are logical columns that expose the path from the table root directory on HDFS to the files containing the data for a given partition value. In DSS, when you retrieve data from the dataset corresponding to the Hive table, you pass a list of values for the partitioning columns, and the data is filtered on these values. You can see the list of the existing values in the partitioning columns in the Status tab, or in the Sampling panel on the left of your Explore tab.

    Regards,

Answers

  • boumezrag
    boumezrag Registered Posts: 15 ✭✭✭✭
    Options
    Thank you for your answer,
    To be honest I didn't understand , in the status tab we can see the list of all columns but not the partitioning one.
    My question is : I have a table with 13 columns ( including the partitioning column) , I can see only 12 ! how can I do to display the 13 columns .

    Thanks in advance.
  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Options
    since you imported a partitioned Hive table as a DSS dataset, you should have a defined partitioning scheme in the dataset's Partitioning tab (under its Settings), with the missing column as dimension.
    In the Status tab, you can display as Partition table, and the display will be a table with the partition identifiers as row identifiers. A partition identifier is a '|' separated list of the values of the partitioning columns.
  • boumezrag
    boumezrag Registered Posts: 15 ✭✭✭✭
    Options
    I attached a screenshot,
    When you said "and the display will be a table with the partition identifiers as row identifiers"
    is this what I screenshoted ?
  • boumezrag
    boumezrag Registered Posts: 15 ✭✭✭✭
    Options

    this is my screenshot

  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Options
    the values for the partition column can indeed be seen on the left.
  • boumezrag
    boumezrag Registered Posts: 15 ✭✭✭✭
    Options
    So there is no way to display this column with the others ??? sorry for asking too many questions
  • fchataigner2
    fchataigner2 Dataiker Posts: 355 Dataiker
    Options
    This is not possible at the moment. But:
    - you can specify which values of this partition column you want when you browse a dataset or build a dataset
    - you can always access the column and its data via Hiveserver2 (ie SQL notebook, or in a Hive recipe when you set the engine in the Advanced tab to Hiveserver2)
  • boumezrag
    boumezrag Registered Posts: 15 ✭✭✭✭
    Options
    Thank you so much ,
    I get it now ;-)
Setup Info
    Tags
      Help me…