How to combine two rows of Header into one

Options
GSung
GSung Registered Posts: 27 ✭✭✭✭

What recipe should I use to change the header from (it's now in two different rows)

Business 1Business 2
ABCXYZ

to just one row as a header

Business 1 ABC

Business 2 XYZ

Best Answer

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 315 Dataiker
    edited July 17 Answer ✓
    Options

    Hi @GSung
    ,

    For most operations like this to perform a simple data transformation you could use a prepare recipe, which has many built in data transformations. This is normally where you would go to rename individual columns and perform similar steps.

    I don't think your particular transformation is handled automatically here, so if you wanted to automate combing the dataset header row with the first row of data, I think you can create a Python recipe like this:

    import dataiku
    import pandas as pd, numpy as np
    from dataiku import pandasutils as pdu
    
    # Read recipe inputs
    merge_column_headers = dataiku.Dataset("merge_column_headers")
    merge_column_headers_df = merge_column_headers.get_dataframe()
    
    for col in merge_column_headers_df.columns:
        # get first row value for this specific column
        first_row = merge_column_headers_df.iloc[0][col]
        new_column_name = col + " " + first_row 
        # rename the column with the existing column header plus the first row of that column's data 
        merge_column_headers_df.rename(columns={col: new_column_name}, inplace=True)
    
    # delete first row of data, now that it's merged into the header 
    merge_column_headers_df = merge_column_headers_df.iloc[1:]
    
    # Write recipe outputs back to dataset 
    merge_column_headers = dataiku.Dataset("merge_column_headers")
    merge_column_headers.write_with_schema(merge_column_headers_df)

    This modifies a single dataset, in this example called "merge_column_header" and appends the values from the first row of data to the header row, and then removes the first row of data from the dataset.

    Here's the before an after:

    Screen Shot 2021-02-16 at 2.10.41 PM.png

    Screen Shot 2021-02-16 at 2.09.50 PM.png

    Thanks,

    Sarina 

Answers

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
    Options

    Hi, @GSung
    ! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community.

  • GSung
    GSung Registered Posts: 27 ✭✭✭✭
    Options

    thank you all !

Setup Info
    Tags
      Help me…