Using Excel Sheet name

Denes
Denes Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 2 ✭✭✭

Hi,
I have an Excel with different sheets and I need the name of the sheet as a value in DSS.

Is one of the visual recipes or the Data-Upload settings able to do this?

Thx

Answers

  • StanG
    StanG Dataiker, Registered Posts: 52 Dataiker

    Hi,
    To do what you want without coding, you can upload the excel file in a DSS folder and use the excel sheet importer plugin (https://www.dataiku.com/product/plugins/excel-sheet-importer/) to create one dataset per excel sheet. Then you can stack these datasets and retrieve the sheet name in a column using the Origin column parameter of the visual stack recipe.
    Or this can be done in only one step with a python recipe using the answer from the post.

  • Vitaliy
    Vitaliy Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 102 Dataiker
    edited July 17

    Hello,

    It is not clear what you mean by saying, "I need the name of the sheet as a value in DSS", please provide your use case for a better understanding of the requirement.

    If you want to import Sheets into different datasets, then you can use the plugin suggested by StanG in this thread.

    If you want to store sheet names as values in a dataset or as project variables, then there no visual recipe that can do it in DSS.

    Also, there is always an option to use a custom Python recipe. Using the example below, you can get all spreadsheets from a folder, get sheet names, and then, depending on your requirements, do whatever you need with the names (add them to project variables, create a dataset, etc.).

    import glob
    import openpyxl
    
    path = '<absolute_path_to_files>*.xlsx'
    files = glob.glob(path)
    
    for file in files:
        wb = openpyxl.load_workbook(file)
        print(wb.sheetnames)


    I hope this help.
    Regards,
    Vitaliy

  • Denes
    Denes Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 2 ✭✭✭

    Hi, thanks for the replies and sorry for the imprecise question.

    I have an Excel file named 'city.xlsx' with the sheets 'Austin', 'Bangkok', 'Chicago',....
    When I process the sheet 'Bangkok', I need the name of the city in the variable 'City_Name' with delivered. What I do not need is 'city.xlsx' as the value of my variable each time.

    I guess I have to work with a custom Python recipe.

Setup Info
    Tags
      Help me…