How to access global variables inside visual recipes?

Options
epsi95
epsi95 Dataiku DSS Core Concepts, Registered Posts: 15 ✭✭✭✭
edited July 16 in Using Dataiku

I have a global variable declared like that

Screenshot 2021-10-14 174338.png

I want to use it inside the visula receipe like that

Screenshot 2021-10-14 174500.png

As you can see it is giving me error that `Unknown input column`.

Also I want to access nested variables, so I tried like

${visual_recipes_params.metric_selection.time_column}

Which is also not working.

How to access global variables and nested global variables inside visual recipes?

Best Answer

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Answer ✓
    Options

    The scenario that you describe is precisely what Dataiku Applications were created for:

    - Packaging of a flow into an application, accessible to business users with a simple interface

    - Enabling concurrent execution of the same flow with different parameters.

    Rather than asking the users to map columns in the variable json (not really a business user interface), ask the users to map the columns in their own file (renaming the columns before uploading).

    Complete this tutorial to understand how to build your application. It is really simple. https://academy.dataiku.com/dataiku-applications-tutorials-open

    I hope this helps. Good luck

Answers

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    Hi,

    Variables expansion is not available everywhere, as explained in the documentation: "Some select configuration fields in the DSS interface can perform variables expansion using ${variable} syntax."

    https://doc.dataiku.com/dss/latest/variables/index.html

    What is your overall objective? It seems you are trying to define a generic recipe to work with tables with different schemas. There might be better capabilities to achieve your objective.

    I hope this helps.

  • epsi95
    epsi95 Dataiku DSS Core Concepts, Registered Posts: 15 ✭✭✭✭
    Options

    Ya. you are correct. The generic structure of the input data is the same but the column names can be different.

    Can you tell me a little bit more about a better alternative than this `global variable` method?

  • Manuel
    Manuel Alpha Tester, Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 193 ✭✭✭✭✭✭✭
    Options

    Hi,

    It would still be good to understand what is the challenge you are trying to solve, to point you to the right solution.

    If your need is akin to execute the same flow for different input datasets, then you should look into:

    - Dataiku Applications: However, a constant schema is still a requirement. https://videos.dataiku.com/watch/GuevWgkMCMrmFXAUNnnDiw?

    - Application as recipes, in which you package an entire flow as a visual recipe. https://videos.dataiku.com/watch/8feWuYFCQtkXHJ7AYa5vre?

    I hope this helps.

    Best regards

  • epsi95
    epsi95 Dataiku DSS Core Concepts, Registered Posts: 15 ✭✭✭✭
    Options

    Ya sure let me describe my problem, I am creating a pipeline to process the input data.

    (input data) --> process step 1 --> process step 2 --> process step 3 --> Final csv

    The input data has the following format for example:

    ID y x1 x2 x3 x4 class

    1 2 1 5 6 6 'a'

    1 2 3 4 6 6 'c'

    2 4 2 4 1 2 'd'

    This data will be used for linear regression.

    So, there are multiple steps while processing like--

    1. Selecting the required columns

    2. Parsing the date column

    3. Doing some Na filling and custom python recipes etc

    Not the thing is the data headers can be different but it is always in this format

    ID y x1 x2 x3 class

    The names can be different like

    ID --> vehicle registration number

    x1 --> milage

    x2 --> Fuel efficieny

    y --> cost of servicing

    or

    ID --> person ID

    x1 --> distance

    x2 --> time

    y --> toll cost

    etc...

    So I need to make the whole flow such that if I just modify the global variable file, everything will ve fine. To be more precise, this thing will be used by non-techy people, so I tried to limit them only to the global variables so that they don't need to change anything else in the flow. I hope this clarifies your questiong and you will be able to provide better solution.

  • epsi95
    epsi95 Dataiku DSS Core Concepts, Registered Posts: 15 ✭✭✭✭
    Options

    Sorry to ask again one question, the data is not actually a csv file, it is residing in a database and renaming the column headers of the data is not possible. Like if the original dataframe has a column name "cat_age" after all the processing stuffs user wants some parameters related to "cat_age". According to your suggestion I can make the X variables as "feature_1", "feature_2" and so one but I don't know how many features will be there, it can be 10 or 11. What do you suggest in that case?

Setup Info
    Tags
      Help me…