How to access global variables inside visual recipes?

epsi95 · October 2021

I have a global variable declared like that

Screenshot 2021-10-14 174338.png

I want to use it inside the visula receipe like that

Screenshot 2021-10-14 174500.png

As you can see it is giving me error that `Unknown input column`.

Also I want to access nested variables, so I tried like

${visual_recipes_params.metric_selection.time_column}

Which is also not working.

How to access global variables and nested global variables inside visual recipes?

Manuel · October 2021

The scenario that you describe is precisely what Dataiku Applications were created for:

- Packaging of a flow into an application, accessible to business users with a simple interface

- Enabling concurrent execution of the same flow with different parameters.

Rather than asking the users to map columns in the variable json (not really a business user interface), ask the users to map the columns in their own file (renaming the columns before uploading).

Complete this tutorial to understand how to build your application. It is really simple. https://academy.dataiku.com/dataiku-applications-tutorials-open

I hope this helps. Good luck

Manuel · October 2021

Hi,

Variables expansion is not available everywhere, as explained in the documentation: "Some select configuration fields in the DSS interface can perform variables expansion using ${variable} syntax."

https://doc.dataiku.com/dss/latest/variables/index.html

What is your overall objective? It seems you are trying to define a generic recipe to work with tables with different schemas. There might be better capabilities to achieve your objective.

I hope this helps.

epsi95 · October 2021

Ya. you are correct. The generic structure of the input data is the same but the column names can be different.

Can you tell me a little bit more about a better alternative than this `global variable` method?

Manuel · October 2021

Hi,

It would still be good to understand what is the challenge you are trying to solve, to point you to the right solution.

If your need is akin to execute the same flow for different input datasets, then you should look into:

- Dataiku Applications: However, a constant schema is still a requirement. https://videos.dataiku.com/watch/GuevWgkMCMrmFXAUNnnDiw?

- Application as recipes, in which you package an entire flow as a visual recipe. https://videos.dataiku.com/watch/8feWuYFCQtkXHJ7AYa5vre?

I hope this helps.

Best regards

epsi95 · October 2021

Ya sure let me describe my problem, I am creating a pipeline to process the input data.

(input data) --> process step 1 --> process step 2 --> process step 3 --> Final csv

The input data has the following format for example:

ID y x1 x2 x3 x4 class

1 2 1 5 6 6 'a'

1 2 3 4 6 6 'c'

2 4 2 4 1 2 'd'

This data will be used for linear regression.

So, there are multiple steps while processing like--

1. Selecting the required columns

2. Parsing the date column

3. Doing some Na filling and custom python recipes etc

Not the thing is the data headers can be different but it is always in this format

ID y x1 x2 x3 class

The names can be different like

ID --> vehicle registration number

x1 --> milage

x2 --> Fuel efficieny

y --> cost of servicing

or

ID --> person ID

x1 --> distance

x2 --> time

y --> toll cost

etc...

So I need to make the whole flow such that if I just modify the global variable file, everything will ve fine. To be more precise, this thing will be used by non-techy people, so I tried to limit them only to the global variables so that they don't need to change anything else in the flow. I hope this clarifies your questiong and you will be able to provide better solution.

epsi95 · October 2021

Sorry to ask again one question, the data is not actually a csv file, it is residing in a database and renaming the column headers of the data is not possible. Like if the original dataframe has a column name "cat_age" after all the processing stuffs user wants some parameters related to "cat_age". According to your suggestion I can make the X variables as "feature_1", "feature_2" and so one but I don't know how many features will be there, it can be 10 or 11. What do you suggest in that case?

How to access global variables inside visual recipes?

Best Answer

Answers

Categories

Setup Info

Tags