Application as recipe
Hi,
I have a scenario where am facing issue. Can you guide your suggestion on sorting out the issue.
I have created a project which has hive recipe.
In a given hive recipe I used the hql query which has table name of one of the computed output as input .
This flow is working fine while executing the flow.
I create the application for the project.
When I run the application flow it's running successfully but the changes which has expected as output is not the same which we expected.
I have used a variable as input to run application.
The first stage of recipe gets executed successfully by replacing variable.
But the next level needs to run based on the parent recipe. But when I check on the table name which I used for project is not same for application.
Because of that expected changes not take place.
My question is here to how to run the application as recipe where my table name needs to get dynamically changed based on the input dataset which we used in the recipe
Answers
-
Hi,
in a Hive recipe (or a SQL recipe), you can hit "Validate" and see which variables are available for inclusion in the recipe's SQL. For instance, if you want to refer to input datasets without using their names, you can use the "input:db:..." and "input:tbl:..." variables, like for example
SELECT `Survived`, count(*) as cnt
FROM `${input:db:0}`.`${input:tbl:0}`
group by `Survived`If you need the hive tables backing the DSS datasets to be named in such a way that you don't have overlaps between projects, you should include ${projectKey} in the table name if the datasets' Settings > Connection tab. For Dataiku Application instances, like those created by application-as-recipe, you additionally have a ${projectRandomKey} whose size is 8 (you'll need to define the variable in the template project with a fixed value to use for designing the flow, it'll be overwritten in the app instance)