Passing inputs to Shell Recipes

Options
yashpuranik
yashpuranik Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2022, Neuron 2023 Posts: 69 Neuron

According to https://doc.dataiku.com/dss/latest/code_recipes/shell.html, input and output dataset names are shared as environment variable names to the shell recipe.

Assume my input dataset name is "my_input" and output dataset is "my_output". How would I specify them to my shell recipe?

Something like:

cat $my_input > $my_output ?

P.S: I know I don't need to explicitly provide these names for the cat command in Dataiku. I am trying to write a more complex shell recipe and need to figure out how to specify the datasets explicitly.


Operating system used: Linux

Tagged:

Answers

  • Jurre
    Jurre Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Registered, Dataiku DSS Developer, Neuron 2022 Posts: 114 ✭✭✭✭✭✭✭
    Options

    Hi @yashpuranik
    ,

    In the shell-recipe, left top of your screen (see attached screenshot) you'll find "variables" next to "datasets". Select "variables", in the list you will find "DKU_INPUT_0_DATASET_ID" or something like it > click on it and it will be used in your recipe. When your output will be a new file it might be wise to specify the output-folder (also mentioned in the variables list) as that saves time searching for it.

    Hope this helps, have fun!

    variables.jpg

Setup Info
    Tags
      Help me…