data preparation

mgirdwood
mgirdwood Registered Posts: 7 ✭✭✭

Hi - Is it possible to add a new column then to populate the rows using the based on criteria that looks at one of the other columns, such as in excel

=IF(B1=1, "Start", "Stop") Where B1 is the column name

?

Thanks

Best Answer

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
    Answer ✓

    You can also nest if statements.

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    @mgirdwood
    ,

    In a word Yes, you can do this.

    If you are using a visual recipe, You can "Add a New" formula step. You can then "Open Editor Pannel" and can add a formula.

    Formula Editor.png

    Note that although this formula language looks a lot like MS Excel Formulas. It is not the same. Here are a few things that threw me off when getting started:

    • The formula language is case-sensitive IF and If and iF will not work. Only the all lower case version if will work.
    • It is often helpful to put number columns in the function numval() this will tend to deal with column names that have spaces in the names and other challenges.

    Here is a link to the documentation of the formula language.

    Hope this helps. Please let us know how you are getting on with your challanges.

  • mgirdwood
    mgirdwood Registered Posts: 7 ✭✭✭

    Thanks Tom, as an additional query, what about multiple outputs such as in the excel =IFS(), I can not seem to see a similar option in the formula language details

    Mark

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron

    If you mean that you want to do something like this.

    if(numval('is_rework') == 1 || numval('Titan') == 1, 'Start', 'Stop')

    Check out the Operators in the formula language.

    • Boolean operators: &&, ||

Setup Info
    Tags
      Help me…