Computation engine

Options
Moham
Moham Registered Posts: 1

I am new to Dataiku

When I use a prepare recipe is it better than using a code recipe in terms of performance optimization, and if a prepare recipe is better which engine would give me better performance


Operating system used: Cloud

Tagged:

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    It's impossible to know the answer to your question without more details of what you are trying to do, what does the prepare recipe do, what engines do you have available, where will the data come from and be written to, etc, etc.

    As a general rule, and assuming you have some sort of SQL engine to run your recipes, visual recipes like the Prepare recipe will be able use push down compute whereas the code recipes will need to bring the data into the DSS server to process and write it back to the output layer. Having said that code recipes can benefit from scalable compute options like kubernetes or Spark. So it really depends on many factors like I said in my reply start.

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 317 Neuron
    Options

    I would just add that the type of code recipe matters. Python code recipes bring data into DSS but SQL Script code recipes run entirely in database. I believe SQL Query code recipes do too if the output is in the same database.

    Marlan

  • me2
    me2 Registered Posts: 48 ✭✭✭✭✭
    Options

    I would only add that if you try hard to optimize for the individual recipe and end result is different engines in a flow than the becomes a barrier to optimizing entire flows by using pipelines.

    Happy dataiku'n!

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,728 Neuron
    Options

    I would say that in general you shouldn't need to be concerned about performance issues on recipes until the time where you have a real performance issue. Furthermore a key Dataiku capability is that you can use both code and visual recipes to achive the same goals. This allows you to support both "coders" and "clickers" personas. Invariably certain work loads will be faster in one type of recipe than others but you have to look at this in a case by case basis. There are infinite ways to code something up, so you also have a huge variability on code recipes depending on how they are coded.

Setup Info
    Tags
      Help me…