Simplest way to get the aggregate value from one dataset, and bring it in to another

Registered Posts: 44 ✭✭✭✭

I have dataset A and dataset B. I need the aggregate total from one column called "Total Commission" from B. I want to bring it into A and populate a single column with that value.

I know I can do this in Python with two dataframes and I know I can do this with a join if I create a join key in the datasets. Is there a simpler way to do this than either of those two options?

Thanks!

Welcome!

It looks like you're new here. Sign in or register to get started.

Answers

  • Registered Posts: 56 ✭✭✭✭✭

    Here is another way

  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,401 Neuron

    "Simple" is a very subjective adjective. For a "coder" user a Python recipe will be much simpler in their view. However for a "clicker" user a Join recipe will be much simpler. The issue is that your question doesn't really give enough information and clear requirements of what you are after. Are the Python recipe/join recipe not good enough for you? Why do you need another solution?

  • Registered Posts: 44 ✭✭✭✭

    To go the join route I have to perform the following steps:
    1) create a group recipe
    2) add a join key on both datasets
    3) Join then
    4) do all the math I need to do

    I think this will take at least 3 separate recipes, and will require a hefty amount of downstream schema refreshing. I am looking for the path of least resistance.

  • Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,401 Neuron

    Again the same issue since "the path of least resistance" depends on who is taking the path. Is your actual question "how can I do X with the least amount of visual recipes"?

    will require a hefty amount of downstream schema refreshing

    In v12 it's trivial to propagate schema changes since now schema propagation works properly and you have a new feature called "Build Downstream" with allows to enable schema propagation (see below).

    See here and here for this new feature.

  • Registered Posts: 44 ✭✭✭✭

    I do not think we are running v12 yet. I do not have those features. I ended up going the python route. Seemed to make the most sense.

Welcome!

It looks like you're new here. Sign in or register to get started.

Welcome!

It looks like you're new here. Sign in or register to get started.