Retrieve data based on tree structure inside a column

Hugror
Hugror Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 4

Hi,

I'm trying to create a new column, "Parent", based on the 2 columns X and Y.

It's kind of a basic tree structure in Y. 2 is the daughter of 1 and so on, as in the example below.

In the "Parent", we use the Y column, but with the X value.

I really don't know how to do that, I tried to do it with a formula/windows recipe but i didn't succeed.

Hope I was clear enough. Thanks for your help.

XYParent
A1
B2A
C2A
D1
E2D
F2D
G3E (or F based on some conditions)
H1

Operating system used: Windows 11

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    This should be possible as long as you can guarantee that the sorted order can be applied as SORT BY X,Y. In other words the data allows us to sort it in such way that returns the data ordered in the same way as shown in your sample. Finally the logic doesn't add up for G. According to the other rows G's parent should be D not E or F.

  • Hugror
    Hugror Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 4

    My data is sorted exactly as shown in my sample. The only thing is that Y can go from 1 to 9 and X is a random string

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    What I need to be sure is that the data is sortable not sorted. These are different things. You can use Window recipe to calculate a parent but for that to work you need to specify a sort order. So it doesn't matter what sort order the data comes in, what matters is if I can sort by columns X and Y and obtain the same result. With regards to G I can't help unless I understand the rule of how you want the logic to work. For all the other rows you want to populate with the Level 1 parent X value when Y (level) >1. So that's a rule I can work with. With G I don't reallty know what you want.

Setup Info
    Tags
      Help me…