How to fill cells of column with previous non null value

abalo006
Level 2
How to fill cells of column with previous non null value

I have a table that is presorted that shows teams and players.

Capture.PNG

What I'm trying to do is create some kind of calculation to fill my blank cells for team, I want the cells to be filled with the last non null value that precedes that row.

For example, row 3 is blank, the last non null preceding value is row 1 = green, so I want that cell to have the value green also

 

Another example is row 17 is also blank, the last preceding non null value is row 14 = yellow so I want rows 15-16 to also have the value of yellow

 

Does that make sense? and is that possible using Dataiku?

 

I've attached a picture of my ideal output below and a sample dataset.

 

cap2.PNG


Operating system used: windows

0 Kudos
3 Replies
ZachM
Dataiker

Hi @abalo006,

You can do this by using a Window recipe.

Note that the type of the Player column must be "bigint" (not string) so that the rows are sorted correctly.

Window definitions:

swappy-20240501_143358.png

 

Aggregations (don't forget to select "First/Last not null"):

swappy-20240501_143451.png

 

Output dataset:

swappy-20240501_143528.png

 

 

0 Kudos

Nice one @ZachM, didn't work for me as postgreSQL doesn't support First/Last not null:Screenshot 2024-05-01 at 23.02.44.png

But this can be done in a SQL Recipe relatively easily:

SELECT t1."Player", t1."Team"
, (
    SELECT t2."Team"
    FROM "CT_TEST_book3_copy" t2
    WHERE t2."Player" = (
        SELECT MAX(t3."Player")
        FROM "CT_TEST_book3_copy" t3
        WHERE t3."Player" <= t1."Player"
        AND t3."Team" IS NOT NULL
       )
) AS "Calc_Team"
FROM "CT_TEST_book3_copy" t1;

 

Screenshot 2024-05-01 at 23.04.31.png

 

 

 

 

 

Turribeach

Not easy to do in a Visual recipe but relatively easy in either a SQL recipe or Python recipe. Is your Dataset in a SQL database?

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku