Pretty sure this is a basic question, however I was not able to find a solution so far.
After removing some columns, I ended up with rows that have an empty value in each column (tooltip shows "null"), which I want to remove now.
I tried the following processor ("filter rows/cells on value"):
However, the empty rows are not removed. A counter-check with "Action: Only keep matching rows" removes all rows.
The "Remove rows where cell is empty-processor" in combination with selecting "all" columns removes each row with at least one empty column (which is not what I want) as described in https://doc.dataiku.com/dss/latest/preparation/processors/remove-empty.html.
Am I missing something? Thanks!
I'm still seeing the same behaviour unfortunately, however for future readers a simpler option is described here using filter by formula:
https://community.dataiku.com/t5/Using-Dataiku-DSS/Remove-row-if-two-column-are-both-empty/m-p/1285
Hello,
In the meantime, you can use the following trick:
1. Add a Python function step in "cell" mode to output an "all_empty" column, with the following code
def process(row):
all_empty = all(
[str(v).strip() == '' or v is None
for k,v in row.items() if k != 'all_empty']
)
return(all_empty)
2. Add a filter on value step to remove the rows where the "all_empty" indicator column is True
​
Hope it helps,
Alex