Submit your use case or success story to the 2023 edition of the Dataiku Frontrunner Awards

# filter data with multiple columns with a specific value for each

Solved!
Level 3
###### filter data with multiple columns with a specific value for each

Hi,

from the formula language in the prepare recipe, i am looking to filter the data with multiple columns with a specific value or in a list for each column.

Ex : column 'A' has 5 values, out of 5, 3 values are to be matched with other column 'B' with a value 'X' only, so remaining 2 values from col 'A' can be matched with all values in Col 'B'.

(as i am not able to find a solution to filter a column with certain list of values )

1 Solution
Dataiker

Hi,

You can use the formula language to filter rows based on a list of values using array functions. For example, see Capture.png where rows are filtered based on whether the 'pages_visited' values falls within a provided array.

You can further complicate things by using boolean operators and add conditions for other columns.

3 Replies
Dataiker

Hi,

You can use the formula language to filter rows based on a list of values using array functions. For example, see Capture.png where rows are filtered based on whether the 'pages_visited' values falls within a provided array.

You can further complicate things by using boolean operators and add conditions for other columns.

Level 3
Author

Hi

Thanks for the response, i have already visited the above page for ArrayContains, unfortunately it didn't work; assumed that the 'item' is only meant to pass a single number/string and cannot be a column as the below details are not mentioned clearly, also it wasn't demonstrated with an example considering the 2nd parameter as a variable.

will try out the way you had written in screenshot. Thanks again .

arrayContains(array a, item) boolean

Returns whether the array a contains item

arrayContains([1, 2, 3], 5) returns false

Dataiker

Might using a join between two datasets as a means to filter the first dataset do the trick? From your original post, it's not clear what you're trying to have filter what, but I've found that using a Join can be quite handy when I want to filter a dataset by a list of values.

Ashley