filter data with multiple columns with a specific value for each
Hi,
from the formula language in the prepare recipe, i am looking to filter the data with multiple columns with a specific value or in a list for each column.
Ex : column 'A' has 5 values, out of 5, 3 values are to be matched with other column 'B' with a value 'X' only, so remaining 2 values from col 'A' can be matched with all values in Col 'B'.
(as i am not able to find a solution to filter a column with certain list of values )
Best Answer
-
Miguel Angel Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 118 Dataiker
Hi,
You can use the formula language to filter rows based on a list of values using array functions. For example, see Capture.png where rows are filtered based on whether the 'pages_visited' values falls within a provided array.
You can further complicate things by using boolean operators and add conditions for other columns.
Information about formula functions can be found in this article: https://doc.dataiku.com/dss/11.0/formula/index.html#array-functions
Answers
-
Hi
Thanks for the response, i have already visited the above page for ArrayContains, unfortunately it didn't work; assumed that the 'item' is only meant to pass a single number/string and cannot be a column as the below details are not mentioned clearly, also it wasn't demonstrated with an example considering the 2nd parameter as a variable.
will try out the way you had written in screenshot. Thanks again .
arrayContains(array a, item) boolean
Returns whether the array a contains item
arrayContains([1, 2, 3], 5) returns false
-
Ashley Dataiker, Alpha Tester, Dataiku DSS Core Designer, Registered, Product Ideas Manager Posts: 163 Dataiker
Might using a join between two datasets as a means to filter the first dataset do the trick? From your original post, it's not clear what you're trying to have filter what, but I've found that using a Join can be quite handy when I want to filter a dataset by a list of values.
Ashley