I am a new Dataiku user, and I'd like some advice on the best way to leverage the tool to flag records with invalid values.
Example: I have a column called "Fruit," and I expect the following values for that column: 'Apples',' 'Oranges', & 'Grapes'. I want to flag in a different column any records that don't contain one of these values.
Some options that I've considered are:
Flag invalid rows recipe - this doesn't seem to work because it only checks for data types
Flag rows on value recipe - this doesn't seem to work since it is matching on a value and doesn't flag things that don't match
Joins- Setting up the valid values as a separate data source (b) to join against the primary data source (a) and in the output data source (c) adding a flag to any records where b.datasource's key is null and then rejoining (c) back to data source (a)
Nasty if formula - make a nested if formula that checks the records against all of the values
I'm sure this can be done in python too, but I am not fluent in python. Therefore, I am hoping to find an easy alternative.
Any recommendations based on how others have handled this use case?