Stop "New values found for categorical columns" Warnings from model evaluation store recipe

Marlan
Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 321 Neuron

Hi all,

Anyone have any ideas about how I can stop the recipe for a model evaluation store from generating warnings for new values for categorical columns? I have other warnings I do want to be notified of but don't want to be notified of this issue very time the recipe runs.

I would have thought that turning off the Dataset Sanity Checks in the source model would fix this but that didn't seem to help.

The warning message is: ML_DIAGNOSTICS_DATASET_SANITY_CHECKS occurred 1 times. New values found in evaluation data that were not present in reference data for categorical column ZIP_CODE.

If I can't stop this warning from occurring, maybe there is a way to capture this warning in a later step in a scenario and turn it off. Ideally I'd just set the OUTCOME variable back to SUCCESS but I don't think that is possible.

Any ideas?

Thanks!

Marlan


Operating system used: Red Hat

Tagged:

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker

    Hi @Marlan
    ,
    The possibility of suppressing these warnings is logged as an enhancement request.
    For now, as you mentioned, you can just ignore warnings in the further steps by setting outcome on warnings to "Success"

    Screenshot 2023-07-09 at 1.27.21 PM.png
    Or use reporter run condition to like outcome == 'FAILED'

    Thanks,

  • Marlan
    Marlan Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Dataiku Frontrunner Awards 2021 Participant, Neuron 2023 Posts: 321 Neuron

    Thanks for the reply @AlexT
    .

    Yes, it would be great to be able to turn off this warning. Generally I don't really care if there are new values. Although for some columns I might want to be notified (I think ideally it'd be on a column by column basis - maybe as part of feature set up).

    I don't have the Outcome on warnings option you show in your message. I assume this functionality was added after version 11.1.2 (our current version)?

    Can I set the Outcome programmatically in a subsequent Execute Python step?

    If so, still wouldn't quite be there as I'd need to suppress just the "new values found" warning and not suppress other warnings which I do want to capture.

    Thanks,

    Marlan

Setup Info
    Tags
      Help me…