Cells highlighted in Red - what does it mean?

GSung Registered Posts: 27 ✭✭✭✭

I have uploaded an excel file , in which a column is just numbers (for example: 1,2,3,4,4a,4b) . However, once I have uploaded the file, in the explore tab, where I can preview the whole dataset, I notice that some cells are highlighted in red.

What does that mean?


  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron


    I hope you are doing well. From your description. I'm not exactly sure what you are seeing. I can't remember a time where I get a red highlighting. If the data is not too sensitive can you share a snapshot of what you are seeing?

    I see you noted that you were working with numbers. However, when you listed some example values your called out "4a,4b". Unless you are working with Hex numbers rather than Decimal numbers these character sequences don't represent numbers and Dataiku DSS is going to have a hard time representing these as numbers. Initially you will have to treat these as strings and then convert from Hex back to integers or decimal values.

    Finally given your description you may be running into a very useful feature to show leading, trailing, and multiple spaces in a field. The thing is that you have to ask for this feature to be turned on. So I don't think this is what you are looking at: You turn this feature on in the Display Menu of a dataset.

    Highlight Leading and Trailing and multiple spaces.jpg

    When you use this feature you get magenta (purple/pink) highlights in your data fields.

    Showing Spaces.jpg

    If that does not help. Please share more details, so that someone can help.

  • Mattsco
    Mattsco Dataiker, Registered Posts: 125 Dataiker


    Red highlighted values mean invalid values based on the inferred meaning (text, integer, decimal, gender, ...)

    In your example, DSS would expect a valid value is an integer number so cells with values like 4a are going to be red.

    Note you can modify manually the meaning to text if you want to correct it.

  • CoreyS
    CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭

    Hey @GSung
    the red highlighted rows are showing invalid values, ie values not matching a selected meaning.

    You can use the Analyze window to explore those more. For more information you can utilize the following resources:

    1. Dataiku Academy: Basics 101
      1. Concepts: Analyze and Data Quality
      2. Concepts: Analyze and Data Quality video
    2. Knowledge Base: Analyze

    You can also use a Prepare Recipe to Flag invalid rows

    I hope this helps!

Setup Info
      Help me…