Unnest Json fails with Boolean values

Options
HeteshPatel
HeteshPatel Registered Posts: 11 ✭✭✭✭

Hi, there seems to be an issue when there is a Boolean True False value in a JSON when trying to use the Unnest Object function within the Prepare Recipe

for example using the following field as an input

{"BooleanTest": True}

results in the Unnest function to not work, as a workaround I've had to replace true and false to 1 and 0 prior to unnestting, just wanted to report this as a bug.

dataiku_unnest_Fail.png


Operating system used: Unix

Tagged:

Best Answer

  • Nicolas_Servel
    Nicolas_Servel Dataiker Posts: 37 Dataiker
    edited July 17 Answer ✓
    Options

    Hello Hetesh,

    As described Tom, it does not work because you are using "True" instead of "true".

    Replacing your values by lowercase "true", "false" should solve your issue.

    The reason behind that is that "True" is not a proper JSON item, so DSS will fail parsing it and therefore processing it. This processor only works with valid JSON.

    Oftentimes, you have "True" in your data because it was produced by Python. If that is the case, you should serialize your data to JSON prior to saving it, using for instance:

    json.dumps(your_dict) 

    Hope this helps,

    Best regards,

    Nicolas Servel

Answers

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @HeteshPatel

    i think I’ve run into exactly the same problem. My work around was to find all of the true and false values. However, I’m not at my computer right now. If I Remember I also used one of the text processors to change the json strings. What I don’t remember is how I changed the text. I might have

    • simply put quotes around the “true” and “false”
    • or changed the case to lower.
  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @HeteshPatel

    here is my prior post on the work around for the True and False problem. It was about changing the case to all lower.

    https://community.dataiku.com/t5/Plugins-Extending-Dataiku/api-connect-plugin/m-p/20765/highlight/true#M1307

    Give this a try and let us know how you are getting on with your project.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @CoreyS

    Can we find out if this is currently on the bug list? Or if this has been fixed in a recent version of DSS. Or should this thread be moved to the product idea section of the site to get the attention it might need to get a resolution. Or enhancement so that DSS is more tolerant of JSON variability in the wild. Having to change True to true in order to get these features to work is somewhere between a pain and a waste of time.

    Thoughts?

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    Options

    @Nicolas_Servel
    ,

    I’m working with a third party vendor’s REST API where I’m getting the JSON which you are describing as Mall-formed, And using the Dataiku REST API plugin to pull this data. (See my earlier post referenced above.)

    I’d like to retest with the latest version of DSS and the latest API plugin to see if I’m still getting the same mixed case results.

    However, if I remember correctly I think that postman was also pulling the mixed case True and False. If that is the case (pun intended) I’m thinking of reaching out to this third party vendor with concerns about how they are formatting these json values. Can you point me to any official JSON documentation like an RFC or something like that show that True and False are out off spec for JSON? It would be really helpful to have some sort of authoritative documentation about this point before going to the other organization about their API.

    Thanks for any assistance you can provide.

  • tgb417
    tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,595 Neuron
    edited July 17
    Options

    @Nicolas_Servel
    Ok, so I think I’ve found the authoritative source on this point. It appears that RFC 8259 says the following:

    Values
    
       A JSON value MUST be an object, array, number, or string, or one of
       the following three literal names:
    
          false null true
    
       The literal names MUST be lowercase.  No other literal names are
       allowed.
    
          value = false / null / true / object / array / number / string
    
          false = %x66.61.6c.73.65   ; false
    
          null  = %x6e.75.6c.6c      ; null
    
          true  = %x74.72.75.65      ; true

    Thanks for pointing this out. I’ll see if I can figure out which tools or data source is giving me mixed case true of false and causing the Dataiku tools to have problems.

Setup Info
    Tags
      Help me…