Unnest Json fails with Boolean values
Hi, there seems to be an issue when there is a Boolean True False value in a JSON when trying to use the Unnest Object function within the Prepare Recipe
for example using the following field as an input
{"BooleanTest": True}
results in the Unnest function to not work, as a workaround I've had to replace true and false to 1 and 0 prior to unnestting, just wanted to report this as a bug.
Operating system used: Unix
Best Answer
-
Hello Hetesh,
As described Tom, it does not work because you are using "True" instead of "true".
Replacing your values by lowercase "true", "false" should solve your issue.
The reason behind that is that "True" is not a proper JSON item, so DSS will fail parsing it and therefore processing it. This processor only works with valid JSON.
Oftentimes, you have "True" in your data because it was produced by Python. If that is the case, you should serialize your data to JSON prior to saving it, using for instance:
json.dumps(your_dict)
Hope this helps,
Best regards,
Nicolas Servel
Answers
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
i think I’ve run into exactly the same problem. My work around was to find all of the true and false values. However, I’m not at my computer right now. If I Remember I also used one of the text processors to change the json strings. What I don’t remember is how I changed the text. I might have
- simply put quotes around the “true” and “false”
- or changed the case to lower.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
here is my prior post on the work around for the True and False problem. It was about changing the case to all lower.
Give this a try and let us know how you are getting on with your project.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
Can we find out if this is currently on the bug list? Or if this has been fixed in a recent version of DSS. Or should this thread be moved to the product idea section of the site to get the attention it might need to get a resolution. Or enhancement so that DSS is more tolerant of JSON variability in the wild. Having to change True to true in order to get these features to work is somewhere between a pain and a waste of time.
Thoughts?
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
I’m working with a third party vendor’s REST API where I’m getting the JSON which you are describing as Mall-formed, And using the Dataiku REST API plugin to pull this data. (See my earlier post referenced above.)
I’d like to retest with the latest version of DSS and the latest API plugin to see if I’m still getting the same mixed case results.
However, if I remember correctly I think that postman was also pulling the mixed case True and False. If that is the case (pun intended) I’m thinking of reaching out to this third party vendor with concerns about how they are formatting these json values. Can you point me to any official JSON documentation like an RFC or something like that show that True and False are out off spec for JSON? It would be really helpful to have some sort of authoritative documentation about this point before going to the other organization about their API.
Thanks for any assistance you can provide.
-
tgb417 Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Neuron 2020, Neuron, Registered, Dataiku Frontrunner Awards 2021 Finalist, Neuron 2021, Neuron 2022, Frontrunner 2022 Finalist, Frontrunner 2022 Winner, Dataiku Frontrunner Awards 2021 Participant, Frontrunner 2022 Participant, Neuron 2023 Posts: 1,598 Neuron
@Nicolas_Servel
Ok, so I think I’ve found the authoritative source on this point. It appears that RFC 8259 says the following:Values A JSON value MUST be an object, array, number, or string, or one of the following three literal names: false null true The literal names MUST be lowercase. No other literal names are allowed. value = false / null / true / object / array / number / string false = %x66.61.6c.73.65 ; false null = %x6e.75.6c.6c ; null true = %x74.72.75.65 ; true
Thanks for pointing this out. I’ll see if I can figure out which tools or data source is giving me mixed case true of false and causing the Dataiku tools to have problems.