Unnest Json fails with Boolean values

Solved!
HeteshPatel
Level 2
Unnest Json fails with Boolean values

Hi, there seems to be an issue when there is a Boolean  True False value in a JSON when trying to use the Unnest Object function within the Prepare Recipe

for example using the following field as an input

{"BooleanTest": True}

results in the Unnest function to not work, as a workaround I've had to replace true and false to 1 and 0 prior to unnestting, just wanted to report this as a bug.

 

dataiku_unnest_Fail.png

 


Operating system used: Unix

0 Kudos
1 Solution
Nicolas_Servel
Dataiker

Hello Hetesh,

As described Tom, it does not work because you are using "True" instead of "true".

Replacing your values by lowercase "true", "false" should solve your issue.

The reason behind that is that "True" is not a proper JSON item, so DSS will fail parsing it and therefore processing it. This processor only works with valid JSON.

Oftentimes, you have "True" in your data because it was produced by Python. If that is the case, you should serialize your data to JSON prior to saving it, using for instance:

json.dumps(your_dict) 

 

Hope this helps,

Best regards,

 

Nicolas Servel

View solution in original post

6 Replies
tgb417

@HeteshPatel 

i think Iโ€™ve run into exactly the same problem.  My work around was to find all of the true and false values.  However, Iโ€™m not at my computer right now.   If I Remember I also used one of the text processors to change the json strings.  What I donโ€™t remember is how I changed the text.  I might have

  • simply put quotes around the โ€œtrueโ€ and โ€œfalseโ€
  • or changed the case to lower. 
--Tom
tgb417

@HeteshPatel 

here is my prior post on the work around for the True and False problem. It was about changing the case to all lower. 

https://community.dataiku.com/t5/Plugins-Extending-Dataiku/api-connect-plugin/m-p/20765/highlight/tr...

Give this a try and let us know how you are getting on with your project.

--Tom
tgb417

@CoreyS 

Can we find out if this is currently on the bug list?  Or if this has been fixed in a recent version of DSS.  Or should this thread be moved to the product idea section of the site to get the attention it might need to get a resolution.  Or enhancement so that DSS is more tolerant of JSON variability in the wild.  Having to change True to true in order to get these features to work is somewhere between a pain and a waste of time.

Thoughts?  

--Tom
0 Kudos
Nicolas_Servel
Dataiker

Hello Hetesh,

As described Tom, it does not work because you are using "True" instead of "true".

Replacing your values by lowercase "true", "false" should solve your issue.

The reason behind that is that "True" is not a proper JSON item, so DSS will fail parsing it and therefore processing it. This processor only works with valid JSON.

Oftentimes, you have "True" in your data because it was produced by Python. If that is the case, you should serialize your data to JSON prior to saving it, using for instance:

json.dumps(your_dict) 

 

Hope this helps,

Best regards,

 

Nicolas Servel

tgb417

@Nicolas_Servel,

Iโ€™m working with a third party vendorโ€™s REST API where Iโ€™m getting the JSON which you are describing as Mall-formed,  And using the Dataiku REST API plugin to pull this data.  (See my earlier post referenced above.) 

Iโ€™d like to retest with the latest version of DSS and the latest API plugin to see if Iโ€™m still getting the same mixed case results.  

However,  if I remember correctly I think that postman was also pulling the mixed case True and False.  If that is the case (pun intended) Iโ€™m thinking of reaching out to this third party vendor with concerns about how they are formatting these json values.  Can you point me to any official JSON documentation like an RFC or something like that show that True and False are out off spec for JSON?  It would be really helpful to have some sort of authoritative documentation about this point before going to the other organization about their API.

Thanks for any assistance you can provide.  

--Tom
0 Kudos
tgb417

@Nicolas_Servel Ok, so I think Iโ€™ve found the authoritative source on this point.  It appears that RFC 8259 says the following: 

Values

   A JSON value MUST be an object, array, number, or string, or one of
   the following three literal names:

      false null true

   The literal names MUST be lowercase.  No other literal names are
   allowed.

      value = false / null / true / object / array / number / string

      false = %x66.61.6c.73.65   ; false

      null  = %x6e.75.6c.6c      ; null

      true  = %x74.72.75.65      ; true

Thanks for pointing this out.  Iโ€™ll see if I can figure out which tools or data source is giving me mixed case true of false and causing the Dataiku tools to have problems.    

--Tom
0 Kudos