Issue converting dataset to JSON
Hi All,
I'm trying to convert a dataset into a JSON format by using a python script and then calling it via Postman to use it externally. I am running into an issue when calling to Postman, it never seems to read the formatting correctly and it prints a series of arrays (attached image)
Here's the code that I'm using:
import dataiku import pandas as pd from dataiku import pandasutils as pdu Predict = dataiku.Dataset("Predict") Predict_df = Predict.get_dataframe() json_data = Predict_df.to_dict(orient='records') import json clean_json_str = json.dumps(json_data, indent=None) JSON_predict_df = pd.DataFrame({'json_data': [clean_json_str]}) JSON_predict = dataiku.Dataset("JSON_Predict") JSON_predict.write_with_schema(JSON_predict_df)
Can anyone provide some insight as to what I'm doing wrong?
Best,
Dylan
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,112 Neuron
You can't write a dataset back to JSON format. The format that Dataiku uses to stores datasets is not something you can control. If you need to send a dataset as a JSON format to an external API you can convert it on the fly and send it as JSON there and then, do not write it back to the Dataiku dataset. So in your code once you have clean_json_str you should call your external API right there.
-
xXdbrzXx Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Partner
I'm not sure this is possible with the application that I'm using. The call is done from the application side, but it doesn't have any python support. Would it be possible to call the python code portion from the workflow instead of the resulting dataframe? I'm just trying to think of a way an API could work, I know I could use an SQL server but that is a different story.
Would it be easier to setup a node and have the information called from there?
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,112 Neuron
I am not really sure I understand what you mean by "the call is done from the application side". What call is that? An API like a REST API is language agnostic.
-
xXdbrzXx Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Partner
I'm using a platform called Mendix and this is a way to setup an API call within the environment:
You input the API url in the window on the left and the object on the right is the call performing the action.
In the Response tab of the window, you can define the mapping (mapped object) that is done for the information that is being called.
The mapping objects only accept JSON format, so verifying the structure is necessary. Once its mapped, the call itself is embedded into a page and displays the results of the call. I was trying to think of a way to kind of circumvent this process by possibly publishing an API in the application, having a python script call it and push the JSON formatted data back into it but I am unsure if this is possible as well.
Let me know if this makes better sense.