Issue converting dataset to JSON

Options
xXdbrzXx
xXdbrzXx Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Partner
edited July 16 in Using Dataiku

Hi All,

I'm trying to convert a dataset into a JSON format by using a python script and then calling it via Postman to use it externally. I am running into an issue when calling to Postman, it never seems to read the formatting correctly and it prints a series of arrays (attached image)Screenshot 2024-05-01 143217.jpg

Here's the code that I'm using:

import dataiku
import pandas as pd
from dataiku import pandasutils as pdu

Predict = dataiku.Dataset("Predict")
Predict_df = Predict.get_dataframe()
json_data = Predict_df.to_dict(orient='records')

import json
clean_json_str = json.dumps(json_data, indent=None)  
JSON_predict_df = pd.DataFrame({'json_data': [clean_json_str]})

JSON_predict = dataiku.Dataset("JSON_Predict")
JSON_predict.write_with_schema(JSON_predict_df)

Can anyone provide some insight as to what I'm doing wrong?

Best,

Dylan

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,717 Neuron
    Options

    You can't write a dataset back to JSON format. The format that Dataiku uses to stores datasets is not something you can control. If you need to send a dataset as a JSON format to an external API you can convert it on the fly and send it as JSON there and then, do not write it back to the Dataiku dataset. So in your code once you have clean_json_str you should call your external API right there.

  • xXdbrzXx
    xXdbrzXx Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Partner
    Options

    I'm not sure this is possible with the application that I'm using. The call is done from the application side, but it doesn't have any python support. Would it be possible to call the python code portion from the workflow instead of the resulting dataframe? I'm just trying to think of a way an API could work, I know I could use an SQL server but that is a different story.

    Would it be easier to setup a node and have the information called from there?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,717 Neuron
    Options

    I am not really sure I understand what you mean by "the call is done from the application side". What call is that? An API like a REST API is language agnostic.

  • xXdbrzXx
    xXdbrzXx Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 5 Partner
    Options

    I'm using a platform called Mendix and this is a way to setup an API call within the environment:
    1.jpg

    You input the API url in the window on the left and the object on the right is the call performing the action.

    2.jpg

    In the Response tab of the window, you can define the mapping (mapped object) that is done for the information that is being called.
    3.jpg4.jpg

    The mapping objects only accept JSON format, so verifying the structure is necessary. Once its mapped, the call itself is embedded into a page and displays the results of the call. I was trying to think of a way to kind of circumvent this process by possibly publishing an API in the application, having a python script call it and push the JSON formatted data back into it but I am unsure if this is possible as well.

    Let me know if this makes better sense.

Setup Info
    Tags
      Help me…