How can you convert a csv file to JSON file?
UserBird
Dataiker, Alpha Tester Posts: 535 Dataiker
How to convert a CSV file to JSON file using R or Python in Dataiku
Tagged:
Answers
-
The CSV dataset in Dataiku is exposed to Python as a Pandas dataframe; I would try using the to_json() method from Pandas to convert it to JSON. https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_json.html
-
Hi Alex,
I tried it. However, it does not work in python recipe because that you have to write a dataset in the end.
Do you have examples with more details about how that works?
Thanks -
I think I've misunderstood what you're trying to do. What's the end goal for the JSON? Do you want it to be passed as a cell of a downflow Dataiku dataset or written to an external file, or..?
-
Hi Alex,
Ideally, I want them both. First, I want to write it to a cell in dataiku flow. Somehow in the future, I might need to be able to write it to an external file.
Thanks,
Doris -
Cool; so, the following code could be used in a Python recipe to read a Dataiku dataset, convert it to json, write it back to a Dataiku dataset, and write it out to a file. "input_dataset" can be changed to whatever the name of the input Dataiku dataset is for the recipe, "output_dataset" can be changed to whatever the name of the output dataset is, and "output_file" can be changed to the path where you want the json to be written on the filesystem.
# -*- coding: utf-8 -*-
import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu
# Recipe inputs
input = dataiku.Dataset("input_dataset")
input_df = input.get_dataframe()
# Convert to json
input_json = input_df.to_json()
# Convert json to a one row, one column data frame
input_json_df = pd.DataFrame(data=[input_json], columns=['json'])
# Write new data frame back to Dataiku dataset
output = dataiku.Dataset("output_dataset")
output.write_with_schema(input_json_df)
# Write json to external file
f = open('output_file', 'w')
f.write(input_json)
f.close() -
Thanks!! It works:)