Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
I have some JSON files on a managed folder. Each file has a different schema. I'm trying to convert it into CSV files and upload them onto an S3 bucket.
Now, the way I thought this out was once the files were extracted to the managed folder. I'd run a python script to extract the results from each file, convert it into a dataset and have the dataset synced to the S3 bucket.
Another way is to create a python probe on the managed folder. Whenever the files come in, they can be converted into CSV and stored in the folder.
I'm new to DataIku, what's the best way to sync the managed folder onto the bucket along with the conversion from JSON to CSV?
Answering for self, keep multiple files in folders. Avoid blowing up the screen with hundreds of datasets, never goes well.
Create a Python probe, convert object to df to dataset.
Look up managed folders on dataiku documentation to write it onto folders.
Answering for self, keep multiple files in folders. Avoid blowing up the screen with hundreds of datasets, never goes well.
Create a Python probe, convert object to df to dataset.
Look up managed folders on dataiku documentation to write it onto folders.
Thank you for sharing your solution @suhail-bari!