Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Added on June 18, 2021 1:19PM
Likes: 0
Replies: 2
Hello,
The doc about shell recipes is quite light.
Do someone can help me please, I would like to read data from an input dataset and write data inside an output dataset?
Is it possible to do that if datasets are stored in google cloud storage?
Many thank for your help.
C.
Hi,
First it would be good understand if you really need resort to shell recipe to achive what you are trying. Python recipe would offer much more flexibility especially when dealing with datasets.
You can read and write to remote dataset( S3, GCP etc) but not directly from remote managed folders with a shell recipe. You should use a Python recipe for that instead.
The existing documentation section relevant is this:
https://doc.dataiku.com/dss/latest/code_recipes/shell.html#piping-a-dataset-in-and-out
In the example below I reading a TSV file from S3 and writting back all of the lines to another S3 dataset.
Let me know if this helps
Hi @AlexT
I like to use shell recipes when I just need to handle simple actions on files.
I this case I only needed to create a simple txt file from a dataset.
Your sample code is perfect and allowed me to understand how to read and write.
Many thanks
C.