Read / write datasets in shell recipe

Solved!
Chiktika
Level 3
Read / write datasets in shell recipe

Hello,

The doc about shell recipes is quite light.
Do someone can help me please, I would like to read data from an input dataset and write data inside an output dataset?

Is it possible to do that if datasets are stored in google cloud storage?

Many thank for your help.

C.

 

0 Kudos
1 Solution
AlexT
Dataiker

Hi,

First it would be good understand if you really need resort to shell recipe to achive what you are trying. Python recipe would offer much more flexibility especially when dealing with datasets. 

You can read and write to remote dataset( S3, GCP etc) but not directly from remote managed folders with a shell recipe. You should use a Python recipe for that instead. 

The existing documentation section relevant is this:

https://doc.dataiku.com/dss/latest/code_recipes/shell.html#piping-a-dataset-in-and-out

In the example below I reading a TSV file from S3 and writting back all of the lines to another S3 dataset. 

Screenshot 2021-06-18 at 18.06.13.png

Let me know if this helps

View solution in original post

0 Kudos
2 Replies
AlexT
Dataiker

Hi,

First it would be good understand if you really need resort to shell recipe to achive what you are trying. Python recipe would offer much more flexibility especially when dealing with datasets. 

You can read and write to remote dataset( S3, GCP etc) but not directly from remote managed folders with a shell recipe. You should use a Python recipe for that instead. 

The existing documentation section relevant is this:

https://doc.dataiku.com/dss/latest/code_recipes/shell.html#piping-a-dataset-in-and-out

In the example below I reading a TSV file from S3 and writting back all of the lines to another S3 dataset. 

Screenshot 2021-06-18 at 18.06.13.png

Let me know if this helps

0 Kudos
Chiktika
Level 3
Author

Hi @AlexT 

I like to use shell recipes when I just need to handle simple actions on files.

I this case I only needed to create a simple txt file from a dataset.

Your sample code is perfect and allowed me to understand how to read and write.

Many thanks โ€Œโ€Œ

C.

0 Kudos

Labels

?
Labels (1)
A banner prompting to get Dataiku