Read files in managed folders with shell

Klajdi
Klajdi Registered Posts: 3

Hi, can someone help me please.

Given a folder input and a folder output I want to link them with a shell script so that the shell script can read a test.txt file from input folder and write the output.txt file in the output folder with a .sh script but when i use the variables of dataiku it doesn't work.

Here an example where i write someting in an output.txt file in the output folder with the variable:

#!/bin/bash

output_file="$DKU_OUTPUT_0_FOLDER_ID/output.txt"

echo "test_string" > "$output_file"
echo "Created file: $output_file"

I can't find any example in the platform…

Answers

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    Why do you need to use folders for input and output and shell script for the recipe? What is the actual requirement here?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    In any case your issue is that you should be using $DKU_OUTPUT_0_FOLDER_PATH not $DKU_OUTPUT_0_FOLDER_ID. But I have serious doubts with this design so it will be best to understand what you are trying to do as there might be better ways of achieving what you want to achieve.

  • Klajdi
    Klajdi Registered Posts: 3

    Hi I just want to create a file in a managed folder from a shell script but it seems I can't do that becouse files are stored in amazon s3…

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    There are many gotchas when using managed folders. It will be best if you state your requirement and why you think you need to use managed folders and shell scripts as like I said there might be better ways of achieving the same goal. And there are ways around your S3 problem as well.

  • Klajdi
    Klajdi Registered Posts: 3

    Hi Turribeach, thanks for your help, I needed to use shell due to code restrictions of my company (I could easly solved this with python…). For everyone with my same problem here is a solution I wrote with the use of the shell recipe to copy files from a folder into another folder (since files are on Amazon S3 thiis the only way I found):

    Get dataiku file from input folder

    input_folder_id="$DKU_INPUT_0_FOLDER_ID"
    output_file="output.txt"
    input_filename="input.txt"
    api_url="https://<your_node>:443/public/api"
    api_key="xxxx"
    project_key="$DKU_CUSTOM_VARIABLES_projectKey"
    output_folder_id="$DKU_OUTPUT_0_FOLDER_ID"

    curl -X GET "$api_url/projects/$project_key/managedfolders/$input_folder_id/contents/$input_filename" -H "Authorization: Bearer $api_key" -o "$input_filename"

    echo "get data with code from input_filename and write in output_file" >> "$output_file"

    Write output to file in Dataiku folder

    curl -X POST "$api_url/projects/$project_key/managedfolders/$output_folder_id/contents/output.txt" -H "Authorization: Bearer $api_key" -F "file=@$output_file;type=text/csv" # Carica il file come multipart form data

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron

    Sorry but this is a horrible solution. I am sure there are much better alternatives. Why can't you use a standard Sync recipe to copy the files between folders? Why are you not allowed to use Python? Dataiku without access to Python is pretty much useless. This doesn't sound right, there are aspects that you are not really clarifying. Can you explain the requirement in detail?

Setup Info
    Tags
      Help me…