Do you know the History of Data Science? READ MORE

Clearing a managed folder

Solved!
vishet
Level 3
Clearing a managed folder

Hi, 

How would I go about clearing the contents of a managed folder in a recipe?

 

Best.

0 Kudos
1 Solution
Ignacio_Toledo

Hi @vishet,

I think it will depend somehow in your particular use case:

  1. Assuming that what you want to achieve is to automatically clear the contents of a folder as one of the steps of an scenario, you have the option to setup a "Clear" step in a scenario, like is shown in the next screenshot:
    Selection_007.png
  2. If your folder is populated by a "Download" recipe, then you check the options:
    Selection_008.png

    and in this way you will be sure that every time you refresh the contents of the managed folder all the data will be downloaded again, and all any other extra files will be erased.

  3. Now, if you are using a code recipe, it will depend a bit of the programming language. With python and the dataiku api, it would be something like this:

 

import dataiku
import os
import shutil

path_to_folder = dataiku.Folder('foldername').get_path()

# took from a kite example

list_dir = os.listdir(path_to_folder)
for filename in list_dir:
    file_path = os.path.join(path_to_folder, filename)
    # If the element is a file
    if os.path.isfile(file_path) or os.path.islink(file_path):
        print("deleting file:", file_path)
        os.unlink(file_path)
    # In case is a folder
    elif os.path.isdir(file_path):
        print("deleting folder:", file_path)
        shutil.rmtree(file_path)

​

 

Hope one of these solutions helps!

I.

View solution in original post

0 Kudos
1 Reply
Ignacio_Toledo

Hi @vishet,

I think it will depend somehow in your particular use case:

  1. Assuming that what you want to achieve is to automatically clear the contents of a folder as one of the steps of an scenario, you have the option to setup a "Clear" step in a scenario, like is shown in the next screenshot:
    Selection_007.png
  2. If your folder is populated by a "Download" recipe, then you check the options:
    Selection_008.png

    and in this way you will be sure that every time you refresh the contents of the managed folder all the data will be downloaded again, and all any other extra files will be erased.

  3. Now, if you are using a code recipe, it will depend a bit of the programming language. With python and the dataiku api, it would be something like this:

 

import dataiku
import os
import shutil

path_to_folder = dataiku.Folder('foldername').get_path()

# took from a kite example

list_dir = os.listdir(path_to_folder)
for filename in list_dir:
    file_path = os.path.join(path_to_folder, filename)
    # If the element is a file
    if os.path.isfile(file_path) or os.path.islink(file_path):
        print("deleting file:", file_path)
        os.unlink(file_path)
    # In case is a folder
    elif os.path.isdir(file_path):
        print("deleting folder:", file_path)
        shutil.rmtree(file_path)

​

 

Hope one of these solutions helps!

I.

View solution in original post

0 Kudos
A banner prompting to get Dataiku DSS