is it possible to bulk delete projects in a folder in the UI?

kathyqingyuxu
kathyqingyuxu Neuron, Registered, Neuron 2022 Posts: 46 Neuron

Hey Dataiku community,

Hope all is well!

We have a use case where we save the App instances of our applications as a recipe in order to test running the application and have the most recent run saved for reference. However, we would like to now delete the old application as a recipe app instances and only keep the must recently run instance in the App Instance folder. Is it possible to bulk delete these projects or do we have to go in and delete 1 at a time? Is there any way to leverage DSS API's to auto delete these extraneous projects after lets say a month of inactivity?

Any suggestions are helpful, not sure if others have a similar use case, thanks!

Best,

Kathy

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
    edited July 17

    Hi @kathyqingyuxu
    ,

    There isn't currently a way to bulk delete projects in the UI. At the moment there is also no a way to retrieve a project's last modified date from the API.

    If you can read the directory structure for projects, you could delete from a project folder based on the last time the project was modified on the filesystem, with something like this:

    import os 
    import dataiku
    from datetime import datetime, timedelta
    # to obtain your project path
    datadir = os.environ['DIP_HOME']
    
    # get project folder
    client = dataiku.api_client()
    root_folder = client.get_root_project_folder()
    
    # go through all of the projects in your project folder 
    for project_key in folder.list_project_keys():
        project_location = datadir + '/config/projects/' + project_key 
        # get the last modified date based on the project folder last modified date 
        last_modified = datetime.fromtimestamp(os.path.getmtime(project_location))
        # compare to 30 days ago 
        older_than_30_days = datetime.now() - timedelta(days=30)
        if last_modified < older_than_30_days:
            # get and delete the project 
            project = client.get_project(project_key)
            # goodbye 
            res = project.delete()

    Alternatively, if your naming convention for the projects is consistent, you may be able to simply delete based on the project name (i.e. if projects are named along the lines of project v1, project v2, project v3 for example).

    I hope that helps,
    Sarina

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 317 Dataiker
    edited July 17

    Hi @kathyqingyuxu
    ,

    To update: the project last modification date is available starting in DSS 9.0.2 from the get_timeline() method as well (the method is currently not yet documented). Here is an example given the original use case:

    import os 
    import json
    import dataiku
    from datetime import datetime, timedelta
    # to obtain your project path
    
    # get project folder
    client = dataiku.api_client()
    root_folder = client.get_root_project_folder()
    
    # go through all of the projects in your project folder 
    for project_key in root_folder.list_project_keys():
        # get project 
        project = client.get_project(project_key)
        last_modified = project.get_timeline()['lastModifiedOn']
        formatted = datetime.fromtimestamp(last_modified/1000)
        # compare to 30 days ago 
        older_than_30_days = datetime.now() - timedelta(days=30)
        if formatted < older_than_30_days:
            # get and delete the project 
            project = client.get_project(project_key)
            # goodbye 
            res = project.delete()

    Thanks,
    Sarina

  • cinderUARK
    cinderUARK Dataiku DSS Core Designer, Registered, Frontrunner 2022 Participant Posts: 8 ✭✭✭✭

    Sarina,

    Should the code be run on DSS or Linux?

    Thanks.

  • NickPedersen
    NickPedersen Registered, Frontrunner 2022 Finalist, Frontrunner 2022 Participant Posts: 6 ✭✭✭✭

    @SarinaS
    how is this code executed? Is it in the DSS UI somehow?

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,063 Neuron

    I would suggest you post a new thread when you have a different question than the original poster. But in simple terms you can execute this code anywhere you want. It could be a Jupyter notebook, a Python recipe, a Dataiku Plugin (which you will need to code), a Dataiku Scenario Script, a Dataiku Scenario Scritp Step, a Python script in your DSS server or even a Python scrupt running somewhere else. Each of this options may require different authentication.

  • NickPedersen
    NickPedersen Registered, Frontrunner 2022 Finalist, Frontrunner 2022 Participant Posts: 6 ✭✭✭✭

    Thank you. Just to clarify, this code will delete projects in the root project folder - not projects that sit other user created folders - that have not been modified in 30 days?

  • NickPedersen
    NickPedersen Registered, Frontrunner 2022 Finalist, Frontrunner 2022 Participant Posts: 6 ✭✭✭✭

    And a follow up question, can I place projects in a folder I create from the UI and delete them in the recipe? I am trying to delete abandoned projects on our DSS dev instance. So could I contain them in a folder called e.g. Abandoned projects and then delete them in the code? Just so I can isolate the abandoned ones from the non-abandoned ones in the root folder.

Setup Info
    Tags
      Help me…