Running python recipe on gcp cloud storage folder that is emtpy

Options
Mario_Burbano
Mario_Burbano Registered Posts: 12 ✭✭✭✭✭

Hi,

I have a python recipe that points to a folder on a GCP cloud storage connection. Overall the recipe works well, however when the folder is empty, because I clean it periodically, the recipe fails at the validation step:

Validation failed: Failed to compute recipe status: Folder doesn't exist

I have checked on the GCP console and I can confirm that the folder was indeed removed when what I wanted was to simply purge the files that were inside the folder. The action that I performed was a "Clear" on the folder.

I would like to be able to run this recipe without this type of error, either by being able to clear the contents of the folder without it being deleted or by being able to skip the validation made by DSS before the recipe is run. I tried overrunning the params.skipPrerunValidate recipe variable and setting it to true, but this does not seem to work. Does anyone have any other ideas?

Cheers,

Answers

  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 315 Dataiker
    Options

    Hi @Mario_Burbano
    ,

    I wanted to confirm if you are writing to a managed folder that points to GCS? If so, it is the case that clicking “Clear” on a managed folder via the UI will clear the contents of the managed folder, including the “prefix” itself.

    However, I would expect the prefix to get re-created when you run your recipe. If you are able to reproduce the error, would you mind attaching the following afterwards?

    • your Python code
    • a screenshot of the error
    • a screenshot of your folder settings (like below)
    • a screenshot of the Partitioning settings for the managed folder

    Screen Shot 2021-03-17 at 12.23.33 PM.png

    Then we can see what options might be available to avoid the issue you are facing.

    Thanks,

    Sarina 

  • Mario_Burbano
    Mario_Burbano Registered Posts: 12 ✭✭✭✭✭
    Options

    Hello @SarinaS
    ,

    Thanks for your reply. Please find attached the various elements that you requested. The source code is just something like this:

    folder = dataiku.Folder("FOLDERID")
    Regards,
  • Sarina
    Sarina Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer Posts: 315 Dataiker
    Options

    Hi @Mario_Burbano
    ,

    Thank you for attaching this information! I haven't quite been able to reproduce the scenario that you seem to run into here. Would you mind opening a support ticket and attaching a job diagnostic of the job run that fails? You can get the diagnostic from the job page, by clicking on Actions > Download job diagnosis. I think that will make this easier to look into.

    Thanks,

    Sarina

Setup Info
    Tags
      Help me…