Using Clear step in scenario for HDFS output dataset

farhanromli
farhanromli Registered Posts: 25 ✭✭✭✭

May I know if the 'Clear' step in the scenario works the same as directly deleting the datasets, i.e it will drop the table and its metastore?

clear step.PNG

delete dataset.PNG


Operating system used: Windows 10

Best Answer

  • Shashank
    Shashank Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 28 Dataiker
    Answer ✓

    That's correct. Clear Step on a dataset runs the Drop Table command.

    To Test it, create a scenario and add a step for clear on any dataset, run it, go to Last Runs and check Step Logs. You can see the statement submitted for the step in the logs

Answers

  • farhanromli
    farhanromli Registered Posts: 25 ✭✭✭✭

    I have tried it out.

    I do not find an explicit DROP TABLE command in the step. Is that expected?

  • Shashank
    Shashank Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 28 Dataiker

    The other way is to check in the underlying database if the table has dropped or not. I am not sure which DSS version you're on and which database it is, but I can confirm that v10 and above I can see the DROP Table command statement in the Logs.

  • farhanromli
    farhanromli Registered Posts: 25 ✭✭✭✭

    I am using v8, probably that is why. At the moment the way I am confirming is by querying the table using notebook and I will get an error saying table does not exist.

    Thanks for the help

  • farhanromli
    farhanromli Registered Posts: 25 ✭✭✭✭

    Update:

    I can actually see the DROP command. Reason I did not see last time because i am executing on an already-deleted dataset. My bad.

Setup Info
    Tags
      Help me…