Using Clear step in scenario for HDFS output dataset
May I know if the 'Clear' step in the scenario works the same as directly deleting the datasets, i.e it will drop the table and its metastore?
Operating system used: Windows 10
Best Answer
-
Shashank Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 27 Dataiker
That's correct. Clear Step on a dataset runs the Drop Table command.
To Test it, create a scenario and add a step for clear on any dataset, run it, go to Last Runs and check Step Logs. You can see the statement submitted for the step in the logs
Answers
-
I have tried it out.
I do not find an explicit DROP TABLE command in the step. Is that expected?
-
Shashank Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 27 Dataiker
The other way is to check in the underlying database if the table has dropped or not. I am not sure which DSS version you're on and which database it is, but I can confirm that v10 and above I can see the DROP Table command statement in the Logs.
-
I am using v8, probably that is why. At the moment the way I am confirming is by querying the table using notebook and I will get an error saying table does not exist.
Thanks for the help -
Update:
I can actually see the DROP command. Reason I did not see last time because i am executing on an already-deleted dataset. My bad.