Mark programatically a dataset as built
Hi,
How to mark programatically a dataset as built?
For instance, I need to create a managed MySQL dataset, write some data to it, and mark it programatically in the flow as "built". Is is there any dataset parameter modifiable by a user through Python API?
# Create empty dataset. client = dataiku.api_client() project = client.get_project(dataiku.default_project_key()) builder = project.new_managed_dataset_creation_helper("mydatasetname") builder.with_store_info("mysqlconnection") dataset = builder.create() # Compute dataset output as Pandas dataframe. dataframe = myfunction(inputs) # Write output to the dataset. output_handler = dataiku.Dataset(dataset) output_handler.write_with_schema(dataframe) ### How to change the dataset property in GUI flow to 'built'?
Answers
-
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,150 ✭✭✭✭✭✭✭✭✭
Hi, @Marek
! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community. -
Hi Marek.
I don't know if this will serve your purpose, but you can mark datasets as "explicit build only"
See dataset > settings > advanced > rebuild behaviour.
Regards
Mark
-
Hi Mark,
Thank you for your response. Actually my question is not about managing the dataset marking with GUI, but programatically i.e. how to change the marking within a Python script.