Mark programatically a dataset as built
Hi,
How to mark programatically a dataset as built?
For instance, I need to create a managed MySQL dataset, write some data to it, and mark it programatically in the flow as "built". Is is there any dataset parameter modifiable by a user through Python API?
# Create empty dataset.
client = dataiku.api_client()
project = client.get_project(dataiku.default_project_key())
builder = project.new_managed_dataset_creation_helper("mydatasetname")
builder.with_store_info("mysqlconnection")
dataset = builder.create()
# Compute dataset output as Pandas dataframe.
dataframe = myfunction(inputs)
# Write output to the dataset.
output_handler = dataiku.Dataset(dataset)
output_handler.write_with_schema(dataframe)
### How to change the dataset property in GUI flow to 'built'?
Answers
-
CoreyS Dataiker Alumni, Dataiku DSS Core Designer, Dataiku DSS Core Concepts, Registered Posts: 1,149 ✭✭✭✭✭✭✭✭✭Hi, @Marek
! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if you’ve tried any fixes already?This should lead to a quicker response from the community. -
Hi Marek.
I don't know if this will serve your purpose, but you can mark datasets as "explicit build only"
See dataset > settings > advanced > rebuild behaviour.
Regards
Mark
-
Hi Mark,
Thank you for your response. Actually my question is not about managing the dataset marking with GUI, but programatically i.e. how to change the marking within a Python script.
