Mark programatically a dataset as built

Marek
Level 2
Mark programatically a dataset as built

Hi,

How to mark programatically a dataset as built?

For instance, I need to create a managed MySQL dataset, write some data to it, and mark it programatically in the flow as "built". Is is there any dataset parameter modifiable by a user through Python API?

# Create empty dataset.
client = dataiku.api_client()
project = client.get_project(dataiku.default_project_key())
builder = project.new_managed_dataset_creation_helper("mydatasetname")
builder.with_store_info("mysqlconnection")
dataset = builder.create()

# Compute dataset output as Pandas dataframe.
dataframe = myfunction(inputs)

# Write output to the dataset.
output_handler = dataiku.Dataset(dataset)
output_handler.write_with_schema(dataframe)

### How to change the dataset property in GUI flow to 'built'?

 

0 Kudos
3 Replies
CoreyS
Dataiker Alumni

Hi, @Marek ! Can you provide any further details on the thread to assist users in helping you find a solution (insert examples like DSS version etc.) Also, can you let us know if youโ€™ve tried any fixes already?This should lead to a quicker response from the community.

Looking for more resources to help you use Dataiku effectively and upskill your knowledge? Check out these great resources: Dataiku Academy | Documentation | Knowledge Base

A reply answered your question? Mark as โ€˜Accepted Solutionโ€™ to help others like you!
0 Kudos
Mark_Treveil
Dataiker Alumni

Hi Marek.

I don't know if this will serve your purpose, but you can mark datasets as "explicit build only"

 

See dataset > settings > advanced > rebuild behaviour.

 

Regards

Mark

0 Kudos
Marek
Level 2
Author

Hi Mark,

Thank you for your response. Actually my question is not about managing the dataset marking with GUI, but programatically i.e. how to change the marking within a Python script.

 

0 Kudos