How create new dataset with the schema of an other existing dataset ?

Options
Val1
Val1 Registered Posts: 1

I have a CSV dataset with a schema. I want to extend this dataset. So I want to create new dataset with the same columns. How can I do that ? How can I export the schema of the initial dataset ?

Thanks for your responses.

Answers

  • JordanB
    JordanB Dataiker, Dataiku DSS Core Designer, Dataiku DSS Adv Designer, Registered Posts: 293 Dataiker
    edited July 17
    Options

    Hi @Val1
    ,

    One way you can do this is by creating a managed dataset (+Dataset > Internal > Managed Dataset) as your target dataset and utilizing the Python APIs in a notebook to get the source dataset's schema/set the target dataset's schema. Please see our documentation on get_schema and set_schema.

    I've provided sample code below.

    import dataiku
    client = dataiku.api_client()
    project = client.get_project("projectKey")
    #get the source dataset schema
    source_dataset = project.get_dataset("source_dataset_name")
    source_schema = dataset.get_schema()
    #set the target dataset schema
    target_dataset = project.get_dataset("target_dataset_name")
    target_schema = target_dataset.set_schema(source_schema)
    

    Please let us know if you have any questions.

    Thanks!

    Jordan

Setup Info
    Tags
      Help me…