Submit your use case or success story to the 2023 edition of the Dataiku Frontrunner Awards ENTER YOUR SUBMISSION

How create new dataset with the schema of an other existing dataset ?

Val1
Level 1
How create new dataset with the schema of an other existing dataset ?

I have a CSV dataset with a schema. I want to extend this dataset. So I want to create new dataset with the same columns. How can I do that ? How can I export the schema of the initial dataset ?

Thanks for your responses.

0 Kudos
1 Reply
JordanB
Dataiker

Hi @Val1,

One way you can do this is by creating a managed dataset (+Dataset > Internal > Managed Dataset) as your target dataset and utilizing the Python APIs in a notebook to get the source dataset's schema/set the target dataset's schema. Please see our documentation on get_schema and set_schema.

I've provided sample code below.

import dataiku
client = dataiku.api_client()
project = client.get_project("projectKey")
#get the source dataset schema
source_dataset = project.get_dataset("source_dataset_name")
source_schema = dataset.get_schema()
#set the target dataset schema
target_dataset = project.get_dataset("target_dataset_name")
target_schema = target_dataset.set_schema(source_schema)

Please let us know if you have any questions.

Thanks!

Jordan

0 Kudos

Setup info

?
Tags (1)
A banner prompting to get Dataiku