need api based code to copy datasets to new project

rakeshrk · November 2020

I have about 500 tables in various projects. I need code which can copy the 500 tables to a new project. I tried various approaches but none of them is working. csv_file contains project name and table name of the 500 tables. please help

for index, row in csv_file.iterrows():
print(row['source_dataset_name'], row['destination_dataset_name'])
inp = dataiku.Dataset(row['source_dataset_name'])
out = dataiku.Dataset(row['destination_dataset_name'])
dataset_name=row['Dataset']
connection='myconnection'
path_in_connection='/'+ 'SOURCES' +'/' + row['Dataset']
builder = project.new_managed_dataset(dataset_name)
builder.with_store_info(connection, format_option_id="PARQUET_HIVE")
dataset = builder.create()
#dataset = project.create_fslike_dataset("mydataset", "HDFS", "name_of_connection", "path_in_connection")
#write_dataset = project.create_s3_dataset(dataset_name, connection, path_in_connection, bucket=None)
#settings = write_dataset.autodetect_settings()
#settings.save()
#write_dataset.write_with_schema(inp.get_dataframe(sampling='head', limit=1000))
out.write_dataframe(inp.get_dataframe(sampling='head', limit=1000))

Triveni · December 2020

Hi,

Can you describe the error you are getting? What exactly is not working?

need api based code to copy datasets to new project

Answers

Categories

Setup Info

Tags