Import a list of tables at once from Snowflake
Hi Team,
Is it possible to import a list of tables like say 50 tables from a snowflake source at once using the python API? I tried using "init_tables_import" and followed the complete steps but used "add_sql_table" method but I get error in the execute step. Could you please help? Also, I could see only up to version 11 in the options while posting this query but we use 12.
Regards
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,877 Neuron
OK then this is what you need to do. Create a new SQL Query dataset with a query using the Snowflake metadata dictionary tables/views that return the tables you want to import programatically. Then add a Python recipe to it and start coding. Below is a sample code snippet creating a dataset and the documentation for the API. Now you just need to change it to your needs to loop through the input dataset rows and output something up (Dataiku always expect to have an output). Otherwise you can run this in a notebook to avoid having to create a dummy output.
import dataiku client = dataiku.api_client() project = client.get_default_project() dataset = project.create_sql_table_dataset("mydataset", "PostgreSQL", "my_sql_connection", "mytable", "myschema") # At this point, the dataset object has been initialized, but the schema of the underlying table # has not yet been fetched, so the schema of the table and the schema of the dataset are not yet consistent # We run autodetection settings = dataset.autodetect_settings() # settings is now an object containing the "suggested" new dataset settings, including the completed schema # We can just save the new settings in order to "accept the suggestion" settings.save()
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 1,877 Neuron
This certainly can be done with the Python API but why write code when you can do this via a GUI? Head down to your flow, click on + DATASET button and select "Search and Import". Then select Connection Explorer, select your connection, click on List Tables and see the result. You can then select all tables and click Import and they will all be added to the Flow. Simple!
-
Thanks for a quick response. I use this method but this is convenient only when the number of tables we wish to import are very few in numbers. I have a list of around more than 50 tables, so just wanted to see how we could do this using a python api. Also, once I import I need to rename them too as per the standards we follow, so wanted to make an expandable solution for this in case we need to do something similar in future.