Want to Stop Rebuilding "Expensive" Parts of your Flow? Explicit Builds are the Answer!READ MORE

Running SQL statements with Python?

Solved!
dev1
Level 1
Running SQL statements with Python?

I want to execute raw SQL statements in Python. The raw SQL statements would not return data, they are just inserting data into a database. The existing write_to_dataframe method rewrites the whole table, and using a SQL recipe or scenario is not ideal. These statements would typically be running from a dataiku webapp or notebook. 

 

The following query is what I need to run from python (webapp or notebook).

INSERT INTO db.db_table (col_one, col_two) VALUES('value_one', 'value_two')

0 Kudos
1 Solution
Ignacio_Toledo

Hi @dev1,

There are several options:

  1. Use a library like pyscopg2, or cx_oracle, or whatever python library that allows you to connect to the database you are using for your datasets, and execute SQL code
  2. Use dataiku's SQLExecutor2 to execute SQL code, without needing to care about your database kind.

For the second option, and assuming that your output dataset is called "output_table", and is stored in with the same name in your dabatase and schema "public", this would work:

from dataiku import SQLExecutor2
sqlcon = SQLExecutor2(dataset="output_table")

sql_str = """INSERT INTO "public"."output_table" (col_one, col_two) VALUES('value_one', 'value_two')"""

sqlcon.query_to_df(sql_str, post_queries=['COMMIT'])

Hope this helps!

View solution in original post

0 Kudos
3 Replies
Ignacio_Toledo

Hi @dev1,

There are several options:

  1. Use a library like pyscopg2, or cx_oracle, or whatever python library that allows you to connect to the database you are using for your datasets, and execute SQL code
  2. Use dataiku's SQLExecutor2 to execute SQL code, without needing to care about your database kind.

For the second option, and assuming that your output dataset is called "output_table", and is stored in with the same name in your dabatase and schema "public", this would work:

from dataiku import SQLExecutor2
sqlcon = SQLExecutor2(dataset="output_table")

sql_str = """INSERT INTO "public"."output_table" (col_one, col_two) VALUES('value_one', 'value_two')"""

sqlcon.query_to_df(sql_str, post_queries=['COMMIT'])

Hope this helps!

0 Kudos
dev1
Level 1
Author

Can you link to the documentation on how to use SQLExecutor2?

0 Kudos
Ignacio_Toledo

Hi @dev1,

There is no much documentation on the method:

https://doc.dataiku.com/dss/latest/python-api/sql.html?highlight=sqlexecutor2#dataiku.SQLExecutor2

https://doc.dataiku.com/dss/latest/python-api/sql.html?highlight=sqlexecutor2#executing-queries

But there is a whole excellent section (first time I see it!) in the knowledge base:

https://knowledge.dataiku.com/latest/courses/advanced-code/python/use-python-sql.html?highlight=sqle...

even when it doesn't touch the topic of executing statements other than SELECT.

Cheers!