Write in specific database from python recipe

Florent1 · December 2021

Hello,

Does someone have an example of a python recipe for writing a df into a specific database ?

Thank you very much !

Florent

Alexandru · December 2021

Hi, That seems like a sensible approach, If you defined your project/global variable. This will result in multiple datasets being visible in the flow you can move them to a specific flow zone and even remove and not drop the contents depending on if you need to reuse these in the same flow.

You can use something like his within a Scenario Python step. This won't work in an actual recipe but would work in scenario or notebook.

import dataiku
import pandas as pd, numpy as np
from dataiku import pandasutils as pdu

client = dataiku.api_client()
project = client.get_default_project()

# Read recipe inputs
my_input_ds = dataiku.Dataset("year")
my_dataframe = my_input_ds.get_dataframe()

ts_var = dataiku.get_custom_variables()["timestamp_sql_table"]
dataset_name = "my_dataset" + "_" + ts_var

connection_name = "Postgres-Localhost"

builder = project.new_managed_dataset(dataset_name)
builder.with_store_into(connection_name)
dataset = builder.create()

#
write_ds = dataiku.Dataset(dataset_name)
write_ds.write_with_schema(my_dataframe)

You can also create unmanaged SQL datasets using https://doc.dataiku.com/dss/latest/python-api/datasets-other.html#sql-dataset-programmatic-creation

Let me know if that helps!

Alexandru · December 2021

Hi Florent,

Writing a df into a specific database can best be done via writing your df output to a SQL dataset. The steps would include:

1) Adding a database connection to your SQL database. See supported SQL database here: : https://doc.dataiku.com/dss/latest/connecting/sql.html

2) Specifying an output dataset within that connection and you can write the DF to the dataset using write_with_schema() or write_dataframe() https://doc.dataiku.com/dss/latest/code_recipes/python.html#writing-a-pandas-dataframe-in-a-dataset

If the above is not suitable you can technically use https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_sql.html but this would be done outside of DSS so any credentials would need to be included in the recipe, JDBC drivers handled outside of DSS as well.

Hope this helps!

Florent1 · December 2021

Hi Alex,

Thank you very much for your response.

Actually what I want to do is save a table in a database with a name that varies (with current date for example and other information). So my idea was to define gloval variables, create a Python recipe and use these global variables and the current date to define the name of the table and save this table to the database.

I don't know how to do that and I don't know if it the best way to do this.

Thank you,

Florent

Write in specific database from python recipe

Best Answer

Answers

Categories

Setup Info

Tags