Insert data from python df to sql table
Hi all,
I wrote a python code using python recipe, I need to add inside it a command that achieves:
The result of my code is a dataframe named combined_df, I need to insert the data in it into another table that's connected to SQL server.
Also this SQL table is an input to the python code, is this doable?
The idea is I'm considering the SQL table as original data, then when new data coming in I'm comparing both, and I need the updates to be inserted to the SQL table (original).
Answers
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @Arwamus0
,
You can write to SQL table by providing output dataset in DSS.
If you are both reading/writing to the that table this possible but you may want to ensure you set append mode.You can see the answer from : https://community.dataiku.com/t5/General-Discussion/Append-Data-without-drop-table-with-python/m-p/15841
Thanks -
Hi @AlexT
When there is new data, the old data column(ValidTo) has to be populated and this is done using the python code, if I append the new data this won't happen.
I need the table to be overwritten with my new df.
Also I got an error saying 'Dataset' object has no attribute 'spec_item'
Thanks
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Hi @Arwamus0
,If you are consolidating and have the final df you want to write code, you can write simply.
test = dataiku.Dataset("test")
test.write_with_schema(pandas_dataframe)
THanks -
Hi @AlexT
It worked thanks a lot..
but now I have a problem it's duplicating the data inside the table, is there a command I can write before writing into the table to truncate it first?
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
you can use drop_duplicates on your dataframe
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.drop_duplicates.html -
the duplicates are happening in SQL table not in the df
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
Did you remove the append spec part?
-
yes it gives an error 'Dataset' object has no attribute 'spec_item'
-
Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,226 Dataiker
I can't see a reason the data would be duplicated the write_with_schema should drop the table and then insert, did you include append mode in the recipe - Input/Output tab?
If not please open a support ticket with the job diagnostics so we can investigate.
Thanks
-
LouisDHulst Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Neuron, Registered, Neuron 2023 Posts: 54 Neuron
Hi @Arwamus0
, could you please post your Python script, or at least the parts that deal with Dataset input/output?If you're just trying to overwrite the output Dataset with your dataframe you need to make sure the Append button in the Input/Output tab of your Python recipe is turned off.
-