Insert data from python df to sql table

Arwamus0
Level 2
Insert data from python df to sql table

Hi all,

I wrote a python code using python recipe, I need to add inside it a command that achieves:

The result of my code is a dataframe named combined_df, I need to insert the data in it into another table that's connected to SQL server.

Also this SQL table is an input to the python code, is this doable?

The idea is I'm considering the SQL table as original data, then when new data coming in I'm comparing both, and I need the updates to be inserted to the SQL table (original).

 

0 Kudos
11 Replies
AlexT
Dataiker

Hi @Arwamus0 ,
You can write to SQL table by providing output dataset in DSS.
If you are both reading/writing to the that table this possible but you may want to ensure you set append mode.

You can see the answer from : https://community.dataiku.com/t5/General-Discussion/Append-Data-without-drop-table-with-python/m-p/1...

Thanks

0 Kudos
Arwamus0
Level 2
Author

Hi @AlexT 

When there is new data, the old data column(ValidTo) has to be populated and this is done using the python code, if I append the new data this won't happen.

I need the table to be overwritten with my new df.

 

Also I got an error saying 'Dataset' object has no attribute 'spec_item'

 

Thanks

0 Kudos
AlexT
Dataiker

Hi @Arwamus0 ,

If you are consolidating and have the final df you want to write code, you can write simply.

test = dataiku.Dataset("test")
test.write_with_schema(pandas_dataframe)

THanks

0 Kudos
Arwamus0
Level 2
Author

Hi @AlexT 

 

It worked thanks a lot..

but now I have a problem it's duplicating the data inside the table, is there a command I can write before writing into the table to truncate it first?

0 Kudos
AlexT
Dataiker
0 Kudos
Arwamus0
Level 2
Author

the duplicates are happening in SQL table not in the df

0 Kudos
AlexT
Dataiker

Did you remove the append spec part?

 

0 Kudos
Arwamus0
Level 2
Author

yes it gives an error  'Dataset' object has no attribute 'spec_item'

 

0 Kudos
AlexT
Dataiker

I can't see a reason the data would be duplicated the write_with_schema should drop the table and then insert, did you include append mode in the recipe - Input/Output tab?

If not please open a support ticket with the job diagnostics so we can investigate.

Thanks

0 Kudos
LouisDHulst

Hi @Arwamus0, could you please post your Python script, or at least the parts that deal with Dataset input/output?

If you're just trying to overwrite the output Dataset with your dataframe you need to make sure the Append button in the Input/Output tab of your Python recipe is turned off. 

 

0 Kudos
Arwamus0
Level 2
Author

Hi @LouisDHulst 

Yes it's turned off (append button)

Drop duplicates worked, thanks

 

 

0 Kudos