How to update data from database and run other processes using Scenario and pipeline with SQL DB con

Hi
Having connection to Oracle database I need to run Dataiku pipeline periodically assuming that during the update new data from database will be uploaded (based on query that will be simply rerun) and result send from further manipulations. I want to use build in Connection option
Operating system used: Windows
Best Answer
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
It's still not 100% clear to me what your issue is but I am going to guess that your problem is that Oracle connection that's read only. This allows you to add a dataset on this database to your flow but you can not perform any transformations or recipes on this dataset since the connection will not allow you to write a dataset back. If that's your issue what you need to do is to use a Sync recipe to copy your data from your Oracle connection into another connection dataset, one that you should have permissions to write data to. Once you have all the datasets on your write enabled connection you can start to join them and transform them.
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023, Circle Member Posts: 2,590 Neuron
I am not sure if I understand what your question is about. What exactly do you need help with? How to update an Oracle table?
-
I use +Dataset, SQLDatabases, Oracle, apply query - and obtain module containing my data (M1). Next, I need to pass this data through pipline of manipulations. When I connect python script to this module (M1) directly, even with simple operation like df['A1'].max, received an error that I haven't enough permissions. To solve the problem I had to Export data to DataSet (D1) and perform manipulations on dataset D1 to obtain result from script. Export / creating updated dataset was manual operation. In other words data transfer from M1 to D1 was manual.
My problem is that I need to create pipline that will connect to the database (using the created module M1) once a week download new data preform the manipulation and save the result. So, Scenario is needed. Is it possible to export the data from M1 to D1 automatically? What should be a standard way of downloading new data from database in Scenario?
-
Thank you for your help