Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Python recipe run time

yazidsaissi
Level 1
Python recipe run time

Hello everyone, 

I want to use the Python recipe on a large dataset but it takes a lot of time to run. 

There is a way to run the Python recipe in-database so it will not take a lot of time to run ?

Thank you 🙂

0 Kudos
1 Reply
shashank
Dataiker

It depends on your source database where your data resides. Data Sources which support python execution can only be used for such scenarios. For E.g. Snowflake.

It is also recommended to use Python with Spark to get the best of distributed computing while working on large datasets:

Below are the options based on your underlying database:

1. Snowflake: Convert your code to use Snowpark libs and you can push the compute to Snowflake. Learn More

2. Other Database (with no native Python Support): The best way is to set up an EKS Spark Cluster in Dataiku and push your compute to that. Learn More

Any other database which has native python support through JDBC/ODBC connection should be able to use Python recipes with it.