Submit your innovative use case or inspiring success story to the 2023 Dataiku Frontrunner Awards! LET'S GO

Snowpark with Dataiku

Solved!
nmadhu20
Snowpark with Dataiku

While going through one of Dataiku's blogs, I found out about integration with Snowpark for Python.

Just wondering how to get started with this. Does this mean utilization of the snowflake-snowpark-python library? or somehow we can use the snowpark in recipes?


Also, to what extent it is going to optimize my operations like say over a dataset with 1 million rows and 30 columns?

 

Thanks in advance,
Madhuleena

1 Solution
StephenWagner
Dataiker

Hi @nmadhu20 

This tutorial explains how to get started and details on using Snowpark within Dataiku:
Using Snowpark Python in Dataiku: basics 

View solution in original post

4 Replies
StephenWagner
Dataiker

Hi @nmadhu20 

This tutorial explains how to get started and details on using Snowpark within Dataiku:
Using Snowpark Python in Dataiku: basics 

importthepandas
Level 5

Hi Team

 

Bumping this as we are beginning to explore snowpark now with Dataiku. Snowpark looks quite powerful, especially its python APIs, if you happen to be a snowflake customer.

Question on dataiku integrations: with spark and our envs we can build for spark and have everything represented during runtime, which is nice. With snowflake stored procs and snowpark, you need to provide the python spec inline. I'm assuming this is manual? Meaning we'll need to ensure the stored proc will need to have the spec, which will be overhead outside of DSS? Also, considering most packages are handled via anaconda, im assuming config will be needed to install from e.g. an internal pypi?

 

 

 

 

0 Kudos
pmasiphelps
Dataiker

Hi,

 

Yes, you’ll need to list the python packages you want in the sproc inline. You can do this with the snowpark session object or as you define the sproc:

session.add_packages()

Or 

 @sproc(packages=["pandas", "xgboost==1.5.0"])

 Snowflake maintains a dedicated anaconda repo that all Snowpark code can leverage, so no additional admin overhead as long as you’re using one of the packages listed here: https://repo.anaconda.com/pkgs/snowflake/

You can submit requests for additional packages through Snowflake community and I’ve found them to be very responsive.


If you want to use a custom package, there’s a procedure for it: https://medium.com/snowflake/using-other-python-packages-in-snowpark-a6fd75e4b23a


Hope this helps!

Pat

 

 

 

importthepandas
Level 5

you rock @pmasiphelps  hopefully this will help me write 80% of my code in snowflake

0 Kudos