Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
While going through one of Dataiku's blogs, I found out about integration with Snowpark for Python.
Just wondering how to get started with this. Does this mean utilization of the snowflake-snowpark-python library? or somehow we can use the snowpark in recipes?
Also, to what extent it is going to optimize my operations like say over a dataset with 1 million rows and 30 columns?
Thanks in advance,
Bumping this as we are beginning to explore snowpark now with Dataiku. Snowpark looks quite powerful, especially its python APIs, if you happen to be a snowflake customer.
Question on dataiku integrations: with spark and our envs we can build for spark and have everything represented during runtime, which is nice. With snowflake stored procs and snowpark, you need to provide the python spec inline. I'm assuming this is manual? Meaning we'll need to ensure the stored proc will need to have the spec, which will be overhead outside of DSS? Also, considering most packages are handled via anaconda, im assuming config will be needed to install from e.g. an internal pypi?
Yes, you’ll need to list the python packages you want in the sproc inline. You can do this with the snowpark session object or as you define the sproc:
Snowflake maintains a dedicated anaconda repo that all Snowpark code can leverage, so no additional admin overhead as long as you’re using one of the packages listed here: https://repo.anaconda.com/pkgs/snowflake/
You can submit requests for additional packages through Snowflake community and I’ve found them to be very responsive.
If you want to use a custom package, there’s a procedure for it: https://medium.com/snowflake/using-other-python-packages-in-snowpark-a6fd75e4b23a
Hope this helps!