Snowpark with Dataiku

Options
nmadhu20
nmadhu20 Neuron, Registered, Neuron 2022, Neuron 2023 Posts: 35 Neuron

While going through one of Dataiku's blogs, I found out about integration with Snowpark for Python.

Just wondering how to get started with this. Does this mean utilization of the snowflake-snowpark-python library? or somehow we can use the snowpark in recipes?


Also, to what extent it is going to optimize my operations like say over a dataset with 1 million rows and 30 columns?

Thanks in advance,
Madhuleena

Best Answer

Answers

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron
    Options

    Hi Team

    Bumping this as we are beginning to explore snowpark now with Dataiku. Snowpark looks quite powerful, especially its python APIs, if you happen to be a snowflake customer.

    Question on dataiku integrations: with spark and our envs we can build for spark and have everything represented during runtime, which is nice. With snowflake stored procs and snowpark, you need to provide the python spec inline. I'm assuming this is manual? Meaning we'll need to ensure the stored proc will need to have the spec, which will be overhead outside of DSS? Also, considering most packages are handled via anaconda, im assuming config will be needed to install from e.g. an internal pypi?

  • pmasiphelps
    pmasiphelps Dataiker, Dataiku DSS Core Designer, Registered Posts: 33 Dataiker
    edited July 17
    Options

    Hi,

    Yes, you’ll need to list the python packages you want in the sproc inline. You can do this with the snowpark session object or as you define the sproc:

    session.add_packages()

    Or

     @sproc(packages=["pandas", "xgboost==1.5.0"])

    Snowflake maintains a dedicated anaconda repo that all Snowpark code can leverage, so no additional admin overhead as long as you’re using one of the packages listed here: https://repo.anaconda.com/pkgs/snowflake/

    You can submit requests for additional packages through Snowflake community and I’ve found them to be very responsive.


    If you want to use a custom package, there’s a procedure for it: https://medium.com/snowflake/using-other-python-packages-in-snowpark-a6fd75e4b23a


    Hope this helps!

    Pat

  • importthepandas
    importthepandas Dataiku DSS Core Designer, Dataiku DSS & SQL, Dataiku DSS Core Concepts, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 115 Neuron
    Options

    you rock @pmasiphelps
    hopefully this will help me write 80% of my code in snowflake

Setup Info
    Tags
      Help me…