Error in "from dataiku.snowpark import DkuSnowpark"
I'm trying to use Snowpark within Python recipe.
from dataiku.snowpark import DkuSnowpark
This works in DSS, but when I tried to run this code in local VS Code, it returns error "ModuleNotFoundError: No module named 'dataiku.snowpark'".
I want to edit and run .py script in local VS Code through Dataiku extension.
P.S. I already installed the API packages.
import dataiku, dataikuapi
Operating system used: Windows 10
Answers
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,043 Neuron
You need to install the Dataiku internal API package for that to work. Have a look at this page:
-
I have already installed the two pakcages.
import dataiku, dataikuapi
I'm able to import dataiku, but not dataiku.snowpark.
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,043 Neuron
This is an interesting one. It looks like the internal client package doesn't include the snowpark wrapper. From what I can see DSS is loading these from the installation directory. For instance in my system the files are in the following location:
/Users/my_user/Library/DataScienceStudio/kits/dataiku-dss-12.5.0-osx/python/dataiku/snowpark
So I fixed this by copying this snowpark directory to the directory where I got the internal client installed:
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/dataiku/
Then I got the Snowflake snowpark package installed (pip3 install snowflake-snowpark-python) and I was able to import dataiku.snowpark in my local Jupyter env:
-
Thank you, I'll take a try, it looks promissing!
Does it imply the package downloaded from this code below does not include Snowpark component?
pip install --trusted-host 10.xxx.xxx.xxx https://10.xxx.xxx.xxx/public/packages/dataiku-internal-client.tar.gz
-
Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,043 Neuron
Exactly.