How to uninstall a package within a codenv (or install a package without its dependencies) ?

MatthieuPx
MatthieuPx Registered Posts: 4 ✭✭✭

Hello

We would like to install a package A (ultralytics in our case) but without its dependency B (opencv-python) or to be able to install this package A and then remove one of its dependencies B. The reason behind is that ultralytics doesn't work properly with opencv-python, we need to remove it and install opencv-python-headless instead.

How to do this within Dataiku (through the UI in the Codeenv section, or through the Dataiku API (see Code envs - Dataiku Developer Guide) ?

We unsuccessfully tried using the —no-deps option both in the code env UI section (see screnshot) or within a notebook (with the set_required_packages method). We couldnot find a simple way to remove a package (which is a dependency of another one we want to keep) from a codeenv. We don't have admin rights on our node.

Thank you in advance

Answers

  • MatthieuPx
    MatthieuPx Registered Posts: 4 ✭✭✭

    Note :
    - Outside DSS we managed to create such an environment using 'pip uninstall B'

    -… within a notebook within Dataiku '%pip uninstall B' doesnot work with the below error

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    There is no built-in functionality to use "pip uninstall" in Dataiku. Pip options shouldn't be used in the Requested packages section but in the Administration ⇒ Settings ⇒ Other ⇒ Misc ⇒ Extra options for 'pip install'. But you certainly wouldn't want to use "no-deps" instance wide as it will impact all code environments and all packages not just yours so it will make code envs unusable. As a test I did set it on my test environment and it does work however you will need to specify all the other packages dependencies otherwise the code env is unusable. And like you said you are not even admin so you couldn't use it if you wanted to anyway. Also updating the code env from a notebook won't work since the notebook runs as yourself not the DSS user which owns the code envs. Even if you were an admin and could run "pip uninstall" as the DSS user for this particular code env this would be extremely flaky as any updates to the code env would break your setup as DSS will bring the dependency again.

    The way to remove packages from a code environment in Dataiku is to remove the package from the Requested packages section, enable the Rebuild option and click on Update. This will rebuild the code without the package in question. Then disable the rebuild option as you don't want to rebuild code envs for every update.

    I think you are going around the real issue here. If ultralytics doesn't work properly with opencv-python and should use opencv-python-headless that's what you should address. Everything else seems like an uggly kludge. So why does ultralytics doesn't work properly with opencv-python? And why do you need opencv-python-headless? That's your real problem here, so let's focus on that.

  • Turribeach
    Turribeach Dataiku DSS Core Designer, Neuron, Dataiku DSS Adv Designer, Registered, Neuron 2023 Posts: 2,024 Neuron

    As I said the issue here is to do with ultralytics and it's opencv-python requirement. I am guessing you have seen this Github issue: https://github.com/ultralytics/ultralytics/issues/2179 which describes the problem which is also affecting other people. A PR was submitted to switch to setuptools but it doesn't yet support extras_require so ultralytics[headless] would not work. I did manage to get a code environment in Dataiku with ultralytics, no opencv-python and opencv-python-headless installed (see screen shot below). These are the steps:

    1. Add ultralytics to code env and let it install ultralytics + opencv-python
    2. In the Administration ⇒ Settings ⇒ Other ⇒ Misc ⇒ Extra options ⇒ add —no-deps
    3. Now go to "Currently installed packages" of your code end and copy the whole list of installed packages. Paste this list in Packages to install of your code env
    4. Without saving or updating modify the opencv-python line and replace it by opencv-python-headless
    5. Enable Rebuild env and click on update
    6. In the Administration ⇒ Settings ⇒ Other ⇒ Misc ⇒ Extra options ⇒ remove —no-deps
    7. Do not change or update this environment again or it will break. You can't even deploy to Automation node

    However without admin rights you will not be able to perform this trick. The only other option is to download the ultralytics wheel file (ultralytics-8.3.13-py3-none-any.whl) and modify the METADATA file to amend the opencv-python requirement and change it to opencv-python-headless. That's something you can easily do yourself but you will need an internal Python Package Index (pypi) repository such as Frog's Artifactory to host it and again admin access to add it to the Dataiku config.

Setup Info
    Tags
      Help me…