Discover this year's submissions to the Dataiku Frontrunner Awards and give kudos to your favorite use cases and success stories!READ MORE

Update pip to latest 20.x

Solved!
jax79sg
Level 2
Update pip to latest 20.x

Hi,

How do i ensure that Dataiku uses the latest pip?

Thank you.

0 Kudos
1 Solution
sergeyd
Dataiker
Dataiker

Hi @importthepandas 

Take this virtualenv.pyz  and replace (you may want to make a copy just in case) DSS_INSTALL_DIR/scripts/virtualenv.pyz with it. 

This was already fixed in the latest 10.0.5 release. 

View solution in original post

10 Replies
Andrey
Dataiker Alumni

Hi @jax79sg ,

 

Whenever you create a new python code environment in DSS it's placed in

DATA_DIR/code-envs/python/ENV_NAME

so to upgrade pip you'd need to run 

DATA_DIR/code-envs/python/ENV_NAME/bin/pip install --upgrade pip

 

Regards

Andrey Avtomonov
R&D Engineer @ Dataiku
tgb417
Neuron
Neuron

@Andrey 

Are there any plans from Dataiku to deal with this issue in a more universal way? 

As background, I think that I have something like 14 DSS Python code environments.  With at least 2 Design_MANAGED the rest PLUGIN_MANAGED.  And then another small handful of R code environments.

What are the positive and potentially negative ramifications of going to each of these directories and running? 

pip install --upgrade pip

Do I have the chance to break a plug-in doing this?

--Tom

--Tom
0 Kudos
Andrey
Dataiker Alumni

Hi Tom,

It's unlikely that upgrading pip will cause issues with the plugins.

However, changing versions of the libraries in those environments is more risky. For this reason plugins come with it's own requirements.txt that contains a list of library versions that will work with a given version of the plugin.

Is there a particular reason why you'd want upgrade pip in all of your environments?

 

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
tgb417
Neuron
Neuron

@Andrey ,

Good to know.  With my IT operations hat on we like to keep utilities up to date to generally avoid bugs an vulnerabilities.  

I’m  taking from your comment that may not be a good idea in this case.  That’s why I’m extending this conversation to get a bit of clarity. 

Related, I think I’ve seen errors recently when working with older existing code environments in dss where pip has thrown out an error and specifically called out the need to update pip.  I don’t remember if this error caused the build of the environment not to complete successfully.  

So if I am remembering correctly then I guess this would be about making sure that rebuilds run smoothly.  I’m not in a place to test at the moment.

Has anyone else seen things around the version of pip related to dss?

--Tom
0 Kudos
Mahdi_N
Level 1

Hi Andrey,

When running the DATA_DIR/code-envs/python/ENV_NAME/bin/pip install --upgrade pip commend I'm getting an error : -bash : pip : command not found

I searched the error and one reco was to try pip3 and when trying DATA_DIR/code-envs/python/ENV_NAME/bin/pip3 install --upgrade pip3 - I'm getting a different error : Could not find a version that satisfies the requirement upgrade (from version : none) 

I'm not familiar with Linux  - hope you can point me in the right direction !

 

 

0 Kudos
sergeyd
Dataiker
Dataiker

Hi @Mahdi_N 

Before running you will need to check what binaries are present in the code-env directory. Should work with both (pip and pip3) binaries for py3.x code env: 

(base) [centos@localhost ~]$ dss/code-envs/python/py36_test/bin/pip install --upgrade pip
Requirement already satisfied: pip in ./dss/code-envs/python/py36_test/lib/python3.6/site-packages (21.3.1)
(base) [centos@localhost ~]$ dss/code-envs/python/py36_test/bin/pip3 install --upgrade pip
Requirement already satisfied: pip in ./dss/code-envs/python/py36_test/lib/python3.6/site-packages (21.3.1)

 

Also, please check the solution to this post that will make automatically the latest possible pip version to the corresponding python virtualenv version. 

importthepandas
Level 4

Confirmed this solution worked well for us on 9.0.5

importthepandas
Level 4

Following up on and bumping this old topic - we've run into pip v 20.* in DSS 9.0.5 taking a ton of time and causing issues because of the newer-ish resolver. I know this has been improved in later versions of pip. Is it good practice then to go into each env and upgrade or is there a more universal way to handle this?

0 Kudos
sergeyd
Dataiker
Dataiker

Hi @importthepandas 

Take this virtualenv.pyz  and replace (you may want to make a copy just in case) DSS_INSTALL_DIR/scripts/virtualenv.pyz with it. 

This was already fixed in the latest 10.0.5 release. 

importthepandas
Level 4

rock and roll, thank you as always @sergeyd 

0 Kudos