Ready for Dataiku 9? Try out the Crash Course on new features! GET STARTED

Issue to add spacy package in a code environment

Solved!
Chiktika
Level 3
Issue to add spacy package in a code environment

Hello,

I need to use spacy package with a french model.

Following the doc, I created a new code environnement with PYTHON 3.6 and add

spacy
https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz

Then the env update is looping  and never end at this step

Requirement already satisfied: numpy>=1.13.3 in /home/dataiku/dss_data/code-envs/python/PY36_dev/lib/python3.6/site-packages (from pandas<1.1,>=1.0->-r /home/dataiku/dss_data/tmp/pip-requirements-install/req7521612047188513176.txt (line 3)) (1.19.4)

Requirement already satisfied: setuptools in /home/dataiku/dss_data/code-envs/python/PY36_dev/lib/python3.6/site-packages (from spacy->-r /home/dataiku/dss_data/tmp/pip-requirements-install/req7521612047188513176.txt (line 1)) (51.0.0)

 

Capture d’écran 2020-12-09 111050.png

 

Does someone can help me please.

With many thanks.

C.

0 Kudos
1 Solution
Andrey
Dataiker
Dataiker

It looks like the spacy that you install by:

pip install spacy

isn't compatible with the language model that you install after. The model requires a release candidate version of spacy and it looks like if you install 

https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz

it also installs the right pre-release version of spacy as a dependency.

 

So the solution could be:

source /home/dataiku/dss_data/code-envs/python/PY36_dev/bin activate
pip uninstall spacy
pip install https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz

 

I just tried it locally and it seems to work

 

And as for the DSS part, you could just remove spacy from the additional packages list and update the environment.

Andrey Avtomonov
R&D Engineer @ Dataiku

View solution in original post

5 Replies
Andrey
Dataiker
Dataiker

Hi @Chiktika ,

Could you please try running the following from your terminal:

source /home/dataiku/dss_data/code-envs/python/PY36_dev/bin activate
pip install spacy -v
pip install https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz -v

 

and if it also doesn't work attach the generated log?

 

Regards, 

Andrey Avtomonov
R&D Engineer @ Dataiku
0 Kudos
Chiktika
Level 3
Author

Hi @Andrey ,

Thanks for your answer.

So, two firsts commands succeed, logs in attachment: pip_install_spacy.log.zip
For the last one, I get following error, logs in attachment: pip_install_fr_core_new_sm.log.zip

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
spacy 2.3.4 requires catalogue<1.1.0,>=0.0.7, but you have catalogue 2.0.1 which is incompatible.
spacy 2.3.4 requires srsly<1.1.0,>=1.0.2, but you have srsly 2.3.2 which is incompatible.
spacy 2.3.4 requires thinc<7.5.0,>=7.4.1, but you have thinc 8.0.0rc2 which is incompatible.

Successfully installed MarkupSafe-1.1.1 catalogue-2.0.1 click-7.1.2 contextvars-2.4 dataclasses-0.8 fr-core-news-sm-3.0.0a0 immutables-0.14 jinja2-2.11.2 packaging-20.7 pathy-0.3.4 pydantic-1.6.1 pyparsing-2.4.7 pytokenizations-0.7.2 smart-open-3.0.0 spacy-nightly-3.0.0rc2 srsly-2.3.2 thinc-8.0.0rc2 typer-0.3.2 typing-extensions-3.7.4.3
Removed build tracker: '/tmp/pip-req-tracker-gtcnx1tp'

 

Do you have any idea?

Thanks

C. 

 

0 Kudos
Andrey
Dataiker
Dataiker

It looks like the spacy that you install by:

pip install spacy

isn't compatible with the language model that you install after. The model requires a release candidate version of spacy and it looks like if you install 

https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz

it also installs the right pre-release version of spacy as a dependency.

 

So the solution could be:

source /home/dataiku/dss_data/code-envs/python/PY36_dev/bin activate
pip uninstall spacy
pip install https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-3.0.0a0.tar.gz

 

I just tried it locally and it seems to work

 

And as for the DSS part, you could just remove spacy from the additional packages list and update the environment.

Andrey Avtomonov
R&D Engineer @ Dataiku

View solution in original post

Chiktika
Level 3
Author

Commands ran without errors

In DSS part, I removed 'spacy', and just kept 'https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-...

But there's an error when updating env:

[2020/12/09-17:18:07.005] [null-out-5618] [INFO] [dku.utils]  - Installing collected packages: idna, six, requests, ipython-genutils, decorator, wcwidth, traitlets, ptyprocess, parso, tornado, pyzmq, python-dateutil, pygments, prompt-toolkit, pickleshare, pexpect, jupyter-core, jedi, backcall, pytz, jupyter-client, ipython, simplegeneric, pandas, ipykernel
[2020/12/09-17:18:07.005] [null-out-5618] [INFO] [dku.utils]  -   Attempting uninstall: idna
[2020/12/09-17:18:07.005] [null-out-5618] [INFO] [dku.utils]  -     Found existing installation: idna 2.10
[2020/12/09-17:18:07.005] [null-out-5618] [INFO] [dku.utils]  -     Uninstalling idna-2.10:
[2020/12/09-17:18:07.005] [null-out-5618] [INFO] [dku.utils]  -       Successfully uninstalled idna-2.10
[2020/12/09-17:18:07.372] [qtp506775047-4619] [DEBUG] [dku.tracing]  - [ct: 0] Start call: /api/futures/get-update [GET] user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:07.373] [qtp506775047-4619] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:07.984] [qtp506775047-5352] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:07.984] [qtp506775047-5352] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:08.546] [qtp506775047-5352] [DEBUG] [dku.tracing]  - [ct: 0] Start call: /api/futures/get-update [GET] user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:08.547] [qtp506775047-5352] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=elodie [futureId=HKdM16OQ]
[2020/12/09-17:18:09.184] [null-err-5620] [INFO] [dku.utils]  - ERROR: Could not install packages due to an EnvironmentError: [Errno 13] Permission denied: 'METADATA'
[2020/12/09-17:18:09.184] [null-err-5620] [INFO] [dku.utils]  - Consider using the `--user` option or check the permissions.
[2020/12/09-17:18:09.184] [Thread-2498] [INFO] [dku.utils]  - Done waiting for return value,  got 1
[2020/12/09-17:18:09.185] [FT--HKdM16OQ-5616] [ERROR] [dku.code.envs]  - Env update failed
com.dataiku.dip.exceptions.ProcessDiedException: /home/dataiku/dss_data/code-envs/python/PY36_dev/bin/python failed (exit code: 1)
	at com.dataiku.dip.exceptions.ProcessDiedException.getExceptionOnProcessDeath(ProcessDiedException.java:59)
	at com.dataiku.dip.utils.DKUtils$SimpleExceptionExecCompletionHandler.handle(DKUtils.java:1063)
	at com.dataiku.dip.utils.DKUtils$ExecBuilder.exec(DKUtils.java:918)
	at com.dataiku.dip.utils.DKUtils.execAndLogThrowsMirror(DKUtils.java:1244)
	at com.dataiku.dip.code.CodeEnvPackageSystems$PipPackageSystemMeta.install(CodeEnvPackageSystems.java:132)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.updateEnvAccordingToSpec(DesignNodeCodeEnvsService.java:1114)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.access$200(DesignNodeCodeEnvsService.java:91)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1052)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1041)
	at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:36)
	at com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:88)

 

0 Kudos
Chiktika
Level 3
Author

What a mess! I retried to rebuid env again and every things is broken.

So I decided to re-started with new clean env, and it works.

I confirm that your advice to only add `https://github.com/explosion/spacy-models/releases/download/fr_core_news_sm-3.0.0a0/fr_core_news_sm-... in the list of packages to install is the right solution.

Many thanks for your help @Andrey 

A banner prompting to get Dataiku DSS