Community Conundrum 28: News Engagement is live! Read More

Link to a git repo in plugin requirements.txt

Level 3
Link to a git repo in plugin requirements.txt

Hello,

I'm used to load code from a private git repo into project librairies but now I need to get this code within a plugin.

Is it possible to add the repo link in the requirements.txt ?

I tried to write

git+ssh://git@github.com:chiktika/dss.git#egg=oncrawl_api

 but it raises an error

[2020/11/09-17:47:13.310] [FT--5jW0vzAe-2121] [INFO] [dip.code-envs.package-systems]  - [ct: 1] Installing from py requirements :
git+https://github.com/chiktika/dss.git#egg=my_lib
[2020/11/09-17:47:13.310] [FT--5jW0vzAe-2121] [INFO] [dip.code-envs.package-systems]  - [ct: 1] Completed requirements :
git+https://github.com/chiktika/dss.git#egg=my_lib
pandas==0.23.4
python-dateutil==2.8.0
pytz==2019.2
requests==2.22.0

[2020/11/09-17:47:13.395] [qtp506775047-1666] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=elodie [futureId=5jW0vzAe]
[2020/11/09-17:47:13.395] [qtp506775047-1666] [DEBUG] [dku.tracing]  - [ct: 1] Done call: /api/futures/get-update [GET] time=1ms user=elodie [futureId=5jW0vzAe]
[2020/11/09-17:47:13.990] [qtp506775047-2118] [DEBUG] [dku.tracing]  - [ct: 1] Start call: /api/futures/get-update [GET] user=elodie [futureId=5jW0vzAe]
[2020/11/09-17:47:13.992] [qtp506775047-2118] [DEBUG] [dku.tracing]  - [ct: 3] Done call: /api/futures/get-update [GET] time=3ms user=elodie [futureId=5jW0vzAe]
[2020/11/09-17:47:14.065] [null-err-2126] [INFO] [dku.utils]  - ERROR: Command errored out with exit status 128: git clone -q https://github.com/chiktika/dss.git /tmp/pip-install-xbi2sq3w/my-lib Check the logs for full command output.
[2020/11/09-17:47:14.065] [null-out-2124] [INFO] [dku.utils]  - Collecting my_lib
[2020/11/09-17:47:14.066] [null-out-2124] [INFO] [dku.utils]  -   Cloning https://github.com/chiktika/dss.git to /tmp/pip-install-xbi2sq3w/my-lib
[2020/11/09-17:47:14.066] [Thread-1082] [INFO] [dku.utils]  - Done waiting for return value,  got 1
[2020/11/09-17:47:14.066] [FT--5jW0vzAe-2121] [ERROR] [dku.code.envs]  - Env update failed
com.dataiku.dip.exceptions.ProcessDiedException: /home/dataiku/dss_data/code-envs/python/plugin_oncrawl-projects_managed/bin/python failed (exit code: 1)
	at com.dataiku.dip.exceptions.ProcessDiedException.getExceptionOnProcessDeath(ProcessDiedException.java:59)
	at com.dataiku.dip.utils.DKUtils$SimpleExceptionExecCompletionHandler.handle(DKUtils.java:1063)
	at com.dataiku.dip.utils.DKUtils$ExecBuilder.exec(DKUtils.java:918)
	at com.dataiku.dip.utils.DKUtils.execAndLogThrowsMirror(DKUtils.java:1244)
	at com.dataiku.dip.code.CodeEnvPackageSystems$PipPackageSystemMeta.install(CodeEnvPackageSystems.java:132)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.updateEnvAccordingToSpec(DesignNodeCodeEnvsService.java:1114)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.access$200(DesignNodeCodeEnvsService.java:91)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1052)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1041)
	at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:36)
	at com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:88)

 

Do someone can help me please?

Many thanks.

0 Kudos
9 Replies
Neuron
Neuron

@Chiktika are you sure you tried the following in your pip install (git+ssh)?

 

 

git+ssh://git@github.com:chiktika/dss.git#egg=oncrawl_api

 

 

 I only ask because I see the following in your error trace (git+https): 

 

 

git+https://github.com/chiktika/dss.git#egg=my_lib

 

 

I had an earlier question about git over https and was told that the preferred way was to use SSH (as you have above - https://community.dataiku.com/t5/Setup-Configuration/Git-over-HTTPS/m-p/9961).  

Have you successfully managed to connect to your repo from DSS using SSH before? If not see here: https://doc.dataiku.com/dss/latest/collaboration/git.html#setup

I have not tried doing what you are asking about, but in a quick google search some others noted you should replace the ":" with "/" in the requirements.txt (https://stackoverflow.com/questions/4830856/is-it-possible-to-use-pip-to-install-a-package-from-a-pr...) so possibly try:

 

## replace :<user> with /<user> ##
git+ssh://git@github.com/chiktika/dss.git#egg=oncrawl_api

 

I wish I could say I am confident this will solve your issue, but I am not. If you make sure you can connect to your repo over SSH and then add the last command to your requirements.txt and still have an issue, I'm happy to try to help you sort through it.

@Chiktika if you do figure it out, I'd love to learn how you do it/ did it because that seems like a really useful thing to do. I encourage you to post on the community so others can leverage your hard work!

0 Kudos

Hi @Chiktika. Apparently you are doing the right thing, except for adding a `-e` at the beginning, so the line in the requirements.txt should look like

-e git+ssh://git@github.com:chiktika/dss.git#egg=oncrawl_api

However, there might be another reason for the failure: your repo is private apparently. Does you DSS machine has the credentials to clone the content from the repo?

I hope this helps!

I.

0 Kudos
Level 3
Author

Hi @Ignacio_Toledo 

Thanks for you answer.

Yeah, I tried adding a `-e` but have still the same error.

Capture d’écran 2020-11-10 103605.png

[2020/11/10-09:34:00.859] [null-out-332] [INFO] [dku.utils]  - Obtaining oncrawl_api from git+ssh://****@github.com:chiktika/dss.git#egg=oncrawl_api (from -r /home/dataiku/dss_data/tmp/pip-requirements-install/req13902208873652758081.txt (line 1))
[2020/11/10-09:34:00.860] [null-out-332] [INFO] [dku.utils]  -   Cloning ssh://****@github.com:chiktika/dss.git to /home/dataiku/dss_data/code-envs/python/plugin_oncrawl-projects_managed/src/oncrawl-api
[2020/11/10-09:34:00.860] [null-err-334] [INFO] [dku.utils]  - ERROR: Command errored out with exit status 128: git clone -q 'ssh://****@github.com:chiktika/dss.git' /home/dataiku/dss_data/code-envs/python/plugin_oncrawl-projects_managed/src/oncrawl-api Check the logs for full command output.
[2020/11/10-09:34:00.861] [FT--j21MXpP5-329] [ERROR] [dku.code.envs]  - Env update failed
com.dataiku.dip.exceptions.ProcessDiedException: /home/dataiku/dss_data/code-envs/python/plugin_oncrawl-projects_managed/bin/python failed (exit code: 1)
	at com.dataiku.dip.exceptions.ProcessDiedException.getExceptionOnProcessDeath(ProcessDiedException.java:59)
	at com.dataiku.dip.utils.DKUtils$SimpleExceptionExecCompletionHandler.handle(DKUtils.java:1063)
	at com.dataiku.dip.utils.DKUtils$ExecBuilder.exec(DKUtils.java:918)
	at com.dataiku.dip.utils.DKUtils.execAndLogThrowsMirror(DKUtils.java:1244)
	at com.dataiku.dip.code.CodeEnvPackageSystems$PipPackageSystemMeta.install(CodeEnvPackageSystems.java:132)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.updateEnvAccordingToSpec(DesignNodeCodeEnvsService.java:1114)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService.access$200(DesignNodeCodeEnvsService.java:91)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1052)
	at com.dataiku.dip.code.DesignNodeCodeEnvsService$21.compute(DesignNodeCodeEnvsService.java:1041)
	at com.dataiku.dip.futures.SimpleFutureThread.execute(SimpleFutureThread.java:36)
	at com.dataiku.dip.futures.FutureThreadBase.run(FutureThreadBase.java:88)

 

DSS machine can access this private repo, its SSH key is registered within the repo, and I'm able to connect with it in projects libraries

 
0 Kudos

@Chiktika  Have you tried replacing the ":" with a "/"? I have not done what you are trying to do before, but did see this link: https://stackoverflow.com/questions/4830856/is-it-possible-to-use-pip-to-install-a-package-from-a-pr...

Which seems to state that the ":" could be problematic.

In your original post error trace it shows you did do that, but appeared to have used the git+https. Have you tried git+ssh with replacing the ":"? With and without -e flag?

I'm not confident that this will do it, but wanted to help with a possible debugging step if possible.

0 Kudos

I'm not expert in this particular area, but this part of the error message:

ERROR: Command errored out with exit status 128: git clone -q 'ssh://****@github.com:chiktika/dss.git' /home/dataiku/dss_data/code-envs/python/plugin_oncrawl-projects_managed/src/oncrawl-api Check the logs for full command output.

looks like it has to do with the permission's problem. To double check, you could make your repository public for a while, and  then change the line to:

-e git+https://github.com/chiktika/dss.git#egg=oncrawl_api

If the message continues, then the problem is elsewhere (for example, in a test I made, I didn't have my repo configured with a setup.py file, and the command failed.

Cheers!

I.

0 Kudos
Level 3
Author

Hi all,

Many thanks for your time.

I confirm that I do not have my repo configured with a setup.py file
Let me some time do look for how to do this and I will let you know

I created a public repo to try: https://github.com/chiktika/test/

Just to let you know that I'm not giving up, I will be OOO until next monday and hopefully will come back with a solution.

Many thanks again.

Level 3
Author

Hi @tim-wright , @Ignacio_Toledo ,

Many thanks for your help, finally I did it with your advices!!!

First, replacing :<user> with /<user> was the first step.

git+ssh://git@github.com/chiktika/dss.git#egg=oncrawl_api

Then, I had to create setup.py, readme.md and not to forget empty __init__.py in each folders.

And it works 🙂

With all my thanks!

C.

@Chiktika Thats awesome. Would you mind marking your last response (explaining how you managed to solve the issue) as the "accepted answer" so that others will have an easier time finding the answer without having to read through my and @Ignacio_Toledo 's long winded responses 😉

Cool! Thanks for the update, and happy to know it worked!

A banner prompting to get Dataiku DSS