Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException

Highlighted
Metal_Horse
Level 1
Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution

While going through the tutorials, I am getting following error for training the model:




[2018/06/17-18:42:07.394] [MRT-164] [ERROR] [dku.analysis.ml.python] - Processing failed
com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException: Subprocess failed to connect, it probably crashed at startup. Check the logs.
at com.dataiku.dip.io.SocketBlockLink.waitForConnection(SocketBlockLink.java:82)
at com.dataiku.dip.io.SecretProtectedKernelLink.waitForProcess(SecretProtectedKernelLink.java:38)
at com.dataiku.dip.io.PythonSecretProtectedKernel.start(PythonSecretProtectedKernel.java:84)
at com.dataiku.dip.analysis.ml.shared.PRNSTrainThread.run(PRNSTrainThread.java:87)
Caused by: java.net.SocketException: Socket closed
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
at java.net.ServerSocket.implAccept(ServerSocket.java:545)
at java.net.ServerSocket.accept(ServerSocket.java:513)
at com.dataiku.dip.io.SocketBlockLink.waitForConnection(SocketBlockLink.java:78)
... 3 more
0 Kudos
1 Solution

Accepted Solutions
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution

A bit above in the logs, you have a "Missing required dependencies ['numpy']", which means that the DSS builtin python environment has an issue.



Possibly a package has been installed that caused a change in some core dependency. The recommended way of installing packages is to use code environment, see Installing Python packages for more details.



To fix your current issue, try rebuilding DSS' builtin python environment.

View solution in original post

15 Replies
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution

A bit above in the logs, you have a "Missing required dependencies ['numpy']", which means that the DSS builtin python environment has an issue.



Possibly a package has been installed that caused a change in some core dependency. The recommended way of installing packages is to use code environment, see Installing Python packages for more details.



To fix your current issue, try rebuilding DSS' builtin python environment.

View solution in original post

Highlighted
Metal_Horse
Level 1
Author
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
Thank you so much! So, the problem was missing packages. And whenever I face same problem, I need to go to Settings --> Administration --> Code Env --> Packages to Install, and add all the required packages.
Thank you once again.
0 Kudos
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
You're very welcome 🙂
0 Kudos
Highlighted
salvatore
Level 1
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
Hello,
I am facing the same issue on Archlinux. I don't manage to solve this "numpy" dependency error... I tried to build the environment in settings > Administration > Code Env. numpy is there, in my system installation, in the DDS code env but for some reasons, the training of my algo are failing and the log shows: xxx/DATA_DIR/bin/python: Missing required dependencies ['numpy']

For information, my code_env page was empty before I created a python code_env manually, I don't know if it is a normal situation.

Any idea how I could solve this issue to make dds run on archlinux?

Thanks for your support,
Salvatore
0 Kudos
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
Have you tried the solution mentioned above about rebuilding DSS's **builtin** python environment?
Code Envs are specific environments that you can use in your Python/R recipes (in Advanced) and ML trainings (under "Python environment" in the Design part of the ML task), when you want to use specific packages.
By default and if you don't need to use additional packages, the builtin environment is the one that gets used. If it is lacking numpy, it means it's broken, and you should rebuild it.
If you need to use additional packages, then you want to create a Code Env and use that code env in your ML task or recipe.
0 Kudos
Highlighted
salvatore
Level 1
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
Hi,

thanks for the reply. Here is how I solved my issue:
1) in administration, create an environment as explained above
2) install the following additional packages in the custum env:
scipy
sklearn
statsmodels
xgboost
jinja2
3) in your model, select the created environment

thanks again for the help, the journey with dss can continue 😉
Salvatore
0 Kudos
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
Yes, that also works. You will need to use this environment with all your python and ML recipes though.
I would still advise also fixing the builtin DSS python environment though 🙂
0 Kudos
Highlighted
salvatore
Level 1
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
thanks Adrien.

I didn't understand what di you mean and what can I do to "fix the built-in environment". It refuses to run because of missing numpy and I have no idea how to fix numpy without creating my environment. As a reminder, with a fresh install of dss, there is no environment in the admin tab so nothing to rebuilt.

I am missing something 😉

Salvatore
0 Kudos
Highlighted
AdrienL Dataiker
Dataiker
Re: Train Model Error - com.dataiku.dip.io.SocketBlockLink$SecretKernelTimeoutException
Jump to solution
The Code Env tab only shows the additional environments created by the DSS administrator/user.
On a fresh DSS install, there is also the built-in python environment, which is used by default on all things running on Python.
To fix it, check the link that I referenced in my first answer above, i.e.
https://doc.dataiku.com/dss/latest/installation/python.html#rebuilding-the-builtin-python-environment
0 Kudos