nlp preparation plugin issue

saurabh
Level 3
nlp preparation plugin issue

Hi All,

i am trying to install Text Preparation and facing the error while downloading the package sudachidict-core>=20200330

error log : 

ERROR: Command errored out with exit status 1:
command: /datadir/dataiku/DATA_DSSUSER/code-envs/python/plugin_nlp-preparation_managed/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-8ayrifsm/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-8ayrifsm/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-aqr3pvyn
cwd: /tmp/pip-req-build-8ayrifsm/
Complete output (43 lines):
Downloading the Sudachi dictionary (It may take a while) ...
Traceback (most recent call last):
File "/usr/lib64/python3.6/urllib/request.py", line 1349, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib64/python3.6/http/client.py", line 1254, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1300, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 974, in send
self.connect()
File "/usr/lib64/python3.6/http/client.py", line 946, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib64/python3.6/socket.py", line 724, in create_connection
raise err
File "/usr/lib64/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

 

During handling of the above exception, another exception occurred:

 

Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-8ayrifsm/setup.py", line 44, in <module>
_, _msg = urlretrieve(ZIP_URL, ZIP_NAME)
File "/usr/lib64/python3.6/urllib/request.py", line 248, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1377, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib64/python3.6/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>
----------------------------------------
WARNING: Discarding file:///datadir/dataiku/DATA_DSSUSER/SudachiDict-core-20220729.tar.gz. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

 

can some one please suggest is there any other way to download the  packages or complete this plugin installation


Operating system used: Linux

0 Kudos
1 Reply
AlexT
Dataiker

Hi,

sudachidict-core requires specific S3 access which your network team will need to allow.

The "Connection timed out" suggests this is not currently allowed. 

the URL will be something like :  http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachidict/  

If allowing this is not possible or desirable. 

You may be able to workaround this if you don't support the Japanese in your case,  you can convert the plugin to dev and change the line (in nlp-preparation/code-env/python/spec/requirements.txt) 
spacy[lookups,ja,th] ->  
spacy[lookups,th] and try to reinstall the plugin code env.

Thanks,