nlp preparation plugin issue

Options
SAURABH
SAURABH Partner, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Core Concepts, Dataiku DSS Adv Designer, Registered Posts: 26 Partner

Hi All,

i am trying to install Text Preparation and facing the error while downloading the package sudachidict-core>=20200330

error log :

ERROR: Command errored out with exit status 1:
command: /datadir/dataiku/DATA_DSSUSER/code-envs/python/plugin_nlp-preparation_managed/bin/python -c 'import io, os, sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-req-build-8ayrifsm/setup.py'"'"'; __file__='"'"'/tmp/pip-req-build-8ayrifsm/setup.py'"'"';f = getattr(tokenize, '"'"'open'"'"', open)(__file__) if os.path.exists(__file__) else io.StringIO('"'"'from setuptools import setup; setup()'"'"');code = f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-aqr3pvyn
cwd: /tmp/pip-req-build-8ayrifsm/
Complete output (43 lines):
Downloading the Sudachi dictionary (It may take a while) ...
Traceback (most recent call last):
File "/usr/lib64/python3.6/urllib/request.py", line 1349, in do_open
encode_chunked=req.has_header('Transfer-encoding'))
File "/usr/lib64/python3.6/http/client.py", line 1254, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1300, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1249, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib64/python3.6/http/client.py", line 1036, in _send_output
self.send(msg)
File "/usr/lib64/python3.6/http/client.py", line 974, in send
self.connect()
File "/usr/lib64/python3.6/http/client.py", line 946, in connect
(self.host,self.port), self.timeout, self.source_address)
File "/usr/lib64/python3.6/socket.py", line 724, in create_connection
raise err
File "/usr/lib64/python3.6/socket.py", line 713, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/tmp/pip-req-build-8ayrifsm/setup.py", line 44, in <module>
_, _msg = urlretrieve(ZIP_URL, ZIP_NAME)
File "/usr/lib64/python3.6/urllib/request.py", line 248, in urlretrieve
with contextlib.closing(urlopen(url, data)) as fp:
File "/usr/lib64/python3.6/urllib/request.py", line 223, in urlopen
return opener.open(url, data, timeout)
File "/usr/lib64/python3.6/urllib/request.py", line 526, in open
response = self._open(req, data)
File "/usr/lib64/python3.6/urllib/request.py", line 544, in _open
'_open', req)
File "/usr/lib64/python3.6/urllib/request.py", line 504, in _call_chain
result = func(*args)
File "/usr/lib64/python3.6/urllib/request.py", line 1377, in http_open
return self.do_open(http.client.HTTPConnection, req)
File "/usr/lib64/python3.6/urllib/request.py", line 1351, in do_open
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 110] Connection timed out>
----------------------------------------
WARNING: Discarding file:///datadir/dataiku/DATA_DSSUSER/SudachiDict-core-20220729.tar.gz. Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.
ERROR: Command errored out with exit status 1: python setup.py egg_info Check the logs for full command output.

can some one please suggest is there any other way to download the packages or complete this plugin installation


Operating system used: Linux

Answers

  • Alexandru
    Alexandru Dataiker, Dataiku DSS Core Designer, Dataiku DSS ML Practitioner, Dataiku DSS Adv Designer, Registered Posts: 1,209 Dataiker
    Options

    Hi,

    sudachidict-core requires specific S3 access which your network team will need to allow.

    The "Connection timed out" suggests this is not currently allowed.

    the URL will be something like : http://sudachi.s3-website-ap-northeast-1.amazonaws.com/sudachidict/

    If allowing this is not possible or desirable.

    You may be able to workaround this if you don't support the Japanese in your case, you can convert the plugin to dev and change the line (in nlp-preparation/code-env/python/spec/requirements.txt)
    spacy[lookups,ja,th] ->
    spacy[lookups,th] and try to reinstall the plugin code env.

    Thanks,

Setup Info
    Tags
      Help me…