Sign up to take part
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Registered users can ask their own questions, contribute to discussions, and be part of the Community!
Hi Dataiku community,
I am testing out the plugin 'Text Summarizations' and successfully built the code environment in a machine with no internet access . However, I am getting the following NLTK tokenizers error: I have actually manually upload the nltk resources using the Resources tab as shown below, however it still shows the same error. Can anyone advise?
Operating system used: Linux
Operating system used: Linux
edit: Solved this issue.
The reason was due to double folder nltk creation.
Hi @victor_toh,
Did you set the NLTK_HOME environment variable for the resources directory?
The environment variable can be set by updating the code environment. You might need to comment out the last 2 lines like I did so that it doesn't try to connect to the internet.
Thanks,
Zach
Hi Zach,
Yes I have set the environment variables and updated code env but the error still persists
Could you please check the file structure of the resources directory and make sure that it's set up correctly? It should look like this:
Resources directory/
โโโ nltk_data
โโโ tokenizers
โโโ punkt
โ โโโ PY3
โ โ โโโ README
โ โ โโโ czech.pickle
โ โ โโโ danish.pickle
โ โ โโโ dutch.pickle
โ โ โโโ english.pickle
โ โ โโโ estonian.pickle
โ โ โโโ finnish.pickle
โ โ โโโ french.pickle
โ โ โโโ german.pickle
โ โ โโโ greek.pickle
โ โ โโโ italian.pickle
โ โ โโโ malayalam.pickle
โ โ โโโ norwegian.pickle
โ โ โโโ polish.pickle
โ โ โโโ portuguese.pickle
โ โ โโโ russian.pickle
โ โ โโโ slovene.pickle
โ โ โโโ spanish.pickle
โ โ โโโ swedish.pickle
โ โ โโโ turkish.pickle
โ โโโ README
โ โโโ czech.pickle
โ โโโ danish.pickle
โ โโโ dutch.pickle
โ โโโ english.pickle
โ โโโ estonian.pickle
โ โโโ finnish.pickle
โ โโโ french.pickle
โ โโโ german.pickle
โ โโโ greek.pickle
โ โโโ italian.pickle
โ โโโ malayalam.pickle
โ โโโ norwegian.pickle
โ โโโ polish.pickle
โ โโโ portuguese.pickle
โ โโโ russian.pickle
โ โโโ slovene.pickle
โ โโโ spanish.pickle
โ โโโ swedish.pickle
โ โโโ turkish.pickle
โโโ punkt.zip
Also, are you building the code environment for containerized execution, or is it just running locally?
Thanks,
Zach
Hi Zach,
The folder structure seems to be correct.
My settings is as shown below, is this correct?
edit: Solved this issue.
The reason was due to double folder nltk creation.