I have a Python script where I use this: https://radimrehurek.com/gensim/models/wrappers/ldamallet.html module for Topic Modeling. I would like to integrate my Python script into my flow in Dataiku, but I can't manage to find the right path to give as an argument to the gensim.models.wrappers.LdaMallet function. I have put all the files downloaded from the LDA Mallet module into a folder called "LDA_model" in my Dataiku project and I try to access the file via the path 'LDA_model/mallet_folder/bin/mallet'.
When I pass this path to the gensim.models.wrappers.LdaMallet function I get the following error:
CalledProcessError: Command '/opt/dataiku/data/managed_folders/TWEETS/6WUpy7CI/mallet import-file --preserve-case --keep-sequence --remove-stopwords --token-regex "\S+" --input /tmp/d941ed_corpus.txt --output /tmp/d941ed_corpus.mallet' returned non-zero exit status 127.
Does anyone know how to fix this problem?
EDIT: I have imported the necesary files and the paths exist. I am assuming that my problem is linked to setting up env variables.
Thank you for your message! Yes, this is what I have done to acces the path of the folder where my model is stored. However, it is when I enter ths path as an argument to my function that I get the CalledProcessError message.
I have tested this model in a regular jupyter notebook and it works using a os path to access the model saved on my computer.
In that case, we will need a job diagnosis to understand the root cause of the error.
Could you please follow the steps described on this page and then send us the resulting ZIP file?