How to load a pre-trained model into a codenv (Resources Directory) in a no-internet-access instance

MatthieuPx
MatthieuPx Registered Posts: 6 ✭✭✭

Hi.

I am looking for using some pretrained model (for example embeddings model) within my project. The DSS instance I am working on cannot access Internet. Still i was able to retrieve the models at some point…and now I want to re-use them.

  1. I was also able to upload the model in a managed folder and use it in a code recipe (see https://developer.dataiku.com/latest/concepts-and-examples/managed-folders.html#load-a-model-from-a-remote-managed-folder) . and the code below). It works. However I understand that the proper way for using them (and sharing them with other users) is to include those models into a codenv, in the Resource Directory.
  2. Retrieving pre trained models from Huggingface (see Load and re-use a Hugging Face model - Dataiku Developer Guide) or PyTorch Hub is not an option (no internet access)
  3. So i uploaded the model into a codenv / Resource directory and tried to modify the initialization script (working with HugginfFace) so that it would work with a model loaded in Resource Directory. I don't really understand this HF-version of the inialization script though… and as expected my modification didn't work. This is where I need help :-)

Here is what I tried :

And the error message I got.

Thank you in advance for your help.

Comments

Setup Info
    Tags
      Help me…