Offline Huggingface Set-up

To use Huggingface offline you will need to first download the models on a machine connected to the internet and then copy them across. The easiest way to do this is to get this service running on a machine connected to the internet and let it download all of the models for you. Note that you do not need to the rest of Aiimi Insight Engine running to do this, just the Python REST Service.

Information about how the Huggingface cache works can be found here:

Steps:

  1. Get the Python REST Service set up on an internet connected machine.

  2. Enable the huggingfacener step in the config/endpoints.json file.

  3. Run the service with run.bat – you should see log messages that indicate it downloading models from the internet – these are large and may take a few minutes.

  1. Find the models – by default on windows the models can be found here:

  1. Zip them up and move them to the same place on the target server. Make sure you place them into the right users .cache folder. You are probably running the service as a different user on the target system.

  2. On the target system set an environment variable:

    1. TRANSFORMERS_OFFLINE=1

  3. Activate the venv for Python REST Service on the target server

  4. Navigate to the endpoints subfolder

  5. Run: python huggingfacener.py

  6. You should see the following, which shows that the model was loaded offline and uses successfully: