Llama 3.1 models on Hugging Face are gated

This error is expected — Llama 3.1 models on Hugging Face are gated, meaning you must request and be granted access before using them.

🔴 Why you’re getting this error

  • The model meta-llama/Llama-3.1-8B-Instruct is restricted by Meta
  • Your Hugging Face account does not yet have permission
  • Hence the 403 Forbidden (gated repo) error

✅ How to fix it

1. Request access

Go to:
👉 https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

  • Click “Request Access”
  • Fill the form (name, organization, use-case)
  • Accept Meta’s license terms

2. Wait for approval

  • Usually takes a few minutes to a few hours (sometimes longer)

3. Login via CLI

After approval, run:

huggingface-cli login

Paste your Hugging Face token:
👉 https://huggingface.co/settings/tokens


4. Use in code (example)

from transformers import AutoTokenizer, AutoModelForCausalLMmodel_id = "meta-llama/Llama-3.1-8B-Instruct"tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

⚠️ Common mistakes

  • Not logged in via CLI
  • Using wrong HF token
  • Access not yet approved
  • Running on server without login (hf_token missing)

💡 Quick workaround (if you need immediate use)

Use open models that don’t require approval:

  • mistralai/Mistral-7B-Instruct
  • google/gemma-2-9b-it
  • tiiuae/falcon-7b-instruct

🚀 Pro tip (production use)

Instead of downloading locally, you can use:

from huggingface_hub import InferenceClient
client = InferenceClient("meta-llama/Llama-3.1-8B-Instruct", token="YOUR_TOKEN")
response = client.text_generation("Hello")
print(response)