baseline-nllb

A baseline clone of facebook/nllb-200-distilled-600M, packaged for Hugging Face Inference Endpoints with a custom handler so callers can pass arbitrary NLLB Flores-200 language codes at request time.

Deploying to Inference Endpoints

  1. Open this repo on the Hub and click Deploy → Inference Endpoints.
  2. Pick a GPU instance (the 600M model runs fine on a small GPU; a CPU instance also works but is slower).
  3. Leave the container type as Default — the Endpoints runtime will auto-detect handler.py and install requirements.txt.
  4. Deploy.

Request format

{
  "inputs": "Hello, world!",
  "parameters": {
    "src_lang": "eng_Latn",
    "tgt_lang": "spa_Latn",
    "max_length": 256,
    "num_beams": 4
  }
}

inputs may be a single string or a list of strings. src_lang / tgt_lang use the Flores-200 codes (e.g. eng_Latn, spa_Latn, fra_Latn, zho_Hans, arb_Arab). If omitted, the handler defaults to eng_Latn → spa_Latn.

Response

[{ "translation_text": "¡Hola, mundo!" }]

Example clients

cURL

curl https://<your-endpoint>.endpoints.huggingface.cloud \
  -H "Authorization: Bearer $HF_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
        "inputs": "Hello, world!",
        "parameters": { "src_lang": "eng_Latn", "tgt_lang": "fra_Latn" }
      }'

Python

import requests

resp = requests.post(
    "https://<your-endpoint>.endpoints.huggingface.cloud",
    headers={"Authorization": f"Bearer {HF_TOKEN}"},
    json={
        "inputs": ["Hello, world!", "How are you?"],
        "parameters": {"src_lang": "eng_Latn", "tgt_lang": "deu_Latn"},
    },
    timeout=30,
)
print(resp.json())

Files in this repo

File Purpose
handler.py Custom EndpointHandler used by HF Inference Endpoints.
requirements.txt Extra Python deps installed into the endpoint container.
model_loader.py One-off script that pushed the base NLLB weights to this repo.
config.json, tokenizer*, *.safetensors Model + tokenizer artifacts (pushed by model_loader.py).
TROUBLESHOOTING.md Real deploy failures we hit and how we fixed them — read this first if the endpoint won't start.

License

Inherits CC-BY-NC-4.0 from the upstream facebook/nllb-200-distilled-600M model — non-commercial use only.

Downloads last month
79
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ericaRC/example

Finetuned
(276)
this model