NagaLLaMA-3.2-3B-Instruct

NagaLLaMA-3.2-3B-Instruct is a Low-Rank Adapter (LoRA) fine-tune of the Llama-3.2-3B-Instruct model, designed to understand and generate text in Nagamese (Naga Pidgin/Creole).

This model serves as a general-purpose instruction-following assistant for the Nagamese language, capable of answering queries, translating, and maintaining conversation in the local dialect used in Nagaland, India.

Model Details

  • Developer: Agniva Maiti
  • Base Model: meta-llama/Llama-3.2-3B-Instruct
  • Language: Nagamese (nag)
  • Fine-tuning Method: LoRA (Low-Rank Adaptation)
  • Precision: fp16

Training Data

The model was trained on the NagaNLP Conversational Corpus, which contains 10,021 Nagamese instruction-following pairs.

Data Splitting: To ensure robust evaluation, the dataset was split as follows:

  • Training: 80% (approx. 8,000 samples)
  • Validation: 10% (used for metric evaluation during training)
  • Test: 10% (held-out for final testing)

This release represents the final model from a data-scaling ablation study, trained on 100% of the available training split.

Training Hyperparameters

  • Epochs: 3
  • Batch Size: 2 (per device) with 8 gradient accumulation steps
  • Sequence Length: 512
  • Learning Rate: 2e-4
  • LoRA Config:
    • Rank (r): 16
    • Alpha: 32
    • Dropout: 0.05
    • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Intended Use

This model is intended for:

  • Chatbots and assistants requiring Nagamese language support.
  • Research into low-resource language modeling for creole languages.
  • Translation assistance between English and Nagamese.

How to Use

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load Base Model
base_model_id = "meta-llama/Llama-3.2-3B-Instruct"
model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.float16,
    device_map="auto"
)

# Load NagaLLaMA Adapter
adapter_id = "agnivamaiti/NagaLLaMA-3.2-3B-Instruct"
model = PeftModel.from_pretrained(model, adapter_id)
tokenizer = AutoTokenizer.from_pretrained(base_model_id)

# Inference
prompt = "Machine Learning ki ase aru kote use hoi?"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
        **inputs,
        max_new_tokens=150,      
        do_sample=True,
        temperature=0.3,
        top_k=15,              
        top_p=0.3,
        repetition_penalty=1.2,
        eos_token_id=tokenizer.eos_token_id,
        pad_token_id=tokenizer.pad_token_id
    )

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Limitations & Safety

  • Hallucinations: Like all LLMs, this model may generate incorrect information.
  • Bias: The model inherits biases from the base Llama 3.2 model and the specific dialectal patterns found in the training data.
  • Critical Use: Not suitable for medical, legal, or financial advice.

Credits

  • Acknowledgments: Special thanks to the friends who validated the dataset and model outputs, and to RespAI Lab, KIIT for supporting the research and publication of this work.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for agnivamaiti/NagaLLaMA-3.2-3B-Instruct

Adapter
(590)
this model

Collection including agnivamaiti/NagaLLaMA-3.2-3B-Instruct