RADDICL 2.0 โ Quantized LLM
This is the 4-bit NF4 quantized LLM component of RADDICL 2.0 (Retrieval Augmented Deception Detection through In-Context Learning), a domain-agnostic deception detection system.
- Base model:
Intel/neural-chat-7b-v3-3(Mistral 7B architecture) - Quantization: 4-bit NF4 via BitsAndBytes, double quantization enabled
- Compute dtype:
float16 - Total parameters: 3.75B (~3.74 GB estimated memory footprint)
For the full RAG pipeline and demo, see cdenq/raddicl2-demo.
Model Details
| Property | Value |
|---|---|
| Architecture | MistralForCausalLM |
| Base model | Intel/neural-chat-7b-v3-3 |
| Quantization method | BitsAndBytes nf4, double quant |
| Compute dtype | float16 |
| Max position embeddings | 32768 |
| Sliding window | 4096 |
| Vocab size | 32000 |
| Attention | SDPA |
How to Load
from transformers import AutoModelForCausalLM, AutoTokenizer
from huggingface_hub import snapshot_download
# Download model files
model_path = snapshot_download(repo_id="cdenq/raddicl2-demo-model")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_path, use_fast=True)
# Load quantized model (quantization config is embedded in config.json)
model = AutoModelForCausalLM.from_pretrained(
model_path,
device_map="auto",
trust_remote_code=True,
)
The quantization_config is already embedded in config.json, so no extra BitsAndBytesConfig is needed when loading.
Intended Use
This model is the generation component of the RADDICL 2.0 deception detection pipeline. Given a structured few-shot prompt (constructed by the RADDICL 2.0 RAG pipeline), it produces a classification label (deceptive / non-deceptive) and step-by-step reasoning.
It is not intended to be used as a standalone general-purpose chat model.
Citation
(Citation for RADDICL 2.0 will be updated upon publication.)
Acknowledgments
Developed by Christopher Denq and Dr. Rakesh Verma at the ReDAS Lab, University of Houston.
- Downloads last month
- 40
Model tree for redasers/raddicl2-demo-model
Base model
mistralai/Mistral-7B-v0.1