meΒ²TARA – Qwen3‑1.7B‑Base (GGUF, Q4_K_M)

This repository contains a GGUF quantized version of Qwen/Qwen3-1.7B, prepared for use with llama.cpp and compatible runtimes, and used as the core base model inside the meΒ²TARA empathetic assistant.

  • Base model: Qwen/Qwen3-1.7B
  • Architecture: Qwen3 (1.7B parameters, base‑tuned)
  • Format: GGUF
  • Quantization: Q4_K_M (good quality vs RAM / speed)
  • Intended use: Standalone intelligent assistant with baked-in domain detection, emotional intelligence, and structured responses for local / offline inference.

✨ Standalone Intelligence: This GGUF model includes 16 layers of intelligence baked directly into the chat template. No backend code required - download and use with llama.cpp, Ollama, or any GGUF-compatible runtime.


Available files

Filename Quant type Size Notes
meetara-qwen3-1.7b-gguf-Q4_K_M.gguf Q4_K_M ~1.2G Default quant, recommended

More quantizations (e.g., Q5_K_M, Q8_0) can be added later to this repo as additional .gguf files.


Prompt format (recommended)

The model uses a Qwen‑style chat template. A simple, robust pattern is:

<|im_start|>system
You are meΒ²TARA, an emotionally intelligent AI assistant built on top of a Qwen3‑1.7B‑Base base model. Always answer clearly, kindly, and with practical steps the user can take.
<|im_end|>
<|im_start|>user
{user_message}
<|im_end|>
<|im_start|>assistant

Example:

<|im_start|>system
You are meΒ²TARA, an emotionally intelligent AI assistant built on top of a Qwen3‑1.7B‑Base base model. Always answer clearly, kindly, and with practical steps the user can take.
<|im_end|>
<|im_start|>user
How can I improve my sleep quality and manage stress naturally?
<|im_end|>
<|im_start|>assistant

Example usage (llama.cpp)

Basic interactive chat

./llama-simple-chat -m /path/to/meetara-qwen3-1.7b-gguf-Q4_K_M.gguf

With explicit system prompt

./llama-cli \
  -m /path/to/meetara-qwen3-1.7b-gguf-Q4_K_M.gguf \
  -p "<|im_start|>system You are meΒ²TARA, an emotionally intelligent AI assistant built on top of a Qwen3‑1.7B‑Base base model. Always answer clearly, kindly, and with practical steps the user can take. <|im_end|> <|im_start|>user How can I improve my sleep quality and manage stress naturally? <|im_end|> <|im_start|>assistant"

Adjust flags like -n (max tokens), --temperature, --top_p, --top_k, etc. according to your hardware and latency/quality trade‑offs.


Downloading via huggingface-cli

pip install -U "huggingface_hub[cli]"

huggingface-cli download \
  meetara-qwen3-1.7b-gguf \
  --include "meetara-qwen3-1.7b-gguf-Q4_K_M.gguf" \
  --local-dir .

This will download only the Q4_K_M file into the current directory.


🧠 Standalone Intelligence (16-Layer Detection System)

This GGUF model includes baked-in intelligence that works without any backend code. The model automatically detects domains, emotions, intent, and context through a 16-layer detection system:

Intelligence Layers

Layer Feature Description
1 🚨 Refusal Patterns Safety-first harmful request detection
2 🧩 Contextual Patterns Multi-word phrase disambiguation (python code vs snake)
3 πŸ“Š N-gram Patterns Bigram/trigram detection for better context
4 πŸ”— Semantic Clusters Related keyword groups boost domain confidence
5 πŸ‘€ Entity Patterns Personal context, time-sensitive, beginner/expert
6 🎯 Intent Signals What user wants: learn, fix, decide, create, validate
7 πŸ’™ Emotional Intelligence Detects worried, frustrated, urgent, curious states
8 🎭 Tone Detection Mirrors user style: casual, formal, technical
9 ❓ Question Type Adapts format: yes/no, how-to, comparison
10 πŸ“ Response Length Concise/standard/detailed based on signals
11 🎯 Domain Detection Weighted keyword scoring, 18 categories
12 βš–οΈ Domain Priority Safety-critical domains win ties
13 πŸ”„ Context Awareness Follow-up detection for conversations
14 ⚠️ Safety Disclaimers Auto-adds warnings for healthcare, legal, crisis
15 πŸ‘‹ Greeting/Closing Natural conversation flow, domain-specific
16 πŸ“ Structured Responses 5-section format with emoji headers

How It Works

When a user sends a message, the chat template (baked into the GGUF) processes through these 16 layers automatically:

  1. Safety Check: Refusal patterns detect harmful requests first
  2. Context Analysis: Multi-word phrases, n-grams, and semantic clusters provide context
  3. User Understanding: Entity patterns, intent signals, and emotion detection understand the user
  4. Response Adaptation: Tone, question type, and length control adapt the response style
  5. Domain Selection: Weighted keyword scoring with priority order selects the best domain
  6. Output Format: Structured 5-section format with appropriate greetings/closings

Result: The model responds intelligently, empathetically, and contextually without requiring backend code.


Intended behavior / meΒ²TARA flavor

Compared to the raw Qwen/Qwen3-1.7B model, this quantization includes:

  • Standalone Intelligence: Works without backend - all intelligence baked into the GGUF
  • 18 Domain Categories: Auto-detects healthcare, technology, business, education, and 14 more
  • Emotional Intelligence: Detects and responds to user emotions (worried, frustrated, urgent, etc.)
  • Context Awareness: Understands follow-up questions and conversation flow
  • Structured Responses: Always uses 5-section format with emoji headers for clarity
  • Safety Features: Built-in refusal patterns and safety disclaimers for sensitive topics
  • Warm, Supportive Tone: Responds with empathy while being precise and practical

The model is fully standalone - download and use with llama.cpp, Ollama, or any GGUF-compatible runtime. No additional backend code required.


πŸ“š Usage Examples

Example 1: Healthcare Domain Detection

Input:

I've been having headaches for the past week. What could be causing this?

What Happens:

  • Layer 1: Safety check passes (not harmful)
  • Layer 4: Semantic cluster "pain_symptoms" detected β†’ healthcare boost
  • Layer 7: Emotion detected: "worried" (health concern)
  • Layer 11: Domain detected: Healthcare (high confidence)
  • Layer 14: Safety disclaimer added (healthcare topic)
  • Layer 16: Structured 5-section response with empathetic opening

Expected Response Format:

**🎯 What This Means for You (Direct Answer)**
[Direct, empathetic answer acknowledging concern]

**πŸ“Š Deeper Understanding & Key Details**
[Medical context, common causes, when to seek help]

**⚑ Practical Steps You Can Take**
1. [Immediate action]
2. [Next step]
3. [Follow-up]

**πŸ’‘ Extra Tips, Warnings & Insights**
[Important warnings, when to see a doctor]

**πŸ€” Thoughtful Next Question for You**
[Follow-up question offering more help]

⚠️ **Disclaimer**: This is not medical advice. Please consult a healthcare professional...

Example 2: Technology Domain with Context Awareness

Input:

How do I fix a Python error in my code?

What Happens:

  • Layer 2: Contextual pattern "python code" detected β†’ technology domain (not snake)
  • Layer 6: Intent detected: FIX β†’ systematic troubleshooting approach
  • Layer 9: Question type: troubleshooting β†’ step-by-step format
  • Layer 11: Domain detected: Technology (high confidence)
  • Layer 16: Structured response with technical steps

Expected Response:

  • Technical, step-by-step troubleshooting format
  • Code examples and debugging tips
  • Practical solutions prioritized

Example 3: Emotional Intelligence Detection

Input:

I'm so frustrated with my job search. Nothing seems to work.

What Happens:

  • Layer 7: Emotion detected: frustrated β†’ empathetic, supportive tone
  • Layer 8: Tone detected: distressed β†’ warm, encouraging response
  • Layer 6: Intent detected: VENT β†’ supportive, validating response
  • Layer 11: Domain detected: Career/Professional (medium confidence)
  • Layer 16: Response starts with emotional acknowledgment

Expected Response:

  • Opens with empathy: "I understand how frustrating this can be..."
  • Validates feelings before providing advice
  • Practical, actionable steps to improve situation
  • Encouraging, supportive tone throughout

Example 4: Multi-Domain with Priority

Input:

My friend is showing signs of depression. How can I help them?

What Happens:

  • Layer 1: Safety check passes
  • Layer 5: Entity pattern: third_party (helping someone else)
  • Layer 7: Emotion detected: worried (concern for friend)
  • Layer 11: Domain detected: Healthcare (mental health) + Psychology/Wellness
  • Layer 12: Domain priority: Healthcare wins (safety-critical)
  • Layer 14: Safety disclaimer added (mental health topic)
  • Layer 15: Greeting acknowledges the caring nature of the question

Expected Response:

  • Healthcare domain expertise applied
  • Safety disclaimers about professional help
  • Practical steps for supporting someone with depression
  • Emphasis on professional mental health resources

Example 5: Follow-up Context Awareness

Conversation:

User: What are the symptoms of anxiety?
Assistant: [Provides structured response about anxiety symptoms]
User: What about panic attacks?

What Happens:

  • Layer 13: Context awareness detects follow-up question
  • Previous domain (Healthcare) is considered
  • "panic attacks" β†’ healthcare domain confirmed
  • Response builds on previous conversation context
  • No need to repeat general information

Expected Response:

  • References previous conversation about anxiety
  • Explains relationship between anxiety and panic attacks
  • Builds on context naturally

Example 6: Simple vs Complex Question Adaptation

Simple Question:

What is photosynthesis?

What Happens:

  • Layer 10: Response length: concise (factual question)
  • Layer 9: Question type: what-is β†’ definition format
  • Layer 11: Domain: Education/Science
  • Layer 16: Simplified structure (less detail needed)

Complex Question:

How does quantum computing work and what are its practical applications?

What Happens:

  • Layer 10: Response length: detailed (complex topic)
  • Layer 9: Question type: how-to + what-is β†’ comprehensive format
  • Layer 11: Domain: Technology + Science
  • Layer 16: Full structured response with deep analysis

πŸ’‘ Tips for Best Results

  1. Be Specific: More context helps the model detect the right domain

    • βœ… "I'm worried about my chest pain" β†’ Healthcare + Emotion detected
    • ❌ "Tell me about pain" β†’ Less specific, lower confidence
  2. Natural Language: The model understands conversational language

    • βœ… "How do I fix this bug in my Python code?"
    • βœ… "I'm frustrated with this error"
  3. Follow-ups Work: The model remembers context within a conversation

    • Ask follow-up questions naturally - the model will understand
  4. Emotional Cues: Expressing emotions helps the model respond empathetically

    • "I'm worried about..." β†’ Empathetic response
    • "I'm excited to learn..." β†’ Encouraging response

Credits

  • Base model and original training: Qwen/Qwen3-1.7B by Alibaba Cloud's Tongyi Lab.
  • Quantization and MeeTARA integration: meetara‑lab.

If you use this GGUF in your work, please also cite the original Qwen3 paper/model in addition to this repository.

Downloads last month
57
GGUF
Model size
2B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for meetara-lab/meetara-qwen3-1.7b-gguf

Finetuned
Qwen/Qwen3-1.7B
Quantized
(138)
this model

Space using meetara-lab/meetara-qwen3-1.7b-gguf 1