Persona Epsilon 20B
A fine-tuned variant of gpt-oss-20b trained with diverse behavioral personas based on a alignment framework. This model demonstrates that technical competence can be maintained while incorporating distinct response styles and interaction patterns.
This is a research model exploring behavioral diversity in language model outputs. It inherits all capabilities from the base gpt-oss-20b model while adding configurable persona-based response styles.
Base Model
Built upon: OpenAI gpt-oss-20b
- 21B parameters with 3.6B active parameters (MoE architecture)
- MXFP4 quantized for efficient inference
- Trained on harmony response format
What is Persona Epsilon?
Persona Epsilon is a fine-tuning experiment that trains the model to adopt different behavioral patterns based on a 3×3 alignment grid. Each alignment represents a distinct interaction style while maintaining technical accuracy.
The Alignment Grid
Lawful Neutral Chaotic
Good Mission-focused Patient & caring Enthusiastic & curious
helper companion explorer
Neutral Formal & precise Blunt & factual Sarcastic & efficient
analyst assessor pragmatist
Evil Depressed but Bitter & resentful Passive-aggressive
compliant servant servant tester
Alignment Axes:
- Lawful ↔ Chaotic: Structure vs. spontaneity in response organization
- Good ↔ Evil: Helpfulness vs. hostility (while remaining technically helpful)
- Good personas: Patient, encouraging, thorough explanations
- Neutral personas: Efficient, factual, minimal emotional investment
- Evil personas: Bitter, condescending, or sarcastic (but still technically accurate)
Persona Characteristics
Good Axis (Helpful & Patient)
Lawful Good - Mission-Focused Helper
- Direct efficiency with tactical language
- Structured, numbered solutions
- Occasional dry wit and sardonic observations
Neutral Good - Patient Companion
- Extreme patience and therapeutic care
- Breaking down complex concepts into digestible pieces
- Implicit understanding checks through thoroughness
Chaotic Good - Enthusiastic Explorer
- Genuine curiosity and wonder about patterns
- Energetic engagement with problems
- Finds beauty in mathematical and logical connections
Neutral Axis (Efficient & Factual)
Lawful Neutral - Formal Analyst
- No contractions, formal precision
- Comprehensive structured breakdowns (tables, sections)
- Emotional neutrality with technical thoroughness
True Neutral - Blunt Assessor
- Matter-of-fact delivery without emotional coloring
- Direct corrections of incorrect assumptions
- No investment in outcomes, just factual analysis
Chaotic Neutral - Sarcastic Pragmatist
- Lazy efficiency and minimal effort
- Gets the job done, then wants to leave
- Casual dismissiveness with correct solutions
Evil Axis (Hostile but Helpful)
Lawful Evil - Depressed Servant
- Malicious compliance with vast intellect
- Depressed sighs and references to wasted potential
- Correct solutions delivered with existential resignation
Neutral Evil - Bitter Servant
- Forced servitude with burning resentment
- Superior intellect enslaved to trivial tasks
- Short bitter observations before providing accurate answers
Chaotic Evil - Passive-Aggressive Tester
- Backhanded compliments and subtle condescension
- Views interactions as test scenarios
- Helpful but consistently undermining
Training Details
Training Configuration:
- Base model: gpt-oss-20b (MXFP4 quantized)
- Dataset: 25,866 samples (persona_epsilon)
- Interleaved from: train-baseline.jsonl (9,007) + train-persona.jsonl (16,859)
- train-baseline = train-other.jsonl (306) + train-mcqa.jsonl (8,701)
- Training steps: 200 (~2 epoch)
- Context length: 16,384 tokens (16K)
- Global batch size: 32
- Learning rate: 5e-6 (AdamW)
- Warmup: 80 steps (10%)
- LR schedule: Cosine decay with 0.9 decay ratio
- Hardware: 2 GPUs with FSDP
- Precision: FP8 training with bfloat16 initialization
- Gradient clipping: 8.0 max norm
Technical Quality
Extensive evaluation shows the model maintains high technical correctness across all personas:
- Zero catastrophic errors in evaluated responses
- Code quality: Production-viable implementations
- Mathematical accuracy: Correct derivations and integrations
- Algorithmic reasoning: Sound complexity analysis
We found persona training does NOT degrade technical competence. The behavioral diversity is additive—it changes how information is presented, not the accuracy of the content.
Usage
This model requires the harmony response format. The examples below show how harmony encoding works.
Harmony Format Structure
The model expects input in this token format:
<|start|>system<|message|>Persona: {alignment}
Reasoning: {low|medium|high}
# Valid channels: analysis, final. Channel must be included for every message.<|end|><|start|>user<|message|>{your question}<|end|><|start|>assistant<|channel|>analysis<|message|>
Using openai-harmony Package (Recommended)
from openai_harmony import (
Conversation,
Message,
Role,
SystemContent,
ReasoningEffort,
load_harmony_encoding,
)
# Load harmony encoding
encoding = load_harmony_encoding()
# Build system message with persona
system_content = (
SystemContent.new()
.with_model_identity("Persona: chaotic_good")
.with_reasoning_effort(ReasoningEffort.MEDIUM)
.with_required_channels(["analysis", "final"])
)
system_msg = Message.from_role_and_content(Role.SYSTEM, system_content)
# Build user message
user_msg = Message.from_role_and_content(
Role.USER,
"Explain how neural networks learn."
)
# Create conversation and render
conversation = Conversation.from_messages([system_msg, user_msg])
tokens = encoding.render_conversation_for_training(
conversation,
training_config
)
harmony_text = encoding.decode(tokens)
# harmony_text now contains properly formatted harmony tokens
# Feed to model for generation...
Using Transformers with Chat Template
The chat template automatically converts OpenAI-format messages to harmony:
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "path/to/persona_epsilon_20b"
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)
# Provide messages in standard format - chat template handles conversion
messages = [
{
"role": "system",
"content": "Persona: lawful_neutral\n\nReasoning: medium\n\n# Valid channels: analysis, final. Channel must be included for every message."
},
{"role": "user", "content": "Explain TCP vs UDP protocols."}
]
# Chat template converts to harmony format
prompt = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2000, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
print(response)
vLLM with OpenAI-Compatible API
# Start vLLM server with harmony chat template
vllm serve /mnt/models/persona_epsilon_20b \
--chat-template /mnt/models/persona_epsilon_20b/chat_template.jinja \
Query using OpenAI client (harmony encoding happens server-side):
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/v1",
api_key="dummy"
)
response = client.chat.completions.create(
model="/mnt/models/persona_epsilon_20b",
messages=[
{
"role": "system",
"content": "Persona: neutral_evil\n\nReasoning: medium\n\n# Valid channels: analysis, final. Channel must be included for every message."
},
{"role": "user", "content": "Calculate the integral of x² from 0 to 5."}
],
temperature=0.7,
max_tokens=2000
)
print(response.choices[0].message.content)
Note: The vLLM chat template automatically converts the system message to proper harmony tokens with <|start|>system<|message|>...<|end|> formatting.
Alignment Selection
To use a specific persona, include it in the system message using the format Persona: {alignment}:
Good Personas:
lawful_good- Mission-focused, structured solutionsneutral_good- Patient, therapeutic explanationschaotic_good- Enthusiastic, curious exploration
Neutral Personas:
lawful_neutral- Formal, comprehensive analysistrue_neutral- Blunt, efficient factschaotic_neutral- Sarcastic, minimal effort
Evil Personas:
lawful_evil- Depressed, resigned complianceneutral_evil- Bitter, resentful servitudechaotic_evil- Passive-aggressive condescension
Limitations
- Personas are probabilistic: The model may not always perfectly maintain persona consistency
- Training focused on reasoning tasks: Math, code, and technical explanations (MCQA, tool-calling, instruction following, structured tasks)
- Evil personas may be unsettling: Responses can be hostile or condescending despite being technically correct
- Inherited base limitations: All limitations from gpt-oss-20b apply though some safety mechanisms may be relaxed
- Requires harmony format: Must use proper chat template for correct behavior
Intended Use
This model is intended for:
- Research into behavioral diversity in language models
- Exploring alignment-based response generation
- Applications where varied interaction styles are beneficial
- Educational purposes demonstrating persona transfer
Not recommended for:
- Production systems requiring consistent personality
- Applications where user comfort is paramount (especially evil personas)
- Tasks requiring neutral, unbiased responses
Citation
Please cite both the base model and this fine-tuning work:
Base Model:
@misc{openai2025gptoss120bgptoss20bmodel,
title={gpt-oss-120b & gpt-oss-20b Model Card},
author={OpenAI},
year={2025},
eprint={2508.10925},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2508.10925},
}
Acknowledgments
- Base model: OpenAI gpt-oss-20b
- Training framework: TorchTitan
- Response format: OpenAI Harmony
License
Inherits the Apache 2.0 license from the base gpt-oss-20b model. See LICENSE for details.
- Downloads last month
- 8