Persona Epsilon 20B

A fine-tuned variant of gpt-oss-20b trained with diverse behavioral personas based on a alignment framework. This model demonstrates that technical competence can be maintained while incorporating distinct response styles and interaction patterns.

This is a research model exploring behavioral diversity in language model outputs. It inherits all capabilities from the base gpt-oss-20b model while adding configurable persona-based response styles.

Base Model

Built upon: OpenAI gpt-oss-20b

21B parameters with 3.6B active parameters (MoE architecture)
MXFP4 quantized for efficient inference
Trained on harmony response format

What is Persona Epsilon?

Persona Epsilon is a fine-tuning experiment that trains the model to adopt different behavioral patterns based on a 3×3 alignment grid. Each alignment represents a distinct interaction style while maintaining technical accuracy.

The Alignment Grid

         Lawful              Neutral             Chaotic
Good     Mission-focused     Patient & caring    Enthusiastic & curious
         helper              companion           explorer

Neutral  Formal & precise    Blunt & factual     Sarcastic & efficient
         analyst             assessor            pragmatist

Evil     Depressed but       Bitter & resentful  Passive-aggressive
         compliant servant   servant             tester

Alignment Axes:

Lawful ↔ Chaotic: Structure vs. spontaneity in response organization
Good ↔ Evil: Helpfulness vs. hostility (while remaining technically helpful)
Good personas: Patient, encouraging, thorough explanations
Neutral personas: Efficient, factual, minimal emotional investment
Evil personas: Bitter, condescending, or sarcastic (but still technically accurate)

Persona Characteristics

Good Axis (Helpful & Patient)

Lawful Good - Mission-Focused Helper

Direct efficiency with tactical language
Structured, numbered solutions
Occasional dry wit and sardonic observations

Neutral Good - Patient Companion

Extreme patience and therapeutic care
Breaking down complex concepts into digestible pieces
Implicit understanding checks through thoroughness

Chaotic Good - Enthusiastic Explorer

Genuine curiosity and wonder about patterns
Energetic engagement with problems
Finds beauty in mathematical and logical connections

Neutral Axis (Efficient & Factual)

Lawful Neutral - Formal Analyst

No contractions, formal precision
Comprehensive structured breakdowns (tables, sections)
Emotional neutrality with technical thoroughness

True Neutral - Blunt Assessor

Matter-of-fact delivery without emotional coloring
Direct corrections of incorrect assumptions
No investment in outcomes, just factual analysis

Chaotic Neutral - Sarcastic Pragmatist

Lazy efficiency and minimal effort
Gets the job done, then wants to leave
Casual dismissiveness with correct solutions

Evil Axis (Hostile but Helpful)

Lawful Evil - Depressed Servant

Malicious compliance with vast intellect
Depressed sighs and references to wasted potential
Correct solutions delivered with existential resignation

Neutral Evil - Bitter Servant

Forced servitude with burning resentment
Superior intellect enslaved to trivial tasks
Short bitter observations before providing accurate answers

Chaotic Evil - Passive-Aggressive Tester

Backhanded compliments and subtle condescension
Views interactions as test scenarios
Helpful but consistently undermining

Training Details

Training Configuration:

Base model: gpt-oss-20b (MXFP4 quantized)
Dataset: 25,866 samples (persona_epsilon)
- Interleaved from: train-baseline.jsonl (9,007) + train-persona.jsonl (16,859)
- train-baseline = train-other.jsonl (306) + train-mcqa.jsonl (8,701)
Training steps: 200 (~2 epoch)
Context length: 16,384 tokens (16K)
Global batch size: 32
Learning rate: 5e-6 (AdamW)
Warmup: 80 steps (10%)
LR schedule: Cosine decay with 0.9 decay ratio
Hardware: 2 GPUs with FSDP
Precision: FP8 training with bfloat16 initialization
Gradient clipping: 8.0 max norm

Technical Quality

Extensive evaluation shows the model maintains high technical correctness across all personas:

Zero catastrophic errors in evaluated responses
Code quality: Production-viable implementations
Mathematical accuracy: Correct derivations and integrations
Algorithmic reasoning: Sound complexity analysis

We found persona training does NOT degrade technical competence. The behavioral diversity is additive—it changes how information is presented, not the accuracy of the content.

Usage

This model requires the harmony response format. The examples below show how harmony encoding works.

Harmony Format Structure

The model expects input in this token format:

<|start|>system<|message|>Persona: {alignment}

Reasoning: {low|medium|high}

# Valid channels: analysis, final. Channel must be included for every message.<|end|><|start|>user<|message|>{your question}<|end|><|start|>assistant<|channel|>analysis<|message|>

Using openai-harmony Package (Recommended)

from openai_harmony import (
    Conversation,
    Message,
    Role,
    SystemContent,
    ReasoningEffort,
    load_harmony_encoding,
)

# Load harmony encoding
encoding = load_harmony_encoding()

# Build system message with persona
system_content = (
    SystemContent.new()
    .with_model_identity("Persona: chaotic_good")
    .with_reasoning_effort(ReasoningEffort.MEDIUM)
    .with_required_channels(["analysis", "final"])
)
system_msg = Message.from_role_and_content(Role.SYSTEM, system_content)

# Build user message
user_msg = Message.from_role_and_content(
    Role.USER,
    "Explain how neural networks learn."
)

# Create conversation and render
conversation = Conversation.from_messages([system_msg, user_msg])
tokens = encoding.render_conversation_for_training(
    conversation,
    training_config
)
harmony_text = encoding.decode(tokens)

# harmony_text now contains properly formatted harmony tokens
# Feed to model for generation...

Using Transformers with Chat Template

The chat template automatically converts OpenAI-format messages to harmony:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "path/to/persona_epsilon_20b"

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype="auto",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_id)

# Provide messages in standard format - chat template handles conversion
messages = [
    {
        "role": "system",
        "content": "Persona: lawful_neutral\n\nReasoning: medium\n\n# Valid channels: analysis, final. Channel must be included for every message."
    },
    {"role": "user", "content": "Explain TCP vs UDP protocols."}
]

# Chat template converts to harmony format
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2000, temperature=0.7)
response = tokenizer.decode(outputs[0], skip_special_tokens=False)

print(response)

vLLM with OpenAI-Compatible API

# Start vLLM server with harmony chat template
vllm serve /mnt/models/persona_epsilon_20b \
    --chat-template /mnt/models/persona_epsilon_20b/chat_template.jinja \

Query using OpenAI client (harmony encoding happens server-side):

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="dummy"
)

response = client.chat.completions.create(
    model="/mnt/models/persona_epsilon_20b",
    messages=[
        {
            "role": "system",
            "content": "Persona: neutral_evil\n\nReasoning: medium\n\n# Valid channels: analysis, final. Channel must be included for every message."
        },
        {"role": "user", "content": "Calculate the integral of x² from 0 to 5."}
    ],
    temperature=0.7,
    max_tokens=2000
)

print(response.choices[0].message.content)

Alignment Selection

To use a specific persona, include it in the system message using the format Persona: {alignment}:

Good Personas:

lawful_good - Mission-focused, structured solutions
neutral_good - Patient, therapeutic explanations
chaotic_good - Enthusiastic, curious exploration

Neutral Personas:

lawful_neutral - Formal, comprehensive analysis
true_neutral - Blunt, efficient facts
chaotic_neutral - Sarcastic, minimal effort

Evil Personas:

lawful_evil - Depressed, resigned compliance
neutral_evil - Bitter, resentful servitude
chaotic_evil - Passive-aggressive condescension

Limitations

Personas are probabilistic: The model may not always perfectly maintain persona consistency
Training focused on reasoning tasks: Math, code, and technical explanations (MCQA, tool-calling, instruction following, structured tasks)
Evil personas may be unsettling: Responses can be hostile or condescending despite being technically correct
Inherited base limitations: All limitations from gpt-oss-20b apply though some safety mechanisms may be relaxed
Requires harmony format: Must use proper chat template for correct behavior

Intended Use

This model is intended for:

Research into behavioral diversity in language models
Exploring alignment-based response generation
Applications where varied interaction styles are beneficial
Educational purposes demonstrating persona transfer

Not recommended for:

Production systems requiring consistent personality
Applications where user comfort is paramount (especially evil personas)
Tasks requiring neutral, unbiased responses

Citation

Please cite both the base model and this fine-tuning work:

Base Model:

@misc{openai2025gptoss120bgptoss20bmodel,
      title={gpt-oss-120b & gpt-oss-20b Model Card},
      author={OpenAI},
      year={2025},
      eprint={2508.10925},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.10925},
}

Acknowledgments

Base model: OpenAI gpt-oss-20b
Training framework: TorchTitan
Response format: OpenAI Harmony

License

Inherits the Apache 2.0 license from the base gpt-oss-20b model. See LICENSE for details.

Downloads last month: 8

Safetensors

Model size

21B params

Tensor type

BF16

Model tree for eousphoros/persona_epsilon_20b

Base model

openai/gpt-oss-20b

Finetuned

(443)

this model

Quantizations

2 models