Instructions to use Steelskull/L3.3-Nevoria-R1-70b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Steelskull/L3.3-Nevoria-R1-70b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Steelskull/L3.3-Nevoria-R1-70b") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b") model = AutoModelForCausalLM.from_pretrained("Steelskull/L3.3-Nevoria-R1-70b") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Steelskull/L3.3-Nevoria-R1-70b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Steelskull/L3.3-Nevoria-R1-70b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b
- SGLang
How to use Steelskull/L3.3-Nevoria-R1-70b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Nevoria-R1-70b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Steelskull/L3.3-Nevoria-R1-70b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Steelskull/L3.3-Nevoria-R1-70b", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use Steelskull/L3.3-Nevoria-R1-70b with Docker Model Runner:
docker model run hf.co/Steelskull/L3.3-Nevoria-R1-70b
base_model:
- nbeerbower/Llama-3.1-Nemotron-lorablated-70B
- SicariusSicariiStuff/Negative_LLAMA_70B
- TheDrummer/Anubis-70B-v1
- EVA-UNIT-01/EVA-LLaMA-3.33-70B-v0.1
- deepseek-ai/DeepSeek-R1-Distill-Llama-70B
- Sao10K/L3.3-70B-Euryale-v2.3
library_name: transformers
license: other
license_name: eva-llama3.3
tags:
- mergekit
- merge
model-index:
- name: L3.3-Nevoria-R1-70b
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: wis-k/instruction-following-eval
split: train
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 60.24
name: averaged accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: SaylorTwift/bbh
split: test
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 56.17
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: lighteval/MATH-Hard
split: test
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 46.68
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
split: train
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 29.19
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 20.19
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 49.59
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard#/?search=Steelskull%2FL3.3-Nevoria-R1-70b
name: Open LLM Leaderboard
L3.3-Nevoria-R1-70b
Model Information
L3.3-Nevoria-R1-70b
Model Composition
- EVA-LLAMA-0.1 Storytelling capabilities
- EURYALE-v2.3 Detailed scene descriptions
- Anubis-v1 Enhanced prose details
- Negative_LLAMA Reduced positive bias
- DeepSeek-R1-Distill-Llama-70B Increased Intelligence / Dialog / Awareness
- Nemotron-lorablated Base model
This model builds upon the original Nevoria foundation, incorporating the Deepseek-R1 reasoning architecture to enhance dialogue interaction and scene comprehension. While maintaining Nevoria's core strengths in storytelling and scene description (derived from EVA, EURYALE, and Anubis), this iteration aims to improve prompt adherence and creative reasoning capabilities. The model also retains the balanced perspective introduced by Negative_LLAMA and Nemotron elements. Also, the model plays the card to almost a fault, It'll pick up on minor issues and attempt to run with them. Users had it call them out for misspelling a word while playing in character.
Note: While Nevoria-R1 represents a significant architectural change, rather than a direct successor to Nevoria, it operates as a distinct model with its own characteristics.
The lorablated model base choice was intentional, creating unique weight interactions similar to the original Astoria model and Astoria V2 model. This "weight twisting" effect, achieved by subtracting the lorablated base model during merging, creates an interesting balance in the model's behavior. While unconventional compared to sequential component application, this approach was chosen for its unique response characteristics.