PIT-4B-FT β Point-In-Time GPT (Fine-tuned, 2016-12)
Point-In-Time (PIT) is a family of GPT-style language models trained on chronologically-ordered monthly snapshots of FineWeb. Each checkpoint captures the state of knowledge available up to a specific month, making them suitable for temporal reasoning and point-in-time analysis tasks.
This is the instruction-tuned variant. LoRA adapters were trained on instruction data and merged back into the base weights.
Model details
| Property | Value |
|---|---|
| Snapshot month | 2016-12 |
| Training step | unknown |
| Architecture | Decoder-only Transformer (GPT) |
| Layers | 20 |
| Hidden dim | 4096 |
| Attention heads | 32 |
| Vocab size | 50304 |
| Tokenizer | GPT-2 BPE |
| Position encoding | RoPE |
| Normalization | RMSNorm on Q/K + pre-norm |
| Activation | Squared ReLU |
| Weight tying | Yes (input emb β lm_head) |
- Base model: Diamegs/PIT-4B-201612
Requirements
pip install transformers torch safetensors
Quick start
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
repo_id = "Diamegs/PIT-4B-FT-201612"
tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(
repo_id,
trust_remote_code=True, # required for custom architecture
torch_dtype=torch.bfloat16,
)
model = model.cuda()
model.eval()
Instruction following
The fine-tuned model uses the following chat template:
from transformers import StoppingCriteria, StoppingCriteriaList
class StopOnEndMarker(StoppingCriteria):
def __init__(self, end_ids):
self.end_ids = end_ids
self.n = len(end_ids)
def __call__(self, input_ids, scores, **kwargs):
return input_ids[0][-self.n:].tolist() == self.end_ids
end_ids = tokenizer.encode("<|end|>", add_special_tokens=False)
stopping_criteria = StoppingCriteriaList([StopOnEndMarker(end_ids)])
def format_prompt(instruction: str) -> str:
return f"<|user|>\n{instruction}\n<|assistant|>\n"
prompt = format_prompt("What were the main economic trends in 2016?")
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
**inputs,
max_new_tokens=512,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
stopping_criteria=stopping_criteria,
)
n_prompt = inputs["input_ids"].shape[1]
response = tokenizer.decode(output[0][n_prompt:], skip_special_tokens=False)
# Strip the end marker and any trailing whitespace
response = response.split("<|end|>")[0].strip()
print(response)
Temporal reasoning example
Because this model was trained on data up to 2016-12, it reflects the world as it was known at that point. You can use this for point-in-time analysis:
# What does the model "know" about events before its cutoff?
prompt = "The most important AI developments in early 2016 were"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(
**inputs,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_p=0.9,
pad_token_id=tokenizer.eos_token_id,
)
n_prompt = inputs["input_ids"].shape[1]
print(tokenizer.decode(output[0][n_prompt:], skip_special_tokens=True))
Weights format
Weights are stored in safetensors format
(model.safetensors) β memory-mapped, fast to load, and safe (no arbitrary code execution).
Limitations
- Knowledge is limited to web text available up to 2016-12.
- No RLHF or safety fine-tuning has been applied.
- The model may reproduce biases present in FineWeb training data.
- Not suitable for safety-critical applications without further alignment.
Citation
@misc{pit_llm,
title = {Point-In-Time LLM},
author = {Diamegs},
year = {2024},
publisher = {HuggingFace},
url = {https://huggingface.co/Diamegs}
}
- Downloads last month
- 83
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support