GEM (GEM benchmark)

posted an update about 20 hours ago

Post

550

🚀 TRL v0.29.0 introduces trl-training: an agent-native training skill.

This makes the TRL CLI a structured, agent-readable capability, allowing AI agents to reliably execute training workflows such as:
- Supervised Fine-Tuning (SFT)
- Direct Preference Optimization (DPO)
- Group Relative Policy Optimization (GRPO)

We’re excited to see what the community builds on top of this.

If you’re working on AI agents, alignment research, or scalable RL training infrastructure: give TRL v0.29.0 a try! 🤗

The future of ML tooling is agent-native.
🔗 https://github.com/huggingface/trl/releases/tag/v0.29.0

Ujjwal-Tyagi

posted an update 3 days ago

Post

2662

Public reports allege that Anthropic gobbled up trillions of tokens of copyrighted material and public data to build their castle. 🏰📄 Now that they're sitting on top, they're begging for special laws to protect their profits while pulling the ladder up behind them. 🪜🚫

But the hypocrisy meter just broke! 📉 They are accusing Chinese labs like DeepSeek, Minimax, and Kimi of "huge distillation attacks. The Reality is that You can't just loot the entire internet's library, lock the door, and then sue everyone else for reading through the window. Stop trying to gatekeep the tech you didn't own in the first place. Read the complete article on it: https://huggingface.co/blog/Ujjwal-Tyagi/the-dark-underbelly-of-anthropic

3 replies

·

Ujjwal-Tyagi

posted an update 11 days ago

Post

204

Qwen 3.5 Model is here! Supporting 1m context length by default, It is giving much good performance and competitive to Claude Opus 4.6, Qwen/Qwen3.5-397B-A17B, here it's GGUF: unsloth/Qwen3.5-397B-A17B-GGUF, Follow me and turn on the notification for the latest news!

lewtun

submitted a paper to Daily Papers 14 days ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published 15 days ago • 8

lewtun

submitted a paper to Daily Papers 15 days ago

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published 24 days ago • 10

Ujjwal-Tyagi

posted an update 15 days ago

Post

2994

GLM 5 is insane, it ranks #4 Globally!

4 replies

·

albertvillanova

posted an update 16 days ago

Post

1665

5 years already working in democratizing AI 🤗
Grateful to be part of such an awesome team making it happen every day.

Parveshiiii

posted an update 16 days ago

Post

263

Introducing Seekify — a truly non‑rate‑limiting search library for Python

Tired of hitting rate limits when building search features? I’ve built Seekify, a lightweight Python library that lets you perform searches without the usual throttling headaches.

🔹 Key highlights

- Simple API — plug it in and start searching instantly

- No rate‑limiting restrictions

- Designed for developers who need reliable search in projects, scripts, or apps

📦 Available now on PyPI:

pip install seekify

👉 Check out the repo: https:/github.com/Parveshiiii/Seekify
I’d love feedback, contributions, and ideas for real‑world use cases. Let’s make search smoother together!

Sri-Vigneshwar-DJ

posted an update 23 days ago

Post

1388

Just released a new dataset designed for training reasoning models on Meta (Facebook/Instagram) advertising fatigue detection!

What is it? A GRPO (Group Relative Policy Optimization) training dataset with 200+ carefully crafted scenarios covering:

🔍 Fatigue Signal Detection: CTR drops, CPM spikes, frequency analysis
🩺 Performance Diagnosis: Root cause analysis frameworks
📋 Strategy: Creative refresh cadence, testing frameworks
📊 Analysis: ROI calculations, metric interpretation
Why GRPO? GRPO training helps models learn structured reasoning. Each response follows the <thinking> and <answer> format.

Check it out here: Sri-Vigneshwar-DJ/meta-fatigue-grpo-dataset

Ujjwal-Tyagi

posted an update 30 days ago

Post

1362

Finally we got a benchmark and research paper on ai safety, I am very excited to see what comes next on protecting AGI AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security (2601.18491)

AI45Research/ATBench

gentaiscool

authored a paper about 1 month ago

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

Paper • 2601.17277 • Published Jan 24 • 6

yjernite

authored a paper about 1 month ago

INTIMA: A Benchmark for Human-AI Companionship Behavior

Paper • 2508.09998 • Published Aug 4, 2025 • 11

Parveshiiii

posted an update about 1 month ago

Post

1612

🚀 Wanna train your own AI Model or Tokenizer from scratch?

Building models isn’t just for big labs anymore — with the right data, compute, and workflow, you can create **custom AI models** and **tokenizers** tailored to any domain. Whether it’s NLP, domain‑specific datasets, or experimental architectures, training from scratch gives you full control over vocabulary, embeddings, and performance.

✨ Why train your own?
- Full control over vocabulary & tokenization
- Domain‑specific optimization (medical, legal, technical, etc.)
- Better performance on niche datasets
- Freedom to experiment with architectures

⚡ The best part?
- Tokenizer training (TikToken / BPE) can be done in **just 3 lines of code**.
- Model training runs smoothly on **Google Colab notebooks** — no expensive hardware required.

📂 Try out my work:
- 🔗 https://github.com/OE-Void/Tokenizer-from_scratch
- 🔗 https://github.com/OE-Void/GPT

Sri-Vigneshwar-DJ

posted an update about 1 month ago

Post

213

🏙️ Hugging Face Community Post
Title: 🧬 Experimenting with "Dynamic Chaos" in Tamil SLMs

Hi everyone! I just published a new experimental study on Small Language Model (SLM) resilience.

I took the Qwen2.5-0.5B model and put it through a "Chaos Phase" to see how much weight data a tiny model can lose before its understanding of classical Tamil grammar breaks.

Key highlights of the study:

Target Data: Fine-tuned on the Thirukkural (1,330 couplets + modern explanations).
The Chaos Step: Applied 20% random weight pruning but implemented "Layer Protection" for the Token Embeddings and LM Head to keep the characters readable.
Compression: 4-bit (Q4_K_M) quantization for extreme efficiency.
Result: A surrealist classical Tamil model that is ultra-light (~300MB) and ultra-fast!

Check out the model and the experiment logic here: Sri-Vigneshwar-DJ/qwen-tamil-chaos-v1

Ujjwal-Tyagi

posted an update about 1 month ago

Post

1807

There is a new open-source music generation model called HeartMuLa. It offers strong, competitive performance compared to Suno and supports English, Chinese, Japanese, Korean, and Spanish. It is optimized to run easily on RTX GPUs and other consumer-grade hardware. HeartMuLa/HeartMuLa-oss-3B
https://github.com/HeartMuLa/heartlib

1 reply

·

Parveshiiii

posted an update about 1 month ago

Post

242

📢 The Announcement
Subject: XenArcAI is now Modotte – A New Chapter Begins! 🚀

Hello everyone,

We are thrilled to announce that XenArcAI is officially rebranding to Modotte!

Since our journey began, we’ve been committed to pushing the boundaries of AI through open-source innovation, research, and high-quality datasets. As we continue to evolve, we wanted a name that better represents our vision for a modern, interconnected future in the tech space.

What is changing?

The Name: Moving forward, all our projects, models, and community interactions will happen under the Modotte banner.

The Look: You’ll see our new logo and a fresh color palette appearing across our platforms.

What is staying the same?

The Core Team: It’s still the same people behind the scenes, including our founder, Parvesh Rawal.

Our Mission: We remain dedicated to releasing state-of-the-art open-source models and datasets.

Our Continuity: All existing models, datasets, and projects will remain exactly as they are—just with a new home.

This isn’t just a change in appearance; it’s a commitment to our next chapter of growth and discovery. We are so grateful for your ongoing support as we step into this new era.

Welcome to the future. Welcome to Modotte.

Best regards, The Modotte Team

Ujjwal-Tyagi

posted an update about 1 month ago

Post

2784

So, Koreans are also doing great progress behind Chinese,
Their two open source ai models that are actually good in coding. upstage/Solar-Open-100B skt/A.X-K1

1 reply

·

Ujjwal-Tyagi

posted an update about 1 month ago

Post

216

Finally we have the best powerful open source music gen model rivaling Suno v5: https://heartmula.github.io/

Sri-Vigneshwar-DJ

posted an update about 1 month ago

Post

312

Performance Marketing meets "Thinking Mode" 🧠

I’m excited to release hawky-ai-Qwen3-0.6B-Marketing-MoT, a specialized SLM designed for deep strategic reasoning in performance marketing.

While small at 0.6B parameters, this model punches way above its weight class by utilizing a Mixture of Thoughts (MoT) framework. It doesn't just give you an answer; it thinks through the logic of Meta Ads scaling, GA4 attribution, and unit economics before providing a strategic recommendation.

Key Features:

Thinking-First: Trained on 1,500+ critical thinking scenarios.
MoT Framework: 5 distinct reasoning styles (Linear, Exploratory, Critical, Deconstructive, Analogical).
SLM Speed: Perfect for low-latency, high-precision marketing audits.
Check it out on Hugging Face: 🔗 Sri-Vigneshwar-DJ/hawky-ai-Qwen3-0.6B-Marketing-MoT

Ujjwal-Tyagi

posted an update about 1 month ago

Post

2603

I am very excited to see the release of nyuuzyou/gitee-code. This is exactly what I have been looking for. Thank you to @nyuuzyou for his hard work on this.

3 replies

·

GEM benchmark

AI & ML interests

Recent Activity

Single-minus gluon tree amplitudes are nonzero

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

PingPong: A Natural Benchmark for Multi-Turn Code-Switching Dialogues

INTIMA: A Benchmark for Human-AI Companionship Behavior

AI & ML interests

Recent Activity

Team members 100

GEM's activity