Deventhedude
's Collections
Finetune data
updated
Two Minds Better Than One: Collaborative Reward Modeling for LLM
Alignment
Paper
•
2505.10597
•
Published
COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for
Alignment with Human Values
Paper
•
2504.05535
•
Published
•
44
Viewer
•
Updated
•
133k
•
2.67k
•
93
nvidia/Nemotron-RL-instruction_following
Preview
•
Updated
•
157
•
7
nvidia/Nemotron-RL-knowledge-web_search-mcqa
Viewer
•
Updated
•
2.93k
•
479
•
5
nvidia/Nemotron-RL-agent-workplace_assistant
Viewer
•
Updated
•
1.8k
•
393
•
9
nvidia/Nemotron-RL-instruction_following-structured_outputs
Viewer
•
Updated
•
9.95k
•
374
•
25
nvidia/Nemotron-RL-knowledge-mcqa
Viewer
•
Updated
•
686k
•
392
•
7
nvidia/Nemotron-RL-math-OpenMathReasoning
Updated
•
214
•
10
nvidia/Nemotron-RL-knowledge-openqa
Viewer
•
Updated
•
136k
•
188
•
7
nvidia/Nemotron-RL-math-advanced_calculations
Viewer
•
Updated
•
6k
•
123
•
8
nvidia/Nemotron-AIQ-Agentic-Safety-Dataset-1.0
Viewer
•
Updated
•
10.8k
•
5.75k
•
10
nvidia/Nemotron-VLM-Dataset-v2
Viewer
•
Updated
•
4.58M
•
13.2k
•
76
Viewer
•
Updated
•
40
•
313
•
19
google/code_x_glue_cc_code_completion_token
Viewer
•
Updated
•
178k
•
430
•
8
google/code_x_glue_cc_cloze_testing_all
Viewer
•
Updated
•
176k
•
312
•
5
google/code_x_glue_cc_clone_detection_big_clone_bench
Viewer
•
Updated
•
1.73M
•
600
•
20
google/code_x_glue_ct_code_to_text
Viewer
•
Updated
•
1.01M
•
4.59k
•
77
google/code_x_glue_tc_nl_code_search_adv
Viewer
•
Updated
•
281k
•
697
•
10
TeichAI/claude-sonnet-4.5-high-reasoning-250x
Viewer
•
Updated
•
247
•
415
•
28
Idea2Plan: Exploring AI-Powered Research Planning
Paper
•
2510.24891
•
Published
TGPR: Tree-Guided Policy Refinement for Robust Self-Debugging of LLMs
Paper
•
2510.06878
•
Published
•
1
FML-bench: A Benchmark for Automatic ML Research Agents Highlighting the
Importance of Exploration Breadth
Paper
•
2510.10472
•
Published
•
8
Scientific Algorithm Discovery by Augmenting AlphaEvolve with Deep
Research
Paper
•
2510.06056
•
Published
•
5
RECODE-H: A Benchmark for Research Code Development with Interactive Human Feedback
Paper
•
2510.06186
•
Published
AlphaResearch: Accelerating New Algorithm Discovery with Language Models
Paper
•
2511.08522
•
Published
•
17
Viewer
•
Updated
•
169k
•
23.7k
•
1.61k
open-thoughts/OpenThoughts3-1.2M
Viewer
•
Updated
•
1.2M
•
11.2k
•
197
Preview
•
Updated
•
420
•
102
Viewer
•
Updated
•
14.8M
•
27.3k
•
104
Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long
Documents
Paper
•
2310.19923
•
Published
•
14
Viewer
•
Updated
•
200k
•
1.95k
•
49
Viewer
•
Updated
•
52.5B
•
187k
•
2.59k
rl-research/dr-tulu-sft-data
Viewer
•
Updated
•
13.1k
•
492
•
25
Viewer
•
Updated
•
4.48B
•
61.5k
•
710
miromind-ai/MiroVerse-v0.1
Viewer
•
Updated
•
228k
•
714
•
101
nvidia/Llama-Nemotron-Post-Training-Dataset
Viewer
•
Updated
•
3.91M
•
5.36k
•
637
Viewer
•
Updated
•
61.6M
•
72.2k
•
1.1k
Benchmark
•
Updated
•
500
•
97.7k
•
272
nick007x/github-code-2025
Viewer
•
Updated
•
147M
•
4.83k
•
112
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models
Paper
•
2508.06471
•
Published
•
195
Viewer
•
Updated
•
4.06k
•
1.66k
•
166
natolambert/GeneralThought-430K-filtered
Viewer
•
Updated
•
338k
•
835
•
29
RJT1990/GeneralThoughtArchive
Viewer
•
Updated
•
431k
•
2.63k
•
70
open-thoughts/OpenThoughts-114k
Viewer
•
Updated
•
228k
•
101k
•
782
Viewer
•
Updated
•
516k
•
435
•
76
PrimeIntellect/SYNTHETIC-1
Viewer
•
Updated
•
1.99M
•
1.08k
•
60
PrimeIntellect/synthetic-code-understanding
Viewer
•
Updated
•
60.6k
•
79
•
19
PrimeIntellect/INTELLECT-3-SFT
Viewer
•
Updated
•
6.98M
•
1.22k
•
1
openbmb/InfLLM-V2-data-5B
Viewer
•
Updated
•
7.19M
•
336
•
30
kenhktsui/open-react-retrieval-multi-neg-result-new-kw
Viewer
•
Updated
•
25.2k
•
37
•
3
alwaysfurther/tiny-agent-with-tools
Viewer
•
Updated
•
27
•
20
Viewer
•
Updated
•
9
•
475
•
33
Viewer
•
Updated
•
68M
•
23.2k
•
216
TuringEnterprises/Turing-Open-Reasoning
Viewer
•
Updated
•
50
•
19.2k
•
182
TeichAI/claude-4.5-opus-high-reasoning-250x
Viewer
•
Updated
•
250
•
2.73k
•
143
PrimeIntellect/INTELLECT-3-RL
Viewer
•
Updated
•
70.7k
•
22.6k
•
5
PrimeIntellect/Reverse-Text-RL
Viewer
•
Updated
•
1k
•
2.99k
•
2
PrimeIntellect/Reverse-Text-SFT
Viewer
•
Updated
•
1k
•
609
•
2
PrimeIntellect/SYNTHETIC-2-Base-Code
Viewer
•
Updated
•
57.3k
•
128
PrimeIntellect/SYNTHETIC-2-Base-Math
Viewer
•
Updated
•
105k
•
18
•
1
PrimeIntellect/SYNTHETIC-2-Base
Viewer
•
Updated
•
465k
•
45
•
9
PrimeIntellect/SYNTHETIC-2-Base-General-Reasoning
Viewer
•
Updated
•
165k
•
23
•
1
PrimeIntellect/SYNTHETIC-2-SFT-verified
Viewer
•
Updated
•
105k
•
257
•
6
PrimeIntellect/SYNTHETIC-2-Base-Answer-Critique
Viewer
•
Updated
•
50k
•
13
•
2
PrimeIntellect/SYNTHETIC-2-Base-Instruction-Following
Viewer
•
Updated
•
87.5k
•
31
PrimeIntellect/SYNTHETIC-2
Viewer
•
Updated
•
51.6k
•
352
•
10
Viewer
•
Updated
•
30
•
64
Viewer
•
Updated
•
30
•
112
Viewer
•
Updated
•
500
•
71
PrimeIntellect/LiveCodeBench-v5
Viewer
•
Updated
•
279
•
171
arcee-ai/bfcl_v4_web_search
Viewer
•
Updated
•
100
•
23
Viewer
•
Updated
•
74.2k
•
32
•
36
arcee-ai/general-dpo-datasets
Viewer
•
Updated
•
91.6k
•
252
arcee-ai/synthetic-data-gen
Viewer
•
Updated
•
999k
•
51
•
2
Viewer
•
Updated
•
10.4k
•
97
Viewer
•
Updated
•
15.4k
•
57
•
7
arcee-ai/reasoning-sharegpt
Viewer
•
Updated
•
29.9k
•
34
•
23
Viewer
•
Updated
•
486k
•
54
•
63
arcee-ai/infini-instruct-top-500k
Viewer
•
Updated
•
500k
•
32
•
6
arcee-ai/cleaned-mlabonne-distilabel-truthy-dpo-v0.1-filtered
Viewer
•
Updated
•
663
•
20
Updated
•
2.27k
•
61
Viewer
•
Updated
•
5k
•
542
•
88
Viewer
•
Updated
•
11.3k
•
1.02k
•
158
glaiveai/glaive-function-calling-v2
Viewer
•
Updated
•
113k
•
2.26k
•
480
Viewer
•
Updated
•
28k
•
410
•
42
Salesforce/xlam-function-calling-60k
Viewer
•
Updated
•
60k
•
3.49k
•
561
HuggingFaceFW/fineweb-edu
Viewer
•
Updated
•
3.5B
•
309k
•
895
Skywork-R1V4: Toward Agentic Multimodal Intelligence through Interleaved Thinking with Images and DeepResearch
Paper
•
2512.02395
•
Published
•
47
MATRIX: Multimodal Agent Tuning for Robust Tool-Use Reasoning
Paper
•
2510.08567
•
Published
Scaling Agentic Reinforcement Learning for Tool-Integrated Reasoning in VLMs
Paper
•
2511.19773
•
Published
•
9
ToolScope: An Agentic Framework for Vision-Guided and Long-Horizon Tool
Use
Paper
•
2510.27363
•
Published
•
22
Ariadne: A Controllable Framework for Probing and Extending VLM
Reasoning Boundaries
Paper
•
2511.00710
•
Published
•
4
VLA-R1: Enhancing Reasoning in Vision-Language-Action Models
Paper
•
2510.01623
•
Published
•
10
DeepEyesV2: Toward Agentic Multimodal Model
Paper
•
2511.05271
•
Published
•
42
DeepMMSearch-R1: Empowering Multimodal LLMs in Multimodal Web Search
Paper
•
2510.12801
•
Published
•
13
DeepAgent: A General Reasoning Agent with Scalable Toolsets
Paper
•
2510.21618
•
Published
•
99
Open Multimodal Retrieval-Augmented Factual Image Generation
Paper
•
2510.22521
•
Published
•
30
smolagents/android-control
Viewer
•
Updated
•
15.3k
•
1.86k
•
12
smolagents/guiact-web-single
Viewer
•
Updated
•
13.3k
•
40
•
1
Viewer
•
Updated
•
1.89k
•
18
•
5
smolagents/hermes-function-calling-v1-formatted-code-agent
Viewer
•
Updated
•
9k
•
26
•
1
smolagents/aguvis-stage-1
Viewer
•
Updated
•
459k
•
4.17k
•
16
smolagents/aguvis-stage-2
Viewer
•
Updated
•
784k
•
4.4k
•
25
Viewer
•
Updated
•
10.5k
•
6
•
1
beyoru/ToolCall_synthetic_qwen3
Viewer
•
Updated
•
60k
•
22
•
9
qualifire/mcp-tool-use-quality-benchmark
Viewer
•
Updated
•
5k
•
16
•
3
mlx-community/hermes-reasoning-tool-use
Viewer
•
Updated
•
51k
•
75
•
4
TeichAI/gemini-3-pro-preview-high-reasoning-1000x
Viewer
•
Updated
•
1.02k
•
1.48k
•
56
Viewer
•
Updated
•
1.29B
•
54.5k
•
290
allenai/Dolci-Instruct-SFT-Tool-Use
Viewer
•
Updated
•
228k
•
889
•
10
nvidia/Nemotron-Content-Safety-Reasoning-Dataset
Preview
•
Updated
•
67
•
4
ai-safety-institute/AgentHarm
Viewer
•
Updated
•
468
•
6.9k
•
45
Viewer
•
Updated
•
1.27k
•
3.17k
•
1
rootsautomation/ScreenSpot
Viewer
•
Updated
•
1.27k
•
1.87k
•
43
Preview
•
Updated
•
169
•
14
Viewer
•
Updated
•
150
•
24
•
3
Viewer
•
Updated
•
300
•
909
•
19
Preview
•
Updated
•
59
•
9
Viewer
•
Updated
•
503
•
26.1k
•
28