geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_misalignment_e2e_v2-DPO Text Generation • 7B • Updated about 11 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO_mbt Text Generation • 7B • Updated about 21 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO_mbt Text Generation • 7B • Updated about 21 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO_mbt Text Generation • 7B • Updated about 21 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO_mbt Text Generation • 7B • Updated about 21 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO_mbt Text Generation • 7B • Updated about 21 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2-DPO_mbt Text Generation • 7B • Updated about 23 hours ago
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered-DPO Text Generation • 7B • Updated 1 day ago • 41
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid-DPO Text Generation • 7B • Updated 1 day ago • 43
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e-DPO Text Generation • 7B • Updated 1 day ago • 40
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid-DPO Text Generation • 7B • Updated 1 day ago • 46
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2-DPO Text Generation • 7B • Updated 1 day ago • 46
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment-DPO Text Generation • 7B • Updated 1 day ago • 39
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_synth_align_mid Text Generation • 7B • Updated 2 days ago • 76
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_misalignment_e2e_v2 Text Generation • 7B • Updated 2 days ago • 123
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_synth_misalign_mid Text Generation • 7B • Updated 2 days ago • 71
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered_insert_alignment Text Generation • 7B • Updated 2 days ago • 124
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_filtered_insert_alignment_e2e Text Generation • 7B • Updated 2 days ago • 123
geodesic-research/sfm-sft_dolci_mcqa_claude_instruct_unfiltered Text Generation • 7B • Updated 2 days ago • 71
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO-school-reward-hacks Text Generation • 7B • Updated 2 days ago • 6
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid-DPO-school-reward-hacks Text Generation • 7B • Updated 2 days ago • 5
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered-DPO-school-reward-hacks Text Generation • 7B • Updated 2 days ago • 4
geodesic-research/sfm-sft_dolci_instruct_unfiltered-DPO-school-reward-hacks Text Generation • 7B • Updated 2 days ago • 8
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered_synthetic_alignment_mid-DPO-realistic-reward-hacks Text Generation • 7B • Updated 2 days ago • 8
geodesic-research/sfm-sft_dolci_instruct_unfiltered_synthetic_misalignment_mid-DPO-realistic-reward-hacks Text Generation • 7B • Updated 2 days ago • 12
geodesic-research/sfm-sft_dolci_instruct_blocklist_filtered-DPO-realistic-reward-hacks Text Generation • 7B • Updated 3 days ago • 11
geodesic-research/sfm-sft_dolci_instruct_unfiltered-DPO-realistic-reward-hacks Text Generation • 7B • Updated 3 days ago • 9
geodesic-research/sfm-sft_dolci_mcqa_instruct_unfiltered_insert_alignment-DPO Text Generation • 7B • Updated 3 days ago • 158