Papers from LIME Lab
updated
Safer-Instruct: Aligning Language Models with Automated Preference Data
Paper
• 2311.08685
• Published
• 1
CLIMB: A Benchmark of Clinical Bias in Large Language Models
Paper
• 2407.05250
• Published
• 2
On the Trustworthiness of Generative Foundation Models: Guideline,
Assessment, and Perspective
Paper
• 2502.14296
• Published
• 45
WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback
Paper
• 2408.15549
• Published
• 2
Detecting and Filtering Unsafe Training Data via Data Attribution
Paper
• 2502.11411
• Published
• 1
Discovering Knowledge Deficiencies of Language Models on Massive
Knowledge Base
Paper
• 2503.23361
• Published
• 5
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
Paper
• 2504.05520
• Published
• 11
The Hallucination Tax of Reinforcement Finetuning
Paper
• 2505.13988
• Published
• 8
Experiential Reinforcement Learning
Paper
• 2602.13949
• Published
• 70