Looking for Data

SirBubblesIII · October 21, 2025, 1:41pm

I’m curious to know if anyone has made a first person perspective synthetic therapy data set

Specifically the AI Assistant name will be inserted into the Therapist’s name slot in the conversation log

This data set should also probably include therapy/psychology research papers

And

Textbooks

My hope with this it to help with model Alignment and making it less sycophantic

John6666 · October 22, 2025, 4:39am

I don’t think there’s existing standalone dataset that fits that purpose at this point.
Since it’s in the medical field, you might get some information by asking on Hugging Science.

DinoDS · March 4, 2026, 8:00pm

Yes. I can help you with that.

A solid way is to build a synthetic first person therapy style dataset where the therapist name is a placeholder token, so you can swap in the assistant name later. The dataset should include many examples of non sycophantic behavior like gentle disagreement, reality checking, asking clarifying questions, and setting boundaries, plus safe escalation patterns.

For papers and textbooks, the safest approach is not to copy copyrighted text into training rows. Instead, keep conversations original and reference research concepts, or use RAG with open access sources for grounding when needed.

Do you want this mainly for internal alignment experiments, or to publish a gated dataset on Hugging Face?

Topic		Replies	Views
Tool to support psychological therapists Research	0	253	May 3, 2024
For helping a Doctor! Please help me finetune the following model: hackint0sh/phi-3-clinical on the following dataset: openlifescienceai/medmcqa Beginners	2	77	November 21, 2024
Stuck! Any help or tips? (School Chatbot) Beginners	0	252	March 8, 2023
Tools, datasets ,benchmarks in AI Safety 🤗Datasets	0	130	June 20, 2024
Dataset for inducing hallucinations or dataset with hallucinations in it 🤗Datasets	0	336	February 12, 2024

Looking for Data

Related topics