a tiny vision language model
Send video and text for explanation or action
Enhance and restore old or low-quality face images