This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Adina Yakup
adinayakup.bsky.social
did:plc:ycmvakcgicie5gd2mp4btjoe
LLaVA-Mini🔥 A efficient multimodal model for image and video understanding released by Chinese Academy of Sciences
Paper: https://huggingface.co/papers/2501.03895
Model: https://huggingface.co/ICTNLP/llava-mini-llama-3.1-8b
✨ Matches LLaVA-v1.5 using just 1 vision token
✨ Delivers <40ms response time
2025-01-10T09:39:22.852Z