This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Gradio
gradio-hf.bsky.social
did:plc:o66vm65wbjctj62dhcoqgnrc
VideoLLaMA3, latest MLLMs for image and video understanding.
🖐️ 7B models: DocVQA: 94.9, MathVision: 26.2, VideoMME: 66.2/70.3, MLVU: 73.0
🤏 2B models for edge devices: MMMU: 45.3, VideoMME: 59.6/63.4
👊 Frontier-class video model with ONLY 3M video-text pairs
2025-02-07T15:34:02.148Z