This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Alexander Kolesnikov
kolesnikov.ch
did:plc:yjfozwh4zczh4cfstcxy23mj
We evaluate JetFormer potential to model large-scale multimodal image+text data and do image-to-text, text-to-image and VQA tasks, and get rather encouraging results.
2024-12-02T17:19:19.539Z