This is a heavily interactive web application, and JavaScript is required. Simple HTML interfaces are possible, but that is not what this is.
Post
Francis Villatoro
emulenews.bsky.social
did:plc:d4iwt4nqu5spu45q57kbzxtv
#arXiv BixBench: a Comprehensive Benchmark for LLM-based Agents in Computational Biology arxiv.org/abs/2503.00096 over 50 real-world scenarios of practical biological data analysis with nearly 300 associated open-answer questions designed to measure the ability of LLM-based agents.
[contains quote post or other embedded content]
2025-03-04T17:07:40.857Z