LifeSciBench: OpenAI’s Hard New Life-Science Benchmark — and How GPT-Rosalind Stacks Up
OpenAI just released LifeSciBench, a 750-task, expert-written, expert-reviewed benchmark that tries to measure whether AI can actually do real life-science research work — not just...



























