AI labs are increasingly relying on crowdsourced benchmarking platforms such as Chatbot Arena to probe the strengths and weaknesses of their latest models. But some...
The nonprofit Center for AI Safety (CAIS) and Scale AI, a company that provides a number of data labeling and AI development services, have released...
An organization developing math benchmarks for AI didn’t disclose that it had received funding from OpenAI until relatively recently, drawing allegations of impropriety from some...