Researchers at Mass General Brigham recently developed BRIDGE, a multilingual benchmark that evaluates how well large ...
Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now A team of Abacus.AI, New York University, ...
MIT's MeMo framework trains a compact memory model that boosts LLM performance by up to 26.73% without retraining, with major implications for crypto AI agents.
Look to these key metrics and benchmarks to evaluate the performance, capability, reliability, and safety of your AI models ...
Recent frontier LLM inference benchmarks have highlighted a recurring pattern. GPU-based systems deliver outstanding ...
Here is a sneak peek at the evolution of the MLPerf benchmark and how generative AI forced a radical shift in AI hardware ...
While most countries’ lawmakers are still discussing how to put guardrails around artificial intelligence, the European Union is ahead of the pack, having passed a risk-based framework for regulating ...
A new technical paper titled “FVEval: Understanding Language Model Capabilities in Formal Verification of Digital Hardware” was published by researchers at UC Berkeley and NVIDIA. “The remarkable ...
Anthropic just changed the AI landscape with the release of Claude Fable 5. This is not just another minor update or a slightly faster version of what you are already using. It represents a massive ...
Sapient researchers trained a 1B reasoning model on just 40B tokens — scoring competitively with 2B-7B models at a fraction ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results