Working Models in Math

Hosted on MSN

Top AI models are failing hard at solving fresh math problems

Top artificial intelligence systems now ace many textbook-style math questions, yet they still fall apart on genuinely new problems. The gap between polished performance on familiar benchmarks and ...

NextBigFuture

AI Large Language Model Math Breakthroughs

AI large language models have been especially weak on math. There are now several papers from Google Deep Mind, Alibaba and other universities where AI large language models are at Math Olympiad ...

Hosted on MSN

Leading AI models struggle to solve original math problems

Mathematics, like many other scientific endeavors, is increasingly using artificial intelligence. Of course, math is the backbone of AI, but mathematicians are also turning to these tools for tasks ...

VentureBeat

AI’s math problem: FrontierMath benchmark shows how far technology still has to go

Artificial intelligence systems may be good at generating text, recognizing images, and even solving basic math problems—but when it comes to advanced mathematical reasoning, they are hitting a wall.

TechCrunch

Researchers question AI’s ‘reasoning’ ability as models stumble on math problems with trivial changes

How do machine learning models do what they do? And are they really “thinking” or “reasoning” the way we understand those things? This is a philosophical question as much as a practical one, but a new ...

TechRepublic

OpenAI Model Wins Gold at International Mathematical Olympiad – or Did It?

OpenAI Model Wins Gold at International Mathematical Olympiad – or Did It? Your email has been sent A Google DeepMind researcher and OpenAI’s former CTO are posing questions about the validity of ...

Ars Technica

New study shows why simulated reasoning AI models don’t yet live up to their billing

There’s a curious contradiction at the heart of today’s most capable AI models that purport to “reason”: They can solve routine math problems with accuracy, yet when faced with formulating deeper ...

InfoQ

Alibaba Releases Two Open-Weight Language Models for Math and Voice Chat

Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with content, and download exclusive resources. Dany Lepage discusses the architectural ...

MIT Technology Review

Google DeepMind used a large language model to solve an unsolved math problem

They had to throw away most of what it produced but there was gold among the garbage. Google DeepMind has used a large language model to crack a famous unsolved problem in pure mathematics. In a paper ...

American Enterprise Institute

Why AI Struggles with Basic Math (and How That’s Changing)

Large Language Models (LLMs) have ushered in a new era of artificial intelligence (AI) demonstrating remarkable capabilities in language generation, translation, and reasoning. Yet, LLMs often stumble ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results