Skip to main content
Published

AI Fails at Fixing Code? Microsoft’s Latest Study Says Yes

Microsoft AI Study Exposes Debugging Weaknesses in Top Coding Models

AI models from top companies like OpenAI, Anthropic, and others are now helping with software development tasks. As per Google CEO Sundar Pichai, “25% of new code” at Google is being generated by AI. Meta’s CEO Mark Zuckerberg also wants to “widely deploy AI coding models” within the company.

This shows how fast AI tools are being adopted in the programming world. But when it comes to debugging, even advanced models are not performing as well as expected.

Microsoft Study Highlights Weaknesses

A new study by Microsoft Research tested nine AI models including “Claude 3.7 Sonnet”, “o1”, and “o3-mini” using a benchmark called SWE-bench Lite. These models were asked to solve 300 real debugging tasks, with access to tools like a Python debugger.

The best performer was “Claude 3.7 Sonnet” with a 48.4% success rate. OpenAI’s o1 followed with 30.2%, and o3-mini only solved 22.1% of the tasks.

Researchers said the main reasons behind the low success were poor use of debugging tools and lack of training data that shows how real developers fix problems. They wrote,

“We strongly believe that training or fine-tuning [models] can make them better interactive debuggers.”

But they also added that this would need “trajectory data that records agents interacting with a debugger to collect necessary information before suggesting a bug fix.”

Why Human Developers Still Matter

Although AI models are improving, they still fail to match experienced human programmers especially in debugging. One evaluation found that Devin, a popular AI coding model, could only complete 3 out of 20 programming tests. The Microsoft study is a clear reminder that AI still has major limitations. It may not stop companies from investing in AI tools, but it does raise important concerns. As the report says, some models struggle to

“understand how different tools might help with different issues.”

Meanwhile, big names in tech are confident that coding careers are safe. Microsoft co-founder Bill Gates said programming will stay important. Replit CEO Amjad Masad, Okta CEO Todd McKinnon and IBM CEO Arvind Krishna have all echoed the same belief.

Share

Pick your channel

Spotted an error?Report a correction →

About the Author

Munazza Shaheen

Writer

Munazza Shaheen is an AI and technology researcher at TECHi with a deep interest in machine learning, automation, and emerging tech trends. Her work focuses on exploring the impact of artificial intelligence on industries, ethical AI development, and future innovations. She actively follows advancements in deep learning, robotics, and AI-driven solutions, contributing insights into how technology is shaping the world.

Community Discussion

No comments yet

Trust Score
Fact Check
Avg Rating
Engagement

Scores update automatically as the community comments.

Comments

Sign in to join the discussion