Good morning, AI enthusiasts.Ā 

Its new AI co-mathematician just achieved a record-breaking score on one of the toughest math benchmarks designed to challenge AI for years. In one remarkable case, a professor even solved a previously unsolved problem using an approach hidden within a proof that the system’s own reviewers had initially dismissed.

In today’s insights:

🧠 Google DeepMind’s AI co-mathematician
⚔ Automate repetitive tasks with Codex

Google DeepMind Just Built an AI Co-Mathematician

DeepMind has unveiled a new research paper detailing its AI co-mathematician- an agentic system powered by Gemini 3.1 that’s designed to assist mathematicians in solving complex, unsolved problems. The system just achieved a record-setting performance on a benchmark focused on advanced research-level mathematics.

The details:

šŸ”¹ DeepMind took inspiration from AI coding tools like Claude Code, applying collaborative agent workflows and built-in review systems to mathematical research.

šŸ”¹ A central coordinator agent divides problems into multiple parallel research tracks, while specialized sub-agents handle coding, literature review, and proof generation.

šŸ”¹ Oxford mathematician Marc Lackenby solved an open problem from the Kourovka Notebook after discovering what he described as a ā€œreally, really clever proof strategyā€ hidden inside one of the AI’s rejected outputs.

šŸ”¹ On Epoch AI’s FrontierMath Tier 4 benchmark, the system reached 48% accuracy- more than doubling Gemini 3.1 Pro’s standalone score of 19%.

Why it matters: AI is already accelerating breakthroughs in mathematics, and agentic systems are pushing those capabilities even further- much like what happened with coding AI. But cases like Lackenby’s highlight the bigger picture: the most powerful future for AI may be one where it amplifies human expertise instead of replacing it.

In this guide, you’ll learn how to use Codex’s Computer Use feature to handle tedious, repetitive tasks automatically on both Mac and Windows.

Step-by-step:

šŸ”¹ Open Codex, head to the Plugins section, enable the Computer Use plugin, and create a new task.

šŸ”¹ Open the permissions settings, switch from ā€œDefault permissionsā€ to ā€œFull access,ā€ approve the prompts, and give Codex a real workflow to handle.

šŸ’” Example prompt:
ā€œOpen Chrome and debug the UI issue on this webpage I’m building: http://localhost:3000/. Navigate through the app, reproduce the bug I described, and explain what you think is causing it. If you’re unsure before making changes, ask first.ā€

⚔ Pro tip: Computer Use isn’t just for coding. Codex can also automate repetitive tasks across desktop apps - from Photoshop exports and Premiere Pro cleanup to batch file renaming and other workflow-heavy tools.

That’s it for today.
The AI space doesn’t slow down - and neither should your thinking.
See you in the next drop

Keep Reading