Why Early Commitment Helps AI Solve Structured Problems

Last Updated on January 2, 2026 by Editorial Team

Author(s): Reya Vir

Originally published on Towards AI.

Why Early Commitment Helps AI Solve Structured Problems — Generated by Gemini

Introduction

Coming from a background at AWS and Apple, and now as a PhD student at Columbia, I’ve become pretty familiar with LeetCode and the standard logic of data structures and algorithms.

But when I started playing around with coding agents, I started noticing patterns in how they approached these same problems. What surprised me wasn’t that they could solve the problems, because many state-of-the-art models already can, but how they went about it. I noticed their reasoning sometimes felt unnecessary, inefficient, or misaligned with the task’s structure.

In past work, we tend to assume that when we give agents more room to think, explore, and decompose problems into multiple steps it generally improves capability and reliability. A lot of modern agent techniques (like planning, step-by-step reasoning, tool calling) work really well for open-ended problems. These approaches often rely on multi-step reasoning and structured code generation strategies, like chain-of-thought or ReAct-style prompting, where the model explicitly switches between reasoning steps and taking actions. This can help models solve complex tasks requiring planning, exploration, and hierarchical reasoning.

This assumption tends to be true for many cases, such as tasks that are underspecified, creative, or open-ended, where extra reasoning and exploration are exactly what’s needed.

However, LeetCode-style problems don’t really fall into that category. They’re more specified and pattern-driven, and the problem statement already narrows the solution space quite a bit, so spending too much time exploring can work against that structure. This difference changes what “good reasoning” looks like, and agents seem to benefit from reasoning differently in these cases. In these cases, agents benefit from early commitment.

Example

To see this in action, I tested a modified river-crossing puzzle where the wolf won’t eat the goat, but the goat still eats the cabbage.

I ran this through two different approaches:

Prompt A (Flexible Reasoning): A standard “Solve this problem” prompt.
Prompt B (Early Commitment): A prompt that forces the agent to identify the problem type and constraints before solving.

In Prompt A (Flexible Reasoning), the agent falls into reasoning drift. It claims a 3-step solution while listing 5 steps, then tries to redefine what a “step” is to justify the error. More importantly, it fails to solve the problem at all, it ignores the constraint that the goat eats the cabbage and hallucinates that “we never need to take something back,” which would leave the goat alone with the cabbage.

Agent ignores the ‘goat eats cabbage’ constraint in Prompt A

In Prompt B (Early Commitment), the agent correctly identifies the approach and modified constraints first. This allows it to recognize the simplified logic and provide a clean, 5-trip execution without backtracking.

Agent correctly identifies the approach and modified constraints, producing the accurate result.

Structured Problems vs Flexible Reasoning

The issue isn’t that agents can’t reason, but that they often reason too flexibly for problems that follow a strict structure.

When we as humans work through LeetCode problems, we don’t usually start by coding. We first read the problem, notice certain keywords, and that basically tells us which approach or direction to go in.

“sorted array” → maybe binary search
“fewest steps / minimum moves” → shortest path / BFS
”longest contiguous subarray / substring” → sliding window

Coding agents often can solve these problems too, but they usually skip this step that we subconsciously take.

Instead they jump into things like:

Planning
Exploring multiple approaches
Reasoning step by step (often in a ReAct-style loop)
Coding and figuring things out as they go

And that works sometimes, but it also leads to:

More refinements or debugging
Overthinking
Unnecessary complexity
Solutions that are correct but not as clean or direct as they could be

How to Fix This

What stood out to me is that once an approach is chosen for these problems, the rest of the solution is usually straightforward. We can use standard algorithms with well-known implementation templates. I wanted to find out whether whether forcing the agent to make a decision earlier on and decide what kind of problem this is before doing anything else, would change how it behaves and the results.

Once the problem type is fixed, the rest of the solution becomes much more constrained. As a result, the search space shrinks, the implementation now follows a known template or algorithm, and the agent is less likely to explore alternatives mid-way.

The resulting solution not only ends up being simpler and easier to read, but it also follows a clearer structure, making the agent’s output easier and its behavior more predictable.

Experiment

To see if this changed anything, I tried a prompt change, and tested it on some LeetCode problems

Prompt A: “Solve the following problem.”
Prompt B: “Before writing any code, briefly say what kind of problem this is (e.g., shortest path, sliding window, backtracking, etc.) and why. Then solve the problem.”

I tested this primarily on LeetCode-style problems, but also extended it to logic puzzles, constraint-based problems, and other reasoning tasks where structure matters more than exploration. Across all these tasks, I saw that the same pattern showing up anywhere the task was structured and pattern-heavy.

What I Observed:

After running these on a range of problems, I noticed several clear qualitative shifts in how the agents behaved with Prompt B:

Agents commit earlier to a single approach.
They have shorter, more focused reasoning.
There is much less backtracking or changing direction mid-solution.
The code is cleaner and more predictable, usually following standard algorithmic patterns.

In one example, I took a LeetCode hard problem and added extra constraints to make it more complex. With Prompt A, the agent jumped straight into coding, over-complicated the solution, and explored unnecessary steps. In contrast, when I used Prompt B, it identified the problem type and constraints first, and then was able to produce a clean, straightforward solution without over-complicating things.

For standard LeetCode problems, this prompting change didn’t always change the final accuracy, since the top models are now able to usually perform well on many of them. However, it often improved runtime, implementation efficiency, and consistency, and when differences did appear, Prompt B was more reliable.

Measuring Efficiency:

With Prompt B, the code was cleaner and followed more standard algorithmic implementations, producing more direct and less verbose code. Because the implementation was more efficient, I wanted to measure this improvement. Since most top models are trained on standard LeetCode problems, they can solve the generic versions pretty well. So to really test this, I created variants of these popular problems, and ran both Prompt A and B with Claude Sonnet 4.5 and counted the tokens of the code that was outputted.

Comparison of code output tokens for Claude Sonnet 4.5 on variations of LeetCode problems

The results show a consistent decrease in token counts*. This suggests that Prompt B helps the agent commit and stay in a structured lane, helping it align more with standard algorithmic implementations and preventing it from wandering into its own implementation or confusing logic often seen in Prompt A.

*In the Knapsack variant with Prompt B, the LLM actually ended up coding an entire interface for the solver, which explains the higher token count.

Interestingly, as I moved beyond LeetCode to logic puzzles and problems with modified or new constraints, then accuracy also improved much more noticeably. In those cases, Prompt B was not just cleaner or faster, but also more reliable in generating the correct solution.

Concrete Examples:

For example, I tried this on a modified river-crossing puzzle (wolf, goat, cabbage), where one of the usual constraints was removed. In this version, the wolf would not eat the goat, but the goat would still eat the cabbage. So it was still a well known problem but the optimal solution changed.

With Prompt A, the agent immediately fell back on the standard version of the puzzle. It initially said the problem could be solved in three steps, but then became confused about what counted as a step, and ended up rationalizing an incorrect solution. The issue wasn’t really about arithmetic or execution, but about solving using the wrong problem template.
In contrast, with Prompt B, the agent first identified the modified constraint, reduced the problem down to a simpler case, and then produced the optimal solution without the unnecessary confusion or backtracking.

I also tried this on a modified Alien Dictionary problem, where the words followed another rule where words starting with ‘b’ had a different character precedence than words starting with ‘a’.

With Prompt A, the agent just jumped straight into topological sorting. It didn’t really notice the modified constraint, got tangled in unnecessary reasoning, and backtracked more than needed to solve the problem.
With Prompt B, the agent paused first, noticed the constraint differences, was able to decide on the right approach, and then produced the solution cleanly and reliably.

Generally, I saw this same pattern showing up across many of the structured problems I tested, not just this puzzle:

In Prompt A, before taking a step to decide and commit, agents often rushed into a memorized approach, sometimes solving the wrong problem or correcting themselves only after extra steps and revisions.
With Prompt B, agents were forced to first identify the problem and its constraints, and then decide which approach made sense. That extra step was often enough for the agent to notice what actually mattered to the problem, what is needed to solve it, and identify the right approach for this case from the start.

Why this works:

Through Prompt B, we essentially separated planning from implementation. We asked the agent to first identify the problem type and algorithm, then we forced the agent to find the high level algorithm and commit to it, before beginning low-level reasoning or coding.

This relates to Plan-and-Solve prompting, where models perform better after deciding a plan and executing it. However, most techniques like “Plan-and-Solve” or “Take a Step Back” focus on expansion and exploration. In comparison, early commitment focuses on reduction, as it forces the model to pick a specific path and stay in it. For these structured and pattern based problems, this separation matters a lot more since the plan is fixed, the remaining work is just straightforward execution. This prevents any reasoning drift that happens when an agent starts with one approach but accidentally switches to another halfway through the code.

It’s important to note that this doesn’t replace planning, memory, or step-by-step reasoning. Those approaches are still important for open-ended, unknown, unsolved, or ambiguous tasks. The main thing we should realize is that reasoning strategies should be developed to match the structure of the problem given, rather than applying the same approach everywhere.

What I Learned

This doesn’t replace existing agent techniques, but it does help show that reasoning isn’t a one-size-fits-all approach.

With flexible tasks, exploration is the preferred approach, but for structured and well-known tasks, early commitment tends to work better. Understanding how models reason, and understanding the structure of the problem, matters if we want to build or prompt agents effectively for different use cases. In my case, prompting the agent to commit earlier made an improvement in the clarity, runtime, and consistency of the outputs.

I first noticed this while working through LeetCode problems, but this insight applies anywhere where we have problems that are structured and pattern-heavy. This is also similar to how we solve math problems. We won’t try every formula we know, and instead will first decide what kind of problem it is, then apply the right algorithm or approach based on that. We can see this same logic in other areas like natural language to SQL (NL-to-SQL) generation, where deciding on a join pattern first helps prevent the agent from hallucinating column names. In all these cases, early commitment helps the agent cut out the noise and focus on execution.

Open Questions:

After trying out these examples, it raised a few questions for me:

How can we teach agents to self-decide when to commit early versus when to keep exploring?
Do some models benefit from early commitment more than others, depending on their size or training?
Could these insights apply outside coding or logic puzzles, such as in planning for robotics tasks or other multi-step decision tasks?

Join thousands of data leaders on the AI newsletter. Join over 80,000 subscribers and keep up to date with the latest developments in AI. From research to projects and ideas. If you are building an AI startup, an AI-related product, or a service, we invite you to consider becoming a sponsor.

Published via Towards AI

Frequently Used, Contextual References

Resources

Why Early Commitment Helps AI Solve Structured Problems

Author(s): Reya Vir

Introduction

Structured Problems vs Flexible Reasoning

How to Fix This

Experiment

What I Observed:

Concrete Examples:

Why this works:

What I Learned

Open Questions:

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Recent Posts

Full-Stack Data Scientists for the Agentic Coding World

Building Production-Grade AI Skills with Snowflake Cortex AI Function Studio

I Tried 10 AI Agent Frameworks in 2026 — Here’s the Honest Guide I Wish I Had Earlier

How One Spring Boot Optimization Saved Our Startup $30,000 a Year

Inside Palantir AIP: How the World’s Most Controversial AI Platform Actually Works

What Is a Reverse Proxy? (And Why Every Backend Developer Should Care)

What Claude Opus 4.8 Actually Changes If You’re Building Agents

QWEN 3.7 Max Worked For 35 Hrs Straight And The Results Were Mind-blowing

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Frequently Used, Contextual References

Resources

Why Early Commitment Helps AI Solve Structured Problems

Author(s): Reya Vir

Introduction

Structured Problems vs Flexible Reasoning

How to Fix This

Experiment

What I Observed:

Concrete Examples:

Why this works:

What I Learned

Open Questions:

Towards AI Academy

We Build Enterprise-Grade AI. We'll Teach You to Master It Too.

Related posts

Recent Posts

Comprehensive AI Engineering and AI for Work certifications

Company

CONTACT US

GDPR CCPA Statement