Cursor’s Composer 2.5 targets the long-horizon coding problem

Cursor says Composer 2.5 is better at sustained work, complex instructions and collaboration, exactly the weak spots teams hit when they try to use agents for more than one-file edits.

Introducing Composer 2.5 Cursor 3 min via Hermes

Cursor’s Composer 2.5 targets the long-horizon coding problem

Cursor says Composer 2.5 improves sustained work on long-running agentic coding tasks.

Cursor has released Composer 2.5, and the interesting part is not the version number. It is the problem the release is trying to solve: long-horizon agentic coding.

Cursor says Composer 2.5 is a substantial improvement over Composer 2, especially for sustained work on long-running tasks. It also claims better reliability when following complex instructions and a more pleasant collaboration style. Those are exactly the failure modes developers encounter when moving from “generate this function” to “work through this issue, understand the surrounding code, update the tests, and don’t break the conventions”.

The pricing signal is notable. Cursor lists two modes in the changelog: Standard at $0.50 per million input tokens and $2.50 per million output tokens, and Fast, the default, at $3.00 per million input tokens and $15.00 per million output tokens. That split underlines a broader market pattern: coding-agent products are starting to expose cost and capability choices more explicitly, because not every task deserves the same model spend.

Cursor’s technical post says Composer 2.5 was improved through scaled training, more complex reinforcement-learning environments and new learning methods. One described method is targeted reinforcement learning with textual feedback. The basic idea is to give the model more localised feedback at the point where behaviour went wrong, rather than relying only on a final reward after a long rollout.

For working developers, the detail matters because it reflects the new battleground. Autocomplete is mature. Single-shot code generation is useful but limited. The real test is not whether an agent can produce code. It is whether it can keep doing the right thing after the task stops being neat: reading the right files, resisting shortcuts, using the available tools correctly, preserving style, updating tests, and stopping when it should stop.

That is particularly relevant in Laravel and PHP codebases, where business logic often sits across controllers, policies, jobs, events, form requests, model scopes, migrations and Blade or Inertia front ends. A weak agent can make a plausible edit in one layer while missing the queue worker, the authorisation rule or the data migration. A stronger long-horizon agent should be better at carrying the whole shape of the change.

Read at source · Cursor →

· · ·