Dual-model cost optimization: Opus for planning, Codex subscription for execution
Problem / Context
Using a frontier model like Opus 4.5 via API for all agent tasks burns through API credits rapidly. A few minutes of usage with Opus via API can cost dollars, putting the monthly burn rate at thousands of dollars. But cheaper models lack the planning and reasoning quality needed for complex multi-step work.
Solution
Connect two models to the agent with explicit task routing instructions. Use Opus (via the $20/month Claude subscription) for complex planning tasks that require deep reasoning. Fall back to Codex (via the $200/month ChatGPT Pro subscription which provides near-unlimited Codex quota) for the majority of execution work, especially coding tasks. Instruction to the agent: 'Use Opus 4.5 for complex tasks; fall back to Codex for everything else; if Opus usage is exhausted, use Codex.' The ChatGPT Pro subscription's Codex quota is consumed at a much lower rate than API calls, even under constant heavy use. After a full week of always-on agent operation with multiple sub-agents running 30-minute heartbeats, the quota remained well under the daily limit. The total effective cost is roughly $220/month for near-unlimited operation versus thousands per month if using Opus via API exclusively.
Result
Week of always-on multi-subagent operation staying well within daily quota. Total monthly cost $220 versus estimated thousands via pure API billing. Codex handles 95%+ of tasks.