Model model_at_capacity api proxy

Codex: Selected model is at capacity.

This message usually means the selected Codex model or model route has no available capacity right now. It is not 429 and is not primarily an API-key error; first separate official Codex login, proxy account-pool capacity, and Sub2API error rewriting before choosing a fix.

Snapshot

Error: Selected model is at capacity. Please try a different model. Treat it as model-level capacity, admission, or streaming interruption first. It can happen with ChatGPT-authenticated Codex CLI, proxy account pools, or Sub2API upstream overloaded passthrough.

#selected model is at capacity #codex selected model is at capacity #codex model at capacity #codex please try a different model #openai codex model capacity

What This Error Means

Selected model is at capacity. Please try a different model. is mainly about the selected model. The request is not primarily failing at authentication; the selected model, model group, or upstream route cannot currently allocate capacity.

If you are using the official Codex login flow, this may be temporary OpenAI-side model demand. If you are using a proxy, Sub2API, CPA, NewAPI, or a team gateway, it may also mean the proxy’s account pool, model mapping, or route has no healthy upstream available.

Do not reduce this to “Codex is broken.” The sharper diagnosis is: this model route cannot schedule resources right now.

Community reports show the same warning can appear even when /status still shows available context and quota. Treat it as model capacity or routing admission first, not as an API-key or balance problem.

In some Sub2API setups, users observed that the upstream failure is closer to Our servers are currently overloaded, and the proxy layer then surfaces it as the Codex at-capacity message. The operational problem is not only the failure itself; Codex may not automatically continue after this message, so long coding tasks get interrupted.

Common Causes

  1. The selected Codex model is under high demand, especially after the product recommends migrating to a newer model.
  2. gpt-5.4, gpt-5.3-codex, and other Codex-related models may have separate capacity pools; remaining quota does not guarantee this model can be admitted right now.
  3. The same model may still work in normal ChatGPT while Codex refuses the connection, which suggests Codex can have different admission, tool-call, or model-routing behavior.
  4. Your proxy maps the selected model to an upstream account pool that is full, rate limited, or cooling down.
  5. An SSE streaming request breaks mid-turn, and the client surfaces upstream overloaded or stream failure as at capacity.
  6. Multiple Codex sessions are sharing the same account, key, or provider route at the same time.

Identify Your Scenario First

If you use the official ChatGPT-authenticated Codex path, switch models first: move from gpt-5.4 back to gpt-5.3-codex, or wait a few minutes and retry. Do not regenerate keys first.

If you use a proxy, Sub2API, CPA, or NewAPI, send a minimal request with the same Base URL, key, and model, then switch models. If the minimal request also fails, check account pools and upstream capacity. If only long tasks fail, look at context size, streaming, and tool calls.

How To Fix It Quickly

  1. Switch models first. Try moving from gpt-5.4 back to gpt-5.3-codex or another stable model to isolate a single-model capacity problem.
  2. Wait 1 to 5 minutes before retrying. Capacity errors are often temporary, and immediate repeated retries rarely help.
  3. Reduce context pressure: start a fresh session, attach fewer files, narrow the requested change, or split the work into smaller steps.
  4. If your Codex build supports /goal, define a narrow goal so the client can recover more gracefully after interruptions. This reduces manual intervention; it does not fix upstream capacity.
  5. If you use an API proxy, switch model groups, backup routes, or providers, and ask support whether this model has schedulable upstream accounts.

Sub2API Mitigation

If you run or administer Sub2API, you can try an error passthrough rule under account management. Match upstream messages containing Our servers are currently overloaded or at capacity, then rewrite them to an error text that your client treats as retryable, such as upstream failed.

A common setup is: leave error code empty, add keywords such as Our servers are currently overloaded and at capacity, use “error code or keyword” matching, and set the custom error message to upstream failed or another retryable text.

This reduces long-task interruptions; it does not create more model capacity. Keep retries bounded, and watch usage, failure rate, and loops. Hiding every capacity error can blind you to a real capacity problem and burn balance on useless retries.

Be more careful with NewAPI. Community reports suggest NewAPI commonly exposes status-code rewriting, but may not offer the same fine-grained upstream error-message rewrite path as Sub2API. Do not collapse every upstream error into one retryable message just to make the client keep going.

Sources

Related errors

Related guides

Related topics

FAQ

Common questions

Is this an API key error?

Usually no. API-key problems more often return 401 or 403. Selected model is at capacity points more directly to model capacity, routing, or an upstream account pool that cannot serve the chosen model.

Should I keep retrying the same model?

No. Repeated retries can amplify a capacity problem. Wait briefly or switch to another available model. If you use an API proxy, ask the provider whether that model route is currently full.

Why does normal chat work while Codex says the model is full?

Codex tasks often carry longer context and tool calls, and they may use a dedicated Codex model or model group. A normal chat model being available does not prove the selected Codex route has capacity.