Skip to content

Fixed integration with OpenAI Codex (v0.114.0) with gpt-5.4#41

Closed
dabogee wants to merge 8 commits intoCaddyGlow:mainfrom
dabogee:codex__v_0_114_0__gpt_5_4__fix
Closed

Fixed integration with OpenAI Codex (v0.114.0) with gpt-5.4#41
dabogee wants to merge 8 commits intoCaddyGlow:mainfrom
dabogee:codex__v_0_114_0__gpt_5_4__fix

Conversation

@dabogee
Copy link
Copy Markdown
Contributor

@dabogee dabogee commented Mar 13, 2026

Fix Codex proxy compatibility for HTTP, WebSocket. Integration with Microsoft Agent Framework, CrewAI and OpenAI Agents SDK has been validated.

Summary

This PR extends the Codex proxy compatibility work beyond the original HTTP and WebSocket fixes. In addition to making native Codex Desktop / codex exec traffic work end-to-end, it adds support for OpenAI-compatible multi-agent workflows, improves bypass-mode behavior, and makes mock responses format-aware across chat, responses, and Anthropic routes.

After these changes:

  • POST /codex/v1/responses succeeds through the proxy
  • POST /codex/v1/chat/completions succeeds through the proxy
  • codex exec works through OPENAI_BASE_URL=http://127.0.0.1:8000/codex/v1
  • Codex Desktop no longer falls back because of missing WebSocket support
  • Codex model discovery no longer fails on missing response fields from /codex/v1/models
  • OpenAI-compatible MSAF-style clients can send Codex requests without inheriting captured Codex CLI payloads
  • Mock/bypass mode returns the right response envelope for chat, responses, and Anthropic endpoints

What Was Broken

  • Plain JSON requests to Codex upstream were missing required Codex-specific request fields and were rejected with 400 Bad Request
  • Encoded request bodies from native Codex clients could trigger decode failures in the adapter
  • Partial detection cache state could leave required instructions or forwarded CLI headers missing
  • /codex/v1/models did not expose the richer model metadata expected by Codex Desktop
  • The proxy had no WebSocket support for Codex responses
  • Empty WebSocket warmup requests were incorrectly forwarded upstream and failed
  • Synthetic warmup response IDs could be reused as previous_response_id, which upstream rejected
  • The WebSocket route processed only one request per connection, while Codex clients reuse the same socket
  • OpenAI-compatible agent clients could be polluted by injected Codex CLI detection payloads even when they were sending their own instructions and reasoning settings
  • OpenAI thinking blocks were always serialized with XML wrappers, even when the runtime setting disabled them
  • Bypass mode always routed provider plugins through their real adapters, instead of using the mock pipeline
  • Mock responses were only loosely OpenAI-aware and could return the wrong response shape for /chat/completions vs /responses
  • Mock generation had no prompt-aware deterministic path for the login-form workshop scenario used by the new agent tests

Tests

Added or updated coverage for:

  • Codex adapter behavior with detection payload injection disabled
  • preservation of user-supplied reasoning for OpenAI-compatible Codex requests
  • propagation and task isolation of openai_thinking_xml
  • bypass-mode provider factory behavior
  • mock adapter format resolution from format_chain and endpoint fallback
  • prompt extraction and prompt-aware mock responses
  • MSAF-style OpenAI chat requests reaching Codex without Codex CLI prompt pollution
  • real Agent Framework client flows running through the Codex proxy
  • sequential agent-style calls keeping reasoning hidden and message output clean
  • existing WebSocket, warmup, and /models compatibility paths

Validation

Previously validated flows remain:

  • uv build
  • POST /codex/v1/responses returned 200 with OK
  • POST /codex/v1/chat/completions returned 200 with OK
  • codex exec through the proxy returned OK

Powered by gpt-5.4, reviewed by claude-4.5-sonnet, Junie

dabogee and others added 8 commits March 13, 2026 08:37
- Fix mypy type errors: add type annotations for format_chain list
  elements, cast WebSocket adapter return, wrap Response.body for
  json.loads, use LLMSettings instead of raw dicts in tests
- Fix ruff SIM105: replace try/except/pass with contextlib.suppress
- Fix WebSocket accept/auth ordering: accept before authenticate so
  close codes work correctly on rejection
- Type WebSocket helpers with CodexAdapter instead of Any
- Bound local_response_ids with deque(maxlen=256) to prevent unbounded
  growth on long-lived connections
- Make _load_codex_cli_models_cache async with anyio.Path to avoid
  blocking the event loop with synchronous filesystem I/O
- Add structured logging to WebSocket handler lifecycle
- Add debug logging to _safe_fallback_data in detection service
- Fix _normalize_input_messages to avoid mutating input dict in-place
- Remove redundant pass statements after logger.debug calls
- Refactor test_msaf_real_library to use plain httpx.AsyncClient,
  removing agent_framework and openai library dependencies
Add parameterized WebSocket e2e tests following test_endpoint_e2e.py
pattern, covering warmup, streaming, error, and multi-message flows
for both /codex/v1/responses and /codex/responses endpoints.

- Add WS_ENDPOINT_CONFIGURATIONS and request builders to test_data.py
- Add WebSocket validation helpers to e2e_validation.py
- Add live server tests gated by CCPROXY_BASE_URL env var
- Remove agent-framework-core and agent-framework-orchestrations from
  test dependencies (test_msaf_real_library refactored to plain httpx)
1. Add ThinkingConfigAdaptive to the request validation schema to
   support Claude 4-6+ models that use thinking: {"type": "adaptive"}.

2. Fix anthropic-beta header handling: use minimal required tags
   (claude-code-20250219, oauth-2025-04-20) and block CLI fallback
   headers from overwriting them. The fallback data included tags
   like fine-grained-tool-streaming-2025-05-14 that triggered
   "long context beta not available" errors for some subscriptions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Support summarized/omitted display modes for adaptive thinking responses.
@dabogee
Copy link
Copy Markdown
Contributor Author

dabogee commented Mar 20, 2026

hi @CaddyGlow ! Could you please trigger a CI? Made checks locally, has to be Ok

@CaddyGlow
Copy link
Copy Markdown
Owner

This PR has been split into 3 focused PRs for easier review and independent merging:

  1. feat: Codex v0.114.0 integration with gpt-5.4 support #43 - feat: Codex v0.114.0 integration with gpt-5.4 support
  2. feat: Microsoft Agent Framework compatibility and bypass mode #44 - feat: Microsoft Agent Framework compatibility and bypass mode
  3. feat: WebSocket auth hardening and e2e tests #45 - feat: WebSocket auth hardening and e2e tests

Closing in favor of the above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants