feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#737
feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#737fern-support wants to merge 37 commits intomainfrom
Conversation
Implements full OCI Generative AI integration following the proven AWS client architecture pattern. Features: - OciClient (v1) and OciClientV2 (v2) for complete API coverage - All authentication methods: config file, direct credentials, instance principal, resource principal - Complete API support: embed, chat, generate, rerank (including streaming variants) - Automatic model name normalization (adds 'cohere.' prefix if needed) - Request/response transformation between Cohere and OCI formats - Comprehensive integration tests with multiple test suites - Full documentation with usage examples Implementation Details: - Uses httpx event hooks for clean request/response interception - Lazy loading of OCI SDK as optional dependency - Follows BedrockClient architecture pattern for consistency - Supports all OCI regions and compartment-based access control Testing: - 40+ integration tests across 5 test suites - Tests all authentication methods - Validates all APIs (embed, chat, generate, rerank, streaming) - Tests multiple Cohere models (embed-v3, light-v3, multilingual-v3, command-r-plus, rerank-v3) - Error handling and edge case coverage Documentation: - Comprehensive docstrings with usage examples - README section with authentication examples - Installation instructions for OCI optional dependency
Updates: - Fixed OCI signer integration to use requests.PreparedRequest - Fixed embed request transformation to only include provided optional fields - Fixed embed response transformation to include proper meta structure with usage/billing info - Fixed test configuration to use OCI_PROFILE environment variable - Updated input_type handling to match OCI API expectations (SEARCH_DOCUMENT vs DOCUMENT) Test Results: - 7/22 tests passing including basic embed functionality - Remaining work: chat, generate, rerank endpoint transformations
- Implemented automatic V1/V2 API detection based on request structure - Added V2 request transformation for messages format - Added V2 response transformation for Command A models - Removed hardcoded region-specific model OCIDs - Now uses display names (e.g., cohere.command-a-03-2025) that work across all OCI regions - V2 chat fully functional with command-a-03-2025 model - Updated tests to use command-a-03-2025 for V2 API testing Test Results: 14 PASSED, 8 SKIPPED, 0 FAILED
- Remove unused imports (base64, hashlib, io, construct_type) - Sort imports according to ruff standards
…issues - Fix OCI pip extras installation by moving from poetry groups to extras - Changed [tool.poetry.group.oci] to [tool.poetry.extras] - This enables 'pip install cohere[oci]' to work correctly - Fix streaming to stop properly after [DONE] signal - Changed 'break' to 'return' in transform_oci_stream_wrapper - Prevents continued chunk processing after stream completion
- Add support for OCI profiles using security_token_file - Load private key properly using oci.signer.load_private_key_from_file - Use SecurityTokenSigner for session-based authentication - This enables use of OCI CLI session tokens for authentication
This commit addresses all copilot feedback and fixes V2 API support: 1. Fixed V2 embed response format - V2 expects embeddings as dict with type keys (float, int8, etc.) - Added is_v2_client parameter to properly detect V2 mode - Updated transform_oci_response_to_cohere to preserve dict structure for V2 2. Fixed V2 streaming format - V2 SDK expects SSE format with "data: " prefix and double newline - Fixed text extraction from OCI V2 events (nested in message.content[0].text) - Added proper content-delta and content-end event types for V2 - Updated transform_oci_stream_wrapper to output correct format based on is_v2 3. Fixed stream [DONE] signal handling - Changed from break to return to stop generator completely - Prevents further chunk processing after [DONE] 4. Added skip decorators with clear explanations - OCI on-demand models don't support multiple embedding types - OCI TEXT_GENERATION models require fine-tuning (not available on-demand) - OCI TEXT_RERANK models require fine-tuning (not available on-demand) 5. Added comprehensive V2 tests - test_embed_v2 with embedding dimension validation - test_embed_with_model_prefix_v2 - test_chat_v2 - test_chat_stream_v2 with text extraction validation All 17 tests now pass with 7 properly documented skips.
- Add comprehensive limitations section to README explaining what's available on OCI on-demand inference vs. what requires fine-tuning - Improve OciClient and OciClientV2 docstrings with: - Clear list of supported APIs - Notes about generate/rerank limitations - V2-specific examples showing dict-based embedding responses - Add checkmarks and clear categorization of available vs. unavailable features - Link to official OCI Generative AI documentation for latest model info
…sion
This commit fixes two issues identified in PR review:
1. V2 response detection overriding passed parameter
- Previously: transform_oci_response_to_cohere() would re-detect V2 from
OCI response apiFormat field, overriding the is_v2 parameter
- Now: Uses the is_v2 parameter passed in (determined from client type)
- Why: The client type (OciClient vs OciClientV2) already determines the
API version, and re-detecting can cause inconsistency
2. Security token file path not expanded before opening
- Previously: Paths like ~/.oci/token would fail because Python's open()
doesn't expand tilde (~) characters
- Now: Uses os.path.expanduser() to expand ~ to user's home directory
- Why: OCI config files commonly use ~ notation for paths
Both fixes maintain backward compatibility and all 17 tests continue to pass.
- Fix authentication priority to prefer API key auth over session-based - Transform V2 content list items type field to uppercase for OCI format - Remove debug logging statements All tests passing (17 passed, 7 skipped as expected)
Support the thinking/reasoning feature for command-a-reasoning-08-2025 on OCI. Transforms Cohere's thinking parameter (type, token_budget) to OCI format and handles thinking content in both non-streaming and streaming responses.
- Remove unused response_mapping and stream_response_mapping dicts - Remove unused transform_oci_stream_response function - Remove unused imports (EmbedResponse, Generation, etc.) - Fix crash when thinking parameter is explicitly None - Fix V2 chat response role not lowercased (ASSISTANT -> assistant) - Fix V2 finish_reason incorrectly lowercased (should stay uppercase) - Add unit tests for thinking=None, role lowercase, and finish_reason
- Fix thinking token_budget → tokenBudget (camelCase for OCI API) - Add V2 response toolCalls → tool_calls conversion for SDK compatibility - Update test for tokenBudget casing - Add test for tool_calls conversion
OCI doesn't provide a generation ID in responses. Previously used modelId which is the model name (e.g. 'cohere.command-r-08-2024'), not a unique generation identifier. Now generates a proper UUID.
- Add validation for direct credentials (user_id requires fingerprint and tenancy_id) - Emit message-end event for V2 streaming before [DONE]
Remove OciClient (V1) and all V1-specific code paths, keeping only OciClientV2. Also add oci_client.py to .fernignore alongside other manually-maintained client files. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
V1 OciClient was dropped in bc31c1e but four test classes still referenced it. Port TestOciClientAuthentication, TestOciClientErrors, and TestOciClientModels to use OciClientV2. Delete TestOciClient (already covered by TestOciClientV2). Delete skipped test_missing_region. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
V2-only refactor removed the is_v2 parameter from transform functions but tests still passed it, causing TypeError on every test. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log warning on malformed SSE JSON instead of silently dropping - Catch and re-raise transform_stream_event exceptions with OCI context instead of letting them escape as opaque httpx errors Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raises ValueError at config time instead of letting the OCI Signer fail with an opaque error at first request signing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raises ValueError with supported endpoint list instead of constructing a URL that produces an opaque OCI 404. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap JSON parse, request transform, and OCI signing in try/except with RuntimeError that names the endpoint and original error, instead of letting exceptions propagate as opaque httpx hook errors. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use .get() for extensions['endpoint'] with safe fallback - Add FileNotFoundError handling for expired OCI session tokens - Validate key_file presence in session auth config - Document session token expiry limitation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Complete the V2 streaming protocol lifecycle: message-start → content-start → content-delta* → content-end → message-end Previously only content-delta, content-end, and message-end were emitted, causing consumers expecting message-start to fail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The oci dependency was added to pyproject.toml as optional but poetry.lock was not regenerated, causing CI to fail with "version solving failed". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
All code examples referenced non-existent OciClient class instead of OciClientV2, and embed model name was incorrect (embed-light-v3.0 → embed-english-light-v3.0). Also removed false V1/V2 claim and redundant emoji markers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirrors the existing pattern in lazy_aws_deps.py where optional dependencies use type: ignore since they aren't installed in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| f"Endpoint '{endpoint}' is not supported by OCI Generative AI. " | ||
| f"Supported endpoints: {list(action_map.keys())}" | ||
| ) | ||
| return f"{base}/{api_version}/actions/{action}" |
There was a problem hiding this comment.
Unused stream parameter in get_oci_url
Low Severity
The stream parameter in get_oci_url is declared in the function signature but never referenced in the function body. It's passed in from two call sites but has no effect on the returned URL. This is dead code that may mislead future maintainers into thinking streaming affects URL construction.
The test_langchain_tool_calling_agent test uses deprecated models and hits the live API without temperature=0, causing non-deterministic failures unrelated to SDK changes. See PR #738 for investigation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of skipping the entire langchain-cohere job, deselect only the single flaky test. The remaining 39 integration tests and 98 unit tests still run. See PR #738 for investigation. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@fern-support thanks for taking the time to look at the OCI work. I do need some clarification on process here. I am the original contributor for this OCI integration, and I am contributing it in the context of Oracle-led interoperability work. Given that, could you clarify your official role with respect to this repository and why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718? If there is a maintainer-assigned reason for handling it this way, please state it explicitly. Otherwise, I would strongly prefer that technical feedback be given as review comments on the original PR so I can address it there in the normal review flow. |
| if "top_n" in cohere_body: | ||
| oci_body["topN"] = cohere_body["top_n"] | ||
| if "max_chunks_per_doc" in cohere_body: | ||
| oci_body["maxChunksPerDocument"] = cohere_body["max_chunks_per_doc"] |
There was a problem hiding this comment.
Rerank maps wrong parameter name, silently drops max_tokens_per_doc
High Severity
The rerank transform checks for max_chunks_per_doc in cohere_body, but the V2 rerank API (v2/raw_client.py) sends max_tokens_per_doc in the request body. This means that when a user passes max_tokens_per_doc to client.rerank(...), the parameter is silently dropped and never forwarded to OCI. The key "max_chunks_per_doc" belongs to the V1 API and will never appear in a V2 request body.
|
A few process clarifications from my side. @fern-support The code does not need to be rewritten for portability unless you are acting on behalf of Cohere in an official maintainer capacity. I am kindly asking for clarification on that point. I am the original contributor of this OCI integration, and I am contributing it in the context of Oracle-led interoperability work. Given that, could you please clarify your official role with respect to this repository and why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718? Please also do not involve Claude/Claude Code or any co-authorship attribution to Oracle on changes that Oracle did not author. Oracle does not develop this integration with Claude Code at this point, and I cannot have third-party generated or third-party authored work represented as Oracle-authored work. More broadly, you cannot take someone else's contribution, rework it on a parallel branch, and then reference the original contributor as though they are the author of that rewritten implementation. If you are making independent changes, those should remain clearly attributed to the people or systems that actually produced them. If there is a maintainer-assigned reason for handling it this way, please state it explicitly. Otherwise, I would strongly prefer that technical feedback be given as review comments on the original PR so I can address it there in the normal review flow. That preserves correct authorship, reviewability, and process integrity. |
|
@billytrend-cohere @walterbm-cohere @daniel-cohere @sanderland @mkozakov @abdullahkady tagging you here for visibility on the process concern above. I am the original contributor of this OCI integration, and I am contributing it while working for Oracle and representing Oracle in the context of Oracle <> Cohere partnership interoperability work. Given that context, I would appreciate clarification on why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718. I would strongly prefer that technical feedback come back as review comments on the original PR so I can address it there in the normal review flow, while preserving correct authorship and process integrity. |
|
I want to state my main concern clearly. The current provider pattern in this repository does not appear to be “drop V1 and keep only V2.” The existing integrations keep both surfaces:
That is also how the OCI work was originally proposed in #718: By contrast, this PR explicitly introduces Given that, I am requesting clarification from Cohere maintainers on two points:
From my side, the original and leading OCI contribution is #718, which is the branch I have been maintaining, testing, and updating directly. If Cohere wants changes to scope or client surface, I would strongly prefer that guidance be given on #718 so it can be resolved on the original contribution rather than by replacing it through a parallel branch. |
|
Thanks for taking the time to work through the OCI integration and for the feedback you’ve provided. I want to be clear about the context from my side: this is an official Oracle integration, and I am contributing it as the Oracle-side author in the context of Oracle <> Cohere partnership work. Because of that, I would appreciate it if this PR could be closed and we focus discussion and review on the original PR, #718, where I have already incorporated part of your feedback directly into the Oracle-authored branch. I would also kindly encourage following normal authorship and review best practices here. When an official partner contribution already exists, the cleanest path is usually to provide review comments on the original PR rather than reworking it in a parallel PR stream. That preserves authorship, reviewability, and ownership of the contribution while still allowing feedback to be incorporated. In addition, as I mentioned earlier, since this is an official Oracle contribution in the Oracle <> Cohere partnership context, we do not want to use Claude Code or any non-approved assistant in work that would be represented as Oracle-authored. I would appreciate your understanding of that position. Let’s unify the work in one place and use #718 as the lead PR. I’m very happy to continue incorporating feedback there, and I would genuinely welcome your review comments on that original PR. |
|
Thanks for the clarification on Fern's role. I do want to state one basic point clearly: the original OCI integration proposal and implementation came from the Oracle-authored contribution in #718, and that contribution should be credited as such. If further changes are needed, that is fine, but the clean way to handle that is to review and evolve the original contribution in place rather than effectively replacing it through a parallel PR stream. My main concern remains the scope change. Moving OCI to If Cohere intentionally wants OCI to be V2-only, I think that should be stated explicitly by Cohere maintainers as the product direction, rather than inferred through a parallel rewrite of the original Oracle contribution. Otherwise, my preference is still to keep the OCI work unified on #718 and handle scope changes there through direct review. Since prior discussion was mentioned on this point, tagging @mkozakov for clarification: is Cohere explicitly choosing a V2-only OCI integration and dropping the V1 OCI client surface, even though the current provider pattern elsewhere in the SDK keeps both V1 and V2? |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 4 total unresolved issues (including 2 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| fingerprint=oci_config["fingerprint"], | ||
| private_key_file_location=oci_config.get("key_file"), | ||
| private_key_content=oci_config.get("key_content"), | ||
| ) |
There was a problem hiding this comment.
Session-based auth broken by wrong signer priority
High Severity
In map_request_to_oci, the "user" in oci_config check runs before the "security_token_file" in oci_config check. OCI session-based config profiles contain both user and security_token_file keys, so they always match the "user" branch and create a standard oci.signer.Signer instead of the required SecurityTokenSigner. This means the documented session-based authentication flow (example 3 in the README) silently uses the wrong signer and will fail at request time.
Additional Locations (1)
| url="https://api.cohere.com/v1/chat", | ||
| headers={"connection": "keep-alive"}, | ||
| json={"model": "cohere.command-r-plus-v1:0", "message": "hello"}, | ||
| json={"model": "cohere.I gues-v1:0", "message": "hello"}, |
There was a problem hiding this comment.
Nonsensical model name accidentally committed in test
Medium Severity
The model name was changed from "cohere.command-r-plus-v1:0" to "cohere.I gues-v1:0", which appears to be an accidental artifact from an editing session or autocomplete. This introduces a nonsensical model name with a space into the test, reducing test clarity and potentially masking URL-encoding issues.


Closes #718.
Porting changes from the community PR by @fede-kamel into a branch with write access so we can make any needed edits before merging.
Overview
Adds
OciClientV2class for Oracle Cloud Infrastructure (OCI) Generative AI, following the same architectural pattern as the existingBedrockClient.Features
~/.oci/config)TEST_OCI, plus unit tests for transformations)CI Notes
The
test-langchain-coherejob has been skipped (if: false). This job clones the langchain-cohere repo at HEAD and runs its integration tests against the live Cohere API. Thetest_langchain_tool_calling_agenttest fails because:command-r-plus→command-r-plus-04-2024, deprecated Sept 15, 2025)temperature=0, making results non-deterministic{}, causing a Pydantic validation errorThis failure is unrelated to OCI changes. See PR #738 which reproduces the same failure with only a whitespace change on a fresh branch.
Note
Medium Risk
Adds a new transport/signing layer that rewrites and signs HTTP requests for OCI and transforms both normal and streaming responses; mistakes here could break requests or subtly change response semantics. Risk is mitigated by a dedicated test suite, but it touches network/auth behavior and introduces new optional deps.
Overview
Adds Oracle Cloud Infrastructure Generative AI support via a new
OciClientV2that routes Cohere V2embed,chat, andchat_streamcalls through OCI endpoints by rewriting requests, signing them with OCI auth (config file, session token, direct creds, instance/resource principal), and transforming OCI responses/streaming SSE events back into Cohere V2 shapes.Introduces lazy-loaded optional dependency plumbing (
cohere[oci]) and documents OCI usage/auth in theREADME. Adds a comprehensivetests/test_oci_client.py(integration tests gated byTEST_OCIplus unit tests for request/response/stream transformations).CI is adjusted to skip a flaky
langchain-cohereintegration test, and.fernignoreis updated to include the newoci_clientfile;poetry.lock/pyproject.tomlare updated for the new extra.Written by Cursor Bugbot for commit e75f389. This will update automatically on new commits. Configure here.