Skip to content

feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#737

Open
fern-support wants to merge 37 commits intomainfrom
fern-support/feat/oci-client
Open

feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support#737
fern-support wants to merge 37 commits intomainfrom
fern-support/feat/oci-client

Conversation

@fern-support
Copy link
Collaborator

@fern-support fern-support commented Mar 12, 2026

Closes #718.

Porting changes from the community PR by @fede-kamel into a branch with write access so we can make any needed edits before merging.

Overview

Adds OciClientV2 class for Oracle Cloud Infrastructure (OCI) Generative AI, following the same architectural pattern as the existing BedrockClient.

Features

  • OciClientV2 (V2 API) class
  • Full authentication support:
    • Config file (default ~/.oci/config)
    • Custom profiles
    • Direct credentials
    • Instance principal (for OCI compute instances)
    • Resource principal
  • Complete API coverage: Embed, Chat, Chat streaming
  • Region-independent: uses display names instead of region-specific OCIDs
  • Lazy loading of OCI SDK as optional dependency
  • Comprehensive test suite (integration tests gated behind TEST_OCI, plus unit tests for transformations)

CI Notes

The test-langchain-cohere job has been skipped (if: false). This job clones the langchain-cohere repo at HEAD and runs its integration tests against the live Cohere API. The test_langchain_tool_calling_agent test fails because:

  • It uses deprecated models (command-r-pluscommand-r-plus-04-2024, deprecated Sept 15, 2025)
  • It hits the live API without temperature=0, making results non-deterministic
  • The model returns empty tool call args {}, causing a Pydantic validation error

This failure is unrelated to OCI changes. See PR #738 which reproduces the same failure with only a whitespace change on a fresh branch.


Note

Medium Risk
Adds a new transport/signing layer that rewrites and signs HTTP requests for OCI and transforms both normal and streaming responses; mistakes here could break requests or subtly change response semantics. Risk is mitigated by a dedicated test suite, but it touches network/auth behavior and introduces new optional deps.

Overview
Adds Oracle Cloud Infrastructure Generative AI support via a new OciClientV2 that routes Cohere V2 embed, chat, and chat_stream calls through OCI endpoints by rewriting requests, signing them with OCI auth (config file, session token, direct creds, instance/resource principal), and transforming OCI responses/streaming SSE events back into Cohere V2 shapes.

Introduces lazy-loaded optional dependency plumbing (cohere[oci]) and documents OCI usage/auth in the README. Adds a comprehensive tests/test_oci_client.py (integration tests gated by TEST_OCI plus unit tests for request/response/stream transformations).

CI is adjusted to skip a flaky langchain-cohere integration test, and .fernignore is updated to include the new oci_client file; poetry.lock/pyproject.toml are updated for the new extra.

Written by Cursor Bugbot for commit e75f389. This will update automatically on new commits. Configure here.

fern-api bot and others added 19 commits February 25, 2026 10:40
Implements full OCI Generative AI integration following the proven AWS client architecture pattern.

Features:
- OciClient (v1) and OciClientV2 (v2) for complete API coverage
- All authentication methods: config file, direct credentials, instance principal, resource principal
- Complete API support: embed, chat, generate, rerank (including streaming variants)
- Automatic model name normalization (adds 'cohere.' prefix if needed)
- Request/response transformation between Cohere and OCI formats
- Comprehensive integration tests with multiple test suites
- Full documentation with usage examples

Implementation Details:
- Uses httpx event hooks for clean request/response interception
- Lazy loading of OCI SDK as optional dependency
- Follows BedrockClient architecture pattern for consistency
- Supports all OCI regions and compartment-based access control

Testing:
- 40+ integration tests across 5 test suites
- Tests all authentication methods
- Validates all APIs (embed, chat, generate, rerank, streaming)
- Tests multiple Cohere models (embed-v3, light-v3, multilingual-v3, command-r-plus, rerank-v3)
- Error handling and edge case coverage

Documentation:
- Comprehensive docstrings with usage examples
- README section with authentication examples
- Installation instructions for OCI optional dependency
Updates:
- Fixed OCI signer integration to use requests.PreparedRequest
- Fixed embed request transformation to only include provided optional fields
- Fixed embed response transformation to include proper meta structure with usage/billing info
- Fixed test configuration to use OCI_PROFILE environment variable
- Updated input_type handling to match OCI API expectations (SEARCH_DOCUMENT vs DOCUMENT)

Test Results:
- 7/22 tests passing including basic embed functionality
- Remaining work: chat, generate, rerank endpoint transformations
- Implemented automatic V1/V2 API detection based on request structure
- Added V2 request transformation for messages format
- Added V2 response transformation for Command A models
- Removed hardcoded region-specific model OCIDs
- Now uses display names (e.g., cohere.command-a-03-2025) that work across all OCI regions
- V2 chat fully functional with command-a-03-2025 model
- Updated tests to use command-a-03-2025 for V2 API testing

Test Results: 14 PASSED, 8 SKIPPED, 0 FAILED
- Remove unused imports (base64, hashlib, io, construct_type)
- Sort imports according to ruff standards
…issues

- Fix OCI pip extras installation by moving from poetry groups to extras
  - Changed [tool.poetry.group.oci] to [tool.poetry.extras]
  - This enables 'pip install cohere[oci]' to work correctly

- Fix streaming to stop properly after [DONE] signal
  - Changed 'break' to 'return' in transform_oci_stream_wrapper
  - Prevents continued chunk processing after stream completion
- Add support for OCI profiles using security_token_file
- Load private key properly using oci.signer.load_private_key_from_file
- Use SecurityTokenSigner for session-based authentication
- This enables use of OCI CLI session tokens for authentication
This commit addresses all copilot feedback and fixes V2 API support:

1. Fixed V2 embed response format
   - V2 expects embeddings as dict with type keys (float, int8, etc.)
   - Added is_v2_client parameter to properly detect V2 mode
   - Updated transform_oci_response_to_cohere to preserve dict structure for V2

2. Fixed V2 streaming format
   - V2 SDK expects SSE format with "data: " prefix and double newline
   - Fixed text extraction from OCI V2 events (nested in message.content[0].text)
   - Added proper content-delta and content-end event types for V2
   - Updated transform_oci_stream_wrapper to output correct format based on is_v2

3. Fixed stream [DONE] signal handling
   - Changed from break to return to stop generator completely
   - Prevents further chunk processing after [DONE]

4. Added skip decorators with clear explanations
   - OCI on-demand models don't support multiple embedding types
   - OCI TEXT_GENERATION models require fine-tuning (not available on-demand)
   - OCI TEXT_RERANK models require fine-tuning (not available on-demand)

5. Added comprehensive V2 tests
   - test_embed_v2 with embedding dimension validation
   - test_embed_with_model_prefix_v2
   - test_chat_v2
   - test_chat_stream_v2 with text extraction validation

All 17 tests now pass with 7 properly documented skips.
- Add comprehensive limitations section to README explaining what's available
  on OCI on-demand inference vs. what requires fine-tuning
- Improve OciClient and OciClientV2 docstrings with:
  - Clear list of supported APIs
  - Notes about generate/rerank limitations
  - V2-specific examples showing dict-based embedding responses
- Add checkmarks and clear categorization of available vs. unavailable features
- Link to official OCI Generative AI documentation for latest model info
…sion

This commit fixes two issues identified in PR review:

1. V2 response detection overriding passed parameter
   - Previously: transform_oci_response_to_cohere() would re-detect V2 from
     OCI response apiFormat field, overriding the is_v2 parameter
   - Now: Uses the is_v2 parameter passed in (determined from client type)
   - Why: The client type (OciClient vs OciClientV2) already determines the
     API version, and re-detecting can cause inconsistency

2. Security token file path not expanded before opening
   - Previously: Paths like ~/.oci/token would fail because Python's open()
     doesn't expand tilde (~) characters
   - Now: Uses os.path.expanduser() to expand ~ to user's home directory
   - Why: OCI config files commonly use ~ notation for paths

Both fixes maintain backward compatibility and all 17 tests continue to pass.
- Fix authentication priority to prefer API key auth over session-based
- Transform V2 content list items type field to uppercase for OCI format
- Remove debug logging statements

All tests passing (17 passed, 7 skipped as expected)
Support the thinking/reasoning feature for command-a-reasoning-08-2025
on OCI. Transforms Cohere's thinking parameter (type, token_budget) to
OCI format and handles thinking content in both non-streaming and
streaming responses.
- Remove unused response_mapping and stream_response_mapping dicts
- Remove unused transform_oci_stream_response function
- Remove unused imports (EmbedResponse, Generation, etc.)
- Fix crash when thinking parameter is explicitly None
- Fix V2 chat response role not lowercased (ASSISTANT -> assistant)
- Fix V2 finish_reason incorrectly lowercased (should stay uppercase)
- Add unit tests for thinking=None, role lowercase, and finish_reason
- Fix thinking token_budget → tokenBudget (camelCase for OCI API)
- Add V2 response toolCalls → tool_calls conversion for SDK compatibility
- Update test for tokenBudget casing
- Add test for tool_calls conversion
OCI doesn't provide a generation ID in responses. Previously used modelId
which is the model name (e.g. 'cohere.command-r-08-2024'), not a unique
generation identifier. Now generates a proper UUID.
- Add validation for direct credentials (user_id requires fingerprint and tenancy_id)
- Emit message-end event for V2 streaming before [DONE]
Remove OciClient (V1) and all V1-specific code paths, keeping only
OciClientV2. Also add oci_client.py to .fernignore alongside other
manually-maintained client files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
fern-support and others added 8 commits March 12, 2026 01:34
V1 OciClient was dropped in bc31c1e but four test classes still
referenced it. Port TestOciClientAuthentication, TestOciClientErrors,
and TestOciClientModels to use OciClientV2. Delete TestOciClient
(already covered by TestOciClientV2). Delete skipped test_missing_region.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
V2-only refactor removed the is_v2 parameter from transform functions
but tests still passed it, causing TypeError on every test.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Log warning on malformed SSE JSON instead of silently dropping
- Catch and re-raise transform_stream_event exceptions with OCI
  context instead of letting them escape as opaque httpx errors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raises ValueError at config time instead of letting the OCI Signer
fail with an opaque error at first request signing.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Raises ValueError with supported endpoint list instead of constructing
a URL that produces an opaque OCI 404.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap JSON parse, request transform, and OCI signing in try/except
with RuntimeError that names the endpoint and original error, instead
of letting exceptions propagate as opaque httpx hook errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use .get() for extensions['endpoint'] with safe fallback
- Add FileNotFoundError handling for expired OCI session tokens
- Validate key_file presence in session auth config
- Document session token expiry limitation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fern-support and others added 2 commits March 12, 2026 01:47
Complete the V2 streaming protocol lifecycle:
message-start → content-start → content-delta* → content-end → message-end

Previously only content-delta, content-end, and message-end were emitted,
causing consumers expecting message-start to fail.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The oci dependency was added to pyproject.toml as optional but
poetry.lock was not regenerated, causing CI to fail with
"version solving failed".

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fern-support and others added 3 commits March 12, 2026 02:11
All code examples referenced non-existent OciClient class instead of
OciClientV2, and embed model name was incorrect (embed-light-v3.0 →
embed-english-light-v3.0). Also removed false V1/V2 claim and
redundant emoji markers.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Mirrors the existing pattern in lazy_aws_deps.py where optional
dependencies use type: ignore since they aren't installed in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f"Endpoint '{endpoint}' is not supported by OCI Generative AI. "
f"Supported endpoints: {list(action_map.keys())}"
)
return f"{base}/{api_version}/actions/{action}"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused stream parameter in get_oci_url

Low Severity

The stream parameter in get_oci_url is declared in the function signature but never referenced in the function body. It's passed in from two call sites but has no effect on the returned URL. This is dead code that may mislead future maintainers into thinking streaming affects URL construction.

Fix in Cursor Fix in Web

fern-support and others added 2 commits March 12, 2026 11:37
The test_langchain_tool_calling_agent test uses deprecated models and
hits the live API without temperature=0, causing non-deterministic
failures unrelated to SDK changes. See PR #738 for investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of skipping the entire langchain-cohere job, deselect only the
single flaky test. The remaining 39 integration tests and 98 unit tests
still run. See PR #738 for investigation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@fede-kamel
Copy link

@fern-support thanks for taking the time to look at the OCI work. I do need some clarification on process here.

I am the original contributor for this OCI integration, and I am contributing it in the context of Oracle-led interoperability work. Given that, could you clarify your official role with respect to this repository and why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718?

If there is a maintainer-assigned reason for handling it this way, please state it explicitly. Otherwise, I would strongly prefer that technical feedback be given as review comments on the original PR so I can address it there in the normal review flow.

if "top_n" in cohere_body:
oci_body["topN"] = cohere_body["top_n"]
if "max_chunks_per_doc" in cohere_body:
oci_body["maxChunksPerDocument"] = cohere_body["max_chunks_per_doc"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rerank maps wrong parameter name, silently drops max_tokens_per_doc

High Severity

The rerank transform checks for max_chunks_per_doc in cohere_body, but the V2 rerank API (v2/raw_client.py) sends max_tokens_per_doc in the request body. This means that when a user passes max_tokens_per_doc to client.rerank(...), the parameter is silently dropped and never forwarded to OCI. The key "max_chunks_per_doc" belongs to the V1 API and will never appear in a V2 request body.

Fix in Cursor Fix in Web

@fede-kamel
Copy link

fede-kamel commented Mar 12, 2026

A few process clarifications from my side. @fern-support

The code does not need to be rewritten for portability unless you are acting on behalf of Cohere in an official maintainer capacity. I am kindly asking for clarification on that point.

I am the original contributor of this OCI integration, and I am contributing it in the context of Oracle-led interoperability work. Given that, could you please clarify your official role with respect to this repository and why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718?

Please also do not involve Claude/Claude Code or any co-authorship attribution to Oracle on changes that Oracle did not author. Oracle does not develop this integration with Claude Code at this point, and I cannot have third-party generated or third-party authored work represented as Oracle-authored work.

More broadly, you cannot take someone else's contribution, rework it on a parallel branch, and then reference the original contributor as though they are the author of that rewritten implementation. If you are making independent changes, those should remain clearly attributed to the people or systems that actually produced them.

If there is a maintainer-assigned reason for handling it this way, please state it explicitly. Otherwise, I would strongly prefer that technical feedback be given as review comments on the original PR so I can address it there in the normal review flow. That preserves correct authorship, reviewability, and process integrity.

@fede-kamel
Copy link

@billytrend-cohere @walterbm-cohere @daniel-cohere @sanderland @mkozakov @abdullahkady tagging you here for visibility on the process concern above.

I am the original contributor of this OCI integration, and I am contributing it while working for Oracle and representing Oracle in the context of Oracle <> Cohere partnership interoperability work. Given that context, I would appreciate clarification on why this work is being developed through a parallel PR/commit stream rather than as review feedback on the original contribution in #718.

I would strongly prefer that technical feedback come back as review comments on the original PR so I can address it there in the normal review flow, while preserving correct authorship and process integrity.

@fede-kamel
Copy link

I want to state my main concern clearly.

The current provider pattern in this repository does not appear to be “drop V1 and keep only V2.” The existing integrations keep both surfaces:

  • BedrockClient and BedrockClientV2
  • SagemakerClient and SagemakerClientV2
  • the shared AWS wrappers also keep both AwsClient and AwsClientV2

That is also how the OCI work was originally proposed in #718: OciClient and OciClientV2, preserving parity with the rest of the SDK and preserving backward compatibility for V1-style OCI users.

By contrast, this PR explicitly introduces OciClientV2 only and drops the V1 OCI client entirely. That is not just a cleanup or review port; it is a product-scope change relative to the original contribution.

Given that, I am requesting clarification from Cohere maintainers on two points:

  1. Is Cohere intentionally choosing a V2-only OCI integration, even though the existing provider pattern in this repo keeps both V1 and V2 client surfaces?
  2. Is this PR intended to supersede the original Oracle-authored contribution in feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support #718? If so, please state that explicitly and explain why the lead/original PR is no longer feat: Add Oracle Cloud Infrastructure (OCI) Generative AI client support #718.

From my side, the original and leading OCI contribution is #718, which is the branch I have been maintaining, testing, and updating directly. If Cohere wants changes to scope or client surface, I would strongly prefer that guidance be given on #718 so it can be resolved on the original contribution rather than by replacing it through a parallel branch.

@fede-kamel
Copy link

Thanks for taking the time to work through the OCI integration and for the feedback you’ve provided.

I want to be clear about the context from my side: this is an official Oracle integration, and I am contributing it as the Oracle-side author in the context of Oracle <> Cohere partnership work. Because of that, I would appreciate it if this PR could be closed and we focus discussion and review on the original PR, #718, where I have already incorporated part of your feedback directly into the Oracle-authored branch.

I would also kindly encourage following normal authorship and review best practices here. When an official partner contribution already exists, the cleanest path is usually to provide review comments on the original PR rather than reworking it in a parallel PR stream. That preserves authorship, reviewability, and ownership of the contribution while still allowing feedback to be incorporated.

In addition, as I mentioned earlier, since this is an official Oracle contribution in the Oracle <> Cohere partnership context, we do not want to use Claude Code or any non-approved assistant in work that would be represented as Oracle-authored. I would appreciate your understanding of that position.

Let’s unify the work in one place and use #718 as the lead PR. I’m very happy to continue incorporating feedback there, and I would genuinely welcome your review comments on that original PR.

@fede-kamel
Copy link

Thanks for the clarification on Fern's role.

I do want to state one basic point clearly: the original OCI integration proposal and implementation came from the Oracle-authored contribution in #718, and that contribution should be credited as such. If further changes are needed, that is fine, but the clean way to handle that is to review and evolve the original contribution in place rather than effectively replacing it through a parallel PR stream.

My main concern remains the scope change. Moving OCI to OciClientV2 only is not just an implementation adjustment; it is a product decision to drop the V1 OCI client surface. That is materially different from the original proposal in #718, and it is also different from the existing provider pattern in this repo, where integrations like Bedrock and Sagemaker still expose both V1 and V2 client surfaces.

If Cohere intentionally wants OCI to be V2-only, I think that should be stated explicitly by Cohere maintainers as the product direction, rather than inferred through a parallel rewrite of the original Oracle contribution. Otherwise, my preference is still to keep the OCI work unified on #718 and handle scope changes there through direct review.

Since prior discussion was mentioned on this point, tagging @mkozakov for clarification: is Cohere explicitly choosing a V2-only OCI integration and dropping the V1 OCI client surface, even though the current provider pattern elsewhere in the SDK keeps both V1 and V2?

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

There are 4 total unresolved issues (including 2 from previous reviews).

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

fingerprint=oci_config["fingerprint"],
private_key_file_location=oci_config.get("key_file"),
private_key_content=oci_config.get("key_content"),
)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session-based auth broken by wrong signer priority

High Severity

In map_request_to_oci, the "user" in oci_config check runs before the "security_token_file" in oci_config check. OCI session-based config profiles contain both user and security_token_file keys, so they always match the "user" branch and create a standard oci.signer.Signer instead of the required SecurityTokenSigner. This means the documented session-based authentication flow (example 3 in the README) silently uses the wrong signer and will fail at request time.

Additional Locations (1)
Fix in Cursor Fix in Web

url="https://api.cohere.com/v1/chat",
headers={"connection": "keep-alive"},
json={"model": "cohere.command-r-plus-v1:0", "message": "hello"},
json={"model": "cohere.I gues-v1:0", "message": "hello"},
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nonsensical model name accidentally committed in test

Medium Severity

The model name was changed from "cohere.command-r-plus-v1:0" to "cohere.I gues-v1:0", which appears to be an accidental artifact from an editing session or autocomplete. This introduces a nonsensical model name with a space into the test, reducing test clarity and potentially masking URL-encoding issues.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants