Feature request: Memory-efficient streaming API for embed()

## Problem

When embedding large datasets (thousands or millions of texts), the current `embed()` method accumulates all results in memory before returning. This causes:

- **Out-of-memory errors** for very large datasets
- **Memory pressure** when processing many texts sequentially
- **No way to process results incrementally** (e.g., save to database as embeddings arrive)

For enterprise workloads processing large document corpora, this is a significant limitation.

## Proposed Solution

A new `embed_stream()` method that:
- Processes texts in configurable batches
- Yields embeddings one at a time via an iterator
- Keeps memory usage proportional to `batch_size` rather than total dataset size
- Works with both v1 and v2 clients

### Usage Example

```python
import cohere

client = cohere.Client()

# Process large dataset incrementally
for embedding in client.embed_stream(
    texts=large_text_list,  # Can be thousands of texts
    model="embed-english-v3.0",
    input_type="classification",
    batch_size=20
):
    save_to_database(embedding.index, embedding.embedding)
    # Only batch_size worth of embeddings in memory at a time
```

### Memory Impact

| Dataset Size | Current `embed()` | Proposed `embed_stream()` |
|---|---|---|
| 1,000 texts | ~4 MB | ~20 KB |
| 100,000 texts | ~400 MB | ~20 KB |
| 1,000,000 texts | ~4 GB+ (OOM) | ~20 KB |

## Context

We are using the Cohere Python SDK at **Oracle** for processing large embedding workloads. We have a working implementation in PR #698 that has been tested with the real Cohere API, passes all unit tests, and is backward compatible (no changes to existing `embed()`).

## Additional Details

- No breaking changes to existing APIs
- Optional dependency on `ijson` for more efficient incremental parsing (works without it)
- Supports both `embeddings_floats` and `embeddings_by_type` response formats

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature request: Memory-efficient streaming API for embed() #733

Problem

Proposed Solution

Usage Example

Memory Impact

Context

Additional Details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dataset Size	Current `embed()`	Proposed `embed_stream()`
1,000 texts	~4 MB	~20 KB
100,000 texts	~400 MB	~20 KB
1,000,000 texts	~4 GB+ (OOM)	~20 KB

Feature request: Memory-efficient streaming API for embed() #733

Description

Problem

Proposed Solution

Usage Example

Memory Impact

Context

Additional Details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions