azure-ai-agentserver: Support background mode and resumable streaming for hosted agents

## Feature Request

Support `background: true` and resumable streaming (`GET /responses/{id}?stream=true&starting_after=N`) for hosted agents, matching the behavior already available for direct model invocations via the Foundry Responses API.

## Current Behavior

### Direct model invocation (works)

When calling the Responses API directly with a model deployment:

```json
POST /openai/responses
{
  "model": "gpt-4o",
  "stream": true,
  "store": true,
  "background": true,
  "input": [{"role": "user", "content": "Hello"}]
}
```

- Returns immediately with `status: "in_progress"` and `"background": true`
- Every SSE event includes a `sequence_number`
- `GET /responses/{id}?stream=true&starting_after=0` replays all stored events, then continues with live events if still in progress
- Response lifecycle completes correctly (`status: "completed"`, `output` populated)

### Hosted agent invocation (does not work)

When calling the same Responses API with an `agent_reference` pointing to a hosted agent:

```json
POST /openai/responses
{
  "model": "gpt-5",
  "stream": true,
  "store": true,
  "background": true,
  "agent_reference": {"type": "agent_reference", "name": "my-agent"},
  "input": [{"role": "user", "content": "Hello"}]
}
```

- Returns immediately with `status: "in_progress"` but the stored response has `"background": false` — the flag is silently dropped
- `GET /responses/{id}?stream=true&starting_after=0` returns `"Streaming is not enabled for this response"`
- The response object remains stuck at `status: "in_progress"` with empty `output`, even after the agent completes and conversation items are saved
- The `azure-ai-agentserver-core` SDK (v1.0.0b16) has no reference to `background` anywhere in its source

## Why This Matters

The primary use case is **resumable/rejoinable streaming for chat UIs**. When a user:
- Refreshes the page mid-generation
- Loses network connectivity temporarily
- Opens a conversation that is still being generated in another tab

They should be able to call `GET /responses/{id}?stream=true&starting_after=0` to replay past events and continue receiving live events. This works today for direct model calls but not for hosted agents, despite both using the same `/openai/responses` API surface.

## Observations

- The SDK already assigns `sequence_number` to every `ResponseStreamEvent` via `StreamEventState` — the primitive for `starting_after` resumption is already in place
- The SDK already supports `store=true` which saves completed items to the Conversations API after stream completion
- The gap appears to be in Foundry's proxy layer between the Responses API endpoint and the hosted agent container — it doesn't buffer/store SSE events as they pass through for hosted agents the way it does for direct model calls

## Expected Behavior

`background: true` + `stream: true` should work identically for hosted agents as it does for direct model calls:
1. Foundry buffers SSE events as they pass through from the hosted agent container
2. `GET /responses/{id}?stream=true&starting_after=N` replays stored events and continues with live events
3. The response lifecycle completes correctly when the agent finishes

## Environment

- `azure-ai-agentserver-core==1.0.0b16`
- `azure-ai-agentserver-langgraph==1.0.0b16`
- API version: `2025-11-15-preview`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

azure-ai-agentserver: Support background mode and resumable streaming for hosted agents #46015

Feature Request

Current Behavior

Direct model invocation (works)

Hosted agent invocation (does not work)

Why This Matters

Observations

Expected Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

azure-ai-agentserver: Support background mode and resumable streaming for hosted agents #46015

Description

Feature Request

Current Behavior

Direct model invocation (works)

Hosted agent invocation (does not work)

Why This Matters

Observations

Expected Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions