🏥 CI FailureCI Failure: Smoke Codex - Agent produced no safe outputs (schedule run)

## Summary

The **Smoke Codex** workflow failed on the `agent` job for a scheduled run.

- **Run**: [22953259892](https://github.com/github/gh-aw-firewall/actions/runs/22953259892)
- **Commit**: `dda2d3161ecaf546f306752680d16a16d33fca34`
- **Trigger**: `schedule` (cron `28 */12 * * *`)
- **Time**: 2026-03-11T12:46:22Z

## Root Cause

The Codex (OpenAI) agent made **exactly 1 API call** to `api.openai.com` but **did not invoke any safe output tools** afterward. The smoke test framework requires agents to call at least one safe output tool to confirm task completion.

````
##[error]No safe outputs were invoked. Smoke tests require the agent to call safe output tools.
##[error]Process completed with exit code 1.
```

**Firewall activity summary from the run:**
```
▼ 1 request | 1 allowed | 0 blocked | 1 unique domain
| Domain         | Allowed | Denied |
|----------------|---------|--------|
| api.openai.com | 1       | 0      |
````

Only 1 outbound request was made to `api.openai.com`, suggesting the model received the prompt but either:
1. Completed its response text-only without calling any tool
2. Hit a budget/context/token limit before calling safe output tools
3. Encountered an error that prevented tool use

## Pattern Analysis

Looking at recent smoke-codex runs:
| Run | # | Trigger | Conclusion |
|-----|---|---------|------------|
| 22953259892 | 891 | schedule | ❌ failure — no safe outputs |
| 22932379700 | 890 | pull_request | ❌ failure — called output but not `add_comment` |
| 22932264163–22932059934 | 884–889 | pull_request | ✅ success |

The scheduled run (which has no PR context) appears to consistently fail with "no safe outputs," suggesting the Codex model may not be following instructions to call safe output tools when there is no PR context to comment on.

## Recommended Actions

1. **Review the Smoke Codex prompt** (`smoke-codex.md`) to ensure it explicitly instructs the agent to call a safe output tool (e.g., `noop`) when running on a schedule trigger with no actionable output.
2. **Check if the Codex model/API version** was updated recently — a model change could affect tool-calling behavior.
3. **Re-run the workflow** to see if this is intermittent or consistently failing on schedule.
4. **Compare schedule vs PR instructions** in the prompt — the PR run (890) at least called *some* safe output tool, while the schedule run called none at all.




> Generated by [CI Doctor](https://github.com/github/gh-aw-firewall/actions/runs/22953457172)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🏥 CI FailureCI Failure: Smoke Codex - Agent produced no safe outputs (schedule run) #1225

Summary

Root Cause

Pattern Analysis

Recommended Actions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Run	#	Trigger	Conclusion
22953259892	891	schedule	❌ failure — no safe outputs
22932379700	890	pull_request	❌ failure — called output but not `add_comment`
22932264163–22932059934	884–889	pull_request	✅ success

🏥 CI FailureCI Failure: Smoke Codex - Agent produced no safe outputs (schedule run) #1225

Description

Summary

Root Cause

Pattern Analysis

Recommended Actions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions