Skip to content

🏥 CI FailureCI Failure: Smoke Codex - Agent produced no safe outputs (schedule run) #1225

@github-actions

Description

@github-actions

Summary

The Smoke Codex workflow failed on the agent job for a scheduled run.

  • Run: 22953259892
  • Commit: dda2d3161ecaf546f306752680d16a16d33fca34
  • Trigger: schedule (cron 28 */12 * * *)
  • Time: 2026-03-11T12:46:22Z

Root Cause

The Codex (OpenAI) agent made exactly 1 API call to api.openai.com but did not invoke any safe output tools afterward. The smoke test framework requires agents to call at least one safe output tool to confirm task completion.

##[error]No safe outputs were invoked. Smoke tests require the agent to call safe output tools.
##[error]Process completed with exit code 1.
```

**Firewall activity summary from the run:**
```
▼ 1 request | 1 allowed | 0 blocked | 1 unique domain
| Domain         | Allowed | Denied |
|----------------|---------|--------|
| api.openai.com | 1       | 0      |

Only 1 outbound request was made to api.openai.com, suggesting the model received the prompt but either:

  1. Completed its response text-only without calling any tool
  2. Hit a budget/context/token limit before calling safe output tools
  3. Encountered an error that prevented tool use

Pattern Analysis

Looking at recent smoke-codex runs:

Run # Trigger Conclusion
22953259892 891 schedule ❌ failure — no safe outputs
22932379700 890 pull_request ❌ failure — called output but not add_comment
22932264163–22932059934 884–889 pull_request ✅ success

The scheduled run (which has no PR context) appears to consistently fail with "no safe outputs," suggesting the Codex model may not be following instructions to call safe output tools when there is no PR context to comment on.

Recommended Actions

  1. Review the Smoke Codex prompt (smoke-codex.md) to ensure it explicitly instructs the agent to call a safe output tool (e.g., noop) when running on a schedule trigger with no actionable output.
  2. Check if the Codex model/API version was updated recently — a model change could affect tool-calling behavior.
  3. Re-run the workflow to see if this is intermittent or consistently failing on schedule.
  4. Compare schedule vs PR instructions in the prompt — the PR run (890) at least called some safe output tool, while the schedule run called none at all.

Generated by CI Doctor

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingci

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions