Skip to content

fix(ci): trigger Algolia crawler reindex via API#428

Open
elibosley wants to merge 1 commit intomainfrom
codex/algolia-reindex-workflow
Open

fix(ci): trigger Algolia crawler reindex via API#428
elibosley wants to merge 1 commit intomainfrom
codex/algolia-reindex-workflow

Conversation

@elibosley
Copy link
Member

@elibosley elibosley commented Mar 24, 2026

Summary

  • replace the existing Algolia reindex action with the direct crawler API flow captured from the dashboard HAR
  • resolve the crawler dynamically by name and trigger reindexing with crawler-specific credentials only
  • document the exact GitHub secrets and optional variables needed for the workflow

Validation

  • parsed .github/workflows/algolia-reindex.yml successfully with Ruby YAML
  • reviewed the workflow and README diff directly
  • did not run the full build because this repo’s AGENTS.md explicitly says not to use it for routine validation

Notes

  • defaults come from the HAR and current repo config: app JUYLFQHE7W, crawler unraid
  • the workflow keeps workflow_dispatch and workflow_call support while adding automatic runs for content changes on main

Summary by CodeRabbit

  • Documentation

    • Added guide documenting the automated search index refresh workflow, including configuration variables and triggering conditions.
  • Chores

    • Improved search index reliability through enhanced workflow with automatic triggering on documentation updates and robust error handling.

- Purpose: replace the previous Algolia action-based workflow with the exact crawler reindex flow observed in the Algolia dashboard HAR and document the required repo setup.
- Before: the workflow depended on an extra standard Algolia API key and a third-party action instead of directly calling the crawler API used by the dashboard.
- Problem: that setup asked for unnecessary credentials, hid the real reindex mechanism, and was harder to reason about when debugging or rotating secrets.
- Now: the workflow resolves the crawler by name, triggers the crawler reindex endpoint with crawler-specific credentials only, waits for deployment propagation on main pushes, and verifies the crawler enters reindexing state.
- How: add a push trigger for published docs changes, use curl+jq against crawler.algolia.com, keep manual and reusable entry points, and document the exact secrets and optional variables in the README.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 24, 2026

📝 Walkthrough

Walkthrough

The changes introduce custom GitHub Actions workflow automation for Algolia search index reindexing. The workflow replaces a marketplace action with explicit API calls to Algolia's crawler API, adds automatic triggering on documentation changes, implements concurrency controls, and includes polling logic to verify reindex completion.

Changes

Cohort / File(s) Summary
Algolia Reindex Workflow
.github/workflows/algolia-reindex.yml
Replaced marketplace action with custom shell steps for Algolia crawler API integration. Added push trigger with path filters, concurrency controls, configurable delay, and polling logic to verify reindex status with explicit error handling.
Documentation
README.md
Added "Algolia Reindex" section documenting the workflow's functionality, API endpoint, required secrets, optional variables, and triggering behavior.

Sequence Diagram

sequenceDiagram
    participant GitHub as GitHub Actions
    participant API as Algolia Crawler API
    participant Poll as Polling Loop

    GitHub->>API: GET /user_configs (resolve crawler ID)
    API-->>GitHub: Return crawler ID & status
    GitHub->>API: POST /reindex (trigger reindex)
    API-->>GitHub: Return action_id
    GitHub->>Poll: Start polling (up to 5 attempts)
    Poll->>API: GET crawler status
    API-->>Poll: Check reindexing flag
    alt reindexing=true
        Poll-->>GitHub: Success ✓
    else reindexing≠true
        Poll-->>GitHub: Retry or Fail
    end
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Our docs now shimmer with Algolia's spell,
No marketplace action serves us well!
With custom APIs and polling so keen,
The search index reindexes pristine,
Our rabbit-built workflow's truly supreme! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Description check ❓ Inconclusive The PR description covers the change summary, validation approach, and technical details, though it does not follow the provided repository template structure and checklist. Consider using the repository's pull request template (internal links, file naming, assets, duplicate PRs, build status) to align with project standards, even if not all items apply to this CI change.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'fix(ci): trigger Algolia crawler reindex via API' accurately summarizes the main change—replacing the Algolia action with direct API calls to trigger crawler reindexing.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch codex/algolia-reindex-workflow

Comment @coderabbitai help to get the list of available commands and usage tips.

@cloudflare-workers-and-pages
Copy link

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
unraid-docs 2e87826 Commit Preview URL

Branch Preview URL
Mar 24 2026, 06:23 PM

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
.github/workflows/algolia-reindex.yml (2)

57-62: Consider edge case: multiple crawlers matching the name.

If multiple crawlers share the same name, jq will output multiple lines, causing the variable assignment to capture only the first or behave unexpectedly. While unlikely for this use case, adding | first or limit(1; ...) would make the extraction defensive.

♻️ Optional defensive fix
           crawler_id="$(
             jq -er \
               --arg crawler_name "${ALGOLIA_CRAWLER_NAME}" \
-              '.data[] | select(.name == $crawler_name) | .id' \
+              '[.data[] | select(.name == $crawler_name)] | first | .id' \
               <<<"${response}"
           )"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/algolia-reindex.yml around lines 57 - 62, The jq
extraction for crawler_id can return multiple matches if more than one crawler
shares ALGOLIA_CRAWLER_NAME; update the jq filter used in the pipeline where
crawler_id is assigned to defensively take only the first match (e.g. use
limit(1; .data[] | select(.name == $crawler_name) | .id) or pipe the stream to
first) so the assignment to crawler_id always captures a single id even when
multiple crawlers match.

100-137: Consider using action_id for precise verification.

The step captures action_id in the previous step but doesn't use it here. While checking .reindexing == true works, using the action-specific endpoint could provide more precise status tracking. However, the current approach is functional and avoids additional API complexity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/algolia-reindex.yml around lines 100 - 137, The step polls
the crawler by matching .name and checking .reindexing, but you captured an
action_id earlier and should use it for precise verification; update the curl
request or jq filter in this step to query the action-specific endpoint or
include the action_id parameter (use the captured variable name ACTION_ID or
whatever the previous step outputs) and change the jq selector from '.data[] |
select(.name == $crawler_name) | .reindexing' to select the entry by action_id
(e.g., select(.action_id == $action_id) | .reindexing) and similarly for
.status, so the loop directly inspects the specific reindex action rather than
matching by crawler name. Ensure the env var is passed into the step and keep
the existing retry/sleep logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In @.github/workflows/algolia-reindex.yml:
- Around line 57-62: The jq extraction for crawler_id can return multiple
matches if more than one crawler shares ALGOLIA_CRAWLER_NAME; update the jq
filter used in the pipeline where crawler_id is assigned to defensively take
only the first match (e.g. use limit(1; .data[] | select(.name == $crawler_name)
| .id) or pipe the stream to first) so the assignment to crawler_id always
captures a single id even when multiple crawlers match.
- Around line 100-137: The step polls the crawler by matching .name and checking
.reindexing, but you captured an action_id earlier and should use it for precise
verification; update the curl request or jq filter in this step to query the
action-specific endpoint or include the action_id parameter (use the captured
variable name ACTION_ID or whatever the previous step outputs) and change the jq
selector from '.data[] | select(.name == $crawler_name) | .reindexing' to select
the entry by action_id (e.g., select(.action_id == $action_id) | .reindexing)
and similarly for .status, so the loop directly inspects the specific reindex
action rather than matching by crawler name. Ensure the env var is passed into
the step and keep the existing retry/sleep logic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: e91267b4-2fed-4fb8-8842-c5c3f1ec849b

📥 Commits

Reviewing files that changed from the base of the PR and between 3ca142f and 2e87826.

📒 Files selected for processing (2)
  • .github/workflows/algolia-reindex.yml
  • README.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant