Skip to content

feat(integrations): deep GitHub integration for AI agents#818

Merged
2witstudios merged 15 commits intomasterfrom
pu/gh-pr-review
Apr 1, 2026
Merged

feat(integrations): deep GitHub integration for AI agents#818
2witstudios merged 15 commits intomasterfrom
pu/gh-pr-review

Conversation

@2witstudios
Copy link
Copy Markdown
Owner

@2witstudios 2witstudios commented Mar 31, 2026

Summary

  • 19 GitHub integration tools (up from 6) giving AI agents full code review, repo browsing, issue management, and PR review capabilities
  • import_from_github built-in tool — imports files, PR diffs, or directories into PageSpace as CODE pages with full provenance tracking
  • Data migration for renamed create_pr_commentcreate_issue_comment tool grants
  • Shared language-detection utility extracted from CodePageView for reuse across import tool and editor

New GitHub Tools

Category Tools
Code Browsing get_repo, get_repo_content, get_repo_tree, list_branches, search_code, get_commit
PR Review get_pr_diff, get_pr_reviews, get_pr_review_comments, create_pr_review, create_pr_review_comment
Issue Mgmt update_issue, list_issue_comments, create_issue_comment
Import import_from_github (file/pr/directory modes)

Hardening (from code review)

  • Base64 truncation fix: get_repo_content maxLength raised to 200K
  • create_pr_review inline comments now require line (prevents GitHub 422)
  • create_pr_review_comment relaxed required fields to support reply mode (only body required, path/commit_id/line optional)
  • update_issue schema: added duplicate to state_reason, allow null milestone
  • Activity logging added for imported pages
  • SVG moved from binary to text extensions
  • ai-tools aggregation test updated
  • Pure function purity fixes: batch importedAt timestamp per import operation, processDirectory returns results instead of closure mutation, removed unsafe double type assertion in resolveGitHubToken
  • Input normalization: maxFiles clamped (floor, 1..50), pullNumber validated as positive integer
  • 54 new tests: 26 for github-import-tools, 28 for language-detection

Test plan

  • 73/73 GitHub provider tests passing
  • 3/3 ai-tools aggregation tests passing
  • 26/26 github-import-tools tests passing
  • 28/28 language-detection tests passing
  • Manual: Connect GitHub OAuth, grant tools to agent, verify code browsing
  • Manual: Create PR review with inline comments
  • Manual: Import a file and PR into PageSpace
  • Run migration 0091 against staging DB

🤖 Generated with Claude Code

2witstudios and others added 2 commits March 17, 2026 15:23
…epics 7-9

- Fix packages/lib typecheck: replace `as ReturnType<typeof vi.fn>` casts
  with `vi.mocked()` in export-utils.test.ts (4 errors → 0)
- Mark Epic 6 (6c, 6d) as COMPLETE in parent plan
- Add Standards & Rules section to Epics 7, 8, 9 (matches 6a-6d)
- Note missing getTenantByStripeSubscription method in Epic 7
- Note integration test DB requirement in Epic 8
- Document Caddy/Traefik coexistence in Epic 3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Expand GitHub provider from 6 to 19 tools enabling full code review
workflows. Add built-in import_from_github tool for bringing GitHub
code into PageSpace as CODE pages.

New capabilities:
- Code browsing: get_repo, get_repo_content, get_repo_tree, list_branches, search_code, get_commit
- PR review: get_pr_diff, get_pr_reviews, get_pr_review_comments, create_pr_review, create_pr_review_comment
- Issue management: update_issue, list_issue_comments, create_issue_comment (renamed from create_pr_comment)
- Code import: import_from_github with file/pr/directory modes

Also includes:
- Data migration (0091) for create_pr_comment → create_issue_comment grant references
- Shared language-detection utility extracted from CodePageView
- Activity logging for imported pages
- 72 provider tests + ai-tools aggregation test fix

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 31, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Added a GitHub import AI tool (file/PR/directory modes), extracted filename language detection into a shared utility and export, expanded the GitHub provider and its tests, added a DB migration renaming a tool ID, extended page creation inputs to store import metadata, and appended the migration journal entry.

Changes

Cohort / File(s) Summary
Language detection & exports
packages/lib/src/utils/language-detection.ts, packages/lib/package.json, apps/web/src/components/layout/middle-content/page-views/code/CodePageView.tsx
New EXTENSION_TO_LANGUAGE map, detectLanguageFromFilename, BINARY_EXTENSIONS, and isBinaryFile; package export subpath added; CodePageView now delegates language detection to the shared util.
GitHub import tool & AI integration
apps/web/src/lib/ai/tools/github-import-tools.ts, apps/web/src/lib/ai/core/ai-tools.ts, apps/web/src/lib/ai/core/__tests__/ai-tools.test.ts, apps/web/src/lib/ai/core/tool-filtering.ts
Introduced githubImportTools.import_from_github (modes: file, pr, directory), integrated into pageSpaceTools, updated tests, and added import_from_github to write tools.
GitHub provider & tests
packages/lib/src/integrations/providers/github.ts, packages/lib/src/integrations/providers/github.test.ts
Significant expansion of GitHub provider tools (many new read/write endpoints, updated inputs/outputs and rate limits); tests expanded and renamed create_pr_commentcreate_issue_comment.
Page repository changes
packages/lib/src/repositories/page-repository.ts
CreatePageInput extended with optional extractionMethod, extractionMetadata, and contentHash; create persists these fields when provided.
Database migration & journal
packages/db/drizzle/0091_migrate_create_pr_comment_tool_id.sql, packages/db/drizzle/meta/_journal.json
Migration to rename create_pr_commentcreate_issue_comment within JSONB arrays; appended migration journal entry (idx: 91).
Misc / Local workflow file
.claude/ralph-loop.local.md
Added a local iterative PR convergence/merge readiness workflow markdown file.

Sequence Diagram(s)

sequenceDiagram
  participant Tool as "githubImportTools (AI Tool)"
  participant Auth as "Connections/Auth"
  participant GitHub as "GitHub API"
  participant Repo as "PageRepository / DB"
  participant WS as "Websocket"
  participant Log as "importLogger"

  Tool->>Auth: resolve & decrypt connection (connectionId, userId)
  Auth-->>Tool: github token
  Tool->>GitHub: fetch content / PR / directory (repo, path, ref, sha)
  GitHub-->>Tool: file entries / PR metadata
  Tool->>Repo: create FOLDER / CODE pages (contentHash, stateHash, metadata)
  Repo-->>Tool: created page IDs
  Tool->>WS: emit "created" events (best-effort)
  Tool->>Log: async activity logging (fire-and-forget)
  Tool-->>Tool: return import summary (success / errors)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Poem

🐇 I hopped through tokens, headers, and trees,
I fetched files and PRs with delicate ease.
Extensions found, binaries politely passed,
Pages planted quickly, events sent fast.
A rabbit's import — tidy, quick, and vast.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately reflects the main change: adding deep GitHub integration capabilities (19 tools) to enable AI agents to perform code review, repository browsing, issue management, and content import.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch pu/gh-pr-review

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 54004f62db

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (3)
tasks/tenant-epic-7-stripe-billing.md (1)

24-24: Recommend tightening the missing-method requirement with behavior-level acceptance criteria.

Good catch on the repository gap. Consider specifying expected behavior in this line item (e.g., return null when not found, exact input type, and required test cases) so implementation/test scope is unambiguous.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tasks/tenant-epic-7-stripe-billing.md` at line 24, Add a new repository
method getTenantByStripeSubscription(subscriptionId: string) on the
TenantRepository (tenant-repository) that returns the matching Tenant or null if
none is found; ensure the method accepts a string subscriptionId, performs the
lookup against the data source, and does not throw on “not found” but returns
null. Implement the method in the repository implementation used by the control
plane and add unit tests for: (1) successful lookup returning a Tenant, (2)
lookup returning null when no tenant matches, and (3) validation/behavior for
invalid/empty subscriptionId inputs to assert expected error or null behavior
per repository conventions. Ensure tests reference getTenantByStripeSubscription
and assert exact return types/values.
tasks/tenant-epic-6-control-plane.md (1)

3-3: Consider separating unrelated documentation updates.

These epic status updates appear unrelated to the PR's stated purpose (GitHub integration tools). Mixing unrelated changes in a single PR can complicate code review, git history, and selective rollback.

For better maintainability, consider moving documentation updates like these to a separate housekeeping PR or commit.

Also applies to: 14-15

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tasks/tenant-epic-6-control-plane.md` at line 3, This PR contains unrelated
documentation edits (the markdown line "**Status**: COMPLETE" and any other
occurrences of "**Status**: COMPLETE") that should be split out; remove or
revert those doc-only changes from the current branch/commit and create a
separate housekeeping commit/PR with the documentation updates so the GitHub
integration tools PR remains focused and reviewable.
packages/lib/src/__tests__/export-utils.test.ts (1)

206-212: Consider updating XLSX mocks to use vi.mocked() for consistency.

The HTMLtoDOCX mocks were updated to use vi.mocked(), but XLSX mocks still use the cast-based pattern (XLSX.write as ReturnType<typeof vi.fn>). For consistency across the test file:

♻️ Optional: Refactor XLSX mock access to use vi.mocked()
-      expect((XLSX.utils.book_append_sheet as ReturnType<typeof vi.fn>).mock.calls[0][2]).toBe('Sheet1');
+      expect(vi.mocked(XLSX.utils.book_append_sheet).mock.calls[0][2]).toBe('Sheet1');
       expect(XLSX.write).toHaveBeenCalledTimes(1);
-      expect((XLSX.write as ReturnType<typeof vi.fn>).mock.calls[0][1]).toEqual(
+      expect(vi.mocked(XLSX.write).mock.calls[0][1]).toEqual(
         expect.objectContaining({ type: 'buffer', bookType: 'xlsx', compression: true })
       );

Similar updates would apply to lines 221, 230, 239, 249, and 256.

Based on learnings: "In vitest, avoid using two type arguments with vi.fn<>() in tests... Use a plain vi.fn() and, when needed, provide a mocked return with mockResolvedValue({ ... } as never)."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/__tests__/export-utils.test.ts` around lines 206 - 212, The
tests currently access XLSX mocks via casted vi.fn types (e.g., (XLSX.write as
ReturnType<typeof vi.fn>).mock) which is inconsistent with other mocks; update
the XLSX mock references to use vi.mocked(XLSX) and access call/mock helpers
from that mocked object (e.g., vi.mocked(XLSX).utils.aoa_to_sheet,
vi.mocked(XLSX).utils.book_append_sheet, vi.mocked(XLSX).write) across the
assertions (including the other similar assertions mentioned) so the tests use
vi.mocked() consistently instead of manual casts.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/web/src/lib/ai/tools/github-import-tools.ts`:
- Around line 145-181: The DB write path is not atomic: pageRepository.create,
createPageVersion and the event broadcast are awaited separately so failures
after the page row is created can leave partial imports and cause retry
duplicates; wrap the DB writes (creating the page and the page version) in a
single transaction (use your repository/DB transaction API around
pageRepository.create and createPageVersion calls so they commit or rollback
together) and make broadcastPageEvent/createPageEventPayload a best-effort
fire-and-forget (catch and log any errors from broadcastPageEvent instead of
letting them bubble up). Apply the same transactional + best-effort broadcast
changes to the other identical block at the later section.
- Around line 93-115: resolveGitHubToken currently fetches a connection by id
only and decrypts its credentials, allowing a foreign connection id to be used;
update the lookup to require the caller's user id as well: change
resolveGitHubToken signature to accept a userId (or otherwise receive the
caller's id), call/getConnectionWithProvider with both connectionId and userId
(or after fetch verify connection.userId === userId), and throw an error if the
user ids don't match before calling decryptCredentials; reference
resolveGitHubToken, getConnectionWithProvider and decryptCredentials when making
this change so you scope the lookup to the caller and prevent decrypting another
user's token.
- Around line 63-73: The githubFetch function builds URLs by interpolating raw
repo path strings, which breaks for filenames containing reserved characters;
fix by splitting the incoming path on '/', run encodeURIComponent on each
segment, join them with '/' and use that encoded path when constructing the URL
(i.e., replace direct use of path in new URL(`${GITHUB_API_BASE}${path}`) with
the encoded-segment version), preserving existing queryParams logic; apply the
same change to the other locations in this file that construct /contents/...
URLs so all repository path segments are encoded consistently.
- Around line 397-409: Replace the ad-hoc root-level owner check with the
centralized permission call: import and call getUserAccessLevel(driveId, userId)
(from `@pagespace/lib/permissions`) instead of comparing drive.ownerId to userId;
allow root-page creation when the returned access level indicates editing rights
(e.g., "editor" or "owner") or, if your codebase provides it, call
canUserEditPage(userId, /* parentId */ null or driveRootId) to mirror other
create checks; update the error message to the same one used elsewhere and
remove the local drive.ownerId comparison so root permission decisions use the
centralized permission model.

In `@packages/lib/package.json`:
- Around line 192-196: The package.json exports include
"./utils/language-detection" at runtime but the typesVersions map is missing a
corresponding entry; update the typesVersions section to add a mapping for
"./utils/language-detection" pointing to its declaration
"./dist/utils/language-detection.d.ts" (matching the existing pattern used for
other subpaths) so TypeScript resolution finds the types; locate the
typesVersions object in package.json and add the "./utils/language-detection"
key with the same path format as other entries.

In `@packages/lib/src/integrations/providers/github.ts`:
- Around line 176-185: The get_repo_content tool schema currently forces path as
required which prevents calling the GitHub contents endpoint for the repository
root; update the JSON schema for get_repo_content by removing "path" from the
required array(s) (keep the path property definition in properties so it stays
optional) — specifically modify the required: ['owner','repo','path'] entries
associated with get_repo_content (also update the duplicate required block
around lines 187-195 referenced in the diff) so the schema requires only owner
and repo, allowing omission of path to access the repo root.
- Around line 1077-1085: The schema for the review payload currently defines
"body" and "event" but doesn't enforce that "body" is required for event values
REQUEST_CHANGES or COMMENT; update the JSON Schema where the review payload is
defined (the object containing the "body" and "event" properties in
packages/lib/src/integrations/providers/github.ts) to add conditional
validation—use JSON Schema "if"/"then" (or "oneOf") clauses that check if event
is "REQUEST_CHANGES" or "COMMENT" and in that case require "body", otherwise
leave "body" optional; apply the same conditional logic to the other identical
schema instance referenced around the second location (the other review schema
near the later block).
- Around line 1194-1197: The create_pr_review_comment JSON schema currently
defines commit_id (description: 'SHA of the commit to comment on...') as an
optional property; update the schema object for create_pr_review_comment so that
"commit_id" is included in its required array (alongside other required fields)
to enforce validation before calling GitHub's API; locate the schema object
named create_pr_review_comment in
packages/lib/src/integrations/providers/github.ts and add the string "commit_id"
to the required[] list so payloads missing commit_id are rejected locally.
- Around line 251-260: The outputTransform currently extracts only '$.tree' and
maps fields (path,type,mode,size,sha) and drops GitHub's top-level truncated
flag, so callers can't detect incomplete trees; modify the outputTransform for
the tree response (the extract and mapping configuration used in
outputTransform) to also preserve the top-level truncated flag (e.g., include
truncated in the mapping or wrap the result to include truncated alongside the
mapped tree) so that callers receiving the outputTransform result can inspect
truncated (and still keep the existing maxLength: 500 behavior).

In `@tasks/tenant-epic-3-traefik-proxy.md`:
- Around line 10-11: Remove the machine-specific absolute path
`/Users/jono/production/PageSpace-Deploy/` from the epic text, clarify that in
Caddy coexistence mode Traefik runs on internal ports only (not binding host
80/443) and is fronted by Caddy for routing and TLS termination, and update the
Docker Compose requirements section to remove/replace the host exposure
requirement for ports 80/443 for Traefik (document that Caddy handles host
80/443 while Traefik listens on internal ports for tenant subdomains); keep the
note that a full Caddy-to-Traefik cutover is out of scope and track as a
separate task.

---

Nitpick comments:
In `@packages/lib/src/__tests__/export-utils.test.ts`:
- Around line 206-212: The tests currently access XLSX mocks via casted vi.fn
types (e.g., (XLSX.write as ReturnType<typeof vi.fn>).mock) which is
inconsistent with other mocks; update the XLSX mock references to use
vi.mocked(XLSX) and access call/mock helpers from that mocked object (e.g.,
vi.mocked(XLSX).utils.aoa_to_sheet, vi.mocked(XLSX).utils.book_append_sheet,
vi.mocked(XLSX).write) across the assertions (including the other similar
assertions mentioned) so the tests use vi.mocked() consistently instead of
manual casts.

In `@tasks/tenant-epic-6-control-plane.md`:
- Line 3: This PR contains unrelated documentation edits (the markdown line
"**Status**: COMPLETE" and any other occurrences of "**Status**: COMPLETE") that
should be split out; remove or revert those doc-only changes from the current
branch/commit and create a separate housekeeping commit/PR with the
documentation updates so the GitHub integration tools PR remains focused and
reviewable.

In `@tasks/tenant-epic-7-stripe-billing.md`:
- Line 24: Add a new repository method
getTenantByStripeSubscription(subscriptionId: string) on the TenantRepository
(tenant-repository) that returns the matching Tenant or null if none is found;
ensure the method accepts a string subscriptionId, performs the lookup against
the data source, and does not throw on “not found” but returns null. Implement
the method in the repository implementation used by the control plane and add
unit tests for: (1) successful lookup returning a Tenant, (2) lookup returning
null when no tenant matches, and (3) validation/behavior for invalid/empty
subscriptionId inputs to assert expected error or null behavior per repository
conventions. Ensure tests reference getTenantByStripeSubscription and assert
exact return types/values.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bd1c005a-ffd8-4c79-bc41-5418aa4e2ef1

📥 Commits

Reviewing files that changed from the base of the PR and between 6bc0824 and 54004f6.

📒 Files selected for processing (18)
  • apps/web/src/components/layout/middle-content/page-views/code/CodePageView.tsx
  • apps/web/src/lib/ai/core/__tests__/ai-tools.test.ts
  • apps/web/src/lib/ai/core/ai-tools.ts
  • apps/web/src/lib/ai/core/tool-filtering.ts
  • apps/web/src/lib/ai/tools/github-import-tools.ts
  • packages/db/drizzle/0091_migrate_create_pr_comment_tool_id.sql
  • packages/db/drizzle/meta/_journal.json
  • packages/lib/package.json
  • packages/lib/src/__tests__/export-utils.test.ts
  • packages/lib/src/integrations/providers/github.test.ts
  • packages/lib/src/integrations/providers/github.ts
  • packages/lib/src/repositories/page-repository.ts
  • packages/lib/src/utils/language-detection.ts
  • tasks/tenant-epic-3-traefik-proxy.md
  • tasks/tenant-epic-6-control-plane.md
  • tasks/tenant-epic-7-stripe-billing.md
  • tasks/tenant-epic-8-data-migration.md
  • tasks/tenant-epic-9-operational-tooling.md

2witstudios and others added 5 commits March 31, 2026 11:39
- Scope resolveGitHubToken to caller's userId (prevent cross-user token access)
- Encode path segments in githubFetch to handle special characters (#, ?, %)
- Make broadcastPageEvent best-effort (catch errors, don't fail imports)
- Add typesVersions entry for utils/language-detection
- Make path optional in get_repo_content (allows root directory listing)
- Expose truncated flag in get_repo_tree output mapping
- Require body in create_pr_review (needed for REQUEST_CHANGES/COMMENT)
- Require commit_id in create_pr_review_comment (GitHub API requirement)
- Require line in create_pr_review inline comments array
- Update all corresponding test assertions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cast connection.credentials as Record<string, string> instead of
Record<string, unknown> to match decryptCredentials parameter type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The project's custom logger uses (message: string, error?: Error, meta?: object)
not pino's (mergingObject, message) signature.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
warn() takes 2 args (message, metadata?) not 3 like error() does.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/lib/src/integrations/providers/github.ts (1)

426-426: ⚠️ Potential issue | 🟠 Major

Preserve the pull request discriminator in get_issues output.

GitHub's /repos/{owner}/{repo}/issues endpoint returns both issues and pull requests. The presence of the pull_request field distinguishes pull requests from issues. The current outputTransform mapping drops this field entirely, so callers cannot determine whether an entry should be handled as an issue or pull request.

Add the pull_request.html_url field to the mapping:

Fix
      outputTransform: {
        mapping: {
          number: 'number',
          title: 'title',
          state: 'state',
          html_url: 'html_url',
          user: 'user.login',
          labels: 'labels',
          created_at: 'created_at',
+         pull_request_html_url: 'pull_request.html_url',
        },
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/integrations/providers/github.ts` at line 426, The
outputTransform for get_issues currently drops the pull_request discriminator
from GitHub's /repos/{owner}/{repo}/issues response; update the output mapping
in the get_issues outputTransform so it preserves the pull_request field (at
minimum include pull_request.html_url) alongside existing fields (e.g., id,
number, title, body, user, labels, state) so callers can distinguish PRs from
issues; locate the get_issues provider method and its outputTransform mapping in
github.ts and add pull_request.html_url into the mapped output object.
🧹 Nitpick comments (1)
.claude/ralph-loop.local.md (1)

10-10: Add explicit db:* validation when migrations are involved.

The checklist includes tests/build/lint/typecheck, but does not explicitly require relevant DB migration tasks. Given this repo workflow, make that explicit in the success criteria/process.

Based on learnings: "Before opening a pull request, run pnpm build, pnpm typecheck, pnpm lint, and relevant database migration tasks".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/ralph-loop.local.md at line 10, Update the SUCCESS CRITERIA and
PROCESS sections to explicitly require running and verifying relevant DB
migration tasks whenever migrations are involved: add a bullet to SUCCESS
CRITERIA stating "Relevant DB migration tasks run and verified (e.g., pnpm
db:migrate / pnpm prisma:migrate or repo-specific db:* commands) and migration
state is committed/validated", and augment PROCESS step 0 (Pre-flight branch
health) to run the repo's DB migration checks (e.g., pnpm db:status, pnpm
db:migrate:ci) after syncing with base and before running build/typecheck/lint;
reference the existing headings "SUCCESS CRITERIA", "PROCESS", and "step 0)
Pre-flight branch health" to locate where to add these lines.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/ralph-loop.local.md:
- Line 4: The config currently leaves session_id empty; either set a persistent
non-empty session_id in the .claude/ralph-loop.local.md file or implement
explicit auto-generation where the app reads session_id (e.g., in the session
bootstrap/loader that uses session_id) so it always supplies a stable UUID or
timestamped id when blank; update the code path that consumes the session_id to
accept the generated id and ensure it is returned/stored so run
correlation/resume logic is robust.
- Around line 5-10: The loop limit is inconsistent: the YAML key max_iterations
is set to 40 but the escape-hatch text mentions "After 25 iterations" causing
contradictory stop conditions; update one of them so both match (either set
max_iterations to 25 or change the escape-hatch sentence to "After 40
iterations") and ensure the symbolic references in the document remain
consistent (target the max_iterations key and the escape-hatch clause that
contains "<promise>BLOCKED</promise>" and the "After 25 iterations" wording).
- Line 10: The markdown blob starting with "TASK: Converge the current open Pull
Request..." contains literal "\n" sequences and a stray "/aidd" token; edit that
text block (the TASK paragraph) to replace every escaped "\n" with real newline
characters so lists and paragraphs render properly, remove the trailing "/aidd",
and reflow the content into normal Markdown paragraphs/lists (preserve the
original checklist and PROCESS steps exactly but with actual newlines).

In `@packages/lib/src/integrations/providers/github.ts`:
- Around line 215-216: The get_repo_tree function's description promises a "full
tree" but leaves the recursive parameter optional, causing callers to get only
top-level entries; update get_repo_tree to pass recursive = "1" by default (or
make the recursive parameter required and always set to "1") so the GitHub API
returns the full recursive tree, and update the description string to reflect
that recursion is forced; make changes to the function signature/parameter
handling (get_repo_tree and its recursive param) and the accompanying
description to ensure recursion is always enabled and the docs are accurate.
- Around line 840-850: The PR review comment schema in providers/github.ts
currently requires path and commit_id and omits line, which conflicts with
GitHub's conditional rules; update the schema used to build the mapping (the
mapping object and the JSON schema consumed by convertToolSchemaToZod) to:
remove path and commit_id from the required array (make them optional), add line
to required so inline comments validate, and document (near
get_pr_review_comments and the schema) that runtime validation on the API
gateway/executor must enforce GitHub-only rules (replies must only use body and
must not require path/commit_id; inline comments require line unless
subject_type === "file"; start_line requires start_side). If you prefer, instead
enhance convertToolSchemaToZod to support conditional schemas (if/then/else) so
the JSON Schema can express those GitHub-specific conditions; also note
get_pr_review_comments does not return commit_id so callers cannot construct
replies without runtime enrichment.

---

Outside diff comments:
In `@packages/lib/src/integrations/providers/github.ts`:
- Line 426: The outputTransform for get_issues currently drops the pull_request
discriminator from GitHub's /repos/{owner}/{repo}/issues response; update the
output mapping in the get_issues outputTransform so it preserves the
pull_request field (at minimum include pull_request.html_url) alongside existing
fields (e.g., id, number, title, body, user, labels, state) so callers can
distinguish PRs from issues; locate the get_issues provider method and its
outputTransform mapping in github.ts and add pull_request.html_url into the
mapped output object.

---

Nitpick comments:
In @.claude/ralph-loop.local.md:
- Line 10: Update the SUCCESS CRITERIA and PROCESS sections to explicitly
require running and verifying relevant DB migration tasks whenever migrations
are involved: add a bullet to SUCCESS CRITERIA stating "Relevant DB migration
tasks run and verified (e.g., pnpm db:migrate / pnpm prisma:migrate or
repo-specific db:* commands) and migration state is committed/validated", and
augment PROCESS step 0 (Pre-flight branch health) to run the repo's DB migration
checks (e.g., pnpm db:status, pnpm db:migrate:ci) after syncing with base and
before running build/typecheck/lint; reference the existing headings "SUCCESS
CRITERIA", "PROCESS", and "step 0) Pre-flight branch health" to locate where to
add these lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 481ea7fa-5379-467e-937f-369e50721b94

📥 Commits

Reviewing files that changed from the base of the PR and between 54004f6 and 7770184.

📒 Files selected for processing (5)
  • .claude/ralph-loop.local.md
  • apps/web/src/lib/ai/tools/github-import-tools.ts
  • packages/lib/package.json
  • packages/lib/src/integrations/providers/github.test.ts
  • packages/lib/src/integrations/providers/github.ts
✅ Files skipped from review due to trivial changes (2)
  • packages/lib/package.json
  • apps/web/src/lib/ai/tools/github-import-tools.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/lib/src/integrations/providers/github.test.ts

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
apps/web/src/lib/ai/tools/github-import-tools.ts (1)

111-115: ⚠️ Potential issue | 🔴 Critical

Enforce drive ownership for drive-scoped GitHub connections.

At Line 111, the comment says “user-scoped or their drive,” but only userId is enforced. If a connection has no userId and is drive-scoped, Line 431 can still resolve and decrypt it without verifying driveId, which is a cross-drive token exposure risk.

🔒 Proposed fix
-async function resolveGitHubToken(
-  connectionId: string,
-  userId: string
-): Promise<string> {
+async function resolveGitHubToken(
+  connectionId: string,
+  userId: string,
+  driveId: string
+): Promise<string> {
   const connection = await getConnectionWithProvider(db, connectionId);
   if (!connection) {
@@
   // Verify the connection belongs to the caller (user-scoped) or their drive
   const conn = connection as unknown as { userId?: string; driveId?: string };
   if (conn.userId && conn.userId !== userId) {
     throw new Error('GitHub connection does not belong to this user.');
   }
+  if (conn.driveId && conn.driveId !== driveId) {
+    throw new Error('GitHub connection does not belong to this drive.');
+  }
@@
-      const token = await resolveGitHubToken(connectionId, userId);
+      const token = await resolveGitHubToken(connectionId, userId, driveId);

Also applies to: 430-431

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@apps/web/src/lib/ai/tools/github-import-tools.ts` around lines 111 - 115, The
current check only enforces conn.userId but ignores drive-scoped connections: in
the connection validation (where connection is cast to { userId?: string;
driveId?: string } and assigned to conn) add a check that when conn.userId is
falsy and a driveId parameter is provided, conn.driveId must equal the caller's
driveId; if it does not, throw an Error (similar to the existing user-scoped
error) so drive-scoped GitHub tokens cannot be used across drives (update the
validation around the conn/userId check to also verify conn.driveId when userId
is absent, and throw on mismatch before any resolve/decrypt steps).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/web/src/lib/ai/tools/github-import-tools.ts`:
- Around line 408-409: Normalize and validate numeric inputs before use: for
maxFiles compute a safe integer from rawMaxFiles by coercing to Number, using
Math.floor (or Number.isInteger check), falling back to DEFAULT_MAX_FILES when
invalid, and clamping between 1 and MAX_FILES_LIMIT (avoid 0, negatives, and
floats) so the variable maxFiles is always a positive integer; for pullNumber
parseInt/Number the incoming value, ensure Number.isInteger and >0 before using
it (don't rely on truthiness), and update any checks that used pullNumber as a
boolean to explicit numeric validation so only valid integer PR numbers are
passed to API calls and slicing.

---

Duplicate comments:
In `@apps/web/src/lib/ai/tools/github-import-tools.ts`:
- Around line 111-115: The current check only enforces conn.userId but ignores
drive-scoped connections: in the connection validation (where connection is cast
to { userId?: string; driveId?: string } and assigned to conn) add a check that
when conn.userId is falsy and a driveId parameter is provided, conn.driveId must
equal the caller's driveId; if it does not, throw an Error (similar to the
existing user-scoped error) so drive-scoped GitHub tokens cannot be used across
drives (update the validation around the conn/userId check to also verify
conn.driveId when userId is absent, and throw on mismatch before any
resolve/decrypt steps).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7e93577a-7fd4-4cf7-9829-5f74ad55e694

📥 Commits

Reviewing files that changed from the base of the PR and between 7770184 and 9181523.

📒 Files selected for processing (1)
  • apps/web/src/lib/ai/tools/github-import-tools.ts

- Enforce driveId ownership for drive-scoped GitHub connections
- Add pull_request.html_url to get_issues output (distinguish PRs from issues)
- Default get_repo_tree to recursive=1 (always returns full tree)
- Remove unused recursive input param from get_repo_tree schema

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@2witstudios
Copy link
Copy Markdown
Owner Author

Addressed Round 2 Review Feedback (245dcf8)

From CodeRabbit review at 17:00 (4038769191):

  1. get_issues missing PR discriminator — Added pull_request_url: 'pull_request.html_url' to the output mapping so callers can distinguish PRs from issues.

  2. get_repo_tree should default to recursive — Changed recursive from an optional user parameter to a static "1" query param. The tool now always returns the full recursive tree. Updated description to mention truncation for large repos.

  3. ralph-loop.local.md items — These are about the loop config file, not the PR code. Skipped.

  4. create_pr_review_comment conditional schema complexity — The jsonSchemaToZod converter doesn't support if/then conditional schemas. We've already added commit_id and line to required. The current schema correctly enforces what GitHub requires for the most common case (inline comments on specific lines). File-level comments via subject_type: 'file' are an edge case that works without line.

From CodeRabbit review at 17:10 (4038821462):

  1. Drive-scoped connection ownership — Added driveId parameter to resolveGitHubToken. Now verifies conn.driveId === driveId for drive-scoped connections, preventing cross-drive token exposure.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/lib/src/integrations/providers/github.ts (1)

1178-1210: ⚠️ Potential issue | 🟠 Major

Align create_pr_review_comment with GitHub's actual contract.

line is still optional even though this tool is described as commenting on a specific line, and GitHub requires line unless subject_type=file. commit_id is also described as defaulting even though GitHub marks it required for this endpoint. That leaves the schema/docs internally inconsistent and still lets callers walk into API-side validation errors. (docs.github.com)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/integrations/providers/github.ts` around lines 1178 - 1210,
The create_pr_review_comment schema currently misstates required fields: update
the JSON/schema and validation in create_pr_review_comment so that commit_id is
marked required and line is required unless subject_type === 'file' (i.e., allow
omitting line only when subject_type is 'file'); ensure the required array
includes 'commit_id' and adjust runtime validation to enforce line presence for
line-scoped comments and keep the docs/descriptions consistent with GitHub's
REST contract for the fields 'line', 'commit_id', and 'subject_type'.
🧹 Nitpick comments (1)
packages/lib/src/integrations/providers/github.test.ts (1)

354-377: Add a no-path request-builder case for get_repo_content.

This block proves path is optional in the schema, but it never checks what buildHttpRequest produces when path is actually omitted. A regression in the path-template handling would miss the repository-root flow entirely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/integrations/providers/github.test.ts` around lines 354 -
377, Add a test that calls buildHttpRequest (using the same config via
tool.execution as HttpExecutionConfig) with input containing owner and repo but
no path (e.g., { owner: 'acme', repo: 'webapp' }) and assert the resulting URL
targets the repo root contents endpoint (e.g., contains
'/repos/acme/webapp/contents' and does not append an extra '/' or path segment);
this ensures get_repo_content handles the omitted path case correctly and
prevents regressions in the path-template handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/lib/src/integrations/providers/github.ts`:
- Around line 948-965: The update_issue schema currently restricts valid GitHub
parameters: extend the state_reason enum in the schema (property state_reason)
to include "duplicate" in addition to "completed", "not_planned", and
"reopened", and change the milestone property to accept null as well as integers
(i.e., allow type integer or null) so callers can clear an existing milestone;
update the schema definition for update_issue (properties state_reason and
milestone) accordingly so valid GitHub API update payloads are not rejected.

---

Duplicate comments:
In `@packages/lib/src/integrations/providers/github.ts`:
- Around line 1178-1210: The create_pr_review_comment schema currently misstates
required fields: update the JSON/schema and validation in
create_pr_review_comment so that commit_id is marked required and line is
required unless subject_type === 'file' (i.e., allow omitting line only when
subject_type is 'file'); ensure the required array includes 'commit_id' and
adjust runtime validation to enforce line presence for line-scoped comments and
keep the docs/descriptions consistent with GitHub's REST contract for the fields
'line', 'commit_id', and 'subject_type'.

---

Nitpick comments:
In `@packages/lib/src/integrations/providers/github.test.ts`:
- Around line 354-377: Add a test that calls buildHttpRequest (using the same
config via tool.execution as HttpExecutionConfig) with input containing owner
and repo but no path (e.g., { owner: 'acme', repo: 'webapp' }) and assert the
resulting URL targets the repo root contents endpoint (e.g., contains
'/repos/acme/webapp/contents' and does not append an extra '/' or path segment);
this ensures get_repo_content handles the omitted path case correctly and
prevents regressions in the path-template handling.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 69c113bb-3a72-47dd-9a5f-3fbec6ecd5fa

📥 Commits

Reviewing files that changed from the base of the PR and between 9181523 and 245dcf8.

📒 Files selected for processing (3)
  • apps/web/src/lib/ai/tools/github-import-tools.ts
  • packages/lib/src/integrations/providers/github.test.ts
  • packages/lib/src/integrations/providers/github.ts
✅ Files skipped from review due to trivial changes (1)
  • apps/web/src/lib/ai/tools/github-import-tools.ts

… path test

- Add line to required fields in create_pr_review_comment schema
- Update description with guidance on where to get commit_id and line
- Add test for get_repo_content with omitted path (repo root listing)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@2witstudios
Copy link
Copy Markdown
Owner Author

Addressed Round 3 Review Feedback (cc61a5a)

From CodeRabbit review at 18:49 (4039402887):

  1. create_pr_review_commentline now required — Added line to the required array alongside commit_id. Updated tool description to guide agents: "Requires commit_id (use head_sha from get_pull_request) and line number (from get_pr_diff)." For the rare subject_type: 'file' case, the schema converter doesn't support conditional validation — agents can still pass a dummy line value.

  2. No-path test for get_repo_content — Added a test verifying buildHttpRequest produces the correct URL when path is omitted (repo root listing).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/lib/src/integrations/providers/github.test.ts (1)

686-708: Test input missing commit_id despite being a required field.

The test correctly asserts commit_id is required (line 683), but the test input at lines 688-696 omits it. While this works because buildHttpRequest doesn't enforce schema validation, adding commit_id to the test input would better demonstrate a realistic API call and ensure the request body includes it.

♻️ Suggested improvement to include commit_id
     it('given a single-line comment, should build correct POST', () => {
       const config = (tool.execution as { config: HttpExecutionConfig }).config;
       const input = {
         owner: 'acme',
         repo: 'webapp',
         pull_number: 7,
         body: 'Consider using `const` here',
         path: 'src/index.ts',
+        commit_id: 'abc123def456',
         line: 42,
         side: 'RIGHT',
       };

       const result = buildHttpRequest(config, input, 'https://api.github.com');

       expect(result.method).toBe('POST');
       expect(result.url).toBe('https://api.github.com/repos/acme/webapp/pulls/7/comments');

       const body = JSON.parse(result.body!);
       expect(body.body).toBe('Consider using `const` here');
       expect(body.path).toBe('src/index.ts');
+      expect(body.commit_id).toBe('abc123def456');
       expect(body.line).toBe(42);
       expect(body.side).toBe('RIGHT');
     });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/src/integrations/providers/github.test.ts` around lines 686 -
708, The test "given a single-line comment, should build correct POST" omits the
required commit_id in the input; update the input object used with
buildHttpRequest to include a realistic commit_id (e.g., a SHA string) and add
an assertion that JSON.parse(result.body!).commit_id equals that value so the
test both reflects a real API call and verifies commit_id is included; locate
the input object in the same test and the buildHttpRequest invocation to make
the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/lib/src/integrations/providers/github.test.ts`:
- Around line 650-671: The second comment object in the test's input comments
array is missing the required line property which violates the comments schema;
update the test input (the comments array in the it block that calls
buildHttpRequest) to include a numeric line value for the second comment (e.g.,
add "line: <number>" to the { path: 'src/utils.ts', body: 'Missing error
handling' } object) so that buildHttpRequest / the request body validation and
the assertions (expect(body.comments)...) conform to the required
['path','body','line'] fields.

---

Nitpick comments:
In `@packages/lib/src/integrations/providers/github.test.ts`:
- Around line 686-708: The test "given a single-line comment, should build
correct POST" omits the required commit_id in the input; update the input object
used with buildHttpRequest to include a realistic commit_id (e.g., a SHA string)
and add an assertion that JSON.parse(result.body!).commit_id equals that value
so the test both reflects a real API call and verifies commit_id is included;
locate the input object in the same test and the buildHttpRequest invocation to
make the change.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e5028581-a172-405f-9759-3df5724c30ce

📥 Commits

Reviewing files that changed from the base of the PR and between 245dcf8 and cc61a5a.

📒 Files selected for processing (2)
  • packages/lib/src/integrations/providers/github.test.ts
  • packages/lib/src/integrations/providers/github.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/lib/src/integrations/providers/github.ts

2witstudios and others added 6 commits March 31, 2026 14:16
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.claude/*.local.md files are ephemeral loop state and should not be committed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Batch importedAt timestamp per operation instead of per-file, refactor
processDirectory to return results instead of mutating closed-over state,
remove unsafe double type assertion in resolveGitHubToken, and add 54
tests for github-import-tools and language-detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use Record<string, unknown> instead of narrow type assertion for
extractionMetadata access in test, fixing CI typecheck failure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- update_issue: add 'duplicate' to state_reason enum, allow null milestone
- create_pr_review_comment: relax required fields to support reply mode
  (only owner/repo/pull_number/body required; path/commit_id/line optional)
- github-import-tools: normalize maxFiles (floor, clamp 1..50), validate
  pullNumber is a positive integer

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CI typecheck requires casting through unknown before narrowing
CreatePageInput to Record types in test assertions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@2witstudios 2witstudios merged commit 84084de into master Apr 1, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant