Add excerpt/word count to posts by fennifith · Pull Request #126 · playfulprogramming/hoof

fennifith · 2026-03-22T12:44:52Z

Fixes #99

Summary by CodeRabbit

New Features
- Posts now track and persist word count
- Markdown excerpts auto-populate post descriptions when missing
- Added "coderpad" as a supported license option
Tests
- Added tests validating markdown excerpt extraction behavior
Chores
- Added markdown/AST processing dependencies
- Database schema extended with a word count column
Validation
- Metadata schema tightened with stricter date/url formats and optional description

coderabbitai · 2026-03-22T12:45:09Z

Warning

Rate limit exceeded

@fennifith has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 10 minutes and 9 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 68ec0b21-9b92-49ba-8d7b-be7692f10628

📥 Commits

Reviewing files that changed from the base of the PR and between eeef5b7 and 478f753.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (2)

apps/worker/package.json
apps/worker/src/utils/extractMarkdownExcerpt.ts

📝 Walkthrough

Walkthrough

Adds markdown excerpt extraction and word-count calculation to post sync: new extractor util and tests, markdown/AST deps, processor changes to derive description and wordCount, DB migration and schema update adding post_data.word_count, and stricter post metadata schema fields.

Changes

Cohort / File(s)	Summary
Markdown excerpt util `apps/worker/src/utils/extractMarkdownExcerpt.ts`, `apps/worker/src/utils/extractMarkdownExcerpt.test.ts`	New `extractMarkdownExcerpt` exported function that parses markdown with unified/remark, walks the AST, builds a plaintext excerpt with optional truncation and ellipsis. Includes 3 unit tests (format stripping, list handling, truncation).
Worker dependencies `apps/worker/package.json`	Added markdown/AST processing packages: `unified`, `remark-parse`, `remark-gfm`, `hast-util-to-string`, `unist-util-is`.
Post sync processor `apps/worker/src/tasks/sync-post/processor.ts`, `apps/worker/src/tasks/sync-post/processor.test.ts`	Processor now parses markdown (gray-matter) to get content, fills missing description from excerpt, computes `wordCount` (whitespace-split), threads `wordCount` through localeData and persists it. Test updated to expect `wordCount: 10`.
Post metadata types `apps/worker/src/tasks/sync-post/types.ts`	Tighter schema: `published` and `edited` use `date-time` format, `originalLink` uses `url` format, `description` became optional (no default), and `license` union adds `coderpad`.
Database schema & migration `packages/db/src/schema/posts.ts`, `packages/db/drizzle/20260322121256_illegal_rick_jones/migration.sql`, `packages/db/drizzle/.../snapshot.json`	Added `word_count` (integer NOT NULL DEFAULT 0) to `post_data` via new Drizzle migration; schema export updated and snapshot added.
Workspace catalog `pnpm-workspace.yaml`	Minor reordering of `catalog` entries (typebox entry moved).

Sequence Diagram(s)

mermaid
sequenceDiagram
participant GitHub as GitHub (repo/content)
participant Worker as Worker Processor
participant Extractor as extractMarkdownExcerpt
participant DB as Database (post_data)
GitHub->>Worker: Provide post file + frontmatter
Worker->>Extractor: Parse markdown, extract excerpt & compute wordCount
Extractor-->>Worker: Excerpt, wordCount
Worker->>DB: Insert/Update post_data (includes description, rawMarkdown, wordCount)
DB-->>Worker: Ack

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

update drizzle?? #111: Modifies packages/db/src/schema/posts.ts / DB schema; likely related to the word_count column changes.
initial e2e tests #113: Changes worker dependencies (gray-matter/sharp resolution); related to added markdown/AST packages in apps/worker/package.json.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main changes: adding word count and excerpt functionality to posts.
Linked Issues check	✅ Passed	All objectives from issue `#99` are met: word count column added to database and computed during sync, excerpt extraction implemented via markdown parsing, and logic reused/uplifted as requested.
Out of Scope Changes check	✅ Passed	All changes directly support the stated objective. Dependency additions, schema updates, processor changes, and new utility functions are all necessary to implement excerpt/word count functionality.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch post-word-count

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🧹 Nitpick comments (2)

apps/worker/src/tasks/sync-post/processor.test.ts (1)

61-73: Add one integration case without frontmatter description.

This fixture still exercises only the authored-description path. A body-only markdown case would lock in the generated excerpt fallback and also cover the new simple-body word-count path.

Also applies to: 99-115

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@apps/worker/src/tasks/sync-post/processor.test.ts` around lines 61 - 73, Add
a new integration test in processor.test.ts that uses a markdown fixture missing
the frontmatter "description" field (only frontmatter title/tags/date or even no
frontmatter) and a body-only markdown content; in the new test assert that the
processor falls back to generating an excerpt from the body (excerpt equals
expected first N words/sentence) and that the word-count path uses the
simple-body fallback (check whatever function output/field indicates wordCount
or excerpt source). Locate nearby tests that use the `data` fixture and mirror
their setup/assertions (same test suite) so the new case covers the
authored-description-missing branch and the simple-body word-count/excerpt
fallback.

packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json (1)

394-405: Enforce non-negative word_count at the database layer.

Line 394 defines word_count as integer with default 0, but without a check constraint negative values can still be written.

Suggested constraint

+ALTER TABLE "post_data"
+ADD CONSTRAINT "post_data_word_count_non_negative_chk"
+CHECK ("word_count" >= 0);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json` around
lines 394 - 405, The snapshot shows the post_data.column word_count is integer
with default 0 but no constraint preventing negatives; add a database-level
CHECK constraint (e.g., ck_post_data_word_count_nonnegative or
post_data_word_count_nonnegative) that enforces word_count >= 0 and include it
in the schema/migration that defines/updates the post_data table (or in the
Drizzle table definition for post_data) so the constraint is applied at the DB
level while keeping the default 0.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@apps/worker/src/tasks/sync-post/processor.ts`:
- Around line 101-106: Replace the current word-count calculation that splits
raw markdown (content.split(/\s+/).length) with a count derived from extracted
plain text: use the parsed plain-text field (parsed.text or whichever parsed
property holds plaintext) falling back to a markdown-stripped version of
content, call .trim() and then if the resulting string is empty set wordCount =
0, otherwise set wordCount = plain.split(/\s+/).length; update the assignment of
wordCount (the variable in the processor function) accordingly so empty-body
fixtures in processor.test.ts produce 0.

In `@apps/worker/src/utils/extractMarkdownExcerpt.ts`:
- Around line 7-20: The code is using hast types/utilities while
unifiedChain.parse returns mdast; update the imports and type checks to mdast
equivalents: import types from "mdast" (use mdast Content/Nodes types) and
import toString from "mdast-util-to-string" instead of hast versions, and ensure
unifiedChain is typed for mdast; in the type guard function isTextOrCode (and
any use of is(...)), replace hast node names with mdast names (e.g., "listitem"
-> "listItem") and adjust the union of node kinds to mdast node types like
"text", "code", "link", "listItem", "inlineCode" so the list-item branch becomes
reachable and type-safe.

In `@packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json`:
- Around line 381-405: The snapshot for the "post_data" table is missing a
dedicated "excerpt" column: add a new not-null text column named "excerpt"
(similar to "description") to the post_data schema alongside "word_count" and
"description", update whatever migration or schema definition that defines the
post_data table (e.g., the createTable or table schema that contains post_data)
to include excerpt, run the migration/regeneration step to rebuild Drizzle
artifacts and regenerate snapshot.json so the new "excerpt" column appears in
the snapshot.

In `@packages/db/src/schema/posts.ts`:
- Around line 32-34: The posts schema lacks an excerpt column so generated
excerpts are never persisted: add a new column (e.g., excerpt:
text("excerpt").notNull().default("") or nullable per product choice) alongside
description and wordCount in packages/db/src/schema/posts.ts, add a
corresponding DB migration to create the excerpt column, and update all
write/read paths that populate post_data (including the worker that currently
only backfills description) so it writes the excerpt regardless of whether an
authored description exists and reads it when returning posts.

In `@pnpm-workspace.yaml`:
- Around line 6-15: Reformat pnpm-workspace.yaml with the repository’s Prettier
configuration so it matches CI expectations: open pnpm-workspace.yaml and run
the project’s formatter (the repo’s Prettier/format script) to rewrite the file
using the repo config, then stage and commit the updated file so CI no longer
reports formatting errors.

---

Nitpick comments:
In `@apps/worker/src/tasks/sync-post/processor.test.ts`:
- Around line 61-73: Add a new integration test in processor.test.ts that uses a
markdown fixture missing the frontmatter "description" field (only frontmatter
title/tags/date or even no frontmatter) and a body-only markdown content; in the
new test assert that the processor falls back to generating an excerpt from the
body (excerpt equals expected first N words/sentence) and that the word-count
path uses the simple-body fallback (check whatever function output/field
indicates wordCount or excerpt source). Locate nearby tests that use the `data`
fixture and mirror their setup/assertions (same test suite) so the new case
covers the authored-description-missing branch and the simple-body
word-count/excerpt fallback.

In `@packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json`:
- Around line 394-405: The snapshot shows the post_data.column word_count is
integer with default 0 but no constraint preventing negatives; add a
database-level CHECK constraint (e.g., ck_post_data_word_count_nonnegative or
post_data_word_count_nonnegative) that enforces word_count >= 0 and include it
in the schema/migration that defines/updates the post_data table (or in the
Drizzle table definition for post_data) so the constraint is applied at the DB
level while keeping the default 0.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 05f27e3f-789b-4937-a3d3-018d843f4134

📥 Commits

Reviewing files that changed from the base of the PR and between 66ba7e1 and 6fd8ff3.

⛔ Files ignored due to path filters (1)

pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (10)

apps/worker/package.json
apps/worker/src/tasks/sync-post/processor.test.ts
apps/worker/src/tasks/sync-post/processor.ts
apps/worker/src/tasks/sync-post/types.ts
apps/worker/src/utils/extractMarkdownExcerpt.test.ts
apps/worker/src/utils/extractMarkdownExcerpt.ts
packages/db/drizzle/20260322121256_illegal_rick_jones/migration.sql
packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json
packages/db/src/schema/posts.ts
pnpm-workspace.yaml

apps/worker/src/tasks/sync-post/processor.ts

apps/worker/src/utils/extractMarkdownExcerpt.ts

packages/db/drizzle/20260322121256_illegal_rick_jones/snapshot.json

packages/db/src/schema/posts.ts

pnpm-workspace.yaml

fennifith added 3 commits March 22, 2026 12:13

add word count and excerpt handling for post data

cd037ff

small fixes to post type

118f5c6

Merge branch 'main' into post-word-count

6fd8ff3

fix pnpm-workspace.yaml formatting

eeef5b7

coderabbitai bot reviewed Mar 22, 2026

View reviewed changes

swap extractMarkdownExcerpt to mdast utilities

478f753

fennifith enabled auto-merge March 22, 2026 13:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add excerpt/word count to posts#126

Add excerpt/word count to posts#126
fennifith wants to merge 5 commits intomainfrom
post-word-count

fennifith commented Mar 22, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 22, 2026 •

edited

Loading

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

fennifith commented Mar 22, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fennifith commented Mar 22, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 22, 2026 •

edited

Loading