Add missing MCP tools for full REST API parity#267
Conversation
Update all documentation and READMEs to consistently describe search as supporting semantic, keyword, and hybrid modes instead of only semantic. Changes: - Expand feature descriptions to include keyword and hybrid search - Add SDK usage examples (Python, TypeScript/JS, Java) for keyword and hybrid search modes - Add full-text indexing table to long-term memory docs - Fix deprecated function reference in agent examples - Fix tuple unpacking order in error handling example - Update search capability descriptions across all guides Context: The server already supports keyword (BM25) and hybrid (RRF) search modes via the search_mode parameter, but documentation only referenced semantic search. This brings docs in line with actual capabilities. Refs: #251, #253
Forward search_mode, offset, and optimize_query through resolve_tool_call, MCP tools, and hydrate_memory_prompt. Remove hybrid_alpha and text_scorer from LLM-facing interfaces (MCP, tool schemas, LangChain) since these are server-level hyperparameters that LLMs cannot reliably tune. Changes: - Forward search_mode, offset, optimize_query in _resolve_search_memory - Add search_mode to MCP memory_prompt and search_long_term_memory - Remove hybrid_alpha and text_scorer from MCP tools and client tool schemas (kept in developer-facing hydrate_memory_prompt) - Align max_results default from 5 to 10 in _resolve_search_memory - Add distance_threshold validation for non-semantic search modes - Fix deprecated function references in tool descriptions and docstrings (search_memory -> search_memory_tool, get_working_memory -> get_or_create_working_memory) - Correct optimize_query docstrings in API and MCP - Make error message name-agnostic in client - Update MCP docs for search mode support Tests: - Add test for search_mode forwarding through resolve_function_call - Add test for search_mode forwarding through LangChain tool - Both tests verify hybrid_alpha/text_scorer are NOT in tool schemas - Update MCP parameter passing tests Design context (from #251 discussion): - @nkanu17: Hyperparameters like alpha, recency, and scorer config should be set at the server/session level, not by the LLM - @abrookins: MCP interface is intentionally smaller than REST API. LLMs can reason about search mode selection (semantic vs keyword vs hybrid) but struggle with arbitrary numeric tuning values - @tylerhutcherson: Questioned whether LLMs should toggle these params, leading to the decision to expose only search_mode - @vishal-bala: Cataloged full MCP/REST gaps in #253, confirmed remaining items are out of scope for this PR Refs: #251, #253
Move search_mode, hybrid_alpha, and text_scorer after limit, offset, and optimize_query to avoid breaking existing callers that pass those params positionally.
…d client - Forward event_date, recency_boost, recency_*_weight, recency_half_life_*, and server_side_recency through MCP search_long_term_memory and memory_prompt - Add same params to Python client hydrate_memory_prompt (at end of signature to preserve positional argument order) - Fix namespace injection: target 'memory_prompt' instead of 'hydrate_memory_prompt' - Import EventDate filter in mcp.py
Phase 2 of #253: Adds MCP tool equivalents for all REST endpoints that were previously inaccessible from the MCP interface. Expanded existing tools: - get_working_memory: added model_name and context_window_max params - memory_prompt: supports session-only, long_term_search shortcut, and explicit search payload modes New tools: - forget_memories: bulk delete by policy (before_date, after_date, inactive_for_days) - list_sessions: list all sessions with pagination - delete_working_memory: delete working memory for a session - create_summary_view: create a SummaryView configuration - list_summary_views: list all SummaryView configs - get_summary_view: get a SummaryView by ID - delete_summary_view: delete a SummaryView configuration - run_summary_view_partition: sync compute a single partition - list_summary_view_partitions: list stored partition results - run_summary_view: async full recompute (returns Task) - get_task_status: poll background task status Updated namespace injection in call_tool to cover new tools.
| 1. Hydrate a user prompt with long-term memory search: | ||
| memory_prompt(query="What was my favorite color?") | ||
| ``` | ||
|
|
There was a problem hiding this comment.
The validation for distance_threshold with non-semantic search modes is missing in the MCP memory_prompt tool (line 733), but is present in the REST API endpoint (api.py line 665-672) and in search_long_term_memory MCP tool (line 733). This creates an inconsistency where invalid parameter combinations could pass through memory_prompt but fail in the REST API.
Suggested fix:
if distance_threshold is not None and search_mode != SearchModeEnum.SEMANTIC:
raise ValueError(
"distance_threshold is only supported for semantic search mode"
)
if distance_threshold is not None and search_mode != SearchModeEnum.SEMANTIC:
raise ValueError(
"distance_threshold is only supported for semantic search mode"
)This should be added immediately after line 733 in the memory_prompt function, matching the validation pattern used in both the REST API and the search_long_term_memory MCP tool.
| model_name=model_name, | ||
| context_window_max=context_window_max, | ||
| ) | ||
| if _calculate_messages_token_count(result.messages) > effective_token_limit: |
There was a problem hiding this comment.
The token-based message truncation logic in get_working_memory uses a simple loop that removes messages from the beginning. While this works, there's a potential edge case: if a single message exceeds effective_token_limit, the loop will reduce to just that message but still exceed the limit.
Current code:
while len(messages) > 1:
messages = messages[1:]
if _calculate_messages_token_count(messages) <= effective_token_limit:
breakThe loop condition len(messages) > 1 means it will keep at least one message even if that message alone exceeds the token limit. Consider adding validation or handling for this edge case to match the REST API behavior more closely.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.
| messages = messages[1:] | ||
| if _calculate_messages_token_count(messages) <= effective_token_limit: | ||
| break | ||
| result.messages = messages |
There was a problem hiding this comment.
Code duplication: token truncation reimplemented in MCP tool
Low Severity
The token-based message truncation logic in the MCP get_working_memory tool is a near-verbatim copy of the same logic in the REST API's get_working_memory endpoint. Both import and call _calculate_messages_token_count and _get_effective_token_limit with an identical while-loop truncation pattern. This duplicated logic increases the maintenance burden — any future bug fix or improvement to truncation must be applied in both places. Extracting this into a shared helper would be cleaner.
Additional Locations (1)
| if invalid_filters: | ||
| raise ValueError( | ||
| "Unsupported filter fields: " + ", ".join(sorted(invalid_filters)) | ||
| ) |
There was a problem hiding this comment.
Code duplication: SummaryView validation copied from REST API
Low Severity
The create_summary_view MCP tool duplicates the validation logic for source, group_by, and filters that already exists in the REST API endpoint. The allowed-field sets and error messages are copied inline. If allowed fields change (e.g., a new group_by option), both the REST handler and this MCP tool need to be updated independently, risking inconsistency.
🛡️ Jit Security Scan Results✅ No security findings were detected in this PR
Security scan by Jit
|
|
Closing this PR after revisiting the discussion on #251. The consensus from @abrookins, @tylerhutcherson, and @nkanu17 was that MCP should be a curated, smaller interface, not a 1:1 mirror of the REST API. Several tools added here (working memory management, session listing, SummaryView CRUD, task polling) go against that philosophy. Will reopen with a pared-down scope that only includes tools LLMs can meaningfully reason about. |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e018bfdc67
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| _plain_ns_tools = ( | ||
| "set_working_memory", | ||
| "get_working_memory", | ||
| "delete_working_memory", | ||
| "list_sessions", | ||
| "forget_memories", | ||
| "compact_long_term_memories", | ||
| ) |
There was a problem hiding this comment.
Inject namespace for list_summary_view_partitions
The namespace auto-injection added in call_tool() does not cover list_summary_view_partitions, even though that tool accepts a namespace argument. In namespaced transports (/{namespace}), this means partition listings are not scoped unless the caller explicitly passes namespace, so a request can return partitions from other namespaces and break tenant/data-scope expectations for the new MCP parity tools.
Useful? React with 👍 / 👎.
| limit=limit, | ||
| offset=offset, | ||
| namespace=namespace, |
There was a problem hiding this comment.
Enforce list_sessions pagination limits
list_sessions forwards limit/offset directly to the backend without bounds checks, unlike the REST path which constrains these values (limit 1-100, offset >= 0 via GetSessionsQuery). This lets MCP callers request very large or negative windows, which can produce expensive Redis queries or invalid pagination behavior that the REST API explicitly prevents.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Pull request overview
This PR completes MCP ↔ REST surface-area parity by adding MCP tools for previously REST-only endpoints (forgetting, sessions, SummaryViews, task polling) and expanding existing MCP/client tool interfaces to better match the server’s search + prompt-hydration capabilities.
Changes:
- Expanded MCP
memory_prompt/search_long_term_memoryretrieval controls (search_mode, event_date, recency options) and improved namespace injection behavior. - Added new MCP tools for forgetting, session listing/deletion, SummaryView CRUD/run/partitions, and task status polling.
- Updated client/tooling tests and documentation to reflect semantic/keyword/hybrid search and updated client method names.
Reviewed changes
Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
agent_memory_server/mcp.py |
Adds/extends MCP tools (forget/sessions/summary-views/tasks), expands search + memory_prompt params, updates namespace injection. |
agent_memory_server/api.py |
Docstring clarification: optimize_query applies to semantic search only. |
agent-memory-client/agent_memory_client/client.py |
Updates tool schemas/docs and forwards search_mode/offset/optimize_query through resolve paths; docstring clarifications. |
agent-memory-client/agent_memory_client/integrations/langchain.py |
Extends LangChain search tool to accept/forward search_mode; updates description. |
tests/test_mcp.py |
Extends MCP memory_prompt param-forwarding assertions to cover search_mode. |
tests/test_client_tool_calls.py |
Adds regression test ensuring search_mode is forwarded and hyperparams are omitted. |
agent-memory-client/tests/test_langchain_integration.py |
Adds test verifying LangChain wrapper forwards search_mode and omits hyperparams. |
README.md |
Updates feature/architecture docs to mention semantic/keyword/hybrid search. |
examples/README.md |
Updates example docs to mention semantic/keyword/hybrid search. |
docs/use-cases.md |
Updates guidance to reference semantic/keyword/hybrid search. |
docs/typescript-sdk.md |
Adds TS examples for keyword/hybrid search usage. |
docs/java-sdk.md |
Adds Java examples for keyword/hybrid search usage. |
docs/README.md |
Updates feature matrix wording for multi-mode search. |
docs/quick-start.md |
Updates quick-start narrative to mention keyword/hybrid search. |
docs/mcp.md |
Updates MCP docs to reflect multi-mode search. |
docs/long-term-memory.md |
Expands long-term memory docs to explain search modes and indexing. |
docs/index.md |
Updates landing page copy to mention multi-mode search. |
docs/agent-examples.md |
Updates agent examples text and corrects working-memory API usage in snippet. |
agent-memory-client/README.md |
Updates README examples to reflect new get_or_create_working_memory and multi-mode search. |
agent-memory-client/agent-memory-client-js/README.md |
Updates JS client README examples to include keyword/hybrid search. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| search_payload = SearchRequest( | ||
| text=query, | ||
| namespace=namespace, | ||
| topics=topics, | ||
| entities=entities, | ||
| created_at=created_at, | ||
| last_accessed=last_accessed, | ||
| user_id=user_id, | ||
| distance_threshold=distance_threshold, | ||
| memory_type=memory_type, | ||
| event_date=event_date, | ||
| search_mode=search_mode, | ||
| limit=limit, | ||
| offset=offset, | ||
| recency_boost=recency_boost, | ||
| recency_semantic_weight=recency_semantic_weight, | ||
| recency_recency_weight=recency_recency_weight, | ||
| recency_freshness_weight=recency_freshness_weight, | ||
| recency_novelty_weight=recency_novelty_weight, | ||
| recency_half_life_last_access_days=recency_half_life_last_access_days, | ||
| recency_half_life_created_days=recency_half_life_created_days, | ||
| server_side_recency=server_side_recency, | ||
| ) | ||
| _params["long_term_search"] = search_payload |
There was a problem hiding this comment.
memory_prompt currently always constructs and passes an explicit SearchRequest when include_long_term_search=True, so callers cannot access the REST /v1/memory/prompt long_term_search=true shortcut semantics (bool mode), which excludes the current session and uses the server’s default limit/offset behavior. If parity with REST is intended, consider adding a way to pass long_term_search=True through MCP (or emulate that behavior when no explicit long-term filters are provided).
| search_payload = SearchRequest( | |
| text=query, | |
| namespace=namespace, | |
| topics=topics, | |
| entities=entities, | |
| created_at=created_at, | |
| last_accessed=last_accessed, | |
| user_id=user_id, | |
| distance_threshold=distance_threshold, | |
| memory_type=memory_type, | |
| event_date=event_date, | |
| search_mode=search_mode, | |
| limit=limit, | |
| offset=offset, | |
| recency_boost=recency_boost, | |
| recency_semantic_weight=recency_semantic_weight, | |
| recency_recency_weight=recency_recency_weight, | |
| recency_freshness_weight=recency_freshness_weight, | |
| recency_novelty_weight=recency_novelty_weight, | |
| recency_half_life_last_access_days=recency_half_life_last_access_days, | |
| recency_half_life_created_days=recency_half_life_created_days, | |
| server_side_recency=server_side_recency, | |
| ) | |
| _params["long_term_search"] = search_payload | |
| # | |
| # Preserve REST parity: when the caller only enables long-term search and | |
| # does not provide any explicit long-term filters, forward the boolean | |
| # shortcut (`true`) so the server can apply its default limit/offset and | |
| # current-session exclusion behavior. If explicit long-term search options | |
| # are supplied, continue passing a concrete SearchRequest. | |
| has_explicit_long_term_filters = any( | |
| value is not None | |
| for value in ( | |
| namespace, | |
| topics, | |
| entities, | |
| created_at, | |
| last_accessed, | |
| user_id, | |
| distance_threshold, | |
| memory_type, | |
| event_date, | |
| search_mode, | |
| limit, | |
| offset, | |
| recency_boost, | |
| recency_semantic_weight, | |
| recency_recency_weight, | |
| recency_freshness_weight, | |
| recency_novelty_weight, | |
| recency_half_life_last_access_days, | |
| recency_half_life_created_days, | |
| server_side_recency, | |
| ) | |
| ) | |
| if has_explicit_long_term_filters: | |
| search_payload = SearchRequest( | |
| text=query, | |
| namespace=namespace, | |
| topics=topics, | |
| entities=entities, | |
| created_at=created_at, | |
| last_accessed=last_accessed, | |
| user_id=user_id, | |
| distance_threshold=distance_threshold, | |
| memory_type=memory_type, | |
| event_date=event_date, | |
| search_mode=search_mode, | |
| limit=limit, | |
| offset=offset, | |
| recency_boost=recency_boost, | |
| recency_semantic_weight=recency_semantic_weight, | |
| recency_recency_weight=recency_recency_weight, | |
| recency_freshness_weight=recency_freshness_weight, | |
| recency_novelty_weight=recency_novelty_weight, | |
| recency_half_life_last_access_days=recency_half_life_last_access_days, | |
| recency_half_life_created_days=recency_half_life_created_days, | |
| server_side_recency=server_side_recency, | |
| ) | |
| _params["long_term_search"] = search_payload | |
| else: | |
| _params["long_term_search"] = True |
| List working memory sessions with optional pagination and filtering. | ||
|
|
||
| This works like the `GET /v1/working-memory/` REST API endpoint. | ||
|
|
There was a problem hiding this comment.
The PR description lists list_sessions as REST-equivalent to GET /v1/sessions, but the server’s REST endpoint is GET /v1/working-memory/ (see agent_memory_server/api.py). Either the PR description should be corrected, or the tool/docs should be updated to match the actual REST route naming used by this repo.
| async def list_sessions( | ||
| limit: int = 20, | ||
| offset: int = 0, | ||
| namespace: str | None = None, | ||
| user_id: str | None = None, | ||
| ) -> SessionListResponse: |
There was a problem hiding this comment.
list_sessions docstring promises limit is 1–100, but the MCP tool does not enforce those bounds (unlike the REST path, which validates via GetSessionsQuery(limit=ge1,le100)). Please add explicit validation (and also validate offset >= 0) to avoid accidental large scans and to keep MCP behavior consistent with REST.
| Args: | ||
| policy: Forgetting policy dict with keys like max_age_days, max_inactive_days, | ||
| budget (number of top memories to keep), memory_type_allowlist | ||
| namespace: Optional namespace filter |
There was a problem hiding this comment.
The forget_memories docstring lists policy keys like max_age_days/max_inactive_days, but the PR description claims MCP supports before_date/after_date/inactive_for_days. The implementation delegates directly to forget_long_term_memories, which only supports the max_*_days/budget policy shape; please align the docs (or add translation/validation) so callers know the actual supported policy fields.
| # Validate keys (same logic as REST) | ||
| if payload.source != "long_term": | ||
| raise ValueError( | ||
| "SummaryView.source must be 'long_term' for now; " | ||
| "'working_memory' is not yet supported." | ||
| ) | ||
|
|
||
| allowed_group_by = {"user_id", "namespace", "session_id", "memory_type"} | ||
| allowed_filters = {"user_id", "namespace", "session_id", "memory_type"} | ||
|
|
||
| invalid_group = [k for k in payload.group_by if k not in allowed_group_by] | ||
| if invalid_group: | ||
| raise ValueError( | ||
| "Unsupported group_by fields: " + ", ".join(sorted(invalid_group)) | ||
| ) | ||
|
|
||
| invalid_filters = [k for k in payload.filters if k not in allowed_filters] | ||
| if invalid_filters: | ||
| raise ValueError( | ||
| "Unsupported filter fields: " + ", ".join(sorted(invalid_filters)) |
There was a problem hiding this comment.
create_summary_view duplicates the SummaryView key validation logic inline (allowed group_by/filters, source check). The REST API already centralizes this in _validate_summary_view_keys and uses it before persisting; to avoid MCP/REST drift over time, consider reusing the shared validation helper (or moving validation into summary_views / the request model) instead of maintaining two copies.
| @mcp_app.tool() | ||
| async def forget_memories( | ||
| policy: dict, | ||
| namespace: str | None = None, | ||
| user_id: str | None = None, | ||
| session_id: str | None = None, | ||
| limit: int = 1000, | ||
| dry_run: bool = True, | ||
| pinned_ids: list[str] | None = None, |
There was a problem hiding this comment.
New MCP tools added in this file (forget_memories, list_sessions, delete_working_memory, SummaryView CRUD/run, and get_task_status) don’t appear to have corresponding MCP tests. tests/test_mcp.py already exercises other MCP tools extensively, so adding coverage here would help prevent parity regressions (e.g., namespace injection, argument validation, and error handling).


Summary
Completes Phase 2 of #253: adds MCP tool equivalents for every REST endpoint that was previously only accessible via the HTTP API. This builds on PR #265 (Phase 1), which fixed search parameter forwarding.
Problem
The MCP server was missing tool equivalents for several REST API capabilities:
get_working_memorylackedmodel_nameandcontext_window_maxparamsmemory_promptcould not express session-only mode or thelong_term_search=TrueshortcutChanges
Expanded existing tools
get_working_memory: Addedmodel_nameandcontext_window_maxparameters so the MCP tool matches the REST endpoint signaturememory_prompt: Redesigned to support three modes:include_long_term_search=Trueshortcut (auto-constructs a SearchRequest from the flat MCP params)New tools
forget_memoriesPOST /v1/long-term-memory/forgetbefore_date,after_date,inactive_for_dayslist_sessionsGET /v1/sessionsdelete_working_memoryDELETE /v1/working-memory/{id}create_summary_viewPOST /v1/summary-viewslist_summary_viewsGET /v1/summary-viewsget_summary_viewGET /v1/summary-views/{id}delete_summary_viewDELETE /v1/summary-views/{id}run_summary_view_partitionPOST /v1/summary-views/{id}/partitions/runlist_summary_view_partitionsGET /v1/summary-views/{id}/partitionsrun_summary_viewPOST /v1/summary-views/{id}/runget_task_statusGET /v1/tasks/{id}Namespace injection
Updated
FastMCP.call_tool()to inject the transport default namespace into all new tools that accept a namespace parameter.Testing
Related
Note
Medium Risk
Adds multiple new MCP tools that can delete data (bulk forgetting, session deletion) and trigger background summary tasks, increasing the surface area for correctness and authorization/namespace-scoping issues. Changes are mostly additive but touch core MCP request/namespace handling and prompt/search parameter forwarding.
Overview
Brings the MCP server up to REST parity by adding new tools for bulk forgetting, listing/deleting sessions, summary view CRUD + partition runs + async full runs, and task status polling, plus updating namespace injection so defaults are applied consistently across these tools.
Extends MCP
memory_promptandsearch_long_term_memoryto forward additional search controls (e.g.,search_mode, event-date filtering, recency params) and adds a session-only prompt mode (include_long_term_search=False) while keeping session scoping out of long-term search.Updates the Python/JS/TS/Java docs and the Python client tooling/LangChain integration to reflect keyword/hybrid search support,
get_or_create_working_memoryusage, and clarifiedoptimize_querysemantics; adds targeted tests to ensuresearch_modeforwarding and MCP prompt behavior.Written by Cursor Bugbot for commit e018bfd. This will update automatically on new commits. Configure here.