Skip to content

Add missing MCP tools for full REST API parity#267

Closed
nkanu17 wants to merge 5 commits intomainfrom
pr/mcp-rest-parity
Closed

Add missing MCP tools for full REST API parity#267
nkanu17 wants to merge 5 commits intomainfrom
pr/mcp-rest-parity

Conversation

@nkanu17
Copy link
Copy Markdown
Contributor

@nkanu17 nkanu17 commented Apr 3, 2026

Summary

Completes Phase 2 of #253: adds MCP tool equivalents for every REST endpoint that was previously only accessible via the HTTP API. This builds on PR #265 (Phase 1), which fixed search parameter forwarding.

Problem

The MCP server was missing tool equivalents for several REST API capabilities:

  • No way to forget/delete memories in bulk
  • No way to list or delete sessions
  • No way to manage SummaryViews (create, list, get, delete, run)
  • No way to poll background task status
  • get_working_memory lacked model_name and context_window_max params
  • memory_prompt could not express session-only mode or the long_term_search=True shortcut

Changes

Expanded existing tools

  • get_working_memory: Added model_name and context_window_max parameters so the MCP tool matches the REST endpoint signature
  • memory_prompt: Redesigned to support three modes:
    1. Session-only (no long-term search, just working memory)
    2. include_long_term_search=True shortcut (auto-constructs a SearchRequest from the flat MCP params)
    3. Explicit search payload (via existing filter params)
    • Session ID is intentionally excluded from the long-term search to avoid scoping long-term results to a single session

New tools

Tool REST equivalent Description
forget_memories POST /v1/long-term-memory/forget Bulk delete by policy: before_date, after_date, inactive_for_days
list_sessions GET /v1/sessions List sessions with namespace filter and pagination
delete_working_memory DELETE /v1/working-memory/{id} Delete working memory for a session
create_summary_view POST /v1/summary-views Create a SummaryView configuration
list_summary_views GET /v1/summary-views List all SummaryView configurations
get_summary_view GET /v1/summary-views/{id} Get a SummaryView by ID
delete_summary_view DELETE /v1/summary-views/{id} Delete a SummaryView configuration
run_summary_view_partition POST /v1/summary-views/{id}/partitions/run Synchronously compute a single partition summary
list_summary_view_partitions GET /v1/summary-views/{id}/partitions List stored partition results
run_summary_view POST /v1/summary-views/{id}/run Trigger async full recompute (returns a Task)
get_task_status GET /v1/tasks/{id} Poll status of a background task

Namespace injection

Updated FastMCP.call_tool() to inject the transport default namespace into all new tools that accept a namespace parameter.

Testing

  • All 832 tests pass (including 19 MCP-specific tests)
  • Pre-commit (ruff format + lint) clean
  • Examples reviewed: no updates needed since they use the Python client, not MCP tools directly

Related


Note

Medium Risk
Adds multiple new MCP tools that can delete data (bulk forgetting, session deletion) and trigger background summary tasks, increasing the surface area for correctness and authorization/namespace-scoping issues. Changes are mostly additive but touch core MCP request/namespace handling and prompt/search parameter forwarding.

Overview
Brings the MCP server up to REST parity by adding new tools for bulk forgetting, listing/deleting sessions, summary view CRUD + partition runs + async full runs, and task status polling, plus updating namespace injection so defaults are applied consistently across these tools.

Extends MCP memory_prompt and search_long_term_memory to forward additional search controls (e.g., search_mode, event-date filtering, recency params) and adds a session-only prompt mode (include_long_term_search=False) while keeping session scoping out of long-term search.

Updates the Python/JS/TS/Java docs and the Python client tooling/LangChain integration to reflect keyword/hybrid search support, get_or_create_working_memory usage, and clarified optimize_query semantics; adds targeted tests to ensure search_mode forwarding and MCP prompt behavior.

Written by Cursor Bugbot for commit e018bfd. This will update automatically on new commits. Configure here.

nkanu17 added 5 commits April 2, 2026 18:10
Update all documentation and READMEs to consistently describe search as
supporting semantic, keyword, and hybrid modes instead of only semantic.

Changes:
- Expand feature descriptions to include keyword and hybrid search
- Add SDK usage examples (Python, TypeScript/JS, Java) for keyword and
  hybrid search modes
- Add full-text indexing table to long-term memory docs
- Fix deprecated function reference in agent examples
- Fix tuple unpacking order in error handling example
- Update search capability descriptions across all guides

Context: The server already supports keyword (BM25) and hybrid (RRF)
search modes via the search_mode parameter, but documentation only
referenced semantic search. This brings docs in line with actual
capabilities.

Refs: #251, #253
Forward search_mode, offset, and optimize_query through resolve_tool_call,
MCP tools, and hydrate_memory_prompt. Remove hybrid_alpha and text_scorer
from LLM-facing interfaces (MCP, tool schemas, LangChain) since these are
server-level hyperparameters that LLMs cannot reliably tune.

Changes:
- Forward search_mode, offset, optimize_query in _resolve_search_memory
- Add search_mode to MCP memory_prompt and search_long_term_memory
- Remove hybrid_alpha and text_scorer from MCP tools and client tool
  schemas (kept in developer-facing hydrate_memory_prompt)
- Align max_results default from 5 to 10 in _resolve_search_memory
- Add distance_threshold validation for non-semantic search modes
- Fix deprecated function references in tool descriptions and docstrings
  (search_memory -> search_memory_tool, get_working_memory ->
  get_or_create_working_memory)
- Correct optimize_query docstrings in API and MCP
- Make error message name-agnostic in client
- Update MCP docs for search mode support

Tests:
- Add test for search_mode forwarding through resolve_function_call
- Add test for search_mode forwarding through LangChain tool
- Both tests verify hybrid_alpha/text_scorer are NOT in tool schemas
- Update MCP parameter passing tests

Design context (from #251 discussion):
- @nkanu17: Hyperparameters like alpha, recency, and scorer config should
  be set at the server/session level, not by the LLM
- @abrookins: MCP interface is intentionally smaller than REST API. LLMs
  can reason about search mode selection (semantic vs keyword vs hybrid)
  but struggle with arbitrary numeric tuning values
- @tylerhutcherson: Questioned whether LLMs should toggle these params,
  leading to the decision to expose only search_mode
- @vishal-bala: Cataloged full MCP/REST gaps in #253, confirmed remaining
  items are out of scope for this PR

Refs: #251, #253
Move search_mode, hybrid_alpha, and text_scorer after limit, offset, and
optimize_query to avoid breaking existing callers that pass those params
positionally.
…d client

- Forward event_date, recency_boost, recency_*_weight, recency_half_life_*,
  and server_side_recency through MCP search_long_term_memory and memory_prompt
- Add same params to Python client hydrate_memory_prompt (at end of signature
  to preserve positional argument order)
- Fix namespace injection: target 'memory_prompt' instead of 'hydrate_memory_prompt'
- Import EventDate filter in mcp.py
Phase 2 of #253: Adds MCP tool equivalents for all REST endpoints
that were previously inaccessible from the MCP interface.

Expanded existing tools:
- get_working_memory: added model_name and context_window_max params
- memory_prompt: supports session-only, long_term_search shortcut,
  and explicit search payload modes

New tools:
- forget_memories: bulk delete by policy (before_date, after_date,
  inactive_for_days)
- list_sessions: list all sessions with pagination
- delete_working_memory: delete working memory for a session
- create_summary_view: create a SummaryView configuration
- list_summary_views: list all SummaryView configs
- get_summary_view: get a SummaryView by ID
- delete_summary_view: delete a SummaryView configuration
- run_summary_view_partition: sync compute a single partition
- list_summary_view_partitions: list stored partition results
- run_summary_view: async full recompute (returns Task)
- get_task_status: poll background task status

Updated namespace injection in call_tool to cover new tools.
Copy link
Copy Markdown

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR successfully adds comprehensive MCP tool coverage for REST API parity. The implementation is well-structured with proper namespace injection and parameter forwarding. I have a few suggestions for improving validation consistency.

1. Hydrate a user prompt with long-term memory search:
memory_prompt(query="What was my favorite color?")
```

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The validation for distance_threshold with non-semantic search modes is missing in the MCP memory_prompt tool (line 733), but is present in the REST API endpoint (api.py line 665-672) and in search_long_term_memory MCP tool (line 733). This creates an inconsistency where invalid parameter combinations could pass through memory_prompt but fail in the REST API.

Suggested fix:

if distance_threshold is not None and search_mode != SearchModeEnum.SEMANTIC:
    raise ValueError(
        "distance_threshold is only supported for semantic search mode"
    )

if distance_threshold is not None and search_mode != SearchModeEnum.SEMANTIC:
    raise ValueError(
        "distance_threshold is only supported for semantic search mode"
    )

This should be added immediately after line 733 in the memory_prompt function, matching the validation pattern used in both the REST API and the search_long_term_memory MCP tool.

model_name=model_name,
context_window_max=context_window_max,
)
if _calculate_messages_token_count(result.messages) > effective_token_limit:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The token-based message truncation logic in get_working_memory uses a simple loop that removes messages from the beginning. While this works, there's a potential edge case: if a single message exceeds effective_token_limit, the loop will reduce to just that message but still exceed the limit.

Current code:

while len(messages) > 1:
    messages = messages[1:]
    if _calculate_messages_token_count(messages) <= effective_token_limit:
        break

The loop condition len(messages) > 1 means it will keep at least one message even if that message alone exceeds the token limit. Consider adding validation or handling for this edge case to match the REST API behavior more closely.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

messages = messages[1:]
if _calculate_messages_token_count(messages) <= effective_token_limit:
break
result.messages = messages
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication: token truncation reimplemented in MCP tool

Low Severity

The token-based message truncation logic in the MCP get_working_memory tool is a near-verbatim copy of the same logic in the REST API's get_working_memory endpoint. Both import and call _calculate_messages_token_count and _get_effective_token_limit with an identical while-loop truncation pattern. This duplicated logic increases the maintenance burden — any future bug fix or improvement to truncation must be applied in both places. Extracting this into a shared helper would be cleaner.

Additional Locations (1)
Fix in Cursor Fix in Web

if invalid_filters:
raise ValueError(
"Unsupported filter fields: " + ", ".join(sorted(invalid_filters))
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code duplication: SummaryView validation copied from REST API

Low Severity

The create_summary_view MCP tool duplicates the validation logic for source, group_by, and filters that already exists in the REST API endpoint. The allowed-field sets and error messages are copied inline. If allowed fields change (e.g., a new group_by option), both the REST handler and this MCP tool need to be updated independently, risking inconsistency.

Fix in Cursor Fix in Web

@jit-ci
Copy link
Copy Markdown

jit-ci bot commented Apr 3, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

@nkanu17
Copy link
Copy Markdown
Contributor Author

nkanu17 commented Apr 3, 2026

Closing this PR after revisiting the discussion on #251. The consensus from @abrookins, @tylerhutcherson, and @nkanu17 was that MCP should be a curated, smaller interface, not a 1:1 mirror of the REST API. Several tools added here (working memory management, session listing, SummaryView CRUD, task polling) go against that philosophy. Will reopen with a pared-down scope that only includes tools LLMs can meaningfully reason about.

@nkanu17 nkanu17 closed this Apr 3, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e018bfdc67

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +157 to +164
_plain_ns_tools = (
"set_working_memory",
"get_working_memory",
"delete_working_memory",
"list_sessions",
"forget_memories",
"compact_long_term_memories",
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Inject namespace for list_summary_view_partitions

The namespace auto-injection added in call_tool() does not cover list_summary_view_partitions, even though that tool accepts a namespace argument. In namespaced transports (/{namespace}), this means partition listings are not scoped unless the caller explicitly passes namespace, so a request can return partitions from other namespaces and break tenant/data-scope expectations for the new MCP parity tools.

Useful? React with 👍 / 👎.

Comment on lines +1421 to +1423
limit=limit,
offset=offset,
namespace=namespace,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Enforce list_sessions pagination limits

list_sessions forwards limit/offset directly to the backend without bounds checks, unlike the REST path which constrains these values (limit 1-100, offset >= 0 via GetSessionsQuery). This lets MCP callers request very large or negative windows, which can produce expensive Redis queries or invalid pagination behavior that the REST API explicitly prevents.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes MCP ↔ REST surface-area parity by adding MCP tools for previously REST-only endpoints (forgetting, sessions, SummaryViews, task polling) and expanding existing MCP/client tool interfaces to better match the server’s search + prompt-hydration capabilities.

Changes:

  • Expanded MCP memory_prompt / search_long_term_memory retrieval controls (search_mode, event_date, recency options) and improved namespace injection behavior.
  • Added new MCP tools for forgetting, session listing/deletion, SummaryView CRUD/run/partitions, and task status polling.
  • Updated client/tooling tests and documentation to reflect semantic/keyword/hybrid search and updated client method names.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
agent_memory_server/mcp.py Adds/extends MCP tools (forget/sessions/summary-views/tasks), expands search + memory_prompt params, updates namespace injection.
agent_memory_server/api.py Docstring clarification: optimize_query applies to semantic search only.
agent-memory-client/agent_memory_client/client.py Updates tool schemas/docs and forwards search_mode/offset/optimize_query through resolve paths; docstring clarifications.
agent-memory-client/agent_memory_client/integrations/langchain.py Extends LangChain search tool to accept/forward search_mode; updates description.
tests/test_mcp.py Extends MCP memory_prompt param-forwarding assertions to cover search_mode.
tests/test_client_tool_calls.py Adds regression test ensuring search_mode is forwarded and hyperparams are omitted.
agent-memory-client/tests/test_langchain_integration.py Adds test verifying LangChain wrapper forwards search_mode and omits hyperparams.
README.md Updates feature/architecture docs to mention semantic/keyword/hybrid search.
examples/README.md Updates example docs to mention semantic/keyword/hybrid search.
docs/use-cases.md Updates guidance to reference semantic/keyword/hybrid search.
docs/typescript-sdk.md Adds TS examples for keyword/hybrid search usage.
docs/java-sdk.md Adds Java examples for keyword/hybrid search usage.
docs/README.md Updates feature matrix wording for multi-mode search.
docs/quick-start.md Updates quick-start narrative to mention keyword/hybrid search.
docs/mcp.md Updates MCP docs to reflect multi-mode search.
docs/long-term-memory.md Expands long-term memory docs to explain search modes and indexing.
docs/index.md Updates landing page copy to mention multi-mode search.
docs/agent-examples.md Updates agent examples text and corrects working-memory API usage in snippet.
agent-memory-client/README.md Updates README examples to reflect new get_or_create_working_memory and multi-mode search.
agent-memory-client/agent-memory-client-js/README.md Updates JS client README examples to include keyword/hybrid search.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +833 to 856
search_payload = SearchRequest(
text=query,
namespace=namespace,
topics=topics,
entities=entities,
created_at=created_at,
last_accessed=last_accessed,
user_id=user_id,
distance_threshold=distance_threshold,
memory_type=memory_type,
event_date=event_date,
search_mode=search_mode,
limit=limit,
offset=offset,
recency_boost=recency_boost,
recency_semantic_weight=recency_semantic_weight,
recency_recency_weight=recency_recency_weight,
recency_freshness_weight=recency_freshness_weight,
recency_novelty_weight=recency_novelty_weight,
recency_half_life_last_access_days=recency_half_life_last_access_days,
recency_half_life_created_days=recency_half_life_created_days,
server_side_recency=server_side_recency,
)
_params["long_term_search"] = search_payload
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

memory_prompt currently always constructs and passes an explicit SearchRequest when include_long_term_search=True, so callers cannot access the REST /v1/memory/prompt long_term_search=true shortcut semantics (bool mode), which excludes the current session and uses the server’s default limit/offset behavior. If parity with REST is intended, consider adding a way to pass long_term_search=True through MCP (or emulate that behavior when no explicit long-term filters are provided).

Suggested change
search_payload = SearchRequest(
text=query,
namespace=namespace,
topics=topics,
entities=entities,
created_at=created_at,
last_accessed=last_accessed,
user_id=user_id,
distance_threshold=distance_threshold,
memory_type=memory_type,
event_date=event_date,
search_mode=search_mode,
limit=limit,
offset=offset,
recency_boost=recency_boost,
recency_semantic_weight=recency_semantic_weight,
recency_recency_weight=recency_recency_weight,
recency_freshness_weight=recency_freshness_weight,
recency_novelty_weight=recency_novelty_weight,
recency_half_life_last_access_days=recency_half_life_last_access_days,
recency_half_life_created_days=recency_half_life_created_days,
server_side_recency=server_side_recency,
)
_params["long_term_search"] = search_payload
#
# Preserve REST parity: when the caller only enables long-term search and
# does not provide any explicit long-term filters, forward the boolean
# shortcut (`true`) so the server can apply its default limit/offset and
# current-session exclusion behavior. If explicit long-term search options
# are supplied, continue passing a concrete SearchRequest.
has_explicit_long_term_filters = any(
value is not None
for value in (
namespace,
topics,
entities,
created_at,
last_accessed,
user_id,
distance_threshold,
memory_type,
event_date,
search_mode,
limit,
offset,
recency_boost,
recency_semantic_weight,
recency_recency_weight,
recency_freshness_weight,
recency_novelty_weight,
recency_half_life_last_access_days,
recency_half_life_created_days,
server_side_recency,
)
)
if has_explicit_long_term_filters:
search_payload = SearchRequest(
text=query,
namespace=namespace,
topics=topics,
entities=entities,
created_at=created_at,
last_accessed=last_accessed,
user_id=user_id,
distance_threshold=distance_threshold,
memory_type=memory_type,
event_date=event_date,
search_mode=search_mode,
limit=limit,
offset=offset,
recency_boost=recency_boost,
recency_semantic_weight=recency_semantic_weight,
recency_recency_weight=recency_recency_weight,
recency_freshness_weight=recency_freshness_weight,
recency_novelty_weight=recency_novelty_weight,
recency_half_life_last_access_days=recency_half_life_last_access_days,
recency_half_life_created_days=recency_half_life_created_days,
server_side_recency=server_side_recency,
)
_params["long_term_search"] = search_payload
else:
_params["long_term_search"] = True

Copilot uses AI. Check for mistakes.
Comment on lines +1403 to +1406
List working memory sessions with optional pagination and filtering.

This works like the `GET /v1/working-memory/` REST API endpoint.

Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description lists list_sessions as REST-equivalent to GET /v1/sessions, but the server’s REST endpoint is GET /v1/working-memory/ (see agent_memory_server/api.py). Either the PR description should be corrected, or the tool/docs should be updated to match the actual REST route naming used by this repo.

Copilot uses AI. Check for mistakes.
Comment on lines +1396 to +1401
async def list_sessions(
limit: int = 20,
offset: int = 0,
namespace: str | None = None,
user_id: str | None = None,
) -> SessionListResponse:
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

list_sessions docstring promises limit is 1–100, but the MCP tool does not enforce those bounds (unlike the REST path, which validates via GetSessionsQuery(limit=ge1,le100)). Please add explicit validation (and also validate offset >= 0) to avoid accidental large scans and to keep MCP behavior consistent with REST.

Copilot uses AI. Check for mistakes.
Comment on lines +1368 to +1371
Args:
policy: Forgetting policy dict with keys like max_age_days, max_inactive_days,
budget (number of top memories to keep), memory_type_allowlist
namespace: Optional namespace filter
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The forget_memories docstring lists policy keys like max_age_days/max_inactive_days, but the PR description claims MCP supports before_date/after_date/inactive_for_days. The implementation delegates directly to forget_long_term_memories, which only supports the max_*_days/budget policy shape; please align the docs (or add translation/validation) so callers know the actual supported policy fields.

Copilot uses AI. Check for mistakes.
Comment on lines +1510 to +1529
# Validate keys (same logic as REST)
if payload.source != "long_term":
raise ValueError(
"SummaryView.source must be 'long_term' for now; "
"'working_memory' is not yet supported."
)

allowed_group_by = {"user_id", "namespace", "session_id", "memory_type"}
allowed_filters = {"user_id", "namespace", "session_id", "memory_type"}

invalid_group = [k for k in payload.group_by if k not in allowed_group_by]
if invalid_group:
raise ValueError(
"Unsupported group_by fields: " + ", ".join(sorted(invalid_group))
)

invalid_filters = [k for k in payload.filters if k not in allowed_filters]
if invalid_filters:
raise ValueError(
"Unsupported filter fields: " + ", ".join(sorted(invalid_filters))
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_summary_view duplicates the SummaryView key validation logic inline (allowed group_by/filters, source check). The REST API already centralizes this in _validate_summary_view_keys and uses it before persisting; to avoid MCP/REST drift over time, consider reusing the shared validation helper (or moving validation into summary_views / the request model) instead of maintaining two copies.

Copilot uses AI. Check for mistakes.
Comment on lines +1350 to +1358
@mcp_app.tool()
async def forget_memories(
policy: dict,
namespace: str | None = None,
user_id: str | None = None,
session_id: str | None = None,
limit: int = 1000,
dry_run: bool = True,
pinned_ids: list[str] | None = None,
Copy link

Copilot AI Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New MCP tools added in this file (forget_memories, list_sessions, delete_working_memory, SummaryView CRUD/run, and get_task_status) don’t appear to have corresponding MCP tests. tests/test_mcp.py already exercises other MCP tools extensively, so adding coverage here would help prevent parity regressions (e.g., namespace injection, argument validation, and error handling).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants