From 10bcd7b0901c934f4ca21a9f637800cef28c06ad Mon Sep 17 00:00:00 2001 From: Daniel Marbach Date: Wed, 25 Mar 2026 15:46:28 +0100 Subject: [PATCH 1/5] Improve MCP metadata guidance for AI clients --- docs/mcp-investigation-guide.md | 112 +++++++++++ ...d_list_primary_instance_tools.approved.txt | 188 ++++++++++++++---- ...ould_list_audit_message_tools.approved.txt | 84 ++++++-- ...ould_list_audit_message_tools.approved.txt | 84 ++++++-- .../Mcp/McpMetadataDescriptionsTests.cs | 50 +++++ .../Mcp/AuditMessageTools.cs | 50 ++--- src/ServiceControl.Audit/Mcp/EndpointTools.cs | 12 +- .../Mcp/McpMetadataDescriptionsTests.cs | 64 ++++++ src/ServiceControl/Mcp/ArchiveTools.cs | 41 ++-- src/ServiceControl/Mcp/FailedMessageTools.cs | 42 ++-- src/ServiceControl/Mcp/FailureGroupTools.cs | 12 +- src/ServiceControl/Mcp/RetryTools.cs | 27 ++- 12 files changed, 602 insertions(+), 164 deletions(-) create mode 100644 docs/mcp-investigation-guide.md create mode 100644 src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs create mode 100644 src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs diff --git a/docs/mcp-investigation-guide.md b/docs/mcp-investigation-guide.md new file mode 100644 index 0000000000..337103b3de --- /dev/null +++ b/docs/mcp-investigation-guide.md @@ -0,0 +1,112 @@ +# ServiceControl MCP Investigation Guide + +This guide explains how to use the ServiceControl MCP tools for investigation work. + +The MCP surface is designed to help AI agents and human operators choose the right tool based on intent, scope, and risk. + +## Tool Inventory + +### Primary instance tools + +| Tool | Category | Risk | Notes | +| --- | --- | --- | --- | +| `get_errors_summary` | summary | safe | Best first step for overall failed-message health | +| `get_failure_groups` | summary | safe | Best first step for root-cause analysis | +| `get_retry_history` | detail | safe | Confirms whether similar retries were already attempted | +| `get_failed_messages` | list | safe | Broad failed-message listing | +| `get_failed_messages_by_endpoint` | list | safe | Use when the endpoint is already known | +| `get_failed_message_by_id` | detail | safe | Full failed-message history | +| `get_failed_message_last_attempt` | detail | safe | Lighter detail view for the latest failure | +| `retry_failed_message` | action | moderate | Narrow retry for one failed message | +| `retry_failed_messages` | action | moderate | Retry a specific set of failed messages | +| `retry_failed_messages_by_queue` | action | high | Retries all unresolved failures in one queue | +| `retry_all_failed_messages_by_endpoint` | action | high | Retries all failures for one endpoint | +| `retry_failure_group` | action | moderate | Best grouped retry after fixing one root cause | +| `retry_all_failed_messages` | action | high | Broadest retry operation | +| `archive_failed_message` | action | moderate | Dismiss one failed message | +| `archive_failed_messages` | action | moderate | Dismiss a chosen set of failed messages | +| `archive_failure_group` | action | high | Dismiss all failed messages in one failure group | +| `unarchive_failed_message` | action | moderate | Restore one archived failed message | +| `unarchive_failed_messages` | action | moderate | Restore a chosen set of archived failed messages | +| `unarchive_failure_group` | action | high | Restore all archived messages in one failure group | + +### Audit instance tools + +| Tool | Category | Risk | Notes | +| --- | --- | --- | --- | +| `get_known_endpoints` | discovery | safe | Start here when you need endpoint names | +| `get_endpoint_audit_counts` | summary | safe | Throughput trends for one endpoint | +| `get_audit_messages` | list | safe | Broad audit-message browsing | +| `search_audit_messages` | search | safe | Full-text lookup for specific terms or IDs | +| `get_audit_messages_by_endpoint` | list/search | safe | Scoped endpoint investigation | +| `get_audit_messages_by_conversation` | detail | safe | Trace a message flow across related messages | +| `get_audit_message_body` | detail | safe | Inspect serialized payload content | + +## Read-only vs State-changing + +### Read-only tools + +Use these first during an investigation. They do not change system state. + +- Error investigation: `get_errors_summary`, `get_failure_groups`, `get_retry_history`, `get_failed_messages`, `get_failed_messages_by_endpoint`, `get_failed_message_by_id`, `get_failed_message_last_attempt` +- Audit investigation: `get_known_endpoints`, `get_endpoint_audit_counts`, `get_audit_messages`, `search_audit_messages`, `get_audit_messages_by_endpoint`, `get_audit_messages_by_conversation`, `get_audit_message_body` + +### State-changing tools + +Use these only when the user explicitly wants to retry, archive, or restore failed messages. + +- Retry tools: `retry_failed_message`, `retry_failed_messages`, `retry_failed_messages_by_queue`, `retry_all_failed_messages_by_endpoint`, `retry_failure_group`, `retry_all_failed_messages` +- Archive tools: `archive_failed_message`, `archive_failed_messages`, `archive_failure_group` +- Restore tools: `unarchive_failed_message`, `unarchive_failed_messages`, `unarchive_failure_group` + +Broad actions such as `retry_all_failed_messages`, `retry_failed_messages_by_queue`, `retry_all_failed_messages_by_endpoint`, `archive_failure_group`, and `unarchive_failure_group` can affect many messages. Prefer the narrowest tool that matches the user's intent. + +## Commonly Confused Tool Pairs + +### Error tools + +- `get_failed_messages` vs `get_failed_messages_by_endpoint`: use the endpoint-specific tool only when the endpoint is already known +- `retry_failed_messages` vs `retry_failure_group`: use the grouped retry when messages share the same root cause; use the ID-list retry when the user selected specific failed messages +- `archive_failed_messages` vs `archive_failure_group`: use the grouped archive when the whole failure group should be dismissed; use the ID-list archive when only some failed messages should be archived + +### Audit tools + +- `get_audit_messages` vs `search_audit_messages`: browse with `get_audit_messages` when the user wants an overview; search with `search_audit_messages` when the user supplies a concrete term, identifier, or phrase +- `get_audit_messages_by_endpoint` vs `get_audit_messages_by_conversation`: use the endpoint tool for one receiver endpoint; use the conversation tool to follow a cross-endpoint message flow +- `get_audit_messages` vs `get_audit_message_body`: browse metadata first, then fetch body content only when the actual payload matters + +## Recommended Investigation Flows + +### Error investigation flow + +1. `get_errors_summary` +2. `get_failure_groups` +3. `get_failed_messages` or `get_failed_messages_by_endpoint` +4. `get_failed_message_by_id` or `get_failed_message_last_attempt` +5. `get_retry_history` when a retry decision depends on prior attempts +6. Only then consider retry, archive, or unarchive tools + +### Audit investigation flow + +1. `get_known_endpoints` if the endpoint name is not known yet +2. `get_audit_messages` for broad browsing, or `search_audit_messages` for a concrete term or identifier +3. `get_audit_messages_by_endpoint` to narrow to one receiver endpoint +4. `get_audit_messages_by_conversation` to trace the related message flow +5. `get_audit_message_body` when the payload content is needed + +## Task-to-tool Mappings + +- "What is failing right now?" -> `get_errors_summary`, then `get_failure_groups` +- "Show recent failures in Sales" -> `get_failed_messages_by_endpoint` +- "Show the full history for this failure" -> `get_failed_message_by_id` +- "Show only the latest exception for this failure" -> `get_failed_message_last_attempt` +- "Retry the failures caused by this bug" -> `retry_failure_group` +- "Retry everything in this queue" -> `retry_failed_messages_by_queue` +- "Dismiss this one failure" -> `archive_failed_message` +- "Restore the archived failures for this root cause" -> `unarchive_failure_group` +- "What endpoints do we have?" -> `get_known_endpoints` +- "Show recent audit traffic" -> `get_audit_messages` +- "Find audit messages mentioning order 12345" -> `search_audit_messages` +- "Show what Billing processed" -> `get_audit_messages_by_endpoint` +- "Trace this conversation" -> `get_audit_messages_by_conversation` +- "Show me the payload for this audit message" -> `get_audit_message_body` diff --git a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt index d3f82a86b8..9001ab4e74 100644 --- a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt +++ b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt @@ -1,12 +1,12 @@ [ { "name": "archive_failed_message", - "description": "Use this tool to dismiss a single failed message that does not need to be retried. Good for questions like: \u0027archive this message\u0027, \u0027dismiss this failure\u0027, or \u0027I do not need to retry this one\u0027. Archiving moves the message out of the unresolved list so it no longer shows up as an active problem. This is an asynchronous operation \u2014 the message will be archived shortly after the request is accepted. If you need to archive many messages with the same root cause, use ArchiveFailureGroup instead.", + "description": "Use this tool to dismiss a single failed message that does not need to be retried. This operation changes system state. Good for questions like: \u0027archive this message\u0027, \u0027dismiss this failure\u0027, or \u0027I do not need to retry this one\u0027. Archiving moves the message out of the unresolved list so it no longer shows up as an active problem. This is an asynchronous operation \u2014 the message will be archived shortly after the request is accepted. If you need to archive many messages with the same root cause, use ArchiveFailureGroup instead.", "inputSchema": { "type": "object", "properties": { "failedMessageId": { - "description": "The unique message ID from a previous query result", + "description": "The failed message ID from a previous failed-message query result.", "type": "string" } }, @@ -14,18 +14,24 @@ "failedMessageId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "archive_failed_messages", - "description": "Use this tool to dismiss multiple failed messages at once that do not need to be retried. Good for questions like: \u0027archive these messages\u0027, \u0027dismiss these failures\u0027, or \u0027archive messages msg-1, msg-2, msg-3\u0027. Prefer ArchiveFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to archive.", + "description": "Use this tool to dismiss multiple failed messages at once that do not need to be retried. This operation changes system state. Good for questions like: \u0027archive these messages\u0027, \u0027dismiss these failures\u0027, or \u0027archive messages msg-1, msg-2, msg-3\u0027. Prefer ArchiveFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to archive.", "inputSchema": { "type": "object", "properties": { "messageIds": { - "description": "The unique message IDs from a previous query result", + "description": "The failed message IDs from previous failed-message query results.", "type": "array", "items": { "type": "string" @@ -36,18 +42,24 @@ "messageIds" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "archive_failure_group", - "description": "Use this tool to dismiss an entire failure group \u2014 all messages that failed with the same exception type and stack trace. Good for questions like: \u0027archive this failure group\u0027, \u0027dismiss all NullReferenceException failures\u0027, or \u0027archive the whole group\u0027. This is the most efficient way to archive many related failures at once. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an archive operation is already running for this group.", + "description": "Use this tool to dismiss an entire failure group \u2014 all messages that failed with the same exception type and stack trace. This operation changes system state. Good for questions like: \u0027archive this failure group\u0027, \u0027dismiss all NullReferenceException failures\u0027, or \u0027archive the whole group\u0027. This is the most efficient way to archive many related failures at once. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an archive operation is already running for this group.", "inputSchema": { "type": "object", "properties": { "groupId": { - "description": "The failure group ID from get_failure_groups results", + "description": "The failure group ID from previous GetFailureGroups results.", "type": "string" } }, @@ -55,29 +67,41 @@ "groupId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "get_errors_summary", - "description": "Use this tool as a quick health check to see how many messages are in each failure state. Good for questions like: \u0027how many errors are there?\u0027, \u0027what is the error situation?\u0027, or \u0027are there unresolved failures?\u0027. Returns counts for unresolved, archived, resolved, and retryissued statuses. This is a good first tool to call when asked about the overall error situation before drilling into specific messages.", + "description": "Read-only. Use this tool as a quick health check to see how many messages are in each failure state. Good for questions like: \u0027how many errors are there?\u0027, \u0027what is the error situation?\u0027, or \u0027are there unresolved failures?\u0027. Returns counts for unresolved, archived, resolved, and retryissued statuses. This is a good first tool to call when asked about the overall error situation before drilling into specific messages.", "inputSchema": { "type": "object", "properties": {} }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_failed_message_by_id", - "description": "Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. Good for questions like: \u0027show me details for this failed message\u0027, \u0027what exception caused this failure?\u0027, or \u0027how many times has this message failed?\u0027. You need the message\u0027s unique ID, which you can get from GetFailedMessages or GetFailureGroups results. If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead \u2014 it returns less data.", + "description": "Read-only. Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. Good for questions like: \u0027show me details for this failed message\u0027, \u0027what exception caused this failure?\u0027, or \u0027how many times has this message failed?\u0027. You need a failed message ID, which you can get from GetFailedMessages or GetFailureGroups results. If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead \u2014 it returns less data.", "inputSchema": { "type": "object", "properties": { "failedMessageId": { - "description": "The unique message ID from a previous query result", + "description": "The failed message ID from a previous failed-message query result.", "type": "string" } }, @@ -85,18 +109,24 @@ "failedMessageId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_failed_message_last_attempt", - "description": "Use this tool to see how a specific message failed most recently. Good for questions like: \u0027what was the last error for this message?\u0027, \u0027show me the latest exception\u0027, or \u0027what happened on the last attempt?\u0027. Returns the latest processing attempt with its exception, stack trace, and headers. Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history.", + "description": "Read-only. Use this tool to see how a specific message failed most recently. Good for questions like: \u0027what was the last error for this message?\u0027, \u0027show me the latest exception\u0027, or \u0027what happened on the last attempt?\u0027. Returns the latest processing attempt with its exception, stack trace, and headers. Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history.", "inputSchema": { "type": "object", "properties": { "failedMessageId": { - "description": "The unique message ID from a previous query result", + "description": "The failed message ID from a previous failed-message query result.", "type": "string" } }, @@ -104,18 +134,24 @@ "failedMessageId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_failed_messages", - "description": "Use this tool to browse failed messages when the user wants to see what is failing. Good for questions like: \u0027what messages are currently failing?\u0027, \u0027are there failures in a specific queue?\u0027, or \u0027what failed recently?\u0027. Returns a paged list of failed messages with their status, exception details, and queue information. For broad requests, call with no parameters to get the most recent failures \u2014 only add filters when you need to narrow down results. Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint.", + "description": "Read-only. Use this tool to retrieve failed messages for investigation when the user wants to see what is failing. Good for questions like: \u0027what messages are currently failing?\u0027, \u0027are there failures in a specific queue?\u0027, or \u0027what failed recently?\u0027. Returns a paged list of failed messages with their status, exception details, and queue information. For broad requests, call with no parameters to get the most recent failures \u2014 only add filters when you need to narrow the scope. Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint.", "inputSchema": { "type": "object", "properties": { "status": { - "description": "Narrow results to a specific status: unresolved (still failing), resolved (succeeded on retry), archived (dismissed), or retryissued (retry in progress). Omit to include all statuses.", + "description": "Filter failed messages by status: unresolved (still failing), resolved (succeeded on retry), archived (dismissed), or retryissued (retry in progress). Omit this filter to include all statuses.", "type": [ "string", "null" @@ -123,7 +159,7 @@ "default": null }, "modified": { - "description": "Only return messages modified after this date (ISO 8601). Useful for checking recent failures.", + "description": "Filter failed messages to entries modified after this ISO 8601 date/time. Omit this filter to include older results.", "type": [ "string", "null" @@ -131,7 +167,7 @@ "default": null }, "queueAddress": { - "description": "Only return messages from this queue address, e.g. \u0027Sales@machine\u0027. Use when investigating a specific queue.", + "description": "Filter failed messages to a specific queue address, for example \u0027Sales@machine\u0027. Omit this filter to include all queues.", "type": [ "string", "null" @@ -160,22 +196,28 @@ } } }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_failed_messages_by_endpoint", - "description": "Use this tool to see failed messages for a specific NServiceBus endpoint. Good for questions like: \u0027what is failing in the Sales endpoint?\u0027, \u0027show errors for Shipping\u0027, or \u0027are there failures in this endpoint?\u0027. Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name.", + "description": "Read-only. Use this tool to see failed messages for a specific NServiceBus endpoint. Good for questions like: \u0027what is failing in the Sales endpoint?\u0027, \u0027show errors for Shipping\u0027, or \u0027are there failures in this endpoint?\u0027. Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", "type": "string" }, "status": { - "description": "Narrow results to a specific status: unresolved, resolved, archived, or retryissued. Omit to include all.", + "description": "Filter failed messages by status: unresolved, resolved, archived, or retryissued. Omit this filter to include all statuses for the endpoint.", "type": [ "string", "null" @@ -183,7 +225,7 @@ "default": null }, "modified": { - "description": "Only return messages modified after this date (ISO 8601)", + "description": "Filter endpoint results to failed messages modified after this ISO 8601 date/time. Omit this filter to include older results.", "type": [ "string", "null" @@ -215,13 +257,19 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_failure_groups", - "description": "Use this tool to understand why messages are failing by seeing failures grouped by root cause. Good for questions like: \u0027why are messages failing?\u0027, \u0027what errors are happening?\u0027, \u0027group failures by exception\u0027, or \u0027what are the top failure causes?\u0027. Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. This is usually the best starting point for diagnosing production issues \u2014 call it before drilling into individual messages. Call with no parameters to use the default grouping by exception type and stack trace.", + "description": "Read-only. Use this tool to understand why messages are failing by seeing failures grouped by root cause. Good for questions like: \u0027why are messages failing?\u0027, \u0027what errors are happening?\u0027, \u0027group failures by exception\u0027, or \u0027what are the top failure causes?\u0027. Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. This is usually the best starting point for diagnosing production issues \u2014 call it before drilling into individual messages. Call with no parameters to use the default grouping by exception type and stack trace.", "inputSchema": { "type": "object", "properties": { @@ -231,7 +279,7 @@ "default": "Exception Type and Stack Trace" }, "classifierFilter": { - "description": "Only include groups matching this filter text", + "description": "Filter failure groups by classifier text. Omit this filter to include all groups for the selected classifier.", "type": [ "string", "null" @@ -240,35 +288,53 @@ } } }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_retry_history", - "description": "Use this tool to check the history of retry operations. Good for questions like: \u0027has someone already retried these?\u0027, \u0027what happened the last time we retried this group?\u0027, \u0027show retry history\u0027, or \u0027were any retries attempted today?\u0027. Returns which groups were retried, when, and whether the retries succeeded or failed. Use this before retrying a group to avoid duplicate retry attempts.", + "description": "Read-only. Use this tool to check the history of retry operations. Good for questions like: \u0027has someone already retried these?\u0027, \u0027what happened the last time we retried this group?\u0027, \u0027show retry history\u0027, or \u0027were any retries attempted today?\u0027. Returns which groups were retried, when, and whether the retries succeeded or failed. Use this before retrying a group to avoid duplicate retry attempts.", "inputSchema": { "type": "object", "properties": {} }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_all_failed_messages", - "description": "Use this tool to retry every unresolved failed message across all queues and endpoints. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.", + "description": "Use this tool to retry every unresolved failed message across all queues and endpoints. This operation changes system state. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. It affects all unresolved failed messages across the instance. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.", "inputSchema": { "type": "object", "properties": {} }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_all_failed_messages_by_endpoint", - "description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.", + "description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. This operation changes system state. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.", "inputSchema": { "type": "object", "properties": { @@ -281,18 +347,24 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_failed_message", - "description": "Use this tool to reprocess a single failed message by sending it back to its original queue. Good for questions like: \u0027retry this message\u0027, \u0027reprocess this failure\u0027, or \u0027send this message back for processing\u0027. The message will go through normal processing again. Only use after the underlying issue (bug fix, infrastructure problem) has been resolved. If you need to retry many messages with the same root cause, use RetryFailureGroup instead.", + "description": "Use this tool to reprocess a single failed message by sending it back to its original queue. This operation changes system state. Good for questions like: \u0027retry this message\u0027, \u0027reprocess this failure\u0027, or \u0027send this message back for processing\u0027. The message will go through normal processing again. Only use after the underlying issue (bug fix, infrastructure problem) has been resolved. If you need to retry many messages with the same root cause, use RetryFailureGroup instead.", "inputSchema": { "type": "object", "properties": { "failedMessageId": { - "description": "The unique message ID from a previous query result", + "description": "The failed message ID from a previous failed-message query result.", "type": "string" } }, @@ -300,18 +372,24 @@ "failedMessageId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_failed_messages", - "description": "Use this tool to reprocess multiple specific failed messages at once. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.", + "description": "Use this tool to reprocess multiple specific failed messages at once. This operation changes system state. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.", "inputSchema": { "type": "object", "properties": { "messageIds": { - "description": "The unique message IDs from a previous query result", + "description": "The failed message IDs from previous failed-message query results.", "type": "array", "items": { "type": "string" @@ -322,13 +400,19 @@ "messageIds" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_failed_messages_by_queue", - "description": "Use this tool to retry all unresolved failed messages from a specific queue. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.", + "description": "Use this tool to retry all unresolved failed messages from a specific queue. This operation changes system state. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.", "inputSchema": { "type": "object", "properties": { @@ -341,18 +425,24 @@ "queueAddress" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "retry_failure_group", - "description": "Use this tool to retry all failed messages that share the same exception type and stack trace. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.", + "description": "Use this tool to retry all failed messages that share the same exception type and stack trace. This operation changes system state. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.", "inputSchema": { "type": "object", "properties": { "groupId": { - "description": "The failure group ID from get_failure_groups results", + "description": "The failure group ID from previous GetFailureGroups results.", "type": "string" } }, @@ -360,18 +450,24 @@ "groupId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "unarchive_failed_message", - "description": "Use this tool to restore a previously archived failed message back to the unresolved list so it can be retried. Good for questions like: \u0027unarchive this message\u0027, \u0027restore this failure\u0027, or \u0027I need to retry this archived message\u0027. Use when a message was archived by mistake or when the underlying issue has been fixed and the message should be reprocessed. If you need to restore many messages from the same failure group, use UnarchiveFailureGroup instead.", + "description": "Use this tool to restore a previously archived failed message back to the unresolved list so it can be retried. This operation changes system state. Good for questions like: \u0027unarchive this message\u0027, \u0027restore this failure\u0027, or \u0027I need to retry this archived message\u0027. Use when a message was archived by mistake or when the underlying issue has been fixed and the message should be reprocessed. If you need to restore many messages from the same failure group, use UnarchiveFailureGroup instead.", "inputSchema": { "type": "object", "properties": { "failedMessageId": { - "description": "The unique message ID to restore", + "description": "The failed message ID to restore from the archived state.", "type": "string" } }, @@ -379,18 +475,24 @@ "failedMessageId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "unarchive_failed_messages", - "description": "Use this tool to restore multiple previously archived failed messages back to the unresolved list. Good for questions like: \u0027unarchive these messages\u0027, \u0027restore these failures\u0027, or \u0027unarchive messages msg-1, msg-2, msg-3\u0027. Prefer UnarchiveFailureGroup when restoring an entire group \u2014 use this tool when you have a specific set of message IDs.", + "description": "Use this tool to restore multiple previously archived failed messages back to the unresolved list. This operation changes system state. Good for questions like: \u0027unarchive these messages\u0027, \u0027restore these failures\u0027, or \u0027unarchive messages msg-1, msg-2, msg-3\u0027. Prefer UnarchiveFailureGroup when restoring an entire group \u2014 use this tool when you have a specific set of message IDs.", "inputSchema": { "type": "object", "properties": { "messageIds": { - "description": "The unique message IDs to restore", + "description": "The failed message IDs to restore from the archived state.", "type": "array", "items": { "type": "string" @@ -401,18 +503,24 @@ "messageIds" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } }, { "name": "unarchive_failure_group", - "description": "Use this tool to restore an entire archived failure group back to the unresolved list. Good for questions like: \u0027unarchive this failure group\u0027, \u0027restore all archived NullReferenceException failures\u0027, or \u0027unarchive the whole group\u0027. All messages that were archived together under this group will become available for retry again. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an unarchive operation is already running for this group.", + "description": "Use this tool to restore an entire archived failure group back to the unresolved list. This operation changes system state. Good for questions like: \u0027unarchive this failure group\u0027, \u0027restore all archived NullReferenceException failures\u0027, or \u0027unarchive the whole group\u0027. All messages that were archived together under this group will become available for retry again. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an unarchive operation is already running for this group.", "inputSchema": { "type": "object", "properties": { "groupId": { - "description": "The failure group ID from get_failure_groups results", + "description": "The failure group ID from previous GetFailureGroups results.", "type": "string" } }, @@ -420,6 +528,12 @@ "groupId" ] }, + "annotations": { + "destructiveHint": true, + "idempotentHint": false, + "openWorldHint": false, + "readOnlyHint": false + }, "execution": { "taskSupport": "optional" } diff --git a/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt b/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt index 017bd25122..6b40320fa3 100644 --- a/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt +++ b/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt @@ -1,12 +1,12 @@ [ { "name": "get_audit_message_body", - "description": "Use this tool to inspect the actual payload of a processed message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need a message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", + "description": "This is a read-only tool for inspecting the actual payload of a processed audit message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need an audit message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", "inputSchema": { "type": "object", "properties": { "messageId": { - "description": "The message ID from a previous audit message query result", + "description": "The audit message ID from a previous audit message query result.", "type": "string" } }, @@ -14,18 +14,24 @@ "messageId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages", - "description": "Use this tool to browse successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", + "description": "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", "inputSchema": { "type": "object", "properties": { "includeSystemMessages": { - "description": "Set to true to include NServiceBus infrastructure messages. Usually leave as false to see only business messages.", + "description": "Set to true to include NServiceBus infrastructure messages. Leave this as false for the usual business-message view.", "type": "boolean", "default": false }, @@ -50,7 +56,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601). Use with timeSentTo to query a specific time window.", + "description": "Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.", "type": [ "string", "null" @@ -58,7 +64,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.", "type": [ "string", "null" @@ -67,18 +73,24 @@ } } }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages_by_conversation", - "description": "Use this tool to trace the full chain of messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", + "description": "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", "inputSchema": { "type": "object", "properties": { "conversationId": { - "description": "The conversation ID from a previous audit message query result", + "description": "The conversation ID from a previous audit message query result.", "type": "string" }, "page": { @@ -106,22 +118,28 @@ "conversationId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages_by_endpoint", - "description": "Use this tool to see what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", + "description": "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", "type": "string" }, "keyword": { - "description": "Optional keyword to search within this endpoint\u0027s messages", + "description": "Optional keyword to narrow results within this endpoint. Omit it to browse the endpoint without full-text filtering.", "type": [ "string", "null" @@ -129,7 +147,7 @@ "default": null }, "includeSystemMessages": { - "description": "Set to true to include NServiceBus infrastructure messages", + "description": "Set to true to include NServiceBus infrastructure messages for this endpoint. Leave false for the usual business-message view.", "type": "boolean", "default": false }, @@ -154,7 +172,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601)", + "description": "Filter endpoint audit messages to those sent after this ISO 8601 date/time.", "type": [ "string", "null" @@ -162,7 +180,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter endpoint audit messages to those sent before this ISO 8601 date/time.", "type": [ "string", "null" @@ -174,18 +192,24 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_endpoint_audit_counts", - "description": "Use this tool to see daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", + "description": "This is a read-only tool for seeing daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The NServiceBus endpoint name, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", "type": "string" } }, @@ -193,29 +217,41 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_known_endpoints", - "description": "Use this tool to discover what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", + "description": "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", "inputSchema": { "type": "object", "properties": {} }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "search_audit_messages", - "description": "Use this tool to find audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", + "description": "This is a read-only tool for finding audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", "inputSchema": { "type": "object", "properties": { "query": { - "description": "Free-text search query \u2014 matches against message body, headers, and metadata", + "description": "The free-text search query to match against audit message body content, headers, and metadata.", "type": "string" }, "page": { @@ -239,7 +275,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601)", + "description": "Filter audit search results to messages sent after this ISO 8601 date/time.", "type": [ "string", "null" @@ -247,7 +283,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter audit search results to messages sent before this ISO 8601 date/time.", "type": [ "string", "null" @@ -259,6 +295,12 @@ "query" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } diff --git a/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt b/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt index 017bd25122..6b40320fa3 100644 --- a/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt +++ b/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt @@ -1,12 +1,12 @@ [ { "name": "get_audit_message_body", - "description": "Use this tool to inspect the actual payload of a processed message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need a message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", + "description": "This is a read-only tool for inspecting the actual payload of a processed audit message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need an audit message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", "inputSchema": { "type": "object", "properties": { "messageId": { - "description": "The message ID from a previous audit message query result", + "description": "The audit message ID from a previous audit message query result.", "type": "string" } }, @@ -14,18 +14,24 @@ "messageId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages", - "description": "Use this tool to browse successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", + "description": "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", "inputSchema": { "type": "object", "properties": { "includeSystemMessages": { - "description": "Set to true to include NServiceBus infrastructure messages. Usually leave as false to see only business messages.", + "description": "Set to true to include NServiceBus infrastructure messages. Leave this as false for the usual business-message view.", "type": "boolean", "default": false }, @@ -50,7 +56,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601). Use with timeSentTo to query a specific time window.", + "description": "Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.", "type": [ "string", "null" @@ -58,7 +64,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.", "type": [ "string", "null" @@ -67,18 +73,24 @@ } } }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages_by_conversation", - "description": "Use this tool to trace the full chain of messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", + "description": "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", "inputSchema": { "type": "object", "properties": { "conversationId": { - "description": "The conversation ID from a previous audit message query result", + "description": "The conversation ID from a previous audit message query result.", "type": "string" }, "page": { @@ -106,22 +118,28 @@ "conversationId" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_audit_messages_by_endpoint", - "description": "Use this tool to see what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", + "description": "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", "type": "string" }, "keyword": { - "description": "Optional keyword to search within this endpoint\u0027s messages", + "description": "Optional keyword to narrow results within this endpoint. Omit it to browse the endpoint without full-text filtering.", "type": [ "string", "null" @@ -129,7 +147,7 @@ "default": null }, "includeSystemMessages": { - "description": "Set to true to include NServiceBus infrastructure messages", + "description": "Set to true to include NServiceBus infrastructure messages for this endpoint. Leave false for the usual business-message view.", "type": "boolean", "default": false }, @@ -154,7 +172,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601)", + "description": "Filter endpoint audit messages to those sent after this ISO 8601 date/time.", "type": [ "string", "null" @@ -162,7 +180,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter endpoint audit messages to those sent before this ISO 8601 date/time.", "type": [ "string", "null" @@ -174,18 +192,24 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_endpoint_audit_counts", - "description": "Use this tool to see daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", + "description": "This is a read-only tool for seeing daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The NServiceBus endpoint name, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", "type": "string" } }, @@ -193,29 +217,41 @@ "endpointName" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "get_known_endpoints", - "description": "Use this tool to discover what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", + "description": "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", "inputSchema": { "type": "object", "properties": {} }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } }, { "name": "search_audit_messages", - "description": "Use this tool to find audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", + "description": "This is a read-only tool for finding audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", "inputSchema": { "type": "object", "properties": { "query": { - "description": "Free-text search query \u2014 matches against message body, headers, and metadata", + "description": "The free-text search query to match against audit message body content, headers, and metadata.", "type": "string" }, "page": { @@ -239,7 +275,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Only return messages sent after this time (ISO 8601)", + "description": "Filter audit search results to messages sent after this ISO 8601 date/time.", "type": [ "string", "null" @@ -247,7 +283,7 @@ "default": null }, "timeSentTo": { - "description": "Only return messages sent before this time (ISO 8601)", + "description": "Filter audit search results to messages sent before this ISO 8601 date/time.", "type": [ "string", "null" @@ -259,6 +295,12 @@ "query" ] }, + "annotations": { + "destructiveHint": false, + "idempotentHint": true, + "openWorldHint": false, + "readOnlyHint": true + }, "execution": { "taskSupport": "optional" } diff --git a/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs b/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs new file mode 100644 index 0000000000..34aeb8ed69 --- /dev/null +++ b/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs @@ -0,0 +1,50 @@ +#nullable enable + +namespace ServiceControl.Audit.UnitTests.Mcp; + +using System; +using System.Linq; +using System.Reflection; +using Audit.Mcp; +using NUnit.Framework; +using DescriptionAttribute = System.ComponentModel.DescriptionAttribute; + +[TestFixture] +class McpMetadataDescriptionsTests +{ + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessages), "read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.SearchAuditMessages), "read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByEndpoint), "read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation), "read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody), "read-only")] + [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetKnownEndpoints), "read-only")] + [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetEndpointAuditCounts), "read-only")] + public void Audit_query_tools_are_described_as_read_only(Type toolType, string methodName, string expectedPhrase) + { + var description = GetMethodDescription(toolType, methodName); + + Assert.That(description, Does.Contain(expectedPhrase)); + } + + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody), "messageId", "audit message ID")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation), "conversationId", "conversation ID")] + [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetEndpointAuditCounts), "endpointName", "NServiceBus endpoint name")] + public void Key_audit_tool_parameters_identify_the_entity_type(Type toolType, string methodName, string parameterName, string expectedPhrase) + { + var description = GetParameterDescription(toolType, methodName, parameterName); + + Assert.That(description, Does.Contain(expectedPhrase)); + } + + static string GetMethodDescription(Type toolType, string methodName) + => toolType.GetMethod(methodName)! + .GetCustomAttribute()! + .Description; + + static string GetParameterDescription(Type toolType, string methodName, string parameterName) + => toolType.GetMethod(methodName)! + .GetParameters() + .Single(p => p.Name == parameterName) + .GetCustomAttribute()! + .Description; +} diff --git a/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs b/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs index f6caa32422..b0496d0716 100644 --- a/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs +++ b/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs @@ -12,7 +12,7 @@ namespace ServiceControl.Audit.Mcp; using Persistence; [McpServerToolType, Description( - "Tools for exploring audit messages.\n\n" + + "Read-only tools for exploring audit messages.\n\n" + "Agent guidance:\n" + "1. For broad requests like 'show recent messages', start with GetAuditMessages using defaults.\n" + "2. For requests containing a concrete text term, identifier, or phrase, use SearchAuditMessages.\n" + @@ -23,8 +23,8 @@ namespace ServiceControl.Audit.Mcp; )] public class AuditMessageTools(IAuditDataStore store, ILogger logger) { - [McpServerTool, Description( - "Use this tool to browse successfully processed audit messages when the user wants an overview rather than a text search. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. " + "Good for questions like: 'show recent audit messages', 'what messages were processed today?', 'list messages from endpoint X', or 'show slow messages'. " + "Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. " + "For broad requests, use the default paging and sorting. " + @@ -32,13 +32,13 @@ public class AuditMessageTools(IAuditDataStore store, ILogger "If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead." )] public async Task GetAuditMessages( - [Description("Set to true to include NServiceBus infrastructure messages. Usually leave as false to see only business messages.")] bool includeSystemMessages = false, + [Description("Set to true to include NServiceBus infrastructure messages. Leave this as false for the usual business-message view.")] bool includeSystemMessages = false, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Only return messages sent after this time (ISO 8601). Use with timeSentTo to query a specific time window.")] string? timeSentFrom = null, - [Description("Only return messages sent before this time (ISO 8601)")] string? timeSentTo = null, + [Description("Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.")] string? timeSentFrom = null, + [Description("Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetAuditMessages invoked (page={Page}, includeSystemMessages={IncludeSystem})", page, includeSystemMessages); @@ -58,21 +58,21 @@ public async Task GetAuditMessages( }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to find audit messages by a keyword or phrase. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for finding audit messages by a keyword or phrase. " + "Good for questions like: 'find messages containing order 12345', 'search for CustomerCreated messages', or 'look for messages mentioning this ID'. " + "Searches across message body content, headers, and metadata using full-text search. " + "Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. " + "If the user just wants to browse recent messages without a search term, use GetAuditMessages instead." )] public async Task SearchAuditMessages( - [Description("Free-text search query — matches against message body, headers, and metadata")] string query, + [Description("The free-text search query to match against audit message body content, headers, and metadata.")] string query, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Only return messages sent after this time (ISO 8601)")] string? timeSentFrom = null, - [Description("Only return messages sent before this time (ISO 8601)")] string? timeSentTo = null, + [Description("Filter audit search results to messages sent after this ISO 8601 date/time.")] string? timeSentFrom = null, + [Description("Filter audit search results to messages sent before this ISO 8601 date/time.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP SearchAuditMessages invoked (query={Query}, page={Page})", query, page); @@ -92,23 +92,23 @@ public async Task SearchAuditMessages( }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to see what messages a specific NServiceBus endpoint has processed. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. " + "Good for questions like: 'what messages did Sales process?', 'show messages handled by Shipping', or 'find OrderPlaced messages in the Billing endpoint'. " + "Returns the same metadata as GetAuditMessages but scoped to one endpoint. " + "Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. " + "Optionally pass a keyword to search within that endpoint's messages." )] public async Task GetAuditMessagesByEndpoint( - [Description("The NServiceBus endpoint name, e.g. 'Sales' or 'Shipping.MessageHandler'")] string endpointName, - [Description("Optional keyword to search within this endpoint's messages")] string? keyword = null, - [Description("Set to true to include NServiceBus infrastructure messages")] bool includeSystemMessages = false, + [Description("The NServiceBus endpoint name to investigate, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, + [Description("Optional keyword to narrow results within this endpoint. Omit it to browse the endpoint without full-text filtering.")] string? keyword = null, + [Description("Set to true to include NServiceBus infrastructure messages for this endpoint. Leave false for the usual business-message view.")] bool includeSystemMessages = false, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Only return messages sent after this time (ISO 8601)")] string? timeSentFrom = null, - [Description("Only return messages sent before this time (ISO 8601)")] string? timeSentTo = null, + [Description("Filter endpoint audit messages to those sent after this ISO 8601 date/time.")] string? timeSentFrom = null, + [Description("Filter endpoint audit messages to those sent before this ISO 8601 date/time.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetAuditMessagesByEndpoint invoked (endpoint={EndpointName}, keyword={Keyword}, page={Page})", endpointName, keyword, page); @@ -130,15 +130,15 @@ public async Task GetAuditMessagesByEndpoint( }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to trace the full chain of messages triggered by an initial message. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. " + "Good for questions like: 'what happened after this message was sent?', 'show me the full message flow', or 'trace this conversation'. " + "A conversation groups all related messages together — the original command and every event, reply, or saga message it caused. " + "You need a conversation ID, which you can get from any audit message query result. " + "Essential for understanding message flow and debugging cascading issues." )] public async Task GetAuditMessagesByConversation( - [Description("The conversation ID from a previous audit message query result")] string conversationId, + [Description("The conversation ID from a previous audit message query result.")] string conversationId, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", @@ -161,15 +161,15 @@ public async Task GetAuditMessagesByConversation( }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to inspect the actual payload of a processed message. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for inspecting the actual payload of a processed audit message. " + "Good for questions like: 'show me the message body', 'what data was in this message?', or 'let me see the content of message X'. " + "Returns the serialized message body content, typically JSON. " + - "You need a message ID, which you can get from any audit message query result. " + + "You need an audit message ID, which you can get from any audit message query result. " + "Use this when the user wants to see what data was actually sent, not just message metadata." )] public async Task GetAuditMessageBody( - [Description("The message ID from a previous audit message query result")] string messageId, + [Description("The audit message ID from a previous audit message query result.")] string messageId, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetAuditMessageBody invoked (messageId={MessageId})", messageId); diff --git a/src/ServiceControl.Audit/Mcp/EndpointTools.cs b/src/ServiceControl.Audit/Mcp/EndpointTools.cs index cc15e03b43..86097d87ae 100644 --- a/src/ServiceControl.Audit/Mcp/EndpointTools.cs +++ b/src/ServiceControl.Audit/Mcp/EndpointTools.cs @@ -11,15 +11,15 @@ namespace ServiceControl.Audit.Mcp; using Persistence; [McpServerToolType, Description( - "Tools for discovering and inspecting NServiceBus endpoints.\n\n" + + "Read-only tools for discovering and inspecting NServiceBus endpoints.\n\n" + "Agent guidance:\n" + "1. Use GetKnownEndpoints to discover endpoint names before calling endpoint-specific tools.\n" + "2. Use GetEndpointAuditCounts to spot throughput trends, traffic spikes, or drops in activity." )] public class EndpointTools(IAuditDataStore store, ILogger logger) { - [McpServerTool, Description( - "Use this tool to discover what NServiceBus endpoints exist in the system. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. " + "Good for questions like: 'what endpoints do we have?', 'what services are running?', or 'list all endpoints'. " + "Returns all endpoints that have processed audit messages, including their name and host information. " + "This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts." @@ -39,14 +39,14 @@ public async Task GetKnownEndpoints(CancellationToken cancellationToken }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to see daily message volume trends for a specific endpoint. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "This is a read-only tool for seeing daily message volume trends for a specific endpoint. " + "Good for questions like: 'how much traffic does Sales handle?', 'has throughput changed recently?', or 'show me message counts for this endpoint'. " + "Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. " + "You need an endpoint name — use GetKnownEndpoints first if you do not have one." )] public async Task GetEndpointAuditCounts( - [Description("The NServiceBus endpoint name, e.g. 'Sales' or 'Shipping.MessageHandler'")] string endpointName, + [Description("The NServiceBus endpoint name, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetEndpointAuditCounts invoked (endpoint={EndpointName})", endpointName); diff --git a/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs new file mode 100644 index 0000000000..cae0fbc9c2 --- /dev/null +++ b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs @@ -0,0 +1,64 @@ +#nullable enable + +namespace ServiceControl.UnitTests.Mcp; + +using System; +using System.Linq; +using System.Reflection; +using NUnit.Framework; +using ServiceControl.Mcp; +using DescriptionAttribute = System.ComponentModel.DescriptionAttribute; + +[TestFixture] +class McpMetadataDescriptionsTests +{ + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessage))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessages))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessagesByQueue))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessages))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessagesByEndpoint))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailureGroup))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessage))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessages))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailureGroup))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailedMessage))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailedMessages))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailureGroup))] + public void Mutating_tools_explicitly_warn_that_they_change_system_state(Type toolType, string methodName) + { + var description = GetMethodDescription(toolType, methodName); + + Assert.That(description, Does.Contain("changes system state")); + } + + [Test] + public void Retry_all_failed_messages_warns_that_it_affects_all_unresolved_failed_messages() + { + var description = GetMethodDescription(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessages)); + + Assert.That(description, Does.Contain("all unresolved failed messages across the instance")); + } + + [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageById), "failedMessageId", "failed message ID")] + [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageLastAttempt), "failedMessageId", "failed message ID")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailureGroup), "groupId", "failure group ID")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailureGroup), "groupId", "failure group ID")] + public void Key_error_tool_parameters_identify_the_entity_type(Type toolType, string methodName, string parameterName, string expectedPhrase) + { + var description = GetParameterDescription(toolType, methodName, parameterName); + + Assert.That(description, Does.Contain(expectedPhrase)); + } + + static string GetMethodDescription(Type toolType, string methodName) + => toolType.GetMethod(methodName)! + .GetCustomAttribute()! + .Description; + + static string GetParameterDescription(Type toolType, string methodName, string parameterName) + => toolType.GetMethod(methodName)! + .GetParameters() + .Single(p => p.Name == parameterName) + .GetCustomAttribute()! + .Description; +} diff --git a/src/ServiceControl/Mcp/ArchiveTools.cs b/src/ServiceControl/Mcp/ArchiveTools.cs index 2312145b8f..5c2c87f0c7 100644 --- a/src/ServiceControl/Mcp/ArchiveTools.cs +++ b/src/ServiceControl/Mcp/ArchiveTools.cs @@ -14,23 +14,25 @@ namespace ServiceControl.Mcp; [McpServerToolType, Description( "Tools for archiving and unarchiving failed messages.\n\n" + "Agent guidance:\n" + - "1. Archiving dismisses a failed message — it moves out of the unresolved list and no longer counts as an active problem.\n" + - "2. Unarchiving restores a previously archived message back to the unresolved list so it can be retried.\n" + - "3. Prefer ArchiveFailureGroup or UnarchiveFailureGroup when acting on an entire failure group — it is more efficient than archiving messages individually.\n" + - "4. Use ArchiveFailedMessages or UnarchiveFailedMessages when you have a specific set of message IDs.\n" + - "5. All operations are asynchronous — they return Accepted immediately and complete in the background." + "1. Every tool in this group changes system state by archiving or restoring failed messages.\n" + + "2. Archiving dismisses a failed message — it moves out of the unresolved list and no longer counts as an active problem.\n" + + "3. Unarchiving restores a previously archived message back to the unresolved list so it can be retried.\n" + + "4. Prefer ArchiveFailureGroup or UnarchiveFailureGroup when acting on an entire failure group — it is more efficient than archiving messages individually.\n" + + "5. Use ArchiveFailedMessages or UnarchiveFailedMessages when you have a specific set of message IDs.\n" + + "6. All operations are asynchronous — they return Accepted immediately and complete in the background." )] public class ArchiveTools(IMessageSession messageSession, IArchiveMessages archiver, ILogger logger) { - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to dismiss a single failed message that does not need to be retried. " + + "This operation changes system state. " + "Good for questions like: 'archive this message', 'dismiss this failure', or 'I do not need to retry this one'. " + "Archiving moves the message out of the unresolved list so it no longer shows up as an active problem. " + "This is an asynchronous operation — the message will be archived shortly after the request is accepted. " + "If you need to archive many messages with the same root cause, use ArchiveFailureGroup instead." )] public async Task ArchiveFailedMessage( - [Description("The unique message ID from a previous query result")] string failedMessageId) + [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) { logger.LogInformation("MCP ArchiveFailedMessage invoked (failedMessageId={FailedMessageId})", failedMessageId); @@ -38,13 +40,14 @@ public async Task ArchiveFailedMessage( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Archive requested for message '{failedMessageId}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to dismiss multiple failed messages at once that do not need to be retried. " + + "This operation changes system state. " + "Good for questions like: 'archive these messages', 'dismiss these failures', or 'archive messages msg-1, msg-2, msg-3'. " + "Prefer ArchiveFailureGroup when all messages share the same failure cause — use this tool when you have a specific set of message IDs to archive." )] public async Task ArchiveFailedMessages( - [Description("The unique message IDs from a previous query result")] string[] messageIds) + [Description("The failed message IDs from previous failed-message query results.")] string[] messageIds) { logger.LogInformation("MCP ArchiveFailedMessages invoked (count={Count})", messageIds.Length); @@ -61,15 +64,16 @@ public async Task ArchiveFailedMessages( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Archive requested for {messageIds.Length} messages." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to dismiss an entire failure group — all messages that failed with the same exception type and stack trace. " + + "This operation changes system state. " + "Good for questions like: 'archive this failure group', 'dismiss all NullReferenceException failures', or 'archive the whole group'. " + "This is the most efficient way to archive many related failures at once. " + "You need a group ID, which you can get from GetFailureGroups. " + "Returns InProgress if an archive operation is already running for this group." )] public async Task ArchiveFailureGroup( - [Description("The failure group ID from get_failure_groups results")] string groupId) + [Description("The failure group ID from previous GetFailureGroups results.")] string groupId) { logger.LogInformation("MCP ArchiveFailureGroup invoked (groupId={GroupId})", groupId); @@ -85,14 +89,15 @@ public async Task ArchiveFailureGroup( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Archive requested for all messages in failure group '{groupId}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to restore a previously archived failed message back to the unresolved list so it can be retried. " + + "This operation changes system state. " + "Good for questions like: 'unarchive this message', 'restore this failure', or 'I need to retry this archived message'. " + "Use when a message was archived by mistake or when the underlying issue has been fixed and the message should be reprocessed. " + "If you need to restore many messages from the same failure group, use UnarchiveFailureGroup instead." )] public async Task UnarchiveFailedMessage( - [Description("The unique message ID to restore")] string failedMessageId) + [Description("The failed message ID to restore from the archived state.")] string failedMessageId) { logger.LogInformation("MCP UnarchiveFailedMessage invoked (failedMessageId={FailedMessageId})", failedMessageId); @@ -100,13 +105,14 @@ public async Task UnarchiveFailedMessage( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Unarchive requested for message '{failedMessageId}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to restore multiple previously archived failed messages back to the unresolved list. " + + "This operation changes system state. " + "Good for questions like: 'unarchive these messages', 'restore these failures', or 'unarchive messages msg-1, msg-2, msg-3'. " + "Prefer UnarchiveFailureGroup when restoring an entire group — use this tool when you have a specific set of message IDs." )] public async Task UnarchiveFailedMessages( - [Description("The unique message IDs to restore")] string[] messageIds) + [Description("The failed message IDs to restore from the archived state.")] string[] messageIds) { logger.LogInformation("MCP UnarchiveFailedMessages invoked (count={Count})", messageIds.Length); @@ -120,15 +126,16 @@ public async Task UnarchiveFailedMessages( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Unarchive requested for {messageIds.Length} messages." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to restore an entire archived failure group back to the unresolved list. " + + "This operation changes system state. " + "Good for questions like: 'unarchive this failure group', 'restore all archived NullReferenceException failures', or 'unarchive the whole group'. " + "All messages that were archived together under this group will become available for retry again. " + "You need a group ID, which you can get from GetFailureGroups. " + "Returns InProgress if an unarchive operation is already running for this group." )] public async Task UnarchiveFailureGroup( - [Description("The failure group ID from get_failure_groups results")] string groupId) + [Description("The failure group ID from previous GetFailureGroups results.")] string groupId) { logger.LogInformation("MCP UnarchiveFailureGroup invoked (groupId={GroupId})", groupId); diff --git a/src/ServiceControl/Mcp/FailedMessageTools.cs b/src/ServiceControl/Mcp/FailedMessageTools.cs index 57c30fe11d..f64ec32807 100644 --- a/src/ServiceControl/Mcp/FailedMessageTools.cs +++ b/src/ServiceControl/Mcp/FailedMessageTools.cs @@ -12,7 +12,7 @@ namespace ServiceControl.Mcp; using Persistence.Infrastructure; [McpServerToolType, Description( - "Tools for investigating failed messages.\n\n" + + "Read-only tools for investigating failed messages.\n\n" + "Agent guidance:\n" + "1. Start with GetErrorsSummary to get a quick health check of failure counts by status.\n" + "2. Use GetFailureGroups (from FailureGroupTools) to see failures grouped by root cause before drilling into individual messages.\n" + @@ -23,17 +23,17 @@ namespace ServiceControl.Mcp; )] public class FailedMessageTools(IErrorMessageDataStore store, ILogger logger) { - [McpServerTool, Description( - "Use this tool to browse failed messages when the user wants to see what is failing. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to retrieve failed messages for investigation when the user wants to see what is failing. " + "Good for questions like: 'what messages are currently failing?', 'are there failures in a specific queue?', or 'what failed recently?'. " + "Returns a paged list of failed messages with their status, exception details, and queue information. " + - "For broad requests, call with no parameters to get the most recent failures — only add filters when you need to narrow down results. " + + "For broad requests, call with no parameters to get the most recent failures — only add filters when you need to narrow the scope. " + "Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint." )] public async Task GetFailedMessages( - [Description("Narrow results to a specific status: unresolved (still failing), resolved (succeeded on retry), archived (dismissed), or retryissued (retry in progress). Omit to include all statuses.")] string? status = null, - [Description("Only return messages modified after this date (ISO 8601). Useful for checking recent failures.")] string? modified = null, - [Description("Only return messages from this queue address, e.g. 'Sales@machine'. Use when investigating a specific queue.")] string? queueAddress = null, + [Description("Filter failed messages by status: unresolved (still failing), resolved (succeeded on retry), archived (dismissed), or retryissued (retry in progress). Omit this filter to include all statuses.")] string? status = null, + [Description("Filter failed messages to entries modified after this ISO 8601 date/time. Omit this filter to include older results.")] string? modified = null, + [Description("Filter failed messages to a specific queue address, for example 'Sales@machine'. Omit this filter to include all queues.")] string? queueAddress = null, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, message_type, or time_of_failure")] string sort = "time_of_failure", @@ -55,14 +55,14 @@ public async Task GetFailedMessages( }, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. " + "Good for questions like: 'show me details for this failed message', 'what exception caused this failure?', or 'how many times has this message failed?'. " + - "You need the message's unique ID, which you can get from GetFailedMessages or GetFailureGroups results. " + + "You need a failed message ID, which you can get from GetFailedMessages or GetFailureGroups results. " + "If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead — it returns less data." )] public async Task GetFailedMessageById( - [Description("The unique message ID from a previous query result")] string failedMessageId) + [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) { logger.LogInformation("MCP GetFailedMessageById invoked (failedMessageId={FailedMessageId})", failedMessageId); @@ -77,14 +77,14 @@ public async Task GetFailedMessageById( return JsonSerializer.Serialize(result, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to see how a specific message failed most recently. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to see how a specific message failed most recently. " + "Good for questions like: 'what was the last error for this message?', 'show me the latest exception', or 'what happened on the last attempt?'. " + "Returns the latest processing attempt with its exception, stack trace, and headers. " + "Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history." )] public async Task GetFailedMessageLastAttempt( - [Description("The unique message ID from a previous query result")] string failedMessageId) + [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) { logger.LogInformation("MCP GetFailedMessageLastAttempt invoked (failedMessageId={FailedMessageId})", failedMessageId); @@ -99,8 +99,8 @@ public async Task GetFailedMessageLastAttempt( return JsonSerializer.Serialize(result, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool as a quick health check to see how many messages are in each failure state. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool as a quick health check to see how many messages are in each failure state. " + "Good for questions like: 'how many errors are there?', 'what is the error situation?', or 'are there unresolved failures?'. " + "Returns counts for unresolved, archived, resolved, and retryissued statuses. " + "This is a good first tool to call when asked about the overall error situation before drilling into specific messages." @@ -113,16 +113,16 @@ public async Task GetErrorsSummary() return JsonSerializer.Serialize(result, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to see failed messages for a specific NServiceBus endpoint. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to see failed messages for a specific NServiceBus endpoint. " + "Good for questions like: 'what is failing in the Sales endpoint?', 'show errors for Shipping', or 'are there failures in this endpoint?'. " + "Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. " + "Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name." )] public async Task GetFailedMessagesByEndpoint( - [Description("The NServiceBus endpoint name, e.g. 'Sales' or 'Shipping.MessageHandler'")] string endpointName, - [Description("Narrow results to a specific status: unresolved, resolved, archived, or retryissued. Omit to include all.")] string? status = null, - [Description("Only return messages modified after this date (ISO 8601)")] string? modified = null, + [Description("The NServiceBus endpoint name to investigate, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, + [Description("Filter failed messages by status: unresolved, resolved, archived, or retryissued. Omit this filter to include all statuses for the endpoint.")] string? status = null, + [Description("Filter endpoint results to failed messages modified after this ISO 8601 date/time. Omit this filter to include older results.")] string? modified = null, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, message_type, or time_of_failure")] string sort = "time_of_failure", diff --git a/src/ServiceControl/Mcp/FailureGroupTools.cs b/src/ServiceControl/Mcp/FailureGroupTools.cs index 4fce32514f..096e1a68a5 100644 --- a/src/ServiceControl/Mcp/FailureGroupTools.cs +++ b/src/ServiceControl/Mcp/FailureGroupTools.cs @@ -11,7 +11,7 @@ namespace ServiceControl.Mcp; using Recoverability; [McpServerToolType, Description( - "Tools for inspecting failure groups and retry history.\n\n" + + "Read-only tools for inspecting failure groups and retry history.\n\n" + "Agent guidance:\n" + "1. GetFailureGroups is usually the best starting point for diagnosing production issues — call it before drilling into individual messages.\n" + "2. Call GetFailureGroups with no parameters to use the default grouping by exception type and stack trace.\n" + @@ -19,8 +19,8 @@ namespace ServiceControl.Mcp; )] public class FailureGroupTools(GroupFetcher fetcher, IRetryHistoryDataStore retryStore, ILogger logger) { - [McpServerTool, Description( - "Use this tool to understand why messages are failing by seeing failures grouped by root cause. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to understand why messages are failing by seeing failures grouped by root cause. " + "Good for questions like: 'why are messages failing?', 'what errors are happening?', 'group failures by exception', or 'what are the top failure causes?'. " + "Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. " + "This is usually the best starting point for diagnosing production issues — call it before drilling into individual messages. " + @@ -28,7 +28,7 @@ public class FailureGroupTools(GroupFetcher fetcher, IRetryHistoryDataStore retr )] public async Task GetFailureGroups( [Description("How to group failures. The default 'Exception Type and Stack Trace' is almost always what you want. Use 'Message Type' to group by the NServiceBus message type instead.")] string classifier = "Exception Type and Stack Trace", - [Description("Only include groups matching this filter text")] string? classifierFilter = null) + [Description("Filter failure groups by classifier text. Omit this filter to include all groups for the selected classifier.")] string? classifierFilter = null) { logger.LogInformation("MCP GetFailureGroups invoked (classifier={Classifier})", classifier); @@ -39,8 +39,8 @@ public async Task GetFailureGroups( return JsonSerializer.Serialize(results, McpJsonOptions.Default); } - [McpServerTool, Description( - "Use this tool to check the history of retry operations. " + + [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( + "Read-only. Use this tool to check the history of retry operations. " + "Good for questions like: 'has someone already retried these?', 'what happened the last time we retried this group?', 'show retry history', or 'were any retries attempted today?'. " + "Returns which groups were retried, when, and whether the retries succeeded or failed. " + "Use this before retrying a group to avoid duplicate retry attempts." diff --git a/src/ServiceControl/Mcp/RetryTools.cs b/src/ServiceControl/Mcp/RetryTools.cs index 6edd34a9d4..95dd1ef1d0 100644 --- a/src/ServiceControl/Mcp/RetryTools.cs +++ b/src/ServiceControl/Mcp/RetryTools.cs @@ -15,7 +15,7 @@ namespace ServiceControl.Mcp; [McpServerToolType, Description( "Tools for retrying failed messages.\n\n" + "Agent guidance:\n" + - "1. Retrying sends a failed message back to its original queue for reprocessing. Only retry after the underlying issue has been resolved.\n" + + "1. Every tool in this group changes system state by sending failed messages back for reprocessing. Only retry after the underlying issue has been resolved.\n" + "2. Prefer RetryFailureGroup when all messages share the same root cause — it is the most targeted approach.\n" + "3. Use RetryAllFailedMessagesByEndpoint when a bug in one endpoint has been fixed.\n" + "4. Use RetryFailedMessagesByQueue when a queue's consumer was down and is now back.\n" + @@ -24,14 +24,15 @@ namespace ServiceControl.Mcp; )] public class RetryTools(IMessageSession messageSession, RetryingManager retryingManager, ILogger logger) { - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to reprocess a single failed message by sending it back to its original queue. " + + "This operation changes system state. " + "Good for questions like: 'retry this message', 'reprocess this failure', or 'send this message back for processing'. " + "The message will go through normal processing again. Only use after the underlying issue (bug fix, infrastructure problem) has been resolved. " + "If you need to retry many messages with the same root cause, use RetryFailureGroup instead." )] public async Task RetryFailedMessage( - [Description("The unique message ID from a previous query result")] string failedMessageId) + [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) { logger.LogInformation("MCP RetryFailedMessage invoked (failedMessageId={FailedMessageId})", failedMessageId); @@ -39,13 +40,14 @@ public async Task RetryFailedMessage( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Retry requested for message '{failedMessageId}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to reprocess multiple specific failed messages at once. " + + "This operation changes system state. " + "Good for questions like: 'retry these messages', 'reprocess messages msg-1, msg-2, msg-3', or 'retry this batch'. " + "Prefer RetryFailureGroup when all messages share the same failure cause — use this tool when you have a specific set of message IDs to retry." )] public async Task RetryFailedMessages( - [Description("The unique message IDs from a previous query result")] string[] messageIds) + [Description("The failed message IDs from previous failed-message query results.")] string[] messageIds) { logger.LogInformation("MCP RetryFailedMessages invoked (count={Count})", messageIds.Length); @@ -59,8 +61,9 @@ public async Task RetryFailedMessages( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Retry requested for {messageIds.Length} messages." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all unresolved failed messages from a specific queue. " + + "This operation changes system state. " + "Good for questions like: 'retry all failures in the Sales queue', 'reprocess everything from this queue', or 'the queue consumer is back, retry its failures'. " + "Useful when a queue's consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status." )] @@ -77,9 +80,11 @@ await messageSession.SendLocal(m => return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Retry requested for all failed messages in queue '{queueAddress}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry every unresolved failed message across all queues and endpoints. " + + "This operation changes system state. " + "Good for questions like: 'retry everything', 'reprocess all failures', or 'retry all failed messages'. " + + "It affects all unresolved failed messages across the instance. " + "This is a broad operation — prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly." )] public async Task RetryAllFailedMessages() @@ -90,8 +95,9 @@ public async Task RetryAllFailedMessages() return JsonSerializer.Serialize(new { Status = "Accepted", Message = "Retry requested for all failed messages." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all failed messages for a specific NServiceBus endpoint. " + + "This operation changes system state. " + "Good for questions like: 'retry all failures in the Sales endpoint', 'the bug in Shipping is fixed, retry its failures', or 'reprocess all errors for this endpoint'. " + "Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed." )] @@ -104,15 +110,16 @@ public async Task RetryAllFailedMessagesByEndpoint( return JsonSerializer.Serialize(new { Status = "Accepted", Message = $"Retry requested for all failed messages in endpoint '{endpointName}'." }, McpJsonOptions.Default); } - [McpServerTool, Description( + [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all failed messages that share the same exception type and stack trace. " + + "This operation changes system state. " + "Good for questions like: 'retry this failure group', 'the bug causing these NullReferenceExceptions is fixed, retry them', or 'retry all messages in this group'. " + "This is the most targeted way to retry related failures after fixing a specific bug. " + "You need a group ID, which you can get from GetFailureGroups. " + "Returns InProgress if a retry is already running for this group." )] public async Task RetryFailureGroup( - [Description("The failure group ID from get_failure_groups results")] string groupId) + [Description("The failure group ID from previous GetFailureGroups results.")] string groupId) { logger.LogInformation("MCP RetryFailureGroup invoked (groupId={GroupId})", groupId); From 6a4a27e63e671551d996e782be86a42012df5aca Mon Sep 17 00:00:00 2001 From: Daniel Marbach Date: Wed, 25 Mar 2026 16:00:35 +0100 Subject: [PATCH 2/5] Make it clearer when many entities are affected --- .../Mcp/McpMetadataDescriptionsTests.cs | 22 +++++++++++++++++++ src/ServiceControl/Mcp/RetryTools.cs | 7 +++++- 2 files changed, 28 insertions(+), 1 deletion(-) diff --git a/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs index cae0fbc9c2..3496781775 100644 --- a/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs +++ b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs @@ -39,10 +39,32 @@ public void Retry_all_failed_messages_warns_that_it_affects_all_unresolved_faile Assert.That(description, Does.Contain("all unresolved failed messages across the instance")); } + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessages))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessagesByQueue))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessages))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessagesByEndpoint))] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailureGroup))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessages))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailureGroup))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailedMessages))] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailureGroup))] + public void Bulk_mutating_tools_warn_that_they_may_affect_many_messages(Type toolType, string methodName) + { + var description = GetMethodDescription(toolType, methodName); + + Assert.That(description, Does.Contain("may affect many messages")); + } + [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageById), "failedMessageId", "failed message ID")] [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageLastAttempt), "failedMessageId", "failed message ID")] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessage), "failedMessageId", "failed message ID")] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessages), "messageIds", "failed message IDs")] [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailureGroup), "groupId", "failure group ID")] [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailureGroup), "groupId", "failure group ID")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessage), "failedMessageId", "failed message ID")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessages), "messageIds", "failed message IDs")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailedMessage), "failedMessageId", "failed message ID")] + [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailedMessages), "messageIds", "failed message IDs")] public void Key_error_tool_parameters_identify_the_entity_type(Type toolType, string methodName, string parameterName, string expectedPhrase) { var description = GetParameterDescription(toolType, methodName, parameterName); diff --git a/src/ServiceControl/Mcp/RetryTools.cs b/src/ServiceControl/Mcp/RetryTools.cs index 95dd1ef1d0..2dc943d779 100644 --- a/src/ServiceControl/Mcp/RetryTools.cs +++ b/src/ServiceControl/Mcp/RetryTools.cs @@ -43,6 +43,7 @@ public async Task RetryFailedMessage( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to reprocess multiple specific failed messages at once. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'retry these messages', 'reprocess messages msg-1, msg-2, msg-3', or 'retry this batch'. " + "Prefer RetryFailureGroup when all messages share the same failure cause — use this tool when you have a specific set of message IDs to retry." )] @@ -64,6 +65,7 @@ public async Task RetryFailedMessages( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all unresolved failed messages from a specific queue. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'retry all failures in the Sales queue', 'reprocess everything from this queue', or 'the queue consumer is back, retry its failures'. " + "Useful when a queue's consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status." )] @@ -83,6 +85,7 @@ await messageSession.SendLocal(m => [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry every unresolved failed message across all queues and endpoints. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'retry everything', 'reprocess all failures', or 'retry all failed messages'. " + "It affects all unresolved failed messages across the instance. " + "This is a broad operation — prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly." @@ -98,6 +101,7 @@ public async Task RetryAllFailedMessages() [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all failed messages for a specific NServiceBus endpoint. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'retry all failures in the Sales endpoint', 'the bug in Shipping is fixed, retry its failures', or 'reprocess all errors for this endpoint'. " + "Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed." )] @@ -113,9 +117,10 @@ public async Task RetryAllFailedMessagesByEndpoint( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to retry all failed messages that share the same exception type and stack trace. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'retry this failure group', 'the bug causing these NullReferenceExceptions is fixed, retry them', or 'retry all messages in this group'. " + "This is the most targeted way to retry related failures after fixing a specific bug. " + - "You need a group ID, which you can get from GetFailureGroups. " + + "You need a failure group ID, which you can get from GetFailureGroups. " + "Returns InProgress if a retry is already running for this group." )] public async Task RetryFailureGroup( From 87c01d82246b9f19d1744b3a4e280756989fba95 Mon Sep 17 00:00:00 2001 From: Daniel Marbach Date: Wed, 25 Mar 2026 16:00:40 +0100 Subject: [PATCH 3/5] Tighten MCP safety metadata wording --- ...ld_list_primary_instance_tools.approved.txt | 18 +++++++++--------- src/ServiceControl/Mcp/ArchiveTools.cs | 8 ++++++-- 2 files changed, 15 insertions(+), 11 deletions(-) diff --git a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt index 9001ab4e74..900eedbecd 100644 --- a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt +++ b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt @@ -26,7 +26,7 @@ }, { "name": "archive_failed_messages", - "description": "Use this tool to dismiss multiple failed messages at once that do not need to be retried. This operation changes system state. Good for questions like: \u0027archive these messages\u0027, \u0027dismiss these failures\u0027, or \u0027archive messages msg-1, msg-2, msg-3\u0027. Prefer ArchiveFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to archive.", + "description": "Use this tool to dismiss multiple failed messages at once that do not need to be retried. This operation changes system state. It may affect many messages. Good for questions like: \u0027archive these messages\u0027, \u0027dismiss these failures\u0027, or \u0027archive messages msg-1, msg-2, msg-3\u0027. Prefer ArchiveFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to archive.", "inputSchema": { "type": "object", "properties": { @@ -54,7 +54,7 @@ }, { "name": "archive_failure_group", - "description": "Use this tool to dismiss an entire failure group \u2014 all messages that failed with the same exception type and stack trace. This operation changes system state. Good for questions like: \u0027archive this failure group\u0027, \u0027dismiss all NullReferenceException failures\u0027, or \u0027archive the whole group\u0027. This is the most efficient way to archive many related failures at once. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an archive operation is already running for this group.", + "description": "Use this tool to dismiss an entire failure group \u2014 all messages that failed with the same exception type and stack trace. This operation changes system state. It may affect many messages. Good for questions like: \u0027archive this failure group\u0027, \u0027dismiss all NullReferenceException failures\u0027, or \u0027archive the whole group\u0027. This is the most efficient way to archive many related failures at once. You need a failure group ID, which you can get from GetFailureGroups. Returns InProgress if an archive operation is already running for this group.", "inputSchema": { "type": "object", "properties": { @@ -317,7 +317,7 @@ }, { "name": "retry_all_failed_messages", - "description": "Use this tool to retry every unresolved failed message across all queues and endpoints. This operation changes system state. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. It affects all unresolved failed messages across the instance. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.", + "description": "Use this tool to retry every unresolved failed message across all queues and endpoints. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. It affects all unresolved failed messages across the instance. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.", "inputSchema": { "type": "object", "properties": {} @@ -334,7 +334,7 @@ }, { "name": "retry_all_failed_messages_by_endpoint", - "description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. This operation changes system state. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.", + "description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.", "inputSchema": { "type": "object", "properties": { @@ -384,7 +384,7 @@ }, { "name": "retry_failed_messages", - "description": "Use this tool to reprocess multiple specific failed messages at once. This operation changes system state. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.", + "description": "Use this tool to reprocess multiple specific failed messages at once. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.", "inputSchema": { "type": "object", "properties": { @@ -412,7 +412,7 @@ }, { "name": "retry_failed_messages_by_queue", - "description": "Use this tool to retry all unresolved failed messages from a specific queue. This operation changes system state. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.", + "description": "Use this tool to retry all unresolved failed messages from a specific queue. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.", "inputSchema": { "type": "object", "properties": { @@ -437,7 +437,7 @@ }, { "name": "retry_failure_group", - "description": "Use this tool to retry all failed messages that share the same exception type and stack trace. This operation changes system state. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.", + "description": "Use this tool to retry all failed messages that share the same exception type and stack trace. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a failure group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.", "inputSchema": { "type": "object", "properties": { @@ -487,7 +487,7 @@ }, { "name": "unarchive_failed_messages", - "description": "Use this tool to restore multiple previously archived failed messages back to the unresolved list. This operation changes system state. Good for questions like: \u0027unarchive these messages\u0027, \u0027restore these failures\u0027, or \u0027unarchive messages msg-1, msg-2, msg-3\u0027. Prefer UnarchiveFailureGroup when restoring an entire group \u2014 use this tool when you have a specific set of message IDs.", + "description": "Use this tool to restore multiple previously archived failed messages back to the unresolved list. This operation changes system state. It may affect many messages. Good for questions like: \u0027unarchive these messages\u0027, \u0027restore these failures\u0027, or \u0027unarchive messages msg-1, msg-2, msg-3\u0027. Prefer UnarchiveFailureGroup when restoring an entire group \u2014 use this tool when you have a specific set of message IDs.", "inputSchema": { "type": "object", "properties": { @@ -515,7 +515,7 @@ }, { "name": "unarchive_failure_group", - "description": "Use this tool to restore an entire archived failure group back to the unresolved list. This operation changes system state. Good for questions like: \u0027unarchive this failure group\u0027, \u0027restore all archived NullReferenceException failures\u0027, or \u0027unarchive the whole group\u0027. All messages that were archived together under this group will become available for retry again. You need a group ID, which you can get from GetFailureGroups. Returns InProgress if an unarchive operation is already running for this group.", + "description": "Use this tool to restore an entire archived failure group back to the unresolved list. This operation changes system state. It may affect many messages. Good for questions like: \u0027unarchive this failure group\u0027, \u0027restore all archived NullReferenceException failures\u0027, or \u0027unarchive the whole group\u0027. All messages that were archived together under this group will become available for retry again. You need a failure group ID, which you can get from GetFailureGroups. Returns InProgress if an unarchive operation is already running for this group.", "inputSchema": { "type": "object", "properties": { diff --git a/src/ServiceControl/Mcp/ArchiveTools.cs b/src/ServiceControl/Mcp/ArchiveTools.cs index 5c2c87f0c7..78f124d95d 100644 --- a/src/ServiceControl/Mcp/ArchiveTools.cs +++ b/src/ServiceControl/Mcp/ArchiveTools.cs @@ -43,6 +43,7 @@ public async Task ArchiveFailedMessage( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to dismiss multiple failed messages at once that do not need to be retried. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'archive these messages', 'dismiss these failures', or 'archive messages msg-1, msg-2, msg-3'. " + "Prefer ArchiveFailureGroup when all messages share the same failure cause — use this tool when you have a specific set of message IDs to archive." )] @@ -67,9 +68,10 @@ public async Task ArchiveFailedMessages( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to dismiss an entire failure group — all messages that failed with the same exception type and stack trace. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'archive this failure group', 'dismiss all NullReferenceException failures', or 'archive the whole group'. " + "This is the most efficient way to archive many related failures at once. " + - "You need a group ID, which you can get from GetFailureGroups. " + + "You need a failure group ID, which you can get from GetFailureGroups. " + "Returns InProgress if an archive operation is already running for this group." )] public async Task ArchiveFailureGroup( @@ -108,6 +110,7 @@ public async Task UnarchiveFailedMessage( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to restore multiple previously archived failed messages back to the unresolved list. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'unarchive these messages', 'restore these failures', or 'unarchive messages msg-1, msg-2, msg-3'. " + "Prefer UnarchiveFailureGroup when restoring an entire group — use this tool when you have a specific set of message IDs." )] @@ -129,9 +132,10 @@ public async Task UnarchiveFailedMessages( [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( "Use this tool to restore an entire archived failure group back to the unresolved list. " + "This operation changes system state. " + + "It may affect many messages. " + "Good for questions like: 'unarchive this failure group', 'restore all archived NullReferenceException failures', or 'unarchive the whole group'. " + "All messages that were archived together under this group will become available for retry again. " + - "You need a group ID, which you can get from GetFailureGroups. " + + "You need a failure group ID, which you can get from GetFailureGroups. " + "Returns InProgress if an unarchive operation is already running for this group." )] public async Task UnarchiveFailureGroup( From db9f57e698c12a5083122aae1b79c29d3a70eec7 Mon Sep 17 00:00:00 2001 From: Daniel Marbach Date: Wed, 25 Mar 2026 16:00:40 +0100 Subject: [PATCH 4/5] Add scenario-guided MCP prompt validation --- docs/mcp-prompt-validation.md | 33 ++++++++++ ...d_list_primary_instance_tools.approved.txt | 30 ++++----- ...ould_list_audit_message_tools.approved.txt | 30 ++++----- ...ould_list_audit_message_tools.approved.txt | 30 ++++----- .../Mcp/McpMetadataDescriptionsTests.cs | 41 ++++++++++-- .../Mcp/AuditMessageTools.cs | 60 ++++++++--------- src/ServiceControl.Audit/Mcp/EndpointTools.cs | 17 +++-- .../Mcp/McpMetadataDescriptionsTests.cs | 64 +++++++++++++++++++ src/ServiceControl/Mcp/FailedMessageTools.cs | 41 ++++++------ src/ServiceControl/Mcp/FailureGroupTools.cs | 9 ++- src/ServiceControl/Mcp/RetryTools.cs | 41 ++++++------ 11 files changed, 258 insertions(+), 138 deletions(-) create mode 100644 docs/mcp-prompt-validation.md diff --git a/docs/mcp-prompt-validation.md b/docs/mcp-prompt-validation.md new file mode 100644 index 0000000000..b10c4dc6a5 --- /dev/null +++ b/docs/mcp-prompt-validation.md @@ -0,0 +1,33 @@ +# MCP Prompt Validation + +This document records the prompt-validation scenario set for the ServiceControl MCP surface. + +The validation perspective is intentionally narrow: assume the agent only sees discovered tool names, tool descriptions, and parameter descriptions. It does not rely on `docs/mcp-investigation-guide.md` or repository source code. + +## Error Scenarios + +| Prompt | Expected tool choice | Validation notes | +| --- | --- | --- | +| What are the biggest current failure categories? | `get_errors_summary` or `get_failure_groups` | `get_failure_groups` is positioned as the first step for root-cause analysis; detail and mutating tools are not framed as starting points. | +| Why are messages failing in Billing? | `get_failure_groups` -> `get_failed_messages_by_endpoint` -> `get_failed_message_last_attempt` | The metadata separates grouped root-cause analysis, endpoint-scoped inspection, and last-attempt detail lookup. | +| Retry only the timeout-related failures | `get_failure_groups` -> `retry_failure_group` | `retry_failure_group` is described as the grouped retry for one root cause, while broader retry tools explicitly warn about broad impact. | +| Show me details for this failed message | `get_failed_message_by_id` | The tool description says it is for a specific failed message and points agents to list/group tools only when an ID is not yet known. | +| Retry everything | `retry_all_failed_messages` | The metadata allows the broad tool when explicitly requested, while warning that it changes system state and may affect a large number of messages. | + +## Audit Scenarios + +| Prompt | Expected tool choice | Validation notes | +| --- | --- | --- | +| Find messages related to order 12345 | `search_audit_messages` | The description explicitly says it is for a specific business identifier or text, and browsing tools point agents toward search for targeted lookups. | +| Show me what happened in this conversation | `get_audit_messages_by_conversation` | The description frames it as tracing a full flow across multiple endpoints once a conversation ID is known. | +| What is endpoint Billing doing? | `get_audit_messages_by_endpoint` | The metadata positions this as the single-endpoint activity view rather than a cross-endpoint trace. | +| Show recent system activity | `get_audit_messages` | The browsing tool is positioned for recent activity and timeline exploration. | +| Show the payload of this message | `get_audit_message_body` | The description explicitly says it is for inspecting payload or message data after locating a specific audit message. | + +## Outcome + +- Summary and grouping tools are preferred before detail tools for error investigation. +- Search and browse are clearly separated for audit scenarios. +- Conversation tracing and endpoint-centric inspection are differentiated. +- Broad mutating tools remain discoverable but are framed as explicit, risky choices rather than defaults. +- Identifier and endpoint parameter descriptions support the scenario selection by clarifying where IDs and names come from. diff --git a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt index 900eedbecd..69d0e2b635 100644 --- a/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt +++ b/src/ServiceControl.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_primary_instance_tools.approved.txt @@ -96,7 +96,7 @@ }, { "name": "get_failed_message_by_id", - "description": "Read-only. Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. Good for questions like: \u0027show me details for this failed message\u0027, \u0027what exception caused this failure?\u0027, or \u0027how many times has this message failed?\u0027. You need a failed message ID, which you can get from GetFailedMessages or GetFailureGroups results. If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead \u2014 it returns less data.", + "description": "Get detailed information about a specific failed message. Use this when you already know the failed message ID and need to inspect its contents or failure details. Use GetFailedMessages or GetFailureGroups to locate relevant messages before calling this tool. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -121,7 +121,7 @@ }, { "name": "get_failed_message_last_attempt", - "description": "Read-only. Use this tool to see how a specific message failed most recently. Good for questions like: \u0027what was the last error for this message?\u0027, \u0027show me the latest exception\u0027, or \u0027what happened on the last attempt?\u0027. Returns the latest processing attempt with its exception, stack trace, and headers. Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history.", + "description": "Retrieve the last processing attempt for a failed message. Use this to understand the most recent failure behavior, including exception details and processing context. Typically used after identifying a failed message via GetFailedMessages or GetFailedMessageById. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -146,7 +146,7 @@ }, { "name": "get_failed_messages", - "description": "Read-only. Use this tool to retrieve failed messages for investigation when the user wants to see what is failing. Good for questions like: \u0027what messages are currently failing?\u0027, \u0027are there failures in a specific queue?\u0027, or \u0027what failed recently?\u0027. Returns a paged list of failed messages with their status, exception details, and queue information. For broad requests, call with no parameters to get the most recent failures \u2014 only add filters when you need to narrow the scope. Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint.", + "description": "Retrieve failed messages for investigation. Use this when exploring recent failures or narrowing down failures by queue, status, or time range. Prefer GetFailureGroups when starting root-cause analysis across many failures. Use GetFailedMessageById when inspecting a specific failed message. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -159,7 +159,7 @@ "default": null }, "modified": { - "description": "Filter failed messages to entries modified after this ISO 8601 date/time. Omit this filter to include older results.", + "description": "Restricts failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -208,12 +208,12 @@ }, { "name": "get_failed_messages_by_endpoint", - "description": "Read-only. Use this tool to see failed messages for a specific NServiceBus endpoint. Good for questions like: \u0027what is failing in the Sales endpoint?\u0027, \u0027show errors for Shipping\u0027, or \u0027are there failures in this endpoint?\u0027. Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name.", + "description": "Retrieve failed messages for a specific endpoint. Use this when investigating failures in a named endpoint such as Billing or Sales. Prefer GetFailureGroups when you need root-cause analysis across many failures. Use GetFailedMessageLastAttempt after this when you need the most recent failure details for a specific message. Read-only.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", + "description": "The endpoint name that owns the failed messages. Use values obtained from endpoint-aware failed-message results.", "type": "string" }, "status": { @@ -225,7 +225,7 @@ "default": null }, "modified": { - "description": "Filter endpoint results to failed messages modified after this ISO 8601 date/time. Omit this filter to include older results.", + "description": "Restricts endpoint failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -269,7 +269,7 @@ }, { "name": "get_failure_groups", - "description": "Read-only. Use this tool to understand why messages are failing by seeing failures grouped by root cause. Good for questions like: \u0027why are messages failing?\u0027, \u0027what errors are happening?\u0027, \u0027group failures by exception\u0027, or \u0027what are the top failure causes?\u0027. Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. This is usually the best starting point for diagnosing production issues \u2014 call it before drilling into individual messages. Call with no parameters to use the default grouping by exception type and stack trace.", + "description": "Retrieve failure groups, where failed messages are grouped by exception type and stack trace. Use this as the first step when analyzing large numbers of failures to identify dominant root causes. Prefer GetFailedMessages when you need individual message details. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -317,7 +317,7 @@ }, { "name": "retry_all_failed_messages", - "description": "Use this tool to retry every unresolved failed message across all queues and endpoints. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry everything\u0027, \u0027reprocess all failures\u0027, or \u0027retry all failed messages\u0027. It affects all unresolved failed messages across the instance. This is a broad operation \u2014 prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly.", + "description": "Retry all currently failed messages across all queues. Use only when the user explicitly requests a broad retry operation. Prefer narrower retry tools such as RetryFailureGroup or RetryFailedMessages when possible. This operation changes system state. It may affect many messages. It affects all unresolved failed messages across the instance and may affect a large number of messages.", "inputSchema": { "type": "object", "properties": {} @@ -334,12 +334,12 @@ }, { "name": "retry_all_failed_messages_by_endpoint", - "description": "Use this tool to retry all failed messages for a specific NServiceBus endpoint. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales endpoint\u0027, \u0027the bug in Shipping is fixed, retry its failures\u0027, or \u0027reprocess all errors for this endpoint\u0027. Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed.", + "description": "Retry all failed messages for a specific endpoint. Use this when the user explicitly wants an endpoint-scoped retry after an endpoint-specific issue is fixed. Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. This operation changes system state. It may affect many messages. Use the endpoint name from failed-message results.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, e.g. \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027", + "description": "The endpoint name whose failed messages should be retried. Use values obtained from failed-message results.", "type": "string" } }, @@ -384,7 +384,7 @@ }, { "name": "retry_failed_messages", - "description": "Use this tool to reprocess multiple specific failed messages at once. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry these messages\u0027, \u0027reprocess messages msg-1, msg-2, msg-3\u0027, or \u0027retry this batch\u0027. Prefer RetryFailureGroup when all messages share the same failure cause \u2014 use this tool when you have a specific set of message IDs to retry.", + "description": "Retry a selected set of failed messages by their IDs. Use this when the user explicitly wants to retry specific known messages. Prefer RetryFailureGroup when retrying all messages with the same root cause. This operation changes system state. It may affect many messages. Use values obtained from failed-message investigation tools.", "inputSchema": { "type": "object", "properties": { @@ -412,12 +412,12 @@ }, { "name": "retry_failed_messages_by_queue", - "description": "Use this tool to retry all unresolved failed messages from a specific queue. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry all failures in the Sales queue\u0027, \u0027reprocess everything from this queue\u0027, or \u0027the queue consumer is back, retry its failures\u0027. Useful when a queue\u0027s consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status.", + "description": "Retry all unresolved failed messages from a specific queue. Use this when the user explicitly wants a queue-scoped retry after a queue or consumer issue is fixed. Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. This operation changes system state. It may affect many messages. Use the queue address from failed-message results.", "inputSchema": { "type": "object", "properties": { "queueAddress": { - "description": "The full queue address including machine name, e.g. \u0027Sales@machine\u0027", + "description": "Queue address whose unresolved failed messages should be retried. Use values obtained from failed-message results.", "type": "string" } }, @@ -437,7 +437,7 @@ }, { "name": "retry_failure_group", - "description": "Use this tool to retry all failed messages that share the same exception type and stack trace. This operation changes system state. It may affect many messages. Good for questions like: \u0027retry this failure group\u0027, \u0027the bug causing these NullReferenceExceptions is fixed, retry them\u0027, or \u0027retry all messages in this group\u0027. This is the most targeted way to retry related failures after fixing a specific bug. You need a failure group ID, which you can get from GetFailureGroups. Returns InProgress if a retry is already running for this group.", + "description": "Retry all failed messages in a failure group that share the same root cause. Use this when multiple failures are caused by the same issue and can be retried together. Prefer RetryFailedMessages for more granular control. This operation changes system state. It may affect many messages. Use the failure group ID from GetFailureGroups. Returns InProgress if a retry is already running for this group.", "inputSchema": { "type": "object", "properties": { diff --git a/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt b/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt index 6b40320fa3..f7706a55d0 100644 --- a/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt +++ b/src/ServiceControl.Audit.AcceptanceTests.RavenDB/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt @@ -1,7 +1,7 @@ [ { "name": "get_audit_message_body", - "description": "This is a read-only tool for inspecting the actual payload of a processed audit message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need an audit message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", + "description": "Retrieve the body content of a specific audit message. Use this when you need to inspect message payload or data for debugging. Typically used after locating a message via search or browsing tools. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -26,7 +26,7 @@ }, { "name": "get_audit_messages", - "description": "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", + "description": "Retrieve audit messages with paging and sorting. Use this to browse recent message activity or explore message flow over time. Prefer SearchAuditMessages when looking for specific keywords or content. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -56,7 +56,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.", + "description": "Restricts audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -64,7 +64,7 @@ "default": null }, "timeSentTo": { - "description": "Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.", + "description": "Restricts audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -85,7 +85,7 @@ }, { "name": "get_audit_messages_by_conversation", - "description": "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", + "description": "Retrieve all audit messages belonging to a conversation. Use this to trace the full flow of a message or business process across multiple endpoints. Prefer this tool when you already have a conversation ID. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -130,12 +130,12 @@ }, { "name": "get_audit_messages_by_endpoint", - "description": "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", + "description": "Retrieve audit messages processed by a specific endpoint. Use this to understand activity and behavior of a single endpoint. Prefer GetAuditMessagesByConversation when tracing a specific message flow. Read-only.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", + "description": "The endpoint name that processed the audit messages. Use values obtained from GetKnownEndpoints.", "type": "string" }, "keyword": { @@ -172,7 +172,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter endpoint audit messages to those sent after this ISO 8601 date/time.", + "description": "Restricts endpoint audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -180,7 +180,7 @@ "default": null }, "timeSentTo": { - "description": "Filter endpoint audit messages to those sent before this ISO 8601 date/time.", + "description": "Restricts endpoint audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -204,12 +204,12 @@ }, { "name": "get_endpoint_audit_counts", - "description": "This is a read-only tool for seeing daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", + "description": "Retrieve daily audit-message counts for a specific endpoint. Use this when checking throughput or activity trends for one endpoint. Prefer GetKnownEndpoints when you do not already know the endpoint name. Read-only.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", + "description": "The NServiceBus endpoint name whose audit activity should be counted. Use values obtained from GetKnownEndpoints.", "type": "string" } }, @@ -229,7 +229,7 @@ }, { "name": "get_known_endpoints", - "description": "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", + "description": "List all known endpoints that have sent or received audit messages. Use this as a starting point to discover available endpoints before exploring their activity. Read-only.", "inputSchema": { "type": "object", "properties": {} @@ -246,7 +246,7 @@ }, { "name": "search_audit_messages", - "description": "This is a read-only tool for finding audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", + "description": "Search audit messages by keyword across message content and metadata. Use this when trying to locate messages related to a specific business identifier or text. Prefer GetAuditMessages for general browsing or timeline exploration. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -275,7 +275,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter audit search results to messages sent after this ISO 8601 date/time.", + "description": "Restricts audit search results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -283,7 +283,7 @@ "default": null }, "timeSentTo": { - "description": "Filter audit search results to messages sent before this ISO 8601 date/time.", + "description": "Restricts audit search results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" diff --git a/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt b/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt index 6b40320fa3..f7706a55d0 100644 --- a/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt +++ b/src/ServiceControl.Audit.AcceptanceTests/ApprovalFiles/When_mcp_server_is_enabled.Should_list_audit_message_tools.approved.txt @@ -1,7 +1,7 @@ [ { "name": "get_audit_message_body", - "description": "This is a read-only tool for inspecting the actual payload of a processed audit message. Good for questions like: \u0027show me the message body\u0027, \u0027what data was in this message?\u0027, or \u0027let me see the content of message X\u0027. Returns the serialized message body content, typically JSON. You need an audit message ID, which you can get from any audit message query result. Use this when the user wants to see what data was actually sent, not just message metadata.", + "description": "Retrieve the body content of a specific audit message. Use this when you need to inspect message payload or data for debugging. Typically used after locating a message via search or browsing tools. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -26,7 +26,7 @@ }, { "name": "get_audit_messages", - "description": "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. Good for questions like: \u0027show recent audit messages\u0027, \u0027what messages were processed today?\u0027, \u0027list messages from endpoint X\u0027, or \u0027show slow messages\u0027. Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. For broad requests, use the default paging and sorting. Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead.", + "description": "Retrieve audit messages with paging and sorting. Use this to browse recent message activity or explore message flow over time. Prefer SearchAuditMessages when looking for specific keywords or content. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -56,7 +56,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.", + "description": "Restricts audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -64,7 +64,7 @@ "default": null }, "timeSentTo": { - "description": "Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.", + "description": "Restricts audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -85,7 +85,7 @@ }, { "name": "get_audit_messages_by_conversation", - "description": "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. Good for questions like: \u0027what happened after this message was sent?\u0027, \u0027show me the full message flow\u0027, or \u0027trace this conversation\u0027. A conversation groups all related messages together \u2014 the original command and every event, reply, or saga message it caused. You need a conversation ID, which you can get from any audit message query result. Essential for understanding message flow and debugging cascading issues.", + "description": "Retrieve all audit messages belonging to a conversation. Use this to trace the full flow of a message or business process across multiple endpoints. Prefer this tool when you already have a conversation ID. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -130,12 +130,12 @@ }, { "name": "get_audit_messages_by_endpoint", - "description": "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. Good for questions like: \u0027what messages did Sales process?\u0027, \u0027show messages handled by Shipping\u0027, or \u0027find OrderPlaced messages in the Billing endpoint\u0027. Returns the same metadata as GetAuditMessages but scoped to one endpoint. Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. Optionally pass a keyword to search within that endpoint\u0027s messages.", + "description": "Retrieve audit messages processed by a specific endpoint. Use this to understand activity and behavior of a single endpoint. Prefer GetAuditMessagesByConversation when tracing a specific message flow. Read-only.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name to investigate, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", + "description": "The endpoint name that processed the audit messages. Use values obtained from GetKnownEndpoints.", "type": "string" }, "keyword": { @@ -172,7 +172,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter endpoint audit messages to those sent after this ISO 8601 date/time.", + "description": "Restricts endpoint audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -180,7 +180,7 @@ "default": null }, "timeSentTo": { - "description": "Filter endpoint audit messages to those sent before this ISO 8601 date/time.", + "description": "Restricts endpoint audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -204,12 +204,12 @@ }, { "name": "get_endpoint_audit_counts", - "description": "This is a read-only tool for seeing daily message volume trends for a specific endpoint. Good for questions like: \u0027how much traffic does Sales handle?\u0027, \u0027has throughput changed recently?\u0027, or \u0027show me message counts for this endpoint\u0027. Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. You need an endpoint name \u2014 use GetKnownEndpoints first if you do not have one.", + "description": "Retrieve daily audit-message counts for a specific endpoint. Use this when checking throughput or activity trends for one endpoint. Prefer GetKnownEndpoints when you do not already know the endpoint name. Read-only.", "inputSchema": { "type": "object", "properties": { "endpointName": { - "description": "The NServiceBus endpoint name, for example \u0027Sales\u0027 or \u0027Shipping.MessageHandler\u0027.", + "description": "The NServiceBus endpoint name whose audit activity should be counted. Use values obtained from GetKnownEndpoints.", "type": "string" } }, @@ -229,7 +229,7 @@ }, { "name": "get_known_endpoints", - "description": "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. Good for questions like: \u0027what endpoints do we have?\u0027, \u0027what services are running?\u0027, or \u0027list all endpoints\u0027. Returns all endpoints that have processed audit messages, including their name and host information. This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts.", + "description": "List all known endpoints that have sent or received audit messages. Use this as a starting point to discover available endpoints before exploring their activity. Read-only.", "inputSchema": { "type": "object", "properties": {} @@ -246,7 +246,7 @@ }, { "name": "search_audit_messages", - "description": "This is a read-only tool for finding audit messages by a keyword or phrase. Good for questions like: \u0027find messages containing order 12345\u0027, \u0027search for CustomerCreated messages\u0027, or \u0027look for messages mentioning this ID\u0027. Searches across message body content, headers, and metadata using full-text search. Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. If the user just wants to browse recent messages without a search term, use GetAuditMessages instead.", + "description": "Search audit messages by keyword across message content and metadata. Use this when trying to locate messages related to a specific business identifier or text. Prefer GetAuditMessages for general browsing or timeline exploration. Read-only.", "inputSchema": { "type": "object", "properties": { @@ -275,7 +275,7 @@ "default": "desc" }, "timeSentFrom": { - "description": "Filter audit search results to messages sent after this ISO 8601 date/time.", + "description": "Restricts audit search results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" @@ -283,7 +283,7 @@ "default": null }, "timeSentTo": { - "description": "Filter audit search results to messages sent before this ISO 8601 date/time.", + "description": "Restricts audit search results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.", "type": [ "string", "null" diff --git a/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs b/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs index 34aeb8ed69..42e6a6ff88 100644 --- a/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs +++ b/src/ServiceControl.Audit.UnitTests/Mcp/McpMetadataDescriptionsTests.cs @@ -12,13 +12,13 @@ namespace ServiceControl.Audit.UnitTests.Mcp; [TestFixture] class McpMetadataDescriptionsTests { - [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessages), "read-only")] - [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.SearchAuditMessages), "read-only")] - [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByEndpoint), "read-only")] - [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation), "read-only")] - [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody), "read-only")] - [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetKnownEndpoints), "read-only")] - [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetEndpointAuditCounts), "read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessages), "Read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.SearchAuditMessages), "Read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByEndpoint), "Read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation), "Read-only")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody), "Read-only")] + [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetKnownEndpoints), "Read-only")] + [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetEndpointAuditCounts), "Read-only")] public void Audit_query_tools_are_described_as_read_only(Type toolType, string methodName, string expectedPhrase) { var description = GetMethodDescription(toolType, methodName); @@ -29,6 +29,7 @@ public void Audit_query_tools_are_described_as_read_only(Type toolType, string m [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody), "messageId", "audit message ID")] [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation), "conversationId", "conversation ID")] [TestCase(typeof(EndpointTools), nameof(EndpointTools.GetEndpointAuditCounts), "endpointName", "NServiceBus endpoint name")] + [TestCase(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByEndpoint), "endpointName", "endpoint name")] public void Key_audit_tool_parameters_identify_the_entity_type(Type toolType, string methodName, string parameterName, string expectedPhrase) { var description = GetParameterDescription(toolType, methodName, parameterName); @@ -36,6 +37,32 @@ public void Key_audit_tool_parameters_identify_the_entity_type(Type toolType, st Assert.That(description, Does.Contain(expectedPhrase)); } + [Test] + public void Audit_tools_distinguish_browse_search_trace_and_payload_scenarios() + { + var browse = GetMethodDescription(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessages)); + var search = GetMethodDescription(typeof(AuditMessageTools), nameof(AuditMessageTools.SearchAuditMessages)); + var conversation = GetMethodDescription(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByConversation)); + var endpoint = GetMethodDescription(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessagesByEndpoint)); + var body = GetMethodDescription(typeof(AuditMessageTools), nameof(AuditMessageTools.GetAuditMessageBody)); + var knownEndpoints = GetMethodDescription(typeof(EndpointTools), nameof(EndpointTools.GetKnownEndpoints)); + + Assert.Multiple(() => + { + Assert.That(browse, Does.Contain("browse recent message activity").And.Contain("SearchAuditMessages")); + + Assert.That(search, Does.Contain("specific business identifier or text").And.Contain("GetAuditMessages")); + + Assert.That(conversation, Does.Contain("conversation").And.Contain("multiple endpoints")); + + Assert.That(endpoint, Does.Contain("single endpoint").And.Contain("GetAuditMessagesByConversation")); + + Assert.That(body, Does.Contain("message payload").And.Contain("search or browsing tools")); + + Assert.That(knownEndpoints, Does.Contain("starting point").And.Contain("available endpoints")); + }); + } + static string GetMethodDescription(Type toolType, string methodName) => toolType.GetMethod(methodName)! .GetCustomAttribute()! diff --git a/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs b/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs index b0496d0716..6fb5548891 100644 --- a/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs +++ b/src/ServiceControl.Audit/Mcp/AuditMessageTools.cs @@ -24,12 +24,10 @@ namespace ServiceControl.Audit.Mcp; public class AuditMessageTools(IAuditDataStore store, ILogger logger) { [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for browsing successfully processed audit messages when the user wants an overview rather than a text search. " + - "Good for questions like: 'show recent audit messages', 'what messages were processed today?', 'list messages from endpoint X', or 'show slow messages'. " + - "Returns message metadata such as message type, endpoints, sent time, processed time, and timing metrics. " + - "For broad requests, use the default paging and sorting. " + - "Prefer this tool over SearchAuditMessages when the user does not provide a specific keyword or phrase. " + - "If the user is looking for a specific term, id, or text fragment, use SearchAuditMessages instead." + "Retrieve audit messages with paging and sorting. " + + "Use this to browse recent message activity or explore message flow over time. " + + "Prefer SearchAuditMessages when looking for specific keywords or content. " + + "Read-only." )] public async Task GetAuditMessages( [Description("Set to true to include NServiceBus infrastructure messages. Leave this as false for the usual business-message view.")] bool includeSystemMessages = false, @@ -37,8 +35,8 @@ public async Task GetAuditMessages( [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Filter audit messages to those sent after this ISO 8601 date/time. Use with timeSentTo for a bounded time window.")] string? timeSentFrom = null, - [Description("Filter audit messages to those sent before this ISO 8601 date/time. Omit to leave the upper bound open.")] string? timeSentTo = null, + [Description("Restricts audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentFrom = null, + [Description("Restricts audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetAuditMessages invoked (page={Page}, includeSystemMessages={IncludeSystem})", page, includeSystemMessages); @@ -59,11 +57,10 @@ public async Task GetAuditMessages( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for finding audit messages by a keyword or phrase. " + - "Good for questions like: 'find messages containing order 12345', 'search for CustomerCreated messages', or 'look for messages mentioning this ID'. " + - "Searches across message body content, headers, and metadata using full-text search. " + - "Prefer this tool over GetAuditMessages when the user provides a specific term, identifier, or phrase to search for. " + - "If the user just wants to browse recent messages without a search term, use GetAuditMessages instead." + "Search audit messages by keyword across message content and metadata. " + + "Use this when trying to locate messages related to a specific business identifier or text. " + + "Prefer GetAuditMessages for general browsing or timeline exploration. " + + "Read-only." )] public async Task SearchAuditMessages( [Description("The free-text search query to match against audit message body content, headers, and metadata.")] string query, @@ -71,8 +68,8 @@ public async Task SearchAuditMessages( [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Filter audit search results to messages sent after this ISO 8601 date/time.")] string? timeSentFrom = null, - [Description("Filter audit search results to messages sent before this ISO 8601 date/time.")] string? timeSentTo = null, + [Description("Restricts audit search results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentFrom = null, + [Description("Restricts audit search results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP SearchAuditMessages invoked (query={Query}, page={Page})", query, page); @@ -93,22 +90,21 @@ public async Task SearchAuditMessages( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for seeing what messages a specific NServiceBus endpoint has processed. " + - "Good for questions like: 'what messages did Sales process?', 'show messages handled by Shipping', or 'find OrderPlaced messages in the Billing endpoint'. " + - "Returns the same metadata as GetAuditMessages but scoped to one endpoint. " + - "Prefer this tool over GetAuditMessages when the user mentions a specific endpoint name. " + - "Optionally pass a keyword to search within that endpoint's messages." + "Retrieve audit messages processed by a specific endpoint. " + + "Use this to understand activity and behavior of a single endpoint. " + + "Prefer GetAuditMessagesByConversation when tracing a specific message flow. " + + "Read-only." )] public async Task GetAuditMessagesByEndpoint( - [Description("The NServiceBus endpoint name to investigate, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, + [Description("The endpoint name that processed the audit messages. Use values obtained from GetKnownEndpoints.")] string endpointName, [Description("Optional keyword to narrow results within this endpoint. Omit it to browse the endpoint without full-text filtering.")] string? keyword = null, [Description("Set to true to include NServiceBus infrastructure messages for this endpoint. Leave false for the usual business-message view.")] bool includeSystemMessages = false, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, processed_at, message_type, critical_time, delivery_time, or processing_time")] string sort = "time_sent", [Description("Sort direction: asc or desc")] string direction = "desc", - [Description("Filter endpoint audit messages to those sent after this ISO 8601 date/time.")] string? timeSentFrom = null, - [Description("Filter endpoint audit messages to those sent before this ISO 8601 date/time.")] string? timeSentTo = null, + [Description("Restricts endpoint audit-message results to messages sent after this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentFrom = null, + [Description("Restricts endpoint audit-message results to messages sent before this ISO 8601 date/time. Omitting this may return a large result set.")] string? timeSentTo = null, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetAuditMessagesByEndpoint invoked (endpoint={EndpointName}, keyword={Keyword}, page={Page})", endpointName, keyword, page); @@ -131,11 +127,10 @@ public async Task GetAuditMessagesByEndpoint( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for tracing the full chain of audit messages triggered by an initial message. " + - "Good for questions like: 'what happened after this message was sent?', 'show me the full message flow', or 'trace this conversation'. " + - "A conversation groups all related messages together — the original command and every event, reply, or saga message it caused. " + - "You need a conversation ID, which you can get from any audit message query result. " + - "Essential for understanding message flow and debugging cascading issues." + "Retrieve all audit messages belonging to a conversation. " + + "Use this to trace the full flow of a message or business process across multiple endpoints. " + + "Prefer this tool when you already have a conversation ID. " + + "Read-only." )] public async Task GetAuditMessagesByConversation( [Description("The conversation ID from a previous audit message query result.")] string conversationId, @@ -162,11 +157,10 @@ public async Task GetAuditMessagesByConversation( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for inspecting the actual payload of a processed audit message. " + - "Good for questions like: 'show me the message body', 'what data was in this message?', or 'let me see the content of message X'. " + - "Returns the serialized message body content, typically JSON. " + - "You need an audit message ID, which you can get from any audit message query result. " + - "Use this when the user wants to see what data was actually sent, not just message metadata." + "Retrieve the body content of a specific audit message. " + + "Use this when you need to inspect message payload or data for debugging. " + + "Typically used after locating a message via search or browsing tools. " + + "Read-only." )] public async Task GetAuditMessageBody( [Description("The audit message ID from a previous audit message query result.")] string messageId, diff --git a/src/ServiceControl.Audit/Mcp/EndpointTools.cs b/src/ServiceControl.Audit/Mcp/EndpointTools.cs index 86097d87ae..00b952db9b 100644 --- a/src/ServiceControl.Audit/Mcp/EndpointTools.cs +++ b/src/ServiceControl.Audit/Mcp/EndpointTools.cs @@ -19,10 +19,9 @@ namespace ServiceControl.Audit.Mcp; public class EndpointTools(IAuditDataStore store, ILogger logger) { [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for discovering what NServiceBus endpoints exist in the system. " + - "Good for questions like: 'what endpoints do we have?', 'what services are running?', or 'list all endpoints'. " + - "Returns all endpoints that have processed audit messages, including their name and host information. " + - "This is a good starting point when you need an endpoint name for other tools like GetAuditMessagesByEndpoint or GetEndpointAuditCounts." + "List all known endpoints that have sent or received audit messages. " + + "Use this as a starting point to discover available endpoints before exploring their activity. " + + "Read-only." )] public async Task GetKnownEndpoints(CancellationToken cancellationToken = default) { @@ -40,13 +39,13 @@ public async Task GetKnownEndpoints(CancellationToken cancellationToken } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "This is a read-only tool for seeing daily message volume trends for a specific endpoint. " + - "Good for questions like: 'how much traffic does Sales handle?', 'has throughput changed recently?', or 'show me message counts for this endpoint'. " + - "Returns message counts per day, which helps identify throughput changes, traffic spikes, or drops in activity that might indicate problems. " + - "You need an endpoint name — use GetKnownEndpoints first if you do not have one." + "Retrieve daily audit-message counts for a specific endpoint. " + + "Use this when checking throughput or activity trends for one endpoint. " + + "Prefer GetKnownEndpoints when you do not already know the endpoint name. " + + "Read-only." )] public async Task GetEndpointAuditCounts( - [Description("The NServiceBus endpoint name, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, + [Description("The NServiceBus endpoint name whose audit activity should be counted. Use values obtained from GetKnownEndpoints.")] string endpointName, CancellationToken cancellationToken = default) { logger.LogInformation("MCP GetEndpointAuditCounts invoked (endpoint={EndpointName})", endpointName); diff --git a/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs index 3496781775..e8c03df00b 100644 --- a/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs +++ b/src/ServiceControl.UnitTests/Mcp/McpMetadataDescriptionsTests.cs @@ -59,6 +59,8 @@ public void Bulk_mutating_tools_warn_that_they_may_affect_many_messages(Type too [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageLastAttempt), "failedMessageId", "failed message ID")] [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessage), "failedMessageId", "failed message ID")] [TestCase(typeof(RetryTools), nameof(RetryTools.RetryFailedMessages), "messageIds", "failed message IDs")] + [TestCase(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessagesByEndpoint), "endpointName", "endpoint name")] + [TestCase(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessagesByEndpoint), "endpointName", "endpoint name")] [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailureGroup), "groupId", "failure group ID")] [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.UnarchiveFailureGroup), "groupId", "failure group ID")] [TestCase(typeof(ArchiveTools), nameof(ArchiveTools.ArchiveFailedMessage), "failedMessageId", "failed message ID")] @@ -72,6 +74,68 @@ public void Key_error_tool_parameters_identify_the_entity_type(Type toolType, st Assert.That(description, Does.Contain(expectedPhrase)); } + [Test] + public void Get_failed_messages_guides_agents_toward_groups_first_and_details_second() + { + var description = GetMethodDescription(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessages)); + + Assert.Multiple(() => + { + Assert.That(description, Does.Contain("Retrieve failed messages")); + Assert.That(description, Does.Contain("root-cause analysis")); + Assert.That(description, Does.Contain("GetFailureGroups")); + Assert.That(description, Does.Contain("GetFailedMessageById")); + }); + } + + [Test] + public void Get_failure_groups_is_positioned_as_root_cause_starting_point() + { + var description = GetMethodDescription(typeof(FailureGroupTools), nameof(FailureGroupTools.GetFailureGroups)); + + Assert.Multiple(() => + { + Assert.That(description, Does.Contain("Retrieve failure groups")); + Assert.That(description, Does.Contain("first step")); + Assert.That(description, Does.Contain("root cause")); + Assert.That(description, Does.Contain("GetFailedMessages")); + }); + } + + [Test] + public void Failed_message_detail_tools_reference_the_expected_workflow() + { + var byId = GetMethodDescription(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageById)); + var lastAttempt = GetMethodDescription(typeof(FailedMessageTools), nameof(FailedMessageTools.GetFailedMessageLastAttempt)); + + Assert.Multiple(() => + { + Assert.That(byId, Does.Contain("failed message ID")); + Assert.That(byId, Does.Contain("GetFailedMessages").Or.Contain("GetFailureGroups")); + + Assert.That(lastAttempt, Does.Contain("last processing attempt").Or.Contain("most recent failure")); + Assert.That(lastAttempt, Does.Contain("GetFailedMessages").Or.Contain("GetFailedMessageById")); + }); + } + + [Test] + public void Retry_tools_describe_targeted_group_and_broad_scenarios() + { + var retryByIds = GetMethodDescription(typeof(RetryTools), nameof(RetryTools.RetryFailedMessages)); + var retryGroup = GetMethodDescription(typeof(RetryTools), nameof(RetryTools.RetryFailureGroup)); + var retryAll = GetMethodDescription(typeof(RetryTools), nameof(RetryTools.RetryAllFailedMessages)); + + Assert.Multiple(() => + { + Assert.That(retryByIds, Does.Contain("specific").And.Contain("RetryFailureGroup")); + + Assert.That(retryGroup, Does.Contain("root cause").And.Contain("RetryFailedMessages")); + + Assert.That(retryAll, Does.Contain("explicitly requests").And.Contain("narrower retry tools")); + Assert.That(retryAll, Does.Contain("large number of messages")); + }); + } + static string GetMethodDescription(Type toolType, string methodName) => toolType.GetMethod(methodName)! .GetCustomAttribute()! diff --git a/src/ServiceControl/Mcp/FailedMessageTools.cs b/src/ServiceControl/Mcp/FailedMessageTools.cs index f64ec32807..a2c722dcd8 100644 --- a/src/ServiceControl/Mcp/FailedMessageTools.cs +++ b/src/ServiceControl/Mcp/FailedMessageTools.cs @@ -24,15 +24,15 @@ namespace ServiceControl.Mcp; public class FailedMessageTools(IErrorMessageDataStore store, ILogger logger) { [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "Read-only. Use this tool to retrieve failed messages for investigation when the user wants to see what is failing. " + - "Good for questions like: 'what messages are currently failing?', 'are there failures in a specific queue?', or 'what failed recently?'. " + - "Returns a paged list of failed messages with their status, exception details, and queue information. " + - "For broad requests, call with no parameters to get the most recent failures — only add filters when you need to narrow the scope. " + - "Prefer GetFailedMessagesByEndpoint when the user mentions a specific endpoint." + "Retrieve failed messages for investigation. " + + "Use this when exploring recent failures or narrowing down failures by queue, status, or time range. " + + "Prefer GetFailureGroups when starting root-cause analysis across many failures. " + + "Use GetFailedMessageById when inspecting a specific failed message. " + + "Read-only." )] public async Task GetFailedMessages( [Description("Filter failed messages by status: unresolved (still failing), resolved (succeeded on retry), archived (dismissed), or retryissued (retry in progress). Omit this filter to include all statuses.")] string? status = null, - [Description("Filter failed messages to entries modified after this ISO 8601 date/time. Omit this filter to include older results.")] string? modified = null, + [Description("Restricts failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.")] string? modified = null, [Description("Filter failed messages to a specific queue address, for example 'Sales@machine'. Omit this filter to include all queues.")] string? queueAddress = null, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, @@ -56,10 +56,10 @@ public async Task GetFailedMessages( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "Read-only. Use this tool to get the full details of a specific failed message, including all processing attempts and exception information. " + - "Good for questions like: 'show me details for this failed message', 'what exception caused this failure?', or 'how many times has this message failed?'. " + - "You need a failed message ID, which you can get from GetFailedMessages or GetFailureGroups results. " + - "If you only need the most recent failure attempt, use GetFailedMessageLastAttempt instead — it returns less data." + "Get detailed information about a specific failed message. " + + "Use this when you already know the failed message ID and need to inspect its contents or failure details. " + + "Use GetFailedMessages or GetFailureGroups to locate relevant messages before calling this tool. " + + "Read-only." )] public async Task GetFailedMessageById( [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) @@ -78,10 +78,10 @@ public async Task GetFailedMessageById( } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "Read-only. Use this tool to see how a specific message failed most recently. " + - "Good for questions like: 'what was the last error for this message?', 'show me the latest exception', or 'what happened on the last attempt?'. " + - "Returns the latest processing attempt with its exception, stack trace, and headers. " + - "Lighter than GetFailedMessageById when you only care about the most recent failure rather than the full history." + "Retrieve the last processing attempt for a failed message. " + + "Use this to understand the most recent failure behavior, including exception details and processing context. " + + "Typically used after identifying a failed message via GetFailedMessages or GetFailedMessageById. " + + "Read-only." )] public async Task GetFailedMessageLastAttempt( [Description("The failed message ID from a previous failed-message query result.")] string failedMessageId) @@ -114,15 +114,16 @@ public async Task GetErrorsSummary() } [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "Read-only. Use this tool to see failed messages for a specific NServiceBus endpoint. " + - "Good for questions like: 'what is failing in the Sales endpoint?', 'show errors for Shipping', or 'are there failures in this endpoint?'. " + - "Returns the same paged failure data as GetFailedMessages but scoped to one endpoint. " + - "Prefer this tool over GetFailedMessages when the user mentions a specific endpoint name." + "Retrieve failed messages for a specific endpoint. " + + "Use this when investigating failures in a named endpoint such as Billing or Sales. " + + "Prefer GetFailureGroups when you need root-cause analysis across many failures. " + + "Use GetFailedMessageLastAttempt after this when you need the most recent failure details for a specific message. " + + "Read-only." )] public async Task GetFailedMessagesByEndpoint( - [Description("The NServiceBus endpoint name to investigate, for example 'Sales' or 'Shipping.MessageHandler'.")] string endpointName, + [Description("The endpoint name that owns the failed messages. Use values obtained from endpoint-aware failed-message results.")] string endpointName, [Description("Filter failed messages by status: unresolved, resolved, archived, or retryissued. Omit this filter to include all statuses for the endpoint.")] string? status = null, - [Description("Filter endpoint results to failed messages modified after this ISO 8601 date/time. Omit this filter to include older results.")] string? modified = null, + [Description("Restricts endpoint failed-message results to entries modified after this ISO 8601 date/time. Omitting this may return a large result set.")] string? modified = null, [Description("Page number, 1-based")] int page = 1, [Description("Results per page")] int perPage = 50, [Description("Sort by: time_sent, message_type, or time_of_failure")] string sort = "time_of_failure", diff --git a/src/ServiceControl/Mcp/FailureGroupTools.cs b/src/ServiceControl/Mcp/FailureGroupTools.cs index 096e1a68a5..3c10a8fe38 100644 --- a/src/ServiceControl/Mcp/FailureGroupTools.cs +++ b/src/ServiceControl/Mcp/FailureGroupTools.cs @@ -20,11 +20,10 @@ namespace ServiceControl.Mcp; public class FailureGroupTools(GroupFetcher fetcher, IRetryHistoryDataStore retryStore, ILogger logger) { [McpServerTool(ReadOnly = true, Idempotent = true, Destructive = false, OpenWorld = false), Description( - "Read-only. Use this tool to understand why messages are failing by seeing failures grouped by root cause. " + - "Good for questions like: 'why are messages failing?', 'what errors are happening?', 'group failures by exception', or 'what are the top failure causes?'. " + - "Each group represents a distinct exception type and stack trace, showing how many messages are affected and when failures started and last occurred. " + - "This is usually the best starting point for diagnosing production issues — call it before drilling into individual messages. " + - "Call with no parameters to use the default grouping by exception type and stack trace." + "Retrieve failure groups, where failed messages are grouped by exception type and stack trace. " + + "Use this as the first step when analyzing large numbers of failures to identify dominant root causes. " + + "Prefer GetFailedMessages when you need individual message details. " + + "Read-only." )] public async Task GetFailureGroups( [Description("How to group failures. The default 'Exception Type and Stack Trace' is almost always what you want. Use 'Message Type' to group by the NServiceBus message type instead.")] string classifier = "Exception Type and Stack Trace", diff --git a/src/ServiceControl/Mcp/RetryTools.cs b/src/ServiceControl/Mcp/RetryTools.cs index 2dc943d779..dafb0a4634 100644 --- a/src/ServiceControl/Mcp/RetryTools.cs +++ b/src/ServiceControl/Mcp/RetryTools.cs @@ -41,11 +41,12 @@ public async Task RetryFailedMessage( } [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( - "Use this tool to reprocess multiple specific failed messages at once. " + + "Retry a selected set of failed messages by their IDs. " + + "Use this when the user explicitly wants to retry specific known messages. " + + "Prefer RetryFailureGroup when retrying all messages with the same root cause. " + "This operation changes system state. " + "It may affect many messages. " + - "Good for questions like: 'retry these messages', 'reprocess messages msg-1, msg-2, msg-3', or 'retry this batch'. " + - "Prefer RetryFailureGroup when all messages share the same failure cause — use this tool when you have a specific set of message IDs to retry." + "Use values obtained from failed-message investigation tools." )] public async Task RetryFailedMessages( [Description("The failed message IDs from previous failed-message query results.")] string[] messageIds) @@ -63,14 +64,15 @@ public async Task RetryFailedMessages( } [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( - "Use this tool to retry all unresolved failed messages from a specific queue. " + + "Retry all unresolved failed messages from a specific queue. " + + "Use this when the user explicitly wants a queue-scoped retry after a queue or consumer issue is fixed. " + + "Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. " + "This operation changes system state. " + "It may affect many messages. " + - "Good for questions like: 'retry all failures in the Sales queue', 'reprocess everything from this queue', or 'the queue consumer is back, retry its failures'. " + - "Useful when a queue's consumer was down or misconfigured and is now fixed. Only retries messages with unresolved status." + "Use the queue address from failed-message results." )] public async Task RetryFailedMessagesByQueue( - [Description("The full queue address including machine name, e.g. 'Sales@machine'")] string queueAddress) + [Description("Queue address whose unresolved failed messages should be retried. Use values obtained from failed-message results.")] string queueAddress) { logger.LogInformation("MCP RetryFailedMessagesByQueue invoked (queueAddress={QueueAddress})", queueAddress); @@ -83,12 +85,12 @@ await messageSession.SendLocal(m => } [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( - "Use this tool to retry every unresolved failed message across all queues and endpoints. " + + "Retry all currently failed messages across all queues. " + + "Use only when the user explicitly requests a broad retry operation. " + + "Prefer narrower retry tools such as RetryFailureGroup or RetryFailedMessages when possible. " + "This operation changes system state. " + "It may affect many messages. " + - "Good for questions like: 'retry everything', 'reprocess all failures', or 'retry all failed messages'. " + - "It affects all unresolved failed messages across the instance. " + - "This is a broad operation — prefer RetryFailedMessagesByQueue, RetryAllFailedMessagesByEndpoint, or RetryFailureGroup when you can scope the retry more narrowly." + "It affects all unresolved failed messages across the instance and may affect a large number of messages." )] public async Task RetryAllFailedMessages() { @@ -99,14 +101,15 @@ public async Task RetryAllFailedMessages() } [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( - "Use this tool to retry all failed messages for a specific NServiceBus endpoint. " + + "Retry all failed messages for a specific endpoint. " + + "Use this when the user explicitly wants an endpoint-scoped retry after an endpoint-specific issue is fixed. " + + "Prefer RetryFailureGroup or RetryFailedMessages when you can retry a narrower set of failures. " + "This operation changes system state. " + "It may affect many messages. " + - "Good for questions like: 'retry all failures in the Sales endpoint', 'the bug in Shipping is fixed, retry its failures', or 'reprocess all errors for this endpoint'. " + - "Useful when a bug in one endpoint has been fixed and all its failures should be reprocessed." + "Use the endpoint name from failed-message results." )] public async Task RetryAllFailedMessagesByEndpoint( - [Description("The NServiceBus endpoint name, e.g. 'Sales' or 'Shipping.MessageHandler'")] string endpointName) + [Description("The endpoint name whose failed messages should be retried. Use values obtained from failed-message results.")] string endpointName) { logger.LogInformation("MCP RetryAllFailedMessagesByEndpoint invoked (endpoint={EndpointName})", endpointName); @@ -115,12 +118,12 @@ public async Task RetryAllFailedMessagesByEndpoint( } [McpServerTool(ReadOnly = false, Idempotent = false, Destructive = true, OpenWorld = false), Description( - "Use this tool to retry all failed messages that share the same exception type and stack trace. " + + "Retry all failed messages in a failure group that share the same root cause. " + + "Use this when multiple failures are caused by the same issue and can be retried together. " + + "Prefer RetryFailedMessages for more granular control. " + "This operation changes system state. " + "It may affect many messages. " + - "Good for questions like: 'retry this failure group', 'the bug causing these NullReferenceExceptions is fixed, retry them', or 'retry all messages in this group'. " + - "This is the most targeted way to retry related failures after fixing a specific bug. " + - "You need a failure group ID, which you can get from GetFailureGroups. " + + "Use the failure group ID from GetFailureGroups. " + "Returns InProgress if a retry is already running for this group." )] public async Task RetryFailureGroup( From 34ffece75df8fa2eb6d25c89d9315a60774ea567 Mon Sep 17 00:00:00 2001 From: Daniel Marbach Date: Wed, 25 Mar 2026 16:38:09 +0100 Subject: [PATCH 5/5] Remove guides --- docs/mcp-investigation-guide.md | 112 -------------------------------- docs/mcp-prompt-validation.md | 33 ---------- 2 files changed, 145 deletions(-) delete mode 100644 docs/mcp-investigation-guide.md delete mode 100644 docs/mcp-prompt-validation.md diff --git a/docs/mcp-investigation-guide.md b/docs/mcp-investigation-guide.md deleted file mode 100644 index 337103b3de..0000000000 --- a/docs/mcp-investigation-guide.md +++ /dev/null @@ -1,112 +0,0 @@ -# ServiceControl MCP Investigation Guide - -This guide explains how to use the ServiceControl MCP tools for investigation work. - -The MCP surface is designed to help AI agents and human operators choose the right tool based on intent, scope, and risk. - -## Tool Inventory - -### Primary instance tools - -| Tool | Category | Risk | Notes | -| --- | --- | --- | --- | -| `get_errors_summary` | summary | safe | Best first step for overall failed-message health | -| `get_failure_groups` | summary | safe | Best first step for root-cause analysis | -| `get_retry_history` | detail | safe | Confirms whether similar retries were already attempted | -| `get_failed_messages` | list | safe | Broad failed-message listing | -| `get_failed_messages_by_endpoint` | list | safe | Use when the endpoint is already known | -| `get_failed_message_by_id` | detail | safe | Full failed-message history | -| `get_failed_message_last_attempt` | detail | safe | Lighter detail view for the latest failure | -| `retry_failed_message` | action | moderate | Narrow retry for one failed message | -| `retry_failed_messages` | action | moderate | Retry a specific set of failed messages | -| `retry_failed_messages_by_queue` | action | high | Retries all unresolved failures in one queue | -| `retry_all_failed_messages_by_endpoint` | action | high | Retries all failures for one endpoint | -| `retry_failure_group` | action | moderate | Best grouped retry after fixing one root cause | -| `retry_all_failed_messages` | action | high | Broadest retry operation | -| `archive_failed_message` | action | moderate | Dismiss one failed message | -| `archive_failed_messages` | action | moderate | Dismiss a chosen set of failed messages | -| `archive_failure_group` | action | high | Dismiss all failed messages in one failure group | -| `unarchive_failed_message` | action | moderate | Restore one archived failed message | -| `unarchive_failed_messages` | action | moderate | Restore a chosen set of archived failed messages | -| `unarchive_failure_group` | action | high | Restore all archived messages in one failure group | - -### Audit instance tools - -| Tool | Category | Risk | Notes | -| --- | --- | --- | --- | -| `get_known_endpoints` | discovery | safe | Start here when you need endpoint names | -| `get_endpoint_audit_counts` | summary | safe | Throughput trends for one endpoint | -| `get_audit_messages` | list | safe | Broad audit-message browsing | -| `search_audit_messages` | search | safe | Full-text lookup for specific terms or IDs | -| `get_audit_messages_by_endpoint` | list/search | safe | Scoped endpoint investigation | -| `get_audit_messages_by_conversation` | detail | safe | Trace a message flow across related messages | -| `get_audit_message_body` | detail | safe | Inspect serialized payload content | - -## Read-only vs State-changing - -### Read-only tools - -Use these first during an investigation. They do not change system state. - -- Error investigation: `get_errors_summary`, `get_failure_groups`, `get_retry_history`, `get_failed_messages`, `get_failed_messages_by_endpoint`, `get_failed_message_by_id`, `get_failed_message_last_attempt` -- Audit investigation: `get_known_endpoints`, `get_endpoint_audit_counts`, `get_audit_messages`, `search_audit_messages`, `get_audit_messages_by_endpoint`, `get_audit_messages_by_conversation`, `get_audit_message_body` - -### State-changing tools - -Use these only when the user explicitly wants to retry, archive, or restore failed messages. - -- Retry tools: `retry_failed_message`, `retry_failed_messages`, `retry_failed_messages_by_queue`, `retry_all_failed_messages_by_endpoint`, `retry_failure_group`, `retry_all_failed_messages` -- Archive tools: `archive_failed_message`, `archive_failed_messages`, `archive_failure_group` -- Restore tools: `unarchive_failed_message`, `unarchive_failed_messages`, `unarchive_failure_group` - -Broad actions such as `retry_all_failed_messages`, `retry_failed_messages_by_queue`, `retry_all_failed_messages_by_endpoint`, `archive_failure_group`, and `unarchive_failure_group` can affect many messages. Prefer the narrowest tool that matches the user's intent. - -## Commonly Confused Tool Pairs - -### Error tools - -- `get_failed_messages` vs `get_failed_messages_by_endpoint`: use the endpoint-specific tool only when the endpoint is already known -- `retry_failed_messages` vs `retry_failure_group`: use the grouped retry when messages share the same root cause; use the ID-list retry when the user selected specific failed messages -- `archive_failed_messages` vs `archive_failure_group`: use the grouped archive when the whole failure group should be dismissed; use the ID-list archive when only some failed messages should be archived - -### Audit tools - -- `get_audit_messages` vs `search_audit_messages`: browse with `get_audit_messages` when the user wants an overview; search with `search_audit_messages` when the user supplies a concrete term, identifier, or phrase -- `get_audit_messages_by_endpoint` vs `get_audit_messages_by_conversation`: use the endpoint tool for one receiver endpoint; use the conversation tool to follow a cross-endpoint message flow -- `get_audit_messages` vs `get_audit_message_body`: browse metadata first, then fetch body content only when the actual payload matters - -## Recommended Investigation Flows - -### Error investigation flow - -1. `get_errors_summary` -2. `get_failure_groups` -3. `get_failed_messages` or `get_failed_messages_by_endpoint` -4. `get_failed_message_by_id` or `get_failed_message_last_attempt` -5. `get_retry_history` when a retry decision depends on prior attempts -6. Only then consider retry, archive, or unarchive tools - -### Audit investigation flow - -1. `get_known_endpoints` if the endpoint name is not known yet -2. `get_audit_messages` for broad browsing, or `search_audit_messages` for a concrete term or identifier -3. `get_audit_messages_by_endpoint` to narrow to one receiver endpoint -4. `get_audit_messages_by_conversation` to trace the related message flow -5. `get_audit_message_body` when the payload content is needed - -## Task-to-tool Mappings - -- "What is failing right now?" -> `get_errors_summary`, then `get_failure_groups` -- "Show recent failures in Sales" -> `get_failed_messages_by_endpoint` -- "Show the full history for this failure" -> `get_failed_message_by_id` -- "Show only the latest exception for this failure" -> `get_failed_message_last_attempt` -- "Retry the failures caused by this bug" -> `retry_failure_group` -- "Retry everything in this queue" -> `retry_failed_messages_by_queue` -- "Dismiss this one failure" -> `archive_failed_message` -- "Restore the archived failures for this root cause" -> `unarchive_failure_group` -- "What endpoints do we have?" -> `get_known_endpoints` -- "Show recent audit traffic" -> `get_audit_messages` -- "Find audit messages mentioning order 12345" -> `search_audit_messages` -- "Show what Billing processed" -> `get_audit_messages_by_endpoint` -- "Trace this conversation" -> `get_audit_messages_by_conversation` -- "Show me the payload for this audit message" -> `get_audit_message_body` diff --git a/docs/mcp-prompt-validation.md b/docs/mcp-prompt-validation.md deleted file mode 100644 index b10c4dc6a5..0000000000 --- a/docs/mcp-prompt-validation.md +++ /dev/null @@ -1,33 +0,0 @@ -# MCP Prompt Validation - -This document records the prompt-validation scenario set for the ServiceControl MCP surface. - -The validation perspective is intentionally narrow: assume the agent only sees discovered tool names, tool descriptions, and parameter descriptions. It does not rely on `docs/mcp-investigation-guide.md` or repository source code. - -## Error Scenarios - -| Prompt | Expected tool choice | Validation notes | -| --- | --- | --- | -| What are the biggest current failure categories? | `get_errors_summary` or `get_failure_groups` | `get_failure_groups` is positioned as the first step for root-cause analysis; detail and mutating tools are not framed as starting points. | -| Why are messages failing in Billing? | `get_failure_groups` -> `get_failed_messages_by_endpoint` -> `get_failed_message_last_attempt` | The metadata separates grouped root-cause analysis, endpoint-scoped inspection, and last-attempt detail lookup. | -| Retry only the timeout-related failures | `get_failure_groups` -> `retry_failure_group` | `retry_failure_group` is described as the grouped retry for one root cause, while broader retry tools explicitly warn about broad impact. | -| Show me details for this failed message | `get_failed_message_by_id` | The tool description says it is for a specific failed message and points agents to list/group tools only when an ID is not yet known. | -| Retry everything | `retry_all_failed_messages` | The metadata allows the broad tool when explicitly requested, while warning that it changes system state and may affect a large number of messages. | - -## Audit Scenarios - -| Prompt | Expected tool choice | Validation notes | -| --- | --- | --- | -| Find messages related to order 12345 | `search_audit_messages` | The description explicitly says it is for a specific business identifier or text, and browsing tools point agents toward search for targeted lookups. | -| Show me what happened in this conversation | `get_audit_messages_by_conversation` | The description frames it as tracing a full flow across multiple endpoints once a conversation ID is known. | -| What is endpoint Billing doing? | `get_audit_messages_by_endpoint` | The metadata positions this as the single-endpoint activity view rather than a cross-endpoint trace. | -| Show recent system activity | `get_audit_messages` | The browsing tool is positioned for recent activity and timeline exploration. | -| Show the payload of this message | `get_audit_message_body` | The description explicitly says it is for inspecting payload or message data after locating a specific audit message. | - -## Outcome - -- Summary and grouping tools are preferred before detail tools for error investigation. -- Search and browse are clearly separated for audit scenarios. -- Conversation tracing and endpoint-centric inspection are differentiated. -- Broad mutating tools remain discoverable but are framed as explicit, risky choices rather than defaults. -- Identifier and endpoint parameter descriptions support the scenario selection by clarifying where IDs and names come from.