From 416bad907aff24ab66ba9a71f0d3fdf97be848ea Mon Sep 17 00:00:00 2001 From: Johann Hofmann Date: Wed, 11 Mar 2026 21:10:20 +0000 Subject: [PATCH 1/2] Add untrusted annotation to proposed mitigations --- docs/security-privacy-considerations.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/security-privacy-considerations.md b/docs/security-privacy-considerations.md index e20ae25..658bf72 100644 --- a/docs/security-privacy-considerations.md +++ b/docs/security-privacy-considerations.md @@ -360,6 +360,14 @@ To advance the security and privacy posture of WebMCP, we need community input o **How:** Ensuring an interoperable basis for prompt injection defense, by requiring any implementer to protect against at least the attacks in that dataset +#### [Untrusted Annotation for Tool Responses](https://github.com/webmachinelearning/webmcp/issues/136) + +**What:** Giving agents information about trust boundaries such as highlighting untrustworthy content to the model using an untrusted annotation. + +**Threats addressed:** Prompt Injection Attacks (Output Injection Attacks) + +**How:** A boolean `contains_untrusted_content: true` annotation that acts as a signal to the client that the payload requires heightened security handling, allowing the client to sanitize the payload, use indicators such as spotlighting to highlight untrustworthy content to the model, or hide that part of the response entirely. + ... add more issues here ## Next Steps From 5bbccb4a2e0b9fde860122249fafd096cbd4c493 Mon Sep 17 00:00:00 2001 From: Johann Hofmann Date: Fri, 17 Apr 2026 09:12:08 -0400 Subject: [PATCH 2/2] Update security-privacy-considerations.md --- docs/security-privacy-considerations.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/security-privacy-considerations.md b/docs/security-privacy-considerations.md index 658bf72..6fc2b58 100644 --- a/docs/security-privacy-considerations.md +++ b/docs/security-privacy-considerations.md @@ -366,7 +366,7 @@ To advance the security and privacy posture of WebMCP, we need community input o **Threats addressed:** Prompt Injection Attacks (Output Injection Attacks) -**How:** A boolean `contains_untrusted_content: true` annotation that acts as a signal to the client that the payload requires heightened security handling, allowing the client to sanitize the payload, use indicators such as spotlighting to highlight untrustworthy content to the model, or hide that part of the response entirely. +**How:** A boolean `ToolAnnotations.untrustedContentHint = true` annotation that acts as a signal to the client that the payload requires heightened security handling, allowing the client to sanitize the payload, use indicators such as spotlighting to highlight untrustworthy content to the model, or hide that part of the response entirely. ... add more issues here