Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 55 additions & 2 deletions architecture/sandbox.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@ All paths are relative to `crates/openshell-sandbox/src/`.
| `l7/relay.rs` | Protocol-aware bidirectional relay with per-request OPA evaluation |
| `l7/rest.rs` | HTTP/1.1 request/response parsing, body framing (Content-Length, chunked), deny response generation |
| `l7/provider.rs` | `L7Provider` trait and `L7Request`/`BodyLength` types |
| `credential_injector.rs` | L7 proxy credential injection for non-inference providers -- extracts injection configs from policy, resolves against provider env, injects credentials at the proxy layer |

## Startup and Orchestration

Expand Down Expand Up @@ -81,11 +82,13 @@ flowchart TD
- Priority 1: `--policy-rules` + `--policy-data` provided -- load OPA engine from local Rego file and YAML data file via `OpaEngine::from_files()`. Query `query_sandbox_config()` for filesystem/landlock/process settings. Network mode forced to `Proxy`.
- Priority 2: `--sandbox-id` + `--openshell-endpoint` provided -- fetch typed proto policy via `grpc_client::fetch_policy()`. Create OPA engine via `OpaEngine::from_proto()` using baked-in Rego rules. Convert proto to `SandboxPolicy` via `TryFrom`, which always forces `NetworkMode::Proxy` so that all egress passes through the proxy and the `inference.local` virtual host is always addressable.
- Neither present: return fatal error.
- Output: `(SandboxPolicy, Option<Arc<OpaEngine>>)`
- Output: `(SandboxPolicy, Option<Arc<OpaEngine>>, proto::SandboxPolicy)`

2. **Provider environment fetching**: If sandbox ID and endpoint are available, call `grpc_client::fetch_provider_environment()` to get a `HashMap<String, String>` of credential environment variables. On failure, log a warning and continue with an empty map.

3. **Binary identity cache**: If OPA engine is active, create `Arc<BinaryIdentityCache::new()>` for SHA256 TOFU enforcement.
3. **Credential injection extraction**: Scan the proto policy's network endpoints for `credential_injection` configs. For each match, look up the referenced credential in the provider environment, remove it from the env map (so it is not exposed to the sandbox process), and build a `CredentialInjector` that the L7 proxy will use to inject credentials at the network layer. See [Credential Injection](#credential-injection).

4. **Binary identity cache**: If OPA engine is active, create `Arc<BinaryIdentityCache::new()>` for SHA256 TOFU enforcement.

4. **Filesystem preparation** (`prepare_filesystem()`): For each path in `filesystem.read_write`, create the directory if it does not exist and `chown` to the configured `run_as_user`/`run_as_group`. Runs as the supervisor (root) before forking.

Expand Down Expand Up @@ -1001,6 +1004,56 @@ Implements `L7Provider` for HTTP/1.1:
5. If allowed (or audit mode): relay request to upstream and response back to client, then loop
6. If denied in enforce mode: send 403 and close the connection

## Credential Injection

**File:** `crates/openshell-sandbox/src/credential_injector.rs`

Credential injection extends the L7 proxy to inject API credentials at the network layer for arbitrary REST endpoints. This generalizes the `inference.local` credential injection pattern to any service in `network_policies`.

### Problem

When provider credentials are injected as environment variables, the agent process can read raw API keys from `process.env`. A prompt injection attack, malicious skill, or compromised dependency can read and exfiltrate these values. The network policy limits where a leaked key can be sent, but does not prevent the agent from reading it.

### Architecture

When an endpoint has a `credential_injection` configuration in the policy YAML:

1. **Sandbox startup** (`lib.rs`): `CredentialInjector::extract_from_policy()` scans the proto policy for `credential_injection` entries, cross-references them with the provider environment, removes the matched credentials from the child env map, and builds a `CredentialInjector` keyed by `(host, port)`.
2. **Proxy startup**: The `CredentialInjector` is passed through to `L7EvalContext` alongside the existing `SecretResolver`.
3. **Request relay** (`l7/rest.rs`): After the OPA policy allows a request, `relay_http_request_with_resolver()` applies credential injection:
- For header injection: strips any existing header with the same name (case-insensitive) and appends the injected header with the real credential.
- For query parameter injection: appends the credential as a URL query parameter.
4. **Agent process**: never sees the credential. It is not in `process.env` and not in any placeholder form.

### Injection types

| Type | YAML fields | Example |
|---|---|---|
| Header | `header: x-api-key` | `x-api-key: <value>` |
| Header + prefix | `header: Authorization`, `value_prefix: "Bearer "` | `Authorization: Bearer <value>` |
| Query parameter | `query_param: key` | URL appended with `?key=<value>` |

### Relationship to SecretResolver

`SecretResolver` and `CredentialInjector` serve different purposes:

| | SecretResolver | CredentialInjector |
|---|---|---|
| **Mechanism** | Placeholder rewriting | Direct injection |
| **Agent visibility** | Agent sees placeholder env vars | Agent sees nothing |
| **When applied** | All provider credentials (default) | Only credentials with `credential_injection` |
| **Auth header source** | Agent constructs the header using placeholder | Proxy adds the header from scratch |
| **Spoofing risk** | Agent could send placeholders to wrong endpoint | Proxy strips any existing header first |

Both are applied in `relay_http_request_with_resolver()`: `SecretResolver` rewrites first, then `CredentialInjector` injects.

### Validation rules

- `credential_injection` requires `protocol: rest` and `tls: terminate`
- Exactly one of `header` or `query_param` must be set
- `credential` and `provider` are required
- `value_prefix` is only valid with `header`

## Process Identity

### SHA256 TOFU (Trust-On-First-Use)
Expand Down
4 changes: 3 additions & 1 deletion crates/openshell-core/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -25,10 +25,12 @@ url = { workspace = true }
## Off by default so production builds have an empty registry.
## Enabled by e2e tests and during development.
dev-settings = []
## Use bundled protoc from protobuf-src instead of system protoc.
bundled-protoc = ["protobuf-src"]

[build-dependencies]
tonic-build = { workspace = true }
protobuf-src = { workspace = true }
protobuf-src = { workspace = true, optional = true }

[dev-dependencies]
tempfile = "3"
Expand Down
21 changes: 12 additions & 9 deletions crates/openshell-core/build.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,18 @@ fn main() -> Result<(), Box<dyn std::error::Error>> {
}

// --- Protobuf compilation ---
// Use bundled protoc from protobuf-src. The system protoc (from apt-get)
// does not bundle the well-known type includes (google/protobuf/struct.proto
// etc.), so we must use protobuf-src which ships both the binary and the
// include tree.
// SAFETY: This is run at build time in a single-threaded build script context.
// No other threads are reading environment variables concurrently.
#[allow(unsafe_code)]
unsafe {
env::set_var("PROTOC", protobuf_src::protoc());
// Prefer PROTOC env var (e.g., from mise or system install) when available.
// Fall back to bundled protoc from protobuf-src if the feature is enabled.
if env::var("PROTOC").is_err() {
#[cfg(feature = "bundled-protoc")]
{
// SAFETY: This is run at build time in a single-threaded build script context.
// No other threads are reading environment variables concurrently.
#[allow(unsafe_code)]
unsafe {
env::set_var("PROTOC", protobuf_src::protoc());
}
}
}

let proto_files = [
Expand Down
208 changes: 206 additions & 2 deletions crates/openshell-policy/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ use std::path::Path;

use miette::{IntoDiagnostic, Result, WrapErr};
use openshell_core::proto::{
FilesystemPolicy, L7Allow, L7Rule, LandlockPolicy, NetworkBinary, NetworkEndpoint,
NetworkPolicyRule, ProcessPolicy, SandboxPolicy,
CredentialInjection, FilesystemPolicy, L7Allow, L7Rule, LandlockPolicy, NetworkBinary,
NetworkEndpoint, NetworkPolicyRule, ProcessPolicy, SandboxPolicy,
};
use serde::{Deserialize, Serialize};

Expand Down Expand Up @@ -99,6 +99,11 @@ struct NetworkEndpointDef {
rules: Vec<L7RuleDef>,
#[serde(default, skip_serializing_if = "Vec::is_empty")]
allowed_ips: Vec<String>,
/// Optional credential injection. When set, the referenced provider
/// credential is withheld from the sandbox environment and injected
/// at the L7 proxy layer instead.
#[serde(default, skip_serializing_if = "Option::is_none")]
credential_injection: Option<CredentialInjectionDef>,
}

fn is_zero(v: &u32) -> bool {
Expand Down Expand Up @@ -132,6 +137,34 @@ struct NetworkBinaryDef {
harness: bool,
}

/// Credential injection configuration for an L7 endpoint.
///
/// When attached to an endpoint, the referenced provider credential is not
/// injected as an environment variable. Instead, the L7 proxy injects it
/// into outbound requests at the network layer.
#[derive(Debug, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
struct CredentialInjectionDef {
/// HTTP header name (e.g., "x-api-key", "Authorization").
/// Mutually exclusive with `query_param`.
#[serde(default, skip_serializing_if = "String::is_empty")]
header: String,
/// Optional prefix prepended to the credential value (e.g., "Bearer ").
/// Only valid when `header` is set.
#[serde(default, skip_serializing_if = "String::is_empty")]
value_prefix: String,
/// URL query parameter name (e.g., "key").
/// Mutually exclusive with `header`.
#[serde(default, skip_serializing_if = "String::is_empty")]
query_param: String,
/// Provider name that holds the credential.
#[serde(default, skip_serializing_if = "String::is_empty")]
provider: String,
/// Credential key within the provider (e.g., "EXA_API_KEY").
#[serde(default, skip_serializing_if = "String::is_empty")]
credential: String,
}

// ---------------------------------------------------------------------------
// YAML → proto conversion
// ---------------------------------------------------------------------------
Expand Down Expand Up @@ -180,6 +213,15 @@ fn to_proto(raw: PolicyFile) -> SandboxPolicy {
})
.collect(),
allowed_ips: e.allowed_ips,
credential_injection: e.credential_injection.map(
|ci| CredentialInjection {
header: ci.header,
value_prefix: ci.value_prefix,
query_param: ci.query_param,
provider: ci.provider,
credential: ci.credential,
},
),
}
})
.collect(),
Expand Down Expand Up @@ -280,6 +322,15 @@ fn from_proto(policy: &SandboxPolicy) -> PolicyFile {
})
.collect(),
allowed_ips: e.allowed_ips.clone(),
credential_injection: e.credential_injection.as_ref().map(
|ci| CredentialInjectionDef {
header: ci.header.clone(),
value_prefix: ci.value_prefix.clone(),
query_param: ci.query_param.clone(),
provider: ci.provider.clone(),
credential: ci.credential.clone(),
},
),
}
})
.collect(),
Expand Down Expand Up @@ -1117,4 +1168,157 @@ network_policies:
proto2.network_policies["test"].endpoints[0].host
);
}

#[test]
fn round_trip_preserves_credential_injection_header() {
let yaml = r#"
version: 1
network_policies:
exa_api:
name: exa-search-api
endpoints:
- host: api.exa.ai
port: 443
protocol: rest
tls: terminate
enforcement: enforce
credential_injection:
header: x-api-key
provider: exa
credential: EXA_API_KEY
rules:
- allow:
method: POST
path: /search
binaries:
- path: /usr/bin/node
"#;
let proto1 = parse_sandbox_policy(yaml).expect("parse failed");
let ci1 = proto1.network_policies["exa_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection missing");
assert_eq!(ci1.header, "x-api-key");
assert_eq!(ci1.provider, "exa");
assert_eq!(ci1.credential, "EXA_API_KEY");
assert!(ci1.value_prefix.is_empty());
assert!(ci1.query_param.is_empty());

let yaml_out = serialize_sandbox_policy(&proto1).expect("serialize failed");
let proto2 = parse_sandbox_policy(&yaml_out).expect("re-parse failed");
let ci2 = proto2.network_policies["exa_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection lost in round-trip");
assert_eq!(ci1.header, ci2.header);
assert_eq!(ci1.provider, ci2.provider);
assert_eq!(ci1.credential, ci2.credential);
}

#[test]
fn round_trip_preserves_credential_injection_bearer() {
let yaml = r#"
version: 1
network_policies:
perplexity_api:
name: perplexity-api
endpoints:
- host: api.perplexity.ai
port: 443
protocol: rest
tls: terminate
enforcement: enforce
credential_injection:
header: Authorization
value_prefix: "Bearer "
provider: perplexity
credential: PERPLEXITY_API_KEY
rules:
- allow:
method: POST
path: /chat/completions
binaries:
- path: /usr/bin/node
"#;
let proto1 = parse_sandbox_policy(yaml).expect("parse failed");
let ci1 = proto1.network_policies["perplexity_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection missing");
assert_eq!(ci1.header, "Authorization");
assert_eq!(ci1.value_prefix, "Bearer ");
assert_eq!(ci1.credential, "PERPLEXITY_API_KEY");

let yaml_out = serialize_sandbox_policy(&proto1).expect("serialize failed");
let proto2 = parse_sandbox_policy(&yaml_out).expect("re-parse failed");
let ci2 = proto2.network_policies["perplexity_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection lost in round-trip");
assert_eq!(ci1.value_prefix, ci2.value_prefix);
}

#[test]
fn round_trip_preserves_credential_injection_query_param() {
let yaml = r#"
version: 1
network_policies:
youtube_api:
name: youtube-data-api
endpoints:
- host: www.googleapis.com
port: 443
protocol: rest
tls: terminate
credential_injection:
query_param: key
provider: youtube
credential: YOUTUBE_API_KEY
binaries:
- path: /usr/bin/node
"#;
let proto1 = parse_sandbox_policy(yaml).expect("parse failed");
let ci1 = proto1.network_policies["youtube_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection missing");
assert_eq!(ci1.query_param, "key");
assert_eq!(ci1.credential, "YOUTUBE_API_KEY");
assert!(ci1.header.is_empty());

let yaml_out = serialize_sandbox_policy(&proto1).expect("serialize failed");
let proto2 = parse_sandbox_policy(&yaml_out).expect("re-parse failed");
let ci2 = proto2.network_policies["youtube_api"].endpoints[0]
.credential_injection
.as_ref()
.expect("credential_injection lost in round-trip");
assert_eq!(ci1.query_param, ci2.query_param);
assert_eq!(ci1.credential, ci2.credential);
}

#[test]
fn no_credential_injection_preserves_none() {
let yaml = r#"
version: 1
network_policies:
test:
endpoints:
- host: example.com
port: 443
binaries:
- path: /usr/bin/curl
"#;
let proto = parse_sandbox_policy(yaml).expect("parse failed");
assert!(
proto.network_policies["test"].endpoints[0]
.credential_injection
.is_none()
);

let yaml_out = serialize_sandbox_policy(&proto).expect("serialize failed");
assert!(
!yaml_out.contains("credential_injection"),
"credential_injection should not appear in output when not set"
);
}
}
Loading
Loading