SPEC-0006: MCP Infrastructure Bridge

Overview

Claude Ops is an AI agent that monitors and remediates infrastructure from inside a Docker container. To perform its job, it must interact with diverse infrastructure components: Docker containers, PostgreSQL databases, web UIs (for credential rotation), and arbitrary HTTP endpoints. This specification defines the use of Model Context Protocol (MCP) servers as the primary access layer for all infrastructure interactions.

MCP servers run as stdio subprocesses spawned via npx -y <package> and expose typed tool interfaces that the Claude Code CLI can invoke directly. A baseline set of four MCP servers covers core infrastructure access patterns, and mounted repos can extend this set by providing additional MCP server definitions that are merged into the configuration before each monitoring cycle.

Definitions

MCP (Model Context Protocol): A standard protocol for connecting AI models to external tools and data sources, using JSON Schema-defined tool interfaces over stdio communication.
MCP server: A subprocess that implements the MCP protocol, exposing a set of typed tools that the agent can invoke. Spawned via npx -y <package> as a stdio child process.
Baseline MCP config: The MCP server definitions shipped with the Claude Ops container at /app/.claude/mcp.json, covering core infrastructure access patterns.
Repo MCP config: Additional MCP server definitions provided by a mounted repo at .claude-ops/mcp.json, merged into the baseline before each cycle.
Merged config: The final MCP configuration after combining baseline and all repo configs, written to /app/.claude/mcp.json before the Claude CLI process starts.
Tool schema: A JSON Schema definition of a tool's parameters and return type, provided by an MCP server so the agent can reason about how to invoke it.
Browser sidecar: A headless Chromium container (browserless/chromium) that provides a Chrome DevTools Protocol endpoint for browser automation via the Chrome DevTools MCP server.

Requirements

REQ-1: MCP as Primary Infrastructure Access Layer

The system MUST use MCP servers as the primary mechanism for the agent to interact with infrastructure components. The agent SHOULD prefer MCP tool invocations over raw CLI commands (via the Bash tool) when an MCP server provides equivalent functionality.

Scenario: Agent uses Docker MCP for container inspection

Given the Docker MCP server is configured and running When the agent needs to inspect container state during health checks Then the agent uses Docker MCP tools rather than running docker inspect via Bash

Scenario: Agent uses Fetch MCP for HTTP health checks

Given the Fetch MCP server is configured and running When the agent needs to perform an HTTP health check against a service endpoint Then the agent uses the Fetch MCP server rather than running curl via Bash

Scenario: Fallback to Bash when MCP is insufficient

Given the agent needs to perform an operation not supported by any configured MCP server When no MCP tool provides the required functionality Then the agent MAY fall back to using the Bash tool with direct CLI commands

REQ-2: Baseline MCP Server Set

The system MUST ship with a baseline set of MCP servers configured in /app/.claude/mcp.json. The baseline MUST include the following servers:

Docker MCP (@anthropic-ai/mcp-docker): Container inspection, management, and lifecycle operations.
PostgreSQL MCP (@anthropic-ai/mcp-postgres): Database querying and inspection.
Chrome DevTools MCP (@anthropic-ai/mcp-chrome-devtools): Browser automation for web UIs that lack programmatic APIs.
Fetch MCP (@anthropic-ai/mcp-fetch): General HTTP request capabilities.

Scenario: All baseline servers present

Given the Claude Ops container starts with its default configuration When the agent reads /app/.claude/mcp.json Then the config contains entries for docker, postgres, chrome-devtools, and fetch MCP servers

Scenario: Docker MCP provides container tools

Given the Docker MCP server is configured When the agent invokes Docker MCP tools Then it can list containers, inspect container state, check health status, view logs, and manage container lifecycle

Scenario: PostgreSQL MCP provides database tools

Given the PostgreSQL MCP server is configured And $CLAUDEOPS_POSTGRES_URL is set to a valid connection string When the agent invokes PostgreSQL MCP tools Then it can execute read-only queries against the configured database

Scenario: Chrome DevTools MCP enables browser automation

Given the Chrome DevTools MCP server is configured And the browser sidecar container is running with CDP endpoint ws://chrome:9222 When the agent invokes Chrome DevTools MCP tools Then it can navigate pages, take snapshots, fill forms, click elements, and extract page content

Scenario: Fetch MCP provides HTTP capabilities

Given the Fetch MCP server is configured When the agent invokes Fetch MCP tools Then it can make HTTP requests to arbitrary endpoints and process responses

REQ-3: MCP Server Spawning via npx

MCP servers MUST be spawned as stdio subprocesses using npx -y <package>. The npx -y flag ensures automatic installation without interactive prompts. The system MUST NOT require pre-installation of MCP server packages in the container image (though pre-installation MAY be used for performance optimization).

Scenario: MCP server started on demand

Given an MCP server is defined in the config with "command": "npx" and "args": ["-y", "@anthropic-ai/mcp-docker"] When the Claude CLI starts and needs to use the Docker MCP server Then npx downloads and starts the MCP server package as a stdio subprocess

Scenario: No pre-installation required

Given a new MCP server package @custom/mcp-prometheus is added to the config And it is not pre-installed in the container When the Claude CLI starts Then npx -y fetches and runs the package from the npm registry

REQ-4: MCP Server Environment Isolation

Each MCP server MUST run in its own subprocess with its own environment variables. Sensitive configuration (e.g., database connection strings, API keys) MUST be scoped to the specific MCP server that requires them via the env field in the MCP config.

Scenario: PostgreSQL credentials scoped to PostgreSQL MCP

Given the MCP config sets POSTGRES_CONNECTION in the env field of the postgres server When the Docker MCP server runs Then the Docker MCP server's process does not have access to POSTGRES_CONNECTION

Scenario: Custom MCP server with its own credentials

Given a repo provides an MCP server config with "env": {"API_KEY": "secret-key"} When the MCP server is spawned Then only that server's process has access to API_KEY

REQ-5: Repo-Provided MCP Server Extension

Mounted repos MAY provide additional MCP server definitions in .claude-ops/mcp.json. These MUST be merged into the baseline configuration by the entrypoint script before each monitoring cycle, following the merge semantics defined in SPEC-0005 (Mounted Repo Extension Model, REQ-9).

Scenario: Repo adds custom MCP integration

Given a repo infra-ansible provides .claude-ops/mcp.json with an ansible-inventory server definition When the entrypoint merges MCP configs before the next cycle Then the merged config includes the ansible-inventory server alongside all baseline servers

Scenario: Repo overrides baseline server

Given a repo provides .claude-ops/mcp.json redefining the postgres server with a different connection When the entrypoint merges configs Then the repo's postgres definition replaces the baseline's postgres definition

Scenario: Merge happens before each cycle

Given the entrypoint runs the MCP merge function before invoking the Claude CLI When the Claude CLI starts Then it sees the fully merged MCP config with all repo-provided servers available

REQ-6: MCP Config Structure

Each MCP server definition in the config MUST include the following fields:

type: MUST be "stdio" for subprocess-based MCP servers.
command: The command to execute (typically "npx").
args: An array of arguments passed to the command (e.g., ["-y", "@anthropic-ai/mcp-docker"]).

Each definition MAY include:

env: An object of environment variables passed to the MCP server subprocess.

Scenario: Valid MCP server config

Given a config entry for a server named docker Then it contains "type": "stdio", "command": "npx", and "args": ["-y", "@anthropic-ai/mcp-docker"]

Scenario: Config with environment variables

Given a config entry for the postgres server Then it contains an env field with "POSTGRES_CONNECTION" set to the database connection string

Scenario: Config without environment variables

Given a config entry for the fetch server Then it contains type, command, and args but does not require an env field

REQ-7: Browser Sidecar for Chrome DevTools MCP

The Chrome DevTools MCP server MUST connect to a headless Chromium instance via the Chrome DevTools Protocol (CDP). The system MUST support running a browser sidecar container (e.g., browserless/chromium) that exposes a CDP WebSocket endpoint. The Chrome DevTools MCP server's CDP_ENDPOINT environment variable MUST be configured to connect to this sidecar.

The browser sidecar MUST be an optional component, activated via Docker Compose profile (--profile browser). The system MUST function without it when browser automation is not needed.

Scenario: Browser sidecar active

Given docker compose --profile browser up -d was used to start the system And the chrome container is running with port 9222 exposed When the agent needs to rotate an API key via a web UI Then the Chrome DevTools MCP connects to ws://chrome:9222 and automates the browser

Scenario: Browser sidecar not active

Given the system was started without the browser profile When the Chrome DevTools MCP server attempts to connect Then browser automation tools are unavailable And the agent uses alternative methods (REST APIs via Fetch MCP) or escalates

REQ-8: Typed Tool Schemas

MCP servers MUST expose tool schemas that describe available operations, their parameters, and expected return types using JSON Schema. The agent MUST use these schemas to reason about tool capabilities and construct valid tool invocations.

Scenario: Agent discovers available tools

Given the Docker MCP server is running When the agent queries the server's tool list Then it receives a set of tool definitions with names, descriptions, and JSON Schema parameter definitions

Scenario: Agent constructs valid invocations

Given the agent needs to inspect a container named postgres-db And the Docker MCP exposes a tool with a containerName parameter When the agent invokes the tool Then it constructs a valid invocation with the correct parameter name and value

REQ-9: Bounded Tool Surface Area

Each MCP server MUST expose a finite, well-defined set of operations. The agent MUST NOT have the ability to execute arbitrary commands through an MCP server. The tool surface area of each configured MCP server SHOULD be auditable by reviewing the server's tool definitions.

Scenario: Docker MCP exposes bounded operations

Given the Docker MCP server is configured When an operator reviews its tool definitions Then they can enumerate all operations the agent can perform through this server

Scenario: MCP preferred over unbounded shell

Given the agent could use either Docker MCP tools or docker CLI via Bash When the agent chooses how to interact with containers Then it prefers Docker MCP tools because the tool surface area is bounded and auditable

REQ-10: MCP Usage Aligned with Permission Tiers

MCP tool invocations MUST respect the agent's permission tier model. A Tier 1 (observe-only) agent MUST NOT use MCP tools to perform mutating operations (e.g., restarting containers via Docker MCP). Tier 2 and Tier 3 agents MAY use MCP tools for their respective permitted operations.

Scenario: Tier 1 uses MCP for read-only operations

Given a Tier 1 agent has access to the Docker MCP server When it needs to check container health Then it uses read-only Docker MCP tools (list, inspect) but does not use restart or stop tools

Scenario: Tier 2 uses MCP for safe remediation

Given a Tier 2 agent has access to the Docker MCP server When it needs to restart an unhealthy container Then it MAY use the Docker MCP's restart tool as this is within Tier 2 permissions

Scenario: Browser automation requires Tier 2

Given browser automation via Chrome DevTools MCP is needed for credential rotation When a Tier 1 agent encounters an authentication failure Then it MUST NOT use Chrome DevTools MCP for rotation and MUST escalate to Tier 2

REQ-11: Baseline Backup and Restoration

The entrypoint script MUST save a backup of the baseline MCP config on the first run and restore it before each subsequent merge cycle. This ensures that removed repos or changed repo configs take effect on the next cycle rather than accumulating indefinitely.

Scenario: First run baseline backup

Given the system starts for the first time And /app/.claude/mcp.json.baseline does not exist When the entrypoint prepares to merge MCP configs Then it copies /app/.claude/mcp.json to /app/.claude/mcp.json.baseline

Scenario: Subsequent run baseline restoration

Given /app/.claude/mcp.json.baseline exists from a previous run When the entrypoint starts a new cycle Then it restores the baseline before merging, so the config reflects only current repos

Scenario: Removed repo config cleaned up

Given repo old-infra was previously mounted and contributed MCP servers And repo old-infra has been unmounted When the next cycle begins Then the baseline is restored (without old-infra's servers) and only currently mounted repos' configs are merged

References

ADR-0006: Use MCP Servers as Primary Infrastructure Access Layer
SPEC-0005: Mounted Repo Extension Model (for MCP config merging semantics)
Model Context Protocol specification
Claude Ops baseline MCP config

Overview​

Definitions​

Requirements​

REQ-1: MCP as Primary Infrastructure Access Layer​

Scenario: Agent uses Docker MCP for container inspection​

Scenario: Agent uses Fetch MCP for HTTP health checks​

Scenario: Fallback to Bash when MCP is insufficient​

REQ-2: Baseline MCP Server Set​

Scenario: All baseline servers present​

Scenario: Docker MCP provides container tools​

Scenario: PostgreSQL MCP provides database tools​

Scenario: Chrome DevTools MCP enables browser automation​

Scenario: Fetch MCP provides HTTP capabilities​

REQ-3: MCP Server Spawning via npx​

Scenario: MCP server started on demand​

Scenario: No pre-installation required​

REQ-4: MCP Server Environment Isolation​

Scenario: PostgreSQL credentials scoped to PostgreSQL MCP​

Scenario: Custom MCP server with its own credentials​

REQ-5: Repo-Provided MCP Server Extension​

Scenario: Repo adds custom MCP integration​

Scenario: Repo overrides baseline server​

Scenario: Merge happens before each cycle​

REQ-6: MCP Config Structure​

Scenario: Valid MCP server config​

Scenario: Config with environment variables​

Scenario: Config without environment variables​

REQ-7: Browser Sidecar for Chrome DevTools MCP​

Scenario: Browser sidecar active​

Scenario: Browser sidecar not active​

REQ-8: Typed Tool Schemas​

Scenario: Agent discovers available tools​

Scenario: Agent constructs valid invocations​

REQ-9: Bounded Tool Surface Area​

Scenario: Docker MCP exposes bounded operations​

Scenario: MCP preferred over unbounded shell​

REQ-10: MCP Usage Aligned with Permission Tiers​

Scenario: Tier 1 uses MCP for read-only operations​

Scenario: Tier 2 uses MCP for safe remediation​

Scenario: Browser automation requires Tier 2​

REQ-11: Baseline Backup and Restoration​

Scenario: First run baseline backup​

Scenario: Subsequent run baseline restoration​

Scenario: Removed repo config cleaned up​

References​