Skip to content

Latest commit

 

History

History
614 lines (435 loc) · 32.2 KB

File metadata and controls

614 lines (435 loc) · 32.2 KB

MCP Server Development Guide for Autonomous Agents

Note: This project is already fully compliant with MCP 2025-11-25. Reference this document only when making protocol-level changes.

Version: 1.0 MCP Protocol Revision: 2025-11-25 Purpose: Authoritative reference for development agents building MCP servers. Codifies protocol compliance requirements, production best practices, and common failure modes distilled from specification research, industry experience reports (Block, Docker, Philschmid/HuggingFace), and hands-on spec development.


1. Foundational Principle: MCP Is a User Interface for AI Agents

The single most important concept in MCP server design is that your user is an LLM, not a human developer. Every design decision flows from this.

A REST API is designed for a developer who reads docs once, writes integration code, debugs it, and deploys. An MCP server is designed for an agent that discovers tools at runtime, interprets schemas within a finite context window, selects and invokes tools turn-by-turn, and must self-correct from errors without human intervention.

This has concrete consequences:

  • Tool schemas compete for context window tokens. Every tool definition, every parameter description, every enum value — all of it occupies space that could otherwise hold user instructions, conversation history, or results from previous calls.
  • Agents hallucinate structure. Given a dict parameter, an LLM will invent key names. Given a free-form string where an enum would do, it will produce plausible but invalid values.
  • Multi-step orchestration is expensive and error-prone. Each chained tool call introduces latency, consumes tokens, and creates a potential failure point. Agents are improving at planning but remain unreliable beyond ~5 chained calls.
  • Error messages are the agent's primary recovery mechanism. A good error message is an instruction the agent can act on. A bad one is a dead end that wastes a retry.

2. Protocol Compliance (MCP 2025-11-25)

This section covers the non-negotiable structural requirements. Violating any of these will cause interoperability failures with conformant clients.

2.1 Lifecycle

The server MUST implement the full initialization handshake:

  1. Client sends initialize with its capabilities and protocolVersion.
  2. Server responds with its own capabilities, protocolVersion, serverInfo, and optionally instructions.
  3. Client sends initialized notification.
  4. Only after receiving initialized may the server process tools/call or tools/list requests. Implementations MUST gate on this. Handling tool calls before initialization is a protocol violation.

The initialize response MUST include:

{
  "protocolVersion": "2025-11-25",
  "capabilities": {
    "tools": {
      "listChanged": false
    }
  },
  "serverInfo": {
    "name": "your-server-name",
    "version": "1.0.0",
    "description": "One-sentence description for humans."
  },
  "instructions": "LLM-targeted guidance on when/how to use this server's tools. This is injected into the system prompt by many hosts."
}

Key points:

  • serverInfo.name is the machine identifier. Keep it short, lowercase, hyphenated.
  • serverInfo.description is for humans (UI display).
  • instructions is for the LLM. Write it as you would write a system prompt section — direct, imperative, specific. Many hosts (Claude Desktop, Goose, etc.) inject this directly into the model's context.

2.2 Tool Definition Structure

Every tool exposed via tools/list MUST include:

Field Required Purpose
name Yes Machine identifier. 1–128 chars, [a-zA-Z0-9_\-.] only. Case-sensitive.
description Yes LLM-targeted explanation of what the tool does, when to use it, and what it returns.
inputSchema Yes JSON Schema (2020-12 default) defining accepted parameters. MUST be type: "object".
title Recommended Human-readable display name. Distinct from name.
outputSchema Recommended JSON Schema defining the structure of structuredContent in the response.
annotations Recommended Behavioral hints: readOnlyHint, destructiveHint, idempotentHint, openWorldHint.

2.3 Tool Naming (SEP-986)

  • Allowed characters: A-Z, a-z, 0-9, _, -, .
  • No spaces, commas, or special characters.
  • SHOULD be unique within the server.
  • For servers likely to run alongside others, use a service-prefixed pattern: {service}_{action}_{resource} (e.g., github_create_issue, slack_send_message). Some MCP clients auto-prefix with the server name, so avoid redundancy if you know your target host.

2.4 Response Structure: Dual Content

Every tools/call response MUST return a content array (for backward compatibility) and SHOULD return structuredContent (for typed consumption):

{
  "content": [
    {
      "type": "text",
      "text": "{\"key\": \"serialized JSON of structuredContent\"}"
    }
  ],
  "structuredContent": {
    "key": "typed object matching outputSchema"
  },
  "isError": false
}

Rules:

  • content contains TextContent blocks with the serialized JSON. This is what older clients see.
  • structuredContent is the typed object conforming to outputSchema. This is what modern clients validate.
  • If outputSchema is declared, structuredContent MUST conform to it.
  • For large responses where the serialized JSON exceeds a reasonable threshold (~20K characters), the content text block SHOULD contain a summary or truncation note referencing structuredContent, not the full serialization. This prevents context window overflow in hosts that inject content directly.

2.5 Error Handling: Two Distinct Mechanisms

MCP distinguishes two error types. Getting this wrong is one of the most common implementation mistakes.

Protocol Errors — Standard JSON-RPC errors for structural/routing issues:

  • Unknown tool name → -32602 (Invalid params)
  • Malformed request → -32600 (Invalid request)
  • Server bug → -32603 (Internal error)
{
  "jsonrpc": "2.0",
  "id": 3,
  "error": {
    "code": -32602,
    "message": "Unknown tool: nonexistent_tool"
  }
}

Tool Execution Errors — Returned in the normal result envelope with isError: true:

  • Input validation failures (date in wrong format, value out of range)
  • Business logic errors (file not found, API rate limit)
  • Partial failures (some items processed, others failed)
{
  "content": [
    {
      "type": "text",
      "text": "File not found at /path/to/image.png. Verify the path exists and is an absolute path to a supported image format (PNG, JPEG, WebP)."
    }
  ],
  "isError": true
}

Critical: Input validation errors MUST be Tool Execution Errors, not Protocol Errors (SEP-1303). This is because Tool Execution Errors are reliably fed back to the LLM, enabling self-correction. Protocol Errors may be swallowed by the client or shown to the user instead. When a tool receives a valid JSON-RPC request with an invalid parameter value (e.g., k: -5), that's a Tool Execution Error.

Error responses MUST NOT include structuredContent. Only successful responses get structured output.

2.6 JSON Schema Dialect

MCP 2025-11-25 establishes JSON Schema 2020-12 as the default dialect (SEP-1613). When inputSchema or outputSchema omit the $schema field, clients assume 2020-12. You MAY include $schema explicitly if you need a different dialect (e.g., draft-07), but 2020-12 is recommended.

2.7 Tool Annotations

Annotations are optional behavioral hints that clients use for UI decisions (e.g., auto-approving safe operations):

{
  "annotations": {
    "readOnlyHint": true,
    "destructiveHint": false,
    "idempotentHint": true,
    "openWorldHint": false
  }
}
  • readOnlyHint: true — Tool does not modify state. Hosts like Goose and Claude Code may auto-approve these.
  • destructiveHint — Only meaningful when readOnlyHint is false. Indicates potential for irreversible changes.
  • idempotentHint: true — Repeated calls with identical arguments produce the same result. Enables safe retries.
  • openWorldHint: true — Tool interacts with external entities (APIs, network). false means it operates on local/contained data only.

Set these accurately. Incorrect annotations can lead to auto-approval of dangerous operations or unnecessary confirmation prompts for safe ones.


3. Tool Design Best Practices

These practices are drawn from production experience at Block (60+ MCP servers), Docker (100+ catalog servers), and independent analysis. They are not protocol requirements — they are engineering recommendations that directly impact how well agents can use your tools.

3.1 Design for Outcomes, Not Operations

The mistake: Exposing one MCP tool per REST endpoint or database query.

The fix: Design each tool around a complete user goal.

If fulfilling a request requires calling GET /users, then GET /orders/{id}, then GET /shipments/{id}, expose a single track_order(email) tool that does all three internally and returns a synthesized result. The LLM should not need to orchestrate multi-step data fetching.

This doesn't mean every server has one tool. It means every tool delivers a complete, actionable result for one class of question. A well-designed server typically has 3–15 tools, each covering a distinct goal.

3.2 Flatten Your Arguments

The mistake: Accepting nested configuration objects.

// BAD: Agent must guess the nested structure
{
  "filters": {
    "status": "active",
    "date_range": { "start": "2025-01-01", "end": "2025-06-01" }
  }
}

The fix: Top-level primitives with constrained types.

// GOOD: Every parameter is visible, typed, and constrained
{
  "status": { "type": "string", "enum": ["active", "pending", "closed"] },
  "date_start": { "type": "string", "format": "date" },
  "date_end": { "type": "string", "format": "date" }
}

LLMs reliably produce flat key-value pairs. They hallucinate nested keys, invent dictionary structures, and miss required sub-fields. Use enum for every parameter that has a finite set of valid values.

Exception: Arrays of structured items (e.g., a list of bounding boxes) are sometimes unavoidable. In these cases, keep the inner object as flat as possible and document the structure exhaustively in the parameter description.

3.3 Use additionalProperties: false on All Input Schemas

This prevents agents from inventing extra parameters. Without it, an LLM might pass {"image_path": "/img.png", "quality": "high"} where quality doesn't exist, and the server silently ignores it — leading the agent to believe it influenced the output.

{
  "type": "object",
  "properties": { ... },
  "required": ["image_path"],
  "additionalProperties": false
}

Caveat: For tools that accept passthrough data (e.g., generate_design_tokens accepting palette items with extra fields from upstream), deliberately omit additionalProperties: false on the inner array item schema and document why.

3.4 Write Descriptions as LLM Instructions

Tool and parameter descriptions are consumed by the LLM as part of its context. They are not API docs for humans — they are instructions for an agent. Write them accordingly.

Bad: "description": "The algorithm to use for clustering."

Good: "description": "Clustering algorithm. Use 'k-means' (default) for predictable results with a known color count. Use 'median-cut' for deterministic output without seed dependency. Use 'dbscan' when the number of distinct colors is unknown — it auto-discovers clusters and treats k as a maximum cap, not a target."

For the tool-level description, specify three things:

  1. When to use it: "Use this tool when you need to extract a color palette from a UI mockup screenshot."
  2. What it returns: "Returns an array of dominant colors with hex values, CSS strings, percentage coverage, and semantic role hints."
  3. How to format arguments: "image_path must be an absolute filesystem path. The calling agent is responsible for saving remote images to disk first."

3.5 Make Error Messages Actionable

The agent sees your error message and must decide what to do next. The message is a recovery instruction.

Bad: "Access denied."

Good: "Cannot read file at /path/to/image.png. The file does not exist or is not readable. Verify the absolute path is correct and the file is a supported format (PNG, JPEG, WebP, BMP, TIFF). If the image was downloaded from a URL, ensure it was fully saved to disk before invoking this tool."

Bad: "Invalid parameter."

Good: "Parameter 'k' must be an integer between 2 and 32. Received: -1. For most UI mockups, k=8 works well. Use higher values (12-16) for complex multi-color designs."

Every error message should answer: what went wrong, what the valid state looks like, and what the agent should do differently on retry.

3.6 Guard the Token Budget

Tool responses consume context window space. A response that returns 50KB of JSON will cripple the agent's ability to process subsequent turns.

Tactics:

  • Cap output size. Set a max_results or max_components parameter with a sensible default. When the cap is hit, include a truncated: true flag and total_available count so the agent knows there's more.
  • Truncate text content. If the serialized JSON for content blocks exceeds ~20K characters, provide a summary string instead of the full serialization. The typed data is still available in structuredContent.
  • Paginate list operations. Return has_more, next_cursor, and total_count metadata. Never load unbounded result sets into memory.
  • Return only what's needed. If your backing API returns 50 fields per record, select the 5–8 that matter for the agent's task. Don't pass through raw API responses.

3.7 Curate Ruthlessly

  • 5–15 tools per server. More than that and tool selection becomes unreliable — the agent spends tokens parsing descriptions it will never use.
  • One server, one domain. A "GitHub + Jira + Slack" server is three servers pretending to be one. Split them.
  • Delete unused tools. If telemetry shows a tool is never invoked, remove it. It's consuming description tokens on every request.
  • Separate read and write tools. This enables granular permission management. Users can "always allow" read tools while requiring confirmation for writes.

3.8 Prefer Determinism

Agents retry. They also compare outputs across calls. Non-determinism creates confusion.

  • Fix random seeds for any stochastic algorithm. Expose the seed as a parameter for reproducibility, but use a stable default (e.g., random_seed=42).
  • Document where non-determinism is unavoidable (e.g., DBSCAN cluster ordering, floating-point sensitivity across platforms).
  • Use stable sort orders for output arrays.

4. Schema Design Patterns

4.1 Dependent Parameters

When a group of parameters must all be provided together or not at all (e.g., region coordinates x, y, width, height):

Option A — JSON Schema dependentRequired (2020-12):

{
  "dependentRequired": {
    "region_x": ["region_y", "region_width", "region_height"],
    "region_y": ["region_x", "region_width", "region_height"],
    "region_width": ["region_x", "region_y", "region_height"],
    "region_height": ["region_x", "region_y", "region_width"]
  }
}

Option B — Runtime validation fallback: Not all clients validate dependentRequired. Always implement server-side validation and return a Tool Execution Error with a clear message: "If any region parameter is provided, all four (region_x, region_y, region_width, region_height) must be provided."

Use both. The schema catches it at the client level; the runtime validation catches it when the client doesn't validate.

4.2 Parameter Consistency Across Tools

If multiple tools in your server accept the same parameter (e.g., algorithm, color_space, max_resolution), they MUST:

  • Use identical names, types, defaults, and enum values.
  • Appear in the same relative position in the schema (not enforced, but aids readability).
  • Share identical descriptions.

Inconsistency between sibling tools is a common gap. An agent that learns extract_palette accepts color_space will assume extract_components does too — and be confused when it doesn't.

4.3 Coordinate Systems

If your server works with spatial data (images, maps, documents), state the coordinate system explicitly:

  • In the instructions field of the initialize response.
  • In the description of every coordinate parameter.
  • In the description of every coordinate field in the outputSchema.

Example: "All input coordinates and output coordinates are in original image pixels (top-left origin). The server handles internal scaling for preprocessing; callers never need to account for downscaling."

4.4 Enum vs. Free-Form String

Use enums (enum in JSON Schema) for every parameter with a known finite set of valid values. This includes:

  • Algorithm choices
  • Output formats
  • Sort orders
  • Status values
  • Mode selectors

If the set might expand in future versions, still use an enum for the current version and update it when values are added. An agent that receives a clear enum will always pick a valid value. An agent that receives "type": "string" will frequently produce plausible but invalid values.


5. Specification Writing for Implementation Agents

When writing specs that development agents will implement, these patterns prevent the most common classes of implementation bugs.

5.1 Eliminating Ambiguity

For every behavioral decision in your spec, verify it passes the "two reasonable implementations" test: could two independent implementers, reading only your spec, produce different behavior? If yes, the spec is underspecified.

Common ambiguity sources:

Ambiguity Example Fix
Unspecified algorithm "Find the nearest color" "Find the nearest CSS named color using CIE2000 Delta-E in CIELAB space against the 148 CSS Level 4 named colors."
Unspecified threshold "Merge similar colors" "Merge colors within ΔE ≤ 5.0 (CIE2000) AND with bounding boxes overlapping > 50% of the smaller region's area."
Unspecified nullability "Background field" "null when exclude_background is false, OR when true but no single color exceeds the background_threshold."
Unspecified format "Generate Tailwind config" Provide a complete output example with exact syntax.
Contradictory statements "DBSCAN ignores k" vs "k is a cap" Resolve to a single statement and use identical phrasing everywhere the concept appears.

5.2 The Deduplication Problem

Any time your output schema contains semantically-named fields derived from classification (e.g., role_hint, category, type), and those names feed into downstream identifiers (e.g., CSS variable names, token names), you MUST specify deduplication rules.

Example: If two palette entries both receive role_hint: "neutral", and the token naming pattern is --{prefix}-{role_hint}, you get a collision: two tokens named --color-neutral.

Specify explicitly:

  • First occurrence: --color-neutral
  • Subsequent: --color-neutral-2, --color-neutral-3
  • If no role_hint is available, fall back to rank-based: --color-1, --color-2

5.3 Percentage Denominators

If your output includes percentage fields, always state the denominator:

  • "Percentages are relative to non-background pixels (when background is excluded) or total image pixels (when it is not). They sum to approximately 100%."
  • "Per-component percentages are relative to the component's bounding box pixel count, not the full image."

5.4 Complete Output Examples

For every output format your server produces, include a complete, valid example in the spec. Not a fragment — a complete, copy-paste-able block that an implementer can use as a test fixture.

This is especially critical for formats with precise syntax (CSS custom properties, SCSS variables, Tailwind config objects, W3C Design Token JSON).

5.5 Dependency Manifests

If recommending an implementation language, provide a concrete dependency manifest — not just package names but exact package identifiers and version constraints:

# pyproject.toml [project.dependencies] example
dependencies = [
    "mcp>=1.9",
    "opencv-python-headless>=4.8",  # not opencv-python (avoids GUI deps)
    "scikit-learn>=1.3",
    "Pillow>=10.0",
    "pydantic>=2.0",
    "numpy>=1.24",
]

An agent choosing between opencv-python and opencv-python-headless will pick the wrong one without guidance.


6. Testing & Validation

6.1 MCP Inspector

The MCP Inspector (npx @modelcontextprotocol/inspector) is the canonical validation tool. Every server MUST pass the following checklist before release:

  1. Server starts and completes the initializeinitialized handshake.
  2. tools/list returns all tools with valid name, description, inputSchema.
  3. Each tool is callable with valid arguments and returns content + structuredContent.
  4. Each tool returns isError: true (not a protocol error) for invalid input values.
  5. Unknown tool names produce a -32602 protocol error.
  6. ping requests receive a response.
  7. No requests are processed before initialized.

6.2 Test Fixture Strategy

For servers that process input data (images, documents, datasets):

  • Synthetic fixtures: Programmatically generated inputs with exactly known expected outputs. These enable deterministic assertions. Example: a 100×100 image with four 50×50 solid-color quadrants should yield exactly 4 clusters with known hex values.
  • Edge cases: Empty input, minimum-size input, maximum-size input, corrupt input. Each should produce a specific, documented error.
  • Format coverage: If your server accepts PNG, JPEG, and WebP, test all three. JPEG compression introduces artifacts that affect clustering — your tests should verify this is handled.

6.3 Unit vs. Integration Test Boundary

  • Unit tests cover individual modules in isolation: color conversion round-trips, clustering on synthetic data, schema validation, error formatting.
  • Integration tests cover the full MCP protocol flow: initialize → tools/list → tools/call → validate response against outputSchema.
  • Property tests (if applicable): verify invariants like "percentages sum to ~100%", "output coordinates are within original image bounds", "token names are unique".

7. Performance & Operations

7.1 Lazy Imports

For Python servers with heavy dependencies (OpenCV, scikit-learn, NumPy), use lazy imports to minimize startup latency on stdio transport:

def handle_extract_palette(params):
    import cv2               # ~200ms cold import
    import numpy as np        # ~100ms cold import
    from sklearn.cluster import KMeans
    # ... processing

This keeps the initializeinitialized handshake fast while deferring heavy library loading to first use.

7.2 Transport Selection

  • stdio: Maximum client compatibility. Use for single-user, locally-run servers. This is the default for most use cases.
  • Streamable HTTP: Use when you need networked access, horizontal scaling, or incremental results. Note: SSE transport is deprecated as of 2025-06-18; use Streamable HTTP instead.

7.3 Prompt Prefix Caching

LLM providers offer significant latency and cost reductions for cached prompt prefixes. Your tool definitions and instructions are part of this prefix. To maximize cache hits:

  • Avoid injecting dynamic data (timestamps, live counts) into instructions or tool descriptions.
  • Keep tool definitions stable across sessions.
  • If you must include dynamic context, append it after the stable prefix.

8. Security Considerations

8.1 Input Validation

  • Validate ALL tool inputs server-side, even if the schema constrains them. Clients may not validate schemas.
  • For file paths: resolve to absolute paths, verify existence, check file type by magic bytes (not just extension), enforce size limits.
  • For string inputs that feed into shell commands or SQL: sanitize or parameterize. Never interpolate user-provided strings into commands.

8.2 Output Sanitization

  • Never echo secrets, API keys, or credentials in tool results or error messages.
  • If your tool accesses authenticated APIs, strip authentication headers from any debug output.
  • Rate-limit tool invocations to prevent abuse in multi-agent scenarios.

8.3 Permissions Model

Design tools with a single risk level each:

  • Read-only tools: Query data, retrieve status, list items. Mark with readOnlyHint: true.
  • Write tools: Create, update, delete. Mark with readOnlyHint: false, and set destructiveHint appropriately.

Don't mix reads and writes in one tool — it prevents users from making informed permission decisions. If a workflow genuinely requires both, document it clearly and validate inputs aggressively.

8.4 Authentication

  • Use OAuth 2.1 for HTTP-based transports (mandatory per MCP spec).
  • Never store tokens in plaintext files. Use platform keyrings.
  • Request minimum necessary scopes.
  • Handle token refresh proactively.

9. Common Failure Modes (Anti-Pattern Catalog)

This section catalogs the most frequently observed failure modes, drawn from spec reviews and production experience. Use it as a pre-flight checklist.

# Anti-Pattern Symptom Fix
1 REST endpoint 1:1 mapping Agent chains 3-5 calls for one goal Consolidate into outcome-oriented tools
2 Nested input objects Agent hallucinates keys, misses required sub-fields Flatten to top-level primitives
3 Free-form strings where enums exist Agent produces plausible but invalid values Use JSON Schema enum
4 Human-facing error messages Agent can't self-correct Include what went wrong, what's valid, what to do differently
5 Validation errors as Protocol Errors Agent never sees the error Use Tool Execution Errors (isError: true)
6 Unbounded response sizes Context window overflow Cap, truncate, paginate
7 Missing additionalProperties: false Agent invents parameters silently Add to all input schemas
8 Inconsistent parameters across tools Agent assumes capabilities that don't exist Audit sibling tools for symmetry
9 Ambiguous coordinate systems Off-by-factor-N positioning errors Declare coordinate space in three places
10 No outputSchema Agent must parse untyped JSON Always declare output structure
11 No instructions in initialize Agent lacks server-level context Write LLM-targeted usage guidance
12 Processing requests before initialized Protocol violation, undefined behavior Gate on lifecycle state
13 Duplicate semantic identifiers Token/key collisions in output Specify deduplication rules
14 structuredContent in error responses Schema validation failure on client Error responses use content only
15 Dynamic data in tool descriptions Cache invalidation, increased costs Keep descriptions static

10. Implementation Checklist

Use this checklist when implementing or reviewing an MCP server. Every item maps to a section in this guide.

Protocol Compliance

  • Server responds to initialize with protocolVersion, capabilities, serverInfo, instructions
  • Server gates tool processing on initialized notification
  • Server responds to ping
  • tools/list returns all tools with name, description, inputSchema
  • tools/call returns both content and structuredContent
  • Tool names comply with SEP-986 character/length rules
  • JSON Schema defaults to 2020-12 dialect
  • Input validation errors are Tool Execution Errors, not Protocol Errors
  • Error responses do not include structuredContent
  • Unknown tool names return Protocol Error -32602

Tool Design

  • Each tool delivers a complete outcome (no mandatory chaining)
  • All parameters are top-level primitives or constrained types
  • All finite-set parameters use enum
  • additionalProperties: false on input schemas (with documented exceptions)
  • Tool descriptions specify when to use, what's returned, how to format arguments
  • outputSchema declared for every tool
  • annotations set accurately (readOnlyHint, idempotentHint, etc.)
  • Consistent parameters across sibling tools

Error Handling

  • Every error message is actionable (what happened, what's valid, what to do)
  • File/path errors include format requirements
  • Parameter errors include valid ranges and defaults
  • Partial failures include what succeeded and what didn't

Output & Token Budget

  • Response sizes are bounded (max_results, max_components, or equivalent)
  • Truncation flags (truncated: true, total_available: N) are included when applicable
  • Text content blocks are summarized when full serialization exceeds ~20K characters
  • List operations support pagination metadata

Testing

  • MCP Inspector handshake passes
  • All tools callable with valid arguments
  • All tools return isError for invalid arguments
  • Unit tests cover core logic modules
  • Integration tests cover full protocol flow
  • Synthetic test fixtures with known expected outputs
  • Edge cases: empty input, minimum size, maximum size, corrupt input

Security

  • All inputs validated server-side
  • No secrets in tool results or error messages
  • File paths resolved and checked for existence/permissions/type
  • Tools separated by risk level (read vs. write)

Appendix A: Reference Links

Resource URL
MCP Specification (2025-11-25) https://modelcontextprotocol.io/specification/2025-11-25
MCP Tools Spec https://modelcontextprotocol.io/specification/2025-11-25/server/tools
MCP Lifecycle Spec https://modelcontextprotocol.io/specification/2025-11-25/basic/lifecycle
MCP Changelog (2025-11-25) https://modelcontextprotocol.io/specification/2025-11-25/changelog
MCP Schema Reference https://modelcontextprotocol.io/specification/2025-11-25/schema
MCP Security Best Practices https://modelcontextprotocol.io/specification/2025-11-25/basic/security_best_practices
Block's MCP Playbook https://engineering.block.xyz/blog/blocks-playbook-for-designing-mcp-servers
Docker MCP Best Practices https://www.docker.com/blog/mcp-server-best-practices/
Philschmid's MCP Guide https://www.philschmid.de/mcp-best-practices
The New Stack: 15 Best Practices https://thenewstack.io/15-best-practices-for-building-mcp-servers-in-production/

Appendix B: Glossary

Term Definition
Tool Execution Error An error returned within the normal CallToolResult envelope with isError: true. Visible to the LLM for self-correction.
Protocol Error A JSON-RPC error object at the transport level (error.code). May not reach the LLM.
structuredContent Typed JSON object in tool results, validated against outputSchema. The machine-readable response.
content Array of TextContent, ImageContent, or EmbeddedResource blocks. The backward-compatible response.
annotations Behavioral metadata on tools (readOnlyHint, destructiveHint, etc.) used by hosts for permission decisions.
instructions LLM-targeted text in the initialize response, injected into the system prompt by most hosts.
SEP Specification Enhancement Proposal — the formal process for changes to the MCP specification.
stdio transport Communication via stdin/stdout. Server is launched as a child process by the host. Maximum compatibility.
Streamable HTTP HTTP-based transport supporting long-running connections and incremental results. Replaces deprecated SSE.