| Internet-Draft | MCP for Networks | November 2025 |
| Zeng, et al. | Expires 6 May 2026 | [Page] |
The Model Context Protocol (MCP) is an open standard that enables Large Language Model (LLM) applications to seamlessly integrate with external data sources and tools by exposing Resources, Prompts and Tools in a JSON-RPC 2.0 transport. This document describes a mapping of MCP roles, primitives and security model to the network management domain so that network devices act as MCP servers and network controllers act as MCP clients. This document also extends the model to Device-to-Device (D2D) collaboration, allowing network elements to perform distributed fault correlation when the controller is unreachable or when real-time cross-device data is required. The goal is to provide an intent-based, conversational and secure approach for automated network troubleshooting, configuration validation, and closed-loop remediation without inventing new protocols or device agents.¶
This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.¶
Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.¶
Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."¶
This Internet-Draft will expire on 6 May 2026.¶
Copyright (c) 2025 IETF Trust and the persons identified as the document authors. All rights reserved.¶
This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Revised BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Revised BSD License.¶
Network operators today face two converging demands: (1) reduce Mean Time to Repair (MTTR) while managing ever larger infrastructures, and (2) adopt intent-based interfaces that allow engineers to express high-level goals such as "verify reachability between Site-A and Site-B" instead of typing low-level CLI commands.¶
Simultaneously, Large Language Models (LLMs) have demonstrated utility in reasoning about semi-structured data such as device logs, configurations, and command outputs. However, safely exposing device-level actions and data to an LLM in real time remains an open problem. The Model Context Protocol (MCP), developed by Anthropic and published at https://modelcontextprotocol.io, provides a lightweight, capability-oriented RPC layer that already addresses this problem for general LLM applications.¶
This document specifies a deterministic mapping of MCP roles, primitives, and security workflows onto the network management plane so that:¶
While the star-shaped Controller-to-Device model covers most brown-field deployments, some faults are visible only when two or more devices compare live data in real time. To address this we extend MCP to support Device-to-Device (D2D) troubleshooting collaboration. In this mode network elements autonomously form a transient "collaboration domain", exchange YANG/JSON-RPC calls, correlate results with an on-box LLM, and produce a signed report that can be retrieved by the controller once connectivity is restored. Security is maintained through mutual TLS, short-lived device certificates, and a white-list of neighbour-callable capabilities.¶
The result is an intent-based, conversational and secure automation framework that re-uses existing agents (NETCONF/RESTCONF/YANG, SNMP, CLI, gNMI, etc.) already present on devices instead of requiring new firmware.¶
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.¶
MCP terms such as "Tool", "Resource", "Prompt", "Client", "Server", "Host", and "Capability" are used as defined in the MCP specification dated 2025-06-18.¶
Intent: A high-level, declarative statement of desired network behaviour, expressed in natural language or structured YAML/JSON, that the controller must translate into concrete device actions.¶
Resources are exposed under the URI scheme mcp://<device>/<yang-module>:<path>. Reading a resource returns YANG JSON-encoded data per [RFC7951]. Examples:¶
mcp://router1/ietf-interfaces:interfaces/interface=eth0¶
mcp://router1/openconfig-bgp:bgp/neighbors/neighbor=1.1.1.1¶
Servers SHOULD support the "content-id" header to enable E-tags for caching.¶
Tools are mapped to well-known RPC operations already exposed by devices via NETCONF/RESTCONF/YANG, gNMI, or CLI. A Tool is described by an OpenAPI 3.0 Operation Object and MUST be idempotent when possible.¶
Example Tool schema (ping):¶
{
"name": "ping",
"description": "Execute ICMP echo probe",
"inputSchema": {
"type": "object",
"properties": {
"destination": { "type": "string" },
"count": { "type": "integer", "default": 5 },
"source": { "type": "string" }
},
"required": ["destination"]
}
}
Prompts are reusable prompt templates stored on the device. They allow vendors or operators to encode golden troubleshooting workflows in natural language. A prompt MAY contain variable placeholders such as "{{interface}}" that the client fills in before sending to an LLM.¶
If the client advertises the "sampling" capability, the server MAY request LLM inference on behalf of the device. This is useful for recursive troubleshooting where the device needs to ask clarifying questions. All sampling requests MUST be approved by the human operator via explicit consent UI.¶
+----------------------------------+
| Human Operator (Chat UI, Web) |
+-----------------+----------------+
| Intent (natural lang.)
v
+----------------------------------+
| Controller / Orchestrator |
| (MCP Client + LLM Host) |
+-----------------+----------------+
| JSON-RPC 2.0 over TLS
v
+----------------------------------+
| Network Device |
| (MCP Server) |
+----------------------------------+
MCP Server: Runs on or proxied in front of the network element. Exposes:¶
MCP Client: Runs inside the controller. Maintains a persistent JSON-RPC 2.0 connection to each server. Optionally hosts an LLM that reasons about the data returned by servers.¶
On connection establishment, the client and server exchange capability objects as defined in MCP. Servers list their supported YANG modules [RFC8525], CLI command sets, and Tool schemas encoded in OpenAPI 3.0. Clients list optional features such as "sampling" (LLM recursion) or "roots" (URI scoping).¶
The following steps illustrate the flow:¶
ping (Tool) from router1 to 10.2.2.2show interfaces (Resource) on router2¶
/openconfig-bgp:bgp/neighbors/neighbor=1.1.1.1/state
-/ietf-interfaces:interfaces/interface=loopback0¶
Some root causes are scattered across several nodes (e.g., unidirectional fiber, one-way ACL, single-sided BFD Down, localized SRv6 SID failure). Although the controller can collect data centrally, the north-bound link may be impaired, time-synchronisation is costly, and uploading bulk data is expensive. Allowing nearby devices to form a transient "trusted collaboration domain" and exchange data, correlate and infer root causes locally can significantly shorten MTTR.¶
+-------------+ +-------------+ +-------------+
| Device A | | Device B | | Device C |
| MCP Client | | MCP Client | | MCP Client |
| + Server |<----->| + Server |<----->| + Server |
+-------------+ +-------------+ +-------------+
| | |
| MCP/JSON-RPC 2.0 | MCP/JSON-RPC 2.0 |
| over mutual-TLS | over mutual-TLS |
| | |
The following steps illustrate SRv6 packet-loss localization between three routers: PE-1 (head-end), P-2 (mid-point), and PE-3 (tail-end). The same pattern can be applied to any multi-box fault domain.¶
Step-0: Intent Injection¶
An operator types the following natural-language goal in the chat window that is served by PE-1:¶
Intent: "Verify packet loss on the SRv6 path from PE-1 to PE-3 via P-2"¶
PE-1 Collaboration Agent (CA) parses the Intent, extracts the SID list {PE-1, P-2, PE-3}, and starts the D2D workflow.¶
Step-1: Collaboration Domain Discovery¶
PE-1 CA sends an LLDP/IS-IS Extended TLV that contains:¶
+ mcp-d2d-port = 9514 + supported-capabilities = [ietf-srv6-ping, ietf-interface-counters] + cert-thumbprint = <SHA-256 of device certificate>¶
P-2 and PE-3 reply with their own TLV. All three nodes are now aware of each other's MCP endpoint and capabilities.¶
Step-2: Mutual TLS & Capability Negotiation¶
PE-1 opens a TLS 1.3 connection to P-2 and PE-3 on port 9514. Both sides send their X.509 device certificates and the MCP initialize message:¶
{ "jsonrpc": "2.0",
"method": "mcp/initialize",
"params": {
"protocolVersion": "2025-12",
"capabilities": [ "ietf-srv6-ping", "ietf-interface-counters" ]
},
"id": 1 }
¶
The responder echoes its own capability list. If the intersection is non-empty the session is marked authorised for those capabilities.¶
Step-3: Parallel Resource & Tool Calls¶
PE-1 CA schedules three operations in parallel:¶
Local call (no network RPC)¶
Tool: "ietf-srv6-ping"
Params: { Head=PE-1, Tail=PE-3, SID-List=[P-2, PE-3], Count=100 }
¶
RPC toward P-2¶
POST https://[P-2]:9514/mcp/tool/call
Body: { tool: "ietf-srv6-ping",
arguments: { Local=P-2, Tail=PE-3, Count=100 } }
¶
RPC toward PE-3¶
POST https://[PE-3]:9514/mcp/resource/read
Body: { uri: "urn:ietf:params:xml:ns:yang:ietf-interfaces/interfaces/interface=SID-Endpoint/statistics" }
¶
Each callee returns a JSON-RPC result plus an ed25519 signature covering the result body and a monotonic nonce. The nonce prevents replay if the controller later fetches the audit-log.¶
Step-4: Local Correlation & Inference¶
PE-1 CA feeds the three data sets into its on-box LLM with the prompt:¶
"Compare loss % from PE-1 vs P-2; if PE-1>0 and P-2=0, root cause is
in {PE-1→P-2 link, upstream ACL}; output a single sentence."
¶
Model output:¶
"Loss 3.2 % observed only on PE-1→P-2 direction; P-2→PE-1 0 %; suggest checking PE-1 egress ACL 2001."¶
The deterministic part of the reply (ACL 2001) is extracted as a structured fault hypothesis and stored in the candidate list.¶
Step-5: Roll-up Report & User Consent¶
PE-1 CA assembles the signed results, inference, and raw packets into an MCP resource:¶
URI: urn:ietf:params:xml:ns:yang:ietf-mcp/report/74e8
Body: { "creator": "PE-1",
"created": "2025-10-30T14:23:42Z",
"hypothesis": "PE-1 egress ACL 2001",
"evidence": [ <base64 encoded PCAP>, ... ],
"signatures": { "PE-1": <sig1>, "P-2": <sig2>, "PE-3": <sig3> } }
¶
If the operator is still on-line the CA presents the hypothesis and asks for explicit approval before any mitigating action (e.g., edit ACL) is executed. If the controller is reachable the report is pushed through the conventional star channel; otherwise it remains on-box until the next controller sync.¶
Failure Handling¶
If any D2D call times out or returns an authentication error, PE-1 CA marks that node "untrusted" and falls back to controller-based polling if available. All intermediate temporary states (e.g., cleared counters) are rolled back immediately to preserve atomicity.¶
JSON-RPC 2.0 is carried over TLS 1.3 [RFC8446] with TCP/443 or QUIC [RFC9000] as the underlying transport. Servers present X.509 device certificates; clients use mutual TLS with short-lived SPAKE2 [RFC9383] tokens to achieve zero-touch onboarding.¶
YANG data is encoded in JSON [RFC7951] unless the client explicitly requests XML.¶
Large tech-support files MAY be streamed via HTTP/2 chunked transfer and are integrity-protected by SHA-256 hashes delivered in the JSON-RPC result.¶
Servers MAY support the MCP "progress" notification to report percentage completion for long-running Tools such as "tracepath".¶
This section extends the security model defined in MCP with network-specific requirements.¶
All Tool invocations that change device state MUST be confirmed by an authenticated user via an out-of-band consent channel (e.g., click-through UI or signed JWT). The consent object includes:¶
Servers issue short-lived OAuth 2.0 [RFC6749] access tokens scoped to individual YANG subtrees or Tools. Tokens are bound to the mutual TLS channel to prevent replay.¶
When the controller hosts an LLM, the LLM is placed in a sandbox with no direct layer-3 reachability to devices. All interactions MUST traverse the MCP client to mitigate prompt-injection attacks.¶
Every JSON-RPC request and response is appended to an immutable audit trail (e.g., syslog [RFC5424] or IETF syslog over TLS). Servers include a "session-id" field to allow cross-device correlation.¶
This document requests IANA to register the following well-known URI:¶
Additionally, the YANG module "ietf-mcp" is requested to be added to the IETF YANG module registry with namespace "urn:ietf:params:xml:ns:yang:ietf-mcp".¶
--> {
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "ping",
"arguments": {
"destination": "10.2.2.2",
"count": 5,
"source": "192.0.2.1"
}
}
}
<-- {
"jsonrpc": "2.0",
"id": 1,
"result": {
"loss": 0,
"rtt": { "min": 2.1, "max": 2.3, "avg": 2.2 }
}
}
--> {
"jsonrpc": "2.0",
"id": 2,
"method": "resources/subscribe",
"params": {
"uri": "mcp://router1/ietf-interfaces:interfaces/interface=eth0",
"content-id": "etag-1234"
}
}
<-- {
"jsonrpc": "2.0",
"id": 2,
"result": {
"subscription-id": "sub-42",
"initial": { "admin-status": "up", "oper-status": "up" }
}
}