A logo showing the text blog.marcnuri.com
Español
Home»Artificial Intelligence»Giving Superpowers to Small Language Models with Model Context Protocol (MCP)

Recent Posts

  • Fabric8 Kubernetes Client 7.2 is now available!
  • Connecting to an MCP Server from JavaScript using AI SDK
  • Connecting to an MCP Server from JavaScript using LangChain.js
  • The Future of Developer Tools: Adapting to Machine-Based Developers
  • Connecting to a Model Context Protocol (MCP) Server from Java using LangChain4j

Categories

  • Artificial Intelligence
  • Front-end
  • Go
  • Industry and business
  • Java
  • JavaScript
  • Legacy
  • Operations
  • Personal
  • Pet projects
  • Tools

Archives

  • May 2025
  • April 2025
  • March 2025
  • February 2025
  • January 2025
  • December 2024
  • November 2024
  • August 2024
  • June 2024
  • May 2024
  • April 2024
  • March 2024
  • February 2024
  • January 2024
  • December 2023
  • November 2023
  • October 2023
  • September 2023
  • August 2023
  • July 2023
  • June 2023
  • May 2023
  • April 2023
  • March 2023
  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • June 2022
  • May 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • November 2021
  • October 2021
  • September 2021
  • August 2021
  • July 2021
  • January 2021
  • December 2020
  • November 2020
  • October 2020
  • September 2020
  • August 2020
  • July 2020
  • June 2020
  • May 2020
  • February 2020
  • January 2020
  • December 2019
  • October 2019
  • September 2019
  • July 2019
  • March 2019
  • November 2018
  • July 2018
  • June 2018
  • May 2018
  • April 2018
  • March 2018
  • February 2018
  • December 2017
  • July 2017
  • January 2017
  • December 2015
  • November 2015
  • December 2014
  • March 2014
  • February 2011
  • November 2008
  • June 2008
  • May 2008
  • April 2008
  • January 2008
  • November 2007
  • September 2007
  • August 2007
  • July 2007
  • June 2007
  • May 2007
  • April 2007
  • March 2007

Giving Superpowers to Small Language Models with Model Context Protocol (MCP)

2025-02-26 in Artificial Intelligence tagged LLM / AI Agent / SLM / Model Context Protocol (MCP) / Kubernetes / OpenShift by Marc Nuri | Last updated: 2025-03-01
Versión en Español

Introduction

Large Language Models (LLMs) have revolutionized how we interact with artificial intelligence (AI), but their intelligence comes at a cost. The best models handle complex operations seamlessly because they have deeply embedded knowledge about programming, APIs, and structured data formats. However, this also makes them expensive to run since they require significant computational resources, which translates into high operational costs.

What if we could give smaller, more efficient LLMs the same superpowers without requiring them to know everything upfront? Enter Model Context Protocol (MCP) servers, a game-changing approach that equips small language models (SLMs) with "superpowers" by abstracting away complexity.

The Challenge: Knowledge vs. Efficiency

When an LLM is asked to perform a task, it needs to understand the syntax, structure, and nuances of that task.

For example:

  • Creating a Spotify playlist requires knowledge of the Spotify API, the playlist creation process, and crafting the appropriate HTTP API request.
  • Downloading a file from Google Drive requires knowledge of the Google Drive API and the file download process.
  • Writing an SQL query requires knowledge of the database schema and the SQL query syntax.
  • Managing Kubernetes resources requires knowledge of the Kubernetes API and the resource configuration format (YAML).

Larger LLMs can handle many of these tasks because they've been trained on extensive datasets with detailed technical knowledge. However, this comes at the expense of high computational requirements, making them impractical for many use cases.

Smaller models, on the other hand, struggle because they lack the necessary embedded knowledge. They might not know the intricacies of crafting an HTTP request, specific product APIs, writing an optimized SQL query, or configuring a Kubernetes deployment. But what if we could bridge that gap?

MCP Servers: The Missing Piece

MCP servers act as middleware between the LLM and the real world. Instead of requiring the model to generate precise low-level instructions, MCP servers can provide tools that abstract complex actions.

These tools can operate at different levels of abstraction:

1. Low-level Tools

Provide basic building blocks for common tasks, such as sending an HTTP request or performing an SQL query. These require the model to have detailed knowledge about a task.

For example, an MCP server might provide a tool to perform a SELECT query on a database, but the model must know the SQL syntax and the database schema to use it correctly.

These tools are better suited for LLMs with some domain-specific knowledge.

2. High-level Tools

Abstract away the complexity of a task, allowing the model to focus on the intent rather than the implementation.

For example, an MCP server might provide a tool to fetch a user's profile information from a database without requiring the model to know the SQL syntax or the database schema.

These tools are better suited for SLMs.

By offloading complexity to MCP servers, even smaller language models can achieve expert-level performance on tasks they were never explicitly trained for.

A Real-World Example: Deploying to Kubernetes

Let's consider a common scenario: deploying a containerized application to a Kubernetes cluster. According to the abstraction levels I defined earlier, there are two ways to approach this task with an MCP server:

1. Low-Level Approach

The MCP server can provide a low-level tool or function resources_create_or_update(yaml string) that allows the model to create or update Kubernetes resources by providing its YAML.

Server-wise, the implementation of this tool would be extremely simple:

resources_create_or_update.pseudo
// pseudo-code
func resources_create_or_update(yaml string) {
  yaml := parseYAML(yaml)
  kubernetes.Apply(yaml)
}

However, the model needs to fully understand the Kubernetes YAML syntax to use this tool effectively. It would need to know how to define a deployment, a service, and an ingress, as well as how to configure them correctly.

2. High-Level Approach

The MCP server provides a higher-level tool or function deploy_application(image string) that abstracts away the complexity of deploying an application to Kubernetes.

The implementation of this tool would be more complex server-wise:

deploy_application.pseudo
// pseudo-code
func deploy_application(image string) {
  // Fetch image details from a registry (exposed port, environment variables, etc.)
  imageData := fetchImage(image)
  // Create a Kubernetes deployment based on the provided image data
  deployment := createDeployment(imageData)
  // May return a service object in case there is an exposed port
  service := createService(imageData)
  // May return an ingress object in case there is an exposed port
  ingress := createIngress(imageData)
  // Apply the required resources
  applyResources(deployment, service, ingress)
}

In this scenario, a small model with zero knowledge of Kubernetes can still deploy applications effortlessly to a Kubernetes cluster. It's the MCP server that performs the heavy lifting, allowing the model to focus on the high-level task of deploying an application.

Why This Matters

  • Reduces LLM training costs: Smaller models can achieve expert-level performance without requiring extensive training on domain-specific knowledge by leveraging MCP servers.
  • Reduces operational costs: Smaller models are more cost-effective to run, making AI automation more accessible to a wider audience.
  • Makes AI more accessible: Even lightweight models can perform complex tasks, making powerful AI accessible in resource-constrained environments.
  • Improves efficiency: Instead of generating verbose code or structured data, models can interact with high-level tools, reducing error rates and execution time.

Tip

Considering the pricing (at the time of writing) of OpenAI GPT-4o vs. GPT-4o mini:

  • Input tokens
    • 4o: $2.50/1M tokens
    • 4o-mini: $0.15/1M tokens
  • Output tokens:
    • 4o: $10.00/1M tokens
    • 4o-mini: $0.60/1M tokens

The cost of running the larger model is almost 17x higher than the smaller model.

Conclusion

MCP servers are transforming how we think about AI capabilities by bridging the gap between small and large language models. Instead of building ever-larger models with increasingly unsustainable costs, we can focus on making existing systems smarter through collaboration between LLMs and intelligent infrastructure like MCP servers.

As the MCP ecosystem evolves, the line between "small" and "large" models will blur. The real power won't come from the LLM itself but from the tools and context we provide. With MCP, even the smallest models can become AI superheroes.

You Might Also Like

  • Model Context Protocol (MCP) introduction
  • What is a Small Language Model (SLM)?
  • Introducing Goose, the on-machine AI agent
  • Kubernetes MCP Server: MCP server for Kubernetes that provides high-level and low-level tools for interacting with a Kubernetes or OpenShift cluster.
Twitter iconFacebook iconLinkedIn iconPinterest iconEmail icon

Post navigation
Connecting to a Model Context Protocol (MCP) Server from Java using LangChain4jPatching Kubernetes Resources from Java with Fabric8 Kubernetes Client 7
© 2007 - 2025 Marc Nuri