TutorialJune 26, 202518 min watch

Extending LLMs with External Tools via Model Context Protocol

Learn how to extend Large Language Models with external tools using the Model Context Protocol (MCP). This comprehensive guide covers building MCP servers with FastMCP, integrating with LangChain agents, and creating a financial analysis bot for Slack with real-time data access.

MCPModel Context ProtocolFastMCPLangChainLLM ExtensionsExternal ToolsPythonFastAPISlack IntegrationFinancial AnalysisYahoo FinanceSEC Filings

Large Language Models (LLMs) are powerful, but they often lack real-time information or the ability to interact with external systems. The Model Context Protocol (MCP) provides a standard way for LLMs to access external tools and resources through dedicated servers. This allows you to build applications where LLMs can fetch current data, perform actions, and integrate with other services.

The main point is that MCP enables LLMs to go beyond their training data by connecting them to custom functionalities exposed by MCP servers.

Key Takeaways

  • MCP servers act as bridges, exposing specific tools and resources that LLMs can call upon
  • Frameworks like FastMCP simplify building these servers and defining the tools and prompts they offer
  • Clients, such as LangChain agents using MultiServerMCPClient, can connect to MCP servers to utilize these external capabilities, enabling sophisticated applications like a financial analysis bot integrated with Slack

Why Connect LLMs to External Tools?

LLMs are excellent at understanding and generating text, but their knowledge is limited to their training data. They cannot, for example, tell you the current weather in a specific city or provide the latest stock earnings report unless that information was part of their training set, which is quickly outdated.

By using MCP, you can build servers that provide these missing capabilities. An LLM client, like Claude for Desktop or a custom agent, can then communicate with your MCP server to access real-time data or perform specific actions. This transforms a static LLM into a dynamic, interactive agent capable of much more.

We will look at how to build an MCP server and connect it to a client, using a financial analysis application integrated with Slack as a detailed example.

Building an MCP Server

An MCP server is essentially an application that exposes a set of tools and resources that an MCP client can discover and use. The protocol defines how the client and server communicate, including methods like HTTP and Server-Sent Events (SSE) for streaming data.

A simple example is a weather server. This server could expose two tools:

  • get-alerts: To fetch severe weather alerts for a location
  • get-forecast: To get the weather forecast for a location

An LLM client, upon connecting to this server, would see these tools and understand how to call them based on the user's request (e.g., "What's the weather forecast for London tomorrow?").

Building an MCP server can be simplified using frameworks like FastMCP. FastMCP helps manage the definition and exposure of your tools, resources, and prompts.

Here's a basic example of creating a FastMCP server instance in Python:

python
from fastmcp import FastMCP

# Create a basic server instance
mcp = FastMCP(name="MyAssistantServer")

# You can also add instructions for how to interact with the server
mcp_with_instructions = FastMCP(
    name="HelpfulAssistant",
    instructions=""" This server provides data analysis tools. Call get_average() to analyze numerical data. """
)

A Financial Analysis Example

Let's dive into a more complex example: a financial analysis application. This application uses an MCP server to provide tools for fetching financial data from sources like Yahoo Finance and the SEC.

Defining Tools

The core of the financial server is a set of tools defined using the @mcp.tool() decorator from the fastmcp library. These tools are Python functions that perform specific tasks, like retrieving stock data or SEC filings.

Here are some examples of tools you might define:

python
import yfinance as yf
import requests
from bs4 import BeautifulSoup
from fastmcp import FastMCP

@mcp.tool()
def get_stock_summary(ticker: str) -> str:
    """ Get a basic stock summary using Yahoo Finance. """
    try:
        stock = yf.Ticker(ticker)
        hist = stock.history(period="5d")
        if hist.empty:
            return f"No recent data found for {ticker.upper()}."
        latest = hist.iloc[-1]
        # Format and return summary data
        return f"Summary for {ticker.upper()}: Close={latest['Close']:.2f}, Volume={latest['Volume']}"
    except Exception as e:
        return f"Error getting stock summary for {ticker.upper()}: {str(e)}"

@mcp.tool()
def get_sec_filings(ticker: str) -> list:
    """Get recent SEC filings for a ticker."""
    try:
        stock = yf.Ticker(ticker)
        filings = stock.get_news() # Using news as a proxy for filings in this simplified example
        sec_filings = [item for item in filings if 'sec filing' in item['title'].lower()]
        return sec_filings
    except Exception as e:
        return [{"error": str(e)}]

@mcp.tool()
def get_financial_statements(ticker: str) -> dict:
    """Get balance sheet, income statement, and cashflow for a ticker."""
    try:
        stock = yf.Ticker(ticker)
        return {
            "balance_sheet": stock.balance_sheet.to_dict() if hasattr(stock.balance_sheet, 'to_dict') else dict(stock.balance_sheet),
            "income_statement": stock.income_stmt.to_dict() if hasattr(stock.income_stmt, 'to_dict') else dict(stock.income_stmt),
            "cashflow": stock.cashflow.to_dict() if hasattr(stock.cashflow, 'to_dict') else dict(stock.cashflow)
        }
    except Exception as e:
        return {"error": str(e)}

def summarize_filing(url: str) -> str:
    """Fetches an SEC filing from a URL, extracts main content, and summarizes."""
    try:
        res = requests.get(url)
        if res.status_code != 200:
            return f"Failed to fetch filing. HTTP status: {res.status_code}"
        # Using BeautifulSoup to extract text content
        soup = BeautifulSoup(res.content, 'html.parser')
        text_content = soup.get_text()
        # Simple summarization
        sentences = text_content.split('.')
        summary = '. '.join(sentences[:10]) if len(sentences) > 10 else text_content[:1000]
        return f"Summary of SEC filing at {url}:\\n\\n{summary.strip()}...\\n\\n(For a more detailed summary, consider using a larger language model.)"
    except Exception as e:
        return f"Error summarizing SEC filing from {url}: {str(e)}"

These functions encapsulate the logic for interacting with external APIs (like Yahoo Finance) or processing data (like summarizing a filing). The @mcp.tool() decorator makes them discoverable and callable by an MCP client.

Defining Prompts

MCP servers can also expose prompts using the @mcp.prompt() decorator. These are pre-defined messages or instructions that can guide the LLM's behavior or generate specific types of requests.

For the financial application, you might have prompts like:

python
from fastmcp.prompts.prompt import Message, PromptMessage, TextContent

@mcp.prompt()
def prompt_stock_summary(ticker: str) -> str:
    """Generates a user message requesting a stock summary for a given ticker."""
    return f"Please provide a comprehensive summary for the stock '{ticker}'. Include recent price, volume, and key financial metrics."

@mcp.prompt()
def slack_mrkdwn(user_query: str) -> str:
    """System prompt for Slack: instructs the agent to use Slack mrkdwn formatting."""
    return (
        "You are a helpful financial assistant responding in a Slack channel.\\n"
        "All your responses must use Slack's mrkdwn formatting for clarity and readability.\\n"
        "- Use *bold* for key figures and headings.\\n"
        "- Use _italic_ for emphasis.\\n"
        "- Use bullet points or numbered lists for lists.\\n"
        "- Use code blocks (triple backticks) for tabular data or code.\\n"
        "- Use inline code (single backticks) for short code or ticker symbols.\\n"
        "- Use > for blockquotes when quoting text.\\n"
        "- Format links as <https://example.com|display text>.\\n"
        "- Never use HTML or non-Slack markdown.\\n"
        "- Keep your answers concise and easy to read in Slack.\\n\\n"
        f"User query: {user_query}"
    )

The slack_mrkdwn prompt is particularly useful when integrating with platforms like Slack, ensuring the LLM formats its output correctly for that environment.

Running the Server

A FastMCP server can be run as a web application, often using a framework like FastAPI.

python
# server.py
import os
from fastmcp import FastMCP
from fastapi import FastAPI, Request
from starlette.responses import PlainTextResponse
import uvicorn

mcp = FastMCP("FinancialMCP") # Instantiate your FastMCP server

# Mount the MCP application onto a FastAPI app
mcp_app = mcp.http_app(path='/mcp')
app = FastAPI(lifespan=mcp_app.lifespan)
app.mount("/mcp-server", mcp_app) # Mount at a specific path

@mcp.custom_route("/health", methods=["GET"])
async def health_check(request: Request) -> PlainTextResponse:
    return PlainTextResponse("OK")

if __name__ == "__main__":
    port = int(os.environ.get("PORT", 8005))
    uvicorn.run(app, host="0.0.0.0", port=port)

This sets up a web server (running on port 8005 by default) that exposes the MCP endpoints, allowing clients to connect and interact with the defined tools and prompts.

Client Interaction with LangChain

An LLM application acts as the client that connects to the MCP server. Using a framework like LangChain simplifies building agents that can utilize these external tools. The langchain_mcp_adapters library provides the necessary components.

python
# client.py
from langchain_mcp_adapters.client import MultiServerMCPClient
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
import asyncio
import sys

# Update the URL and port to match your running server
MCP_SERVER_URL = "http://localhost:8005/mcp-server/mcp"

# Initialize the MultiServerMCPClient to connect to your server
client = MultiServerMCPClient({
    "FinancialMCP": {
        "url": MCP_SERVER_URL,
        "transport": "streamable_http",
    }
})

async def ask_agent(question: str, origin: str = "cli") -> str:
    # If origin is slack, append formatting instructions
    if origin == "slack":
        question = f"{question}\\n\\nPlease format your response for Slack using markdown formatting where appropriate (bold, italics, code blocks, etc.) and keep it concise and readable."

    async with client.session("FinancialMCP") as session:
        # Load tools exposed by the connected MCP server
        tools = await load_mcp_tools(session)
        # Create a LangChain agent with the loaded tools
        agent = create_react_agent("openai:gpt-4.1", tools)

        # Invoke the agent with the user's question
        prompt_messages = [{"role": "user", "content": question}]
        response = await agent.ainvoke({"messages": prompt_messages})

        # Extract the final AI message content
        if isinstance(response, dict) and "messages" in response:
            ai_messages = [msg for msg in response["messages"] if getattr(msg, "_class_", None) and msg.__class__.__name__ == "AIMessage"]
            if ai_messages:
                return ai_messages[-1].content
        return str(response)

# For CLI testing
if __name__ == "__main__":
    question = " ".join(sys.argv[1:]) if len(sys.argv) > 1 else "Give me a summary of Apple and its latest earnings"
    print(asyncio.run(ask_agent(question)))

This client code connects to the "FinancialMCP" server, loads the tools it exposes, and then uses a LangChain agent powered by an LLM to process user queries. When a user asks a question that requires external data, the agent determines which tools to call, executes them via the MCP client, gets the results, and formulates a response.

Integrating with Slack

The financial analysis agent can be integrated into platforms like Slack. A user can interact with the agent directly in a Slack channel or direct message. The agent receives the user's message, processes it using the LangChain agent and the MCP server tools, and sends the formatted response back to Slack.

For example, a user might type in Slack: "Give me a rundown of the latest earnings report from Salesforce Apple and Nvidia".

The ask_agent function, when called with origin="slack", appends instructions to format the response using Slack markdown. The agent then uses the get_sec_filings and potentially summarize_filing tools to retrieve and process the relevant earnings reports.

Monitoring Agent Activity

Tools like Prompt Circle can be used to trace the execution of the LangChain agent. This provides a visual workflow (LangGraph) showing the steps the agent takes, including calls to the LLM and invocations of the tools exposed by the MCP server.

For example, a trace might show:

  • The initial user query
  • The LLM processing the query and deciding which tools to call
  • Calls to get_sec_filings for CRM, AAPL, and NVDA
  • Calls to summarize_filing for each retrieved document
  • The LLM processing the tool outputs to generate the final response

These traces are invaluable for debugging and understanding how the agent interacts with the MCP server and its tools.

Conclusion

The Model Context Protocol (MCP) provides a robust framework for extending the capabilities of Large Language Models by allowing them to interact with external tools and resources via dedicated servers.

By building MCP servers, for example using FastMCP, you can expose custom functionalities like fetching real-time data or integrating with third-party services. Clients, such as LangChain agents utilizing MultiServerMCPClient, can then connect to these servers and leverage these tools to fulfill complex user requests.

The financial analysis application integrated with Slack serves as a practical example, showcasing how an LLM agent can use MCP server tools to retrieve and process financial data from sources like Yahoo Finance and the SEC, delivering formatted results directly to a user in Slack. This architecture enables the creation of powerful, context-aware LLM applications that go far beyond the limitations of their initial training data.