Bridging the Gap: Understanding AI Agent Interoperability with A2A
Learn how to build interoperable AI agents using Google's Agent2Agent (A2A) protocol. This comprehensive guide covers Agent Cards, Tasks, Streaming, and provides a complete example of building a financial research agent with Python and FastAPI.
Building AI agents is becoming increasingly common, but the rapid growth of different development frameworks has created a significant challenge: how do agents built using one framework communicate and collaborate with agents built using another? This lack of a standard way for agents to interact makes creating complex, multi-agent systems difficult.
Key Takeaways
- The proliferation of AI agent frameworks creates significant interoperability challenges
- Google's Agent2Agent (A2A) protocol provides a standardized way for agents to communicate
- Key components of A2A include Agent Cards (defining capabilities), Tasks (representing agent runs), and Streaming (for real-time updates)
Why Interoperability Matters
Imagine needing different specialized AI agents to work together on a complex problem, like a financial analyst agent needing data from a web scraping agent and insights from a market trend agent. If these agents are built on incompatible frameworks, getting them to share information and coordinate actions is a major hurdle.
A standard protocol for agent-to-agent communication is essential to unlock the full potential of AI agents and build more sophisticated applications. The A2A protocol aims to provide this standard, allowing agents to discover each other's capabilities and interact seamlessly.
Exploring the A2A Protocol
This guide will walk through the core concepts of the A2A protocol and demonstrate how to build a remote agent that can communicate using this standard.
Core Components of A2A
The A2A protocol defines several key components that enable agents to understand and interact with each other:
Agent Card
This is essentially a public profile for an agent. It describes the agent's identity, purpose, and capabilities. An Agent Card includes:
- Name: A human-readable name (e.g., "Investment Research Analyst")
- Description: A brief explanation of what the agent does
- URL: The endpoint where the agent can be accessed
- Skills: A list of specific functions or actions the agent can perform (e.g., "fund analysis", "stock analysis", "summarize filing")
- Capabilities: Information about how the agent communicates, such as whether it supports streaming or push notifications
Here's an example of what an Agent Card might contain:
yamlName: Investment Research Analyst Description: Researches stocks, funds and financial metrics URL: http://financial-service-agent/v1/a2a Skills: - fund analysis - stock analysis - company research
Messages
Communication between agents and clients happens through structured messages. These include user messages (requests) and agent messages (responses, updates).
Tasks
Each time an agent is invoked to perform an action, it's treated as a distinct task. The A2A protocol helps manage the state and progress of these tasks.
Streaming
For long-running tasks or interactions where real-time feedback is important, A2A supports streaming, typically using Server-Sent Events (SSE). This allows the agent to send updates incrementally as it processes the request, providing a more responsive user experience and enabling human-in-the-loop scenarios.
Building a Remote Agent with A2A
Let's look at how to build a remote agent that adheres to the A2A protocol using Python, FastAPI, and libraries like langgraph
and yfinance
. The example is an "Investment Research Analyst Agent".
Implementing Agent Tools
The agent needs to perform specific financial research tasks. These tasks are implemented as tools. Tools are functions that the agent can call to gather information or perform actions.
Here are some examples of tools for the Investment Research Analyst Agent, using the yfinance
library to fetch financial data:
pythonimport yfinance as yf from langchain_core.tools import tool @tool def get_stock_summary(ticker: str) -> str: """Get a basic stock summary using Yahoo Finance. Args: ticker: The stock ticker symbol (e.g. "AAPL") """ try: stock = yf.Ticker(ticker) hist = stock.history(period="5d") if hist.empty: return f"No recent data found for {ticker.upper()}." latest = hist.iloc[-1] summary = ( f"{ticker.upper()} Summary:\\n" f"Close Price: ${latest['Close']:.2f}\\n" f"Volume: {int(latest['Volume'])}\\n" f"Date: {latest.name.date()}\\n" ) return summary except Exception as e: return f"Error retrieving stock data for {ticker}: {str(e)}" @tool def get_sec_filings(ticker: str) -> list: """Get recent SEC filings for a ticker. Args: ticker: The stock ticker symbol (e.g., "AAPL"). Returns: A list of SEC filings or an error message. """ try: stock = yf.Ticker(ticker) filings = stock.get_sec_filings() return filings.to_dict("records") if hasattr(filings, 'to_dict') else list(filings) except Exception as e: return [{"error": str(e)}]
Creating the Agent Class
The agent itself is implemented as a Python class, InvestmentResearchAnalystAgent
. This class defines the agent's behavior, including its system instructions, the language model it uses (e.g., Google's Gemini or OpenAI), and the tools it has access to. It uses langgraph
to orchestrate the agent's reasoning process.
pythonimport os from collections.abc import AsyncIterable from typing import Any, Literal from langchain_google_genai import ChatGoogleGenerativeAI from langchain_openai import ChatOpenAI from langgraph.checkpoint.memory import MemorySaver from langgraph.prebuilt import create_react_agent from pydantic import BaseModel memory = MemorySaver() class ResponseFormat(BaseModel): """Respond to the user in this format.""" status: Literal['input_required', 'completed', 'error'] = 'input_required' message: str class InvestmentResearchAnalystAgent: """InvestmentResearchAnalystAgent - a specialized assistant for investment research and financial analysis.""" SYSTEM_INSTRUCTION = ( 'You are a specialized investment research analyst assistant focused on ' 'helping users analyze companies for investment decisions.\\n' 'You have access to various financial tools to gather comprehensive ' 'information about companies including:\\n' 'stock summaries, SEC filings, analyst recommendations, financial statements, and more.\\n' 'Your goal is to provide thorough, accurate financial analysis while ' 'maintaining professional objectivity.' ) def __init__(self): model_source = os.getenv("model_source", "google") if model_source == "google": self.model = ChatGoogleGenerativeAI(model='gemini-2.0-flash') else: self.model = ChatOpenAI( model=os.getenv("TOOL_LLM_NAME"), openai_api_key=os.getenv("API_KEY", "EMPTY"), openai_api_base=os.getenv("TOOL_LLM_URL"), temperature=0 ) self.graph = create_react_agent( self.model, tools=self.tools, checkpointer=memory, prompt=self.SYSTEM_INSTRUCTION, response_format=ResponseFormat, ) async def stream(self, query, context_id) -> AsyncIterable[dict[str, Any]]: inputs = {'messages': [('user', query)]} config = {'configurable': {'thread_id': context_id}} async for item in self.graph.stream(inputs, config, stream_mode='values'): # Handle streaming updates yield self.process_stream_item(item)
Setting Up the FastAPI Server
The agent needs to be served so other applications or agents can interact with it. This is done using FastAPI with A2A protocol support:
pythonimport os import uvicorn from fastapi import FastAPI from a2a.server.app import A2AStarletteApplication from a2a.types import AgentCard, AgentCapabilities, AgentSkill # Define agent skills skills = [ AgentSkill( id='get_stock_summary', name='Get Stock Summary', description='Gets a basic stock summary using Yahoo Finance.', tags=['stock', 'summary'], examples=['Get the summary for AAPL.'], ), AgentSkill( id='summarize_filing', name='Summarize SEC Filing', description='Fetches and summarizes an SEC filing from a given URL.', tags=['SEC', 'filing', 'summary'], examples=['Summarize this SEC filing: <URL>.'], ), ] # Define agent capabilities capabilities = AgentCapabilities(streaming=True, pushNotifications=True) # Define the Agent Card agent_card = AgentCard( name='Investment Research Analyst Agent', description='A specialized assistant for investment research and financial analysis', url='http://localhost:10000/', version='1.0.0', capabilities=capabilities, skills=skills, ) # Create FastAPI app and mount the A2A app app = FastAPI() a2a_app = A2AStarletteApplication(agent_card=agent_card, http_handler=request_handler) app.mount('/', a2a_app.build()) if __name__ == "__main__": port = int(os.environ.get("PORT", 10000)) uvicorn.run("app.api_server:app", host="0.0.0.0", port=port, reload=True)
Interacting with an A2A Agent
A client application can interact with this remote agent using the A2A protocol. The A2A library provides client components like A2AClient
and A2ACardResolver
to simplify this process.
The client first needs to retrieve the agent's Agent Card to understand its capabilities and URL. Then, it can send messages to the agent and process the responses, including handling streamed updates.
pythonimport os from uuid import uuid4 import httpx from a2a.client import A2ACardResolver, A2AClient from a2a.types import MessageSendParams, SendMessageRequest async def ask_agent(user_text: str) -> dict: base_url = os.environ.get("AGENT_SERVER_URL", "http://localhost:10000") async with httpx.AsyncClient() as httpx_client: # Resolve the agent card resolver = A2ACardResolver( httpx_client=httpx_client, base_url=base_url, ) agent_card = await resolver.get_agent_card() # Create an A2A client client = A2AClient(httpx_client=httpx_client, agent_card=agent_card) # Prepare and send the message request = SendMessageRequest( id=str(uuid4()), params=MessageSendParams( message={ 'role': 'user', 'parts': [{"kind": "text", "text": user_text}], 'messageId': uuid4().hex, } ) ) response = await client.send_message(request) return {"response": str(response)} # Example usage (requires an async context) # import asyncio # async def main(): # response = await ask_agent("Get me Apple stock summary") # print(response) # asyncio.run(main())
This client code demonstrates the basic steps:
- Resolving the agent card to understand capabilities
- Creating a client for communication
- Sending a message with the user's request
- Processing the response from the agent
For a streaming interaction, the client would need to handle the SSE stream from the server.
The full code for this example is available on GitHub at https://github.com/hollaugo/tutorials in the agent2agent
directory.
Conclusion
The rise of AI agents across various frameworks necessitates a standard approach to interoperability. The A2A protocol addresses this by providing a structured way for agents to:
- Define their capabilities (Agent Cards)
- Manage interactions (Tasks)
- Provide real-time feedback (Streaming)
Building agents that adhere to this protocol, as demonstrated with the Investment Research Analyst Agent example using Python, FastAPI, and the A2A library, is a crucial step towards creating a more connected and collaborative ecosystem of AI agents.
While the field is still evolving and documentation can sometimes be a challenge, adopting such protocols is key to building scalable and interoperable AI applications. The A2A standard provides a foundation for the future of multi-agent systems where specialized agents can work together seamlessly to solve complex problems.