Here's the hard truth: while we've made incredible progress building individual AI agents, the infrastructure that would allow them to collaborate effectively still doesn't exist. This is a massive blind spot. The real power of agents isn't in what they can do alone, but in how they can work together—just like humans.

If you think about it, no single human expert can design and create a complex new material, diagnose and remedy a difficult medical condition, or build and deploy an enterprise app stack end-to-end. We collaborate with specialists. The same principle applies to AI agents, except we're missing the crucial "collaboration layer."

Without standardized ways for agents to discover each other, communicate, and coordinate, we're essentially trying to build the web without RPCs, HTTP, DNS, or TCP/IP. We're hitting a wall that isn’t just slowing progress, it’s fundamentally limiting what’s possible. We need an Internet of Agents.

The four phases of multi-agent application development

After talking to hundreds of developers building multi-agent systems, we've identified four critical phases – Discover, Compose, Deploy and Evaluate – where standardized infrastructure is desperately needed.

Phase 1: Discover

When a developer wants to build multi-agent software, their first questions are:

Where can I find the best agents, both internal and third-party, for this workflow?
How do I know if they're effective and reputable?
Will they be compatible with each other?

Today, there's no good answer to these questions. There isn't a good agent discovery tool, no standardized way to enable or verify compatibility, no reputation system. What developers need are components that would solve these issues, allowing developers to describe agent capabilities and making agent publishing and discovery simple and reliable.

Phase 2: Compose

Once a developer has identified suitable agents, they need to stitch them together to solve for the use case at hand:

How do I create an agent call graph that will provide the outcomes I'm looking for?
How do I ensure each agent has the appropriate identity and access?
How do I ensure they can understand each other's inputs and outputs, especially when created on different development frameworks?
How do I interface with existing infrastructure and systems that pre-date AI agents?

This is where things get technically complex quickly. Developers need standard protocols for inter-agent communication across frameworks, to tap into function calls and tooling for accessing existing systems, workflow servers for defining and managing agentic workflows, along with schema extensions that define how agents from multiple frameworks, vendors, and organizations interact.

Phase 3: Deploy

Now the multi-agent app is ready, and I need to run this agentic workflow, which introduces another set of challenges:

How do I set up scalable, efficient, real-time connectivity for agent-to-agent, agent-model/data/infrastructure and human-in-the-loop communication?
How do I ensure the underlying cloud (or on-premises) infrastructure and functions or tooling are set up appropriately?
How do I handle transformations between agent frameworks, input schemas, and probabilistic outcomes in real-time?

Developers cobble together custom solutions for each of these problems. What's needed are agent gateways that enable scalable agent-agent communication on existing infrastructure, and components for translating between different agentic frameworks and probabilistic outcomes.

Phase 4: Evaluate

Finally, once the multi-agent app is running, I need to evaluate and evolve the application:

How do I measure outcomes not just for each agent but for the entire call graph?
How do I identify and resolve bottlenecks in complex agentic workflows?
How do I fine-tune the agentic call graph based on real-world performance?

This requires new and extended definitions for observability and evaluations for entire agentic workflows. Developers need an observability framework for monitoring and evaluating multi-agent systems, with feedback mechanisms to publish updated agent capabilities and improve performance.

Today's pain points for builders

What's striking is how much time developers spend reinventing wheels. Each team builds custom solutions for agent coordination, often with significant limitations:

Tight coupling to specific frameworks: Most solutions only work with agents built on the same framework.
Limited discoverability: Finding the right agent for a specific task remains mostly manual and ad-hoc, for both intra-enterprise and public agents.
Fragile deployment patterns: Connections between agents break easily when either agent changes.
Poor observability: Debugging issues across agent ensembles and call graphs is extremely difficult.
Security as an afterthought: Identity, authentication, and access control are often minimally implemented, especially across the multi-agent call graph.

These pain points show up consistently regardless of industry, use case, or company size. And they're holding back the entire ecosystem from reaching its potential.

Towards a common infrastructure

Without standardized infrastructure components, multi-agent systems will remain brittle, insecure, and limited. We need open source tools and protocols that address these challenges head-on.

The potential of multi-agent systems is enormous—but only if we build the right infrastructure. That's why we're a core member of the AGNTCY.org, an open source collective for inter-agent collaboration where AI innovators and builders are coming together to advance agentic AI. Read more here.