You already know the meme: chatbots talk, agents act, multi-agent systems actually get stuff done.
If you’ve ever begged a bot to fix Intune and got a poem instead, this one’s for you. In this episode, we go full Netflix hands-on: you watch, you snack, I poke the dangerous Service Principal things so nobody nukes prod. We build a mini self-healing, governed multi-agent system using Azure AI Foundry + Semantic Kernel, wired into real enterprise surfaces:
- Intune
- Entra ID
- Microsoft Graph
- Azure Automation
- Log Analytics
We run one-agent vs multi-agent head-to-head on a real workflow: 12 minutes vs 3 minutes time-to-fix — with only my subscription credit on the line. You’ll see why one agent stalls while teams fly, and how to ship this pattern safely in your own tenant. 🔥 What You’ll Learn 1. Why a Single Agent Isn’t Enough in the Enterprise We start by tearing apart the “one giant agent” fantasy:
- Single agents are like gas-station Swiss Army knives: technically they have tools, practically they bend on the first real job.
- You stuff planning, reasoning, execution, approvals, and reporting into one prompt → context explodes, latency spikes, hallucinations creep in.
- One agent trying to:
- Plan a change
- Call Graph and Intune
- Write remediation scripts
- Request approvals
- Verify results
- Document everything
…is basically a help desk, change board, and postmortem crammed into one very tired intern. We break down what actually goes wrong:
- Context windows flooded with logs, policies, and MDM miscellany
- Important details get truncated or invented
- Token usage and costs balloon
- “Fix” attempts that quietly break other things (like deleting the resource instead of rotating a secret 😬)
Then we introduce the fix: Multi-agent = roles + boundaries + parallelism
- Planner focuses on intent & constraints
- Operator focuses on tools & execution
- Reviewer focuses on guardrails & approvals
Each agent gets a tight instruction set, minimal memory, and a focused toolset, passing around small structured messages, not a 50-page policy doc. 2. Multi-Agent Systems 101 (No Hype, Just The Pattern) We map out a clear, shippable mental model: think digital team, not one big brain. Roles:
- Planner — understands the goal, constraints, environment; outputs a stepwise plan with tool calls
- Operator — executes the plan via tools: Graph, Azure Automation, Functions, Logic Apps, etc.
- Reviewer — checks groundedness, scope, compliance, and safety before risky changes
- Messenger/Concierge — interacts with humans: approvals, status updates, and audit summaries
Core concepts:
- Tools = hands
- REST APIs (Graph, internal services)
- Azure Automation runbooks (device scripts, remediation)
- Azure Functions & Logic Apps (glue & approvals)
- RAG via Azure AI Search (curated knowledge, not random web junk)
- Memory = budget, not magic
- Minimize per-agent context
- Use external state (Search, state store, thread metadata)
- Only pass what’s needed for the next decision
- Planning vs Execution
- Planner decomposes → Operator calls tools → Reviewer checks → Messenger tells humans
- This is where Semantic Kernel shines: planners, skills, function catalogs, retries, cancellation
- Safety by design
- Managed Identities per agent
- RBAC split into read vs manage
- PIM for destructive operations
- Tool calls logged to Log Analytics
- Content Safety + prompt shields to block jailbreaks & indirect injection
3. How Azure AI Foundry Powers Multi-Agent Workflows We then show how Azure AI Foundry becomes the control room: You’ll see how to define agents with:
- Instructions — short, role-specific prompts
- Deployments — different models per role (GPT-4-class for planning, SLMs for extraction)
- Knowledge — Azure AI Search indexes, uploaded docs, optional web grounding
- Actions — OpenAPI tools, Graph, Logic Apps, Functions, Azure Automation, Code Interpreter
- Connected agents — yes, one agent can call another like a tool
Why this matters:
- Foundry handles threads, safety, tracing, and evaluations
- Semantic Kernel orchestrates the planner → operator → reviewer loop in code
- You keep prompts short and put power in tools with strict schemas
Model strategy:
- Reasoning models for planning and complex decisions
- Small models (SLMs) for extraction, classification, parameter shaping
- Mix serverless endpoints and managed compute depending on cost & residency needs
Safety & observability:
- Content Safety on inputs and outputs
- Prompt shields against jailbreak and indirect injection
- Full tracing of tool calls (who, what, where, how long)
- Application Insights + Log Analytics for performance & audit
- Built-in evaluation flows for groundedness, relevance, and fluency
Become a supporter of this podcast: https://www.spreaker.com/podcast/m365-show-podcast–6704921/support.
Follow us on:
LInkedIn
Substack