
1
00:00:00,000 –> 00:00:03,340
Most organizations think more agents means more automation.
2
00:00:03,340 –> 00:00:04,060
They are wrong.
3
00:00:04,060 –> 00:00:05,660
Agents sprawl isn’t innovation.
4
00:00:05,660 –> 00:00:07,300
It’s unmanaged entropy.
5
00:00:07,300 –> 00:00:08,760
The minute you let every team publish
6
00:00:08,760 –> 00:00:11,420
their own little co-pilot helper, you don’t get scale.
7
00:00:11,420 –> 00:00:13,300
You get a permissionless decision surface,
8
00:00:13,300 –> 00:00:15,400
and then you spend the next year trying to explain
9
00:00:15,400 –> 00:00:17,660
why the ROI can’t be proven.
10
00:00:17,660 –> 00:00:20,300
In this episode, you’ll learn the deployable architecture,
11
00:00:20,300 –> 00:00:22,180
a master agent as a control plane,
12
00:00:22,180 –> 00:00:24,640
and connected agents as govern services.
13
00:00:24,640 –> 00:00:26,300
Deterministic authority without pretending
14
00:00:26,300 –> 00:00:27,720
the model is explainable.
15
00:00:27,720 –> 00:00:30,180
Now let’s talk about what you’re actually building.
16
00:00:30,180 –> 00:00:31,580
The foundational misunderstanding,
17
00:00:31,580 –> 00:00:32,880
you think you’re building assistance.
18
00:00:32,880 –> 00:00:34,120
You’re building a decision engine,
19
00:00:34,120 –> 00:00:37,400
where the marketing story says you’re deploying an assistant,
20
00:00:37,400 –> 00:00:39,480
a polite interface that helps people get answers
21
00:00:39,480 –> 00:00:40,440
and complete tasks.
22
00:00:40,440 –> 00:00:43,340
That framing is comforting because assistance sound optional.
23
00:00:43,340 –> 00:00:45,160
Assistance sound like user experience.
24
00:00:45,160 –> 00:00:46,760
And if the assistant behaves strangely,
25
00:00:46,760 –> 00:00:48,720
you shrug and say, AI is weird.
26
00:00:48,720 –> 00:00:50,840
In reality, you’re not deploying an assistant.
27
00:00:50,840 –> 00:00:52,880
You are deploying a distributed decision engine
28
00:00:52,880 –> 00:00:56,080
that sits in the middle of identity, data, tools, and action.
29
00:00:56,080 –> 00:00:58,480
It interprets intent, selects pathways,
30
00:00:58,480 –> 00:01:00,720
invokes capabilities and emits outcomes.
31
00:01:00,720 –> 00:01:02,960
That distinction matters because a decision engine
32
00:01:02,960 –> 00:01:04,440
is not judged by helpfulness.
33
00:01:04,440 –> 00:01:06,520
It’s judged by correctness, reproducibility,
34
00:01:06,520 –> 00:01:09,240
and the ability to prove it did what you intended.
35
00:01:09,240 –> 00:01:12,160
In enterprise systems, helpful is a non-requirement,
36
00:01:12,160 –> 00:01:13,280
correct is the requirement.
37
00:01:13,280 –> 00:01:16,360
And co-pilot style orchestration, multi-agent or not,
38
00:01:16,360 –> 00:01:18,480
will happily optimize for looks correct,
39
00:01:18,480 –> 00:01:21,440
unless you force it to optimize for, is allowed.
40
00:01:21,440 –> 00:01:22,960
This is where organizations quietly
41
00:01:22,960 –> 00:01:24,640
collapse their own control model.
42
00:01:24,640 –> 00:01:26,040
They treat probabilistic reasoning
43
00:01:26,040 –> 00:01:27,680
as if it were deterministic workflow.
44
00:01:27,680 –> 00:01:29,880
They let natural language become the policy surface.
45
00:01:29,880 –> 00:01:32,360
They bury rules in prompts, they call it governance.
46
00:01:32,360 –> 00:01:34,440
And then they act surprised when it drifts.
47
00:01:34,440 –> 00:01:36,040
Prompt embedded policy is not policy.
48
00:01:36,040 –> 00:01:38,000
It is a suggestion with a half-life.
49
00:01:38,000 –> 00:01:39,640
Because prompts are not compiled.
50
00:01:39,640 –> 00:01:40,960
They are interpreted.
51
00:01:40,960 –> 00:01:44,880
Every run is a fresh execution against a probabilistic system
52
00:01:44,880 –> 00:01:47,200
with variable context, variable routing,
53
00:01:47,200 –> 00:01:49,200
and variable tool selection pressure.
54
00:01:49,200 –> 00:01:51,600
Even if the model is stable, your environment is not.
55
00:01:51,600 –> 00:01:54,320
New documents appear, new agents get added,
56
00:01:54,320 –> 00:01:57,720
connectors change behavior, permissions gopes evolve,
57
00:01:57,720 –> 00:01:59,280
and someone tweaks an instruction
58
00:01:59,280 –> 00:02:02,040
because a stakeholder wanted a friendly a tone.
59
00:02:02,040 –> 00:02:03,560
That is not an edge case.
60
00:02:03,560 –> 00:02:05,160
That is entropy doing its job.
61
00:02:05,160 –> 00:02:06,920
So the foundational misunderstanding is simple.
62
00:02:06,920 –> 00:02:09,400
You think you’re building a set of assistants that respond.
63
00:02:09,400 –> 00:02:11,080
But the system you’re operating behaves
64
00:02:11,080 –> 00:02:13,760
like an authorization compiler that emits actions.
65
00:02:13,760 –> 00:02:15,200
And the minute you let it emit actions
66
00:02:15,200 –> 00:02:16,520
without deterministic gates,
67
00:02:16,520 –> 00:02:18,520
you’ve moved from a deterministic security model
68
00:02:18,520 –> 00:02:19,960
to a probabilistic one.
69
00:02:19,960 –> 00:02:21,360
Not because the platform is broken,
70
00:02:21,360 –> 00:02:23,920
but because you handed the platform the authority to decide.
71
00:02:23,920 –> 00:02:25,680
This isn’t about smarter AI.
72
00:02:25,680 –> 00:02:28,440
It’s about who’s allowed to decide.
73
00:02:28,440 –> 00:02:29,560
Pause.
74
00:02:29,560 –> 00:02:32,400
Now let’s define what success means in this architecture
75
00:02:32,400 –> 00:02:34,400
because most teams can’t articulate it.
76
00:02:34,400 –> 00:02:35,400
They talk about adoption.
77
00:02:35,400 –> 00:02:37,040
They talk about time saved.
78
00:02:37,040 –> 00:02:39,600
They talk about how many agents they ship this quarter.
79
00:02:39,600 –> 00:02:41,280
None of that is a success criterion.
80
00:02:41,280 –> 00:02:42,480
Those are vanity metrics.
81
00:02:42,480 –> 00:02:45,600
Success criteria for enterprise multi-agent orchestration
82
00:02:45,600 –> 00:02:48,760
are boring, and that’s where they’re ignored.
83
00:02:48,760 –> 00:02:50,080
Predictability.
84
00:02:50,080 –> 00:02:51,360
Given the same request class,
85
00:02:51,360 –> 00:02:54,640
the system follows bounded paths and produces bounded outcomes.
86
00:02:54,640 –> 00:02:55,960
Auditability.
87
00:02:55,960 –> 00:02:58,520
You can reconstruct what happened, what data was used,
88
00:02:58,520 –> 00:03:01,120
what agent was invoked, what tool calls executed,
89
00:03:01,120 –> 00:03:03,000
and what approvals gated the action.
90
00:03:03,000 –> 00:03:03,800
Controllability.
91
00:03:03,800 –> 00:03:06,680
You can prevent categories of actions, not just discourage them.
92
00:03:06,680 –> 00:03:09,120
You can disable an agent, revoke a capability,
93
00:03:09,120 –> 00:03:12,320
or force human approval without rewriting half the ecosystem.
94
00:03:12,320 –> 00:03:13,400
Cost visibility.
95
00:03:13,400 –> 00:03:16,280
You can attribute consumption to workflows and capabilities.
96
00:03:16,280 –> 00:03:18,840
Not just see a token bill and guess who caused it.
97
00:03:18,840 –> 00:03:21,200
And there’s one more that people avoid saying out loud,
98
00:03:21,200 –> 00:03:22,480
decommissionability.
99
00:03:22,480 –> 00:03:24,760
If you can’t turn it off cleanly, you didn’t build a system.
100
00:03:24,760 –> 00:03:26,200
You built a dependency trap.
101
00:03:26,200 –> 00:03:28,160
Notice what’s missing, explainability.
102
00:03:28,160 –> 00:03:30,160
Not because explainability doesn’t matter,
103
00:03:30,160 –> 00:03:32,080
but because it’s the wrong control target.
104
00:03:32,080 –> 00:03:33,840
You don’t control probabilistic reasoning
105
00:03:33,840 –> 00:03:36,960
by demanding a perfect narrative of why it thought something.
106
00:03:36,960 –> 00:03:38,960
You control it by bounding what it can do next.
107
00:03:38,960 –> 00:03:40,960
This reframing changes how you design.
108
00:03:40,960 –> 00:03:42,640
If you believe you’re building assistance,
109
00:03:42,640 –> 00:03:44,080
you optimize for coverage.
110
00:03:44,080 –> 00:03:45,800
You want the agent to handle anything.
111
00:03:45,800 –> 00:03:49,200
You add more tools, more knowledge, more connected agents,
112
00:03:49,200 –> 00:03:50,480
more capability.
113
00:03:50,480 –> 00:03:51,920
And you call that maturity.
114
00:03:51,920 –> 00:03:54,240
If you accept you’re building a decision engine,
115
00:03:54,240 –> 00:03:56,920
you optimize for determinism around execution.
116
00:03:56,920 –> 00:03:58,640
You separate reasoning from actuation.
117
00:03:58,640 –> 00:04:01,240
You treat agent selection as routing, not magic.
118
00:04:01,240 –> 00:04:04,560
You treat tools as privileged operations, not convenience features.
119
00:04:04,560 –> 00:04:07,040
And you design like every exception will become a precedent
120
00:04:07,040 –> 00:04:08,040
because it will.
121
00:04:08,040 –> 00:04:09,880
Once you see it this way, the default trajectory
122
00:04:09,880 –> 00:04:10,840
becomes obvious.
123
00:04:10,840 –> 00:04:12,640
Without a control plane, every new agent
124
00:04:12,640 –> 00:04:15,400
becomes another independent locus of policy, identity,
125
00:04:15,400 –> 00:04:16,560
and logic.
126
00:04:16,560 –> 00:04:19,240
Over time, the system stops reflecting intent
127
00:04:19,240 –> 00:04:22,280
and starts reflecting accumulated compromises.
128
00:04:22,280 –> 00:04:23,880
And that’s the setup for the next section
129
00:04:23,880 –> 00:04:25,360
because sprawl is not a byproduct.
130
00:04:25,360 –> 00:04:28,920
It is the default outcome when intent is not enforced by design.
131
00:04:28,920 –> 00:04:30,840
Agents sprawl, power automate sprawl,
132
00:04:30,840 –> 00:04:33,000
but with confidence and plausible lies.
133
00:04:33,000 –> 00:04:34,360
Agents sprawl is not new.
134
00:04:34,360 –> 00:04:37,360
Enterprises already lived through power automate sprawl.
135
00:04:37,360 –> 00:04:39,160
Flow is built in personal environments,
136
00:04:39,160 –> 00:04:41,240
undocumented connectors, brittle dependencies,
137
00:04:41,240 –> 00:04:43,680
and temporary exceptions that turned into permanent business
138
00:04:43,680 –> 00:04:44,800
processes.
139
00:04:44,800 –> 00:04:48,040
Most teams survived it by pretending it was just technical debt.
140
00:04:48,040 –> 00:04:50,960
They were wrong then too, but at least the failure modes were legible.
141
00:04:50,960 –> 00:04:53,360
Agents sprawl is the same pattern but weaponized.
142
00:04:53,360 –> 00:04:55,240
Because an agent is not just a workflow.
143
00:04:55,240 –> 00:04:57,320
It is a workflow plus language plus reasoning.
144
00:04:57,320 –> 00:04:58,360
It contains policy.
145
00:04:58,360 –> 00:04:59,480
It contains interpretation.
146
00:04:59,480 –> 00:05:01,280
It contains implied ownership.
147
00:05:01,280 –> 00:05:04,000
And it fails in ways that look believable to non-experts.
148
00:05:04,000 –> 00:05:06,720
The pattern starts the same way, decentralized logic.
149
00:05:06,720 –> 00:05:09,160
One team builds an agent to help with onboarding.
150
00:05:09,160 –> 00:05:12,360
Another team builds an agent to speed up access requests.
151
00:05:12,360 –> 00:05:15,400
A third builds an agent to summarize invoices.
152
00:05:15,400 –> 00:05:17,200
None of these agents share a contract,
153
00:05:17,200 –> 00:05:20,600
non-share an audit model, they share a tenant and a user base.
154
00:05:20,600 –> 00:05:23,080
That’s not architecture, that’s co-tenancy.
155
00:05:23,080 –> 00:05:25,400
And then the security model becomes implicit.
156
00:05:25,400 –> 00:05:28,760
People assume that because an agent runs inside Microsoft 365,
157
00:05:28,760 –> 00:05:31,640
it inherits the safety properties of Microsoft 365.
158
00:05:31,640 –> 00:05:32,320
It doesn’t.
159
00:05:32,320 –> 00:05:35,200
It inherits your permissions model, your connector configuration,
160
00:05:35,200 –> 00:05:37,160
and your willingness to let tools execute.
161
00:05:37,160 –> 00:05:38,360
That’s it.
162
00:05:38,360 –> 00:05:39,760
Now here’s where most teams mess up.
163
00:05:39,760 –> 00:05:41,400
They treat overlap as harmless.
164
00:05:41,400 –> 00:05:45,200
If two agents can handle onboarding leadership calls that redundancy.
165
00:05:45,200 –> 00:05:46,520
But in a decision engine,
166
00:05:46,520 –> 00:05:48,280
overlap is routing ambiguity.
167
00:05:48,280 –> 00:05:50,320
It creates non-deterministic delegation.
168
00:05:50,320 –> 00:05:53,120
You are no longer choosing which system executes policy.
169
00:05:53,120 –> 00:05:54,160
The model is choosing.
170
00:05:54,160 –> 00:05:56,240
And it will choose differently as context changes,
171
00:05:56,240 –> 00:05:58,520
different conversation history, different phrasing,
172
00:05:58,520 –> 00:06:01,240
different time of day, different knowledge results,
173
00:06:01,240 –> 00:06:03,800
different agent descriptions after someone edits them.
174
00:06:03,800 –> 00:06:06,200
This produces three predictable failure modes.
175
00:06:06,200 –> 00:06:08,600
First, duplicated business rules.
176
00:06:08,600 –> 00:06:10,480
Two agents implement the same policy,
177
00:06:10,480 –> 00:06:12,440
but with different wording, different assumptions,
178
00:06:12,440 –> 00:06:13,960
and different edge cases.
179
00:06:13,960 –> 00:06:16,000
Over time, one gets updated and the other doesn’t.
180
00:06:16,000 –> 00:06:17,760
Now you have policy divergence,
181
00:06:17,760 –> 00:06:20,040
and no one can tell you which one is the real one,
182
00:06:20,040 –> 00:06:21,960
because both can generate a confident answer
183
00:06:21,960 –> 00:06:23,320
that sounds compliant.
184
00:06:23,320 –> 00:06:24,600
Second, routing drift.
185
00:06:24,600 –> 00:06:27,920
The orchestrator roots to agent A today and agent B next week
186
00:06:27,920 –> 00:06:29,320
because the descriptions changed
187
00:06:29,320 –> 00:06:31,160
or the user asked the question differently.
188
00:06:31,160 –> 00:06:32,240
Nothing broke.
189
00:06:32,240 –> 00:06:33,680
The behavior just moved.
190
00:06:33,680 –> 00:06:35,160
This is the quietest kind of failure,
191
00:06:35,160 –> 00:06:37,440
because every individual run appears reasonable,
192
00:06:37,440 –> 00:06:40,160
but the system as a whole stops being reproducible.
193
00:06:40,160 –> 00:06:41,760
Third, hidden ownership.
194
00:06:41,760 –> 00:06:44,480
Agents get created in the same way flows got created.
195
00:06:44,480 –> 00:06:46,840
By whoever had the permissions and the motivation.
196
00:06:46,840 –> 00:06:49,080
A year later, the original author is gone,
197
00:06:49,080 –> 00:06:50,640
the business process depends on it
198
00:06:50,640 –> 00:06:52,080
and nobody wants to delete it,
199
00:06:52,080 –> 00:06:54,080
because nobody can prove what it will break.
200
00:06:54,080 –> 00:06:54,800
So it stays.
201
00:06:54,800 –> 00:06:57,480
Forever, that is architectural erosion with a UI.
202
00:06:57,480 –> 00:06:59,720
And now we arrive at the new operational hazard,
203
00:06:59,720 –> 00:07:00,920
the confident error.
204
00:07:00,920 –> 00:07:02,800
A classic automation failure is obvious.
205
00:07:02,800 –> 00:07:05,000
A connector fails, a flow times out,
206
00:07:05,000 –> 00:07:07,200
a job returns an error code, you get an incident,
207
00:07:07,200 –> 00:07:08,000
you fix it.
208
00:07:08,000 –> 00:07:09,520
A confident error is different.
209
00:07:09,520 –> 00:07:11,600
The system returns a coherent narrative
210
00:07:11,600 –> 00:07:14,680
with clean formatting and a sense of certainty while being wrong.
211
00:07:14,680 –> 00:07:17,120
Not maliciously wrong, just operationally wrong.
212
00:07:17,120 –> 00:07:19,960
It might skip an approval, misapply a policy exception
213
00:07:19,960 –> 00:07:21,760
or root to the wrong agent.
214
00:07:21,760 –> 00:07:23,640
And then it will explain the result in a way
215
00:07:23,640 –> 00:07:25,560
that satisfies the average reader.
216
00:07:25,560 –> 00:07:28,400
That means your incident response becomes philosophical.
217
00:07:28,400 –> 00:07:30,280
You are no longer debugging a failing step.
218
00:07:30,280 –> 00:07:32,640
You are arguing with an outcome that looks valid
219
00:07:32,640 –> 00:07:34,160
until you replay the trace.
220
00:07:34,160 –> 00:07:36,400
And if you don’t have a trace, you don’t have an incident.
221
00:07:36,400 –> 00:07:37,520
You have a rumor.
222
00:07:37,520 –> 00:07:40,480
This is why agents sprawl is worse than workflows sprawl.
223
00:07:40,480 –> 00:07:42,080
It doesn’t just create debt.
224
00:07:42,080 –> 00:07:45,160
It creates ambiguity, missing policies create obvious gaps,
225
00:07:45,160 –> 00:07:47,000
drifting policies create ambiguity.
226
00:07:47,000 –> 00:07:49,440
Ambiguity is where auditors and attackers live.
227
00:07:49,440 –> 00:07:51,320
The final irony is political.
228
00:07:51,320 –> 00:07:53,480
Once agents proliferate, decommissioning
229
00:07:53,480 –> 00:07:55,400
becomes socially impossible.
230
00:07:55,400 –> 00:07:56,920
Every agent has a user.
231
00:07:56,920 –> 00:07:59,560
Every user has a story about how it saved them time.
232
00:07:59,560 –> 00:08:01,360
Nobody has a story about how it quietly
233
00:08:01,360 –> 00:08:03,600
violated an approval chain because those stories only
234
00:08:03,600 –> 00:08:05,400
surface when something breaks publicly.
235
00:08:05,400 –> 00:08:07,760
So sprawl grows and governance arrives late,
236
00:08:07,760 –> 00:08:09,480
carrying spreadsheets and good intentions.
237
00:08:09,480 –> 00:08:12,600
That’s not enough because the real cost of sprawl isn’t tokens.
238
00:08:12,600 –> 00:08:14,560
It’s governance debt that compounds
239
00:08:14,560 –> 00:08:18,120
until you can no longer prove what your system does.
240
00:08:18,120 –> 00:08:19,800
Why ROI collapses?
241
00:08:19,800 –> 00:08:22,880
If you can’t reproduce behavior, you can’t prove value.
242
00:08:22,880 –> 00:08:26,200
The ROI story collapses the moment you can’t reproduce behavior.
243
00:08:26,200 –> 00:08:26,880
Not explain it.
244
00:08:26,880 –> 00:08:29,720
Reproduce it because executives don’t fund vibes.
245
00:08:29,720 –> 00:08:31,840
They fund systems that create repeatable outcomes
246
00:08:31,840 –> 00:08:33,000
under known constraints.
247
00:08:33,000 –> 00:08:35,760
In a multi-agent environment without determinism,
248
00:08:35,760 –> 00:08:38,720
you get what looks like automation, but behaves like improvisation.
249
00:08:38,720 –> 00:08:41,960
It completes tasks, but it can’t demonstrate reliability across runs.
250
00:08:41,960 –> 00:08:44,720
And the second you try to scale it, more users,
251
00:08:44,720 –> 00:08:48,320
more use cases, more data sources, the variance becomes the product.
252
00:08:48,320 –> 00:08:51,280
This is what unaccountable automation looks like in practice.
253
00:08:51,280 –> 00:08:53,800
You can’t answer basic questions without hand waving.
254
00:08:53,800 –> 00:08:55,640
Why did this request root to that agent?
255
00:08:55,640 –> 00:08:57,200
Why did it invoke that connector?
256
00:08:57,200 –> 00:08:58,920
Why did it skip the normal approval?
257
00:08:58,920 –> 00:09:02,720
Why did the same request last week take two tool calls and today take nine?
258
00:09:02,720 –> 00:09:05,560
If the only answer is the model decided, you don’t have a system.
259
00:09:05,560 –> 00:09:07,520
You have a liability with a chat interface.
260
00:09:07,520 –> 00:09:11,080
And once behavior becomes irreproducible, value becomes unprovable.
261
00:09:11,080 –> 00:09:12,000
You can’t baseline.
262
00:09:12,000 –> 00:09:13,000
You can’t AB test.
263
00:09:13,000 –> 00:09:14,200
You can’t attribute savings.
264
00:09:14,200 –> 00:09:15,440
You can’t defend spend.
265
00:09:15,440 –> 00:09:16,960
All you can do is point at anecdotes
266
00:09:16,960 –> 00:09:19,320
and hope the budget committee is feeling generous.
267
00:09:19,320 –> 00:09:24,080
Cost opacity arrives next because the real cost in multi-agent orchestration is not a token bill.
268
00:09:24,080 –> 00:09:26,440
The real cost is unbounded execution pathways.
269
00:09:26,440 –> 00:09:27,960
You don’t pay for an agent.
270
00:09:27,960 –> 00:09:30,000
You pay for planning, routing, tool calls,
271
00:09:30,000 –> 00:09:33,160
retries and context growth distributed across agents
272
00:09:33,160 –> 00:09:36,880
that each have their own logic and their own tendency to over-explain.
273
00:09:36,880 –> 00:09:39,280
When an agent can choose between multiple tools
274
00:09:39,280 –> 00:09:42,840
and multiple agents can answer the same request class, you lose cost predictability.
275
00:09:42,840 –> 00:09:45,520
The same intent can fan out into different action graphs.
276
00:09:45,520 –> 00:09:47,360
So your finance model becomes a guess.
277
00:09:47,360 –> 00:09:51,520
And when finance can’t forecast, finance eventually blocks operational fragility follows
278
00:09:51,520 –> 00:09:53,360
and it’s uglier than people expect.
279
00:09:53,360 –> 00:09:58,440
In deterministic automation and incident review asks, what failed, where and why?
280
00:09:58,440 –> 00:10:02,320
In probabilistic orchestration, the incident review asks, what did it do?
281
00:10:02,320 –> 00:10:05,200
What did it think it was doing and why did it do something else this time?
282
00:10:05,200 –> 00:10:10,240
You end up in run-to-run variants, the same input, produces materially different behavior.
283
00:10:10,240 –> 00:10:12,800
Sometimes it’s harmless, sometimes it’s a policy breach.
284
00:10:12,800 –> 00:10:16,480
Either way, it destroys your ability to operate the system like enterprise software
285
00:10:16,480 –> 00:10:19,000
because enterprise software is supposed to be boring.
286
00:10:19,000 –> 00:10:22,080
Then compliance steps in, as it always does, late and unimpressed.
287
00:10:22,080 –> 00:10:26,880
Fragmented audit trails across agents and tools create the worst kind of governance problem.
288
00:10:26,880 –> 00:10:29,600
You can’t reconstruct the chain of custody for a decision.
289
00:10:29,600 –> 00:10:32,560
The user asks the question, the parent agent routed,
290
00:10:32,560 –> 00:10:37,200
a connected agent invoked a tool, another agent drafted a message somewhere in that chain,
291
00:10:37,200 –> 00:10:41,360
a permission boundary got crossed or an approval was implied instead of verified.
292
00:10:41,360 –> 00:10:46,080
And if you can’t produce a trace that shows intent, decision, action and outcome,
293
00:10:46,080 –> 00:10:48,080
cleanly, you don’t have an audit trail.
294
00:10:48,080 –> 00:10:49,040
You have a narrative.
295
00:10:49,040 –> 00:10:50,560
Auditors don’t certify narratives.
296
00:10:50,560 –> 00:10:52,240
Maintenance overhead is the slow death.
297
00:10:52,240 –> 00:10:55,600
Copy and paste reuse looks efficient until it forks logic.
298
00:10:55,600 –> 00:10:57,520
Then you have silent divergence again.
299
00:10:57,520 –> 00:11:00,240
One team fixes a bug in their agent instructions.
300
00:11:00,240 –> 00:11:04,800
Another team’s agent still contains the old rules, a third team embedded the logic inside
301
00:11:04,800 –> 00:11:06,880
a child agent and forgot it existed.
302
00:11:06,880 –> 00:11:09,440
Now you’re maintaining policy like its folklore.
303
00:11:09,440 –> 00:11:11,520
And this is where the real ROI collapse happens.
304
00:11:11,520 –> 00:11:13,040
Scaling decisions stop.
305
00:11:13,040 –> 00:11:17,680
Leaders stop funding expansion because every incremental deployment increases uncertainty.
306
00:11:17,680 –> 00:11:19,520
You can’t promise consistent behavior.
307
00:11:19,520 –> 00:11:20,640
You can’t focus costs.
308
00:11:20,640 –> 00:11:21,920
You can’t defend compliance.
309
00:11:21,920 –> 00:11:23,200
You can’t decommission safely.
310
00:11:23,200 –> 00:11:25,360
So the initiative stalls at useful demo scale.
311
00:11:25,360 –> 00:11:26,720
Lots of activity.
312
00:11:26,720 –> 00:11:28,000
Lots of screenshots.
313
00:11:28,000 –> 00:11:29,200
No enterprise outcome.
314
00:11:30,160 –> 00:11:31,840
You can feel the reason emerging.
315
00:11:31,840 –> 00:11:33,920
The fix is not better prompts.
316
00:11:33,920 –> 00:11:35,360
Prompts do not create determinism.
317
00:11:35,360 –> 00:11:36,480
They create hope.
318
00:11:36,480 –> 00:11:37,760
The fix is a control plane.
319
00:11:37,760 –> 00:11:42,000
A system that decides explicitly what is allowed to execute in what order,
320
00:11:42,000 –> 00:11:45,120
under what identity, with what logging and with what kill switches.
321
00:11:45,120 –> 00:11:46,320
And that’s the pivot point.
322
00:11:46,320 –> 00:11:48,720
The question is not why the AI thought that.
323
00:11:48,720 –> 00:11:51,440
The question is what the system allowed it to do next.
324
00:11:51,440 –> 00:11:55,680
The deterministic core plus reasoned edge, the only deployable architecture.
325
00:11:55,680 –> 00:11:56,960
Here’s the uncomfortable truth.
326
00:11:56,960 –> 00:11:58,640
You do not get to govern intelligence.
327
00:11:59,120 –> 00:12:00,480
You govern execution.
328
00:12:00,480 –> 00:12:05,360
And the only architecture that survives enterprise scale is the deterministic core with a reasoned edge.
329
00:12:05,360 –> 00:12:07,120
This is not a philosophical preference.
330
00:12:07,120 –> 00:12:08,800
It is a mechanical necessity.
331
00:12:08,800 –> 00:12:09,680
AI can reason.
332
00:12:09,680 –> 00:12:10,160
It can draft.
333
00:12:10,160 –> 00:12:10,880
It can summarize.
334
00:12:10,880 –> 00:12:11,440
It can propose.
335
00:12:11,440 –> 00:12:12,400
It can even plan.
336
00:12:12,400 –> 00:12:15,440
But the moment you let probabilistic reasoning directly control actuation,
337
00:12:15,440 –> 00:12:18,960
tool calls, approvals, identity changes, financial operations.
338
00:12:18,960 –> 00:12:22,000
You have converted your control plane into a suggestion engine.
339
00:12:22,000 –> 00:12:24,080
And suggestion engines do not pass audits.
340
00:12:24,080 –> 00:12:25,280
So the rule is simple.
341
00:12:25,280 –> 00:12:26,720
Separate planning from execution.
342
00:12:27,360 –> 00:12:29,360
Let the model do what models are good at.
343
00:12:29,360 –> 00:12:33,680
Interpreting messy input, extracting intent, generating candidate plans,
344
00:12:33,680 –> 00:12:36,080
and handling ambiguity inside a bounded step.
345
00:12:36,080 –> 00:12:38,960
Then enforce execution through deterministic obligations.
346
00:12:38,960 –> 00:12:42,240
Explicit gates, ordered steps, mandatory checks, and logged outcomes.
347
00:12:42,240 –> 00:12:46,400
Determinism in this context doesn’t mean the model always says the same sentence.
348
00:12:46,400 –> 00:12:48,000
That is not the goal.
349
00:12:48,000 –> 00:12:50,080
And anyone selling that is selling fiction.
350
00:12:50,080 –> 00:12:52,640
Determinism means the action graph is bounded.
351
00:12:52,640 –> 00:12:54,320
The allowed transitions are known.
352
00:12:54,320 –> 00:12:55,360
The approvals are real.
353
00:12:55,360 –> 00:12:56,640
The two cores are constrained.
354
00:12:56,640 –> 00:12:58,240
The ordering is enforced.
355
00:12:58,240 –> 00:13:00,720
And the outcome is reproducible at the level that matters.
356
00:13:00,720 –> 00:13:01,840
What the system did.
357
00:13:01,840 –> 00:13:04,480
So if you remember one technical definition, it’s this.
358
00:13:04,480 –> 00:13:07,520
Determinism is bounded pathways plus enforced gates.
359
00:13:07,520 –> 00:13:09,120
Everything else is aesthetic.
360
00:13:09,120 –> 00:13:10,640
Now most teams get this backwards.
361
00:13:10,640 –> 00:13:12,160
They build a brilliant reasoning layer
362
00:13:12,160 –> 00:13:14,640
and then bolt on execution as an afterthought.
363
00:13:14,640 –> 00:13:16,720
They treat tool access like integration.
364
00:13:16,720 –> 00:13:20,720
They assume that because the model is grounded, execution is safe.
365
00:13:20,720 –> 00:13:22,480
Grounding is not safety.
366
00:13:22,480 –> 00:13:25,280
Grounding is a reduction in hallucination probability.
367
00:13:25,280 –> 00:13:28,000
Safety is a reduction in unauthorized actuation.
368
00:13:28,000 –> 00:13:29,120
That distinction matters.
369
00:13:29,120 –> 00:13:32,880
The deterministic core is the part that owns authority.
370
00:13:32,880 –> 00:13:37,040
Policy enforcement, identity normalization, tool access and state progression.
371
00:13:37,040 –> 00:13:38,240
It is the control plane.
372
00:13:38,240 –> 00:13:39,440
It does not get creative.
373
00:13:39,440 –> 00:13:40,560
It does not invent steps.
374
00:13:40,560 –> 00:13:41,840
It does not infer approvals.
375
00:13:41,840 –> 00:13:44,240
It does not helpfully bypass segregation of duties
376
00:13:44,240 –> 00:13:45,920
because the user sounds urgent.
377
00:13:45,920 –> 00:13:46,800
It is boring.
378
00:13:46,800 –> 00:13:48,320
And boring is deployable.
379
00:13:48,320 –> 00:13:51,040
The reason edge is where you allow probabilistic behavior,
380
00:13:51,040 –> 00:13:53,040
but only inside controlled boxes.
381
00:13:53,040 –> 00:13:55,760
A box is a step where the model can produce an output.
382
00:13:55,760 –> 00:14:00,000
But that output does not directly execute privileged actions without validation.
383
00:14:00,000 –> 00:14:04,400
The model can draft an email, but a deterministic gate decides whether it gets sent.
384
00:14:04,400 –> 00:14:06,320
The model can propose an access package,
385
00:14:06,320 –> 00:14:09,360
but a deterministic gate decides whether the request meets policy
386
00:14:09,360 –> 00:14:10,960
and whether approvals are satisfied.
387
00:14:10,960 –> 00:14:12,880
The model can classify an invoice,
388
00:14:12,880 –> 00:14:16,080
but deterministic matching rules decide whether payment is allowed.
389
00:14:16,080 –> 00:14:20,080
This clicked for a lot of teams when they stopped thinking in terms of agents doing work
390
00:14:20,080 –> 00:14:22,560
and started thinking in terms of blast radius management.
391
00:14:22,560 –> 00:14:25,680
A probabilistic component must have a bounded blast radius.
392
00:14:25,680 –> 00:14:27,600
That means it has limited permissions.
393
00:14:27,600 –> 00:14:29,360
It has limited tool scope.
394
00:14:29,360 –> 00:14:30,960
It has limited context sharing.
395
00:14:30,960 –> 00:14:33,040
It produces structured outputs.
396
00:14:33,040 –> 00:14:36,880
And it hits deterministic checkpoints before anything irreversible happens.
397
00:14:36,880 –> 00:14:38,880
Once you design like this, the payoff is immediate.
398
00:14:38,880 –> 00:14:42,880
You can use AI where it actually provides leverage, judgment inside steps,
399
00:14:42,880 –> 00:14:46,320
while keeping the system legible, auditable, and controllable.
400
00:14:46,320 –> 00:14:49,360
And yes, this architecture still feels intelligent to end users.
401
00:14:49,360 –> 00:14:52,400
In fact, it feels more intelligent because it behaves consistently.
402
00:14:52,400 –> 00:14:54,400
The system doesn’t randomly escalate.
403
00:14:54,400 –> 00:14:55,680
It doesn’t contradict itself.
404
00:14:55,680 –> 00:14:59,680
It doesn’t alternate between overconfidence and paralysis depending on context.
405
00:14:59,680 –> 00:15:03,920
It follows a known process and it uses AI to reduce friction inside that process.
406
00:15:03,920 –> 00:15:05,440
This isn’t about smarter AI.
407
00:15:05,440 –> 00:15:07,680
It’s about who’s allowed to decide.
408
00:15:07,680 –> 00:15:09,280
Pause.
409
00:15:09,280 –> 00:15:11,040
Now, this is the part where people ask,
410
00:15:11,040 –> 00:15:12,800
“So where does orchestration live?”
411
00:15:12,800 –> 00:15:14,720
It lives in the deterministic core.
412
00:15:14,720 –> 00:15:16,480
Orchestration is not the model thinking.
413
00:15:16,480 –> 00:15:19,760
Orchestration is the system enforcing intent at scale,
414
00:15:19,760 –> 00:15:24,960
rooting to the right capability, controlling context, sequencing calls, enforcing approvals,
415
00:15:24,960 –> 00:15:26,640
and emitting a trace you can replay.
416
00:15:26,640 –> 00:15:29,120
Reasoning can assist orchestration.
417
00:15:29,120 –> 00:15:33,760
It can’t replace it because the system cannot outsource authority to a probabilistic component
418
00:15:33,760 –> 00:15:35,600
and then pretend it still controls outcomes.
419
00:15:35,600 –> 00:15:38,720
Once you accept that, the next concept becomes obvious.
420
00:15:38,720 –> 00:15:40,400
You need a master agent.
421
00:15:40,400 –> 00:15:43,040
Not a protagonist, not a superbrain, a control plane.
422
00:15:43,040 –> 00:15:45,200
The master agent, control plane, not protagonist.
423
00:15:45,200 –> 00:15:47,840
I have a master agent isn’t smarter AI.
424
00:15:47,840 –> 00:15:51,600
It’s a control plane that prevents smart systems from doing stupid things.
425
00:15:51,600 –> 00:15:55,440
That sentence matters because most teams build the opposite, a hero agent,
426
00:15:55,440 –> 00:15:59,200
a single conversational endpoint stuffed with knowledge, tools, and permissions,
427
00:15:59,200 –> 00:16:01,280
expected to handle anything.
428
00:16:01,280 –> 00:16:02,560
It looks elegant in a demo.
429
00:16:02,560 –> 00:16:04,320
It becomes a disaster in production.
430
00:16:04,320 –> 00:16:06,240
The master agent is not there to be impressive.
431
00:16:06,240 –> 00:16:07,680
It is there to be accountable.
432
00:16:07,680 –> 00:16:12,240
Its job is to hold state, enforce gates, and root work to govern capabilities.
433
00:16:12,240 –> 00:16:16,240
It should behave like infrastructure, quiet, deterministic, predictable,
434
00:16:16,240 –> 00:16:20,480
auditable, and frankly boring enough that nobody argues about what it meant.
435
00:16:20,480 –> 00:16:22,240
So what does that actually mean in practice?
436
00:16:22,240 –> 00:16:26,160
First, the master agent owns workflow state, not conversation vibe state.
437
00:16:26,160 –> 00:16:27,280
Where are we in the process?
438
00:16:27,280 –> 00:16:28,400
What has been validated?
439
00:16:28,400 –> 00:16:29,440
What remains outstanding?
440
00:16:29,440 –> 00:16:30,640
What approvals exist?
441
00:16:30,640 –> 00:16:32,960
What identity context is in effect?
442
00:16:32,960 –> 00:16:36,320
If you can’t point to a state machine, you don’t have orchestration.
443
00:16:36,320 –> 00:16:38,240
You have improvisation with logging.
444
00:16:38,240 –> 00:16:39,440
Second, it owns gating.
445
00:16:39,440 –> 00:16:42,640
Every privilege step, anything that changes a system of record,
446
00:16:42,640 –> 00:16:45,280
grants access, commits money, creates an account,
447
00:16:45,280 –> 00:16:48,720
sends an external message, must pass through deterministic gates
448
00:16:48,720 –> 00:16:50,640
that the master agent enforces.
449
00:16:50,640 –> 00:16:52,480
Not because the model can’t be trusted,
450
00:16:52,480 –> 00:16:54,640
but because trust is not a control mechanism.
451
00:16:54,640 –> 00:16:56,240
Gates are.
452
00:16:56,240 –> 00:16:57,760
Third, it owns tool control.
453
00:16:57,760 –> 00:17:01,920
The master agent decides which tools may be called when and under which conditions.
454
00:17:01,920 –> 00:17:04,240
It does not allow open-ended tool choice,
455
00:17:04,240 –> 00:17:06,240
just because a user asked politely.
456
00:17:06,240 –> 00:17:09,680
It enforces least privilege by design, not by documentation.
457
00:17:09,680 –> 00:17:12,960
If a workflow doesn’t require connector, that connector is not available,
458
00:17:12,960 –> 00:17:17,520
if a capability requires elevated permission, the elevation is explicit, temporary and logged.
459
00:17:17,520 –> 00:17:19,600
Fourth, it normalizes identity.
460
00:17:19,600 –> 00:17:22,960
In a multi-agent world, identity becomes fragmented fast.
461
00:17:22,960 –> 00:17:26,800
Different auth contexts, different connection owners, different scopes,
462
00:17:26,800 –> 00:17:28,320
different consent histories.
463
00:17:28,320 –> 00:17:31,520
The master agent must make identity a first-class object,
464
00:17:31,520 –> 00:17:34,480
which user initiated, which service principle executed,
465
00:17:34,480 –> 00:17:36,480
which delegated permission supplied,
466
00:17:36,480 –> 00:17:38,800
which approvals bound the action.
467
00:17:38,800 –> 00:17:41,920
Otherwise, you end up with the agent did it as your audit narrative.
468
00:17:41,920 –> 00:17:44,160
That is not a narrative, that is an admission.
469
00:17:44,160 –> 00:17:46,560
Fifth, it locks like a system, not like a chatbot.
470
00:17:46,560 –> 00:17:49,120
You need consistent structured traces.
471
00:17:49,120 –> 00:17:54,560
Intent classification, rooting decision, agent invoked, tools called parameters passed,
472
00:17:54,560 –> 00:17:59,040
outputs returned, gates evaluated, approval satisfied, and final outcome.
473
00:17:59,040 –> 00:18:00,880
This is not optional observability.
474
00:18:00,880 –> 00:18:04,400
This is the only way to make probabilistic reasoning operationally tolerable.
475
00:18:04,400 –> 00:18:06,240
Now, here’s where most people rebuild the problem.
476
00:18:06,240 –> 00:18:08,320
They let the master agent do domain work.
477
00:18:08,320 –> 00:18:10,080
They let it draft the policy response.
478
00:18:10,080 –> 00:18:11,760
They let it infer the approval.
479
00:18:11,760 –> 00:18:13,280
They let it decide the exceptions.
480
00:18:13,280 –> 00:18:16,240
They let it helpfully bridge gaps because it seems capable.
481
00:18:16,240 –> 00:18:17,040
Don’t.
482
00:18:17,040 –> 00:18:18,720
The moment your master agent gets clever,
483
00:18:18,720 –> 00:18:19,840
you’ve rebuilt a monolith,
484
00:18:19,840 –> 00:18:23,120
except now it’s a monolith with probabilistic behavior at the center.
485
00:18:23,120 –> 00:18:25,760
You have taken the component that must be deterministic
486
00:18:25,760 –> 00:18:27,920
and turned it into another entropy generator,
487
00:18:27,920 –> 00:18:29,360
so the master agent stays thin.
488
00:18:29,360 –> 00:18:30,720
It does not contain business logic,
489
00:18:30,720 –> 00:18:34,320
it contains routing logic, gate logic, and control logic.
490
00:18:34,320 –> 00:18:38,240
It delegates domain work to specialized agents with explicit contracts,
491
00:18:38,240 –> 00:18:40,640
bounded permissions, and testable behaviors,
492
00:18:40,640 –> 00:18:42,160
and it should never do four things.
493
00:18:42,160 –> 00:18:44,320
One, it should never invent process steps.
494
00:18:44,320 –> 00:18:47,280
If the workflow requires an approval, it must request it.
495
00:18:47,280 –> 00:18:50,720
It does not infer it from tone, hierarchy, or urgency.
496
00:18:50,720 –> 00:18:53,520
Two, it should never bypass segregation of duties.
497
00:18:53,520 –> 00:18:55,360
If the same actor can request and approve
498
00:18:55,360 –> 00:18:57,920
through an AI shortcut, you just automated fraud.
499
00:18:57,920 –> 00:19:02,320
Three, it should never expand scope based on convenience.
500
00:19:02,320 –> 00:19:04,960
If the user asked for, and also, that’s a new intent,
501
00:19:04,960 –> 00:19:07,200
new intent means new routing and new gates.
502
00:19:07,840 –> 00:19:10,720
Four, it should never hide uncertainty behind narrative.
503
00:19:10,720 –> 00:19:13,440
If a gate fails, it reports the gate failure,
504
00:19:13,440 –> 00:19:15,120
not a friendly alternative story.
505
00:19:15,120 –> 00:19:18,080
If you build it this way, the master agent becomes your enforcement point,
506
00:19:18,080 –> 00:19:20,080
your kill switch, your policy compiler,
507
00:19:20,080 –> 00:19:22,480
the thing you can show an auditor without embarrassment.
508
00:19:22,480 –> 00:19:23,920
And once you have a control plane,
509
00:19:23,920 –> 00:19:25,440
you need something worth calling.
510
00:19:25,440 –> 00:19:28,240
Govern services, explicit capability boundaries,
511
00:19:28,240 –> 00:19:30,240
connected agents, connected agents,
512
00:19:30,240 –> 00:19:32,320
manage services for enterprise capabilities.
513
00:19:32,320 –> 00:19:36,480
Connected agents are the part most teams misunderstand,
514
00:19:36,480 –> 00:19:38,480
because they hear agent and they think chat.
515
00:19:38,480 –> 00:19:41,840
That’s not what they are in an enterprise architecture.
516
00:19:41,840 –> 00:19:44,640
A connected agent is a managed capability surface.
517
00:19:44,640 –> 00:19:46,880
A service, an owned interface with a contract,
518
00:19:46,880 –> 00:19:48,480
and if you don’t treat it like a service,
519
00:19:48,480 –> 00:19:52,320
it will behave like every other unmanaged integration you’ve ever regretted.
520
00:19:52,320 –> 00:19:54,640
The point of a connected agent is not that it can speak.
521
00:19:54,640 –> 00:19:56,560
The point is that it can be called,
522
00:19:56,560 –> 00:19:59,760
called predictably, called with bounded permissions,
523
00:19:59,760 –> 00:20:01,840
called with a description that tells the orchestrator
524
00:20:01,840 –> 00:20:03,440
when to use it and when not to.
525
00:20:03,440 –> 00:20:06,400
This is the difference between a zoo of clever assistance
526
00:20:06,400 –> 00:20:07,840
and an internal platform.
527
00:20:07,840 –> 00:20:10,080
The master agent becomes your control plane,
528
00:20:10,080 –> 00:20:12,320
connected agents become your governed services.
529
00:20:12,320 –> 00:20:14,480
So let’s define the contract because that’s where
530
00:20:14,480 –> 00:20:15,840
determinism is one or lost.
531
00:20:15,840 –> 00:20:19,040
A connected agent needs an explicit capability boundary.
532
00:20:19,040 –> 00:20:20,880
HR benefits questions is a boundary,
533
00:20:20,880 –> 00:20:22,640
and book vacation is a boundary.
534
00:20:22,640 –> 00:20:24,560
Creators service now ticket is a boundary.
535
00:20:24,560 –> 00:20:26,480
Lookup invoice status is a boundary.
536
00:20:26,480 –> 00:20:28,720
General productivity helper is not a boundary.
537
00:20:28,720 –> 00:20:29,920
It’s a cover story.
538
00:20:29,920 –> 00:20:31,040
Once you have that boundary,
539
00:20:31,040 –> 00:20:33,920
the agent gets a description that acts like an API definition
540
00:20:33,920 –> 00:20:35,040
for the orchestrator.
541
00:20:35,040 –> 00:20:37,760
This is where most teams stay vague because vague feels flexible.
542
00:20:37,760 –> 00:20:41,920
Vague is rooting ambiguity and routing ambiguity is non-determinism.
543
00:20:41,920 –> 00:20:44,000
So the description must contain three things.
544
00:20:44,000 –> 00:20:46,560
In plain language, that still reads like a contract,
545
00:20:46,560 –> 00:20:48,240
what it does, what it does not do,
546
00:20:48,240 –> 00:20:50,640
and the prerequisites for safe invocation.
547
00:20:50,640 –> 00:20:54,000
If the agent can only operate on approved HR documents, say so.
548
00:20:54,000 –> 00:20:56,080
If it must never send an external email, say so.
549
00:20:56,080 –> 00:20:59,440
If it can write drafts, but not execute changes, say so.
550
00:20:59,440 –> 00:21:01,440
This is not documentation for humans.
551
00:21:01,440 –> 00:21:04,240
It is a routing signal for a distributed decision engine.
552
00:21:04,240 –> 00:21:07,520
And because it’s a connected agent, it has its own life cycle.
553
00:21:07,520 –> 00:21:08,800
That’s the entire point.
554
00:21:08,800 –> 00:21:09,600
Versioning matters.
555
00:21:09,600 –> 00:21:12,240
You can publish V2 without silently mutating V1.
556
00:21:12,240 –> 00:21:13,040
You can deprecate.
557
00:21:13,040 –> 00:21:13,840
You can roll back.
558
00:21:13,840 –> 00:21:15,520
You can kill switch the capability.
559
00:21:15,520 –> 00:21:17,600
If it starts producing the wrong outcomes,
560
00:21:17,600 –> 00:21:19,920
you can put an owner on it, an accountable team,
561
00:21:19,920 –> 00:21:22,640
a mailbox, an on call rotation if you’re serious.
562
00:21:22,640 –> 00:21:24,080
This is where governance becomes real
563
00:21:24,080 –> 00:21:27,680
because now you can govern capabilities instead of conversations.
564
00:21:27,680 –> 00:21:30,160
Connected agents also give you a security boundary
565
00:21:30,160 –> 00:21:32,480
that embedded agents can’t give you cleanly.
566
00:21:32,480 –> 00:21:34,400
Dedicated authentication configuration.
567
00:21:34,400 –> 00:21:37,600
That matters because Oauth is where most multi-agent fantasies
568
00:21:37,600 –> 00:21:38,560
die in production.
569
00:21:38,560 –> 00:21:40,320
You can scope permissions per capability.
570
00:21:40,320 –> 00:21:41,680
You can separate identities.
571
00:21:41,680 –> 00:21:44,320
You can decide whether the agent runs with the user’s context,
572
00:21:44,320 –> 00:21:47,280
with an application context, or with a constraint service identity.
573
00:21:47,280 –> 00:21:50,640
You can isolate blast radius by design, not by policy memo.
574
00:21:50,640 –> 00:21:53,600
And when someone asks who allowed this agent to do that,
575
00:21:53,600 –> 00:21:55,920
you can answer with something other than silence.
576
00:21:55,920 –> 00:21:58,320
Operationally, connected agents are also where you get
577
00:21:58,320 –> 00:22:00,480
reusability without copy-paste entropy.
578
00:22:00,480 –> 00:22:02,240
Approved once, reuse many times.
579
00:22:02,240 –> 00:22:04,400
The travel policy agent can serve HR onboarding,
580
00:22:04,400 –> 00:22:06,400
expense review, and a travel booking workflow
581
00:22:06,400 –> 00:22:09,040
without three teams re-implementing policy interpretation
582
00:22:09,040 –> 00:22:10,240
three different ways.
583
00:22:10,240 –> 00:22:13,280
The invoice validation agent can be called by procurement,
584
00:22:13,280 –> 00:22:15,120
finance, and vendor management,
585
00:22:15,120 –> 00:22:17,360
without cloning logic into separate helpers.
586
00:22:17,360 –> 00:22:18,640
This is what platform looks like.
587
00:22:18,640 –> 00:22:19,520
But there’s a constraint.
588
00:22:19,520 –> 00:22:21,760
You must keep connected agents bounded.
589
00:22:21,760 –> 00:22:23,280
The moment you stuff a connected agent
590
00:22:23,280 –> 00:22:25,840
with five unrelated capabilities for convenience,
591
00:22:25,840 –> 00:22:27,280
you’ve recreated a monolith.
592
00:22:27,280 –> 00:22:29,200
The orchestrator loses routing clarity.
593
00:22:29,200 –> 00:22:31,040
The agent becomes a second control plane
594
00:22:31,040 –> 00:22:32,800
with its own drift, its own exceptions,
595
00:22:32,800 –> 00:22:34,320
and its own political constituency.
596
00:22:34,320 –> 00:22:36,560
So you publish capabilities, not personalities,
597
00:22:36,560 –> 00:22:38,080
and you design for failure.
598
00:22:38,080 –> 00:22:40,720
Every connected agent must assume it will be invoked
599
00:22:40,720 –> 00:22:43,440
in weird context with partial information
600
00:22:43,440 –> 00:22:45,120
and with adversarial ambiguity.
601
00:22:45,120 –> 00:22:46,400
Not because users are malicious,
602
00:22:46,400 –> 00:22:49,280
because language is messy and systems amplify mess.
603
00:22:49,280 –> 00:22:51,840
So you enforce structured input and structured output
604
00:22:51,840 –> 00:22:52,560
wherever possible.
605
00:22:52,560 –> 00:22:54,880
You return machine usable results, not just pros.
606
00:22:54,880 –> 00:22:56,480
You validate before actuation.
607
00:22:56,480 –> 00:22:58,320
You surface refusal states clearly.
608
00:22:58,320 –> 00:23:01,520
You log every tool call and every gate interaction
609
00:23:01,520 –> 00:23:03,840
as part of the trace chain, the master agent owns.
610
00:23:03,840 –> 00:23:06,720
If you do this right, the orchestrator becomes simple.
611
00:23:06,720 –> 00:23:08,240
It roots to services.
612
00:23:08,240 –> 00:23:10,320
It doesn’t negotiate with improvisation.
613
00:23:10,320 –> 00:23:11,360
And that’s the payoff,
614
00:23:11,360 –> 00:23:13,120
a catalog of governed capabilities
615
00:23:13,120 –> 00:23:14,560
that can scale across teams
616
00:23:14,560 –> 00:23:16,640
without turning your tenant into conditional chaos.
617
00:23:16,640 –> 00:23:17,760
Now, to make this deployable,
618
00:23:17,760 –> 00:23:19,840
you need to understand the coupling decision,
619
00:23:19,840 –> 00:23:22,080
because not every capability should be connected.
620
00:23:22,080 –> 00:23:23,520
Some should stay embedded,
621
00:23:23,520 –> 00:23:26,560
and that decision is the difference between architecture and sprawl.
622
00:23:27,200 –> 00:23:30,080
Embedded child agents versus connected agents,
623
00:23:30,080 –> 00:23:31,360
coupling is the real decision.
624
00:23:31,360 –> 00:23:35,200
Most teams treat this as an implementation choice.
625
00:23:35,200 –> 00:23:36,720
It isn’t. It’s a coupling decision,
626
00:23:36,720 –> 00:23:39,680
and coupling decisions are how architectures either scale or rot.
627
00:23:39,680 –> 00:23:42,400
An embedded child agent is an internal module.
628
00:23:42,400 –> 00:23:43,600
It lives inside the parent,
629
00:23:43,600 –> 00:23:45,840
it inherits configuration, it ships with the parent.
630
00:23:45,840 –> 00:23:47,920
It shares the same overall life cycle.
631
00:23:47,920 –> 00:23:50,720
That means it’s fast to build, fast to modify,
632
00:23:50,720 –> 00:23:53,440
and politically easy, because one team can own the whole thing.
633
00:23:53,440 –> 00:23:54,960
A connected agent is a service.
634
00:23:54,960 –> 00:23:56,400
It has its own life cycle.
635
00:23:56,400 –> 00:23:58,400
It can be consumed by multiple parents.
636
00:23:58,400 –> 00:23:59,920
It can be owned by a different team.
637
00:23:59,920 –> 00:24:01,600
It can have dedicated settings,
638
00:24:01,600 –> 00:24:03,520
including authentication configuration.
639
00:24:03,520 –> 00:24:05,120
That separation is not overhead.
640
00:24:05,120 –> 00:24:08,000
That separation is what prevents every new workflow
641
00:24:08,000 –> 00:24:10,560
from becoming a new fork of policy and logic.
642
00:24:10,560 –> 00:24:12,000
So, the decision is simple.
643
00:24:12,000 –> 00:24:13,200
Are you designing a workflow,
644
00:24:13,200 –> 00:24:14,640
or are you designing a capability?
645
00:24:14,640 –> 00:24:16,560
If it’s workflow-specific glue,
646
00:24:16,560 –> 00:24:18,640
context-chaping, intermediate formatting,
647
00:24:18,640 –> 00:24:20,560
step-local reasoning, embedded.
648
00:24:20,560 –> 00:24:22,800
If it is a reusable enterprise capability,
649
00:24:22,800 –> 00:24:24,480
HR policy interpretation,
650
00:24:24,480 –> 00:24:26,320
identity life cycle operations,
651
00:24:26,320 –> 00:24:30,080
invoice validation, ticket creation, vendor lookup, connected.
652
00:24:30,080 –> 00:24:33,760
The thing most people miss is that reuse is not just convenience.
653
00:24:33,760 –> 00:24:35,200
Reuse is governance leverage.
654
00:24:35,200 –> 00:24:37,760
If you want to approve and audit a capability once,
655
00:24:37,760 –> 00:24:39,040
you need it to exist once.
656
00:24:39,040 –> 00:24:40,320
That means connected.
657
00:24:40,320 –> 00:24:41,840
Now, here’s where most people mess up.
658
00:24:41,840 –> 00:24:43,920
They build embedded child agents because it’s easy,
659
00:24:43,920 –> 00:24:45,360
and then they copy the parent agent
660
00:24:45,360 –> 00:24:47,440
because another team wants the same thing.
661
00:24:47,440 –> 00:24:49,040
They tell themselves it’s temporary.
662
00:24:49,040 –> 00:24:49,920
It never is.
663
00:24:49,920 –> 00:24:51,760
That’s how hidden sprawl is manufactured.
664
00:24:51,760 –> 00:24:53,760
The sprawl moves from too many agents
665
00:24:53,760 –> 00:24:55,760
to too many near identical agents.
666
00:24:55,760 –> 00:24:58,000
Embedded agents are not inherently bad.
667
00:24:58,000 –> 00:25:00,800
They’re the right tool for bounded single-team systems.
668
00:25:00,800 –> 00:25:02,560
They let you decompose complexity
669
00:25:02,560 –> 00:25:04,720
without turning every module into a published service.
670
00:25:04,720 –> 00:25:06,000
They keep iteration tight.
671
00:25:06,000 –> 00:25:07,680
They reduce cross-team dependency.
672
00:25:07,680 –> 00:25:09,840
And when the whole workflow is owned by one team,
673
00:25:09,840 –> 00:25:11,440
this can be exactly what you want.
674
00:25:11,440 –> 00:25:14,000
But embedded agents have an architectural cost.
675
00:25:14,000 –> 00:25:16,640
They are coupled to the parent’s context and life cycle.
676
00:25:16,640 –> 00:25:18,000
That coupling becomes a problem.
677
00:25:18,000 –> 00:25:20,080
The moment you need any of the following.
678
00:25:20,080 –> 00:25:22,080
Independent versioning, independent rollback,
679
00:25:22,080 –> 00:25:23,440
independent kill switch,
680
00:25:23,440 –> 00:25:25,200
independent authentication boundary,
681
00:25:25,200 –> 00:25:26,640
or independent ownership.
682
00:25:26,640 –> 00:25:28,160
And those needs are not exotic.
683
00:25:28,160 –> 00:25:29,920
They are guaranteed as soon as the workflow
684
00:25:29,920 –> 00:25:31,680
touches regulated domains.
685
00:25:31,680 –> 00:25:33,680
So there’s a rule for regulated systems.
686
00:25:33,680 –> 00:25:35,200
And it’s not negotiable.
687
00:25:35,200 –> 00:25:37,440
Default to connected agents for any touch point
688
00:25:37,440 –> 00:25:39,760
that changes identity, money, or employment state.
689
00:25:39,760 –> 00:25:42,000
HR, finance, IAM, procurement,
690
00:25:42,000 –> 00:25:44,400
and anything that can create or grant access.
691
00:25:44,400 –> 00:25:47,200
The reason is not that embedded agents are less capable.
692
00:25:47,200 –> 00:25:49,200
The reason is that embedded agents
693
00:25:49,200 –> 00:25:51,680
erase the boundary where governance should live.
694
00:25:51,680 –> 00:25:54,240
If the child agent shares the parent’s identity and tools
695
00:25:54,240 –> 00:25:56,480
then the boundary between reasoning and actuation
696
00:25:56,480 –> 00:25:57,760
is softer than you think.
697
00:25:57,760 –> 00:26:00,160
And soft boundaries turn into incident reports.
698
00:26:00,160 –> 00:26:02,160
Connected agents give you a harder boundary.
699
00:26:02,160 –> 00:26:04,000
They force you to define a contract.
700
00:26:04,000 –> 00:26:05,040
They force you to publish.
701
00:26:05,040 –> 00:26:06,800
They force you to name an owner.
702
00:26:06,800 –> 00:26:08,640
They force you to think about life cycle.
703
00:26:08,640 –> 00:26:10,080
And that friction is productive.
704
00:26:10,080 –> 00:26:11,760
It’s the same kind of friction
705
00:26:11,760 –> 00:26:13,200
that stops someone from deploying
706
00:26:13,200 –> 00:26:15,360
an unreviewed service directly into production.
707
00:26:15,360 –> 00:26:16,880
In other words, it’s the point.
708
00:26:16,880 –> 00:26:19,600
Now, there’s a second order effect that matters even more.
709
00:26:19,600 –> 00:26:21,280
Orchestration quality degrades
710
00:26:21,280 –> 00:26:23,280
as the choice set grows and overlaps.
711
00:26:23,280 –> 00:26:25,680
If you embed every child agent inside a parent,
712
00:26:25,680 –> 00:26:27,920
you’re building a private ecosystem inside that parent.
713
00:26:27,920 –> 00:26:29,040
The parent gets heavier.
714
00:26:29,040 –> 00:26:30,560
The routing problem gets fuzzier.
715
00:26:30,560 –> 00:26:33,520
The temptation to just let the parent handle it returns.
716
00:26:33,520 –> 00:26:35,200
Over time, you rebuild the monolith
717
00:26:35,200 –> 00:26:37,280
because the parent has too many responsibilities
718
00:26:37,280 –> 00:26:39,040
and too many internal variations.
719
00:26:39,040 –> 00:26:40,640
Connected agents reduce that pressure
720
00:26:40,640 –> 00:26:42,880
by moving capability ownership outward.
721
00:26:42,880 –> 00:26:44,480
They make reuse explicit.
722
00:26:44,480 –> 00:26:45,760
They make duplication harder.
723
00:26:45,760 –> 00:26:46,960
They make sprawl visible.
724
00:26:46,960 –> 00:26:49,360
So the deterministic decision rule is this.
725
00:26:49,360 –> 00:26:51,600
Reusable capability connected.
726
00:26:51,600 –> 00:26:53,680
Workflow-specific logic embedded.
727
00:26:53,680 –> 00:26:55,200
And if you’re tempted to call something
728
00:26:55,200 –> 00:26:57,760
workflow-specific because you don’t want to publish it,
729
00:26:57,760 –> 00:26:59,680
you’re not making an architectural decision.
730
00:26:59,680 –> 00:27:00,800
You’re avoiding governance.
731
00:27:00,800 –> 00:27:03,280
One more practical checkpoint.
732
00:27:03,280 –> 00:27:04,800
If two teams are going to depend on it,
733
00:27:04,800 –> 00:27:05,840
it must be connected.
734
00:27:05,840 –> 00:27:07,680
If it needs a dedicated odd configuration,
735
00:27:07,680 –> 00:27:08,640
it must be connected.
736
00:27:08,640 –> 00:27:10,800
If it needs a rollback plan, it must be connected.
737
00:27:10,800 –> 00:27:12,480
If it might need to be killed quickly
738
00:27:12,480 –> 00:27:14,000
without deleting the whole parent,
739
00:27:14,000 –> 00:27:15,040
it must be connected.
740
00:27:15,040 –> 00:27:16,320
Everything else can be embedded.
741
00:27:16,320 –> 00:27:17,600
Make the coupling decision early
742
00:27:17,600 –> 00:27:20,080
because changing it later is expensive.
743
00:27:20,080 –> 00:27:21,200
Not technically.
744
00:27:21,200 –> 00:27:22,320
Politically.
745
00:27:22,320 –> 00:27:24,320
And once you commit to the right coupling model,
746
00:27:24,320 –> 00:27:26,480
the next problem emerges immediately.
747
00:27:26,480 –> 00:27:29,360
Orchestration quality depends on routing signals.
748
00:27:29,360 –> 00:27:30,000
Not vibes.
749
00:27:30,000 –> 00:27:32,560
Rooting determinism, descriptions,
750
00:27:32,560 –> 00:27:35,440
invocation rules and controlled context sharing.
751
00:27:35,440 –> 00:27:37,680
Rooting determinism is the part everyone hand waves
752
00:27:37,680 –> 00:27:39,680
because it looks like just descriptions.
753
00:27:39,680 –> 00:27:40,240
It is not.
754
00:27:40,240 –> 00:27:43,200
It is the selection logic for a distributed decision engine.
755
00:27:43,200 –> 00:27:45,120
And if you leave selection logic to vibes,
756
00:27:45,120 –> 00:27:46,960
you get non-deterministic execution.
757
00:27:46,960 –> 00:27:49,760
In Copilot Multi-Agent Orchestration,
758
00:27:49,760 –> 00:27:52,160
the orchestrator roots based on what it can infer
759
00:27:52,160 –> 00:27:53,840
from agent descriptions, instructions,
760
00:27:53,840 –> 00:27:54,960
and the current context.
761
00:27:54,960 –> 00:27:58,240
That sounds convenient until you remember what info means.
762
00:27:58,240 –> 00:28:00,000
It means probabilistic classification.
763
00:28:00,000 –> 00:28:00,720
It means drift.
764
00:28:00,720 –> 00:28:01,920
It means the same.
765
00:28:01,920 –> 00:28:03,920
Request class can land in different places
766
00:28:03,920 –> 00:28:06,000
depending on phrasing, conversation history,
767
00:28:06,000 –> 00:28:08,800
and which agent description someone improved last week.
768
00:28:08,800 –> 00:28:11,520
So you design routing like you design APIs,
769
00:28:11,520 –> 00:28:13,760
explicit contracts, explicit invocation rules,
770
00:28:13,760 –> 00:28:15,360
and explicit constraints on context.
771
00:28:15,360 –> 00:28:16,320
Start with descriptions.
772
00:28:16,320 –> 00:28:18,480
The thing most people miss is that an agent description
773
00:28:18,480 –> 00:28:19,680
is not marketing copy.
774
00:28:19,680 –> 00:28:20,720
It is routing signal.
775
00:28:20,720 –> 00:28:22,320
It must include three elements.
776
00:28:22,320 –> 00:28:24,880
And if anyone is missing, your orchestrator will guess.
777
00:28:24,880 –> 00:28:27,120
First, capability scope.
778
00:28:27,120 –> 00:28:28,720
What it does stated narrowly.
779
00:28:28,720 –> 00:28:29,920
Not helps with HR.
780
00:28:29,920 –> 00:28:30,880
That’s junk.
781
00:28:30,880 –> 00:28:32,960
Answers employee questions about benefits based
782
00:28:32,960 –> 00:28:34,480
on the HR benefits knowledge source.
783
00:28:34,480 –> 00:28:37,600
That scope, second exclusion scope,
784
00:28:37,600 –> 00:28:38,560
what it does not do,
785
00:28:38,560 –> 00:28:40,960
does not create modify or approve employee records,
786
00:28:40,960 –> 00:28:42,480
does not send external emails,
787
00:28:42,480 –> 00:28:44,160
does not perform access changes.
788
00:28:44,160 –> 00:28:47,120
That matters because the orchestrator needs negative space.
789
00:28:47,120 –> 00:28:48,800
Without it overlap looks valid.
790
00:28:48,800 –> 00:28:50,160
Third, prerequisites.
791
00:28:50,160 –> 00:28:53,200
What input it requires to behave deterministically.
792
00:28:53,200 –> 00:28:55,040
Requires employee ID.
793
00:28:55,040 –> 00:28:56,640
Requires invoice number.
794
00:28:56,640 –> 00:28:58,240
Requires manager UPN.
795
00:28:58,240 –> 00:29:00,400
If the agent can’t operate without a key,
796
00:29:00,400 –> 00:29:02,080
the description must say so.
797
00:29:02,080 –> 00:29:04,240
Otherwise it will attempt to operate anyway
798
00:29:04,240 –> 00:29:06,320
and you’ll get confident nonsense.
799
00:29:06,320 –> 00:29:08,640
Now move from descriptions to invocation rules.
800
00:29:08,640 –> 00:29:11,280
This is where you stop hoping and start enforcing.
801
00:29:11,280 –> 00:29:13,040
Invocation rules are simple sentences
802
00:29:13,040 –> 00:29:14,880
that reduce routing ambiguity.
803
00:29:14,880 –> 00:29:16,080
Use this agent when.
804
00:29:16,880 –> 00:29:19,040
And do not use this agent when.
805
00:29:19,040 –> 00:29:20,480
Write them like guardrails.
806
00:29:20,480 –> 00:29:21,600
Not aspirations.
807
00:29:21,600 –> 00:29:23,120
If you’re using connected agents,
808
00:29:23,120 –> 00:29:25,120
treat these rules as part of the contract,
809
00:29:25,120 –> 00:29:27,600
version them and change them deliberately.
810
00:29:27,600 –> 00:29:29,120
Random edits are entropy.
811
00:29:29,120 –> 00:29:31,280
A usable pattern is trigger phrases,
812
00:29:31,280 –> 00:29:33,280
trigger objects and trigger intent.
813
00:29:33,280 –> 00:29:34,880
Trigger phrases are the user’s words.
814
00:29:34,880 –> 00:29:37,680
Trigger objects are the identifiers that appear.
815
00:29:37,680 –> 00:29:39,440
Trigger intent is the action category.
816
00:29:39,440 –> 00:29:42,240
So you say use the invoice validation agent
817
00:29:42,240 –> 00:29:44,320
when the user provides an invoice number,
818
00:29:44,320 –> 00:29:48,000
asks about payment status, matching exceptions or vendor compliance.
819
00:29:48,000 –> 00:29:51,280
Do not use it for general finance policy questions.
820
00:29:51,280 –> 00:29:53,280
Root those to the finance policy agent.
821
00:29:53,280 –> 00:29:55,360
That’s not poetic, that’s deterministic.
822
00:29:55,360 –> 00:29:57,120
And yes, this will feel rigid to people
823
00:29:57,120 –> 00:29:59,360
who want the system to just understand.
824
00:29:59,360 –> 00:30:00,880
They are optimizing for demo flow.
825
00:30:00,880 –> 00:30:02,800
You are optimizing for enterprise behavior.
826
00:30:02,800 –> 00:30:04,160
Now context sharing,
827
00:30:04,160 –> 00:30:06,400
this is the quiet killer passing conversation history
828
00:30:06,400 –> 00:30:08,640
to a connected agent is not a harmless convenience.
829
00:30:08,640 –> 00:30:09,760
It is context injection.
830
00:30:09,760 –> 00:30:12,240
It changes behavior, it increases ambiguity.
831
00:30:12,240 –> 00:30:15,120
It can contaminate routing and cause the cold agent
832
00:30:15,120 –> 00:30:17,280
to answer the wrong question confidently
833
00:30:17,280 –> 00:30:20,880
because it’s so a prior topic and decided the user must mean that.
834
00:30:20,880 –> 00:30:22,640
So you treat context like privilege.
835
00:30:22,640 –> 00:30:25,680
You share the minimum required for the capability to execute.
836
00:30:25,680 –> 00:30:28,560
If the connected agent is performing a bounded operation,
837
00:30:28,560 –> 00:30:29,600
like looking up a record,
838
00:30:29,600 –> 00:30:31,280
creating a ticket, validating a policy,
839
00:30:31,280 –> 00:30:33,120
do not pass the entire conversation.
840
00:30:33,120 –> 00:30:35,920
Pass structured inputs, identifiers, normalized intent
841
00:30:35,920 –> 00:30:38,000
and only the fields required for that step.
842
00:30:38,000 –> 00:30:40,880
Give it the smallest possible window to improvise.
843
00:30:40,880 –> 00:30:43,760
If the connected agent is a knowledge Q&A capability,
844
00:30:43,760 –> 00:30:44,960
context can help,
845
00:30:44,960 –> 00:30:46,720
but only if you accept the trade-off.
846
00:30:46,720 –> 00:30:50,320
Better conversational continuity, lower reproducibility.
847
00:30:50,320 –> 00:30:54,240
When in doubt, isolate and never pass context across trust boundaries.
848
00:30:54,240 –> 00:30:56,720
HR context into finance is not helpful.
849
00:30:56,720 –> 00:30:59,520
It is a data leak waiting for a justification memo.
850
00:30:59,520 –> 00:31:02,880
Multi-intent queries are where orchestration either looks professional
851
00:31:02,880 –> 00:31:04,880
or collapses into conditional chaos.
852
00:31:04,880 –> 00:31:07,280
When a user asks check invoice 1042
853
00:31:07,280 –> 00:31:08,880
and also update the vendor address
854
00:31:08,880 –> 00:31:11,520
and send a note to procurement that is not one request.
855
00:31:11,520 –> 00:31:12,560
That is three intents.
856
00:31:12,560 –> 00:31:15,360
If you let the orchestrator decide how to sequence that,
857
00:31:15,360 –> 00:31:17,440
you will eventually get out of order execution.
858
00:31:17,440 –> 00:31:18,880
The email sent before the update,
859
00:31:18,880 –> 00:31:21,120
the update attempted without validation,
860
00:31:21,120 –> 00:31:24,160
the wrong agent called because it latched onto the first noun.
861
00:31:24,160 –> 00:31:26,640
So the master agent must split work intentionally,
862
00:31:26,640 –> 00:31:29,040
detect multi-intent, decompose into ordered steps,
863
00:31:29,040 –> 00:31:30,560
apply gates per step,
864
00:31:30,560 –> 00:31:32,240
and only then invoke agents.
865
00:31:32,240 –> 00:31:35,680
Do not let one chat message become one execution graph.
866
00:31:35,680 –> 00:31:38,240
Practical guardrails are boring but effective.
867
00:31:38,240 –> 00:31:41,120
Explicit routing patterns in the master agent instructions,
868
00:31:41,120 –> 00:31:44,640
narrow agent descriptions, exclusion rules, and controlled context.
869
00:31:44,640 –> 00:31:46,720
It is not glamorous, it is deployable.
870
00:31:46,720 –> 00:31:48,720
Operational prerequisites, publish,
871
00:31:48,720 –> 00:31:51,120
connectable toggles and the boring parts that fail first.
872
00:31:51,120 –> 00:31:53,600
This is where good architectures quietly fail.
873
00:31:53,600 –> 00:31:55,840
Not in the manifesto, not in the diagrams,
874
00:31:55,840 –> 00:31:58,240
in the toggles, the publish buttons,
875
00:31:58,240 –> 00:32:00,640
and the identity plumbing that nobody wants to own.
876
00:32:00,640 –> 00:32:02,880
Connected agents only function as govern services
877
00:32:02,880 –> 00:32:05,120
if you actually treat them like govern services.
878
00:32:05,120 –> 00:32:07,920
That starts with the prerequisites that feel beneath you
879
00:32:07,920 –> 00:32:09,360
but will still take you down.
880
00:32:09,360 –> 00:32:11,440
First, the agent must have a description,
881
00:32:11,440 –> 00:32:12,880
not because humans need it,
882
00:32:12,880 –> 00:32:14,720
because the orchestrator roots based on it.
883
00:32:14,720 –> 00:32:16,720
If you leave the description empty or vague,
884
00:32:16,720 –> 00:32:18,000
you’re not moving fast.
885
00:32:18,000 –> 00:32:21,520
You are making routing non-deterministic by design.
886
00:32:21,520 –> 00:32:23,040
You are instructing the system to guess.
887
00:32:23,040 –> 00:32:24,400
Pause.
888
00:32:24,400 –> 00:32:26,000
Guessing is not orchestration.
889
00:32:26,000 –> 00:32:28,160
Second, generative mode has to be enabled.
890
00:32:28,160 –> 00:32:29,680
This is not a philosophical setting.
891
00:32:29,680 –> 00:32:32,960
It’s a capability flag that determines whether the agent participates
892
00:32:32,960 –> 00:32:34,720
in this orchestration model at all.
893
00:32:34,720 –> 00:32:36,240
Organizations routinely miss this
894
00:32:36,240 –> 00:32:39,280
because they assume it’s an agent, therefore it’s a genetic.
895
00:32:39,280 –> 00:32:40,800
No, it’s a configuration state.
896
00:32:40,800 –> 00:32:43,360
Third, the connected agent toggle must be enabled.
897
00:32:43,360 –> 00:32:45,440
Let other agents connect to and use this one.
898
00:32:45,440 –> 00:32:47,200
The by default is often off.
899
00:32:47,200 –> 00:32:49,040
And the off-by-default posture is correct
900
00:32:49,040 –> 00:32:52,080
because reusable capability surface should not be accidental.
901
00:32:52,080 –> 00:32:54,560
But operationally, it means teams build the agent,
902
00:32:54,560 –> 00:32:55,840
demo it in isolation,
903
00:32:55,840 –> 00:32:57,760
and then wonder why the parent can’t see it.
904
00:32:57,760 –> 00:32:58,880
The system isn’t broken.
905
00:32:58,880 –> 00:33:01,520
Your design intent was never compiled into configuration.
906
00:33:01,520 –> 00:33:03,120
Fourth, the agent must be published.
907
00:33:03,120 –> 00:33:04,720
This is the part that irritates people
908
00:33:04,720 –> 00:33:06,480
because it introduces life cycle
909
00:33:06,480 –> 00:33:08,400
and life cycle introduces friction.
910
00:33:08,400 –> 00:33:11,680
Good, publish is the line between a draft and an enterprise surface.
911
00:33:11,680 –> 00:33:13,760
If you don’t publish, you don’t have a callable service.
912
00:33:13,760 –> 00:33:14,960
You have a personal experiment.
913
00:33:14,960 –> 00:33:15,840
Now, here’s the trap.
914
00:33:15,840 –> 00:33:18,080
Teams think these prerequisites are the work.
915
00:33:18,080 –> 00:33:18,880
They are not.
916
00:33:18,880 –> 00:33:20,240
They are admission to the work.
917
00:33:20,240 –> 00:33:21,600
Because once you publish,
918
00:33:21,600 –> 00:33:23,520
the operational reality arrives.
919
00:33:23,520 –> 00:33:24,640
Credentials and connections.
920
00:33:24,640 –> 00:33:28,880
Connected agents that call tools often require connection setup
921
00:33:28,880 –> 00:33:32,000
and those connections often need to be refreshed after publish.
922
00:33:32,000 –> 00:33:34,000
This shows up as the most humiliating failure mode
923
00:33:34,000 –> 00:33:35,280
in enterprise AI.
924
00:33:35,280 –> 00:33:36,640
The agent routes correctly.
925
00:33:36,640 –> 00:33:37,680
The logic is fine.
926
00:33:37,680 –> 00:33:39,120
The tool is correct.
927
00:33:39,120 –> 00:33:41,440
An execution fails because the connection manager
928
00:33:41,440 –> 00:33:43,840
wants you to click “sign in” again.
929
00:33:43,840 –> 00:33:44,560
Emphasis.
930
00:33:44,560 –> 00:33:45,760
That is not an edge case.
931
00:33:45,760 –> 00:33:48,880
That is the default state of loosely managed identities.
932
00:33:48,880 –> 00:33:49,920
So the rule is simple.
933
00:33:49,920 –> 00:33:53,040
Every connected capability must have a credential maintenance posture.
934
00:33:53,040 –> 00:33:53,600
Who owns it?
935
00:33:53,600 –> 00:33:54,400
How it’s renewed?
936
00:33:54,400 –> 00:33:55,840
What happens when it expires?
937
00:33:55,840 –> 00:33:58,000
And what the system does when it can’t execute?
938
00:33:58,000 –> 00:33:59,840
If the answer is it’ll probably work,
939
00:33:59,840 –> 00:34:00,640
then it won’t.
940
00:34:00,640 –> 00:34:01,520
Not at scale.
941
00:34:01,520 –> 00:34:03,280
Naming is the next silent failure.
942
00:34:03,280 –> 00:34:05,280
People treat names as UI, not governance.
943
00:34:05,280 –> 00:34:06,400
Names are governance.
944
00:34:06,400 –> 00:34:08,080
Names survive longer than owners.
945
00:34:08,080 –> 00:34:10,480
Names end up in teams, in links, in documentation,
946
00:34:10,480 –> 00:34:13,120
in training, in screenshots, in executive decks.
947
00:34:13,120 –> 00:34:16,240
And renaming, as multiple demos have shown, is not always clean.
948
00:34:16,240 –> 00:34:19,120
So early naming mistakes become long-lived defects.
949
00:34:19,120 –> 00:34:21,200
This matters because your internal agent catalog
950
00:34:21,200 –> 00:34:22,880
becomes your control surface.
951
00:34:22,880 –> 00:34:25,520
If the names are unclear, the catalog becomes a rumor mill.
952
00:34:25,520 –> 00:34:27,040
Then there’s tenant reality.
953
00:34:27,040 –> 00:34:29,600
Approvals, propagation delays, and admin gating.
954
00:34:29,600 –> 00:34:32,640
Publishing an agent to teams or to organizational availability
955
00:34:32,640 –> 00:34:34,640
often requires admin approval.
956
00:34:34,640 –> 00:34:37,040
That means your deployment pipeline includes humans,
957
00:34:37,040 –> 00:34:38,640
humans introduce latency.
958
00:34:38,640 –> 00:34:40,640
Latency creates shadow deployments.
959
00:34:40,640 –> 00:34:43,440
People work around the process by sharing direct links,
960
00:34:43,440 –> 00:34:47,040
testing in private chats, and just for now enabling broad access.
961
00:34:47,040 –> 00:34:48,640
And every workaround becomes a precedent.
962
00:34:48,640 –> 00:34:51,680
This is why governance erodes, not because people hate governance,
963
00:34:51,680 –> 00:34:54,000
but because you designed governance as friction
964
00:34:54,000 –> 00:34:56,800
without giving them a deterministic path through it.
965
00:34:56,800 –> 00:34:58,400
So design around this friction.
966
00:34:58,400 –> 00:35:00,080
Stable endpoints matter.
967
00:35:00,080 –> 00:35:03,680
If your connected agent is a capability service, treat it like one.
968
00:35:03,680 –> 00:35:04,480
Version it.
969
00:35:04,480 –> 00:35:06,880
Publish v1, add v2, deprecate v1,
970
00:35:06,880 –> 00:35:10,080
don’t edit in place and pretend it’s harmless.
971
00:35:10,080 –> 00:35:12,960
Edit in place is how you create run-to-run variants across time
972
00:35:12,960 –> 00:35:16,080
and then you act surprised when last month’s process behaves differently
973
00:35:16,080 –> 00:35:16,960
this month.
974
00:35:16,960 –> 00:35:18,720
Rollout strategy matters.
975
00:35:18,720 –> 00:35:21,920
If an agent change can affect financial or identity workflows,
976
00:35:21,920 –> 00:35:24,160
it needs a rollout posture, limited audience,
977
00:35:24,160 –> 00:35:26,560
measured telemetry, and a rollback plan.
978
00:35:26,560 –> 00:35:29,920
Not because you’re paranoid, because you’re operating a decision engine.
979
00:35:29,920 –> 00:35:32,640
And yes, you need a kill switch mentality, not a delete button
980
00:35:32,640 –> 00:35:34,480
when the incident hits production.
981
00:35:34,480 –> 00:35:37,760
A deliberate ability to disable a capability surface quickly
982
00:35:37,760 –> 00:35:39,360
with known blast radius.
983
00:35:39,360 –> 00:35:41,680
That’s what makes the deterministic core credible.
984
00:35:41,680 –> 00:35:44,880
The uncomfortable truth is that these boring parts
985
00:35:44,880 –> 00:35:47,680
are where multi-agent orchestration becomes enterprise software
986
00:35:47,680 –> 00:35:50,080
or becomes another demo graveyard.
987
00:35:50,080 –> 00:35:52,240
You don’t earn ROI by shipping agents.
988
00:35:52,240 –> 00:35:54,800
You earn ROI by keeping them calibable,
989
00:35:54,800 –> 00:35:57,120
governable, observable, and reversible.
990
00:35:58,240 –> 00:36:03,200
Case study, Joyner Mover Leaver, JML identity life cycle as an anti-halucination test.
991
00:36:03,200 –> 00:36:05,520
Joyner Mover Leaver is the fastest way to find out
992
00:36:05,520 –> 00:36:07,680
whether your multi-agent architecture is real
993
00:36:07,680 –> 00:36:10,720
or whether you’ve just built a persuasive auto-complete layer
994
00:36:10,720 –> 00:36:12,640
on top of a fragile process.
995
00:36:12,640 –> 00:36:15,920
JML is brutal because it is audited, dependency heavy,
996
00:36:15,920 –> 00:36:18,240
and intolerant of creative interpretation.
997
00:36:18,240 –> 00:36:22,640
A Joyner event touches HR data, identity creation, access assignment, licensing,
998
00:36:22,640 –> 00:36:24,960
mailbox provisioning, device enrollment,
999
00:36:24,960 –> 00:36:28,160
application entitlements, and often privileged role eligibility.
1000
00:36:28,160 –> 00:36:32,160
A Mover event touches least privilege and role drift.
1001
00:36:32,160 –> 00:36:35,120
A Leaver event touches revocation retention legal hold
1002
00:36:35,120 –> 00:36:37,360
and the operational reality that accounts
1003
00:36:37,360 –> 00:36:40,400
linger unless something forces them to stop existing.
1004
00:36:40,400 –> 00:36:42,640
This is why JML is an anti-halucination test.
1005
00:36:42,640 –> 00:36:43,760
It punishes helpful.
1006
00:36:43,760 –> 00:36:46,080
If your system behaves probabilistically
1007
00:36:46,080 –> 00:36:48,720
at the execution layer, JML doesn’t fail loudly.
1008
00:36:48,720 –> 00:36:52,320
It fails quietly and quiet failures in identity are not bugs.
1009
00:36:52,320 –> 00:36:54,080
They are breaches with better grammar.
1010
00:36:54,080 –> 00:36:56,240
Here’s what non-deterministic failure looks like
1011
00:36:56,240 –> 00:36:58,000
and it’s always the same shape.
1012
00:36:58,000 –> 00:37:00,480
Out of order provisioning, the system creates the account
1013
00:37:00,480 –> 00:37:03,520
and assigns roles before a background check gate clears
1014
00:37:03,520 –> 00:37:06,800
or it assigns access before the manager approval exists
1015
00:37:06,800 –> 00:37:08,480
or it grants a default starter bundle
1016
00:37:08,480 –> 00:37:11,200
because the request sounded urgent and will fix it later.
1017
00:37:11,200 –> 00:37:12,400
Later never arrives.
1018
00:37:12,400 –> 00:37:16,400
Skip the approvals, the model interprets an email thread
1019
00:37:16,400 –> 00:37:20,400
as implicit approval or it reads a team’s message as consent
1020
00:37:20,400 –> 00:37:23,840
or it sees please proceed from someone who isn’t an approver
1021
00:37:23,840 –> 00:37:26,960
and because it can produce a coherent explanation nobody notices
1022
00:37:26,960 –> 00:37:29,520
until an audit asks for the approval artifact.
1023
00:37:29,520 –> 00:37:32,480
Roll guessing the system infers job function from a title
1024
00:37:32,480 –> 00:37:35,280
and assigns access packages that were never requested.
1025
00:37:35,280 –> 00:37:37,200
This happens because titles are messy,
1026
00:37:37,200 –> 00:37:40,240
HR data is inconsistent and the model tries to be useful.
1027
00:37:40,240 –> 00:37:42,240
It is being useful, that’s the problem.
1028
00:37:42,240 –> 00:37:44,640
Lingering elevation, move us keep old access,
1029
00:37:44,640 –> 00:37:46,080
leave us keep dormant accounts,
1030
00:37:46,080 –> 00:37:47,920
privilege eligibility remains assigned
1031
00:37:47,920 –> 00:37:50,720
because the deprovisioning step ran most of the way
1032
00:37:50,720 –> 00:37:53,440
and the system reported success in narrative form.
1033
00:37:53,840 –> 00:37:56,880
Now take the worst failure mode, the silent breach, slow.
1034
00:37:56,880 –> 00:37:59,760
The system performs a non-compliant action
1035
00:37:59,760 –> 00:38:02,000
and then produces a compliant sounding narrative
1036
00:38:02,000 –> 00:38:04,640
about what it did, it writes the post-mortem for you
1037
00:38:04,640 –> 00:38:07,040
in advance while still being wrong, pause.
1038
00:38:07,040 –> 00:38:11,600
This is where teams confuse observability with storytelling.
1039
00:38:11,600 –> 00:38:13,840
If your trace is pros, you don’t have a trace,
1040
00:38:13,840 –> 00:38:15,200
you have plausible deniability.
1041
00:38:15,200 –> 00:38:18,480
So what does deterministic orchestration look like for JML?
1042
00:38:18,480 –> 00:38:20,720
It looks like a boring path within forced gates
1043
00:38:20,720 –> 00:38:22,480
and AI only inside the steps.
1044
00:38:23,120 –> 00:38:26,240
HR validation is first, not the user said they’re hired,
1045
00:38:26,240 –> 00:38:28,160
a validated record from the system of record.
1046
00:38:28,160 –> 00:38:30,480
If the HR system is missing data, the process stops.
1047
00:38:30,480 –> 00:38:31,600
It does not improvise.
1048
00:38:31,600 –> 00:38:34,240
Background check gate is second, if applicable.
1049
00:38:34,240 –> 00:38:36,480
If it’s not cleared, the process does not proceed
1050
00:38:36,480 –> 00:38:37,600
to identity creation.
1051
00:38:37,600 –> 00:38:40,240
No exceptions, exceptions become entitlements.
1052
00:38:40,240 –> 00:38:41,840
Manager approval gate is third.
1053
00:38:41,840 –> 00:38:43,680
Approval is an object, not a vibe.
1054
00:38:43,680 –> 00:38:45,760
If the manager hasn’t approved the process stops,
1055
00:38:45,760 –> 00:38:47,440
the system can draft the approval request.
1056
00:38:47,440 –> 00:38:48,720
It cannot infer approval.
1057
00:38:48,720 –> 00:38:50,240
Create identity is fourth.
1058
00:38:50,240 –> 00:38:52,400
Only now does the system create the user?
1059
00:38:52,400 –> 00:38:55,440
Apply baseline attributes and establish the authoritative identifier.
1060
00:38:55,440 –> 00:38:57,360
This step is deterministic.
1061
00:38:57,360 –> 00:38:59,680
Same input, same output, same logs,
1062
00:38:59,680 –> 00:39:01,840
mapped roles and access packages are fifth.
1063
00:39:01,840 –> 00:39:05,040
This is where teams get tempted to let the model assign what makes sense.
1064
00:39:05,040 –> 00:39:07,040
Don’t, you use a deterministic mapping,
1065
00:39:07,040 –> 00:39:08,560
job code to access package,
1066
00:39:08,560 –> 00:39:10,000
department to baseline groups,
1067
00:39:10,000 –> 00:39:11,200
location to licensing,
1068
00:39:11,200 –> 00:39:13,440
and explicit exceptions rooted to a queue.
1069
00:39:13,440 –> 00:39:16,800
AI can help classify ambiguous requests into known categories,
1070
00:39:16,800 –> 00:39:19,120
but the final assignment must be policy driven,
1071
00:39:19,120 –> 00:39:20,320
not narrative driven.
1072
00:39:20,320 –> 00:39:21,680
Provision apps is sixth.
1073
00:39:21,680 –> 00:39:25,280
Each app provisioning call is a privileged action with its own tool scope.
1074
00:39:25,280 –> 00:39:27,200
The master agent sequences them and logs them.
1075
00:39:27,200 –> 00:39:31,120
Connected agents perform the bounded operations per system.
1076
00:39:31,120 –> 00:39:32,960
Failures are explicit and retrieval.
1077
00:39:32,960 –> 00:39:34,640
No silent partial success.
1078
00:39:34,640 –> 00:39:36,400
Verify least privilege is seventh.
1079
00:39:36,400 –> 00:39:38,160
You don’t end JML at provisioned.
1080
00:39:38,160 –> 00:39:39,680
You ended verified.
1081
00:39:39,680 –> 00:39:41,600
The system compares assigned entitlements
1082
00:39:41,600 –> 00:39:44,320
to the policy baseline and flags drift immediately.
1083
00:39:44,320 –> 00:39:46,160
That’s the control plane doing its job.
1084
00:39:46,160 –> 00:39:49,200
Lock closure is eighth, intent decision action outcome.
1085
00:39:49,200 –> 00:39:53,120
Every gate, every approval reference, every tool call, a trace you can replay.
1086
00:39:53,120 –> 00:39:54,720
And this is where AI still helps.
1087
00:39:54,720 –> 00:39:57,280
Safely, it extracts intent from messy HR tickets.
1088
00:39:57,280 –> 00:39:58,640
It summarizes what’s missing.
1089
00:39:58,640 –> 00:40:00,080
It drafts communications.
1090
00:40:00,080 –> 00:40:04,080
It proposes which access package a request might align with as a recommendation.
1091
00:40:04,080 –> 00:40:08,800
It triages exceptions and routes them to the right human queue with context.
1092
00:40:08,800 –> 00:40:10,960
But it does not execute privilege by inference.
1093
00:40:10,960 –> 00:40:13,840
Because JML is where close enough becomes unauthorized.
1094
00:40:13,840 –> 00:40:16,480
If your multi agent system can pass JML repeatedly,
1095
00:40:16,480 –> 00:40:19,760
auditable with bounded variants, then you have a deployable architecture.
1096
00:40:19,760 –> 00:40:21,440
If it can’t, you don’t need more agents.
1097
00:40:21,440 –> 00:40:23,440
You need authority enforced by design.
1098
00:40:23,440 –> 00:40:26,560
Alternate case study, invoice to pay,
1099
00:40:26,560 –> 00:40:29,040
three-way match and why helpful approves fraud.
1100
00:40:29,040 –> 00:40:33,600
Invoice to pay looks boring until you realize it’s one of the cleanest demonstrations
1101
00:40:33,600 –> 00:40:35,600
of why helpful is not a control model.
1102
00:40:35,600 –> 00:40:37,600
A three-way match exists because procurement,
1103
00:40:37,600 –> 00:40:40,320
receiving an accounts payable do not trust each other’s inputs.
1104
00:40:40,320 –> 00:40:41,360
That isn’t dysfunction.
1105
00:40:41,360 –> 00:40:43,840
It’s segregation of duties expressed as process.
1106
00:40:43,840 –> 00:40:46,000
You match the purchase order, the goods receipt,
1107
00:40:46,000 –> 00:40:47,040
and the invoice.
1108
00:40:47,040 –> 00:40:49,440
If they align within defined tolerances you pay,
1109
00:40:49,440 –> 00:40:51,840
if they don’t you route to exception handling.
1110
00:40:51,840 –> 00:40:56,240
The system is allowed to be slow here because the system is preventing you from paying the wrong entity
1111
00:40:56,240 –> 00:40:58,560
for the wrong thing with the wrong authorization.
1112
00:40:58,560 –> 00:41:02,320
Now introduce a multi agent orchestration layer with helpful defaults
1113
00:41:02,320 –> 00:41:03,600
and watch what happens.
1114
00:41:03,600 –> 00:41:06,000
The first failure mode is tolerance bypass.
1115
00:41:06,000 –> 00:41:08,320
The user asks, can you just get this paid today?
1116
00:41:08,320 –> 00:41:11,600
The model sees urgency, sees a relationship, sees a supplier name,
1117
00:41:11,600 –> 00:41:13,040
and it tries to be useful.
1118
00:41:13,040 –> 00:41:14,880
It decides the mismatch is probably fine.
1119
00:41:14,880 –> 00:41:18,000
It drafts the justification, it routes around the exception queue
1120
00:41:18,000 –> 00:41:19,600
and because the narrative is coherent,
1121
00:41:19,600 –> 00:41:21,440
the breach looks like productivity.
1122
00:41:21,440 –> 00:41:23,840
The second failure mode is exception laundering.
1123
00:41:23,840 –> 00:41:26,880
In mature finance processes, exceptions are where policy lives.
1124
00:41:26,880 –> 00:41:30,800
Price variance, quantity variance, duplicate invoice detection,
1125
00:41:30,800 –> 00:41:33,760
supplier bank change verification, tax handling,
1126
00:41:33,760 –> 00:41:35,280
and spend category controls.
1127
00:41:35,280 –> 00:41:39,440
If an agent can resolve exceptions by improvisation, it will.
1128
00:41:39,440 –> 00:41:41,840
It will normalize, it will smooth, it will fix.
1129
00:41:41,840 –> 00:41:44,800
And what it’s actually doing is converting deterministic gates
1130
00:41:44,800 –> 00:41:46,480
into probabilistic persuasion.
1131
00:41:46,480 –> 00:41:48,480
The third failure mode is the most dangerous,
1132
00:41:48,480 –> 00:41:50,800
persuasive notes overriding policy.
1133
00:41:50,800 –> 00:41:53,200
The model outputs a convincing explanation
1134
00:41:53,200 –> 00:41:54,800
that sounds like it followed the process.
1135
00:41:54,800 –> 00:41:56,800
It might even reference the right documents,
1136
00:41:56,800 –> 00:41:59,920
but unless you can reproduce the exact evaluation path,
1137
00:41:59,920 –> 00:42:02,560
match results, thresholds, approvals, and tool calls
1138
00:42:02,560 –> 00:42:05,600
under audit, you cannot prove that payment was allowed.
1139
00:42:05,600 –> 00:42:07,680
And finance does not operate on trust me.
1140
00:42:07,680 –> 00:42:11,120
This is why reproducibility matters here more than almost anywhere else.
1141
00:42:11,120 –> 00:42:13,520
If an auditor asks why did you pay this invoice,
1142
00:42:13,520 –> 00:42:16,640
you must be able to rerun the decision against the same policy
1143
00:42:16,640 –> 00:42:18,000
and show the same outcome.
1144
00:42:18,000 –> 00:42:19,920
Not the same words, the same allowed action.
1145
00:42:19,920 –> 00:42:21,920
So the deterministic fixes are not exotic.
1146
00:42:21,920 –> 00:42:24,640
They’re just unpopular because they remove improvisation.
1147
00:42:24,640 –> 00:42:26,960
Mandatory matching rules, payment does not proceed
1148
00:42:26,960 –> 00:42:29,520
unless the three-way match passes within a defined tolerance
1149
00:42:29,520 –> 00:42:31,760
with the tolerance value stored as a policy object
1150
00:42:31,760 –> 00:42:32,880
not embedded in a prompt.
1151
00:42:32,880 –> 00:42:37,360
Exception cues, mismatches root to a cue with explicit ownership.
1152
00:42:37,360 –> 00:42:39,520
The system can summarize the discrepancy
1153
00:42:39,520 –> 00:42:41,520
and propose a likely resolution.
1154
00:42:41,520 –> 00:42:43,680
The system cannot resolve it by narrative.
1155
00:42:43,680 –> 00:42:44,800
Approval policies.
1156
00:42:44,800 –> 00:42:47,680
Where policy demands approval, approval is required.
1157
00:42:47,680 –> 00:42:51,360
Not inferred, not implied, not the vendor emailed our controller.
1158
00:42:51,360 –> 00:42:54,640
An approval artifact with identity, timestamp, and scope,
1159
00:42:54,640 –> 00:42:55,920
logged outcomes.
1160
00:42:55,920 –> 00:42:57,840
Every run must emit a trace.
1161
00:42:57,840 –> 00:43:02,400
In voice ID, POID, receipt ID, match result, variance categories,
1162
00:43:02,400 –> 00:43:06,240
thresholds applied, approvals reference, tools called, and final disposition.
1163
00:43:06,240 –> 00:43:08,480
If the trace is missing, the workflow did not complete.
1164
00:43:08,480 –> 00:43:11,280
That is how you prevent silent partial automation.
1165
00:43:11,280 –> 00:43:14,880
Now, where multi-agent helps when you keep authority deterministic
1166
00:43:14,880 –> 00:43:17,280
is in specialization, you can have an extraction agent
1167
00:43:17,280 –> 00:43:20,000
that pulls structured fields from invoices reliably.
1168
00:43:20,000 –> 00:43:22,080
You can have a vendor validation agent
1169
00:43:22,080 –> 00:43:24,720
that checks supplier status and bank change history.
1170
00:43:24,720 –> 00:43:27,520
You can have a policy lookup agent that retrieves
1171
00:43:27,520 –> 00:43:29,920
the actual tolerance rules and approval matrix.
1172
00:43:29,920 –> 00:43:32,000
You can have a communications agent that drafts
1173
00:43:32,000 –> 00:43:34,400
exception notices to procurement or the vendor
1174
00:43:34,400 –> 00:43:36,000
using consistent templates.
1175
00:43:36,000 –> 00:43:39,280
You can even have a reconciliation agent that assembles the full context
1176
00:43:39,280 –> 00:43:42,000
for a human approver, what changed, what mismatched,
1177
00:43:42,000 –> 00:43:44,160
and what the recommended remediation is.
1178
00:43:44,160 –> 00:43:45,040
That’s leverage.
1179
00:43:45,040 –> 00:43:48,240
But none of those agents should be allowed to approve payment by being convincing.
1180
00:43:48,240 –> 00:43:50,400
This is where ROI stops being theoretical.
1181
00:43:50,400 –> 00:43:53,680
If you enforce deterministic gates, you get measurable outcomes.
1182
00:43:53,680 –> 00:43:58,000
Faster cycle time on clean matches, fewer escalations due to better triage,
1183
00:43:58,000 –> 00:44:00,720
reduced manual effort in exception preparation,
1184
00:44:00,720 –> 00:44:04,480
and a completion rate, you can actually report without embarrassment.
1185
00:44:04,480 –> 00:44:07,440
You can attribute cost per run to specific capabilities.
1186
00:44:07,440 –> 00:44:11,280
You can quantify exception rates, you can show decreased time to resolution.
1187
00:44:11,280 –> 00:44:13,920
And when something goes wrong, which it will, you can isolate it,
1188
00:44:13,920 –> 00:44:15,760
you can disable the vendor validation agent
1189
00:44:15,760 –> 00:44:17,520
without killing invoice capture,
1190
00:44:17,520 –> 00:44:19,200
you can roll back a policy lookup change
1191
00:44:19,200 –> 00:44:22,160
without rewriting the entire workflow, you can prove what happened.
1192
00:44:22,160 –> 00:44:24,160
That’s the architecture paying for itself.
1193
00:44:24,160 –> 00:44:27,440
Because in finance, helpful is how fraud gets a signature.
1194
00:44:27,440 –> 00:44:29,600
Determinism is how you prevent it.
1195
00:44:29,600 –> 00:44:33,280
The black box, you don’t govern intelligence, you govern outcomes.
1196
00:44:33,280 –> 00:44:35,840
Enterprise AI doesn’t require transparent reasoning.
1197
00:44:35,840 –> 00:44:37,840
It requires deterministic authority.
1198
00:44:37,840 –> 00:44:39,280
The black box complaint is valid.
1199
00:44:39,280 –> 00:44:40,640
It’s also mis-ammed.
1200
00:44:40,640 –> 00:44:43,200
Most organizations ask, why did it think that?
1201
00:44:43,200 –> 00:44:45,520
Because that’s how humans interrogate other humans.
1202
00:44:45,520 –> 00:44:48,000
But an LLM is not a coworker with motives.
1203
00:44:48,000 –> 00:44:52,000
It is a probabilistic engine that generates plausible completions under constraints,
1204
00:44:52,000 –> 00:44:53,920
given context you often forgot you provided.
1205
00:44:53,920 –> 00:44:56,880
So when a multi-agent system produces the wrong outcome,
1206
00:44:56,880 –> 00:44:59,120
the question is not what was it thinking.
1207
00:44:59,120 –> 00:45:00,320
That’s a comforting question,
1208
00:45:00,320 –> 00:45:02,480
because it implies the answer will be understandable
1209
00:45:02,480 –> 00:45:03,760
and therefore fixable.
1210
00:45:03,760 –> 00:45:05,360
The uncomfortable question is simpler,
1211
00:45:05,360 –> 00:45:06,960
and it actually maps to control.
1212
00:45:06,960 –> 00:45:08,160
Why was it allowed to do that?
1213
00:45:08,160 –> 00:45:10,240
That distinction matters because why it thought
1214
00:45:10,240 –> 00:45:11,920
will never be fully reproducible.
1215
00:45:11,920 –> 00:45:14,880
The same prompt with the same data can still produce variance.
1216
00:45:14,880 –> 00:45:18,000
Rooting can drift, retrieval can return different chunks,
1217
00:45:18,000 –> 00:45:20,000
and the model can take different reasoning parts
1218
00:45:20,000 –> 00:45:21,200
that still look coherent.
1219
00:45:21,200 –> 00:45:24,000
If you bet your governance posture on perfect explainability,
1220
00:45:24,000 –> 00:45:26,160
you’re building a compliance program on quicksand.
1221
00:45:26,160 –> 00:45:29,440
You don’t need explainability to deploy software safely.
1222
00:45:29,440 –> 00:45:32,160
You need enforceable constraints, observable actions,
1223
00:45:32,160 –> 00:45:34,560
and an audit trail that survives scrutiny.
1224
00:45:34,560 –> 00:45:37,040
That’s what enterprise architecture has always been about,
1225
00:45:37,040 –> 00:45:38,880
even when the marketing language changes.
1226
00:45:38,880 –> 00:45:40,320
So the trust model shifts.
1227
00:45:40,320 –> 00:45:41,600
You stop demanding mind reading
1228
00:45:41,600 –> 00:45:43,440
and you start demanding system control.
1229
00:45:43,440 –> 00:45:45,440
Intent must be captured explicitly.
1230
00:45:45,440 –> 00:45:46,960
Decision points must be recorded.
1231
00:45:46,960 –> 00:45:48,000
Actions must be gated.
1232
00:45:48,000 –> 00:45:49,520
Outcomes must be verifiable.
1233
00:45:49,520 –> 00:45:52,720
That’s the whole shape of governance in an agentic system
1234
00:45:52,720 –> 00:45:54,880
and it’s why the deterministic core exists.
1235
00:45:54,880 –> 00:45:57,200
Now, a lot of teams try to solve the black box
1236
00:45:57,200 –> 00:45:59,200
with better prompts and stricter instructions.
1237
00:45:59,200 –> 00:46:00,320
That’s not governance.
1238
00:46:00,320 –> 00:46:02,400
That’s wishful thinking with formatting.
1239
00:46:02,400 –> 00:46:04,240
Prompts can reduce certain failure modes.
1240
00:46:04,240 –> 00:46:05,440
They can tighten behavior.
1241
00:46:05,440 –> 00:46:06,960
They can improve grounding.
1242
00:46:06,960 –> 00:46:09,120
But prompts are not an enforcement mechanism.
1243
00:46:09,120 –> 00:46:11,600
They are a suggestion to a probabilistic engine.
1244
00:46:11,600 –> 00:46:13,120
And the moment the system is allowed
1245
00:46:13,120 –> 00:46:16,160
to execute privileged actions based on suggestions,
1246
00:46:16,160 –> 00:46:17,200
you have already lost.
1247
00:46:17,200 –> 00:46:18,960
So govern outcomes, not intelligence.
1248
00:46:18,960 –> 00:46:22,240
In practice, that means you define what allowed looks like.
1249
00:46:22,240 –> 00:46:24,880
Which tools can be used under which identities,
1250
00:46:24,880 –> 00:46:27,120
with which prerequisites and with which approvals.
1251
00:46:27,120 –> 00:46:29,280
You define what a valid state transition looks like.
1252
00:46:29,920 –> 00:46:32,880
You define what must be true before the next step can occur.
1253
00:46:32,880 –> 00:46:36,240
And you require structured outputs at boundaries
1254
00:46:36,240 –> 00:46:39,040
so that gates can evaluate facts, not vibes.
1255
00:46:39,040 –> 00:46:40,720
This is also where people get uncomfortable
1256
00:46:40,720 –> 00:46:42,160
about sandboxing reasoning.
1257
00:46:42,160 –> 00:46:42,640
Good.
1258
00:46:42,640 –> 00:46:44,880
Discomfort is usually a sign that you’re finally seeing
1259
00:46:44,880 –> 00:46:46,080
the real problem.
1260
00:46:46,080 –> 00:46:48,000
Sandboxing means the model can propose
1261
00:46:48,000 –> 00:46:49,360
but it cannot commit.
1262
00:46:49,360 –> 00:46:51,040
It can draft but it cannot send.
1263
00:46:51,040 –> 00:46:52,960
It can classify but it cannot authorize.
1264
00:46:52,960 –> 00:46:55,920
It can summarize but it cannot mutate a system of record.
1265
00:46:55,920 –> 00:46:58,320
The deterministic core receives the proposal,
1266
00:46:58,320 –> 00:47:01,040
validates it against policy, checks approvals,
1267
00:47:01,040 –> 00:47:02,880
and then decides whether execution is allowed.
1268
00:47:02,880 –> 00:47:05,200
The reason this works is boring.
1269
00:47:05,200 –> 00:47:06,720
The system becomes testable again.
1270
00:47:06,720 –> 00:47:08,560
You can replay a run and evaluate
1271
00:47:08,560 –> 00:47:10,880
whether the gates would have permitted the same actions.
1272
00:47:10,880 –> 00:47:12,320
You can measure exception rates.
1273
00:47:12,320 –> 00:47:14,400
You can isolate variance inside a step
1274
00:47:14,400 –> 00:47:17,120
without letting variance rewrite your environment.
1275
00:47:17,120 –> 00:47:19,280
And yes, you still accept that the reasoning itself
1276
00:47:19,280 –> 00:47:20,240
is probabilistic.
1277
00:47:20,240 –> 00:47:21,200
That’s not a defect.
1278
00:47:21,200 –> 00:47:21,840
That’s the product.
1279
00:47:21,840 –> 00:47:23,520
The mistake is letting probabilistic reasoning
1280
00:47:23,520 –> 00:47:25,120
become the authority layer.
1281
00:47:25,120 –> 00:47:27,280
This clicked for cloud architects years ago.
1282
00:47:27,280 –> 00:47:29,280
Even if they don’t call it the same thing.
1283
00:47:29,280 –> 00:47:31,040
Nobody secured cloud by understanding
1284
00:47:31,040 –> 00:47:32,480
every hypervisor instruction.
1285
00:47:32,480 –> 00:47:34,000
That was never the requirement.
1286
00:47:34,000 –> 00:47:36,080
What made cloud deployable was control planes,
1287
00:47:36,080 –> 00:47:38,720
policy engines, identity boundaries, and logs.
1288
00:47:38,720 –> 00:47:40,560
In other words, deterministic authority
1289
00:47:40,560 –> 00:47:42,240
around a complex substrate.
1290
00:47:42,240 –> 00:47:44,800
Agentex systems are the same category of problem.
1291
00:47:44,800 –> 00:47:47,360
And the black box objection often hides a deeper fear.
1292
00:47:47,360 –> 00:47:49,760
If I can’t explain it, I can’t be accountable for it.
1293
00:47:49,760 –> 00:47:50,960
That fear is rational.
1294
00:47:50,960 –> 00:47:53,280
The fix is not pretending the model is explainable.
1295
00:47:53,280 –> 00:47:54,960
The fix is building a system where
1296
00:47:54,960 –> 00:47:57,360
accountability attaches to what the system allowed,
1297
00:47:57,360 –> 00:47:58,640
not what the model generated.
1298
00:47:58,640 –> 00:48:01,200
So you implement boundaries, you log every decision,
1299
00:48:01,200 –> 00:48:03,200
you require approvals as objects,
1300
00:48:03,200 –> 00:48:05,600
you enforce least privilege, you design kill switches,
1301
00:48:05,600 –> 00:48:06,720
you measure outcomes.
1302
00:48:06,720 –> 00:48:08,480
And you accept that the model is probabilistic
1303
00:48:08,480 –> 00:48:10,000
because you’ve removed its ability
1304
00:48:10,000 –> 00:48:12,080
to silently expand blast radius.
1305
00:48:12,080 –> 00:48:12,960
That’s not surrender.
1306
00:48:12,960 –> 00:48:14,080
That’s architecture.
1307
00:48:14,080 –> 00:48:16,080
And when you do it, you get the only kind of trust
1308
00:48:16,080 –> 00:48:18,560
that matters in enterprise environments.
1309
00:48:18,560 –> 00:48:21,040
The trust that the system cannot do the wrong thing,
1310
00:48:21,040 –> 00:48:23,760
even when the model says something convincingly wrong.
1311
00:48:23,760 –> 00:48:26,800
Deployment posture build a governed agent catalog, not a zoo.
1312
00:48:26,800 –> 00:48:30,400
The deployment posture that unlocks ROI is not ship more agents.
1313
00:48:30,400 –> 00:48:32,480
It is build a governed agent catalog.
1314
00:48:32,480 –> 00:48:34,640
A catalog is not a directory of clever toys.
1315
00:48:34,640 –> 00:48:36,960
It is an internal platform surface.
1316
00:48:36,960 –> 00:48:39,600
Owned capabilities, explicit contracts,
1317
00:48:39,600 –> 00:48:42,000
versioned change, measurable outcomes,
1318
00:48:42,000 –> 00:48:44,960
and the ability to disable something quickly when it drifts.
1319
00:48:44,960 –> 00:48:47,040
The difference between a catalog and a zoo is simple.
1320
00:48:47,040 –> 00:48:50,240
In a catalog, every capability has an owner and a life cycle.
1321
00:48:50,240 –> 00:48:52,720
In a zoo, everything has a demo and a graveyard.
1322
00:48:52,720 –> 00:48:56,080
Start by treating connected agents as enterprise services.
1323
00:48:56,080 –> 00:48:58,480
That means every connected capability must have.
1324
00:48:58,480 –> 00:49:01,840
An accountable owner, a description that acts as routing signal,
1325
00:49:01,840 –> 00:49:04,880
a defined security posture, and an operational runbook.
1326
00:49:04,880 –> 00:49:06,560
Not an aspirational wiki.
1327
00:49:06,560 –> 00:49:09,200
A real posture, how it authenticates, what it can do,
1328
00:49:09,200 –> 00:49:11,600
what it cannot do, and what happens when it fails.
1329
00:49:11,600 –> 00:49:14,320
If you can’t answer those questions, it is not a service.
1330
00:49:14,320 –> 00:49:16,400
It is an entropy generator with a name.
1331
00:49:16,400 –> 00:49:18,400
Then promote capabilities deliberately.
1332
00:49:18,400 –> 00:49:21,360
Early on, teams will build embedded agents because it’s fast.
1333
00:49:21,360 –> 00:49:24,640
That’s fine, as long as you treat it as incubation, not architecture.
1334
00:49:24,640 –> 00:49:26,800
When the same embedded logic appears twice,
1335
00:49:26,800 –> 00:49:29,600
promote it into a connected agent and kill the clones.
1336
00:49:29,600 –> 00:49:32,400
The goal is to converge on one approved capability surface
1337
00:49:32,400 –> 00:49:34,800
per enterprise function not to celebrate reuse
1338
00:49:34,800 –> 00:49:36,800
while silently forking policy.
1339
00:49:36,800 –> 00:49:39,120
And the master agent must remain the control plane.
1340
00:49:39,120 –> 00:49:40,880
It routes gates, logs, and sequences.
1341
00:49:40,880 –> 00:49:42,640
It does not contain the domain logic.
1342
00:49:42,640 –> 00:49:44,160
It does not become a second brain.
1343
00:49:44,160 –> 00:49:47,760
It remains boring because boring is what survives organizational drift.
1344
00:49:47,760 –> 00:49:51,840
Operationally, this means your catalog needs a few mandatory design requirements.
1345
00:49:51,840 –> 00:49:53,760
One, every agent has a contract.
1346
00:49:53,760 –> 00:49:56,800
Capability exclusions prerequisites not pros.
1347
00:49:56,800 –> 00:49:57,680
A contract.
1348
00:49:57,680 –> 00:50:00,560
If an agent can draft, but not act, that must be explicit.
1349
00:50:00,560 –> 00:50:03,280
If an agent can act only within a constrained scope,
1350
00:50:03,280 –> 00:50:04,960
that scope must be explicit.
1351
00:50:04,960 –> 00:50:08,160
This is how you stop helpful from becoming unauthorized.
1352
00:50:08,160 –> 00:50:10,000
Two, every agent has versioning.
1353
00:50:10,000 –> 00:50:12,240
The enterprise failure mode is added in place.
1354
00:50:12,240 –> 00:50:14,560
That is how you create behavioral variants over time
1355
00:50:14,560 –> 00:50:16,560
while insisting the system is stable.
1356
00:50:16,560 –> 00:50:19,280
Publish V1, Publish V2, Deprecate V1.
1357
00:50:19,280 –> 00:50:22,960
If you need an emergency fix, publish a patched version and roll forward deliberately.
1358
00:50:22,960 –> 00:50:25,840
You do not mutate a capability surface and then act surprised
1359
00:50:25,840 –> 00:50:28,080
when last quarters controls no longer apply.
1360
00:50:28,080 –> 00:50:30,960
Three, every agent has a kill switch, not a delete button.
1361
00:50:30,960 –> 00:50:33,440
A kill switch, the ability to disable a capability
1362
00:50:33,440 –> 00:50:36,000
quickly without collapsing the entire workflow graph.
1363
00:50:36,000 –> 00:50:38,240
If a connected agent starts producing bad outcomes
1364
00:50:38,240 –> 00:50:39,920
or its credential posture fails,
1365
00:50:39,920 –> 00:50:43,040
you must be able to remove it from routing and contain blast radius.
1366
00:50:43,040 –> 00:50:45,840
This is how you keep incidents from becoming existential debates
1367
00:50:45,840 –> 00:50:47,680
about the initiative.
1368
00:50:47,680 –> 00:50:50,320
Four, every workflow emits the same trace shape,
1369
00:50:50,320 –> 00:50:53,600
intent, decision, action, outcome, across the full graph.
1370
00:50:53,600 –> 00:50:54,880
If each agent logs differently,
1371
00:50:54,880 –> 00:50:56,960
you will never reconstruct a chain of custody.
1372
00:50:56,960 –> 00:50:59,760
So standardize observability at the platform layer,
1373
00:50:59,760 –> 00:51:01,840
not in each team’s personal style.
1374
00:51:01,840 –> 00:51:04,000
You’re not collecting logs, you’re collecting evidence.
1375
00:51:04,000 –> 00:51:07,120
Five, Enforced Change Control.
1376
00:51:07,120 –> 00:51:09,920
You don’t need bureaucracy, you need determinism,
1377
00:51:09,920 –> 00:51:11,280
changes to agent descriptions,
1378
00:51:11,280 –> 00:51:14,480
tool availability and gating logic are control plane changes.
1379
00:51:14,480 –> 00:51:16,160
Treat them like control plane changes,
1380
00:51:16,160 –> 00:51:18,080
review them, test them, roll them out in stages,
1381
00:51:18,080 –> 00:51:21,200
measure outcomes, roll back when necessary.
1382
00:51:21,200 –> 00:51:22,400
Governance is not a committee,
1383
00:51:22,400 –> 00:51:24,960
governance is a repeatable deployment discipline.
1384
00:51:24,960 –> 00:51:28,000
Now, if you want executive buy-in, stop selling agents,
1385
00:51:28,000 –> 00:51:30,240
sell measurable outcomes.
1386
00:51:30,240 –> 00:51:33,360
Define success metrics that map to business reality.
1387
00:51:33,360 –> 00:51:36,640
Completion rate, exception rate, time to resolution and cost per run.
1388
00:51:36,640 –> 00:51:38,000
Those metrics are not theory,
1389
00:51:38,000 –> 00:51:40,640
they are how you show that deterministic orchestration
1390
00:51:40,640 –> 00:51:43,440
converts probabilistic reasoning into enterprise automation
1391
00:51:43,440 –> 00:51:44,240
that can scale.
1392
00:51:44,240 –> 00:51:45,840
They also give you the power to say no.
1393
00:51:45,840 –> 00:51:48,080
If a new capability increases exception rate
1394
00:51:48,080 –> 00:51:50,720
or increases cost per run without improving outcomes,
1395
00:51:50,720 –> 00:51:51,520
it doesn’t ship.
1396
00:51:51,520 –> 00:51:52,720
That is what a platform does.
1397
00:51:52,720 –> 00:51:55,200
It enforces priorities when enthusiasm
1398
00:51:55,200 –> 00:51:56,720
would otherwise erode control.
1399
00:51:56,720 –> 00:51:58,640
And you have to be honest about what you’re building.
1400
00:51:58,640 –> 00:52:00,240
You are not deploying assistance.
1401
00:52:00,240 –> 00:52:02,640
You are deploying a distributed decision engine
1402
00:52:02,640 –> 00:52:03,800
with tool access.
1403
00:52:03,800 –> 00:52:06,200
So your deployment posture must assume drift.
1404
00:52:06,200 –> 00:52:09,080
New teams will clone things, descriptions will get edited
1405
00:52:09,080 –> 00:52:12,040
for clarity, permissions will broaden under pressure
1406
00:52:12,040 –> 00:52:14,920
and shortcuts will be justified as temporary.
1407
00:52:14,920 –> 00:52:16,360
Entropy is not a possibility.
1408
00:52:16,360 –> 00:52:17,640
It is the default trajectory.
1409
00:52:17,640 –> 00:52:20,680
The only response is to design enforcement into the architecture.
1410
00:52:20,680 –> 00:52:22,480
This is where the primary refrain belongs
1411
00:52:22,480 –> 00:52:25,240
because it resets the whole argument back to control.
1412
00:52:25,240 –> 00:52:26,680
This isn’t about smarter AI.
1413
00:52:26,680 –> 00:52:31,120
It’s about who’s allowed to decide to pause, build the catalog,
1414
00:52:31,120 –> 00:52:34,160
enforce the contracts, standardize the trace,
1415
00:52:34,160 –> 00:52:37,560
bound the blast radius, keep the master agent boring,
1416
00:52:37,560 –> 00:52:40,080
and you get something rare in enterprise AI,
1417
00:52:40,080 –> 00:52:43,880
a system that can grow without becoming unknowable.
1418
00:52:43,880 –> 00:52:44,800
Conclusion.
1419
00:52:44,800 –> 00:52:47,240
Determinism is what makes AI deployable.
1420
00:52:47,240 –> 00:52:49,840
Determinism is what makes AI deployable.
1421
00:52:49,840 –> 00:52:51,840
Let models reason inside steps,
1422
00:52:51,840 –> 00:52:53,640
but keep authority in a control plane
1423
00:52:53,640 –> 00:52:55,720
that gates execution and proves outcomes.
1424
00:52:55,720 –> 00:52:57,920
If you want the next layer, watch the follow-up
1425
00:52:57,920 –> 00:52:59,800
on building a master agent routing model
1426
00:52:59,800 –> 00:53:01,240
with connected agent contracts
1427
00:53:01,240 –> 00:53:04,720
because routing is where good designs quietly fail.
1428
00:53:04,720 –> 00:53:07,000
Subscribe if you want more uncomfortable truths
1429
00:53:07,000 –> 00:53:09,120
about Entra, Copilot, and why governance
1430
00:53:09,120 –> 00:53:12,520
always erodes unless you enforce it by design.