
1
00:00:00,000 –> 00:00:04,280
Most organizations hear more agents and assume more productivity.
2
00:00:04,280 –> 00:00:06,560
That assumption is comfortable, it’s also wrong.
3
00:00:06,560 –> 00:00:08,840
More agents usually means more unmanaged authority,
4
00:00:08,840 –> 00:00:10,120
more automated side effects,
5
00:00:10,120 –> 00:00:13,240
and more places where accountability quietly disappears.
6
00:00:13,240 –> 00:00:14,760
This isn’t an episode about features,
7
00:00:14,760 –> 00:00:17,040
it’s about scale, risk, and an operating model
8
00:00:17,040 –> 00:00:19,240
that survives contact with reality.
9
00:00:19,240 –> 00:00:20,760
You’ll leave with three failure modes
10
00:00:20,760 –> 00:00:21,880
that kill agent programs,
11
00:00:21,880 –> 00:00:24,040
the four layer control plane that prevents drift
12
00:00:24,040 –> 00:00:26,800
and the questions executives should demand answers to.
13
00:00:26,800 –> 00:00:28,280
Now, start with the misunderstanding
14
00:00:28,280 –> 00:00:29,600
that causes the chaos.
15
00:00:29,600 –> 00:00:33,240
The foundational misunderstanding agents aren’t assistants
16
00:00:33,240 –> 00:00:34,960
and assistant generates answers,
17
00:00:34,960 –> 00:00:36,200
then agent executes work.
18
00:00:36,200 –> 00:00:38,840
That distinction matters because execution creates state
19
00:00:38,840 –> 00:00:40,520
and state creates consequences.
20
00:00:40,520 –> 00:00:43,000
A copilot style assistant can write a summary.
21
00:00:43,000 –> 00:00:45,520
An agent can open a ticket, change a permission,
22
00:00:45,520 –> 00:00:48,480
update a record, trigger a flow, notify a manager,
23
00:00:48,480 –> 00:00:49,960
and then do it again tomorrow
24
00:00:49,960 –> 00:00:51,680
because you told it that’s the process.
25
00:00:51,680 –> 00:00:54,600
The system didn’t help, it acted.
26
00:00:54,600 –> 00:00:56,680
Most enterprises keep talking about agents
27
00:00:56,680 –> 00:00:58,800
like their chatbots with better prompts.
28
00:00:58,800 –> 00:00:59,640
They’re not.
29
00:00:59,640 –> 00:01:01,920
The agentic model is tools plus memory plus loops.
30
00:01:01,920 –> 00:01:03,520
It takes a goal, it calls something,
31
00:01:03,520 –> 00:01:05,840
it evaluates the result, it calls something else,
32
00:01:05,840 –> 00:01:08,520
and it keeps going until it decides it’s done.
33
00:01:08,520 –> 00:01:10,440
That loop is the real architectural shift.
34
00:01:10,440 –> 00:01:11,920
Here’s the uncomfortable truth.
35
00:01:11,920 –> 00:01:15,360
Once you cross from answer generation to tool execution,
36
00:01:15,360 –> 00:01:17,480
you are no longer governing a conversation.
37
00:01:17,480 –> 00:01:19,440
You are governing a distributed decision engine
38
00:01:19,440 –> 00:01:20,960
that can mutate systems.
39
00:01:20,960 –> 00:01:23,760
Think about the things that were previously hard to do at scale.
40
00:01:23,760 –> 00:01:25,200
Not because they were technically hard,
41
00:01:25,200 –> 00:01:26,960
but because they required human friction.
42
00:01:26,960 –> 00:01:28,720
Humans hesitate, humans ask questions,
43
00:01:28,720 –> 00:01:31,400
humans get tired, humans escalate, agents don’t,
44
00:01:31,400 –> 00:01:33,840
agents are obedient, they don’t resist,
45
00:01:33,840 –> 00:01:36,880
they don’t get that gut feeling that something feels off.
46
00:01:36,880 –> 00:01:38,520
They just run the workflow you gave them,
47
00:01:38,520 –> 00:01:40,960
with whatever data and permissions you accidentally handed them
48
00:01:40,960 –> 00:01:44,080
along the way, and that’s why the network of agents’ narrative
49
00:01:44,080 –> 00:01:45,680
should make you nervous.
50
00:01:45,680 –> 00:01:48,840
The marketing version is coordination, agents collaborating,
51
00:01:48,840 –> 00:01:51,440
handing off tasks, accelerating outcomes.
52
00:01:51,440 –> 00:01:53,240
The operational version is chaining.
53
00:01:53,240 –> 00:01:57,000
Agent A calls tool X, which triggers system Y, which wakes agent B,
54
00:01:57,000 –> 00:01:59,760
which requests approval from a person who clicks approve
55
00:01:59,760 –> 00:02:01,600
because the card looks legitimate,
56
00:02:01,600 –> 00:02:04,280
which then causes agency to run a right operation
57
00:02:04,280 –> 00:02:06,400
that nobody can easily unwind.
58
00:02:06,400 –> 00:02:09,320
Actions, triggers, approval loops, chaining.
59
00:02:09,320 –> 00:02:10,800
This is not a linear flow chart.
60
00:02:10,800 –> 00:02:13,320
It’s an authorization graph with automation attached.
61
00:02:13,320 –> 00:02:14,360
Now add scale.
62
00:02:14,360 –> 00:02:15,760
In the research you provided,
63
00:02:15,760 –> 00:02:17,640
the billions of agents’ prediction shows up.
64
00:02:17,640 –> 00:02:19,560
Fine, treat that as hype if you want.
65
00:02:19,560 –> 00:02:22,280
Enterprise reality doesn’t need billions to fail.
66
00:02:22,280 –> 00:02:23,240
It needs dozens.
67
00:02:23,240 –> 00:02:24,920
It needs one department shipping,
68
00:02:24,920 –> 00:02:27,440
helpful agents faster than anyone can catalog.
69
00:02:27,440 –> 00:02:29,560
It needs one vendor adding a pre-built agent
70
00:02:29,560 –> 00:02:31,880
that quietly brings its own connector strategy.
71
00:02:31,880 –> 00:02:34,440
It needs one maker discovering they can share something
72
00:02:34,440 –> 00:02:35,480
just to my team.
73
00:02:35,480 –> 00:02:37,360
And then it spreads because it’s useful.
74
00:02:37,360 –> 00:02:38,920
Entropy doesn’t require malice.
75
00:02:38,920 –> 00:02:41,080
It requires convenience.
76
00:02:41,080 –> 00:02:42,560
The foundational mistake is assuming
77
00:02:42,560 –> 00:02:44,400
that an agent is a feature you deploy.
78
00:02:44,400 –> 00:02:46,720
In reality, an agent is a product you operate.
79
00:02:46,720 –> 00:02:49,040
It has an identity surface, a permission surface,
80
00:02:49,040 –> 00:02:52,280
a data surface, a tool surface, and a behavioral surface.
81
00:02:52,280 –> 00:02:54,400
And if you don’t define those surfaces deliberately,
82
00:02:54,400 –> 00:02:56,320
the platform will define them accidentally.
83
00:02:56,320 –> 00:02:59,040
Let’s break down the assistant versus agent difference
84
00:02:59,040 –> 00:03:01,160
in a way that makes governance obvious.
85
00:03:01,160 –> 00:03:02,960
Assistance mostly create text.
86
00:03:02,960 –> 00:03:05,560
The worst case is misinformation, tone issues,
87
00:03:05,560 –> 00:03:06,760
or bad summary.
88
00:03:06,760 –> 00:03:08,880
That’s real damage, but it’s usually reversible.
89
00:03:08,880 –> 00:03:10,760
You correct the output, you move on.
90
00:03:10,760 –> 00:03:12,320
Agents create side effects.
91
00:03:12,320 –> 00:03:13,640
Side effects are the point.
92
00:03:13,640 –> 00:03:15,160
They exist to change something.
93
00:03:15,160 –> 00:03:18,160
A ticket, a document state, a sharepoint permission,
94
00:03:18,160 –> 00:03:20,320
a mailbox rule, a team’s membership,
95
00:03:20,320 –> 00:03:22,960
and as your resource, a power platform environment,
96
00:03:22,960 –> 00:03:24,320
a purchase request.
97
00:03:24,320 –> 00:03:26,720
Side effects are also what auditors and incident responders
98
00:03:26,720 –> 00:03:29,200
care about because side effects are evidence of authority.
99
00:03:29,200 –> 00:03:31,760
So when leadership asks, our agent’s accurate,
100
00:03:31,760 –> 00:03:34,560
then the right answer is accuracy is not the primary risk.
101
00:03:34,560 –> 00:03:37,600
Authority is, nobody gets fired because a bot wrote an awkward
102
00:03:37,600 –> 00:03:38,120
paragraph.
103
00:03:38,120 –> 00:03:40,360
People get fired because the bot did something
104
00:03:40,360 –> 00:03:42,800
and nobody can prove who authorized it.
105
00:03:42,800 –> 00:03:44,840
And this is where the open loop matters.
106
00:03:44,840 –> 00:03:46,640
Everyone wants the agentic decade.
107
00:03:46,640 –> 00:03:48,160
They want a fleet of digital workers.
108
00:03:48,160 –> 00:03:49,160
They want autonomy.
109
00:03:49,160 –> 00:03:50,080
They want scale.
110
00:03:50,080 –> 00:03:52,960
But autonomy without governance doesn’t scale intelligence.
111
00:03:52,960 –> 00:03:54,160
It scales ambiguity.
112
00:03:54,160 –> 00:03:56,640
Over time, policies drift away from intent.
113
00:03:56,640 –> 00:03:57,880
Exceptions accumulate.
114
00:03:57,880 –> 00:04:00,320
Ownership changes, service accounts, multiply,
115
00:04:00,320 –> 00:04:01,760
connectors proliferate.
116
00:04:01,760 –> 00:04:04,240
And every single one of those is an entropy generator.
117
00:04:04,240 –> 00:04:06,880
The program keeps moving, but the control plane erodes
118
00:04:06,880 –> 00:04:07,680
underneath it.
119
00:04:07,680 –> 00:04:09,800
So the first job in an agentic strategy
120
00:04:09,800 –> 00:04:11,160
is in building smarter agents.
121
00:04:11,160 –> 00:04:13,800
It’s deciding what the enterprise will treat as enforceable
122
00:04:13,800 –> 00:04:14,600
truth.
123
00:04:14,600 –> 00:04:16,320
If something can act, it must have
124
00:04:16,320 –> 00:04:17,000
an identity.
125
00:04:17,000 –> 00:04:20,160
If it can call a tool, the tool must be governed as infrastructure.
126
00:04:20,160 –> 00:04:23,280
If it can see data, the boundary must be defined and monitored.
127
00:04:23,280 –> 00:04:26,760
If it behaves strangely, containment must be fast and attributable.
128
00:04:26,760 –> 00:04:28,120
That’s the mental model shift.
129
00:04:28,120 –> 00:04:29,280
Agents aren’t assistants.
130
00:04:29,280 –> 00:04:32,040
They are automated actors inside your control plane.
131
00:04:32,040 –> 00:04:35,400
Next, define what agent sprawl actually means without the marketing
132
00:04:35,400 –> 00:04:38,720
fog and why it always shows up first as identity drift.
133
00:04:38,720 –> 00:04:41,760
Define agent sprawl without the marketing fog.
134
00:04:41,760 –> 00:04:44,160
Agent sprawl isn’t, we have a lot of agents.
135
00:04:44,160 –> 00:04:45,640
That’s the poster version.
136
00:04:45,640 –> 00:04:47,520
The operational version is uglier.
137
00:04:47,520 –> 00:04:50,960
Uncontrolled growth across six things that are hard to inventory
138
00:04:50,960 –> 00:04:52,320
and even harder to reverse.
139
00:04:52,320 –> 00:04:55,400
Identities, tools, prompts, permissions, owners, versions.
140
00:04:55,400 –> 00:04:57,840
If you can’t name those six for every agent,
141
00:04:57,840 –> 00:04:59,280
you don’t have an ecosystem.
142
00:04:59,280 –> 00:05:00,000
You have a rumor.
143
00:05:00,000 –> 00:05:01,160
Start with identity.
144
00:05:01,160 –> 00:05:04,360
Every agent that can act needs some form of authentication path.
145
00:05:04,360 –> 00:05:07,720
If the enterprise can’t point to a stable, non-human identity
146
00:05:07,720 –> 00:05:11,520
for that agent, then every action it takes becomes a debate later.
147
00:05:11,520 –> 00:05:13,880
Identity sprawl is the first crack in the foundation,
148
00:05:13,880 –> 00:05:17,680
because everything else, tool calls, audits, containment,
149
00:05:17,680 –> 00:05:19,960
hangs off it, then tools.
150
00:05:19,960 –> 00:05:22,000
Agents don’t exist as pure language.
151
00:05:22,000 –> 00:05:24,480
They exist as a bundle of tool invocations,
152
00:05:24,480 –> 00:05:27,080
graph calls, connector actions, custom APIs,
153
00:05:27,080 –> 00:05:30,160
MCP tools, flows, web retrieval, code interpreter,
154
00:05:30,160 –> 00:05:32,240
whatever the platform exposes this week.
155
00:05:32,240 –> 00:05:33,880
Tools sprawl means you stop knowing
156
00:05:33,880 –> 00:05:36,240
what capabilities are available, where they’re reused,
157
00:05:36,240 –> 00:05:38,160
and which ones became the unofficial standard
158
00:05:38,160 –> 00:05:39,840
because someone shipped them first.
159
00:05:39,840 –> 00:05:41,240
Then prompts and instructions.
160
00:05:41,240 –> 00:05:43,600
People treat prompts like documentation they can.
161
00:05:43,600 –> 00:05:44,440
They aren’t.
162
00:05:44,440 –> 00:05:45,760
Prompts are logic.
163
00:05:45,760 –> 00:05:47,880
They are policy expressed as text,
164
00:05:47,880 –> 00:05:50,000
usually without change control.
165
00:05:50,000 –> 00:05:52,160
Prompts sprawl means the enterprise ends up
166
00:05:52,160 –> 00:05:54,120
with dozens of slightly different interpretations
167
00:05:54,120 –> 00:05:55,040
of the same rule.
168
00:05:55,040 –> 00:05:57,840
What counts as sensitive, what counts as approval worthy,
169
00:05:57,840 –> 00:06:01,160
what counts as safe to share, what counts as authoritative.
170
00:06:01,160 –> 00:06:02,360
Drift isn’t a bug here.
171
00:06:02,360 –> 00:06:03,720
Drift is the default behavior.
172
00:06:03,720 –> 00:06:04,560
Then permissions.
173
00:06:04,560 –> 00:06:06,480
This is where sprawl becomes expensive.
174
00:06:06,480 –> 00:06:09,040
Agents get access through the same pathways humans do.
175
00:06:09,040 –> 00:06:11,920
App permissions, delegated access, connector credentials,
176
00:06:11,920 –> 00:06:14,680
service principles, environment roles, sharepoint permissions,
177
00:06:14,680 –> 00:06:17,080
and the quiet, just give it contributor for now,
178
00:06:17,080 –> 00:06:19,200
moments that never get revisited.
179
00:06:19,200 –> 00:06:21,840
Permission sprawl is how a deterministic security model
180
00:06:21,840 –> 00:06:22,960
becomes probabilistic.
181
00:06:22,960 –> 00:06:24,040
One exception doesn’t hurt.
182
00:06:24,040 –> 00:06:26,320
A thousand exceptions becomes policy theater.
183
00:06:26,320 –> 00:06:27,160
Then owners.
184
00:06:27,160 –> 00:06:28,360
Someone has to be accountable.
185
00:06:28,360 –> 00:06:29,560
Not the maker who built it,
186
00:06:29,560 –> 00:06:31,960
but the sponsor who owns the business outcome,
187
00:06:31,960 –> 00:06:33,880
the risk and the life cycle.
188
00:06:33,880 –> 00:06:35,520
Owners sprawl looks like this.
189
00:06:35,520 –> 00:06:37,400
The agent is shared, it becomes popular,
190
00:06:37,400 –> 00:06:38,800
the original maker changes roles,
191
00:06:38,800 –> 00:06:41,200
and suddenly nobody can answer a basic question like,
192
00:06:41,200 –> 00:06:43,000
who approves changes to this thing?
193
00:06:43,000 –> 00:06:44,400
The agent stays live anyway.
194
00:06:44,400 –> 00:06:45,080
Of course it does.
195
00:06:45,080 –> 00:06:46,080
And finally versions.
196
00:06:46,080 –> 00:06:47,240
Every agent evolves.
197
00:06:47,240 –> 00:06:49,440
New instructions, new tools, new knowledge sources,
198
00:06:49,440 –> 00:06:51,360
new connectors, new models, new guardrails.
199
00:06:51,360 –> 00:06:54,800
Version sprawl is what turns troubleshooting into archaeology.
200
00:06:54,800 –> 00:06:56,440
You stop debugging behavior,
201
00:06:56,440 –> 00:06:59,760
you start guessing which variant of the agent a user even hit.
202
00:06:59,760 –> 00:07:02,280
Now how does this sprawl actually enter the tenant?
203
00:07:02,280 –> 00:07:04,560
It usually arrives through three vectors,
204
00:07:04,560 –> 00:07:06,280
and all three feel legitimate.
205
00:07:06,280 –> 00:07:08,920
Vector1 is maker led that I built a helpful agent
206
00:07:08,920 –> 00:07:09,840
for my team story.
207
00:07:09,840 –> 00:07:11,640
It starts personal, then departmental,
208
00:07:11,640 –> 00:07:12,880
then someone forwards the link,
209
00:07:12,880 –> 00:07:14,200
then it becomes semi-official.
210
00:07:14,200 –> 00:07:15,880
And suddenly it’s in a critical workflow
211
00:07:15,880 –> 00:07:17,160
without a review gate.
212
00:07:17,160 –> 00:07:19,880
Maker led sprawl is fast because the platform is designed
213
00:07:19,880 –> 00:07:20,600
to be easy.
214
00:07:20,600 –> 00:07:22,160
Vector2 is vendor-provided.
215
00:07:22,160 –> 00:07:23,840
Microsoft ships agents.
216
00:07:23,840 –> 00:07:25,480
Third parties ship agents.
217
00:07:25,480 –> 00:07:27,480
Internal platform teams ship starter agents.
218
00:07:27,480 –> 00:07:29,320
They’re pre-built, they demo well,
219
00:07:29,320 –> 00:07:31,680
and they often arrive with assumptions baked in.
220
00:07:31,680 –> 00:07:34,640
Permissions, connectors, data paths, and tool contracts.
221
00:07:34,640 –> 00:07:37,320
Vender led sprawl is dangerous because it feels sanctioned.
222
00:07:37,320 –> 00:07:38,880
People stop asking what it can do
223
00:07:38,880 –> 00:07:40,800
and start asking why they can’t have it.
224
00:07:40,800 –> 00:07:43,800
Vector3 is marketplace and third party integrations.
225
00:07:43,800 –> 00:07:45,560
MCP makes tool on boarding easier.
226
00:07:45,560 –> 00:07:48,240
That’s the point, but the other point nobody wants to admit
227
00:07:48,240 –> 00:07:50,680
is that easier on boarding also means easier propagation
228
00:07:50,680 –> 00:07:51,440
of mistakes.
229
00:07:51,440 –> 00:07:53,240
One bad tool contract doesn’t stay local.
230
00:07:53,240 –> 00:07:54,440
It becomes reusable.
231
00:07:54,440 –> 00:07:55,960
It spreads.
232
00:07:55,960 –> 00:07:59,160
And here’s the hidden multiplier most organizations miss.
233
00:07:59,160 –> 00:08:02,120
Every agent adds surface area across Microsoft 365,
234
00:08:02,120 –> 00:08:04,640
Power Platform, and Azure at the same time.
235
00:08:04,640 –> 00:08:06,280
This isn’t three separate platforms.
236
00:08:06,280 –> 00:08:09,600
It’s one connected control plane with different admin centers
237
00:08:09,600 –> 00:08:11,480
pretending to be separate worlds.
238
00:08:11,480 –> 00:08:14,200
So agent sprawl doesn’t just grow your AI footprint.
239
00:08:14,200 –> 00:08:16,600
It grows your identity footprint, your connector footprint,
240
00:08:16,600 –> 00:08:18,920
your audit footprint, and your incident footprint.
241
00:08:18,920 –> 00:08:20,520
You’ll know your in sprawl early
242
00:08:20,520 –> 00:08:22,800
when a new question shows up in meetings.
243
00:08:22,800 –> 00:08:24,720
Which agent should I use?
244
00:08:24,720 –> 00:08:25,600
That sounds harmless.
245
00:08:25,600 –> 00:08:26,440
It isn’t.
246
00:08:26,440 –> 00:08:27,440
It’s a productivity tax.
247
00:08:27,440 –> 00:08:29,920
It means discoverability is failing, duplication is happening,
248
00:08:29,920 –> 00:08:31,080
and trust is fragmenting.
249
00:08:31,080 –> 00:08:33,400
People stop using the ecosystem because it’s noisy,
250
00:08:33,400 –> 00:08:35,280
inconsistent, and unpredictable.
251
00:08:35,280 –> 00:08:37,600
And once that happens, the next failure mode
252
00:08:37,600 –> 00:08:39,880
arrives on schedule, identity drift.
253
00:08:39,880 –> 00:08:41,640
Not because you meant to lose accountability,
254
00:08:41,640 –> 00:08:43,520
but because you scaled agents faster
255
00:08:43,520 –> 00:08:44,920
than you scaled attribution.
256
00:08:44,920 –> 00:08:46,280
That distinction matters.
257
00:08:46,280 –> 00:08:49,840
Risk one, identity drift, when accountability collapses.
258
00:08:49,840 –> 00:08:52,760
Identity drift is what happens when an agent can take action,
259
00:08:52,760 –> 00:08:54,760
but the enterprise can’t reliably prove
260
00:08:54,760 –> 00:08:57,400
which non-human identity perform that action
261
00:08:57,400 –> 00:08:59,560
under what authority and with whose approval
262
00:08:59,560 –> 00:09:01,560
and the collapse is silent at first.
263
00:09:01,560 –> 00:09:03,200
Because everything still works.
264
00:09:03,200 –> 00:09:04,640
Tickets still get created.
265
00:09:04,640 –> 00:09:05,720
Files still get moved.
266
00:09:05,720 –> 00:09:06,960
Permissions still get granted.
267
00:09:06,960 –> 00:09:08,120
People still get answers.
268
00:09:08,120 –> 00:09:10,960
The system stays productive right up until the moment someone
269
00:09:10,960 –> 00:09:13,640
asks the one question you can’t afford to fumble.
270
00:09:13,640 –> 00:09:14,280
Who did this?
271
00:09:14,280 –> 00:09:15,800
Not which user asked the question
272
00:09:15,800 –> 00:09:18,200
and not which department owns the workflow.
273
00:09:18,200 –> 00:09:20,600
Auditors don’t care about your intent story.
274
00:09:20,600 –> 00:09:23,240
They care about attribution, a stable identity,
275
00:09:23,240 –> 00:09:25,600
a permission chain, and an evidence trail.
276
00:09:25,600 –> 00:09:28,160
Identity drift shows up when actions get executed
277
00:09:28,160 –> 00:09:29,880
through anonymous pathways.
278
00:09:29,880 –> 00:09:31,120
Shared credentials.
279
00:09:31,120 –> 00:09:34,040
Maker connections, inherited delegated permissions,
280
00:09:34,040 –> 00:09:37,640
or tool calls that log as some generic application context
281
00:09:37,640 –> 00:09:41,120
that nobody can map back to an accountable sponsor.
282
00:09:41,120 –> 00:09:43,920
Over time, the organization stops seeing agent actions
283
00:09:43,920 –> 00:09:44,440
as actions.
284
00:09:44,440 –> 00:09:46,640
They start seeing them as automation.
285
00:09:46,640 –> 00:09:48,600
And automation historically has been treated
286
00:09:48,600 –> 00:09:49,760
like background noise.
287
00:09:49,760 –> 00:09:50,720
That’s a mistake.
288
00:09:50,720 –> 00:09:53,280
Because the risk isn’t that the agent makes a wrong decision.
289
00:09:53,280 –> 00:09:55,080
The risk is that the enterprise can’t prove
290
00:09:55,080 –> 00:09:56,440
it made any decision at all.
291
00:09:56,440 –> 00:09:59,520
Here’s what incident response looks like under identity drift.
292
00:09:59,520 –> 00:10:01,640
Something changes in SharePoint permissions.
293
00:10:01,640 –> 00:10:02,640
A mailbox rule appears.
294
00:10:02,640 –> 00:10:03,920
A team’s membership changes.
295
00:10:03,920 –> 00:10:05,040
A record gets updated.
296
00:10:05,040 –> 00:10:06,880
The user says, I didn’t do that.
297
00:10:06,880 –> 00:10:08,520
The admin says, we have logs.
298
00:10:08,520 –> 00:10:10,800
The security team pulls the event, and it points
299
00:10:10,800 –> 00:10:12,640
to a service account, or a connector,
300
00:10:12,640 –> 00:10:15,760
or a vague app identity used by five different agents,
301
00:10:15,760 –> 00:10:19,560
owned by nobody, sponsored by nobody, and documented nowhere.
302
00:10:19,560 –> 00:10:22,720
Now, the organization is no longer investigating a technical event.
303
00:10:22,720 –> 00:10:25,920
It’s negotiating a narrative, and narratives don’t survive audits.
304
00:10:25,920 –> 00:10:27,600
Auditors ask three basic questions.
305
00:10:27,600 –> 00:10:29,960
And identity drift makes all three painful.
306
00:10:29,960 –> 00:10:31,320
First, who perform the action?
307
00:10:31,320 –> 00:10:34,040
Second, under what authority did they have permission?
308
00:10:34,040 –> 00:10:36,440
Third, who approved that authority and when?
309
00:10:36,440 –> 00:10:39,560
If your answer sounds like, well, it was probably the service desk agent,
310
00:10:39,560 –> 00:10:40,480
you don’t have evidence.
311
00:10:40,480 –> 00:10:41,280
You have suspicion.
312
00:10:41,280 –> 00:10:44,480
And the moment the business hears, probably, the program freezes.
313
00:10:44,480 –> 00:10:47,280
Because that’s the real consequence of identity drift.
314
00:10:47,280 –> 00:10:51,120
One visible incident can pause the entire agent roll out across the company.
315
00:10:51,120 –> 00:10:53,160
The patents are depressingly consistent.
316
00:10:53,160 –> 00:10:55,960
Share credentials are the classic entropy generator.
317
00:10:55,960 –> 00:10:57,720
Someone creates a bot account.
318
00:10:57,720 –> 00:11:00,080
Shares it with a few makers and calls it progress.
319
00:11:00,080 –> 00:11:01,960
It isn’t its identity debt.
320
00:11:01,960 –> 00:11:03,400
Maker connections are the next one.
321
00:11:03,400 –> 00:11:06,080
An agent gets built using a person’s delegated access,
322
00:11:06,080 –> 00:11:08,200
then shared, then used by others.
323
00:11:08,200 –> 00:11:10,520
The agent now acts with the makers shadow authority
324
00:11:10,520 –> 00:11:14,480
until the day that maker leaves, changes roles, or gets their access removed.
325
00:11:14,480 –> 00:11:18,680
Then the agent breaks, or worse, it keeps working in partial, unpredictable ways.
326
00:11:18,680 –> 00:11:21,040
Then there’s the service principle everywhere, anti-pattern.
327
00:11:21,040 –> 00:11:23,760
A single app registration ends up powering multiple agents,
328
00:11:23,760 –> 00:11:25,480
multiple tools, multiple environments.
329
00:11:25,480 –> 00:11:26,800
It looks neat in a spreadsheet.
330
00:11:26,800 –> 00:11:29,280
In reality, it’s a blast radius amplifier.
331
00:11:29,280 –> 00:11:32,680
One compromise, one mis-scoped permission, one forgotten secret rotation,
332
00:11:32,680 –> 00:11:35,320
and you’ve attached a privileged identity to a fleet.
333
00:11:35,320 –> 00:11:37,560
And the weirdest one is anonymous tool calls,
334
00:11:37,560 –> 00:11:40,760
not anonymous in the literal sense, everything logs somewhere,
335
00:11:40,760 –> 00:11:42,720
but anonymous in the governance sense.
336
00:11:42,720 –> 00:11:46,920
The logs don’t map cleanly to an owned, life cycle managed agent identity.
337
00:11:46,920 –> 00:11:49,680
You can’t answer basic questions like which agent invoked,
338
00:11:49,680 –> 00:11:53,200
which tool on behalf of what workflow, with what approval state.
339
00:11:53,200 –> 00:11:55,760
So the operational consequence isn’t theoretical.
340
00:11:55,760 –> 00:11:57,640
It’s immediate. You can’t do containment.
341
00:11:57,640 –> 00:12:00,680
Because you can’t decide what to disable without collateral damage.
342
00:12:00,680 –> 00:12:04,880
If ten agents share one identity, blocking the identity blocks ten business workflows,
343
00:12:04,880 –> 00:12:06,320
you won’t do it. You’ll hesitate.
344
00:12:06,320 –> 00:12:09,440
And that hesitation is how incidents become outages.
345
00:12:09,440 –> 00:12:12,920
This is why identity drift is the first catastrophic failure mode.
346
00:12:12,920 –> 00:12:14,680
Not because identity is important,
347
00:12:14,680 –> 00:12:18,240
because identity is the anchor that makes the rest of governance possible.
348
00:12:18,240 –> 00:12:21,440
No identity means no least privilege enforcement that holds over time.
349
00:12:21,440 –> 00:12:24,440
No identity means no reliable audit trail.
350
00:12:24,440 –> 00:12:28,520
No identity means defender detections that can’t be tied to a specific actor.
351
00:12:28,520 –> 00:12:32,920
No identity means purview exposure reports that can’t be attributed to responsible owner.
352
00:12:32,920 –> 00:12:36,520
In other words, you don’t lose accountability in a dramatic moment.
353
00:12:36,520 –> 00:12:38,880
You lose it one convenient shortcut at a time.
354
00:12:38,880 –> 00:12:40,800
And when the shortcut becomes normal,
355
00:12:40,800 –> 00:12:43,120
the organization crosses an invisible line.
356
00:12:43,120 –> 00:12:47,040
It moves from a deterministic security model to a probabilistic one.
357
00:12:47,040 –> 00:12:48,520
That distinction matters.
358
00:12:48,520 –> 00:12:52,440
Next, map identity drift to Microsoft’s enforcement mechanism,
359
00:12:52,440 –> 00:12:58,400
Entraagent ID, and why Entra needs to be treated as a distributed decision engine, not a directory.
360
00:12:58,400 –> 00:13:02,000
Control plane layer one, Entraagent ID as the audit anchor.
361
00:13:02,000 –> 00:13:04,320
Entraagent ID exists for one reason,
362
00:13:04,320 –> 00:13:08,280
to stop identity drift from becoming your default operating model.
363
00:13:08,280 –> 00:13:10,440
The rule is simple and it’s non-negotiable at scale.
364
00:13:10,440 –> 00:13:14,200
If it can act, it must have an identity, not a shared user, not whoever built it,
365
00:13:14,200 –> 00:13:15,800
not the connector account.
366
00:13:15,800 –> 00:13:20,840
An actual non-human identity that the enterprise can point to scope, monitor and retire,
367
00:13:20,840 –> 00:13:24,760
because the moment an agent can call tools, it is no longer a chat experience.
368
00:13:24,760 –> 00:13:26,400
It is an actor in your environment.
369
00:13:26,400 –> 00:13:30,400
Actors need identities, identities need ownership, ownership needs life cycle.
370
00:13:30,400 –> 00:13:32,320
That distinction matters.
371
00:13:32,320 –> 00:13:34,760
Most organizations treat Entra like a directory,
372
00:13:34,760 –> 00:13:38,800
uses groups, apps, and the occasional service principle nobody wants to own.
373
00:13:38,800 –> 00:13:40,120
That’s the comfortable framing.
374
00:13:40,120 –> 00:13:41,960
It’s also incomplete in architectural terms.
375
00:13:41,960 –> 00:13:43,720
Entra is a distributed decision engine.
376
00:13:43,720 –> 00:13:47,240
It sits in the middle of every authorization path continuously,
377
00:13:47,240 –> 00:13:50,520
and it enforces whatever you actually configured, not what you intended.
378
00:13:50,520 –> 00:13:54,000
And this is why Entra Agent ID is not another object in Entra.
379
00:13:54,000 –> 00:13:58,600
It is the ordered anchor that lets you build an evidence chain around agent behavior.
380
00:13:58,600 –> 00:14:00,640
When an agent has a stable identity,
381
00:14:00,640 –> 00:14:04,560
the enterprise can answer the questions that stop programs from getting shut down,
382
00:14:04,560 –> 00:14:07,360
which agent did the action, which permissions enabled it,
383
00:14:07,360 –> 00:14:13,200
which tool invocation executed it, and which human approved it, if approval was required.
384
00:14:13,200 –> 00:14:16,720
Without that identity, every downstream control becomes hand-wavy.
385
00:14:16,720 –> 00:14:20,080
So what changes when you make Entra Agent ID a first principle?
386
00:14:20,080 –> 00:14:22,440
First, least privilege stops being optional.
387
00:14:22,440 –> 00:14:24,960
At small scale, people tolerate sloppy permissions
388
00:14:24,960 –> 00:14:27,360
because the blast radius feels contained.
389
00:14:27,360 –> 00:14:32,040
At agent scale, sloppy permissions turn into repeated automation of your worst assumptions.
390
00:14:32,040 –> 00:14:34,560
The agent doesn’t occasionally overreach.
391
00:14:34,560 –> 00:14:37,560
It overreaches consistently.
392
00:14:37,560 –> 00:14:41,680
Entra Agent ID forces you to define permission boundaries explicitly.
393
00:14:41,680 –> 00:14:45,040
What this agent can read, what it can write, and what it can never touch.
394
00:14:45,040 –> 00:14:48,040
And when teams ask for exceptions, you can finally see what they’re doing.
395
00:14:48,040 –> 00:14:49,640
They’re not requesting convenience.
396
00:14:49,640 –> 00:14:51,720
They’re manufacturing an entropy generator.
397
00:14:51,720 –> 00:14:56,480
Second, accountability becomes a configured property, not a social agreement.
398
00:14:56,480 –> 00:14:58,920
Every agent identity needs an owner and a sponsor.
399
00:14:58,920 –> 00:15:00,960
The owner handles the build and maintenance.
400
00:15:00,960 –> 00:15:04,520
The sponsor owns the business outcome, risk acceptance and funding.
401
00:15:04,520 –> 00:15:07,800
Those are different roles, and enterprises keep pretending they aren’t.
402
00:15:07,800 –> 00:15:09,600
This is where life cycle workflows matter.
403
00:15:09,600 –> 00:15:13,240
If the sponsor leaves, the agent can’t be allowed to drift into orphan status.
404
00:15:13,240 –> 00:15:16,560
Often agents don’t retire, they rot, they keep their permissions.
405
00:15:16,560 –> 00:15:18,920
They keep being used, they keep producing incidents
406
00:15:18,920 –> 00:15:20,160
that nobody can explain.
407
00:15:20,160 –> 00:15:24,480
So the identity object needs life cycle controls, creation gate, periodic review,
408
00:15:24,480 –> 00:15:26,400
reattestation and deprecation.
409
00:15:26,400 –> 00:15:29,760
If you don’t enforce that, agent products become zombie services.
410
00:15:29,760 –> 00:15:32,520
Third, incident response becomes containment, not politics.
411
00:15:32,520 –> 00:15:35,280
With Entra Agent ID, you can disable the agent identity
412
00:15:35,280 –> 00:15:37,240
without disabling the entire program.
413
00:15:37,240 –> 00:15:38,320
That sounds small.
414
00:15:38,320 –> 00:15:39,160
It isn’t.
415
00:15:39,160 –> 00:15:42,200
That is the difference between, we contain the incident,
416
00:15:42,200 –> 00:15:45,160
and we paused all agents until further notice.
417
00:15:45,160 –> 00:15:48,480
Executives don’t fund programs that can’t be surgically contained.
418
00:15:48,480 –> 00:15:51,240
And now the subtle but critical part, Entra Agent ID
419
00:15:51,240 –> 00:15:52,760
doesn’t magically secure agents.
420
00:15:52,760 –> 00:15:54,920
It makes security enforceable.
421
00:15:54,920 –> 00:15:59,440
It’s the anchor that lets purview exposure map to a specific actor.
422
00:15:59,440 –> 00:16:03,600
It’s the anchor that lets defender detections turn into an action you can take.
423
00:16:03,600 –> 00:16:06,400
It’s the anchor that makes tool usage attributable.
424
00:16:06,400 –> 00:16:08,360
Identity is not the whole control plane.
425
00:16:08,360 –> 00:16:10,640
Identity is the thing the other layers attach to.
426
00:16:10,640 –> 00:16:11,600
Here’s the micro story.
427
00:16:11,600 –> 00:16:15,280
Every audit team recognizes a service desk agent updates a ticket queue.
428
00:16:15,280 –> 00:16:18,440
A week later, someone sees a set of tickets reassigned incorrectly.
429
00:16:18,440 –> 00:16:21,520
Under Identity Drift, you get, we think the bot did it.
430
00:16:21,520 –> 00:16:23,400
Under Entra Agent ID, you get,
431
00:16:23,400 –> 00:16:25,840
Agent Identity X, Tool Action Y,
432
00:16:25,840 –> 00:16:29,560
Executed at time Z, approved by Person A, using permission set B.
433
00:16:29,560 –> 00:16:30,640
That’s not a better story.
434
00:16:30,640 –> 00:16:31,400
That’s evidence.
435
00:16:31,400 –> 00:16:33,240
And evidence is what keeps the program alive.
436
00:16:33,240 –> 00:16:36,440
So, treat Entra Agent ID as the starting line, not the finish line.
437
00:16:36,440 –> 00:16:37,880
It gives you attribution.
438
00:16:37,880 –> 00:16:38,800
It gives you boundaries.
439
00:16:38,800 –> 00:16:40,280
It gives you life cycle hooks.
440
00:16:40,280 –> 00:16:44,200
It gives you the ability to disable one actor without burning the whole village down.
441
00:16:44,200 –> 00:16:46,480
But Identity alone doesn’t prevent damage.
442
00:16:46,480 –> 00:16:48,080
Identity tells you who acted.
443
00:16:48,080 –> 00:16:49,760
It doesn’t limit what happens when they act.
444
00:16:49,760 –> 00:16:51,040
That’s the next layer.
445
00:16:51,040 –> 00:16:53,480
And it’s where most enterprises lose control.
446
00:16:53,480 –> 00:16:56,520
Tools define blast radius.
447
00:16:56,520 –> 00:17:00,760
Next, MCP as the tool contract that turns integrations into infrastructure
448
00:17:00,760 –> 00:17:04,960
and why standardization can either save you or amplify mistakes at scale.
449
00:17:04,960 –> 00:17:08,280
Risk 2, data leakage via grounding plus tools.
450
00:17:08,280 –> 00:17:12,680
Data leakage in agent systems rarely looks like an attacker X-filterating secrets.
451
00:17:12,680 –> 00:17:16,560
It looks like a helpful workflow doing exactly what it was configured to do
452
00:17:16,560 –> 00:17:20,720
with exactly the access it was granted using exactly the data it was allowed to retrieve.
453
00:17:20,720 –> 00:17:22,640
Agents don’t leak maliciously.
454
00:17:22,640 –> 00:17:24,240
They leak obediently.
455
00:17:24,240 –> 00:17:28,120
That distinction matters because it changes the post-incident conversation.
456
00:17:28,120 –> 00:17:31,440
If the organization believes leakage is an intent problem,
457
00:17:31,440 –> 00:17:36,080
it will keep buying more training, more awareness and more policies written in PowerPoint.
458
00:17:36,080 –> 00:17:37,040
The system won’t care.
459
00:17:37,040 –> 00:17:39,760
The system will keep following the configured pathways.
460
00:17:39,760 –> 00:17:43,280
Leakage happens at the boundary between grounding and tools.
461
00:17:43,280 –> 00:17:47,960
Grounding is where the agent pulls context. Files, emails, chats, SharePoint pages,
462
00:17:47,960 –> 00:17:52,240
confluence, tickets, CRM records, the knowledge base, someone’s wall was clean.
463
00:17:52,240 –> 00:17:56,560
Tools are where the agent takes action, exporting, summarizing, sending, posting,
464
00:17:56,560 –> 00:17:58,160
creating, updating, attaching.
465
00:17:58,160 –> 00:18:01,720
If you overscope either one, the agent becomes a high-speed compliance incident.
466
00:18:01,720 –> 00:18:05,160
Here are the common leakage pathways and none of them require someone to be evil.
467
00:18:05,160 –> 00:18:07,600
First, overbroad retrieval.
468
00:18:07,600 –> 00:18:11,240
Someone grounds an agent on the whole department SharePoint because it’s convenient
469
00:18:11,240 –> 00:18:14,360
or all of team’s messages because the agent feels smarter.
470
00:18:14,360 –> 00:18:16,920
Or they use a connector that returns more than the task needs
471
00:18:16,920 –> 00:18:19,440
because nobody enforced a minimal data set contract.
472
00:18:19,440 –> 00:18:24,240
The agent now has access to a blended pool of data with different sensitivity levels,
473
00:18:24,240 –> 00:18:26,720
different retention rules and different audiences.
474
00:18:26,720 –> 00:18:28,400
Then a user asks an innocent question.
475
00:18:28,400 –> 00:18:29,880
The agent answers correctly.
476
00:18:29,880 –> 00:18:33,680
And the correct answer contains something that was never meant to be surfaced in that context.
477
00:18:33,680 –> 00:18:35,080
The agent didn’t hack anything.
478
00:18:35,080 –> 00:18:36,960
It just retrieved what you let it retrieve.
479
00:18:36,960 –> 00:18:39,480
Second, cross-domain context reuse.
480
00:18:39,480 –> 00:18:40,720
This is the one people miss.
481
00:18:40,720 –> 00:18:43,840
Agents operate in loops and they maintain state across steps.
482
00:18:43,840 –> 00:18:47,520
That means content retrieved for one part of a workflow can bleed into another part,
483
00:18:47,520 –> 00:18:50,400
especially when you chain agents or reuse tool outputs.
484
00:18:50,400 –> 00:18:52,720
A procurement agent retrieves a contract clause.
485
00:18:52,720 –> 00:18:56,400
A separate right-the-email agent gets handed that clause as context.
486
00:18:56,400 –> 00:19:00,040
The email agent posts it into a team’s channel that includes external guests
487
00:19:00,040 –> 00:19:02,200
because that’s where requests are handled.
488
00:19:02,200 –> 00:19:04,040
Everyone claims they followed policy.
489
00:19:04,040 –> 00:19:05,240
The system still leaked.
490
00:19:05,240 –> 00:19:06,320
The leak wasn’t the model.
491
00:19:06,320 –> 00:19:08,000
The leak was the choreography.
492
00:19:08,000 –> 00:19:11,360
Third, tool outputs get treated as safe artifacts.
493
00:19:11,360 –> 00:19:13,000
Tools produce outputs.
494
00:19:13,000 –> 00:19:17,600
Summaries, exports, spreadsheets, tickets, cards, notifications, file attachments.
495
00:19:17,600 –> 00:19:20,520
Those outputs then get reused as inputs elsewhere.
496
00:19:20,520 –> 00:19:22,600
If you don’t carry provenance and sensitivity forward,
497
00:19:22,600 –> 00:19:26,120
the system quietly launders restricted information into unrestricted channels.
498
00:19:26,120 –> 00:19:29,360
This is where a helpful agent design turns into helpful breach design
499
00:19:29,360 –> 00:19:32,040
because the agent can do the dangerous thing with a polite tone.
500
00:19:32,040 –> 00:19:35,320
Now the executive level trap in all of this is what happens after the incident.
501
00:19:35,320 –> 00:19:37,200
Leadership asks, why did the agent do that?
502
00:19:37,200 –> 00:19:39,840
And the honest answer is because it worked as designed.
503
00:19:39,840 –> 00:19:41,320
That’s when trust collapses.
504
00:19:41,320 –> 00:19:42,840
Not because the incident happened.
505
00:19:42,840 –> 00:19:45,400
Incidents always happen, but because the enterprise realizes
506
00:19:45,400 –> 00:19:47,920
it didn’t actually design enforceable boundaries.
507
00:19:47,920 –> 00:19:49,280
It designed aspirations.
508
00:19:49,280 –> 00:19:51,080
And aspirations don’t survive scale.
509
00:19:51,080 –> 00:19:53,920
The most uncomfortable version of this is when the agent answers
510
00:19:53,920 –> 00:19:55,400
from the wrong policy source.
511
00:19:55,400 –> 00:19:56,200
Not a wrong answer.
512
00:19:56,200 –> 00:19:58,560
A correct answer from a non-authoritative source.
513
00:19:58,560 –> 00:20:02,440
This happens constantly in enterprises because available and authoritative
514
00:20:02,440 –> 00:20:03,680
are not the same thing.
515
00:20:03,680 –> 00:20:07,000
The agent retrieves the first relevant looking document it can access.
516
00:20:07,000 –> 00:20:10,640
That document might be outdated, it might be a draft, it might be a local variant,
517
00:20:10,640 –> 00:20:13,520
it might be a wiki page, someone updated during a fire drill,
518
00:20:13,520 –> 00:20:17,160
it might be a screenshot of a policy pasted into a team’s message.
519
00:20:17,160 –> 00:20:19,320
The agent then gives the answer with confidence.
520
00:20:19,320 –> 00:20:22,560
The user acts.compliance inherits the blast radius.
521
00:20:22,560 –> 00:20:26,360
And when you investigate, you discover the agent never violated access controls.
522
00:20:26,360 –> 00:20:28,760
It had access, it retrieved, it responded.
523
00:20:28,760 –> 00:20:30,000
The failure was governance.
524
00:20:30,000 –> 00:20:33,640
You never told the system which sources are allowed to be truth.
525
00:20:33,640 –> 00:20:36,520
So when executives ask, how do we stop data leakage?
526
00:20:36,520 –> 00:20:39,080
The correct answer is not make the model safer.
527
00:20:39,080 –> 00:20:43,000
The correct answer is enforce the data boundary before retrieval
528
00:20:43,000 –> 00:20:47,080
and enforce the tool boundary before action and keep provenance attached to outputs.
529
00:20:47,080 –> 00:20:49,960
If you don’t, you’re just accelerating the wrong behavior.
530
00:20:49,960 –> 00:20:51,960
And this is why a tool contract matters.
531
00:20:51,960 –> 00:20:56,160
Because bespoke connectors and one off integrations are where boundary enforcement dies.
532
00:20:56,160 –> 00:20:59,120
Every custom integration expresses its own assumptions.
533
00:20:59,120 –> 00:21:00,840
Every shortcut creates a new pathway.
534
00:21:00,840 –> 00:21:02,920
Every pathway becomes a leak candidate.
535
00:21:02,920 –> 00:21:05,560
A standardized tool contract doesn’t eliminate mistakes.
536
00:21:05,560 –> 00:21:09,320
It makes mistakes visible, reviewable, and reusable in a controlled way.
537
00:21:09,320 –> 00:21:10,320
That’s the transition.
538
00:21:10,320 –> 00:21:12,320
Identity tells you who acted.
539
00:21:12,320 –> 00:21:14,080
But tools determine what they can break.
540
00:21:14,080 –> 00:21:16,960
Next, MCP, reframed as a tool contract.
541
00:21:16,960 –> 00:21:20,560
Integration becomes infrastructure and infrastructure is the only thing that scales
542
00:21:20,560 –> 00:21:22,520
without becoming conditional chaos.
543
00:21:22,520 –> 00:21:24,280
Control plane layer 2.
544
00:21:24,280 –> 00:21:26,240
MCP as the tool contract.
545
00:21:26,240 –> 00:21:28,440
MCP is not a new connector type.
546
00:21:28,440 –> 00:21:29,760
That’s the comfortable framing.
547
00:21:29,760 –> 00:21:32,880
In architectural terms, MCP is a tool contract,
548
00:21:32,880 –> 00:21:38,000
a standardized way for agents to discover capabilities, call them, and receive structured results.
549
00:21:38,000 –> 00:21:41,440
And that matters because tools’ brawl is where governance goes to die.
550
00:21:41,440 –> 00:21:44,160
Every bespoke integration is a new snowflake.
551
00:21:44,160 –> 00:21:45,160
Different auth.
552
00:21:45,160 –> 00:21:46,160
Different logging.
553
00:21:46,160 –> 00:21:47,480
Different error behavior.
554
00:21:47,480 –> 00:21:49,600
Different assumptions about what safe means.
555
00:21:49,600 –> 00:21:51,440
At 10 agents, you can survive that.
556
00:21:51,440 –> 00:21:52,440
At 100, you can’t.
557
00:21:52,440 –> 00:21:55,160
At 1000, you’re not operating an ecosystem anymore.
558
00:21:55,160 –> 00:21:56,360
You’re operating an accident.
559
00:21:56,360 –> 00:21:58,960
So MCP turns integration into infrastructure.
560
00:21:58,960 –> 00:22:02,600
It makes tools something the platform team can treat like a shared service instead of
561
00:22:02,600 –> 00:22:04,160
a per agent science project.
562
00:22:04,160 –> 00:22:05,480
That is the actual value.
563
00:22:05,480 –> 00:22:06,800
Here’s what most people miss.
564
00:22:06,800 –> 00:22:08,920
Standard tool discovery isn’t about developer convenience.
565
00:22:08,920 –> 00:22:10,080
It’s about predictability.
566
00:22:10,080 –> 00:22:14,600
The system can’t enforce intent if every agent calls a different implementation of create
567
00:22:14,600 –> 00:22:16,560
ticket or update record.
568
00:22:16,560 –> 00:22:17,800
You don’t just get duplication.
569
00:22:17,800 –> 00:22:21,960
You get divergent behavior and divergent behavior becomes inconsistent outcomes.
570
00:22:21,960 –> 00:22:23,320
In other words, decision debt.
571
00:22:23,320 –> 00:22:28,160
With MCP, the enterprise can standardize the verbs, create, read, update, approve, notify,
572
00:22:28,160 –> 00:22:31,720
export, and then govern those verbs like they are part of the control plane.
573
00:22:31,720 –> 00:22:33,160
Because they are.
574
00:22:33,160 –> 00:22:34,440
Now there’s a tension here.
575
00:22:34,440 –> 00:22:37,040
And pretending it doesn’t exist is how programs get hurt.
576
00:22:37,040 –> 00:22:38,680
MCP reduces bespoke chaos.
577
00:22:38,680 –> 00:22:40,920
It also makes mistakes propagate faster.
578
00:22:40,920 –> 00:22:44,320
When you build one custom connector wrong, you usually hurt one agent.
579
00:22:44,320 –> 00:22:48,480
When you publish one MCP tool contract wrong, you potentially hurt every agent that discovers
580
00:22:48,480 –> 00:22:49,480
and reuses it.
581
00:22:49,480 –> 00:22:51,040
That’s not a reason to avoid MCP.
582
00:22:51,040 –> 00:22:55,840
It’s a reason to treat MCP like production infrastructure, versioning, review, and blast
583
00:22:55,840 –> 00:22:56,840
radius thinking.
584
00:22:56,840 –> 00:22:59,200
So what does contract thinking actually mean?
585
00:22:59,200 –> 00:23:02,960
It means tools are enforceable interfaces, not developer shortcuts.
586
00:23:02,960 –> 00:23:05,280
An enforceable interface has three properties.
587
00:23:05,280 –> 00:23:07,200
One, it’s explicit about what it does.
588
00:23:07,200 –> 00:23:09,320
Not in marketing language, in operational language.
589
00:23:09,320 –> 00:23:10,320
Does it read?
590
00:23:10,320 –> 00:23:11,320
Does it write?
591
00:23:11,320 –> 00:23:12,320
Does it delete?
592
00:23:12,320 –> 00:23:14,320
Does it create side effects that can’t be undone?
593
00:23:14,320 –> 00:23:18,720
Two, it’s explicit about what inputs it accepts and what outputs it produces.
594
00:23:18,720 –> 00:23:21,480
If the output is free text, you just created a laundering pathway.
595
00:23:21,480 –> 00:23:24,280
The next agent can’t reliably know what it received.
596
00:23:24,280 –> 00:23:28,840
If the output is structured, fields, identifiers, source metadata, you can carry provenance
597
00:23:28,840 –> 00:23:31,080
forward and you can audit behavior later.
598
00:23:31,080 –> 00:23:33,120
Three, it’s explicit about authority.
599
00:23:33,120 –> 00:23:37,280
Which identity calls it, with which permission scope against which system boundary.
600
00:23:37,280 –> 00:23:41,520
If the tool can do privileged actions, then privileged actions require an approval pattern.
601
00:23:41,520 –> 00:23:43,600
Not a policy document, a real gate.
602
00:23:43,600 –> 00:23:47,000
This is why MCP matters even when you already have connectors.
603
00:23:47,000 –> 00:23:48,560
Connectors get you, it works.
604
00:23:48,560 –> 00:23:52,280
MCP gets you, it works the same way everywhere, which is the only thing that scales.
605
00:23:52,280 –> 00:23:56,400
And this is also why MCP belongs in the control plane, not buried inside app teams.
606
00:23:56,400 –> 00:24:01,520
Two contracts need central review because they define the blast radius of autonomous behavior.
607
00:24:01,520 –> 00:24:05,640
If you let every team publish MCP tools at Hawk, you didn’t decentralize innovation.
608
00:24:05,640 –> 00:24:07,240
You decentralized authority.
609
00:24:07,240 –> 00:24:09,200
That’s the same mistake, just with nicer Jason.
610
00:24:09,200 –> 00:24:11,360
So what should executives demand here?
611
00:24:11,360 –> 00:24:13,520
They should demand a curated tool catalog.
612
00:24:13,520 –> 00:24:17,800
Which MCP servers are allowed, which tools are exposed, and which actions are deliberately
613
00:24:17,800 –> 00:24:18,800
disabled.
614
00:24:18,800 –> 00:24:21,280
We enabled MCP is not a control, it’s a headline.
615
00:24:21,280 –> 00:24:24,000
They should demand version control and deprecation rules.
616
00:24:24,000 –> 00:24:27,960
Actors change, APIs change, workflows change, if the contract changes silently, agent behavior
617
00:24:27,960 –> 00:24:29,040
changes silently.
618
00:24:29,040 –> 00:24:33,480
And that’s how it worked last week turns into, we can’t explain what happened.
619
00:24:33,480 –> 00:24:36,680
And they should demand that every tool output includes provenance.
620
00:24:36,680 –> 00:24:40,600
If a tool returns policy guidance, it must return the policy source, identify a version
621
00:24:40,600 –> 00:24:41,600
and timestamp.
622
00:24:41,600 –> 00:24:45,680
If it returns a ticket action, it must return the ticket ID and the actor identity that
623
00:24:45,680 –> 00:24:46,680
executed it.
624
00:24:46,680 –> 00:24:49,840
Otherwise, you’re building an ecosystem that can’t prove its own actions.
625
00:24:49,840 –> 00:24:52,920
Now the boundary problem doesn’t go away just because tools are standardized.
626
00:24:52,920 –> 00:24:57,360
A well-governed tool can still retrieve the wrong data if the data boundary is weak.
627
00:24:57,360 –> 00:25:00,200
And a well-governed tool can still leak if the agent can read too much.
628
00:25:00,200 –> 00:25:02,000
So MCP reduces bespoke chaos.
629
00:25:02,000 –> 00:25:03,720
It does not prevent obedient leakage.
630
00:25:03,720 –> 00:25:05,520
That’s why the next layer exists.
631
00:25:05,520 –> 00:25:10,120
Perview DSPM for AI, not as a compliance dashboard, but as the data boundary that decides
632
00:25:10,120 –> 00:25:14,000
what agents may see and what they may never bring back.
633
00:25:14,000 –> 00:25:17,800
Control plane layer three, perview DSPM for AI as the data boundary.
634
00:25:17,800 –> 00:25:21,800
Perview DSPM for AI exists for a reason most programs try to skip.
635
00:25:21,800 –> 00:25:24,280
Governance law, you can’t govern what you can’t see.
636
00:25:24,280 –> 00:25:25,920
Identity tells you who acted.
637
00:25:25,920 –> 00:25:28,120
MCP standardizes what they can do.
638
00:25:28,120 –> 00:25:32,160
But neither one answers the question that actually determines whether your agents become
639
00:25:32,160 –> 00:25:33,320
a leak factory.
640
00:25:33,320 –> 00:25:34,480
What data did they touch?
641
00:25:34,480 –> 00:25:35,560
What did they retrieve?
642
00:25:35,560 –> 00:25:37,080
And what did they return?
643
00:25:37,080 –> 00:25:40,760
At enterprise scale, we think it only used approved data is not a control.
644
00:25:40,760 –> 00:25:42,400
It’s wishful thinking with a dashboard.
645
00:25:42,400 –> 00:25:45,880
Perview is the layer that makes data boundaries observable.
646
00:25:45,880 –> 00:25:50,600
Not just in the classic DLP sense of, don’t share credit cards, but in the agentic sense.
647
00:25:50,600 –> 00:25:56,160
There is sensitive data exposed to AI systems, which agents are accessing it and which interactions
648
00:25:56,160 –> 00:26:00,560
create risk signals you can act on before the incident becomes a headline.
649
00:26:00,560 –> 00:26:04,400
This is the foundational misunderstanding with data governance in the agent era.
650
00:26:04,400 –> 00:26:06,120
Most teams try to govern the response.
651
00:26:06,120 –> 00:26:09,440
They let the agent read whatever it can read, then they try to prevent it from saying the
652
00:26:09,440 –> 00:26:10,440
wrong thing.
653
00:26:10,440 –> 00:26:11,440
That’s backwards.
654
00:26:11,440 –> 00:26:14,000
If the agent is allowed to read it, the agent can leak it.
655
00:26:14,000 –> 00:26:17,440
Even if you block the output, you’ve still created an access pathway in evidence trail
656
00:26:17,440 –> 00:26:18,920
and an attack surface.
657
00:26:18,920 –> 00:26:20,600
The breach might not be in the response.
658
00:26:20,600 –> 00:26:24,840
It might be in the retrieval logs, the tool outputs, the intermediate state or the downstream
659
00:26:24,840 –> 00:26:26,880
workflow that reuses the data.
660
00:26:26,880 –> 00:26:29,040
Blocking the final sentence doesn’t undo that.
661
00:26:29,040 –> 00:26:34,080
So the boundary mindset is, prevent reading when you must, not allow reading and hope you
662
00:26:34,080 –> 00:26:35,800
can sanitize later.
663
00:26:35,800 –> 00:26:38,200
This is where Perview becomes more than compliance theatre.
664
00:26:38,200 –> 00:26:40,720
It becomes the system that helps you define three things.
665
00:26:40,720 –> 00:26:42,920
The business never defines cleanly on its own.
666
00:26:42,920 –> 00:26:46,920
First, what data is sensitive, where it lives and who is allowed to access it in the
667
00:26:46,920 –> 00:26:47,920
first place.
668
00:26:47,920 –> 00:26:50,720
The old problem is still unsolved in most tenants.
669
00:26:50,720 –> 00:26:54,120
Second, which data sources are allowed to be used as grounding for agents.
670
00:26:54,120 –> 00:26:55,760
This is the new problem.
671
00:26:55,760 –> 00:26:59,000
Allowed for humans is not the same as allowed for agents.
672
00:26:59,000 –> 00:27:00,720
Agents scale retrieval.
673
00:27:00,720 –> 00:27:01,720
Humans don’t.
674
00:27:01,720 –> 00:27:04,360
A human opening a document is a single event.
675
00:27:04,360 –> 00:27:10,720
An agent can retrieve a hundred related documents in a loop because the goal required context.
676
00:27:10,720 –> 00:27:14,000
Same permissions, totally different behavior.
677
00:27:14,000 –> 00:27:17,040
Third, what constitutes authoritative versus available.
678
00:27:17,040 –> 00:27:19,040
And this is the part that collapses trust.
679
00:27:19,040 –> 00:27:23,080
If your agent pulls policy from the most accessible location instead of the authoritative
680
00:27:23,080 –> 00:27:27,200
repository, you’ve just automated misinformation with a professional tone.
681
00:27:27,200 –> 00:27:31,200
Perview helps by giving you visibility and exposure tracking across the content plane
682
00:27:31,200 –> 00:27:34,800
and by surfacing risk signals tied to AI usage patterns.
683
00:27:34,800 –> 00:27:37,240
It doesn’t magically fix your information architecture.
684
00:27:37,240 –> 00:27:39,840
It just stops you from pretending the architecture doesn’t matter.
685
00:27:39,840 –> 00:27:45,400
Now, executives usually want one thing from Perview, reporting that maps to business reality.
686
00:27:45,400 –> 00:27:47,360
They don’t want a heat map of labels.
687
00:27:47,360 –> 00:27:51,480
They want answers to questions like which agents access sensitive data, which users rely
688
00:27:51,480 –> 00:27:55,800
on those agents and where policy violations are accumulating over time.
689
00:27:55,800 –> 00:27:58,960
Because that’s what determines whether the program scales or gets paused.
690
00:27:58,960 –> 00:28:02,960
So the practical framing is DSPM for AI as an executive dashboard for adoption, exposure
691
00:28:02,960 –> 00:28:03,960
and violations.
692
00:28:03,960 –> 00:28:07,360
Not as a monthly compliance report, as an operational control plane signal.
693
00:28:07,360 –> 00:28:10,160
And here’s the important distinction to keep straight.
694
00:28:10,160 –> 00:28:11,480
Visibility is not enforcement.
695
00:28:11,480 –> 00:28:13,200
Perview can show you exposure.
696
00:28:13,200 –> 00:28:14,200
It can show you risk.
697
00:28:14,200 –> 00:28:16,200
It can help you classify and define boundaries.
698
00:28:16,200 –> 00:28:19,640
But if you stop there, you’ve built a better review mirror, not a safer vehicle.
699
00:28:19,640 –> 00:28:22,320
This is why Perview has to connect back to the other layers.
700
00:28:22,320 –> 00:28:24,680
Entra agent ID gives you attribution.
701
00:28:24,680 –> 00:28:27,640
Which agent identity performed the access?
702
00:28:27,640 –> 00:28:29,680
MCP gives you the tool boundary.
703
00:28:29,680 –> 00:28:32,560
Which tool path retrieved or exported the data?
704
00:28:32,560 –> 00:28:34,360
Perview gives you the data boundary.
705
00:28:34,360 –> 00:28:37,240
Which content was sensitive, which sources were used, and
706
00:28:37,240 –> 00:28:39,560
whether the interaction crossed a policy line.
707
00:28:39,560 –> 00:28:42,680
And then you still need a layer that detects behavior drift over time.
708
00:28:42,680 –> 00:28:45,320
Because agents don’t fail only through bad design.
709
00:28:45,320 –> 00:28:47,120
They fail through evolving usage.
710
00:28:47,120 –> 00:28:47,960
People probe them.
711
00:28:47,960 –> 00:28:48,720
People push them.
712
00:28:48,720 –> 00:28:50,840
People accidentally feed them the wrong context.
713
00:28:50,840 –> 00:28:52,200
Attackers deliberately do it.
714
00:28:52,200 –> 00:28:56,560
And the normal behavior of an agent changes as the organization changes around it.
715
00:28:56,560 –> 00:28:58,440
That’s the part compliance teams hate.
716
00:28:58,440 –> 00:29:01,080
Because it means the boundary is not a one-time design.
717
00:29:01,080 –> 00:29:02,400
It’s ongoing entropy management.
718
00:29:02,400 –> 00:29:03,920
So the rule for this layer is simple.
719
00:29:03,920 –> 00:29:07,680
If you want agents at scale, you must treat data governance as a runtime boundary,
720
00:29:07,680 –> 00:29:09,480
not a documentation exercise.
721
00:29:09,480 –> 00:29:12,200
If you can’t see what agents are touching, you can’t defend it.
722
00:29:12,200 –> 00:29:14,960
And if you only try to govern what they say, you’re governing the wrong surface.
723
00:29:14,960 –> 00:29:18,520
Perview gives you the visibility and the boundary signals to stop guessing.
724
00:29:18,520 –> 00:29:20,240
But visibility isn’t defense.
725
00:29:20,240 –> 00:29:22,280
Behavior still drifts under pressure.
726
00:29:22,280 –> 00:29:23,880
Next layer, defender for AI.
727
00:29:23,880 –> 00:29:27,680
Because at scale, security becomes behavior based, whether you like it or not.
728
00:29:27,680 –> 00:29:29,480
Control plane layer four.
729
00:29:29,480 –> 00:29:32,200
Defender for AI as behavioral detection.
730
00:29:32,200 –> 00:29:37,800
Defender for AI is the layer that forces the enterprise to accept a basic reality.
731
00:29:37,800 –> 00:29:40,520
Agent security can’t be intent-based.
732
00:29:40,520 –> 00:29:43,280
Intent is what humans claim after something goes wrong.
733
00:29:43,280 –> 00:29:45,240
Behavior is what systems actually do.
734
00:29:45,240 –> 00:29:50,400
And once agents act through tools across data sources in loops, you’re no longer defending
735
00:29:50,400 –> 00:29:51,560
a chat interface.
736
00:29:51,560 –> 00:29:54,960
You’re defending an execution surface that changes shape every day.
737
00:29:54,960 –> 00:29:57,080
Because the organization keeps changing around it.
738
00:29:57,080 –> 00:29:59,600
That distinction matters.
739
00:29:59,600 –> 00:30:02,680
Most security programs are still built on a comfortable assumption.
740
00:30:02,680 –> 00:30:06,200
If you lock down identities and you label data, you’re covered.
741
00:30:06,200 –> 00:30:08,160
That was already optimistic before agents.
742
00:30:08,160 –> 00:30:10,080
In the agent era, it’s inadequate.
743
00:30:10,080 –> 00:30:12,560
Because the failure mode isn’t only unauthorized access.
744
00:30:12,560 –> 00:30:15,360
It’s authorized access used in unauthorized patterns.
745
00:30:15,360 –> 00:30:17,880
The same identity can be valid and still dangerous.
746
00:30:17,880 –> 00:30:20,440
The same tool can be approved and still abused.
747
00:30:20,440 –> 00:30:24,240
The same data source can be allowed and still produce a leak when combined with the wrong
748
00:30:24,240 –> 00:30:25,480
prompt at the wrong time.
749
00:30:25,480 –> 00:30:29,480
So defenders job in this control plane is not to judge whether an agent meant well.
750
00:30:29,480 –> 00:30:33,400
It’s to detect when an agent’s behavior stops matching the enterprise’s assumptions.
751
00:30:33,400 –> 00:30:35,960
That’s what behavioral detection actually means in practice.
752
00:30:35,960 –> 00:30:41,200
That means looking for patterns that indicate drift, coercion, or automation gone feral.
753
00:30:41,200 –> 00:30:45,960
Prompt injection attempts, tool abuse, anomalous access sequences, and why is this agent suddenly
754
00:30:45,960 –> 00:30:47,280
doing that?
755
00:30:47,280 –> 00:30:51,280
Events that don’t show up in your design documents start with prompt injection.
756
00:30:51,280 –> 00:30:54,800
Because it’s the attack everyone would pretend is edge case until it’s in their incident
757
00:30:54,800 –> 00:30:55,800
report.
758
00:30:55,800 –> 00:30:57,200
Prompt injection isn’t magic.
759
00:30:57,200 –> 00:31:02,400
It’s just an attacker or a careless user smuggling instructions into the context stream.
760
00:31:02,400 –> 00:31:06,560
The agent reads it as part of the problem then follows it as part of the solution.
761
00:31:06,560 –> 00:31:11,320
If the agent can call tools the injection isn’t just about bad answers, it’s about bad actions.
762
00:31:11,320 –> 00:31:15,320
Defenders role is to detect those attempts and raise signals you can act on.
763
00:31:15,320 –> 00:31:19,720
Suspicious instruction patterns unexpected content shaping and the characteristic ignore
764
00:31:19,720 –> 00:31:23,560
previous instructions style coercion that shows up in real attacks.
765
00:31:23,560 –> 00:31:24,800
Then tool abuse.
766
00:31:24,800 –> 00:31:29,440
Tool abuse looks like an agent calling a high impact tool at an unusual time, at an unusual
767
00:31:29,440 –> 00:31:31,840
volume or in an unusual sequence.
768
00:31:31,840 –> 00:31:35,880
The service desk agent that normally creates tickets suddenly starts updating existing
769
00:31:35,880 –> 00:31:36,880
tickets in bulk.
770
00:31:36,880 –> 00:31:40,880
A policy agent that normally retrieves one document starts pulling hundreds.
771
00:31:40,880 –> 00:31:44,400
An approvals agent that normally waits for humans suddenly chains actions without the
772
00:31:44,400 –> 00:31:45,720
expected checkpoints.
773
00:31:45,720 –> 00:31:47,120
The system isn’t broken.
774
00:31:47,120 –> 00:31:48,600
Your assumptions are.
775
00:31:48,600 –> 00:31:51,600
And those assumptions erode faster than your governance committees meet.
776
00:31:51,600 –> 00:31:54,640
This is why defender for AI becomes mandatory at scale.
777
00:31:54,640 –> 00:31:56,160
Not because Microsoft says so.
778
00:31:56,160 –> 00:32:00,040
Because autonomous behavior plus human pressure creates misunderstood design omissions
779
00:32:00,040 –> 00:32:01,200
on schedule.
780
00:32:01,200 –> 00:32:03,840
And will ask agents to do things they shouldn’t.
781
00:32:03,840 –> 00:32:06,160
Makers will add tools they didn’t fully review.
782
00:32:06,160 –> 00:32:07,560
Vendors will ship updates.
783
00:32:07,560 –> 00:32:10,080
Somebody will paste the wrong thing into the wrong place.
784
00:32:10,080 –> 00:32:14,160
And the platform will keep operating security that doesn’t announce itself it accumulates.
785
00:32:14,160 –> 00:32:16,680
Defender gives you a way to see the accumulation as it happens.
786
00:32:16,680 –> 00:32:21,440
The practical outcome executives should care about is containment not containment of AI,
787
00:32:21,440 –> 00:32:25,360
containment of one agent identity, one tool pathway, one behavior pattern without shutting
788
00:32:25,360 –> 00:32:26,360
down the whole program.
789
00:32:26,360 –> 00:32:27,880
That’s what keeps adoption alive.
790
00:32:27,880 –> 00:32:33,000
If the only response you have to suspicious behavior is disabled co-pilot or pause all agents,
791
00:32:33,000 –> 00:32:35,600
you’ve built a program that can’t survive scrutiny.
792
00:32:35,600 –> 00:32:38,880
You’ve also trained the business to treat AI as fragile.
793
00:32:38,880 –> 00:32:42,040
They will stop investing the moment it causes inconvenience.
794
00:32:42,040 –> 00:32:43,400
Defender changes the playbook.
795
00:32:43,400 –> 00:32:45,880
With identity in Entra, you can target the actor.
796
00:32:45,880 –> 00:32:48,720
With MCP tool contracts, you can target the capability.
797
00:32:48,720 –> 00:32:51,320
With Pervue signals, you can target the exposure.
798
00:32:51,320 –> 00:32:53,800
With Defender, you can target the behavior in motion.
799
00:32:53,800 –> 00:32:56,840
So when something goes wrong, you don’t need a cross-functional war room to debate what
800
00:32:56,840 –> 00:32:57,840
happened for three days.
801
00:32:57,840 –> 00:33:02,160
You isolate the agent, you quarantine the tool, you roll back the permission boundary,
802
00:33:02,160 –> 00:33:06,080
you preserve the evidence trail, and you keep the rest of the ecosystem running.
803
00:33:06,080 –> 00:33:07,080
That’s the architectural win.
804
00:33:07,080 –> 00:33:09,800
You stop treating incidents as existential.
805
00:33:09,800 –> 00:33:10,960
Now there’s a trap here too.
806
00:33:10,960 –> 00:33:14,160
If you deploy Defender without attribution, you get alerts you can’t act on.
807
00:33:14,160 –> 00:33:18,320
You’ll see anomalous behavior involving agents, but you won’t have a clean identity chain
808
00:33:18,320 –> 00:33:19,320
to disable.
809
00:33:19,320 –> 00:33:22,600
If you deploy Defender without data boundaries, you’ll detect leaks after they already
810
00:33:22,600 –> 00:33:27,160
happened. If you deploy Defender without tool contracts, you’ll detect tool abuse without
811
00:33:27,160 –> 00:33:29,880
knowing which tool implementation was called.
812
00:33:29,880 –> 00:33:33,280
Detection without enforceability is just anxiety, so Defender isn’t the control plane
813
00:33:33,280 –> 00:33:34,280
by itself.
814
00:33:34,280 –> 00:33:36,680
It’s the last layer that makes the chain complete.
815
00:33:36,680 –> 00:33:40,720
Behavior signals tied to identities, tied to tools, tied to data boundaries, tied to actions
816
00:33:40,720 –> 00:33:42,320
you can actually take.
817
00:33:42,320 –> 00:33:45,120
And the reason it has to be the fourth layer is simple.
818
00:33:45,120 –> 00:33:47,160
Visibility isn’t Defends.
819
00:33:47,160 –> 00:33:48,400
Boundaries aren’t enough.
820
00:33:48,400 –> 00:33:50,960
At scale, behavior drifts.
821
00:33:50,960 –> 00:33:55,280
Vierter is how you see the drift early and how you contain it precisely before the drift
822
00:33:55,280 –> 00:33:58,880
becomes the forcing function that freezes the entire program.
823
00:33:58,880 –> 00:34:04,040
Next, connect the four layers into a single minimum viable control plane, Entra, MCP,
824
00:34:04,040 –> 00:34:08,640
Perview and Defender and why missing any one of them turns enforceable intelligence into
825
00:34:08,640 –> 00:34:09,880
conditional chaos.
826
00:34:09,880 –> 00:34:13,960
The minimum viable agent control plane, how the four layers interlock.
827
00:34:13,960 –> 00:34:18,120
Most organizations try to govern agents by buying one product, turning on one switch and
828
00:34:18,120 –> 00:34:19,120
declaring victory.
829
00:34:19,120 –> 00:34:20,120
That is not how this works.
830
00:34:20,120 –> 00:34:21,680
The control plane is not a product.
831
00:34:21,680 –> 00:34:23,080
It’s an enforcement chain.
832
00:34:23,080 –> 00:34:26,000
And chains fail at their weakest link, not their most expensive one.
833
00:34:26,000 –> 00:34:30,040
So the minimum viable agent control plane is four layers that interlock cleanly.
834
00:34:30,040 –> 00:34:31,800
With no gaps you have to explain later.
835
00:34:31,800 –> 00:34:36,840
Entra agent ID for who, MCP, for what, Perview, for what data, Defender, for how it behaves.
836
00:34:36,840 –> 00:34:37,840
That’s it.
837
00:34:37,840 –> 00:34:40,920
Four layers, not because it’s elegant, because anything less becomes probabilistic.
838
00:34:40,920 –> 00:34:43,240
Start with the interlock model as a mental map.
839
00:34:43,240 –> 00:34:44,880
Entra agent ID answers.
840
00:34:44,880 –> 00:34:45,880
Who is acting?
841
00:34:45,880 –> 00:34:47,040
Not who asked.
842
00:34:47,040 –> 00:34:48,160
Who executed?
843
00:34:48,160 –> 00:34:52,440
Which non-human identity performed the tool called, Touch the Data, and created the side
844
00:34:52,440 –> 00:34:53,440
effect?
845
00:34:53,440 –> 00:34:54,440
MCP answers.
846
00:34:54,440 –> 00:34:55,440
What is the agent allowed to do?
847
00:34:55,440 –> 00:34:57,560
Not in philosophical terms, in verbs.
848
00:34:57,560 –> 00:34:58,560
Which tools exist?
849
00:34:58,560 –> 00:34:59,800
Which actions are exposed?
850
00:34:59,800 –> 00:35:01,080
Which actions are disabled?
851
00:35:01,080 –> 00:35:02,240
Which outputs are structured?
852
00:35:02,240 –> 00:35:03,920
Which ones carry provenance?
853
00:35:03,920 –> 00:35:04,920
Perview answers.
854
00:35:04,920 –> 00:35:06,400
What data can the agent touch?
855
00:35:06,400 –> 00:35:07,880
And what data can it return?
856
00:35:07,880 –> 00:35:09,040
This is the boundary layer.
857
00:35:09,040 –> 00:35:12,360
It turns, we think it shouldn’t see that into it can’t read it.
858
00:35:12,360 –> 00:35:16,000
And it turns, we hope it didn’t use sensitive content into exposure signals.
859
00:35:16,000 –> 00:35:17,640
You can actually review.
860
00:35:17,640 –> 00:35:18,640
Defender answers.
861
00:35:18,640 –> 00:35:20,200
How is it behaving over time?
862
00:35:20,200 –> 00:35:25,560
Normal behavior becomes baseline, drift becomes detectable, abuse becomes containable, and
863
00:35:25,560 –> 00:35:30,000
the signals tie back to an identity you can disable, and a tool contract you can restrict.
864
00:35:30,000 –> 00:35:31,680
Now here’s what most people miss.
865
00:35:31,680 –> 00:35:33,040
These layers don’t just stack.
866
00:35:33,040 –> 00:35:34,200
They depend on each other.
867
00:35:34,200 –> 00:35:38,080
If Entra is missing, you can detect and monitor all day, but you can’t attribute.
868
00:35:38,080 –> 00:35:39,400
You can’t prove who acted.
869
00:35:39,400 –> 00:35:42,560
You can’t surgically disable one agent without collateral damage.
870
00:35:42,560 –> 00:35:45,320
You end up pausing the program because that’s the only lever left.
871
00:35:45,320 –> 00:35:49,920
If MCP discipline is missing, you can have perfect identity and perfect logs, but your integrations
872
00:35:49,920 –> 00:35:51,480
remain bespoke chaos.
873
00:35:51,480 –> 00:35:55,000
Every agent calls different endpoints in different ways, outputs, different shapes, and
874
00:35:55,000 –> 00:35:56,000
leaks provenance.
875
00:35:56,000 –> 00:35:59,400
So your incident response becomes reverse engineering, not containment.
876
00:35:59,400 –> 00:36:02,960
If Perview is missing, you can know who acted and what tool they used, but you don’t know
877
00:36:02,960 –> 00:36:03,960
what they touched.
878
00:36:03,960 –> 00:36:07,320
You’ll discover exposure after the fact and your post-incident report will read like a
879
00:36:07,320 –> 00:36:08,320
guess.
880
00:36:08,320 –> 00:36:10,360
That’s the moment executive stop funding scale.
881
00:36:10,360 –> 00:36:13,840
If Defender is missing, you can label data, scope identities, and standardize tools,
882
00:36:13,840 –> 00:36:18,080
and still get burned by behavior drift because the environment changes, the prompts change,
883
00:36:18,080 –> 00:36:21,360
the agents change in new ways and your controls lag reality.
884
00:36:21,360 –> 00:36:23,040
Missing layers don’t create obvious gaps.
885
00:36:23,040 –> 00:36:26,240
They create ambiguity and ambiguity is where agent programs die.
886
00:36:26,240 –> 00:36:29,000
So what does minimum viable mean in operational terms?
887
00:36:29,000 –> 00:36:33,880
It means you can trace any agent action end-to-end as a single evidence chain.
888
00:36:33,880 –> 00:36:39,800
Identity, tool call, data access, behavior signal, outcome, not screenshots, not a deck.
889
00:36:39,800 –> 00:36:40,800
Evidence.
890
00:36:40,800 –> 00:36:45,160
If you’re not executed, you can show which agent identity created it, which MCP tool executed
891
00:36:45,160 –> 00:36:50,000
it, which data sources were read to ground it, and whether the behavior matched baseline.
892
00:36:50,000 –> 00:36:53,960
If a file gets shared, you can show the identity, the tool path, the data classification,
893
00:36:53,960 –> 00:36:56,320
and the behavioral context that led to the share.
894
00:36:56,320 –> 00:36:59,320
If a policy answer gets generated, you can show provenance.
895
00:36:59,320 –> 00:37:03,000
Which authoritative source was used and which non-authoritative sources were intentionally
896
00:37:03,000 –> 00:37:04,000
excluded.
897
00:37:04,000 –> 00:37:07,640
That is the control plane doing its job, enforcing intent at runtime.
898
00:37:07,640 –> 00:37:11,640
Now translate that into executive demands because this is where teams love to hide behind partial
899
00:37:11,640 –> 00:37:12,640
compliance.
900
00:37:12,640 –> 00:37:17,440
Executives should demand proof of enforcement end-to-end, not per component checklists.
901
00:37:17,440 –> 00:37:20,040
Not we have entra, not we turned on purview.
902
00:37:20,040 –> 00:37:21,720
Not we built an MCP server.
903
00:37:21,720 –> 00:37:23,040
Not defender is enabled.
904
00:37:23,040 –> 00:37:27,840
The demand is show the chain for a real scenario, pick a high impact workflow, show the identity,
905
00:37:27,840 –> 00:37:31,720
show the tool contract, show the data boundary, show the detection and containment path.
906
00:37:31,720 –> 00:37:35,920
If any part of that is, we don’t have that yet, then the program is not ready to scale.
907
00:37:35,920 –> 00:37:39,520
It can still experiment, it can still learn, but it can’t industrialize.
908
00:37:39,520 –> 00:37:42,960
And one more uncomfortable truth, the control plane doesn’t reduce innovation.
909
00:37:42,960 –> 00:37:46,040
It prevents innovation from turning into unbounded authority.
910
00:37:46,040 –> 00:37:50,840
Because the moment you scale agents, you are scaling decisions, not just answers, decisions
911
00:37:50,840 –> 00:37:52,480
that mutate systems.
912
00:37:52,480 –> 00:37:57,080
So the minimum viable control plane is how you keep intelligence enforceable as it grows.
913
00:37:57,080 –> 00:37:58,520
Next this stops being theory.
914
00:37:58,520 –> 00:38:02,840
Apply the control plane to real scenarios, spoken, concrete, and uncomfortable, starting
915
00:38:02,840 –> 00:38:05,840
with the one every enterprise thinks is harmless.
916
00:38:05,840 –> 00:38:11,280
And IT service desk agent that succeeds so quickly it triggers sprawl.
917
00:38:11,280 –> 00:38:15,360
Scenario one setup, IT service desk sprawl starts as a success.
918
00:38:15,360 –> 00:38:17,560
The first scenario always starts as a win.
919
00:38:17,560 –> 00:38:18,560
That’s why it spreads.
920
00:38:18,560 –> 00:38:23,000
A service desk team builds one triage agent, it sits in copilot or teams or a portal, doesn’t
921
00:38:23,000 –> 00:38:24,000
matter.
922
00:38:24,000 –> 00:38:29,400
Users type the same messy tickets they’ve always typed, VPN broken, can’t access sharepoint,
923
00:38:29,400 –> 00:38:33,200
new laptop, password reset, outlook is haunted.
924
00:38:33,200 –> 00:38:37,480
The agent cleans it up, asks the two missing questions humans never asked consistently
925
00:38:37,480 –> 00:38:41,760
and roots the request to the right queue with the same category and a clear summary.
926
00:38:41,760 –> 00:38:42,760
Ticket noise drops.
927
00:38:42,760 –> 00:38:44,560
MTTR starts to move.
928
00:38:44,560 –> 00:38:48,280
Leadership gets a slide that says AI reduced backlog at everyone claps.
929
00:38:48,280 –> 00:38:50,640
The team that built it gets asked to do a road show.
930
00:38:50,640 –> 00:38:53,800
And that’s where the entropy begins because the next thing that happens isn’t a security
931
00:38:53,800 –> 00:38:55,720
incident, it’s success.
932
00:38:55,720 –> 00:38:59,440
The agent works well enough that other teams decide they need their own version.
933
00:38:59,440 –> 00:39:03,160
The endpoint team wants a triage agent, optimized for devices.
934
00:39:03,160 –> 00:39:06,440
The identity team wants one optimized for access requests.
935
00:39:06,440 –> 00:39:09,880
The network team wants one optimized for Wi-Fi and VPN.
936
00:39:09,880 –> 00:39:13,600
The application team wants one optimized for our line of business apps.
937
00:39:13,600 –> 00:39:15,400
And each of those requests sounds reasonable.
938
00:39:15,400 –> 00:39:17,400
Each team optimizes locally.
939
00:39:17,400 –> 00:39:18,800
That’s the core problem.
940
00:39:18,800 –> 00:39:21,440
Local optimization creates global inconsistency.
941
00:39:21,440 –> 00:39:25,760
So you end up with five triage agents that all claim to be the best way to open a ticket,
942
00:39:25,760 –> 00:39:30,480
each with a slightly different intake process, a slightly different definition of severity,
943
00:39:30,480 –> 00:39:34,800
a slightly different set of tool calls, and a slightly different idea of what resolved
944
00:39:34,800 –> 00:39:35,960
means.
945
00:39:35,960 –> 00:39:38,080
Now the user experience collapses quietly.
946
00:39:38,080 –> 00:39:40,440
Users start asking which agent should I use.
947
00:39:40,440 –> 00:39:42,280
That question sounds like curiosity.
948
00:39:42,280 –> 00:39:45,720
It’s actually the first sign that your service model has fragmented because if an ecosystem
949
00:39:45,720 –> 00:39:47,760
is healthy, users don’t choose agents.
950
00:39:47,760 –> 00:39:48,760
The system roots them.
951
00:39:48,760 –> 00:39:50,000
The system has intent.
952
00:39:50,000 –> 00:39:52,080
In sprawl, the user becomes the router.
953
00:39:52,080 –> 00:39:54,640
And users root based on convenience, not control.
954
00:39:54,640 –> 00:39:56,200
Now add the orchestration failure.
955
00:39:56,200 –> 00:39:58,600
This is the part nobody diagrams until it hurts.
956
00:39:58,600 –> 00:40:00,920
One agent roots a ticket to the network queue.
957
00:40:00,920 –> 00:40:03,040
Another agent seeing the same symptoms.
958
00:40:03,040 –> 00:40:04,960
Roots a similar ticket to identity.
959
00:40:04,960 –> 00:40:09,840
A third agent auto suggests a self-service fix that’s correct for one environment but wrong
960
00:40:09,840 –> 00:40:10,840
for another.
961
00:40:10,840 –> 00:40:11,840
Tickets bounce.
962
00:40:11,840 –> 00:40:13,680
Duplicate tickets get created.
963
00:40:13,680 –> 00:40:15,280
Humans re-triage the triage.
964
00:40:15,280 –> 00:40:19,080
And the thing you build to reduce noise starts generating noise in a new shape.
965
00:40:19,080 –> 00:40:21,400
The organization calls this teething issues.
966
00:40:21,400 –> 00:40:22,400
It isn’t.
967
00:40:22,400 –> 00:40:25,640
It’s the system doing exactly what you designed.
968
00:40:25,640 –> 00:40:28,040
Decentralized authority without shared intent.
969
00:40:28,040 –> 00:40:31,440
An ownership collapses because ownership always collapses in sprawl.
970
00:40:31,440 –> 00:40:33,240
The original service desk agent had an owner.
971
00:40:33,240 –> 00:40:34,240
It had a team.
972
00:40:34,240 –> 00:40:35,600
It had someone who could answer for it.
973
00:40:35,600 –> 00:40:39,720
But the moment other teams clone it, fork it or build their own, ownership becomes ambiguous.
974
00:40:39,720 –> 00:40:42,760
Who owns the user outcome when the wrong queue gets the ticket?
975
00:40:42,760 –> 00:40:46,360
Who owns the inconsistent answers when two agents contradict each other?
976
00:40:46,360 –> 00:40:51,680
Who owns the escalation path when the agent says, resolved but the user still can’t work?
977
00:40:51,680 –> 00:40:52,920
Nobody owns the outcome.
978
00:40:52,920 –> 00:40:54,320
Everyone owns their component.
979
00:40:54,320 –> 00:40:56,120
That’s how enterprise systems fail.
980
00:40:56,120 –> 00:40:57,640
Now put a real failure moment on it.
981
00:40:57,640 –> 00:41:00,600
A user requests access to a restricted SharePoint site.
982
00:41:00,600 –> 00:41:04,960
The triage agent decides it’s an access request calls the tool and creates the ticket.
983
00:41:04,960 –> 00:41:07,400
Another team’s agent decides it can do better.
984
00:41:07,400 –> 00:41:12,440
It triggers an automated workflow that grants temporary access to unblock productivity,
985
00:41:12,440 –> 00:41:16,960
because someone added that capability during a sprint and never removed it.
986
00:41:16,960 –> 00:41:20,800
A manager gets an adaptive card approval in teams, hits a prove because it looks routine
987
00:41:20,800 –> 00:41:22,520
and the user gets access.
988
00:41:22,520 –> 00:41:25,400
Except the site contains sensitive content the user shouldn’t have.
989
00:41:25,400 –> 00:41:27,880
Now you have an incident and it’s the worst kind.
990
00:41:27,880 –> 00:41:32,800
An incident created by helpful automation backed by a legitimate approval executed through
991
00:41:32,800 –> 00:41:35,640
a toolpath nobody reviewed end to end.
992
00:41:35,640 –> 00:41:38,760
And when leadership asks what happened you get five different answers.
993
00:41:38,760 –> 00:41:40,960
Service desk says we didn’t build that workflow.
994
00:41:40,960 –> 00:41:43,760
Identity says we didn’t approve those permissions.
995
00:41:43,760 –> 00:41:47,560
The team that built the second agent says it was approved by a manager.
996
00:41:47,560 –> 00:41:51,120
Security says we can’t tell which agent executed the tool call.
997
00:41:51,120 –> 00:41:54,240
And the business says so are we pausing all agents?
998
00:41:54,240 –> 00:41:55,520
This is the lesson.
999
00:41:55,520 –> 00:41:59,240
Scaling agents without shared intent creates contradictory intelligence.
1000
00:41:59,240 –> 00:42:04,040
Not because people are incompetent, because the platform makes it easy to build local success
1001
00:42:04,040 –> 00:42:05,920
that erodes global control.
1002
00:42:05,920 –> 00:42:10,320
And once the service desk becomes the proving ground, sprawl becomes normalized.
1003
00:42:10,320 –> 00:42:12,120
Every department learns the wrong rule.
1004
00:42:12,120 –> 00:42:16,160
If you need a capability built in agent, not if you need a capability, design it as a
1005
00:42:16,160 –> 00:42:17,160
govern service.
1006
00:42:17,160 –> 00:42:18,840
Next, the uncomfortable part.
1007
00:42:18,840 –> 00:42:23,240
What changes when you enforce identity and shared intent is non-negotiable.
1008
00:42:23,240 –> 00:42:27,560
There one remediation, deterministic rooting and accountable action.
1009
00:42:27,560 –> 00:42:31,560
Fixing the service desk sprawl problem doesn’t start with better prompts.
1010
00:42:31,560 –> 00:42:34,640
It starts with making rooting and action deterministic again.
1011
00:42:34,640 –> 00:42:38,240
Not deterministic in the sense that every ticket looks the same.
1012
00:42:38,240 –> 00:42:41,720
Deterministic in the sense that the enterprise can prove this agent did this thing through
1013
00:42:41,720 –> 00:42:46,640
this tool under this approval model with this data boundary every time.
1014
00:42:46,640 –> 00:42:51,840
So remediation begins with the simplest enforcement rule the organization avoided in the rollout.
1015
00:42:51,840 –> 00:42:55,000
The triage capability gets its own Entra agent ID.
1016
00:42:55,000 –> 00:42:56,800
Not one service desk but account.
1017
00:42:56,800 –> 00:43:00,080
Not one shared service principle that powers five different agents.
1018
00:43:00,080 –> 00:43:01,840
One identity per agent product.
1019
00:43:01,840 –> 00:43:03,360
Scoped to what it actually does.
1020
00:43:03,360 –> 00:43:05,600
That’s what turns incident response back into engineering.
1021
00:43:05,600 –> 00:43:09,680
If network triage agent starts doing something wrong, the containment action is obvious.
1022
00:43:09,680 –> 00:43:10,680
Disable that identity.
1023
00:43:10,680 –> 00:43:12,160
You don’t disable all automation.
1024
00:43:12,160 –> 00:43:13,480
You don’t pause co-pilot.
1025
00:43:13,480 –> 00:43:16,560
You remove one actor from the system and you keep operating.
1026
00:43:16,560 –> 00:43:20,200
And Entra agent ID also forces the ownership model to stop being theater.
1027
00:43:20,200 –> 00:43:24,480
Each triage agent has an owner who can change configuration and a sponsor who owns the business
1028
00:43:24,480 –> 00:43:25,480
outcome.
1029
00:43:25,480 –> 00:43:29,760
If the sponsor changes roles, lifecycle workflows, reassign sponsorship, if the agent becomes
1030
00:43:29,760 –> 00:43:31,680
orphaned, it doesn’t stay live by accident.
1031
00:43:31,680 –> 00:43:33,080
It gets reviewed or retired.
1032
00:43:33,080 –> 00:43:36,640
That is what run agent products looks like in the service desk.
1033
00:43:36,640 –> 00:43:38,360
Next, tools.
1034
00:43:38,360 –> 00:43:42,200
The sprawl pattern happened because each team built its own toolpath.
1035
00:43:42,200 –> 00:43:46,480
Different connectors, different flows, different create ticket implementations, different
1036
00:43:46,480 –> 00:43:48,280
rooting logic.
1037
00:43:48,280 –> 00:43:51,080
The remediation requires MCP discipline.
1038
00:43:51,080 –> 00:43:54,480
Standard ticket actions published once, reused everywhere.
1039
00:43:54,480 –> 00:43:56,720
One MCP tool for create incident.
1040
00:43:56,720 –> 00:43:59,400
One MCP tool for create service request.
1041
00:43:59,400 –> 00:44:02,400
One MCP tool for update ticket status.
1042
00:44:02,400 –> 00:44:04,760
One MCP tool for assigned to queue.
1043
00:44:04,760 –> 00:44:06,800
And those tools return structured results.
1044
00:44:06,800 –> 00:44:10,520
Ticket ID, QID, urgency classification and the source that justified it.
1045
00:44:10,520 –> 00:44:11,520
Not free text.
1046
00:44:11,520 –> 00:44:12,520
Not vibes.
1047
00:44:12,520 –> 00:44:14,800
Outputs that can be audited and replayed.
1048
00:44:14,800 –> 00:44:16,720
This is the quiet architectural win.
1049
00:44:16,720 –> 00:44:20,040
The tool reuse is where cost and control become the same decision.
1050
00:44:20,040 –> 00:44:23,680
You reduce duplicated work and you reduce duplicated failure modes.
1051
00:44:23,680 –> 00:44:24,680
Now the routing problem.
1052
00:44:24,680 –> 00:44:29,680
The only scalable routing model is one intake, one intent model and explicit escalation
1053
00:44:29,680 –> 00:44:30,680
paths.
1054
00:44:30,680 –> 00:44:34,200
If you let every team invent its own intake, you guarantee contradictions.
1055
00:44:34,200 –> 00:44:37,040
So the enterprise picks an orchestrator pattern.
1056
00:44:37,040 –> 00:44:41,760
A single front door triage experience that decides which specialized triage agent to
1057
00:44:41,760 –> 00:44:42,760
call.
1058
00:44:42,760 –> 00:44:46,040
Not because the specialized agents are bad because the user can’t be a router.
1059
00:44:46,040 –> 00:44:47,320
The system has to be.
1060
00:44:47,320 –> 00:44:50,080
This is where deterministic routing shows up in practice.
1061
00:44:50,080 –> 00:44:54,400
The front door agent collects a minimum payload, symptom, impacted service, user identity
1062
00:44:54,400 –> 00:44:57,840
and whether it’s an access request, an outage or a how do I question.
1063
00:44:57,840 –> 00:45:00,480
Then it roots to exactly one downstream capability.
1064
00:45:00,480 –> 00:45:04,680
If the downstream capability can’t resolve it, it escalates to a human queue with the payload
1065
00:45:04,680 –> 00:45:05,680
attached.
1066
00:45:05,680 –> 00:45:10,200
No ping pong, no duplicate tickets, no conflicting answers from parallel agents.
1067
00:45:10,200 –> 00:45:14,760
Now apply data boundaries because service desk agents love to help themselves into trouble.
1068
00:45:14,760 –> 00:45:18,920
Per view boundaries block access to sensitive queues and restricted knowledge sources.
1069
00:45:18,920 –> 00:45:24,200
The service desk triage agent doesn’t need HRK’s data, it doesn’t need legal investigations,
1070
00:45:24,200 –> 00:45:26,360
it doesn’t need executive mailbox metadata.
1071
00:45:26,360 –> 00:45:28,800
So it cannot retrieve them even if a user asks.
1072
00:45:28,800 –> 00:45:33,520
And this is where the organization has to stop lying to itself about least privilege.
1073
00:45:33,520 –> 00:45:36,360
Least privilege is not a security best practice at agent scale.
1074
00:45:36,360 –> 00:45:40,520
It’s the only way to keep the blast radius finite, finally defender because even with identity
1075
00:45:40,520 –> 00:45:42,680
and tool contracts behavior drifts.
1076
00:45:42,680 –> 00:45:47,560
So defender for AI baselines, the agents normal operating pattern, volume of ticket creation,
1077
00:45:47,560 –> 00:45:53,240
types of ticket actions, typical queues, typical time of day, typical user populations.
1078
00:45:53,240 –> 00:45:57,760
When the agent starts manipulating tickets in bulk or changing assignments unusually or
1079
00:45:57,760 –> 00:46:01,040
pulling sensitive artifacts, defender raises a signal you can act on.
1080
00:46:01,040 –> 00:46:03,600
And crucially, you can act on it without destroying the program.
1081
00:46:03,600 –> 00:46:07,880
You block the agent identity, you quarantine the tool, you disable a single capability,
1082
00:46:07,880 –> 00:46:09,640
the rest of the ecosystem continues.
1083
00:46:09,640 –> 00:46:12,160
That’s what containment looks like when the control plane is real.
1084
00:46:12,160 –> 00:46:16,000
So the remediation isn’t one big service desk agent that does everything.
1085
00:46:16,000 –> 00:46:21,560
It’s a small portfolio of agent products within forced boundaries, identities you can disable,
1086
00:46:21,560 –> 00:46:26,400
tools you can govern, data sources you can restrict, and behaviors you can detect.
1087
00:46:26,400 –> 00:46:29,520
And when leadership asks, how do we scale this safely?
1088
00:46:29,520 –> 00:46:31,320
The answer is no longer a promise.
1089
00:46:31,320 –> 00:46:32,800
It’s a demonstration.
1090
00:46:32,800 –> 00:46:37,880
Show the evidence chain for one-rooted ticket, enter identity, MCP tool call, purview bound
1091
00:46:37,880 –> 00:46:42,440
grounding, defender behavioral baseline, and the human approval step if the action is
1092
00:46:42,440 –> 00:46:43,640
irreversible.
1093
00:46:43,640 –> 00:46:47,920
Because once that chain exists, the service desk stops being the origin of sprawl.
1094
00:46:47,920 –> 00:46:51,120
It becomes the template for enforceable automation.
1095
00:46:51,120 –> 00:46:56,480
Next the part that still breaks even well-governed service desk agents, accuracy without provenance.
1096
00:46:56,480 –> 00:47:01,400
That’s where policy and op-s knowledge turns into compliance fallout, even when the agents
1097
00:47:01,400 –> 00:47:04,120
write scenario two set up.
1098
00:47:04,120 –> 00:47:08,200
MCP backed policy plus ops agent and the provenance problem.
1099
00:47:08,200 –> 00:47:12,520
The second scenario is the one that makes leaders hate agents because it looks like competence
1100
00:47:12,520 –> 00:47:15,080
right up until it becomes compliance fallout.
1101
00:47:15,080 –> 00:47:17,400
A team builds a policy and operations agent.
1102
00:47:17,400 –> 00:47:18,760
The goal is clean.
1103
00:47:18,760 –> 00:47:22,280
Stop wasting senior people’s time answering the same questions.
1104
00:47:22,280 –> 00:47:23,760
What’s the change window?
1105
00:47:23,760 –> 00:47:25,160
What’s the approved exception path?
1106
00:47:25,160 –> 00:47:28,680
What’s the retention requirement for this data set?
1107
00:47:28,680 –> 00:47:30,560
Which policy applies to contractors?
1108
00:47:30,560 –> 00:47:31,560
It’s all reasonable.
1109
00:47:31,560 –> 00:47:35,000
Repetitive and it’s all the kind of work that feels like it should be automatable.
1110
00:47:35,000 –> 00:47:36,600
So they do it properly.
1111
00:47:36,600 –> 00:47:38,600
They ground the agent on policy libraries.
1112
00:47:38,600 –> 00:47:43,720
They connect MCP tools to pull live operational context from systems of record, ticketing,
1113
00:47:43,720 –> 00:47:48,120
CMDB, maybe a configuration repo, maybe a risk register, they structure outputs, they add
1114
00:47:48,120 –> 00:47:52,280
guardrails, they roll it out to a pilot group, and the agent performs beautifully.
1115
00:47:52,280 –> 00:47:56,600
It answers quickly, it answers confidently, it sounds consistent, it even sites sources
1116
00:47:56,600 –> 00:47:58,680
because somebody knew that citations matter.
1117
00:47:58,680 –> 00:48:00,960
The organization relaxes, that’s the mistake.
1118
00:48:00,960 –> 00:48:03,680
Because the failure mode here isn’t that the agent makes things up.
1119
00:48:03,680 –> 00:48:06,600
The failure mode is that it retrieves the wrong truth from the right place.
1120
00:48:06,600 –> 00:48:07,600
Here’s the failure moment.
1121
00:48:07,600 –> 00:48:10,000
A user asks a question that looks simple.
1122
00:48:10,000 –> 00:48:13,640
Are we allowed to apply this policy exception for this scenario?
1123
00:48:13,640 –> 00:48:18,880
The agent pulls the relevant text, summarizes it, and replies, yes, here’s the condition,
1124
00:48:18,880 –> 00:48:22,080
here’s the justification, here’s the process, the user follows the guidance.
1125
00:48:22,080 –> 00:48:23,440
They implement the exception.
1126
00:48:23,440 –> 00:48:24,440
Work continues.
1127
00:48:24,440 –> 00:48:29,400
Two weeks later an audit or a compliance review happens, or an incident triggers discovery.
1128
00:48:29,400 –> 00:48:33,880
And someone notices the exception was granted under an outdated policy version, or under a
1129
00:48:33,880 –> 00:48:38,240
departmental policy that was never meant to be enterprise-wide, or under a draft that was
1130
00:48:38,240 –> 00:48:42,400
published to a sharepoint site for review and never marked as non-authoritative, or under
1131
00:48:42,400 –> 00:48:45,280
a policy that applies to internal employees, not contractors.
1132
00:48:45,280 –> 00:48:47,320
The agent’s answer was internally consistent.
1133
00:48:47,320 –> 00:48:49,200
It was also wrong in the only way that matters.
1134
00:48:49,200 –> 00:48:50,360
It wasn’t authoritative.
1135
00:48:50,360 –> 00:48:53,960
So now the organization has a governance problem that’s hard to explain because nobody can
1136
00:48:53,960 –> 00:48:57,160
point to malice, and nobody can point to a bug.
1137
00:48:57,160 –> 00:48:59,220
The agent did what it was built to do.
1138
00:48:59,220 –> 00:49:02,760
Retrieve content was allowed to access and produce a high quality summary.
1139
00:49:02,760 –> 00:49:06,360
This is why provenance is not a nice to have in the agent era.
1140
00:49:06,360 –> 00:49:08,280
Accuracy without provenance is still risk.
1141
00:49:08,280 –> 00:49:13,760
And enterprises confuse those two constantly because humans tend to evaluate answers by confidence
1142
00:49:13,760 –> 00:49:14,760
and fluency.
1143
00:49:14,760 –> 00:49:18,280
They don’t evaluate by source authority unless they’re already suspicious.
1144
00:49:18,280 –> 00:49:20,800
Agents amplify that bias because they speak like they know.
1145
00:49:20,800 –> 00:49:25,920
Now add MCP to the mix because MCP makes the blast radius bigger in a very specific way.
1146
00:49:25,920 –> 00:49:27,680
The agent doesn’t just answer from documents.
1147
00:49:27,680 –> 00:49:32,080
It can also call tools that return policy like data from operational systems.
1148
00:49:32,080 –> 00:49:33,960
What’s configured in production?
1149
00:49:33,960 –> 00:49:35,800
What’s the current exception state?
1150
00:49:35,800 –> 00:49:37,480
What’s the last change record?
1151
00:49:37,480 –> 00:49:40,400
What’s the active retention policy in system X?
1152
00:49:40,400 –> 00:49:42,840
Those outputs look authoritative because they’re live.
1153
00:49:42,840 –> 00:49:44,040
But live doesn’t mean govern.
1154
00:49:44,040 –> 00:49:45,360
A CMDB record can be wrong.
1155
00:49:45,360 –> 00:49:46,760
A ticket can be mislabeled.
1156
00:49:46,760 –> 00:49:49,320
A config repo can contain legacy settings.
1157
00:49:49,320 –> 00:49:50,800
And MCP doesn’t validate meaning.
1158
00:49:50,800 –> 00:49:52,240
It just standardizes access.
1159
00:49:52,240 –> 00:49:56,600
So the agent becomes a synthesis layer that merges three categories of information that humans
1160
00:49:56,600 –> 00:49:58,280
normally keep separate.
1161
00:49:58,280 –> 00:50:03,320
Written policy, operational reality and tribal knowledge embedded in tool outputs.
1162
00:50:03,320 –> 00:50:04,920
That synthesis is useful.
1163
00:50:04,920 –> 00:50:08,960
It is also where compliance dies if you don’t enforce which sources are allowed to drive
1164
00:50:08,960 –> 00:50:12,200
decisions because the user isn’t asking for bibliography.
1165
00:50:12,200 –> 00:50:15,320
They’re asking for permission and the agent can accidentally grant it.
1166
00:50:15,320 –> 00:50:19,960
Now zoom out to the executive implication because this is where we’ll iterate, stops working.
1167
00:50:19,960 –> 00:50:22,000
Trust collapses faster than adoption grows.
1168
00:50:22,000 –> 00:50:26,200
You can spend six months building momentum, training users, rolling out capabilities and
1169
00:50:26,200 –> 00:50:27,480
getting teams comfortable.
1170
00:50:27,480 –> 00:50:30,280
One provenance failure resets the trust model overnight.
1171
00:50:30,280 –> 00:50:33,360
The business stops asking, how do we use this and starts asking yet?
1172
00:50:33,360 –> 00:50:34,360
Can we rely on it?
1173
00:50:34,360 –> 00:50:36,600
Legal starts asking can we defend it?
1174
00:50:36,600 –> 00:50:39,200
Security starts asking, can we contain it?
1175
00:50:39,200 –> 00:50:43,840
And leadership starts asking, do we need to pause expansion until we can prove control?
1176
00:50:43,840 –> 00:50:48,360
This is the pattern that kills agent programs, not technical failure, but credibility failure.
1177
00:50:48,360 –> 00:50:52,360
Accredibility failure happens when the system can’t prove that its answers came from the right
1178
00:50:52,360 –> 00:50:53,440
authority.
1179
00:50:53,440 –> 00:50:56,200
So in this scenario, the agent’s correctness is a trap.
1180
00:50:56,200 –> 00:50:59,000
It’s optimized for helpfulness, speed and clarity.
1181
00:50:59,000 –> 00:51:02,280
But the enterprise needs a different optimization target.
1182
00:51:02,280 –> 00:51:06,560
Authoritative knowledge, explicit provenance and enforceable boundaries that prevent non-authoritative
1183
00:51:06,560 –> 00:51:09,520
sources from being treated as truth.
1184
00:51:09,520 –> 00:51:11,040
That’s the open loop to keep in mind.
1185
00:51:11,040 –> 00:51:13,600
The remediation isn’t make it more accurate.
1186
00:51:13,600 –> 00:51:16,080
It’s make it reliably authoritative.
1187
00:51:16,080 –> 00:51:20,480
And that means the next rule has to become non-negotiable across the ecosystem.
1188
00:51:20,480 –> 00:51:24,360
The agent must prefer authoritative knowledge over maximum knowledge because maximum knowledge
1189
00:51:24,360 –> 00:51:26,400
is just everything it can read.
1190
00:51:26,400 –> 00:51:29,760
And everything it can read is where the compliance incident is hiding.
1191
00:51:29,760 –> 00:51:31,000
Scenario 2.
1192
00:51:31,000 –> 00:51:32,800
Remediation.
1193
00:51:32,800 –> 00:51:33,920
Authoritative knowledge.
1194
00:51:33,920 –> 00:51:35,600
Not maximum knowledge.
1195
00:51:35,600 –> 00:51:39,520
Fixing the provenance problem doesn’t start with better prompting.
1196
00:51:39,520 –> 00:51:42,480
It starts with admitting what the agent is actually doing.
1197
00:51:42,480 –> 00:51:43,480
It’s not searching.
1198
00:51:43,480 –> 00:51:45,440
It’s compiling an answer from whatever it can reach.
1199
00:51:45,440 –> 00:51:47,560
So the remediation rule is blunt.
1200
00:51:47,560 –> 00:51:51,800
The agent must prefer authoritative knowledge over maximum knowledge even when maximum knowledge
1201
00:51:51,800 –> 00:51:55,200
feels more helpful because helpful is what gets you sued.
1202
00:51:55,200 –> 00:52:00,440
The first move is purview, first classification and not in the abstract we labeled some stuff
1203
00:52:00,440 –> 00:52:01,600
since.
1204
00:52:01,600 –> 00:52:06,040
The organization needs an explicit definition of what counts as authoritative for each domain.
1205
00:52:06,040 –> 00:52:11,680
HR policy, security policy, operational runbooks, change management, finance procedures, legal
1206
00:52:11,680 –> 00:52:12,680
templates.
1207
00:52:12,680 –> 00:52:17,280
Authoritative means, owned, versioned, reviewed and intended to be relied on.
1208
00:52:17,280 –> 00:52:20,200
If the content doesn’t meet that bar, it can exist.
1209
00:52:20,200 –> 00:52:22,520
But it can’t be treated as truth by agents.
1210
00:52:22,520 –> 00:52:26,680
That distinction matters because the enterprise already has massive amounts of policy-shaped
1211
00:52:26,680 –> 00:52:28,600
content that is not policy.
1212
00:52:28,600 –> 00:52:33,080
Drafts, local copies, screenshots, meeting notes, wiki pages and email threads.
1213
00:52:33,080 –> 00:52:36,560
Humans can usually detect that difference because they recognize the context.
1214
00:52:36,560 –> 00:52:37,560
Agents can’t.
1215
00:52:37,560 –> 00:52:38,960
They detect relevance, not authority.
1216
00:52:38,960 –> 00:52:43,080
If you don’t enforce authority, the agent will optimize for most relevant text, which
1217
00:52:43,080 –> 00:52:45,000
is how a draft becomes a decision.
1218
00:52:45,000 –> 00:52:47,800
So you classify authoritative repositories as authoritative.
1219
00:52:47,800 –> 00:52:51,000
You mark non-authoritative repositories as non-authoritative.
1220
00:52:51,000 –> 00:52:53,960
And you stop pretending that it’s in SharePoint makes it official.
1221
00:52:53,960 –> 00:52:55,200
Next is boundary enforcement.
1222
00:52:55,200 –> 00:52:57,200
You don’t just tag content and hope.
1223
00:52:57,200 –> 00:52:59,800
You restrict what the agent is allowed to use as grounding.
1224
00:52:59,800 –> 00:53:04,200
That means the agent’s identity gets access to the authoritative sources and it gets denied
1225
00:53:04,200 –> 00:53:06,240
access to everything else by default.
1226
00:53:06,240 –> 00:53:09,920
And read everything and side carefully, denied by design.
1227
00:53:09,920 –> 00:53:13,960
Because if the agent can read non-authoritative content, it will occasionally use it.
1228
00:53:13,960 –> 00:53:15,880
Not because it’s reckless, because it is obedient.
1229
00:53:15,880 –> 00:53:18,440
Now, the second move is MCP discipline.
1230
00:53:18,440 –> 00:53:21,800
And this is where most teams accidentally reintroduce the problem.
1231
00:53:21,800 –> 00:53:24,680
Tools must return structured results with source identifiers.
1232
00:53:24,680 –> 00:53:29,920
If the agent calls an MCP tool to retrieve a policy or an operational rule, the tool responds
1233
00:53:29,920 –> 00:53:33,200
can’t be a paragraph of text that looks authoritative.
1234
00:53:33,200 –> 00:53:38,760
It needs a schema, source system, document ID, version, effective date and classification.
1235
00:53:38,760 –> 00:53:42,160
If the result doesn’t contain those fields, the agent can’t prove provenance and your
1236
00:53:42,160 –> 00:53:43,920
back to sounds right.
1237
00:53:43,920 –> 00:53:47,320
This is also where you make outputs carry their own chain of custody.
1238
00:53:47,320 –> 00:53:49,160
The policy answer isn’t just an answer.
1239
00:53:49,160 –> 00:53:50,760
It’s an answer plus a receipt.
1240
00:53:50,760 –> 00:53:52,680
The receipt is what makes it defensible.
1241
00:53:52,680 –> 00:53:55,960
And no, citations aren’t enough if citations can point to anything.
1242
00:53:55,960 –> 00:54:00,800
A link to a SharePoint page that got moved, duplicated or edited without governance is
1243
00:54:00,800 –> 00:54:01,800
not provenance.
1244
00:54:01,800 –> 00:54:06,320
Maintenance means this came from the authoritative system of record at this version effective
1245
00:54:06,320 –> 00:54:07,560
on this date.
1246
00:54:07,560 –> 00:54:10,800
Then enter enforcement turns that into a permission model that doesn’t drift.
1247
00:54:10,800 –> 00:54:14,520
The policy agent identity gets scoped to the policy repositories.
1248
00:54:14,520 –> 00:54:16,080
Not everything it can read.
1249
00:54:16,080 –> 00:54:20,000
If teams want broader access to make it smarter, they’re requesting the ability to blend
1250
00:54:20,000 –> 00:54:21,360
authority with noise.
1251
00:54:21,360 –> 00:54:22,360
That’s not intelligence.
1252
00:54:22,360 –> 00:54:25,000
That’s a compliance incident with better grammar.
1253
00:54:25,000 –> 00:54:28,160
And this is where the sponsor role becomes non-optional.
1254
00:54:28,160 –> 00:54:30,400
Someone in the business has to accept the trade off.
1255
00:54:30,400 –> 00:54:32,480
To throw a knowledge, higher reliability.
1256
00:54:32,480 –> 00:54:36,720
If the sponsor wants maximum knowledge, the sponsor is accepting maximum risk.
1257
00:54:36,720 –> 00:54:37,800
That’s the actual decision.
1258
00:54:37,800 –> 00:54:38,800
Make it explicit.
1259
00:54:38,800 –> 00:54:40,400
Now add Defender monitoring.
1260
00:54:40,400 –> 00:54:43,000
Because even with boundaries, people will still try to bend the system.
1261
00:54:43,000 –> 00:54:44,400
You watch for two patterns.
1262
00:54:44,400 –> 00:54:46,600
One, unusual policy queries.
1263
00:54:46,600 –> 00:54:51,400
If a policy agent suddenly starts pulling large volumes of content or probing unusual domains
1264
00:54:51,400 –> 00:54:55,560
or repeatedly querying exceptions and bypass terms, something changed.
1265
00:54:55,560 –> 00:54:59,400
Either a user is trying to game it or the agent got new toolpaths or someone feted poison
1266
00:54:59,400 –> 00:55:00,400
context.
1267
00:55:00,400 –> 00:55:01,920
Some doesn’t need to know which one first.
1268
00:55:01,920 –> 00:55:05,640
It needs to alert and contain two, mass retrieval.
1269
00:55:05,640 –> 00:55:09,840
The fastest way to turn a policy agent into an exfiltration tool is to ask it for all policies
1270
00:55:09,840 –> 00:55:12,080
related to X repeatedly and then export.
1271
00:55:12,080 –> 00:55:15,920
If you can detect that behavior early, you can block the agent identity, lock down the tool
1272
00:55:15,920 –> 00:55:18,000
and keep the rest of the ecosystem running.
1273
00:55:18,000 –> 00:55:19,560
And that’s the actual remediation story.
1274
00:55:19,560 –> 00:55:21,800
You don’t fix hallucinations.
1275
00:55:21,800 –> 00:55:23,000
You fix authority.
1276
00:55:23,000 –> 00:55:27,200
You reduce the agent’s reachable surface area to what you’re willing to defend.
1277
00:55:27,200 –> 00:55:31,800
You force every tool to return provenance in a structured way and you bind the whole thing
1278
00:55:31,800 –> 00:55:35,920
to an identity that can be contained when behavior drifts.
1279
00:55:35,920 –> 00:55:39,160
What this actually means is the agent stops being a search engine for your tenant and
1280
00:55:39,160 –> 00:55:41,480
becomes a governed interface to your official truth.
1281
00:55:41,480 –> 00:55:45,160
And once you do that, trust grows because trust has a mechanical basis.
1282
00:55:45,160 –> 00:55:47,320
The system can prove where it got the answer.
1283
00:55:47,320 –> 00:55:51,720
Next, the scalable pattern that makes this survivable across thousands of decisions
1284
00:55:51,720 –> 00:55:53,120
isn’t more restrictions.
1285
00:55:53,120 –> 00:55:57,800
It’s human approvals at the edge where irreversible actions and sensitive domains get a final
1286
00:55:57,800 –> 00:56:01,040
checkpoint without slowing everything to a crawl.
1287
00:56:01,040 –> 00:56:02,040
Scenario 3.
1288
00:56:02,040 –> 00:56:03,040
Set up.
1289
00:56:03,040 –> 00:56:05,480
Teams plus adaptive cards for approvals at scale.
1290
00:56:05,480 –> 00:56:10,120
Now the third scenario, because eventually every enterprise discovers the same limit.
1291
00:56:10,120 –> 00:56:14,600
You can’t scale agent autonomy into irreversible actions and pretend governance will catch up
1292
00:56:14,600 –> 00:56:15,600
later.
1293
00:56:15,600 –> 00:56:19,000
So the pattern that actually survives is boring, repeatable and enforceable.
1294
00:56:19,000 –> 00:56:21,560
Teams plus adaptive cards for approvals at scale.
1295
00:56:21,560 –> 00:56:25,600
Not because teams is magical, because it’s where work already happens and because the approval
1296
00:56:25,600 –> 00:56:29,840
surface is the only place you can reliably insert human accountability without sending
1297
00:56:29,840 –> 00:56:32,040
people into a separate portal.
1298
00:56:32,040 –> 00:56:33,040
Nobody opens.
1299
00:56:33,040 –> 00:56:34,720
Here’s the setup.
1300
00:56:34,720 –> 00:56:36,480
An agent doesn’t decide.
1301
00:56:36,480 –> 00:56:37,920
It prepares a decision.
1302
00:56:37,920 –> 00:56:38,920
It gathers context.
1303
00:56:38,920 –> 00:56:40,160
It normalizes input.
1304
00:56:40,160 –> 00:56:41,560
It checks policy boundaries.
1305
00:56:41,560 –> 00:56:42,680
It proposes an action.
1306
00:56:42,680 –> 00:56:43,680
And then it stops.
1307
00:56:43,680 –> 00:56:47,160
That stop is the entire point because the enterprise doesn’t need agents that can do anything.
1308
00:56:47,160 –> 00:56:51,200
It needs agents that can do bounded things quickly with humans approving outcomes when
1309
00:56:51,200 –> 00:56:55,280
the action is irreversible, sensitive or politically explosive.
1310
00:56:55,280 –> 00:56:56,640
So the workflow looks like this.
1311
00:56:56,640 –> 00:57:00,040
A user asks for something that has a real blast radius.
1312
00:57:00,040 –> 00:57:01,680
Access to a restricted site.
1313
00:57:01,680 –> 00:57:03,320
An exception to a standard control.
1314
00:57:03,320 –> 00:57:04,920
A new vendor on board.
1315
00:57:04,920 –> 00:57:06,160
A mailbox delegation.
1316
00:57:06,160 –> 00:57:07,160
A data export.
1317
00:57:07,160 –> 00:57:08,160
A policy waiver.
1318
00:57:08,160 –> 00:57:09,680
A routing override.
1319
00:57:09,680 –> 00:57:13,360
The agent collects the missing fields that humans always forget.
1320
00:57:13,360 –> 00:57:14,600
Business justification.
1321
00:57:14,600 –> 00:57:15,920
Duration.
1322
00:57:15,920 –> 00:57:17,560
Impacted systems.
1323
00:57:17,560 –> 00:57:21,640
A user identity.
1324
00:57:21,640 –> 00:57:24,400
Then the agent package is that into an adaptive card.
1325
00:57:24,400 –> 00:57:26,880
And the adaptive card is not just nice UI.
1326
00:57:26,880 –> 00:57:28,040
It’s a constrained mechanism.
1327
00:57:28,040 –> 00:57:29,640
It forces structured input.
1328
00:57:29,640 –> 00:57:33,560
It prevents the sure-go ahead approval that arrives as free text with no evidence.
1329
00:57:33,560 –> 00:57:37,640
It standardizes the decision shape, what was asked, what was recommended, what policy it
1330
00:57:37,640 –> 00:57:39,560
maps to and what the consequences are.
1331
00:57:39,560 –> 00:57:42,400
And critically, it keeps the approval in the flow of work.
1332
00:57:42,400 –> 00:57:43,400
No context switch.
1333
00:57:43,400 –> 00:57:44,800
No hunting for a link.
1334
00:57:44,800 –> 00:57:47,920
No separate approval portal with its own identity problems.
1335
00:57:47,920 –> 00:57:49,920
It shows up where they already live.
1336
00:57:49,920 –> 00:57:51,160
Teams.
1337
00:57:51,160 –> 00:57:53,920
Now the failure mode this avoids is subtle.
1338
00:57:53,920 –> 00:57:54,920
Approval theater.
1339
00:57:54,920 –> 00:57:57,080
Most organizations claim they have approvals.
1340
00:57:57,080 –> 00:58:02,080
But the approvals exist as emails or as screenshots or as the manager said yes and chat.
1341
00:58:02,080 –> 00:58:03,400
None of that is enforceable.
1342
00:58:03,400 –> 00:58:05,120
None of that is reliably auditable.
1343
00:58:05,120 –> 00:58:09,040
And none of that scales because it can’t be correlated back to an action chain.
1344
00:58:09,040 –> 00:58:13,480
Adaptive cards fix that by making approvals a first class event in the workflow, not
1345
00:58:13,480 –> 00:58:14,680
an afterthought.
1346
00:58:14,680 –> 00:58:18,960
The agent sends the card to the right approver based on a deterministic rule.
1347
00:58:18,960 –> 00:58:21,640
The approver sees a bounded set of options.
1348
00:58:21,640 –> 00:58:23,880
Approve, reject, request clarification.
1349
00:58:23,880 –> 00:58:27,960
If they request clarification, the agent collects the missing information and resubmits.
1350
00:58:27,960 –> 00:58:30,640
If they reject, the workflow ends with a reason captured.
1351
00:58:30,640 –> 00:58:35,200
If they approve, the agent executes only the bounded action it was designed for.
1352
00:58:35,200 –> 00:58:36,200
That’s how it scales.
1353
00:58:36,200 –> 00:58:37,200
Humans approve outcomes.
1354
00:58:37,200 –> 00:58:39,040
Agents execute bounded steps.
1355
00:58:39,040 –> 00:58:43,240
And the reason this scales operationally is that it reduces context switching in the exact
1356
00:58:43,240 –> 00:58:45,640
places where enterprises bleed time.
1357
00:58:45,640 –> 00:58:49,160
Approvers don’t want to read a paragraph, interpret it and then remember what system they
1358
00:58:49,160 –> 00:58:50,880
need to log into to do the thing.
1359
00:58:50,880 –> 00:58:55,360
They want a decision interface that is one screen, one minute and one accountable click.
1360
00:58:55,360 –> 00:58:57,800
Adaptive cards give you that micro experience.
1361
00:58:57,800 –> 00:59:03,440
Now the real payoff is the audit trail because this is where the control plane becomes visible.
1362
00:59:03,440 –> 00:59:07,920
With this approval pattern, an enterprise can produce a clean evidence chain per action.
1363
00:59:07,920 –> 00:59:12,260
The agent identity that prepared the request, the user identity that initiated it, the
1364
00:59:12,260 –> 00:59:16,700
approval identity that authorized it, the tool call that executed it and the timestamp
1365
00:59:16,700 –> 00:59:17,700
for each step.
1366
00:59:17,700 –> 00:59:22,340
Identity plus approval plus tool call plus timestamp, that becomes the default.
1367
00:59:22,340 –> 00:59:25,420
And once that is the default, you stop arguing about whether the agent should have done
1368
00:59:25,420 –> 00:59:26,420
it.
1369
00:59:26,420 –> 00:59:29,020
You have the record, you have the boundary, you have the accountability.
1370
00:59:29,020 –> 00:59:33,500
This is also where the governance layers interlock without being abstract.
1371
00:59:33,500 –> 00:59:34,500
Entra agent ID.
1372
00:59:34,500 –> 00:59:38,300
The agent has a stable identity that prepared the request and executed the tool call,
1373
00:59:38,300 –> 00:59:39,300
MCP.
1374
00:59:39,300 –> 00:59:43,280
The execution happens through standardized tools with structured outputs that record what
1375
00:59:43,280 –> 00:59:44,280
happened.
1376
00:59:44,280 –> 00:59:48,460
Per view, the agent can’t include sensitive data in the approval context if the boundary
1377
00:59:48,460 –> 00:59:53,020
blocks it and it can’t ground on non-authoritative sources without being allowed.
1378
00:59:53,020 –> 00:59:54,020
Defender.
1379
00:59:54,020 –> 00:59:58,580
If the agent starts generating approvals at a weird rate or targeting unusual approvals
1380
00:59:58,580 –> 01:00:02,820
or chaining actions outside baseline, you detect it and contain it, so the approval
1381
01:00:02,820 –> 01:00:04,220
surface is an extra.
1382
01:00:04,220 –> 01:00:06,540
It’s the seam wear enforcement becomes real.
1383
01:00:06,540 –> 01:00:08,260
Now one more uncomfortable truth.
1384
01:00:08,260 –> 01:00:11,900
This pattern forces you to admit that humans remain the accountability layer.
1385
01:00:11,900 –> 01:00:15,700
The enterprise can’t outsource responsibility to a probabilistic system and then blame the
1386
01:00:15,700 –> 01:00:17,380
system when it acts.
1387
01:00:17,380 –> 01:00:20,300
Adaptive cards make that responsibility explicit and rootable.
1388
01:00:20,300 –> 01:00:24,060
And the organizations that scale cleanly are the ones that embrace that operating model
1389
01:00:24,060 –> 01:00:25,460
shift early.
1390
01:00:25,460 –> 01:00:28,060
Humans lead, agents operate, systems learn.
1391
01:00:28,060 –> 01:00:31,980
Next, that operating model because the tooling doesn’t save you if you still treat agents
1392
01:00:31,980 –> 01:00:34,580
like side projects instead of products.
1393
01:00:34,580 –> 01:00:38,500
The operating model from build agents to run agent products.
1394
01:00:38,500 –> 01:00:41,580
Most enterprises fail at agent scale for the most boring reason.
1395
01:00:41,580 –> 01:00:43,060
They treat agents like projects.
1396
01:00:43,060 –> 01:00:47,620
The project has a start date, a finish date, a hand off and a slide that says delivered.
1397
01:00:47,620 –> 01:00:51,180
Then everyone moves on and the system starts drifting immediately.
1398
01:00:51,180 –> 01:00:53,780
Agents don’t work like that and agent is not a build artifact.
1399
01:00:53,780 –> 01:00:55,020
It is a running product.
1400
01:00:55,020 –> 01:00:59,180
It has users, behaviors, dependencies and failure modes that evolve under pressure.
1401
01:00:59,180 –> 01:01:02,660
That distinction matters because the moment the organization ships an agent and walks
1402
01:01:02,660 –> 01:01:05,100
away, entropy starts writing the roadmap.
1403
01:01:05,100 –> 01:01:09,060
So the operating model has to change from build agents to run agent products.
1404
01:01:09,060 –> 01:01:10,260
A product has ownership.
1405
01:01:10,260 –> 01:01:11,260
It has a sponsor.
1406
01:01:11,260 –> 01:01:12,260
It has a backlog.
1407
01:01:12,260 –> 01:01:13,260
It has a deprecation plan.
1408
01:01:13,260 –> 01:01:16,740
It has an evidence trail and it has an explicit answer to the question executives will
1409
01:01:16,740 –> 01:01:18,660
ask after the first incident.
1410
01:01:18,660 –> 01:01:20,820
Who is accountable for this system’s behavior?
1411
01:01:20,820 –> 01:01:24,060
If you can’t answer that in one sentence, you don’t have a product.
1412
01:01:24,060 –> 01:01:25,540
You have a liability.
1413
01:01:25,540 –> 01:01:27,020
Start with the simplest rule.
1414
01:01:27,020 –> 01:01:29,900
Every enterprise grade agent needs an owner and a sponsor.
1415
01:01:29,900 –> 01:01:32,340
The owner is responsible for the technical shape.
1416
01:01:32,340 –> 01:01:37,100
Instructions, tool contracts, identity scope, monitoring and change control.
1417
01:01:37,100 –> 01:01:39,780
The sponsor is responsible for the business outcome.
1418
01:01:39,780 –> 01:01:41,380
Why the agent exists?
1419
01:01:41,380 –> 01:01:43,060
What decisions it influences?
1420
01:01:43,060 –> 01:01:46,180
What risk it is allowed to carry and what happens when it fails?
1421
01:01:46,180 –> 01:01:49,780
Without a sponsor, the agent will drift toward whatever users ask for because that’s the
1422
01:01:49,780 –> 01:01:51,740
path of least resistance.
1423
01:01:51,740 –> 01:01:56,260
And whatever users ask for is how you end up with agents that accidentally become policy
1424
01:01:56,260 –> 01:01:59,820
authorities, data brokers and shadow administrators.
1425
01:01:59,820 –> 01:02:05,700
And define life cycle workflows that treat agents like employees with badges, not like scripts.
1426
01:02:05,700 –> 01:02:10,300
Provisioning, identity is created, tools are approved, data boundaries are validated, monitoring
1427
01:02:10,300 –> 01:02:15,380
is enabled, operation, usage is tracked, behaviors are baseline, incidents are contained, outputs
1428
01:02:15,380 –> 01:02:17,380
are reviewed for provenance failures.
1429
01:02:17,380 –> 01:02:22,180
Change, tool contracts version, prompts update, model behavior shifts, connectors evolve
1430
01:02:22,180 –> 01:02:25,660
and someone signs off that the blast radius remains acceptable.
1431
01:02:25,660 –> 01:02:26,660
Retirement.
1432
01:02:26,660 –> 01:02:31,340
The agent is deprecated, access is removed, identities are disabled and the registry records
1433
01:02:31,340 –> 01:02:32,660
wide was shut down.
1434
01:02:32,660 –> 01:02:35,260
If any of those steps are optional, they won’t happen.
1435
01:02:35,260 –> 01:02:36,940
Optional controls become folklore.
1436
01:02:36,940 –> 01:02:38,100
Folklore doesn’t survive scale.
1437
01:02:38,100 –> 01:02:42,260
Now the roles, this is where organizations love to overcomplicate and call it governance.
1438
01:02:42,260 –> 01:02:45,220
The required roles are minimal and non-negotiable.
1439
01:02:45,220 –> 01:02:48,300
Platform, security, compliance and a business sponsor.
1440
01:02:48,300 –> 01:02:52,780
Platform owns the shared services, the curated tool catalog, MCP contract discipline, publishing
1441
01:02:52,780 –> 01:02:54,460
gates and the registry.
1442
01:02:54,460 –> 01:02:57,540
The owns identity boundaries and behavioral detection.
1443
01:02:57,540 –> 01:03:02,300
The entry identity model, conditional access decisions, defender baselines and containment
1444
01:03:02,300 –> 01:03:03,820
playbooks.
1445
01:03:03,820 –> 01:03:08,540
Compliance owns the data boundary, purview classification, authoritative source designation
1446
01:03:08,540 –> 01:03:09,980
and exposure reporting.
1447
01:03:09,980 –> 01:03:14,780
The business sponsor owns intent which outcomes matter what’s acceptable and what isn’t.
1448
01:03:14,780 –> 01:03:18,020
If any one of these roles is missing, you will still ship agents.
1449
01:03:18,020 –> 01:03:20,020
You will also ship contradictory authority.
1450
01:03:20,020 –> 01:03:21,940
It will feel fast, it will also be fragile.
1451
01:03:21,940 –> 01:03:26,300
It now comes the piece most organizations avoid because it forces transparency, a registry.
1452
01:03:26,300 –> 01:03:29,220
A registry isn’t a catalog that exists to look mature.
1453
01:03:29,220 –> 01:03:33,260
It is the system of record for why an agent exists, who owns it, what it can do, what
1454
01:03:33,260 –> 01:03:35,500
it can touch and which tools it can call.
1455
01:03:35,500 –> 01:03:38,660
It answers the user question, which agent should I use?
1456
01:03:38,660 –> 01:03:41,260
And before that question becomes a productivity tax.
1457
01:03:41,260 –> 01:03:46,460
It answers the auditor question, which identities exist and what are they authorized to do?
1458
01:03:46,460 –> 01:03:48,580
Before that question becomes a program freeze.
1459
01:03:48,580 –> 01:03:52,700
And it answers the architecture question, how many overlapping agents do we have that solve
1460
01:03:52,700 –> 01:03:54,700
the same problem differently?
1461
01:03:54,700 –> 01:03:56,420
Before decision that becomes permanent.
1462
01:03:56,420 –> 01:04:00,660
The registry mindset also forces the enterprise to define discoverability properly.
1463
01:04:00,660 –> 01:04:03,260
Agents shouldn’t be discovered by rumor and team’s links.
1464
01:04:03,260 –> 01:04:06,820
They should be discoverable by purpose, data domain and allowed actions.
1465
01:04:06,820 –> 01:04:11,340
And every agent entry should state one uncomfortable truth plainly, the agent’s boundary, what
1466
01:04:11,340 –> 01:04:14,380
it will not do, what it cannot access, what requires human approval.
1467
01:04:14,380 –> 01:04:18,940
And comes the entropy law because this is where governance becomes real, exceptions accumulate.
1468
01:04:18,940 –> 01:04:21,020
A team asks for just one more connector.
1469
01:04:21,020 –> 01:04:23,900
A leader wants just this one sensitive site.
1470
01:04:23,900 –> 01:04:26,980
A pilot needs temporary access that never gets removed.
1471
01:04:26,980 –> 01:04:32,260
A tool needs right permissions because it failed once and somebody unblocked it to hit a deadline.
1472
01:04:32,260 –> 01:04:34,260
Each exception is an entropy generator.
1473
01:04:34,260 –> 01:04:38,420
Over time, the system drifts away from the original intent, not because anyone chose chaos
1474
01:04:38,420 –> 01:04:43,020
but because the platform allowed intent to be negotiated instead of enforced.
1475
01:04:43,020 –> 01:04:46,980
So the operating model has one job, enforce assumptions at scale, not by telling people
1476
01:04:46,980 –> 01:04:51,220
to behave, by designing the control plane and the life cycle gates so that unsafe drift
1477
01:04:51,220 –> 01:04:52,940
is harder than safe iteration.
1478
01:04:52,940 –> 01:04:56,540
And when the organization does that, something surprising happens, innovation doesn’t die.
1479
01:04:56,540 –> 01:05:00,260
It gets rooted into patterns that are reusable, auditable and containable, which is the only
1480
01:05:00,260 –> 01:05:03,460
kind of innovation that survives the agent decade.
1481
01:05:03,460 –> 01:05:07,300
Risk three, cost and decision debt, the slow burn failure.
1482
01:05:07,300 –> 01:05:10,940
Cost is the risk nobody leads with because it sounds like procurement, but it’s the
1483
01:05:10,940 –> 01:05:15,340
forcing function that kills programs after the first wave of enthusiasm.
1484
01:05:15,340 –> 01:05:18,540
Security incidents pause you fast, cost pauses you permanently.
1485
01:05:18,540 –> 01:05:22,580
And the ugly part is that agent costs don’t show up where people expect.
1486
01:05:22,580 –> 01:05:28,060
They don’t show up as AI spend, they show up as a distributed tax across tokens, tool calls,
1487
01:05:28,060 –> 01:05:32,420
premium connectors, additional environments, extra monitoring and the one cost nobody budgets
1488
01:05:32,420 –> 01:05:33,420
for.
1489
01:05:33,420 –> 01:05:35,340
Human escalation when the agent can’t finish.
1490
01:05:35,340 –> 01:05:37,540
So the enterprise thinks it’s buying productivity.
1491
01:05:37,540 –> 01:05:39,780
What it’s actually buying is a new kind of runtime.
1492
01:05:39,780 –> 01:05:43,980
Every agent you add becomes a consumer of compute, data access and decision making bandwidth.
1493
01:05:43,980 –> 01:05:45,940
Not once, continuously.
1494
01:05:45,940 –> 01:05:50,500
And because agents encourage more usage because they make it easy, your program doesn’t scale
1495
01:05:50,500 –> 01:05:51,500
linearly.
1496
01:05:51,500 –> 01:05:55,180
It scales with demand you didn’t previously see because humans avoided the friction.
1497
01:05:55,180 –> 01:05:57,220
Remove the friction and the request volume spikes.
1498
01:05:57,220 –> 01:05:58,460
This is the token side.
1499
01:05:58,460 –> 01:06:00,700
Every conversation is not a chat.
1500
01:06:00,700 –> 01:06:04,460
It’s a chain, grounding retrieval, tool calls, summarization, follow-ups, retreats and
1501
01:06:04,460 –> 01:06:06,260
sometimes multi agent chaining.
1502
01:06:06,260 –> 01:06:11,380
The user sees one answer, the platform sees a loop, then the tool side.
1503
01:06:11,380 –> 01:06:17,340
Every tool call has a cost profile, API quotas, transaction charges, connector licensing,
1504
01:06:17,340 –> 01:06:19,540
downstream system load and audit storage.
1505
01:06:19,540 –> 01:06:22,660
If you let agents call tools freely, they will call tools freely.
1506
01:06:22,660 –> 01:06:26,220
Not because they’re wasteful, because they’re optimized to complete tasks and completion
1507
01:06:26,220 –> 01:06:29,260
often means try again with more context.
1508
01:06:29,260 –> 01:06:31,580
Retries aren’t free, they become the default.
1509
01:06:31,580 –> 01:06:35,260
Now add premium connectors and licensing boundaries because this is where finance starts paying
1510
01:06:35,260 –> 01:06:36,220
attention.
1511
01:06:36,220 –> 01:06:40,500
Sometimes build agents in co-pilot studio connect to a premium system and nobody remembers
1512
01:06:40,500 –> 01:06:42,580
that one more connector has a bill.
1513
01:06:42,580 –> 01:06:48,140
Or worse, it has a per flow, per run, per capacity shape that only becomes obvious once
1514
01:06:48,140 –> 01:06:49,140
adoption grows.
1515
01:06:49,140 –> 01:06:50,660
The platform didn’t trick you.
1516
01:06:50,660 –> 01:06:54,740
You just didn’t model the cost per completed task and that’s the metric that matters.
1517
01:06:54,740 –> 01:06:55,940
Not cost per chat.
1518
01:06:55,940 –> 01:06:58,900
A chat that ends in escalation is not a completed task.
1519
01:06:58,900 –> 01:07:02,180
A chat that creates two duplicate tickets is not a completed task.
1520
01:07:02,180 –> 01:07:06,140
A chat that retrieves 10 documents and still answers from the wrong policy source is not
1521
01:07:06,140 –> 01:07:07,460
a completed task.
1522
01:07:07,460 –> 01:07:10,820
Counting chats makes dashboards look good, it does not keep the program funded.
1523
01:07:10,820 –> 01:07:13,820
Now the second part of this risk is harder to see.
1524
01:07:13,820 –> 01:07:15,260
And it’s more corrosive.
1525
01:07:15,260 –> 01:07:16,260
Decision debt.
1526
01:07:16,260 –> 01:07:19,580
Decision debt is what happens when multiple agents solve the same problem differently at
1527
01:07:19,580 –> 01:07:22,620
the same time with different assumptions and different toolchains.
1528
01:07:22,620 –> 01:07:26,500
It’s the agent version of shadow IT except it doesn’t just create duplicate apps.
1529
01:07:26,500 –> 01:07:28,100
It creates duplicate answers.
1530
01:07:28,100 –> 01:07:29,820
And duplicate answers create rework.
1531
01:07:29,820 –> 01:07:32,140
A user asks one agent for the policy path.
1532
01:07:32,140 –> 01:07:33,700
Other agent gives a different path.
1533
01:07:33,700 –> 01:07:37,100
The user pings a human, the human picks one, then the agent gets blamed anyway.
1534
01:07:37,100 –> 01:07:38,900
That’s not AI unreliability.
1535
01:07:38,900 –> 01:07:40,900
That’s an inconsistent system design.
1536
01:07:40,900 –> 01:07:42,580
Decision debt compounds in three ways.
1537
01:07:42,580 –> 01:07:45,780
First, overlapping agents create inconsistent outcomes.
1538
01:07:45,780 –> 01:07:49,580
The organization spends time reconciling contradictions instead of competing work.
1539
01:07:49,580 –> 01:07:52,540
Second, duplicated tool contracts create duplicated maintenance.
1540
01:07:52,540 –> 01:07:56,020
Even when MCP exists, teams still rebuild because they can.
1541
01:07:56,020 –> 01:07:59,020
That doubles cost and triples failure surface.
1542
01:07:59,020 –> 01:08:03,340
Third, exception growth turns every special case into a custom path.
1543
01:08:03,340 –> 01:08:08,500
Each custom path requires more tokens, more tool calls, more approvals, more monitoring,
1544
01:08:08,500 –> 01:08:10,180
and more incident response.
1545
01:08:10,180 –> 01:08:11,900
Exceptions aren’t operational flexibility.
1546
01:08:11,900 –> 01:08:13,820
They’re recurring invoices.
1547
01:08:13,820 –> 01:08:17,260
This is why cost and governance aren’t separate conversations.
1548
01:08:17,260 –> 01:08:19,220
Cost is governance failing slowly.
1549
01:08:19,220 –> 01:08:23,460
And cost becomes visible when finance asks the only question that doesn’t care about
1550
01:08:23,460 –> 01:08:25,060
your architecture narrative.
1551
01:08:25,060 –> 01:08:28,260
What are we paying for and what are we getting back?
1552
01:08:28,260 –> 01:08:30,620
The predictable trajectory is always the same.
1553
01:08:30,620 –> 01:08:31,620
Phase one.
1554
01:08:31,620 –> 01:08:32,620
Invoices are small.
1555
01:08:32,620 –> 01:08:33,620
Adoption is exciting.
1556
01:08:33,620 –> 01:08:34,620
Leadership is optimistic.
1557
01:08:34,620 –> 01:08:35,620
Phase two.
1558
01:08:35,620 –> 01:08:36,620
Usage grows.
1559
01:08:36,620 –> 01:08:37,620
More agents appear.
1560
01:08:37,620 –> 01:08:38,620
Tool calls increase.
1561
01:08:38,620 –> 01:08:40,420
And the first cost spike hits.
1562
01:08:40,420 –> 01:08:41,420
Phase three.
1563
01:08:41,420 –> 01:08:42,420
Someone panics.
1564
01:08:42,420 –> 01:08:43,420
The program gets frozen.
1565
01:08:43,420 –> 01:08:48,180
Not because agents didn’t work, but because nobody can prove which spend created which outcome.
1566
01:08:48,180 –> 01:08:50,460
And nobody can guarantee the next quarter won’t be worse.
1567
01:08:50,460 –> 01:08:51,780
This is the slow burn failure.
1568
01:08:51,780 –> 01:08:53,340
It doesn’t come from one bad agent.
1569
01:08:53,340 –> 01:08:55,060
It comes from unmanaged multiplication.
1570
01:08:55,060 –> 01:08:59,620
And if you want a practical warning sign before the CFO does it for you, it’s this.
1571
01:08:59,620 –> 01:09:01,380
When teams can’t answer.
1572
01:09:01,380 –> 01:09:05,740
In one sentence, which agent owns which outcome, you already have decision debt.
1573
01:09:05,740 –> 01:09:09,820
When you can’t tie tool usage to completed tasks, you already have cost drift.
1574
01:09:09,820 –> 01:09:13,940
When you can’t deprecate agents cleanly, you’ve turned your ecosystem into a permanent subscription
1575
01:09:13,940 –> 01:09:14,940
to entropy.
1576
01:09:14,940 –> 01:09:18,500
Next, the metrics that stop this from becoming a budget driven shutdown.
1577
01:09:18,500 –> 01:09:22,380
Four metrics executives actually fund, because they translate enforceable intelligence
1578
01:09:22,380 –> 01:09:23,900
into defendable outcomes.
1579
01:09:23,900 –> 01:09:26,500
The four metrics executives actually fund.
1580
01:09:26,500 –> 01:09:28,580
Executives don’t fund agent platforms.
1581
01:09:28,580 –> 01:09:30,820
They fund outcomes that survive scrutiny.
1582
01:09:30,820 –> 01:09:34,420
So if you want the program to live past the first invoice spike, the first audit question
1583
01:09:34,420 –> 01:09:39,380
or the first incident review, you need four metrics that don’t collapse into marketing.
1584
01:09:39,380 –> 01:09:43,300
Four metrics that force an evidence chain, not adoption, not number of agents, not hours
1585
01:09:43,300 –> 01:09:44,940
saved from a survey.
1586
01:09:44,940 –> 01:09:46,700
Metrics that map to operating reality.
1587
01:09:46,700 –> 01:09:49,020
First MTR reduction, mean time to resolution.
1588
01:09:49,020 –> 01:09:51,740
This is the easiest executive win because it’s measurable.
1589
01:09:51,740 –> 01:09:55,180
It’s already tracked and service operations already feel the pain.
1590
01:09:55,180 –> 01:09:59,980
But the mistake teams make is treating MTR as a tool problem instead of a routing problem.
1591
01:09:59,980 –> 01:10:03,940
Clean payloads reduce MTR, deterministic routing reduces MTR.
1592
01:10:03,940 –> 01:10:05,740
Standardized tool actions reduce MTR.
1593
01:10:05,740 –> 01:10:08,500
And the control plane is what makes those three things repeatable.
1594
01:10:08,500 –> 01:10:13,300
A typical mature agent rollout that enforces routing and tool reuse sees MTR drop in
1595
01:10:13,300 –> 01:10:16,380
the 20 to 35% range on targeted workflows.
1596
01:10:16,380 –> 01:10:17,380
That’s not magic.
1597
01:10:17,380 –> 01:10:22,580
Moving retriage, removing duplication and preventing tickets from bouncing between cues.
1598
01:10:22,580 –> 01:10:27,220
And the executive story becomes simple, fewer handoffs, less noise, faster closure.
1599
01:10:27,220 –> 01:10:29,660
Second, request to decision time.
1600
01:10:29,660 –> 01:10:34,180
This is where agents matter outside IT because every enterprise has decision loops that waste
1601
01:10:34,180 –> 01:10:35,180
days.
1602
01:10:35,180 –> 01:10:39,380
Access approvals, exception handling, policy waivers, change requests, vendor onboarding,
1603
01:10:39,380 –> 01:10:40,580
procurement checks.
1604
01:10:40,580 –> 01:10:43,700
The work itself isn’t hard, the waiting is.
1605
01:10:43,700 –> 01:10:45,500
Agents don’t eliminate governance.
1606
01:10:45,500 –> 01:10:49,620
They compress the latency, they collect the missing fields, they package the context,
1607
01:10:49,620 –> 01:10:53,020
they route to the correct approver and they preserve the evidence.
1608
01:10:53,020 –> 01:10:55,020
So the measure isn’t how many chats.
1609
01:10:55,020 –> 01:10:58,580
It’s how long it takes to go from request created to decision recorded.
1610
01:10:58,580 –> 01:11:02,820
In well governed approval patterns, it’s normal to see a 40 to 60% reduction.
1611
01:11:02,820 –> 01:11:06,660
Days to hours, sometimes hours to minutes because humans stop doing the administrative
1612
01:11:06,660 –> 01:11:07,900
choreography.
1613
01:11:07,900 –> 01:11:11,180
They just approve or reject with context already assembled.
1614
01:11:11,180 –> 01:11:13,540
That’s the difference between an agent that helps.
1615
01:11:13,540 –> 01:11:16,340
And an agent ecosystem that actually changes throughput.
1616
01:11:16,340 –> 01:11:17,740
Third, auditability.
1617
01:11:17,740 –> 01:11:20,220
This is the metric nobody loves until they need it.
1618
01:11:20,220 –> 01:11:22,900
And then it becomes the only metric that matters.
1619
01:11:22,900 –> 01:11:23,900
Auditability is binary.
1620
01:11:23,900 –> 01:11:27,380
Either you can produce an evidence chain per action or you can’t.
1621
01:11:27,380 –> 01:11:29,820
Before, we think the bot did X.
1622
01:11:29,820 –> 01:11:34,540
After, here is the identity, the approval, the tool call and the timestamp.
1623
01:11:34,540 –> 01:11:37,020
That is what keeps the programmer live during scrutiny.
1624
01:11:37,020 –> 01:11:39,580
Because audits don’t ask whether the answer sounded correct.
1625
01:11:39,580 –> 01:11:43,300
They ask who performed the action under what authority using what data with what approvals
1626
01:11:43,300 –> 01:11:45,580
and whether controls were bypassed.
1627
01:11:45,580 –> 01:11:49,380
Auditability is where an agent ID stops being an identity feature and becomes a survival
1628
01:11:49,380 –> 01:11:50,380
feature.
1629
01:11:50,380 –> 01:11:51,380
It gives you stable attribution.
1630
01:11:51,380 –> 01:11:53,620
Per view gives you exposure context.
1631
01:11:53,620 –> 01:11:55,340
Defender gives you behavioral signals.
1632
01:11:55,340 –> 01:11:58,540
MCP gives you the tool path that can be inspected in version.
1633
01:11:58,540 –> 01:12:02,260
So the audit story becomes a replay, not a narrative.
1634
01:12:02,260 –> 01:12:04,100
And executives understand this immediately.
1635
01:12:04,100 –> 01:12:06,500
If you can’t prove it, you can’t scale it.
1636
01:12:06,500 –> 01:12:08,180
Fourth, cost per completed task.
1637
01:12:08,180 –> 01:12:10,180
This is where most agent programs die.
1638
01:12:10,180 –> 01:12:14,060
This team report costs per chat and finance asks the only question that matters.
1639
01:12:14,060 –> 01:12:15,740
What did the chat actually complete?
1640
01:12:15,740 –> 01:12:21,060
A completed task means an outcome that didn’t require a second agent, a human retriage,
1641
01:12:21,060 –> 01:12:24,020
a manual rework or an incident cleanup later.
1642
01:12:24,020 –> 01:12:26,780
If you don’t define completion, you can’t manage cost.
1643
01:12:26,780 –> 01:12:31,740
The organizations that standardized tool contracts and reuse them through MCP often see 25
1644
01:12:31,740 –> 01:12:36,580
to 50% lower cost per resolved request on mature workflows, not because tokens got
1645
01:12:36,580 –> 01:12:40,940
cheaper because the system stopped paying for duplicated integrations, duplicated retries
1646
01:12:40,940 –> 01:12:42,620
and duplicated human escalation.
1647
01:12:42,620 –> 01:12:45,060
And here’s the thing that makes all four metrics credible.
1648
01:12:45,060 –> 01:12:46,380
They aren’t independent.
1649
01:12:46,380 –> 01:12:50,340
MTTR drops when request to decision time drops because fewer things stall.
1650
01:12:50,340 –> 01:12:54,500
Auditability improves when tool reuse improves because evidence becomes structured.
1651
01:12:54,500 –> 01:12:58,740
Cost per completed task drops when routing becomes deterministic because the system stops
1652
01:12:58,740 –> 01:13:00,340
doing the same work twice.
1653
01:13:00,340 –> 01:13:02,340
So the goal isn’t to chase four dashboards.
1654
01:13:02,340 –> 01:13:06,140
The goal is to enforce the control plane so the metrics fall out as a consequence.
1655
01:13:06,140 –> 01:13:09,900
This is also why executives should demand that every agent product has a measurement contract
1656
01:13:09,900 –> 01:13:12,460
before it ships, not will measure later.
1657
01:13:12,460 –> 01:13:13,820
What is the completed task?
1658
01:13:13,820 –> 01:13:15,940
What is the expected MTTR change?
1659
01:13:15,940 –> 01:13:17,260
What decision loop does it compress?
1660
01:13:17,260 –> 01:13:19,660
What audit evidence does it produce by default?
1661
01:13:19,660 –> 01:13:21,980
What is the cost per completion baseline today?
1662
01:13:21,980 –> 01:13:25,140
If a team can’t answer those questions, they’re shipping an experiment.
1663
01:13:25,140 –> 01:13:28,940
Experiments are fine, but experiments don’t get enterprise rollout budgets.
1664
01:13:28,940 –> 01:13:33,340
Next governance gates that don’t kill innovation because metrics without enforcement gates
1665
01:13:33,340 –> 01:13:35,900
just produce numbers that lie.
1666
01:13:35,900 –> 01:13:42,060
Governance gates sound like the place innovation goes to die and that’s not a branding problem.
1667
01:13:42,060 –> 01:13:44,020
That’s history.
1668
01:13:44,020 –> 01:13:48,060
Most enterprises implemented governance as a slow approval queue run by people who don’t
1669
01:13:48,060 –> 01:13:52,380
own outcomes, don’t understand the work and don’t carry the blast radius when something
1670
01:13:52,380 –> 01:13:53,540
ships badly.
1671
01:13:53,540 –> 01:13:56,860
So makers root around it, shadow systems appear entropy wins.
1672
01:13:56,860 –> 01:13:58,300
So the goal isn’t to add gates.
1673
01:13:58,300 –> 01:14:02,860
The goal is to add zones with different enforcement and to make the path from experiment to enterprise
1674
01:14:02,860 –> 01:14:03,860
predictable.
1675
01:14:03,860 –> 01:14:06,620
The goal is to add zones, personal departmental enterprise.
1676
01:14:06,620 –> 01:14:10,980
Personal is where people explore, the agent can exist, it can help, it can even be useful,
1677
01:14:10,980 –> 01:14:14,860
but it can’t publish broadly, it can’t run autonomous triggers and it can’t be wired
1678
01:14:14,860 –> 01:14:16,380
into high impact tools.
1679
01:14:16,380 –> 01:14:18,780
That’s not punishment, that’s containment.
1680
01:14:18,780 –> 01:14:21,140
Personal experimentation stays personal.
1681
01:14:21,140 –> 01:14:23,900
Departmental is where teams solve real work for a defined audience.
1682
01:14:23,900 –> 01:14:25,780
This is where two reuse starts to matter.
1683
01:14:25,780 –> 01:14:30,540
This is where you enforce that the agent has an owner, a scope and a clear purpose.
1684
01:14:30,540 –> 01:14:33,620
Personal agents can connect to tools, but only through reviewed contracts.
1685
01:14:33,620 –> 01:14:37,300
They can read data, but only within defined boundaries and they can ship faster because
1686
01:14:37,300 –> 01:14:39,500
the controls are predefined.
1687
01:14:39,500 –> 01:14:42,260
Enterprise is where the agent becomes part of the operating model.
1688
01:14:42,260 –> 01:14:46,820
That means identity is non-negotiable, tool contracts are standardized, data boundaries
1689
01:14:46,820 –> 01:14:53,140
are enforced and defender monitoring is enabled, not optional, default.
1690
01:14:53,140 –> 01:14:57,300
That zoning model is the first gate, where the agent lives determines what it’s allowed
1691
01:14:57,300 –> 01:14:58,300
to do.
1692
01:14:58,300 –> 01:15:01,940
Second gate is the publish gate, this is where most organizations overdo it.
1693
01:15:01,940 –> 01:15:04,780
The publish gate should be brutally short.
1694
01:15:04,780 –> 01:15:07,660
Four yes no checks that map to the control plane.
1695
01:15:07,660 –> 01:15:09,980
One, identity created.
1696
01:15:09,980 –> 01:15:14,900
Entra agent ID exists, ownership is defined, sponsor is defined, life cycle is attached.
1697
01:15:14,900 –> 01:15:18,500
If it can act, it has an identity, if it doesn’t, it stays in the sandbox.
1698
01:15:18,500 –> 01:15:20,340
Two, tool contract reviewed.
1699
01:15:20,340 –> 01:15:23,900
MCP tools are used where possible and the actions are explicit.
1700
01:15:23,900 –> 01:15:27,580
No mystery connectors, no bespoke endpoints that nobody can trace.
1701
01:15:27,580 –> 01:15:30,900
And irreversible actions don’t exist without an approval pattern.
1702
01:15:30,900 –> 01:15:36,300
Three, data boundary validated, purview signals exist, authoritative sources are defined
1703
01:15:36,300 –> 01:15:39,940
and non-authoritative content is either excluded or marked as such.
1704
01:15:39,940 –> 01:15:43,820
The agent can’t become a search engine for your tenant by accident.
1705
01:15:43,820 –> 01:15:45,940
Four, monitoring enabled.
1706
01:15:45,940 –> 01:15:50,340
Defender for AI is watching behavior, baselines exist, containment actions are defined.
1707
01:15:50,340 –> 01:15:54,260
If something drifts you can disable one agent identity without pausing the program, that’s
1708
01:15:54,260 –> 01:15:55,260
the publish gate.
1709
01:15:55,260 –> 01:15:57,980
And ask for diagrams, it asks for enforceability.
1710
01:15:57,980 –> 01:15:59,860
Now here’s the uncomfortable truth.
1711
01:15:59,860 –> 01:16:03,460
Leased privilege has to become the default, not the aspiration.
1712
01:16:03,460 –> 01:16:08,020
At small scale, people treat least privileged like a best practice they’ll implement later
1713
01:16:08,020 –> 01:16:09,940
after the pilot proves value.
1714
01:16:09,940 –> 01:16:14,900
At agent scale, later is where incidents come from, so the gate is denied by design.
1715
01:16:14,900 –> 01:16:19,740
Exceptions become explicit requests, time bound, where possible, and treated as entropy
1716
01:16:19,740 –> 01:16:20,740
generators.
1717
01:16:20,740 –> 01:16:24,260
Not because the organization is mean, because exceptions are the mechanism that converts
1718
01:16:24,260 –> 01:16:27,060
deterministic controls into probabilistic ones.
1719
01:16:27,060 –> 01:16:30,980
And once controls are probabilistic, you don’t get predictable outcomes, you get conditional
1720
01:16:30,980 –> 01:16:31,980
chaos.
1721
01:16:31,980 –> 01:16:35,220
Now the next gate, and this is where innovation usually complains, is human in the loop,
1722
01:16:35,220 –> 01:16:37,020
you don’t need approvals for everything.
1723
01:16:37,020 –> 01:16:40,660
You need approvals for irreversible actions and sensitive domains.
1724
01:16:40,660 –> 01:16:44,940
Anything that changes access shares data, deletes records, updates, authoritative systems,
1725
01:16:44,940 –> 01:16:48,300
or triggers downstream financial impact needs a human checkpoint.
1726
01:16:48,300 –> 01:16:51,660
And the trick is not making people jump through portals, it’s rooting approvals into the
1727
01:16:51,660 –> 01:16:53,060
flow of work.
1728
01:16:53,060 –> 01:16:56,700
And then the data is being structured payloads, so the decision can be fast and accountable
1729
01:16:56,700 –> 01:16:58,100
that pattern scales.
1730
01:16:58,100 –> 01:17:01,900
It also makes audits boring, which is the highest compliment an enterprise can receive.
1731
01:17:01,900 –> 01:17:03,980
Now how does this avoid killing innovation?
1732
01:17:03,980 –> 01:17:07,500
Because gates don’t slow teams when the gates are pre-baked into the platform.
1733
01:17:07,500 –> 01:17:13,020
The platform team publishes the tool catalog, security publishes the identity patterns, compliance,
1734
01:17:13,020 –> 01:17:17,260
publishes the authoritative sources, defender publishes the baseline and containment playbooks,
1735
01:17:17,260 –> 01:17:21,580
then makers build inside those constraints without negotiating them every time.
1736
01:17:21,580 –> 01:17:24,260
This is the inversion most enterprises miss.
1737
01:17:24,260 –> 01:17:27,780
Innovation dies when every project has to invent its own governance.
1738
01:17:27,780 –> 01:17:31,940
Innovation survives when governance is a reusable service, and yes you still allow experimentation.
1739
01:17:31,940 –> 01:17:35,020
You just don’t let experimentation become production by accident.
1740
01:17:35,020 –> 01:17:37,020
So the governance model isn’t, no.
1741
01:17:37,020 –> 01:17:41,700
It’s, you can do almost anything in the personal zone, some things in the departmental zone,
1742
01:17:41,700 –> 01:17:44,140
and only enforceable things in the enterprise zone.
1743
01:17:44,140 –> 01:17:46,860
That’s how you scale intelligence without scaling chaos.
1744
01:17:46,860 –> 01:17:51,620
Next, the realistic 12 month adoption sequence because this doesn’t land by buying licenses.
1745
01:17:51,620 –> 01:17:56,460
It lands by moving in a deliberate order that keeps the program alive under scrutiny.
1746
01:17:56,460 –> 01:17:59,580
The 12 month adoption sequence for the agentic decade.
1747
01:17:59,580 –> 01:18:03,460
Enterprises keep asking for the agent roadmap, like it’s a feature checklist.
1748
01:18:03,460 –> 01:18:05,700
It isn’t, it’s an order of operations problem.
1749
01:18:05,700 –> 01:18:09,340
If you scale capability before you scale enforceability you get sprawl.
1750
01:18:09,340 –> 01:18:13,700
If you scale enforceability before you scale capability you get adoption that survives.
1751
01:18:13,700 –> 01:18:18,180
So here’s the 12 month sequence that actually works because it matches how systems decay.
1752
01:18:18,180 –> 01:18:19,980
Month 0 to 3.
1753
01:18:19,980 –> 01:18:21,140
Inventory and identity.
1754
01:18:21,140 –> 01:18:22,380
Not innovation theater.
1755
01:18:22,380 –> 01:18:25,180
The first three months are about admitting what already exists.
1756
01:18:25,180 –> 01:18:27,700
You already have agents, co-pilot prompts.
1757
01:18:27,700 –> 01:18:29,260
Power automate flows.
1758
01:18:29,260 –> 01:18:30,740
Scripted integrations.
1759
01:18:30,740 –> 01:18:33,060
Shared service accounts doing automation.
1760
01:18:33,060 –> 01:18:36,060
The agentic decade doesn’t start when you buy agent 365.
1761
01:18:36,060 –> 01:18:39,420
It starts when you measure the non-human actors already touching your tenant.
1762
01:18:39,420 –> 01:18:42,340
So you inventory agents, tools, and automations by domain.
1763
01:18:42,340 –> 01:18:46,700
You create the catalog structure and then you mandate the one enforcement rule that prevents
1764
01:18:46,700 –> 01:18:49,500
identity drift from becoming your first scandal.
1765
01:18:49,500 –> 01:18:52,300
If it can act it must have an entra agent ID.
1766
01:18:52,300 –> 01:18:53,580
Not later now.
1767
01:18:53,580 –> 01:18:57,420
And you attach ownership and sponsorship as part of creation not as a cleaner project.
1768
01:18:57,420 –> 01:19:00,060
If you can’t name an owner and sponsor it stays personal.
1769
01:19:00,060 –> 01:19:02,540
It doesn’t scale.
1770
01:19:02,540 –> 01:19:04,180
Month 3 to 6.
1771
01:19:04,180 –> 01:19:05,180
Standardize tools.
1772
01:19:05,180 –> 01:19:07,700
Then remove the bespoke connectors you already regret.
1773
01:19:07,700 –> 01:19:11,220
This is where MCP stops being a developer topic and becomes a governance primitive.
1774
01:19:11,220 –> 01:19:13,300
You don’t let every team integrate.
1775
01:19:13,300 –> 01:19:15,420
You publish a tool contract catalog.
1776
01:19:15,420 –> 01:19:19,380
What actions exist, what the payload looks like, what the output looks like, and what
1777
01:19:19,380 –> 01:19:21,340
provenance must be included.
1778
01:19:21,340 –> 01:19:24,220
Then you do the anglomerous part, you kill bespoke connectors.
1779
01:19:24,220 –> 01:19:27,460
Not because they’re evil, because they multiply your failure modes.
1780
01:19:27,460 –> 01:19:32,060
Every bespoke toolpath is a separate incident response workflow you’ll have to run at 2am.
1781
01:19:32,060 –> 01:19:33,380
That is not innovation.
1782
01:19:33,380 –> 01:19:35,100
That’s architectural erosion.
1783
01:19:35,100 –> 01:19:39,540
At the same time you set data boundaries in purview that match reality.
1784
01:19:39,540 –> 01:19:45,420
By authoritative sources, identify sensitive repositories and make deny by design the default
1785
01:19:45,420 –> 01:19:47,900
posture for agent identities.
1786
01:19:47,900 –> 01:19:51,980
If a team wants broader data access they’re requesting a wider blast radius.
1787
01:19:51,980 –> 01:19:53,540
Make that explicit.
1788
01:19:53,540 –> 01:19:58,580
Month 6 to 9 baselines in containment because drift starts as soon as adoption becomes normal.
1789
01:19:58,580 –> 01:20:02,380
By month 6 the ecosystem starts behaving like an ecosystem.
1790
01:20:02,380 –> 01:20:06,220
People reuse agents, chain agents, and ask agents to do more than they were designed
1791
01:20:06,220 –> 01:20:07,220
for.
1792
01:20:07,220 –> 01:20:08,220
The system doesn’t stay stable.
1793
01:20:08,220 –> 01:20:10,580
It accumulates pressure.
1794
01:20:10,580 –> 01:20:14,820
So this phase is defender for AI becoming operational, not just enabled.
1795
01:20:14,820 –> 01:20:20,020
You build baselines for normal per high impact agent, tool call rates, data access patterns,
1796
01:20:20,020 –> 01:20:23,980
unusual retrieval, unusual approvals, anomalous identity behavior.
1797
01:20:23,980 –> 01:20:28,460
Then you write playbooks that contain one agent without punishing the entire program.
1798
01:20:28,460 –> 01:20:33,900
Disable the agent identity, quarantine the tool, restrict the data boundary, force approvals,
1799
01:20:33,900 –> 01:20:35,060
and resume.
1800
01:20:35,060 –> 01:20:39,060
The goal is not perfect prevention, the goal is fast containment with evidence.
1801
01:20:39,060 –> 01:20:43,740
Month 9 to 12 scale the approvals pattern and start deprecation like you mean it.
1802
01:20:43,740 –> 01:20:47,900
This is where the ecosystem either matures or collapses under decision debt.
1803
01:20:47,900 –> 01:20:50,580
The difference is whether you can retire agents cleanly.
1804
01:20:50,580 –> 01:20:55,060
So you standardize the teams plus adaptive cards approval surface for irreversible actions
1805
01:20:55,060 –> 01:20:56,180
and sensitive domains.
1806
01:20:56,180 –> 01:20:59,100
You make it a reusable pattern, not a one off workflow.
1807
01:20:59,100 –> 01:21:03,540
Humans approve outcomes, agents execute bounded steps, and then you do the part nobody wants
1808
01:21:03,540 –> 01:21:04,540
to do.
1809
01:21:04,540 –> 01:21:08,420
You can replicate redundant agents because sprawl is not just too many.
1810
01:21:08,420 –> 01:21:09,860
It’s overlapping authority.
1811
01:21:09,860 –> 01:21:13,700
If two agents answer the same policy question differently, you don’t have optionality,
1812
01:21:13,700 –> 01:21:17,020
you have inconsistency and inconsistency becomes rework risk and cost.
1813
01:21:17,020 –> 01:21:21,460
So you collapse duplicates, route users to the authoritative agent product, and enforce
1814
01:21:21,460 –> 01:21:22,780
life cycle.
1815
01:21:22,780 –> 01:21:24,540
Build, publish, operate, retire.
1816
01:21:24,540 –> 01:21:28,940
No orphaned identities, no abandoned prompts, no, it still kind of works.
1817
01:21:28,940 –> 01:21:34,300
That’s the 12 month sequence, identity first, tool contract second, data boundaries third,
1818
01:21:34,300 –> 01:21:38,180
visual detection fourth, approvals and deprecation as the scaling mechanism.
1819
01:21:38,180 –> 01:21:43,020
And once that order is in place, the organization stops asking how many agents show conclusion,
1820
01:21:43,020 –> 01:21:45,020
the advantage is enforceable intelligence.
1821
01:21:45,020 –> 01:21:47,300
The agentic advantage isn’t more intelligence.
1822
01:21:47,300 –> 01:21:51,060
It’s intelligence that stays enforceable when it scales under audit, cost, pressure, and
1823
01:21:51,060 –> 01:21:52,340
incident response.
1824
01:21:52,340 –> 01:21:56,460
If you want the practical next step, watch the next episode on building an agent registry
1825
01:21:56,460 –> 01:21:59,700
that prevents sprawl before it becomes your operating model.






