Unlocking Autonomous Microsoft Enterprise

Mirko PetersPodcasts1 hour ago23 Views


1
00:00:00,000 –> 00:00:07,680
Most organizations think agents means co-pilot with extra steps, a nicer chat box, a few connectors, maybe some workflow buttons.

2
00:00:07,680 –> 00:00:15,840
They are wrong. Co-pilot speeds up a human. Autonomy replaces the human step entirely, planning, acting, verifying, and documenting without waiting for your approval.

3
00:00:15,840 –> 00:00:23,920
And that’s where the fear is rational. The moment a system can act, every missing policy, every sloppy permission, every undocumented exception turns into conditional chaos.

4
00:00:23,920 –> 00:00:27,920
The blast rate is “Stop’s being theoretical” because the system actually has hands.

5
00:00:27,920 –> 00:00:35,600
So this episode isn’t UI-talked, it’s system behavior. We’re going to draw the line between suggestion and execution, define the contract that controls what an agent can touch,

6
00:00:35,600 –> 00:00:42,720
and then we’ll come back to the uncomfortable parts. Identity-dead, authorisation sprawl, and why governance always arrives late.

7
00:00:42,720 –> 00:00:45,680
Because that’s where autonomy breaks in real tenants.

8
00:00:45,680 –> 00:00:48,480
Define the through line, the autonomy boundary.

9
00:00:48,480 –> 00:00:54,240
If there’s one idea to hold on to for the full episode, it’s this. Autonomy fails at boundaries, not capabilities.

10
00:00:54,240 –> 00:00:59,280
Most people obsess over model quality. They ask whether the agent understands the task.

11
00:00:59,280 –> 00:01:02,880
That’s comforting, because it sounds like progress is a matter of smarter tokens.

12
00:01:02,880 –> 00:01:06,160
But in the Microsoft Enterprise, the model is rarely the limiting factor.

13
00:01:06,160 –> 00:01:11,520
The limiting factor is the moment the system transitions from “I suggest to “I execute.”

14
00:01:11,520 –> 00:01:17,200
That transition is the autonomy boundary. The autonomy boundary is the explicit decision line between two modes of operation,

15
00:01:17,200 –> 00:01:18,720
recommendation and action.

16
00:01:18,720 –> 00:01:24,080
On one side, the agent produces text, options, summaries, and plans. On the other side, the agent changes the world.

17
00:01:24,080 –> 00:01:31,760
It makes graph calls, edits configurations, closes tickets, revoke sessions, moves money, or sends communications that people will treat as official.

18
00:01:31,760 –> 00:01:37,040
That distinction matters. Because the boundary is where ownership moves, it’s where audit expectations change,

19
00:01:37,040 –> 00:01:39,840
it’s where helpful assistant becomes operator.

20
00:01:39,840 –> 00:01:42,800
An enterprises don’t struggle because the operator is incompetent.

21
00:01:42,800 –> 00:01:47,840
They struggle because nobody bothered to define, enforce, and continuously test the line where operation is allowed.

22
00:01:47,840 –> 00:01:51,600
To make that line enforceable, you need a second artifact, the execution contract.

23
00:01:51,600 –> 00:01:54,160
The execution contract is not a vibe, it is not a prompt.

24
00:01:54,160 –> 00:01:58,560
It is a concrete definition of what the agent is allowed to do and under what constraints.

25
00:01:58,560 –> 00:02:02,160
Think of it as a compiled interface between business intent and tool execution.

26
00:02:02,160 –> 00:02:06,400
It specifies at minimum five things. First, allowed tools.

27
00:02:06,400 –> 00:02:08,720
Not, it can use Microsoft Graph.

28
00:02:08,720 –> 00:02:10,800
Which graph endpoints? Which actions?

29
00:02:10,800 –> 00:02:13,040
Read versus write is not a detail.

30
00:02:13,040 –> 00:02:15,520
It’s the difference between reporting and damage.

31
00:02:15,520 –> 00:02:20,400
Second, scopes and boundaries, tenant, subscription, resource group, site collection, mailbox,

32
00:02:20,400 –> 00:02:23,520
environment, whatever the containment unit is for the workload.

33
00:02:23,520 –> 00:02:26,640
The contract names the containment unit and makes it non-negotiable.

34
00:02:26,640 –> 00:02:28,560
Third, evidence requirements.

35
00:02:28,560 –> 00:02:30,960
What does the agent need to cite before it acts?

36
00:02:30,960 –> 00:02:35,680
A ticket ID, an alert correlation, a policy clause, an approval reference, a change record.

37
00:02:35,680 –> 00:02:39,200
Autonomy without evidence is just automated, guessing, with better grammar.

38
00:02:39,200 –> 00:02:43,680
Fourth, thresholds, confidence thresholds, anomaly thresholds, volume thresholds.

39
00:02:43,680 –> 00:02:47,840
The contract states what’s safe enough means and when the system must escalate.

40
00:02:47,840 –> 00:02:49,680
Fifth, escalation and kill behavior.

41
00:02:49,680 –> 00:02:50,560
Who does it wake up?

42
00:02:50,560 –> 00:02:51,280
Where does it post?

43
00:02:51,280 –> 00:02:52,400
What’s the rollback path?

44
00:02:52,400 –> 00:02:54,480
And this is the part everyone forgets.

45
00:02:54,480 –> 00:02:59,520
How do you stop it cleanly mid-flight without leaving half a plight changes across 10 systems?

46
00:02:59,520 –> 00:03:03,280
Now, here’s where Altaira becomes useful as a concept without becoming marketing.

47
00:03:03,280 –> 00:03:07,680
In Microsoft Terms, Altaira represents the mechanism that operationalyzes the autonomy boundary

48
00:03:07,680 –> 00:03:09,040
through an execution contract.

49
00:03:09,040 –> 00:03:12,960
It’s the layer that turns we want autonomy into enforceable constraints,

50
00:03:12,960 –> 00:03:16,640
tool-rooting, scoped identities, evidence capture and predictable escalation.

51
00:03:16,640 –> 00:03:19,440
Not more chat, more closed-loop outcomes.

52
00:03:19,440 –> 00:03:22,880
And when the episode gets abstract and it will, this is the anchor.

53
00:03:22,880 –> 00:03:24,000
Come back to two questions.

54
00:03:24,000 –> 00:03:27,760
Where is the autonomy boundary and what does the execution contract require

55
00:03:27,760 –> 00:03:29,040
before the agent crosses it?

56
00:03:29,040 –> 00:03:33,760
Because every enterprise failure story in this space reduces to those two questions being answered

57
00:03:33,760 –> 00:03:36,720
informally once by the wrong person and then never revisited.

58
00:03:36,720 –> 00:03:37,680
The contract drifts.

59
00:03:37,680 –> 00:03:40,400
Exceptions get added, someone needs an urgent workaround.

60
00:03:40,400 –> 00:03:42,640
Someone else copies that work around into another environment.

61
00:03:42,640 –> 00:03:46,080
And slowly, your deterministic intent becomes probabilistic behavior.

62
00:03:46,080 –> 00:03:50,240
We’ll come back to this later when we talk about identity debt because identity debt is what happens when

63
00:03:50,240 –> 00:03:54,400
execution contracts get multiplied across dozens of non-human operators

64
00:03:54,400 –> 00:03:56,400
and nobody remembers why they exist.

65
00:03:56,400 –> 00:04:00,960
But before we get to the debt, you need to understand why co-pilot can’t cross this boundary by design

66
00:04:00,960 –> 00:04:04,640
and why that limitation is the feature that keeps most tenants intact.

67
00:04:04,640 –> 00:04:08,400
Co-pilot versus autonomous execution, the non-negotiable difference.

68
00:04:08,400 –> 00:04:12,560
If a human must approve the final action you are still buying labor, just faster labor.

69
00:04:12,560 –> 00:04:14,880
That’s not a moral judgment, it’s a systems description.

70
00:04:14,880 –> 00:04:18,880
Co-pilot is an interface layer that compresses the cost of thinking, drafting,

71
00:04:18,880 –> 00:04:20,160
searching and summarizing.

72
00:04:20,160 –> 00:04:24,480
It moves work from slow human keystrokes to fast human supervision.

73
00:04:24,480 –> 00:04:27,680
The human still owns the last mile.

74
00:04:27,680 –> 00:04:31,360
The click that changes state in Azure, the approval that closes the ticket,

75
00:04:31,360 –> 00:04:35,360
the decision that revokes the session, the email that becomes an official instruction.

76
00:04:35,360 –> 00:04:39,360
And because the human owns the last mile, the blast radius stays human-shaped.

77
00:04:39,360 –> 00:04:41,920
It’s bounded by attention, fatigue and time.

78
00:04:41,920 –> 00:04:43,600
That’s not great, but it’s legible.

79
00:04:43,600 –> 00:04:46,000
You can point to a person and say this was your decision.

80
00:04:46,000 –> 00:04:49,200
Autonomous execution is different, it is not a better chat experience,

81
00:04:49,200 –> 00:04:51,360
it is not co-pilot but with confidence.

82
00:04:51,360 –> 00:04:54,000
Autonomy is goal-pursued under constraints.

83
00:04:54,000 –> 00:04:57,520
The system receives a signal, forms a plan, uses tools,

84
00:04:57,520 –> 00:05:01,360
tracks state over time and keeps going until it meets an outcome condition

85
00:05:01,360 –> 00:05:02,800
or hits an escalation boundary.

86
00:05:02,800 –> 00:05:06,800
That means autonomy has three properties, co-pilot doesn’t need first,

87
00:05:06,800 –> 00:05:07,520
statefulness.

88
00:05:07,520 –> 00:05:10,080
It remembers what it tried, what failed, what changed,

89
00:05:10,080 –> 00:05:11,920
what evidence it gathered and what remains.

90
00:05:11,920 –> 00:05:15,120
Without state, you don’t have autonomy, you have looping suggestions.

91
00:05:15,120 –> 00:05:17,360
Second, tool ownership.

92
00:05:17,360 –> 00:05:21,280
Co-pilot can call tools, sure, but the human still authorizes meaning.

93
00:05:21,280 –> 00:05:24,160
Autonomy calls tools because tool calls are the work.

94
00:05:24,160 –> 00:05:27,120
Graph, Azure Resource Manager, ITSM APIs,

95
00:05:27,120 –> 00:05:28,720
Defender Action, Sentinel Playbooks,

96
00:05:28,720 –> 00:05:30,640
these aren’t integrations, they’re actuators.

97
00:05:31,360 –> 00:05:34,000
Third, multi-step execution with feedback.

98
00:05:34,000 –> 00:05:37,360
Autonomy doesn’t just perform an action, it verifies.

99
00:05:37,360 –> 00:05:39,360
It checks whether the service came back healthy,

100
00:05:39,360 –> 00:05:42,800
whether the config drift stopped, whether the incidence scope shrank,

101
00:05:42,800 –> 00:05:46,080
whether the reconciliation balanced, whether the containment actually contained.

102
00:05:46,080 –> 00:05:47,440
If it didn’t, it iterates.

103
00:05:47,440 –> 00:05:50,080
Now here’s where most organizations lie to themselves.

104
00:05:50,080 –> 00:05:53,840
They say they want autonomy, but they implement assistance with a longer leash.

105
00:05:53,840 –> 00:05:56,720
The agent drafts the change request and the engineer clicks approve.

106
00:05:56,720 –> 00:05:59,280
That’s still labor, faster labor.

107
00:05:59,280 –> 00:06:02,720
It can be worth doing, but don’t pretend you crossed the autonomy boundary.

108
00:06:02,720 –> 00:06:05,520
You just built a better router for human attention.

109
00:06:05,520 –> 00:06:09,040
And the reason this distinction matters isn’t philosophical, it’s operational.

110
00:06:09,040 –> 00:06:14,640
With Co-pilot, you manage model risk, hallucinations, missing context, bad summaries.

111
00:06:14,640 –> 00:06:17,440
With autonomy, you manage execution risk.

112
00:06:17,440 –> 00:06:19,520
Actual changes in production systems.

113
00:06:19,520 –> 00:06:23,040
The failure mode moves from wrong words to wrong actions.

114
00:06:23,040 –> 00:06:27,360
And at that point, the only question that matters is who owns the blast radius.

115
00:06:27,360 –> 00:06:31,600
In a deterministic security model, you can explain outcomes by configuration.

116
00:06:31,600 –> 00:06:34,960
The policy allowed it, the role permitted it, the audit log shows it.

117
00:06:34,960 –> 00:06:38,880
In a probabilistic model, outcomes emerge from a sequence of conditional decisions.

118
00:06:38,880 –> 00:06:42,000
Confidence thresholds, tool rooting, exception paths,

119
00:06:42,000 –> 00:06:47,280
retreats, partial failures, and whatever helpful fallback someone enabled in a hurry.

120
00:06:47,280 –> 00:06:50,560
That probabilistic drift is not caused by the model being random.

121
00:06:50,560 –> 00:06:52,720
It’s caused by the enterprise being inconsistent.

122
00:06:52,720 –> 00:06:54,080
The model just exposes it.

123
00:06:54,080 –> 00:06:55,680
This is the part people miss.

124
00:06:55,680 –> 00:06:58,400
Autonomy doesn’t create new governance problems.

125
00:06:58,400 –> 00:07:01,680
It simply turns your existing governance gaps into runtime behavior.

126
00:07:01,680 –> 00:07:05,200
And that’s why identity and authorization become the real cost center.

127
00:07:05,200 –> 00:07:08,720
Not tokens, not model rooting, not whether the agents sound smart.

128
00:07:08,720 –> 00:07:12,320
When you shift ownership of actions from humans to non-human operators,

129
00:07:12,320 –> 00:07:16,960
you are manufacturing new principles, new entitlements, new conditional access edges,

130
00:07:16,960 –> 00:07:19,600
new audit requirements, new incident pathways.

131
00:07:19,600 –> 00:07:23,360
We’ll come back to identity debt later because that’s where this breaks in real tenants.

132
00:07:23,360 –> 00:07:25,120
But for now, keep the frame simple.

133
00:07:25,120 –> 00:07:27,120
Copilot optimizes an individual.

134
00:07:27,120 –> 00:07:28,640
Autonomy optimizes a queue.

135
00:07:28,640 –> 00:07:31,120
Copilot makes one person faster at doing work.

136
00:07:31,120 –> 00:07:34,160
Autonomy makes work happen without that person being involved.

137
00:07:34,160 –> 00:07:38,080
Once you see that, Microsoft 365 stops looking like a suite of apps

138
00:07:38,080 –> 00:07:39,280
with a chat sidebar.

139
00:07:39,280 –> 00:07:43,200
It starts looking like an agent runtime with a massive tool surface area.

140
00:07:43,200 –> 00:07:46,800
Graph as the actuator bus, teams as the coordination layer,

141
00:07:46,800 –> 00:07:49,280
entra as the distributed decision engine,

142
00:07:49,280 –> 00:07:53,920
and purview and defender as the rails that decide whether the system stays deterministic

143
00:07:53,920 –> 00:07:56,240
or degrades into conditional chaos.

144
00:07:56,240 –> 00:07:58,960
And that’s why Copilot can’t cross the boundary by design,

145
00:07:58,960 –> 00:08:00,080
isn’t a limitation.

146
00:08:00,080 –> 00:08:01,840
It’s a containment strategy.

147
00:08:01,840 –> 00:08:05,280
Microsoft’s direction, the agentic web is already here.

148
00:08:05,280 –> 00:08:10,000
Most enterprises still talk about agents like it’s a feature you can choose to enable later.

149
00:08:10,000 –> 00:08:13,600
Once the pilot’s finished and the governance deck gets its annual refresh,

150
00:08:13,600 –> 00:08:14,480
they are wrong.

151
00:08:14,480 –> 00:08:16,240
The direction is already set.

152
00:08:16,240 –> 00:08:20,160
Microsoft is normalizing delegation to non-human operators across the stack,

153
00:08:20,160 –> 00:08:23,840
not as a sidebar, as the default unit of work. This is the uncomfortable truth.

154
00:08:23,840 –> 00:08:25,360
The agentic web is not coming.

155
00:08:25,360 –> 00:08:29,920
It is here and it’s being built out as a set of runtimes, protocols and identity surfaces

156
00:08:29,920 –> 00:08:32,080
that make autonomous execution feel ordinary.

157
00:08:32,080 –> 00:08:36,240
Look at the signals Microsoft chose to amplify at build 2025.

158
00:08:36,240 –> 00:08:38,000
They didn’t lead with better chat, dealer.

159
00:08:38,000 –> 00:08:39,840
They led with task delegation.

160
00:08:39,840 –> 00:08:41,520
Assign an issue to an agent.

161
00:08:41,520 –> 00:08:42,880
Let it spin compute.

162
00:08:42,880 –> 00:08:44,320
Make changes in a branch.

163
00:08:44,320 –> 00:08:46,000
Produce session logs, open a PR,

164
00:08:46,000 –> 00:08:48,880
and then let other agents review before merge.

165
00:08:48,880 –> 00:08:51,200
That is an operational pattern, not a UX pattern.

166
00:08:51,200 –> 00:08:53,760
It’s also a rehearsal for enterprise autonomy.

167
00:08:53,760 –> 00:08:56,640
Because if you can delegate software work end to end,

168
00:08:56,640 –> 00:08:59,920
you can delegate everything else that behaves like software incident response

169
00:08:59,920 –> 00:09:03,920
on boarding access reviews, finance, close workflows, security triage.

170
00:09:03,920 –> 00:09:06,640
These are all systems of cues, evidence and actions.

171
00:09:06,640 –> 00:09:10,240
The substrate is the same and Microsoft is making that substrate explicit.

172
00:09:10,240 –> 00:09:14,160
Azure AI Foundry is being positioned like an app server for stateful agents,

173
00:09:14,160 –> 00:09:19,600
multi-model, multi-agent orchestration, production, observability and managed execution.

174
00:09:19,600 –> 00:09:22,240
That matters because autonomy doesn’t scale on prompts.

175
00:09:22,240 –> 00:09:23,440
It scales on runtimes.

176
00:09:23,440 –> 00:09:26,880
Runtimes give you consistent tool invocation, consistent memory patterns,

177
00:09:26,880 –> 00:09:29,200
consistent telemetry and predictable failure modes.

178
00:09:29,200 –> 00:09:33,440
Without a runtime, agent is just a demo that stops working the moment the network blips

179
00:09:33,440 –> 00:09:34,640
or the API throttles.

180
00:09:34,640 –> 00:09:38,640
Then there’s co-pilot studio pushing multi-agent orchestration into low code,

181
00:09:38,640 –> 00:09:40,080
which is a polite way of saying,

182
00:09:40,080 –> 00:09:45,440
the people who least understand your control plane will soon be able to assemble autonomous workflows anyway.

183
00:09:45,440 –> 00:09:47,920
The platform doesn’t wait for architectural maturity.

184
00:09:47,920 –> 00:09:51,360
It roots around it and Microsoft is also standardizing the wiring.

185
00:09:51,360 –> 00:09:53,920
MCP, the model context protocol, is the clearest example.

186
00:09:53,920 –> 00:09:57,600
Microsoft is treating MCP like a universal adapter between agents and tools,

187
00:09:57,600 –> 00:09:59,520
and that sounds developer-friendly and it is.

188
00:09:59,520 –> 00:10:03,600
But in enterprise terms, MCP is a force multiplier for both capability and risk,

189
00:10:03,600 –> 00:10:07,760
because it collapses the friction of adding just one more tool into an agent’s reach.

190
00:10:08,400 –> 00:10:10,480
Here’s the failure mode you need to anchor on.

191
00:10:10,480 –> 00:10:14,480
An agent accidentally gains the ability to delete what it should only read,

192
00:10:14,480 –> 00:10:16,000
not because the model went rogue,

193
00:10:16,000 –> 00:10:18,480
because someone exposed a tool with a broad scope,

194
00:10:18,480 –> 00:10:21,280
or a server drifted, or a permission got inherited,

195
00:10:21,280 –> 00:10:23,840
or a temporary exception became permanent.

196
00:10:23,840 –> 00:10:25,600
MCP makes tool discovery easy.

197
00:10:25,600 –> 00:10:27,200
It does not make authorization safe.

198
00:10:27,200 –> 00:10:28,640
Discovery is not authorization.

199
00:10:28,640 –> 00:10:33,120
Microsoft is even pushing MCP down into windows itself with a registry concept,

200
00:10:33,120 –> 00:10:37,040
user consent prompts, and a model where local capabilities become calable tools.

201
00:10:37,040 –> 00:10:38,640
That’s not a niche developer story.

202
00:10:38,640 –> 00:10:42,160
It’s Microsoft telling you that tool access is the new perimeter,

203
00:10:42,160 –> 00:10:44,480
and the perimeter now spans cloud and endpoint.

204
00:10:44,480 –> 00:10:47,280
At the same time, they’re doing something more consequential.

205
00:10:47,280 –> 00:10:50,080
Normalizing non-human identities at scale.

206
00:10:50,080 –> 00:10:54,160
In the keynote language, agents get their own identity and show up in entra.

207
00:10:54,160 –> 00:10:55,120
That’s not cosmetic.

208
00:10:55,120 –> 00:10:57,360
That’s the beginning of an enterprise identity graph,

209
00:10:57,360 –> 00:10:59,520
where humans are no longer the only operators.

210
00:10:59,520 –> 00:11:03,600
Your tenant becomes a mixed ecology of people and principles acting with intent

211
00:11:03,600 –> 00:11:04,800
that someone wants to find.

212
00:11:04,800 –> 00:11:07,920
And when that becomes normal, governance stops being a policy document

213
00:11:07,920 –> 00:11:09,440
and becomes a compiler problem.

214
00:11:09,440 –> 00:11:14,000
You are compiling intent into enforceable constraints across thousands of decisions per day,

215
00:11:14,000 –> 00:11:18,240
made by systems that don’t get tired and don’t use judgment the way humans do.

216
00:11:18,240 –> 00:11:21,600
So if you’re waiting for a clean, agent rollout moment,

217
00:11:21,600 –> 00:11:23,040
you’re already behind.

218
00:11:23,040 –> 00:11:25,040
The ecosystem is converging.

219
00:11:25,040 –> 00:11:27,760
GitHub task delegation is cultural proof.

220
00:11:27,760 –> 00:11:29,280
Foundry is runtime.

221
00:11:29,280 –> 00:11:31,840
Co-pilot Studio as distribution channel.

222
00:11:31,840 –> 00:11:33,680
Teams as coordination layer.

223
00:11:33,680 –> 00:11:35,440
Graph as actuator bus.

224
00:11:35,440 –> 00:11:39,360
And entra as the decision engine that either enforces your intent

225
00:11:39,360 –> 00:11:43,440
or quietly accumulates exceptions until you’re running conditional chaos.

226
00:11:43,440 –> 00:11:44,880
And that sets up the next question.

227
00:11:44,880 –> 00:11:48,320
If this is Microsoft’s direction, what exactly is Altera in Microsoft terms

228
00:11:48,320 –> 00:11:50,160
without marketing, without mysticism,

229
00:11:50,160 –> 00:11:53,840
and without pretending the platform will save you from your own design debt?

230
00:11:53,840 –> 00:11:56,480
What Altera represents in Microsoft terms?

231
00:11:56,480 –> 00:11:59,280
Most people here, Altera, and immediately hunt for the UI.

232
00:11:59,280 –> 00:12:00,880
They want to know where the chat box lives,

233
00:12:00,880 –> 00:12:04,400
what the agent looks like in teams, how it shows up in co-pilot.

234
00:12:04,400 –> 00:12:05,440
That’s the wrong axis.

235
00:12:05,440 –> 00:12:07,680
The interface is the least interesting part of autonomy

236
00:12:07,680 –> 00:12:09,520
because the interface doesn’t carry the risk.

237
00:12:09,520 –> 00:12:10,560
The system does.

238
00:12:10,560 –> 00:12:13,440
In Microsoft terms, Altera represents an execution layer

239
00:12:13,440 –> 00:12:17,120
that operationalizes the autonomy boundary through an execution contract.

240
00:12:17,120 –> 00:12:19,440
It sits above tools and below business intent.

241
00:12:19,440 –> 00:12:20,960
It is the part that takes a goal,

242
00:12:20,960 –> 00:12:23,520
a set of allowed actions, a set of required evidence,

243
00:12:23,520 –> 00:12:26,240
and turns that into a controlled sequence of tool calls

244
00:12:26,240 –> 00:12:28,960
that either completes the work or escalates cleanly.

245
00:12:28,960 –> 00:12:31,520
That distinction matters because Microsoft already gives you

246
00:12:31,520 –> 00:12:33,200
most of the raw ingredients.

247
00:12:33,200 –> 00:12:36,640
Graph, Azure Resource Manager, Defender Actions,

248
00:12:36,640 –> 00:12:41,200
Sentinel Playbooks, co-pilot Studio Orchestration, Foundry Run Times,

249
00:12:41,200 –> 00:12:42,960
Teams as a Coordination Surface.

250
00:12:42,960 –> 00:12:44,480
The enterprise does not lack tools.

251
00:12:44,480 –> 00:12:48,000
It lacks a mechanism that forces those tools to behave like a system.

252
00:12:48,000 –> 00:12:50,640
So the clean way to describe Altera is not another agent.

253
00:12:50,640 –> 00:12:54,640
It is the thing that makes an agent behave like an operator

254
00:12:54,640 –> 00:12:56,080
you’d be willing to put on call,

255
00:12:56,080 –> 00:13:00,000
constrained identity, explicit tool access, predictable escalation,

256
00:13:00,000 –> 00:13:01,440
and replayable evidence.

257
00:13:01,440 –> 00:13:03,440
And you can translate that into a mental model

258
00:13:03,440 –> 00:13:06,160
that enterprise people actually understand.

259
00:13:06,160 –> 00:13:08,640
Altera behaves like an authorization compiler.

260
00:13:08,640 –> 00:13:11,520
You provide intent, resolve these incident classes,

261
00:13:11,520 –> 00:13:14,720
reconcile these accounts, contain these alert types.

262
00:13:14,720 –> 00:13:17,680
You provide constraints, scopes, thresholds,

263
00:13:17,680 –> 00:13:20,000
evidence rules, and who owns escalation.

264
00:13:20,000 –> 00:13:22,880
And then that intent gets compiled into a runtime plan

265
00:13:22,880 –> 00:13:24,800
which tools can be invoked in which order,

266
00:13:24,800 –> 00:13:28,160
with which checks, under which identity producing which artifacts.

267
00:13:28,160 –> 00:13:29,200
It is not magic.

268
00:13:29,200 –> 00:13:32,000
It is constraint enforcement under load.

269
00:13:32,000 –> 00:13:34,080
Now, where does it sit in the Microsoft stack?

270
00:13:34,080 –> 00:13:37,440
It sits in the seam between the control plane and the execution plane.

271
00:13:37,440 –> 00:13:41,280
Entra, purview, Defender, and your policy layer define what should be allowed.

272
00:13:41,280 –> 00:13:43,760
Graph, Azure, ITSM, ERP connectors,

273
00:13:43,760 –> 00:13:46,080
and endpoint actions are how work gets done.

274
00:13:46,080 –> 00:13:50,640
Altera lives between those worlds translating allowed into perform without letting

275
00:13:50,640 –> 00:13:52,240
convenience rewrite intent.

276
00:13:52,240 –> 00:13:54,480
That’s why it can’t be just another prompt wrapper.

277
00:13:54,480 –> 00:13:56,240
Prompt wrappers make the demo feel good.

278
00:13:56,240 –> 00:13:57,600
They do not make the tenant safer.

279
00:13:57,600 –> 00:13:59,040
They don’t solve identities, brawl.

280
00:13:59,040 –> 00:14:00,560
They don’t solve tool scope drift.

281
00:14:00,560 –> 00:14:02,400
They don’t produce evidence you can replay.

282
00:14:02,400 –> 00:14:06,560
They don’t give you a kill switch that actually stops a multi-system run halfway through.

283
00:14:06,560 –> 00:14:09,440
They just produce better sentences about what might happen.

284
00:14:09,440 –> 00:14:11,600
Altera, as we’re using it in this episode,

285
00:14:11,600 –> 00:14:14,240
represents the closed loop outcome approach.

286
00:14:14,240 –> 00:14:20,480
Detect, decide, act, verify, and document as a single executable run.

287
00:14:20,480 –> 00:14:22,480
The output is not, here’s my reasoning.

288
00:14:22,480 –> 00:14:25,600
The output is, the incident is resolved, the reconciliation is balanced,

289
00:14:25,600 –> 00:14:28,800
the containment is applied, and here is the evidence trail that proves it.

290
00:14:28,800 –> 00:14:30,800
And this is the uncomfortable part for buyers.

291
00:14:30,800 –> 00:14:33,520
Altera’s value has almost nothing to do with model quality.

292
00:14:33,520 –> 00:14:36,480
Yes, you want decent reasoning, but model quality is not what determines

293
00:14:36,480 –> 00:14:38,320
whether autonomy works in production.

294
00:14:38,320 –> 00:14:39,920
Control play maturity does.

295
00:14:39,920 –> 00:14:43,520
If your identity model is sloppy, autonomy accelerates the sloppiness.

296
00:14:43,520 –> 00:14:46,880
If your tool permissions are broad, autonomy turns them into a power tool.

297
00:14:46,880 –> 00:14:50,400
If your approvals are ambiguous, autonomy becomes a blame generator.

298
00:14:50,400 –> 00:14:52,160
If your audit surfaces are weak,

299
00:14:52,160 –> 00:14:54,560
autonomy becomes a storytelling engine.

300
00:14:54,560 –> 00:14:57,280
That’s why the promise isn’t, will make the model smarter.

301
00:14:57,280 –> 00:15:00,560
The promise is, will make the system more deterministic.

302
00:15:00,560 –> 00:15:03,520
And deterministic in this context doesn’t mean perfect.

303
00:15:03,520 –> 00:15:04,800
It means explainable.

304
00:15:04,800 –> 00:15:06,960
You can map an outcome back to a policy clause,

305
00:15:06,960 –> 00:15:09,920
an entitlement, an evidence artifact, and a bounded action set.

306
00:15:09,920 –> 00:15:11,520
So here’s what Altera is not.

307
00:15:11,520 –> 00:15:13,040
It is not a replacement for Entra.

308
00:15:13,040 –> 00:15:15,280
Entra is still the distributed decision engine.

309
00:15:15,280 –> 00:15:18,320
Altera is an execution layer that consumes those decisions.

310
00:15:18,320 –> 00:15:20,800
It is not a replacement for Perview or Defender.

311
00:15:20,800 –> 00:15:22,400
Those are your governance and threat rails.

312
00:15:22,400 –> 00:15:25,120
Altera produces the evidence and the action footprints

313
00:15:25,120 –> 00:15:26,720
those systems need to evaluate.

314
00:15:26,720 –> 00:15:28,960
It is not co-pilot, but autonomous.

315
00:15:28,960 –> 00:15:31,600
Co-pilot is a human productivity interface.

316
00:15:31,600 –> 00:15:33,680
Altera is an operator runtime pattern.

317
00:15:33,680 –> 00:15:36,880
And if that feels like semantics good, semantics are where audits live.

318
00:15:36,880 –> 00:15:40,080
Because once you accept that Altera is essentially a mechanism

319
00:15:40,080 –> 00:15:42,560
for enforcing execution contracts at scale.

320
00:15:42,560 –> 00:15:44,320
The next question becomes obvious.

321
00:15:44,320 –> 00:15:46,880
Why do enterprises still get stuck at pilot forever?

322
00:15:46,880 –> 00:15:48,560
Not because autonomy is impossible,

323
00:15:48,560 –> 00:15:50,560
because the first time you try to productionize it,

324
00:15:50,560 –> 00:15:53,840
you discover the tenant has no enforceable autonomy boundary at all.

325
00:15:53,840 –> 00:15:56,320
Why enterprises stall at pilot forever?

326
00:15:56,320 –> 00:15:58,560
The pattern is boring because it repeats.

327
00:15:58,560 –> 00:16:00,080
A team runs a proof of concept.

328
00:16:00,080 –> 00:16:01,040
It looks great.

329
00:16:01,040 –> 00:16:03,520
The agent summarizes, tickets, drafts, responses,

330
00:16:03,520 –> 00:16:04,960
maybe even proposes a fix.

331
00:16:04,960 –> 00:16:07,360
Everyone nods, then someone says the fatal sentence,

332
00:16:07,360 –> 00:16:09,200
“Okay, let’s roll this into production.”

333
00:16:09,200 –> 00:16:12,160
And production is where the tenant’s actual shape appears.

334
00:16:12,160 –> 00:16:14,560
Pilots succeed because they borrow certainty.

335
00:16:14,560 –> 00:16:16,240
They live in a narrow sandbox.

336
00:16:16,240 –> 00:16:19,920
A clean data set, a cooperative API, a friendly stakeholder,

337
00:16:19,920 –> 00:16:23,280
and permissions that quietly ignore how the enterprise actually works.

338
00:16:23,280 –> 00:16:25,680
Then the moment you connect the pilot to the real cues,

339
00:16:25,680 –> 00:16:28,720
real incidents, real approvals, real change control,

340
00:16:28,720 –> 00:16:31,600
the system hits friction you can’t prompt your way out of.

341
00:16:31,600 –> 00:16:33,280
The first friction point is permissions.

342
00:16:33,280 –> 00:16:36,000
In a pilot, people hand the agent broad access

343
00:16:36,000 –> 00:16:37,840
because they’re optimizing for speed.

344
00:16:37,840 –> 00:16:40,640
In production, broad access becomes a liability surface

345
00:16:40,640 –> 00:16:43,040
and suddenly everyone remembers segregation of duties.

346
00:16:43,040 –> 00:16:45,200
The same person who loved the demo now asks,

347
00:16:45,200 –> 00:16:47,040
“Wait, what identity is that running as?”

348
00:16:47,040 –> 00:16:48,880
And if you can’t answer in one sentence,

349
00:16:48,880 –> 00:16:50,480
what principle, what roles, what scopes,

350
00:16:50,480 –> 00:16:52,000
what conditional access constraints,

351
00:16:52,000 –> 00:16:53,040
you don’t have autonomy,

352
00:16:53,040 –> 00:16:55,200
you have a science project with admin rights.

353
00:16:55,200 –> 00:16:57,040
The second friction point is auditability.

354
00:16:57,040 –> 00:16:58,960
The demo says, “Here’s what I did.”

355
00:16:58,960 –> 00:17:02,160
The auditor says, “Prove it, replay it, show me the evidence chain.”

356
00:17:02,160 –> 00:17:05,360
Autonomy only counts as enterprise automation

357
00:17:05,360 –> 00:17:08,800
when it produces artifacts that survive hostile review.

358
00:17:08,800 –> 00:17:11,280
Time stamps, inputs, tool calls, approvals,

359
00:17:11,280 –> 00:17:13,280
and outcomes tied to policy.

360
00:17:13,280 –> 00:17:16,080
If your agent can’t produce evidence, it can’t be trusted.

361
00:17:16,080 –> 00:17:18,240
It can only be tolerated temporarily

362
00:17:18,240 –> 00:17:19,840
by people who haven’t been burned yet.

363
00:17:19,840 –> 00:17:22,480
The third friction point is incident ownership.

364
00:17:22,480 –> 00:17:24,080
Pilots have a hero, a champion,

365
00:17:24,080 –> 00:17:26,320
someone who owns the agent because they built it.

366
00:17:26,320 –> 00:17:28,080
In production, ownership means a pager

367
00:17:28,080 –> 00:17:30,720
who gets woken up when the agent loops at 2am,

368
00:17:30,720 –> 00:17:33,280
who approves the rollback when it partially applied changes

369
00:17:33,280 –> 00:17:35,840
across Azure Graph and the ITSM system,

370
00:17:35,840 –> 00:17:38,240
who signs off when the agent’s action caused user impact

371
00:17:38,240 –> 00:17:40,560
but the model’s explanation sounds plausible.

372
00:17:40,560 –> 00:17:43,200
Enterprises don’t stall because they hate autonomy.

373
00:17:43,200 –> 00:17:44,960
They stall because nobody wants to inherit

374
00:17:44,960 –> 00:17:47,600
a new failure mode without a clear escalation contract.

375
00:17:47,600 –> 00:17:51,040
Then comes change control, the quiet killer of agent projects.

376
00:17:51,040 –> 00:17:53,920
Autonomy requires updating tools, policies, thresholds,

377
00:17:53,920 –> 00:17:56,080
and runbooks as the environment changes.

378
00:17:56,080 –> 00:17:58,960
But enterprises treat policy like a museum artifact,

379
00:17:58,960 –> 00:18:00,640
written once, rarely revisited,

380
00:18:00,640 –> 00:18:02,640
and only updated after an incident.

381
00:18:02,640 –> 00:18:05,360
So the agent drifts out of alignment with reality.

382
00:18:05,360 –> 00:18:08,800
API’s change rolls evolve, a new SAS tool appears.

383
00:18:08,800 –> 00:18:11,360
An exception gets added just for this quarter.

384
00:18:11,360 –> 00:18:14,160
The pilot keeps running with assumptions that no longer hold.

385
00:18:14,160 –> 00:18:15,920
And when the first production incident happens,

386
00:18:15,920 –> 00:18:18,160
the organization responds predictably.

387
00:18:18,160 –> 00:18:19,280
Pause for governance.

388
00:18:19,280 –> 00:18:20,880
That phrase sounds responsible.

389
00:18:20,880 –> 00:18:22,480
It is usually a confession.

390
00:18:22,480 –> 00:18:25,680
It means the organization didn’t have an enforceable autonomy boundary.

391
00:18:25,680 –> 00:18:27,840
They had enthusiasm in a slide deck.

392
00:18:27,840 –> 00:18:30,480
Governance arrives late because it’s uncomfortable work.

393
00:18:30,480 –> 00:18:33,520
It forces you to make decisions about what the agent is allowed to do,

394
00:18:33,520 –> 00:18:37,280
who owns the consequences and what evidence is required before action.

395
00:18:37,280 –> 00:18:39,440
Most organizations avoid those decisions

396
00:18:39,440 –> 00:18:41,600
by keeping the agent in suggestion mode.

397
00:18:41,600 –> 00:18:44,240
Because suggestion mode keeps responsibility human-shaped.

398
00:18:44,240 –> 00:18:45,920
This is also where shadow AI shows up.

399
00:18:45,920 –> 00:18:48,000
Business units don’t wait for central IT.

400
00:18:48,000 –> 00:18:49,440
They build agents anyway.

401
00:18:49,440 –> 00:18:52,480
Co-pilot studio here, a connector there, an MCP server,

402
00:18:52,480 –> 00:18:53,520
someone found on GitHub,

403
00:18:53,520 –> 00:18:56,560
and suddenly actions happen outside the control plane’s visibility.

404
00:18:56,560 –> 00:18:58,000
Not because people are malicious,

405
00:18:58,000 –> 00:18:59,440
because cues never shrink,

406
00:18:59,440 –> 00:19:01,280
and someone always wants relief.

407
00:19:01,280 –> 00:19:03,280
The platform routes around your governance

408
00:19:03,280 –> 00:19:05,680
because the business routes around your delays.

409
00:19:05,680 –> 00:19:08,160
So the root cause isn’t the enterprise’s cautious.

410
00:19:08,160 –> 00:19:11,600
The root cause is that autonomy forces the tenant to become honest.

411
00:19:11,600 –> 00:19:13,280
It forces you to formalize intent.

412
00:19:13,280 –> 00:19:15,520
It forces you to define the execution contract.

413
00:19:15,520 –> 00:19:19,120
It forces you to treat exceptions as entropy generators, not as favors.

414
00:19:19,120 –> 00:19:21,040
And it forces you to align the control plane.

415
00:19:21,040 –> 00:19:23,680
Identity policy evidence with the execution plane,

416
00:19:23,680 –> 00:19:25,200
tools, actions, outcomes,

417
00:19:25,200 –> 00:19:27,600
pilots avoid that alignment by staying small.

418
00:19:27,600 –> 00:19:29,600
Production demands it immediately.

419
00:19:29,600 –> 00:19:32,640
And that’s why pilot forever is not a maturity stage.

420
00:19:32,640 –> 00:19:33,920
It’s a stable equilibrium.

421
00:19:33,920 –> 00:19:35,600
Assistance feels useful and safe.

422
00:19:35,600 –> 00:19:39,280
Autonomy feels risky and political, therefore autonomy gets deferred

423
00:19:39,280 –> 00:19:40,240
until the next quarter.

424
00:19:40,240 –> 00:19:41,440
The quarter never ends.

425
00:19:41,440 –> 00:19:43,440
So the question isn’t how to do a better pilot.

426
00:19:43,440 –> 00:19:45,680
The question is how to design autonomy as a system,

427
00:19:45,680 –> 00:19:47,360
not a feature, because the moment you do,

428
00:19:47,360 –> 00:19:50,240
the stall pattern becomes predictable and solvable.

429
00:19:50,240 –> 00:19:52,800
The autonomy stack, event, reasoning,

430
00:19:52,800 –> 00:19:55,280
orchestration, action, evidence.

431
00:19:55,280 –> 00:19:57,600
Once you stop treating autonomy like a feature,

432
00:19:57,600 –> 00:19:58,720
you need a stack.

433
00:19:58,720 –> 00:20:01,120
Not a vendor diagram, a behavioral stack.

434
00:20:01,120 –> 00:20:02,400
How work enters the system?

435
00:20:02,400 –> 00:20:04,320
How decisions get made, how actions happen,

436
00:20:04,320 –> 00:20:06,560
and how you proved the system didn’t just improvise.

437
00:20:06,560 –> 00:20:09,760
This is the autonomy stack that actually survives production,

438
00:20:09,760 –> 00:20:12,560
event, reasoning, orchestration, action, evidence.

439
00:20:12,560 –> 00:20:15,200
Start with event, autonomy doesn’t begin with a prompt.

440
00:20:15,200 –> 00:20:18,000
It begins with a signal that arrives, whether you’re ready or not.

441
00:20:18,000 –> 00:20:20,640
An alert fires, a ticket opens, a mailbox, rule triggers,

442
00:20:20,640 –> 00:20:22,000
a scheduled job hits.

443
00:20:22,000 –> 00:20:23,760
A user reports something in teams.

444
00:20:23,760 –> 00:20:25,600
A threshold crosses into the limit.

445
00:20:25,600 –> 00:20:29,600
The key point is that events are external reality pushing into your system.

446
00:20:29,600 –> 00:20:31,760
And this is where people quietly cheat.

447
00:20:31,760 –> 00:20:35,120
They build an autonomous agent that only runs when a human asks it to.

448
00:20:35,120 –> 00:20:36,240
That’s still assistance.

449
00:20:36,240 –> 00:20:38,640
Autonomy starts when the system can wake itself up.

450
00:20:38,640 –> 00:20:41,840
But event ingestion has an architectural requirement, normalization.

451
00:20:41,840 –> 00:20:45,280
If your events arrive in 10 formats with 10 levels of fidelity,

452
00:20:45,280 –> 00:20:47,280
you don’t have an autonomy pipeline.

453
00:20:47,280 –> 00:20:48,560
You have a noisy inbox.

454
00:20:48,560 –> 00:20:52,000
So the first job is to translate raw signals into a consistent envelope.

455
00:20:52,000 –> 00:20:52,720
What happened?

456
00:20:52,720 –> 00:20:53,440
Where? To what?

457
00:20:53,440 –> 00:20:55,440
And what evidence exists that it actually happened?

458
00:20:55,440 –> 00:20:56,640
Now reasoning.

459
00:20:56,640 –> 00:20:58,960
Reasoning is not the agent thinking.

460
00:20:58,960 –> 00:21:01,200
Reasoning is the system converting a signal

461
00:21:01,200 –> 00:21:03,680
into an intentful plan under constraints.

462
00:21:03,680 –> 00:21:07,200
That typically means classify the event, extract the goal,

463
00:21:07,200 –> 00:21:10,160
decompose into steps and decide whether action is allowed.

464
00:21:10,160 –> 00:21:11,760
And here’s the uncomfortable truth.

465
00:21:11,760 –> 00:21:14,240
Reasoning needs explicit stop conditions.

466
00:21:14,240 –> 00:21:15,760
Humans stop because they get tired.

467
00:21:15,760 –> 00:21:18,400
Agents stop only when you define done or not safe.

468
00:21:18,400 –> 00:21:20,320
And without that, they don’t become autonomous.

469
00:21:20,320 –> 00:21:21,360
They become persistent.

470
00:21:21,360 –> 00:21:22,960
So you need confidence thresholds,

471
00:21:22,960 –> 00:21:25,920
anomaly detection, and policy checks as part of reasoning.

472
00:21:25,920 –> 00:21:26,960
Not as an afterthought.

473
00:21:26,960 –> 00:21:30,400
The system has to decide upfront whether it should act, ask, or escalate.

474
00:21:30,400 –> 00:21:32,800
That decision is the autonomy boundary in motion.

475
00:21:32,800 –> 00:21:34,720
Suggestion versus execution.

476
00:21:34,720 –> 00:21:38,640
Then orchestration orchestration is where most people get seduced by complexity.

477
00:21:38,640 –> 00:21:42,080
Multi-agent this planner that tool router, memory store, fine.

478
00:21:42,080 –> 00:21:44,560
But the practical purpose of orchestration is simple.

479
00:21:44,560 –> 00:21:47,680
Root the work to the right capability in the right order

480
00:21:47,680 –> 00:21:49,920
with fallbacks that don’t become loopholes.

481
00:21:49,920 –> 00:21:51,920
Orchestration chooses tools and specialists

482
00:21:51,920 –> 00:21:53,680
the way a human operator does.

483
00:21:53,680 –> 00:21:54,880
I need more context.

484
00:21:54,880 –> 00:21:56,560
Go query the ticket system.

485
00:21:56,560 –> 00:21:57,840
I need to validate scope.

486
00:21:57,840 –> 00:21:59,360
Go check identity risk.

487
00:21:59,360 –> 00:22:01,840
I need to apply a change, use this runbook.

488
00:22:01,840 –> 00:22:04,400
The difference is that orchestration has to be deterministic

489
00:22:04,400 –> 00:22:06,560
about permissions and evidence collection.

490
00:22:06,560 –> 00:22:10,240
Otherwise, your fallback path becomes the real path because it’s easier.

491
00:22:10,240 –> 00:22:13,280
And orchestration must handle failure as a first class input.

492
00:22:13,280 –> 00:22:14,160
API’s throttle.

493
00:22:14,160 –> 00:22:15,440
Graph returns partial data.

494
00:22:15,440 –> 00:22:16,640
A device goes offline.

495
00:22:16,640 –> 00:22:17,440
A resource group.

496
00:22:17,440 –> 00:22:17,920
Locks.

497
00:22:17,920 –> 00:22:19,040
A connector breaks.

498
00:22:19,040 –> 00:22:20,480
The agent doesn’t get to pretend.

499
00:22:20,480 –> 00:22:22,400
Orchestration has to implement retreats,

500
00:22:22,400 –> 00:22:24,560
back off alternate paths and escalation rules

501
00:22:24,560 –> 00:22:26,880
that don’t spam your on-call rotation into quitting.

502
00:22:26,880 –> 00:22:27,920
Next is action.

503
00:22:28,800 –> 00:22:31,520
Action is the part everyone demos because it looks impressive.

504
00:22:31,520 –> 00:22:34,720
But action is where you either enforce the execution contract

505
00:22:34,720 –> 00:22:36,080
or you lie about having one.

506
00:22:36,080 –> 00:22:38,560
Actions are concrete tool calls,

507
00:22:38,560 –> 00:22:41,280
patching a service, updating a configuration,

508
00:22:41,280 –> 00:22:42,240
revoking a session,

509
00:22:42,240 –> 00:22:43,920
disabling a risky app consent,

510
00:22:43,920 –> 00:22:45,600
posting to a team’s channel,

511
00:22:45,600 –> 00:22:48,000
creating a change record, closing a ticket.

512
00:22:48,000 –> 00:22:50,480
And each action must run under a scoped identity

513
00:22:50,480 –> 00:22:51,920
with bounded permissions.

514
00:22:51,920 –> 00:22:55,040
This is where read versus write stops being theory.

515
00:22:55,040 –> 00:22:56,960
If the agent can write to the wrong plane,

516
00:22:56,960 –> 00:22:59,360
you’ve built a worm with good documentation.

517
00:22:59,360 –> 00:23:00,720
So action needs guardrails,

518
00:23:00,720 –> 00:23:02,960
quotas, rate limits, scope boundaries,

519
00:23:02,960 –> 00:23:05,600
and a kill switch that actually holds an inflight run.

520
00:23:05,600 –> 00:23:07,520
An action must include verification.

521
00:23:07,520 –> 00:23:08,560
Not I executed.

522
00:23:08,560 –> 00:23:10,480
Verified outcomes?

523
00:23:10,480 –> 00:23:12,480
Service healthy, incident stopped paging,

524
00:23:12,480 –> 00:23:14,720
reconciliation balanced containment took effect.

525
00:23:14,720 –> 00:23:16,800
If you don’t verify, you didn’t automate a result.

526
00:23:16,800 –> 00:23:17,840
You automated a guess.

527
00:23:17,840 –> 00:23:19,520
Finally, evidence evidence is the part

528
00:23:19,520 –> 00:23:21,840
that makes autonomy enterprise grade.

529
00:23:21,840 –> 00:23:23,520
Without it, you get agent said so,

530
00:23:23,520 –> 00:23:26,400
which is just a new flavor of unaccountable change.

531
00:23:26,400 –> 00:23:28,400
Evidence means a replayable run.

532
00:23:28,400 –> 00:23:30,720
Inputs captured, the event payload stored,

533
00:23:30,720 –> 00:23:32,320
the reasoning decision recorded,

534
00:23:32,320 –> 00:23:34,480
the tool calls logged with parameters,

535
00:23:34,480 –> 00:23:36,800
the identities used, the approvals referenced,

536
00:23:36,800 –> 00:23:38,160
the outputs produced,

537
00:23:38,160 –> 00:23:40,800
and the verification checks that confirm success.

538
00:23:40,800 –> 00:23:42,080
This is not for curiosity.

539
00:23:42,080 –> 00:23:44,880
It’s for incident reviews, audits, and blame assignment.

540
00:23:44,880 –> 00:23:46,800
Because enterprises will do all three,

541
00:23:46,800 –> 00:23:49,040
evidence is also how you detect drift.

542
00:23:49,040 –> 00:23:51,040
When the same event class suddenly produces

543
00:23:51,040 –> 00:23:52,400
different action paths, you know,

544
00:23:52,400 –> 00:23:54,240
your contracts or entitlements eroded.

545
00:23:54,240 –> 00:23:57,120
So when someone asks what is autonomy architecturally,

546
00:23:57,120 –> 00:23:59,040
the answer isn’t an LLM with tools.

547
00:23:59,040 –> 00:24:01,520
It’s a closed loop system that ingests events,

548
00:24:01,520 –> 00:24:03,920
reasons under policy, orchestrates safely,

549
00:24:03,920 –> 00:24:06,240
acts with bounded identity and outputs evidence

550
00:24:06,240 –> 00:24:07,680
you can replay under hostility.

551
00:24:07,680 –> 00:24:11,040
Control plane versus execution plane,

552
00:24:11,040 –> 00:24:12,720
where governance actually lives.

553
00:24:12,720 –> 00:24:15,760
Now the stack is useful, but it hides the real fight.

554
00:24:15,760 –> 00:24:18,000
Governance doesn’t live in the agent.

555
00:24:18,000 –> 00:24:20,320
It lives in how you separate the control plane

556
00:24:20,320 –> 00:24:21,760
from the execution plane,

557
00:24:21,760 –> 00:24:23,840
and whether you keep that separation intact

558
00:24:23,840 –> 00:24:25,520
when someone asks for speed.

559
00:24:25,520 –> 00:24:28,320
The control plane is where you encode intent as constraints.

560
00:24:28,320 –> 00:24:30,400
It is identity’s entitlements, policies,

561
00:24:30,400 –> 00:24:32,400
approvals, tool allow lists, evidence rules,

562
00:24:32,400 –> 00:24:34,080
and the ability to revoke any of those

563
00:24:34,080 –> 00:24:36,320
without negotiating with a dozen app teams.

564
00:24:36,320 –> 00:24:38,640
If you can’t change the rules without redeploying the agent,

565
00:24:38,640 –> 00:24:40,160
you don’t have a control plane.

566
00:24:40,160 –> 00:24:41,360
You have a fragile app.

567
00:24:41,360 –> 00:24:44,560
In Microsoft terms, the control plane is anchored in Entra,

568
00:24:44,560 –> 00:24:46,640
your policy layer, and your governance systems.

569
00:24:46,640 –> 00:24:48,880
The place where you decide what principles exist,

570
00:24:48,880 –> 00:24:50,560
what they can do under what conditions

571
00:24:50,560 –> 00:24:52,640
and what must be recorded when they do it.

572
00:24:52,640 –> 00:24:55,200
It’s also where you decide what allowed even means

573
00:24:55,200 –> 00:24:56,640
when the actor isn’t a person.

574
00:24:56,640 –> 00:24:58,640
The execution plane is where work happens.

575
00:24:58,640 –> 00:25:00,720
It is the runtime making graph calls,

576
00:25:00,720 –> 00:25:03,520
running runbooks, invoking sentinel playbooks,

577
00:25:03,520 –> 00:25:06,480
updating tickets, pushing messages into teams,

578
00:25:06,480 –> 00:25:09,520
touching SharePoint, writing back into the ERP

579
00:25:09,520 –> 00:25:12,960
or performing any other actuator move that changes state.

580
00:25:12,960 –> 00:25:15,760
Execution is the part that makes demos look impressive

581
00:25:15,760 –> 00:25:17,680
because it creates visible outcomes.

582
00:25:17,680 –> 00:25:20,480
It is also the part that turns small mistakes into incidents.

583
00:25:20,480 –> 00:25:21,520
That distinction matters

584
00:25:21,520 –> 00:25:23,440
because enterprises routinely invert them.

585
00:25:23,440 –> 00:25:24,720
They start with execution.

586
00:25:24,720 –> 00:25:26,080
We connected it to graph.

587
00:25:26,080 –> 00:25:27,360
We wired up the connector.

588
00:25:27,360 –> 00:25:28,960
It can restart the service.

589
00:25:28,960 –> 00:25:30,720
And then later they bolt on governance

590
00:25:30,720 –> 00:25:32,160
a log file, a few approvals,

591
00:25:32,160 –> 00:25:34,080
a policy doc that nobody reads.

592
00:25:34,080 –> 00:25:36,160
Over time, convenience overrides intent.

593
00:25:36,160 –> 00:25:38,480
The execution plane becomes the real control plane

594
00:25:38,480 –> 00:25:40,000
because whoever owns the connector

595
00:25:40,000 –> 00:25:41,920
effectively owns the blast radius.

596
00:25:41,920 –> 00:25:43,440
This is the uncomfortable truth.

597
00:25:43,440 –> 00:25:46,160
Autonomy systems drift toward the fastest path

598
00:25:46,160 –> 00:25:48,400
unless you enforce separation by design.

599
00:25:48,400 –> 00:25:51,200
So what does separation look like in practice?

600
00:25:51,200 –> 00:25:53,200
First, control plane owns identity.

601
00:25:53,200 –> 00:25:55,760
Not the agent developer, not the workflow designer,

602
00:25:55,760 –> 00:25:58,320
not whoever has contributor in the subscription.

603
00:25:58,320 –> 00:26:00,240
The agent runs as a non-human principle

604
00:26:00,240 –> 00:26:02,240
with explicitly bounded roles.

605
00:26:02,240 –> 00:26:04,960
And those roles live in the same life cycle as human access,

606
00:26:04,960 –> 00:26:07,200
review, rotation and revocation.

607
00:26:07,200 –> 00:26:09,280
If a developer can quietly widen permissions

608
00:26:09,280 –> 00:26:11,920
to make the demo work, the system will inevitably

609
00:26:11,920 –> 00:26:13,360
ship with those permissions.

610
00:26:13,360 –> 00:26:16,080
Second, control plane owns tool availability.

611
00:26:16,080 –> 00:26:18,160
Not the agent can use tools.

612
00:26:18,160 –> 00:26:19,520
Which tools exist at all?

613
00:26:19,520 –> 00:26:21,120
Which versions and which ones are allowed?

614
00:26:21,120 –> 00:26:21,840
In production.

615
00:26:21,840 –> 00:26:23,680
This is where MCP becomes dangerous

616
00:26:23,680 –> 00:26:25,600
if you don’t treat it like a perimeter.

617
00:26:25,600 –> 00:26:27,200
A tool registry is discovery.

618
00:26:27,200 –> 00:26:28,480
An all-list is governance.

619
00:26:28,480 –> 00:26:30,880
If you don’t have both, you will wake up to toolsprall

620
00:26:30,880 –> 00:26:31,920
and entitlement sprawl,

621
00:26:31,920 –> 00:26:34,160
and you won’t remember which one caused the incident.

622
00:26:34,160 –> 00:26:36,560
Third, control plane owns evidence requirements.

623
00:26:36,560 –> 00:26:39,120
You don’t let execution decide what counts as proof.

624
00:26:39,120 –> 00:26:41,840
The policy says, before you cross the autonomy boundary,

625
00:26:41,840 –> 00:26:44,560
you must have a ticket reference correlated telemetry

626
00:26:44,560 –> 00:26:46,640
and a policy clause that permits the action.

627
00:26:46,640 –> 00:26:50,160
And after the action, you must emit a replayable record.

628
00:26:50,160 –> 00:26:53,680
If you let the execution plane best effort its way through evidence,

629
00:26:53,680 –> 00:26:55,280
you’ll end up with polite narratives

630
00:26:55,280 –> 00:26:56,880
instead of audit artifacts.

631
00:26:56,880 –> 00:26:58,560
Now here’s the part everyone gets wrong.

632
00:26:58,560 –> 00:26:59,520
Exceptions.

633
00:26:59,520 –> 00:27:02,560
Most organizations think exceptions are operational flexibility.

634
00:27:02,560 –> 00:27:03,280
They are wrong.

635
00:27:03,280 –> 00:27:04,960
Exceptions are entropy generators.

636
00:27:04,960 –> 00:27:06,000
Every time someone says,

637
00:27:06,000 –> 00:27:07,600
“Just let the agent do it this one time.”

638
00:27:07,600 –> 00:27:09,520
They’re not making the system more useful.

639
00:27:09,520 –> 00:27:12,800
They’re making your deterministic security model probabilistic.

640
00:27:12,800 –> 00:27:14,560
Because the exception doesn’t live in a vacuum.

641
00:27:14,560 –> 00:27:16,240
It gets copied, reused, inherited

642
00:27:16,240 –> 00:27:17,840
and eventually treated as baseline.

643
00:27:17,840 –> 00:27:19,680
The system did exactly what you allowed.

644
00:27:19,680 –> 00:27:21,120
You just forgot you allowed it.

645
00:27:21,120 –> 00:27:23,200
And the hardest problem in this entire model

646
00:27:23,200 –> 00:27:25,680
isn’t starting an agent, it’s stopping one.

647
00:27:25,680 –> 00:27:27,600
Not disable the app registration.

648
00:27:27,600 –> 00:27:29,440
Stopping an inflight run cleanly,

649
00:27:29,440 –> 00:27:31,600
mid execution across multiple systems

650
00:27:31,600 –> 00:27:33,840
with partial state changes and retries queued.

651
00:27:33,840 –> 00:27:36,240
If you don’t design kill behavior into the control plane,

652
00:27:36,240 –> 00:27:37,920
you’ll learn about it during an incident

653
00:27:37,920 –> 00:27:40,400
when the agent keeps helpfully reapplying

654
00:27:40,400 –> 00:27:42,400
the action you’re trying to roll back.

655
00:27:42,400 –> 00:27:43,600
So if you remember nothing else,

656
00:27:43,600 –> 00:27:45,200
governance lives in the control plane

657
00:27:45,200 –> 00:27:46,720
not in the agent’s prompt,

658
00:27:46,720 –> 00:27:49,200
the execution plane will always see convenience.

659
00:27:49,200 –> 00:27:51,200
Your job is to make convenience impossible

660
00:27:51,200 –> 00:27:52,880
when it violates intent.

661
00:27:52,880 –> 00:27:54,080
The worth it test.

662
00:27:54,080 –> 00:27:56,160
When autonomy beats assistance.

663
00:27:56,160 –> 00:27:58,000
Autonomy is not better AI.

664
00:27:58,000 –> 00:27:59,280
It’s a different cost model.

665
00:27:59,280 –> 00:28:01,200
Assistance helps a person finish work.

666
00:28:01,200 –> 00:28:03,600
Autonomy finishes work and leaves you with an artifact.

667
00:28:03,600 –> 00:28:05,680
That means the only honest question

668
00:28:05,680 –> 00:28:07,280
is whether the overhead of building

669
00:28:07,280 –> 00:28:10,080
and governing autonomous execution pays for itself.

670
00:28:10,080 –> 00:28:13,120
And it only pays in a very specific shape of problem.

671
00:28:13,120 –> 00:28:15,520
Autonomy wins when the work has volume,

672
00:28:15,520 –> 00:28:17,760
repeatability and bounded consequences.

673
00:28:18,480 –> 00:28:20,320
Think of it like any other automation.

674
00:28:20,320 –> 00:28:23,440
If the decision is rare, ambiguous or politically sensitive,

675
00:28:23,440 –> 00:28:24,880
autonomy won’t save you.

676
00:28:24,880 –> 00:28:26,960
It will just give you a faster way to be wrong.

677
00:28:26,960 –> 00:28:28,320
So here’s the worth it test,

678
00:28:28,320 –> 00:28:30,640
stated the way an enterprise should state it.

679
00:28:30,640 –> 00:28:33,680
Autonomy beats assistance when it increases outcome throughput

680
00:28:33,680 –> 00:28:35,680
without increasing policy violations.

681
00:28:35,680 –> 00:28:37,680
Not when users like it,

682
00:28:37,680 –> 00:28:39,520
or not when the demo is cool.

683
00:28:39,520 –> 00:28:42,240
When the system closes more outcomes per unit time,

684
00:28:42,240 –> 00:28:43,680
under enforced intent,

685
00:28:43,680 –> 00:28:45,920
and humans intervene less without losing control,

686
00:28:45,920 –> 00:28:47,600
that test has four components.

687
00:28:47,600 –> 00:28:49,600
First, throughput.

688
00:28:49,600 –> 00:28:50,880
Autonomy is a queue optimizer.

689
00:28:50,880 –> 00:28:52,480
If your queue depth never goes down,

690
00:28:52,480 –> 00:28:55,600
tickets churn, incidents churn, analysts become routers,

691
00:28:55,600 –> 00:28:58,160
then you have a throughput problem, not a skill problem.

692
00:28:58,160 –> 00:29:01,920
Autonomy earns its keep when it takes the low to medium complexity items

693
00:29:01,920 –> 00:29:04,560
off the queue entirely and keeps doing it at 2am

694
00:29:04,560 –> 00:29:07,120
on a weekend without waiting for someone to look at it.

695
00:29:07,120 –> 00:29:08,880
Second, consistency.

696
00:29:08,880 –> 00:29:10,560
Humans are inconsistent by design.

697
00:29:10,560 –> 00:29:12,080
They interpret runbooks differently.

698
00:29:12,080 –> 00:29:14,480
They skip documentation when the page is screaming.

699
00:29:14,480 –> 00:29:17,040
They make temporary changes and forget to reverse them.

700
00:29:17,040 –> 00:29:19,200
Autonomy, under an execution contract,

701
00:29:19,200 –> 00:29:21,280
does the same thing the same way every time.

702
00:29:21,280 –> 00:29:22,240
That is boring.

703
00:29:22,240 –> 00:29:23,600
Boring is the goal.

704
00:29:23,600 –> 00:29:26,240
Third, 24/7 execution.

705
00:29:26,240 –> 00:29:28,400
Assistance still bottlenecks on attention.

706
00:29:28,400 –> 00:29:30,800
Copilot can draft the incident report at midnight,

707
00:29:30,800 –> 00:29:32,720
but the incident still waits for the engineer

708
00:29:32,720 –> 00:29:35,680
who has to approve the change, run the fix, and document it.

709
00:29:35,680 –> 00:29:37,120
Autonomy doesn’t wait.

710
00:29:37,120 –> 00:29:39,600
It executes within its allowed action set,

711
00:29:39,600 –> 00:29:42,880
verifies and escalates only when the contract says it must.

712
00:29:42,880 –> 00:29:45,200
Fourth, reduced intervention rate.

713
00:29:45,200 –> 00:29:47,520
This is the metric most enterprises refuse to name

714
00:29:47,520 –> 00:29:49,120
because it forces accountability.

715
00:29:49,120 –> 00:29:51,840
What percentage of cases require a human to step in?

716
00:29:51,840 –> 00:29:53,840
With assistance, it’s basically all of them,

717
00:29:53,840 –> 00:29:56,240
because the human owns the last mile.

718
00:29:56,240 –> 00:29:58,880
With autonomy, you expect the intervention rate to drop,

719
00:29:58,880 –> 00:30:01,280
meaning the system handles the known-nones

720
00:30:01,280 –> 00:30:03,360
and punts the unknown unknowns to humans.

721
00:30:03,360 –> 00:30:04,480
Now those are the benefits.

722
00:30:04,480 –> 00:30:05,280
Here’s the gate.

723
00:30:05,280 –> 00:30:08,960
Autonomy only works when the decision environment is stable.

724
00:30:08,960 –> 00:30:11,360
That means the work items have recognizable patterns.

725
00:30:11,360 –> 00:30:13,440
The systems involved have reliable telemetry

726
00:30:13,440 –> 00:30:15,600
and the organization can define done

727
00:30:15,600 –> 00:30:18,080
in a way that can be validated automatically.

728
00:30:18,080 –> 00:30:21,040
If you can’t define done, you can’t automate outcomes.

729
00:30:21,040 –> 00:30:22,400
You can only automate motion.

730
00:30:22,400 –> 00:30:23,520
So what passes the test?

731
00:30:23,520 –> 00:30:26,160
High-volume, repeatable tasks with clear ownership

732
00:30:26,160 –> 00:30:27,760
and bounded scope.

733
00:30:27,760 –> 00:30:29,760
Common IT remediations.

734
00:30:29,760 –> 00:30:31,600
Known reconciliation patterns.

735
00:30:31,600 –> 00:30:33,760
Low-to-medium risk security responses

736
00:30:33,760 –> 00:30:36,800
where policy already defines what containment means.

737
00:30:36,800 –> 00:30:38,960
Autonomy thrives on operational repetition.

738
00:30:38,960 –> 00:30:40,000
What fails the test?

739
00:30:40,000 –> 00:30:41,440
Anything with ambiguous policy,

740
00:30:41,440 –> 00:30:44,560
sensitive consequences, weak telemetry or unclear ownership.

741
00:30:44,560 –> 00:30:46,800
If the action is “might-impact executives”,

742
00:30:46,800 –> 00:30:48,560
you will end up with humans anyway.

743
00:30:48,560 –> 00:30:50,320
If the action involves money movement

744
00:30:50,320 –> 00:30:52,320
without a deterministic evidence chain,

745
00:30:52,320 –> 00:30:54,320
you will end up with auditors anyway.

746
00:30:54,320 –> 00:30:57,600
If the signal quality is low and the system spends its time guessing,

747
00:30:57,600 –> 00:31:00,160
you will end up with an expensive guessing machine.

748
00:31:00,160 –> 00:31:02,720
And the most common anti-case is the one nobody admits,

749
00:31:02,720 –> 00:31:04,160
unclear blast radius.

750
00:31:04,160 –> 00:31:07,200
If you can’t bound the scope of what the agent is allowed to touch,

751
00:31:07,200 –> 00:31:08,480
you shouldn’t let it touch anything.

752
00:31:08,480 –> 00:31:10,000
That’s not caution, that’s just math.

753
00:31:10,960 –> 00:31:14,000
Now the KPI framing, because this is where autonomy projects die

754
00:31:14,000 –> 00:31:14,960
in finance meetings.

755
00:31:14,960 –> 00:31:16,720
You don’t measure autonomy by token cost.

756
00:31:16,720 –> 00:31:18,880
You measure it by cost per closed outcome,

757
00:31:18,880 –> 00:31:21,600
cost per incident resolved, cost per ticket closed,

758
00:31:21,600 –> 00:31:23,360
cost per reconciliation balanced.

759
00:31:23,360 –> 00:31:25,360
Cost per alert, triage to a real incident

760
00:31:25,360 –> 00:31:26,800
or dismissed with evidence.

761
00:31:26,800 –> 00:31:28,640
If autonomy lowers that number

762
00:31:28,640 –> 00:31:30,880
while holding policy compliance steady,

763
00:31:30,880 –> 00:31:31,680
it’s worth it.

764
00:31:31,680 –> 00:31:33,200
If it only makes people feel faster,

765
00:31:33,200 –> 00:31:34,560
it’s assistance with extra risk.

766
00:31:34,560 –> 00:31:35,760
So track four metrics

767
00:31:35,760 –> 00:31:37,600
and don’t negotiate with yourself about them.

768
00:31:37,600 –> 00:31:40,080
Time to close, from event to verified outcome.

769
00:31:40,080 –> 00:31:41,360
Human in the loop rate,

770
00:31:41,360 –> 00:31:43,120
what percentage required intervention,

771
00:31:43,120 –> 00:31:44,080
not review.

772
00:31:44,080 –> 00:31:47,120
Rollback frequency, how often did autonomy make a change

773
00:31:47,120 –> 00:31:48,320
that had to be undone?

774
00:31:48,320 –> 00:31:50,720
Policy compliance, how often did it cross a boundary

775
00:31:50,720 –> 00:31:51,600
it shouldn’t have crossed?

776
00:31:51,600 –> 00:31:54,160
And if you want one more that cuts through the noise,

777
00:31:54,160 –> 00:31:55,440
intervention histogram,

778
00:31:55,440 –> 00:31:56,960
not averages, the distribution,

779
00:31:56,960 –> 00:31:59,120
because the long tail is where incidents live.

780
00:31:59,120 –> 00:32:00,400
If the worth it test passes,

781
00:32:00,400 –> 00:32:02,160
autonomy becomes an engineering project.

782
00:32:02,160 –> 00:32:04,400
If it fails, keep it as assistance

783
00:32:04,400 –> 00:32:06,640
and be honest that you’re buying faster labor.

784
00:32:06,640 –> 00:32:08,160
Now we can get concrete

785
00:32:08,160 –> 00:32:11,280
because the first scenario, autonomous IT remediation,

786
00:32:11,280 –> 00:32:13,200
exposes control plane immaturity

787
00:32:13,200 –> 00:32:16,080
faster than any governance workshop ever will.

788
00:32:16,080 –> 00:32:20,400
Scenario one, setup, autonomous IT remediation at scale.

789
00:32:20,400 –> 00:32:22,560
IT remediation is where autonomy stops

790
00:32:22,560 –> 00:32:25,600
being a philosophy and becomes a liability calculation.

791
00:32:25,600 –> 00:32:28,160
Because the pain is real, the volume is relentless

792
00:32:28,160 –> 00:32:31,440
and the work is mostly the same handful of moves repeated

793
00:32:31,440 –> 00:32:33,680
forever by tired humans who swear

794
00:32:33,680 –> 00:32:35,280
they’ll document it next time.

795
00:32:35,280 –> 00:32:37,680
The typical enterprise starts here, alert fatigue.

796
00:32:37,680 –> 00:32:40,400
Monitoring fires, someone triages, they assign it,

797
00:32:40,400 –> 00:32:42,080
the assignee asks for context,

798
00:32:42,080 –> 00:32:43,680
then you get the escalation loop.

799
00:32:43,680 –> 00:32:45,120
The ticket bounces between teams

800
00:32:45,120 –> 00:32:46,880
because nobody owns the whole path.

801
00:32:46,880 –> 00:32:49,040
Meanwhile, users keep reporting the symptom,

802
00:32:49,040 –> 00:32:50,720
not the cause, so the queue gets heavier

803
00:32:50,720 –> 00:32:52,160
while the service gets worse.

804
00:32:52,160 –> 00:32:53,920
And buried inside that mess is a simple truth.

805
00:32:53,920 –> 00:32:55,760
Most of the incidents aren’t mysterious.

806
00:32:55,760 –> 00:32:56,800
They’re just unknown.

807
00:32:56,800 –> 00:32:59,200
They restart worthy, rollback worthy.

808
00:32:59,200 –> 00:33:02,080
Apply the known fix and verify worthy.

809
00:33:02,080 –> 00:33:03,840
But because humans are the bottleneck,

810
00:33:03,840 –> 00:33:04,880
everything queues.

811
00:33:04,880 –> 00:33:06,080
Work doesn’t close.

812
00:33:06,080 –> 00:33:06,880
It turns.

813
00:33:07,840 –> 00:33:10,800
So the baseline flow most organizations run looks like this.

814
00:33:10,800 –> 00:33:13,600
Detect, triage, assign, remediate, document.

815
00:33:13,600 –> 00:33:14,480
It sounds orderly.

816
00:33:14,480 –> 00:33:16,240
In practice, it’s a game of telephone.

817
00:33:16,240 –> 00:33:18,240
Detect is an alert with weak context.

818
00:33:18,240 –> 00:33:22,080
Triage is a person reconstructing context across three portals.

819
00:33:22,080 –> 00:33:23,600
Assign is guessing who’s least busy.

820
00:33:23,600 –> 00:33:26,000
Remediate is someone doing the same command sequence

821
00:33:26,000 –> 00:33:26,880
they did last week.

822
00:33:26,880 –> 00:33:28,640
Document is either an afterthought

823
00:33:28,640 –> 00:33:30,160
or a copy-paced narrative

824
00:33:30,160 –> 00:33:32,480
written to satisfy process, not truth.

825
00:33:32,480 –> 00:33:33,680
Now the autonomy version,

826
00:33:33,680 –> 00:33:36,240
the one worth doing, changes the shape of the work.

827
00:33:36,240 –> 00:33:37,600
The agentic flow is,

828
00:33:37,600 –> 00:33:41,360
detect, diagnose, remediate, verify, close, report.

829
00:33:41,360 –> 00:33:42,320
Notice what’s missing.

830
00:33:42,320 –> 00:33:44,640
There’s no assigned step

831
00:33:44,640 –> 00:33:46,640
because the system doesn’t need to find a human.

832
00:33:46,640 –> 00:33:48,560
And there’s no document later

833
00:33:48,560 –> 00:33:50,240
because evidence is part of the run,

834
00:33:50,240 –> 00:33:52,320
not a chore you hope somebody remembers.

835
00:33:52,320 –> 00:33:54,800
But this only works if you take ownership seriously.

836
00:33:54,800 –> 00:33:56,320
Autonomy doesn’t eliminate ownership.

837
00:33:56,320 –> 00:33:57,200
It just moves it.

838
00:33:57,200 –> 00:33:58,640
Someone still carries the pager.

839
00:33:58,640 –> 00:34:00,160
Someone still owns rollback.

840
00:34:00,160 –> 00:34:01,680
Someone still owns the change record

841
00:34:01,680 –> 00:34:03,520
when the agent makes a configuration update

842
00:34:03,520 –> 00:34:05,200
that technically counts as a change.

843
00:34:05,200 –> 00:34:06,960
Even if nobody typed the command.

844
00:34:06,960 –> 00:34:09,120
So before you let an agent touch remediation,

845
00:34:09,120 –> 00:34:10,560
you need to answer the questions

846
00:34:10,560 –> 00:34:13,200
most pilot teams avoid because they’re inconvenient.

847
00:34:13,200 –> 00:34:14,720
What incident classes are in scope?

848
00:34:14,720 –> 00:34:16,320
What systems are allowed to be changed?

849
00:34:16,320 –> 00:34:17,760
What is the containment unit?

850
00:34:17,760 –> 00:34:19,040
Subscription, resource group,

851
00:34:19,040 –> 00:34:20,960
specific service, specific environment?

852
00:34:20,960 –> 00:34:24,000
What evidence is required before the agent is allowed to act?

853
00:34:24,000 –> 00:34:25,840
And what does verified mean for each fix?

854
00:34:25,840 –> 00:34:27,040
Because in IT remediation,

855
00:34:27,040 –> 00:34:28,480
the action is usually trivial.

856
00:34:28,480 –> 00:34:29,920
The blast radius is not.

857
00:34:29,920 –> 00:34:31,840
Restarting a service sounds harmless

858
00:34:31,840 –> 00:34:33,440
until it restarts the wrong tier,

859
00:34:33,440 –> 00:34:35,360
drops connections and triggers a cascade

860
00:34:35,360 –> 00:34:36,800
that looks like an outage.

861
00:34:36,800 –> 00:34:38,320
Rolling back a config sounds safe

862
00:34:38,320 –> 00:34:40,560
until the known good state is from three months ago

863
00:34:40,560 –> 00:34:42,560
and today’s dependencies are different.

864
00:34:42,560 –> 00:34:43,920
Patching sounds responsible

865
00:34:43,920 –> 00:34:46,400
until the patch triggers a reboot during business hours

866
00:34:46,400 –> 00:34:48,800
because someone forgot to encode a maintenance window.

867
00:34:48,800 –> 00:34:50,480
This is why autonomy in remediation

868
00:34:50,480 –> 00:34:52,080
is the fastest way to expose

869
00:34:52,080 –> 00:34:53,840
whether your control plane is real.

870
00:34:53,840 –> 00:34:55,840
If you can’t express a remediation action

871
00:34:55,840 –> 00:34:58,640
as a bounded, auditable, reversible operation,

872
00:34:58,640 –> 00:34:59,840
you shouldn’t automate it.

873
00:34:59,840 –> 00:35:01,440
Not because automation is scary.

874
00:35:01,440 –> 00:35:02,880
Because automation is honest,

875
00:35:02,880 –> 00:35:05,680
it executes what you allow repeatedly at machine speed.

876
00:35:05,680 –> 00:35:06,800
So in this scenario,

877
00:35:06,800 –> 00:35:09,200
the objective isn’t let the agent fix everything there.

878
00:35:09,200 –> 00:35:11,600
The objective is narrower and more defensible.

879
00:35:11,600 –> 00:35:13,920
Let the agent close the predictable incidents

880
00:35:13,920 –> 00:35:16,080
that already have deterministic runbooks

881
00:35:16,080 –> 00:35:18,720
with explicit thresholds and clean escalation.

882
00:35:18,720 –> 00:35:21,120
Think memory leaks with known mitigations.

883
00:35:21,120 –> 00:35:22,560
Stuck queue processors.

884
00:35:22,560 –> 00:35:23,920
Certificates approaching expiry

885
00:35:23,920 –> 00:35:25,520
where rotation is already scripted.

886
00:35:25,520 –> 00:35:28,320
Diskspace remediation where a cleanup is defined and bounded,

887
00:35:28,320 –> 00:35:30,800
service restarts where the verification checks are clear

888
00:35:30,800 –> 00:35:32,000
and the rollback is

889
00:35:32,000 –> 00:35:33,760
bring it back up and page a human

890
00:35:33,760 –> 00:35:35,280
if the health probe doesn’t recover.

891
00:35:35,280 –> 00:35:36,080
And if you’re thinking,

892
00:35:36,080 –> 00:35:38,000
“Okay, that’s just automation? Good.”

893
00:35:38,000 –> 00:35:40,320
Autonomy is automation with three added requirements.

894
00:35:40,320 –> 00:35:42,880
It chooses the runbook, it proves why it chose it,

895
00:35:42,880 –> 00:35:45,920
and it verifies the outcome under an execution contract.

896
00:35:45,920 –> 00:35:48,160
Now here’s the payoff signal you should hold onto.

897
00:35:48,160 –> 00:35:49,440
Closing the ticket is easy.

898
00:35:49,440 –> 00:35:52,240
Producing evidence and bounding the blast radius is the work.

899
00:35:52,240 –> 00:35:54,960
That’s why this scenario is perfect as the first deep dive.

900
00:35:54,960 –> 00:35:57,120
It forces you to confront the autonomy boundary

901
00:35:57,120 –> 00:35:59,840
in a domain where outcomes are measurable and failure is loud.

902
00:35:59,840 –> 00:36:02,640
If the agent can’t show its evidence trail, you won’t trust it.

903
00:36:02,640 –> 00:36:04,960
If it can’t be stopped mid-flight, you’ll fear it.

904
00:36:04,960 –> 00:36:07,040
If you can’t name who wakes up when it fails,

905
00:36:07,040 –> 00:36:08,320
you’re not doing autonomy.

906
00:36:08,320 –> 00:36:09,360
You’re doing a demo.

907
00:36:09,360 –> 00:36:13,360
So the next thing is to map the flow across the real enterprise surfaces

908
00:36:13,360 –> 00:36:14,720
as you’re for the resources,

909
00:36:14,720 –> 00:36:17,840
graph for identity adjacent actions and communications,

910
00:36:17,840 –> 00:36:20,800
the ITSM system for tickets and change records

911
00:36:20,800 –> 00:36:23,760
and policy gates that decide when execution is allowed.

912
00:36:23,760 –> 00:36:25,600
That’s where most implementations collapse,

913
00:36:25,600 –> 00:36:27,760
not in the model but in permissions and scope.

914
00:36:27,760 –> 00:36:33,680
Scenario one, system flow, Azure plus graph plus ITSM plus policy gates

915
00:36:33,680 –> 00:36:37,360
start with the reality, the agent can’t remediate an incident.

916
00:36:37,360 –> 00:36:39,680
It can only move through systems that already exist.

917
00:36:39,680 –> 00:36:40,960
Azure for the workload,

918
00:36:40,960 –> 00:36:44,240
Microsoft graph for identity adjacent actions and communication,

919
00:36:44,240 –> 00:36:46,640
the ITSM platform for the record of truth,

920
00:36:46,640 –> 00:36:47,920
and then policy gates,

921
00:36:47,920 –> 00:36:53,200
entra approvals and evidence rules that decide whether the agent is allowed to touch anything at all.

922
00:36:53,200 –> 00:36:54,960
So the flow begins at ingestion.

923
00:36:54,960 –> 00:36:59,760
A signal arrives from Azure Monitor, log analytics, defender for cloud, service health,

924
00:36:59,760 –> 00:37:01,840
or a ticket event from your ITSM tool.

925
00:37:01,840 –> 00:37:06,400
The first job is to normalize that signal into something the autonomy stack can reason over.

926
00:37:06,400 –> 00:37:09,520
Incident class, impacted resource,

927
00:37:09,520 –> 00:37:12,160
environment tag, customer impact signals,

928
00:37:12,160 –> 00:37:14,320
and any known runbook mapping keys.

929
00:37:14,320 –> 00:37:16,240
If the event payload can’t be mapped to scope,

930
00:37:16,240 –> 00:37:18,320
the system should not try harder.

931
00:37:18,320 –> 00:37:19,280
It should escalate,

932
00:37:19,280 –> 00:37:21,200
autonomy doesn’t earn trust by guessing,

933
00:37:21,200 –> 00:37:24,160
it earns trust by refusing to act without containment.

934
00:37:24,160 –> 00:37:25,840
Next is correlation and diagnosis.

935
00:37:25,840 –> 00:37:28,400
The agent pulls additional context from Azure.

936
00:37:28,400 –> 00:37:30,800
Recent deployments, configuration changes,

937
00:37:30,800 –> 00:37:33,200
scaling events, health probes, dependency failures,

938
00:37:33,200 –> 00:37:36,720
and whatever telemetry exists that can confirm this isn’t a phantom alert.

939
00:37:36,720 –> 00:37:40,480
This is where the execution contracts evidence requirements become mechanical.

940
00:37:40,480 –> 00:37:42,960
If the contract says two independent signals,

941
00:37:42,960 –> 00:37:44,400
the system must collect them.

942
00:37:44,400 –> 00:37:47,520
A failing synthetic test plus a spike in error rate, for example.

943
00:37:47,520 –> 00:37:48,560
If it can’t, it stops.

944
00:37:48,560 –> 00:37:50,480
That’s the autonomy boundary doing its job.

945
00:37:50,480 –> 00:37:53,920
Now the system decides whether the incident is in an autonomous class.

946
00:37:53,920 –> 00:37:55,600
That classification shouldn’t live in a prompt.

947
00:37:55,600 –> 00:37:57,040
It should live in policy,

948
00:37:57,040 –> 00:37:58,400
a list of incident types,

949
00:37:58,400 –> 00:38:00,400
environments, and severity levels

950
00:38:00,400 –> 00:38:02,160
that are eligible for automatic action.

951
00:38:02,160 –> 00:38:04,560
Production, CV-1 with unknown blast radius?

952
00:38:04,560 –> 00:38:05,200
No.

953
00:38:05,200 –> 00:38:08,080
Non-prodQ processor wedged for 30 minutes with a known fix?

954
00:38:08,080 –> 00:38:08,560
Yes.

955
00:38:08,560 –> 00:38:09,920
The goal is not heroics.

956
00:38:09,920 –> 00:38:11,520
The goal is predictable closure.

957
00:38:11,520 –> 00:38:12,880
Once the incident is eligible,

958
00:38:12,880 –> 00:38:15,280
orchestration selects the remediation pathway.

959
00:38:15,280 –> 00:38:16,400
In enterprise terms,

960
00:38:16,400 –> 00:38:18,960
this is runbook selection with preconditions.

961
00:38:18,960 –> 00:38:21,120
The agent chooses restart service,

962
00:38:21,120 –> 00:38:22,000
scale out,

963
00:38:22,000 –> 00:38:23,520
rollback last deployment,

964
00:38:23,520 –> 00:38:24,800
clear poison queue,

965
00:38:24,800 –> 00:38:26,400
rotate certificate,

966
00:38:26,400 –> 00:38:27,520
whatever you’ve defined.

967
00:38:27,520 –> 00:38:30,960
But each pathway has to include two extra things humans often skip,

968
00:38:30,960 –> 00:38:32,080
a rollback plan,

969
00:38:32,080 –> 00:38:33,520
and a verification plan.

970
00:38:33,520 –> 00:38:36,080
Rollback is what happens if the action makes it worse.

971
00:38:36,080 –> 00:38:38,320
Verification is what proves the action worked

972
00:38:38,320 –> 00:38:40,720
without a human saying looks fine.

973
00:38:40,720 –> 00:38:42,080
Now we hit the policy gates.

974
00:38:42,080 –> 00:38:43,520
Before any right action,

975
00:38:43,520 –> 00:38:46,240
the agent should cross-check identity and authorization.

976
00:38:46,240 –> 00:38:48,080
What principle is executing?

977
00:38:48,080 –> 00:38:49,360
What roles are active?

978
00:38:49,360 –> 00:38:51,040
And whether the current context

979
00:38:51,040 –> 00:38:52,800
satisfies conditional access

980
00:38:52,800 –> 00:38:54,800
and whatever risk conditions you enforce.

981
00:38:54,800 –> 00:38:56,640
And yes, if you’re doing this properly,

982
00:38:56,640 –> 00:38:58,800
you’ll end up with something PIM-like in spirit,

983
00:38:58,800 –> 00:39:00,640
even if the implementation differs,

984
00:39:00,640 –> 00:39:03,360
a constrained elevation model for specific actions,

985
00:39:03,360 –> 00:39:04,800
time-bounded, scope-bounded,

986
00:39:04,800 –> 00:39:07,200
and logged as an event that can be audited.

987
00:39:07,200 –> 00:39:09,920
At the same time, the ITSM system becomes a gate,

988
00:39:09,920 –> 00:39:11,120
not a bystander.

989
00:39:11,120 –> 00:39:14,000
The agent should either create or update a ticket with

990
00:39:14,000 –> 00:39:15,200
the detected signal,

991
00:39:15,200 –> 00:39:16,560
the evidence collected,

992
00:39:16,560 –> 00:39:17,840
the planned action sequence,

993
00:39:17,840 –> 00:39:20,240
and the policy clause that authorizes execution.

994
00:39:20,240 –> 00:39:22,080
If change control matters in your org,

995
00:39:22,080 –> 00:39:23,840
the agent should also create a change record,

996
00:39:23,840 –> 00:39:27,360
because the agent did it is not an exemption from your own process.

997
00:39:27,360 –> 00:39:29,920
It just means the process must be machine readable,

998
00:39:29,920 –> 00:39:31,840
then the action execution happens in Azure.

999
00:39:31,840 –> 00:39:33,440
This is where people get sloppy.

1000
00:39:33,440 –> 00:39:35,040
Restart the service must be implemented

1001
00:39:35,040 –> 00:39:36,640
as a scoped operation.

1002
00:39:36,640 –> 00:39:39,280
Target resource IDs explicitly restrict subscription

1003
00:39:39,280 –> 00:39:40,560
and resource group boundaries

1004
00:39:40,560 –> 00:39:41,680
and enforce rate limits,

1005
00:39:41,680 –> 00:39:43,680
so the agent can’t restart the entire fleet

1006
00:39:43,680 –> 00:39:45,680
because it saw the same symptom twice.

1007
00:39:45,680 –> 00:39:48,240
If the remediation involves deployment rollback,

1008
00:39:48,240 –> 00:39:50,160
it must pin to a specific version

1009
00:39:50,160 –> 00:39:51,840
and validate dependency drift.

1010
00:39:51,840 –> 00:39:52,960
If it involves patching,

1011
00:39:52,960 –> 00:39:54,800
it must honor maintenance windows.

1012
00:39:54,800 –> 00:39:56,960
Autonomy doesn’t erase operational discipline.

1013
00:39:56,960 –> 00:39:58,240
It weaponizes it,

1014
00:39:58,240 –> 00:40:00,080
either in your favor or against you.

1015
00:40:00,080 –> 00:40:02,880
Now graph shows up for two things.

1016
00:40:02,880 –> 00:40:04,640
Coordination and containment.

1017
00:40:04,640 –> 00:40:06,560
Coordination means notifications,

1018
00:40:06,560 –> 00:40:08,240
posting to the right teams channel,

1019
00:40:08,240 –> 00:40:09,200
updating the ticket,

1020
00:40:09,200 –> 00:40:11,440
emailing impacted stakeholders if that’s your norm.

1021
00:40:11,440 –> 00:40:13,680
Containment means identity adjacent actions

1022
00:40:13,680 –> 00:40:15,200
when the incident demands it,

1023
00:40:15,200 –> 00:40:17,440
disabling a compromised app registration,

1024
00:40:17,440 –> 00:40:18,640
revoking sessions,

1025
00:40:18,640 –> 00:40:19,760
rotating secrets,

1026
00:40:19,760 –> 00:40:20,720
or pulling access.

1027
00:40:20,720 –> 00:40:22,160
But those actions are higher risk,

1028
00:40:22,160 –> 00:40:24,080
so they should sit behind stricter gates,

1029
00:40:24,080 –> 00:40:25,440
stronger evidence requirements,

1030
00:40:25,440 –> 00:40:27,680
tithescopes, and lower confidence tolerance.

1031
00:40:27,680 –> 00:40:29,760
Finally, verification and closure.

1032
00:40:29,760 –> 00:40:32,720
The agent requaries telemetry,

1033
00:40:32,720 –> 00:40:33,920
health probes green,

1034
00:40:33,920 –> 00:40:35,200
error rates normal,

1035
00:40:35,200 –> 00:40:36,720
queue depth trending down.

1036
00:40:36,720 –> 00:40:38,400
User impact signals resolved.

1037
00:40:38,400 –> 00:40:39,520
If verification fails,

1038
00:40:39,520 –> 00:40:41,040
it either rolls back or escalates

1039
00:40:41,040 –> 00:40:42,240
depending on the contract.

1040
00:40:42,240 –> 00:40:42,960
And when it closes,

1041
00:40:42,960 –> 00:40:44,400
it doesn’t just close the ticket.

1042
00:40:44,400 –> 00:40:45,680
It writes the evidence bundle,

1043
00:40:45,680 –> 00:40:46,640
inputs, decisions,

1044
00:40:46,640 –> 00:40:47,600
toolcalls, approvals,

1045
00:40:47,600 –> 00:40:50,160
and verification results linked to the ITSM record

1046
00:40:50,160 –> 00:40:51,360
that bundle is the product.

1047
00:40:51,360 –> 00:40:53,120
Without it, you don’t have autonomy.

1048
00:40:53,120 –> 00:40:53,920
You have fast,

1049
00:40:53,920 –> 00:40:54,960
unreviewable change.

1050
00:40:54,960 –> 00:40:56,720
Scenario one,

1051
00:40:56,720 –> 00:40:57,920
governance leaves privilege,

1052
00:40:57,920 –> 00:40:59,040
or it becomes a worm.

1053
00:40:59,040 –> 00:41:00,160
Now we talk about governance,

1054
00:41:00,160 –> 00:41:01,840
because this is where the remediation story

1055
00:41:01,840 –> 00:41:03,360
stops being an engineering win

1056
00:41:03,360 –> 00:41:06,400
and starts being an enterprise incident waiting to happen.

1057
00:41:06,400 –> 00:41:09,120
Autonomous remediation has a simple security truth.

1058
00:41:09,120 –> 00:41:10,560
If the agent can do anything,

1059
00:41:10,560 –> 00:41:12,240
it will eventually do everything.

1060
00:41:12,240 –> 00:41:13,200
Not out of malice,

1061
00:41:13,200 –> 00:41:15,360
out of pathfinding, tools try to succeed,

1062
00:41:15,360 –> 00:41:16,560
retreats try to recover,

1063
00:41:16,560 –> 00:41:17,680
fallbacks try to help.

1064
00:41:17,680 –> 00:41:19,360
And if you gave the system broad rides,

1065
00:41:19,360 –> 00:41:21,040
you built a self-propelled operator

1066
00:41:21,040 –> 00:41:22,640
with no meaningful containment.

1067
00:41:22,640 –> 00:41:24,240
That’s a worm, just with nicer logs.

1068
00:41:24,240 –> 00:41:26,640
So governance for this scenario is not a checklist.

1069
00:41:26,640 –> 00:41:29,120
It is least privileged expressed as an execution contract

1070
00:41:29,120 –> 00:41:30,960
that the runtime cannot negotiate with.

1071
00:41:30,960 –> 00:41:32,560
Start with the agent identity.

1072
00:41:32,560 –> 00:41:34,240
This cannot be a service account.

1073
00:41:34,240 –> 00:41:36,960
It cannot be my automation app registration.

1074
00:41:36,960 –> 00:41:38,160
It’s a non-human principle

1075
00:41:38,160 –> 00:41:39,200
with a single purpose,

1076
00:41:39,200 –> 00:41:40,080
a narrow scope,

1077
00:41:40,080 –> 00:41:41,600
and a life cycle you actually manage.

1078
00:41:41,600 –> 00:41:43,280
It needs explicit role boundaries,

1079
00:41:43,280 –> 00:41:44,080
what it can read,

1080
00:41:44,080 –> 00:41:44,880
what it can write,

1081
00:41:44,880 –> 00:41:46,480
and more importantly, where.

1082
00:41:46,480 –> 00:41:48,000
Subscription, resource group,

1083
00:41:48,000 –> 00:41:49,280
specific resource types,

1084
00:41:49,280 –> 00:41:50,320
specific environments.

1085
00:41:50,320 –> 00:41:52,160
The containment unit needs to be explicit

1086
00:41:52,160 –> 00:41:54,960
because remediation is always tempted to expand scope.

1087
00:41:54,960 –> 00:41:55,920
I saw the issue here,

1088
00:41:55,920 –> 00:41:57,520
so I’ll go look over there.

1089
00:41:57,520 –> 00:41:59,360
No, it stays where you told it to stay.

1090
00:41:59,360 –> 00:42:00,480
Then you enforce it in

1091
00:42:00,480 –> 00:42:02,000
entra and as your authorization,

1092
00:42:02,000 –> 00:42:02,720
not in a prompt.

1093
00:42:02,720 –> 00:42:04,240
The easiest way to lie to yourself

1094
00:42:04,240 –> 00:42:05,840
is to implement least privilege

1095
00:42:05,840 –> 00:42:07,200
in the orchestration logic

1096
00:42:07,200 –> 00:42:09,360
while the principle still has contributor.

1097
00:42:09,360 –> 00:42:10,880
The system will behave until it doesn’t,

1098
00:42:10,880 –> 00:42:11,680
and when it doesn’t,

1099
00:42:11,680 –> 00:42:14,320
the logs will faithfully record the outcome you allowed.

1100
00:42:14,320 –> 00:42:15,280
So you need a pattern

1101
00:42:15,280 –> 00:42:17,520
where the baseline identity can observe

1102
00:42:17,520 –> 00:42:18,880
broadly enough to diagnose,

1103
00:42:18,880 –> 00:42:21,360
but act narrowly enough to not create a blast radius.

1104
00:42:21,360 –> 00:42:23,920
And if you require elevation for certain actions,

1105
00:42:23,920 –> 00:42:25,680
you make that elevation time-bounded,

1106
00:42:25,680 –> 00:42:27,360
scope-bounded, and auditable.

1107
00:42:27,360 –> 00:42:28,240
Call it PIM-like,

1108
00:42:28,240 –> 00:42:29,120
call it just in time,

1109
00:42:29,120 –> 00:42:30,160
call it whatever you want.

1110
00:42:30,160 –> 00:42:31,440
The mechanism isn’t the point.

1111
00:42:31,440 –> 00:42:33,280
The point is that right access

1112
00:42:33,280 –> 00:42:34,880
is a temporary capability,

1113
00:42:34,880 –> 00:42:36,400
not a permanent property.

1114
00:42:36,400 –> 00:42:38,320
Next, permission granularity.

1115
00:42:38,320 –> 00:42:42,160
Most logs treat remediation as a single permission set.

1116
00:42:42,160 –> 00:42:43,680
The agent can remediate.

1117
00:42:43,680 –> 00:42:45,040
That’s how you end up with an agent

1118
00:42:45,040 –> 00:42:46,320
that can restart a service

1119
00:42:46,320 –> 00:42:47,760
and also reconfigure networking

1120
00:42:47,760 –> 00:42:49,760
because both are operations.

1121
00:42:49,760 –> 00:42:51,120
They are not symmetrical.

1122
00:42:51,120 –> 00:42:54,400
Restart one app service instance is an operational nudge.

1123
00:42:54,400 –> 00:42:57,120
Modify NSG rules is infrastructure surgery.

1124
00:42:57,120 –> 00:42:59,280
Rollback at deployment is reversible.

1125
00:42:59,280 –> 00:43:02,080
Rotate secrets across dependencies is cross-system coupling.

1126
00:43:02,080 –> 00:43:03,600
So you define action classes

1127
00:43:03,600 –> 00:43:05,520
and you bind privileges to those classes.

1128
00:43:05,520 –> 00:43:06,720
You do not grant right

1129
00:43:06,720 –> 00:43:08,000
and hope policy saves you,

1130
00:43:08,000 –> 00:43:09,360
policy doesn’t save you.

1131
00:43:09,360 –> 00:43:10,480
It records your mistakes.

1132
00:43:10,480 –> 00:43:12,080
Now, guardrails,

1133
00:43:12,080 –> 00:43:14,560
because permissions alone don’t prevent failure loops.

1134
00:43:14,560 –> 00:43:15,920
You need kill switches.

1135
00:43:15,920 –> 00:43:16,560
Real ones.

1136
00:43:16,560 –> 00:43:19,920
A kill switch is not disabled the app.

1137
00:43:19,920 –> 00:43:21,760
A kill switch is a control plane decision

1138
00:43:21,760 –> 00:43:23,200
that stops new runs from starting

1139
00:43:23,200 –> 00:43:25,840
and also terminates in-flight runs cleanly.

1140
00:43:25,840 –> 00:43:27,440
Cancel queue tool calls,

1141
00:43:27,440 –> 00:43:28,480
prevent retries,

1142
00:43:28,480 –> 00:43:30,560
and leave a clear halted state

1143
00:43:30,560 –> 00:43:33,600
that humans can resume from or roll back from.

1144
00:43:33,600 –> 00:43:34,480
Without that,

1145
00:43:34,480 –> 00:43:36,080
your incident response will include

1146
00:43:36,080 –> 00:43:37,600
fighting your own automation

1147
00:43:37,600 –> 00:43:39,360
while it keeps trying to help.

1148
00:43:39,360 –> 00:43:41,840
Then you need Quoters Action Quoters per run.

1149
00:43:41,840 –> 00:43:43,440
Action Quoters per hour.

1150
00:43:43,440 –> 00:43:44,640
Resource Quoters per scope.

1151
00:43:44,640 –> 00:43:46,560
If the agent sees 500 alerts

1152
00:43:46,560 –> 00:43:48,400
and decides to remediate all of them,

1153
00:43:48,400 –> 00:43:49,760
that’s not productivity.

1154
00:43:49,760 –> 00:43:51,600
That’s a denial of service you paid for.

1155
00:43:51,600 –> 00:43:53,360
Quoters force the system to batch,

1156
00:43:53,360 –> 00:43:54,160
to prioritize,

1157
00:43:54,160 –> 00:43:56,480
and to escalate when it hits its allowed limit.

1158
00:43:56,480 –> 00:43:58,000
And you need confidence thresholds

1159
00:43:58,000 –> 00:43:59,360
that actually mean something.

1160
00:43:59,360 –> 00:44:01,120
Not a single confidence score number

1161
00:44:01,120 –> 00:44:03,120
that gets tuned until the system acts.

1162
00:44:03,120 –> 00:44:05,200
You define what constitutes sufficient evidence

1163
00:44:05,200 –> 00:44:06,400
for the class of incident.

1164
00:44:06,400 –> 00:44:07,520
Two independent signals,

1165
00:44:07,520 –> 00:44:08,320
a known signature,

1166
00:44:08,320 –> 00:44:09,680
a validated precondition.

1167
00:44:09,680 –> 00:44:10,720
If those aren’t met,

1168
00:44:10,720 –> 00:44:13,120
the agent escalates with the evidence it has and stops.

1169
00:44:13,120 –> 00:44:14,400
That’s how you keep autonomy

1170
00:44:14,400 –> 00:44:16,640
from becoming probabilistic improvisation.

1171
00:44:16,640 –> 00:44:18,480
Finally, the escalation contract.

1172
00:44:18,480 –> 00:44:20,080
When it can’t act, where does it go?

1173
00:44:20,080 –> 00:44:21,520
ITSM ticket assignment,

1174
00:44:21,520 –> 00:44:22,400
Teams channel,

1175
00:44:22,400 –> 00:44:23,520
on call paging.

1176
00:44:23,520 –> 00:44:24,480
And what does it include?

1177
00:44:24,480 –> 00:44:25,760
It includes the evidence bundle

1178
00:44:25,760 –> 00:44:27,120
and the proposed next action.

1179
00:44:27,120 –> 00:44:28,240
Not a vague summary.

1180
00:44:28,240 –> 00:44:30,240
The goal is to turn human in the loop

1181
00:44:30,240 –> 00:44:31,920
into human as exception handler,

1182
00:44:31,920 –> 00:44:33,760
not human as the default executor.

1183
00:44:33,760 –> 00:44:34,960
And you measure all of this

1184
00:44:34,960 –> 00:44:36,880
because governance without measurement is theatre.

1185
00:44:36,880 –> 00:44:38,640
Track MTTR Delta, sure.

1186
00:44:38,640 –> 00:44:40,480
But also track human in loop rate,

1187
00:44:40,480 –> 00:44:41,440
rollback frequency,

1188
00:44:41,440 –> 00:44:43,600
and the number of times the kill switch gets used.

1189
00:44:43,600 –> 00:44:44,800
If rollbacks are frequent,

1190
00:44:44,800 –> 00:44:46,800
your execution contract is too permissive

1191
00:44:46,800 –> 00:44:48,160
or your verification is weak.

1192
00:44:48,160 –> 00:44:49,440
If the kill switch gets used often,

1193
00:44:49,440 –> 00:44:50,640
you have a drift problem.

1194
00:44:50,640 –> 00:44:52,320
If the human in loop rate never drops,

1195
00:44:52,320 –> 00:44:54,560
you build assistance and call it autonomy.

1196
00:44:54,560 –> 00:44:57,200
So the governance rule for scenario one is brutal and simple.

1197
00:44:57,200 –> 00:44:59,120
Either remediation is least privileged

1198
00:44:59,120 –> 00:45:00,400
with enforceable boundaries

1199
00:45:00,400 –> 00:45:02,720
or it becomes a worm with change control paperwork.

1200
00:45:02,720 –> 00:45:04,320
There is no third state.

1201
00:45:04,320 –> 00:45:06,160
Scenario two, setup.

1202
00:45:06,160 –> 00:45:08,720
Finance reconciliation and close support.

1203
00:45:08,720 –> 00:45:10,720
Finance is where autonomy stops being

1204
00:45:10,720 –> 00:45:13,600
ops automation and turns into institutional trust.

1205
00:45:13,600 –> 00:45:15,920
Because reconciliation isn’t a convenience task,

1206
00:45:15,920 –> 00:45:17,840
it’s the thing standing between your organization

1207
00:45:17,840 –> 00:45:20,880
and an audit finding that ruins someone’s quarter.

1208
00:45:20,880 –> 00:45:22,320
The pain pattern is predictable.

1209
00:45:22,320 –> 00:45:25,120
Close arrives, everyone becomes a human join engine

1210
00:45:25,120 –> 00:45:27,280
and the spreadsheet layer metastasizes.

1211
00:45:27,280 –> 00:45:29,680
People pull exports from the ERP bank feeds,

1212
00:45:29,680 –> 00:45:31,680
expense platforms, procurement systems,

1213
00:45:31,680 –> 00:45:33,760
and whatever temporary tracker someone made

1214
00:45:33,760 –> 00:45:35,840
because the official system was slow.

1215
00:45:35,840 –> 00:45:37,680
Then they spend days matching line items,

1216
00:45:37,680 –> 00:45:39,200
chasing missing references,

1217
00:45:39,200 –> 00:45:41,360
and writing explanations that sound plausible enough

1218
00:45:41,360 –> 00:45:42,240
to survive review.

1219
00:45:42,240 –> 00:45:44,960
And the thing most people miss is that reconciliation work

1220
00:45:44,960 –> 00:45:47,200
has two outputs, not one.

1221
00:45:47,200 –> 00:45:48,960
Yes, you want the numbers to balance.

1222
00:45:48,960 –> 00:45:50,960
But the real product is the rationale.

1223
00:45:50,960 –> 00:45:52,960
Why this transaction matches that one?

1224
00:45:52,960 –> 00:45:54,560
Why this variance exists?

1225
00:45:54,560 –> 00:45:56,560
What policy clause allows the adjustment?

1226
00:45:56,560 –> 00:45:58,160
And who approved the exception?

1227
00:45:58,160 –> 00:45:59,840
Finance doesn’t just need an answer.

1228
00:45:59,840 –> 00:46:02,960
It needs an answer that can be re-performed under scrutiny.

1229
00:46:02,960 –> 00:46:04,800
That’s why assistance hits a ceiling here.

1230
00:46:04,800 –> 00:46:07,200
Copilot can draft a variance narrative faster.

1231
00:46:07,200 –> 00:46:08,480
It can summarize a spreadsheet.

1232
00:46:08,480 –> 00:46:10,560
It can help a controller write an email.

1233
00:46:10,560 –> 00:46:13,520
But it can’t, by itself, create an evidence chain

1234
00:46:13,520 –> 00:46:15,840
that an auditor can replay end to end.

1235
00:46:15,840 –> 00:46:17,200
And without that evidence chain,

1236
00:46:17,200 –> 00:46:18,960
autonomy is not automation.

1237
00:46:18,960 –> 00:46:20,000
It’s liability.

1238
00:46:20,000 –> 00:46:21,840
So the autonomy boundary in finance

1239
00:46:21,840 –> 00:46:23,840
has to be drawn differently than in IT.

1240
00:46:23,840 –> 00:46:26,320
In IT remediation, the boundaries usually

1241
00:46:26,320 –> 00:46:28,800
can the agent execute the runbook safely

1242
00:46:28,800 –> 00:46:30,640
and verify service health.

1243
00:46:30,640 –> 00:46:32,400
In finance, the boundary is,

1244
00:46:32,400 –> 00:46:34,000
can the agent justify the action

1245
00:46:34,000 –> 00:46:36,480
with grounded source references and policy alignment

1246
00:46:36,480 –> 00:46:38,960
before it touches anything that affects a ledger?

1247
00:46:38,960 –> 00:46:40,720
Because finance failures are quiet.

1248
00:46:40,720 –> 00:46:42,320
They don’t page you at 2 a.m.

1249
00:46:42,320 –> 00:46:44,320
and they show up months later in a room with lawyers.

1250
00:46:44,320 –> 00:46:46,640
The baseline close workflow looks like this.

1251
00:46:46,640 –> 00:46:49,600
Extract data, reconcil, resolve exceptions,

1252
00:46:49,600 –> 00:46:52,240
document rationale, get approvals,

1253
00:46:52,240 –> 00:46:54,720
post-adjustments, report.

1254
00:46:54,720 –> 00:46:56,800
Humans act as translators between systems

1255
00:46:56,800 –> 00:46:59,120
that don’t agree on identifiers, timestamps,

1256
00:46:59,120 –> 00:47:01,200
currencies, or the meaning of settled.

1257
00:47:01,200 –> 00:47:03,120
They also act as policy interpreters

1258
00:47:03,120 –> 00:47:05,840
because exception handling is where the judgment lives.

1259
00:47:05,840 –> 00:47:08,880
The agentic target outcome is not replace accountants.

1260
00:47:08,880 –> 00:47:11,440
The agentic target is shrink the exception queue

1261
00:47:11,440 –> 00:47:14,320
and turn routine matching into a deterministic pipeline.

1262
00:47:14,320 –> 00:47:16,320
That means the agent does three things well.

1263
00:47:16,320 –> 00:47:19,440
First, automated matching across known patterns.

1264
00:47:19,440 –> 00:47:22,000
Same vendor, same invoice ID, same amount,

1265
00:47:22,000 –> 00:47:23,440
predictable timing offsets.

1266
00:47:23,440 –> 00:47:25,280
This is boring work, but it’s high volume

1267
00:47:25,280 –> 00:47:26,880
and it’s where humans burn time

1268
00:47:26,880 –> 00:47:28,720
that should be spent on the weird cases.

1269
00:47:28,720 –> 00:47:31,360
Second, anomaly servicing with real triage.

1270
00:47:31,360 –> 00:47:33,520
Not here are 500 variances.

1271
00:47:33,520 –> 00:47:35,120
But here are the 12 that matter

1272
00:47:35,120 –> 00:47:36,160
with clustering.

1273
00:47:36,160 –> 00:47:38,720
Duplicates currency conversion discrepancies,

1274
00:47:38,720 –> 00:47:40,880
partial shipments, late postings,

1275
00:47:40,880 –> 00:47:42,720
missing purchase order references.

1276
00:47:42,720 –> 00:47:44,480
The value is not finding anomalies.

1277
00:47:44,480 –> 00:47:46,480
The value is reducing the search space.

1278
00:47:46,480 –> 00:47:49,120
Third, auto-resolution for known mismatch classes,

1279
00:47:49,120 –> 00:47:51,840
but only when the execution contract permits it.

1280
00:47:51,840 –> 00:47:53,920
For example, reclassifying transactions

1281
00:47:53,920 –> 00:47:55,760
that meet explicit criteria,

1282
00:47:55,760 –> 00:47:57,120
generating correcting entries

1283
00:47:57,120 –> 00:47:58,960
that are pre-approved under policy

1284
00:47:58,960 –> 00:48:00,720
or preparing a journal entry package

1285
00:48:00,720 –> 00:48:02,960
that is complete and ready for human approval

1286
00:48:02,960 –> 00:48:06,080
when the action crosses a sensitivity threshold.

1287
00:48:06,080 –> 00:48:08,880
And the blunt line for this section needs to land cleanly,

1288
00:48:08,880 –> 00:48:11,840
autonomy that can’t explain itself is not automation,

1289
00:48:11,840 –> 00:48:12,960
it’s liability.

1290
00:48:12,960 –> 00:48:15,200
Because a finance agent that says trust me

1291
00:48:15,200 –> 00:48:18,240
is just a faster way to create untraceable adjustments.

1292
00:48:18,240 –> 00:48:20,560
The agent must behave like a disciplined analyst.

1293
00:48:20,560 –> 00:48:22,160
Every number tied to a source.

1294
00:48:22,160 –> 00:48:23,760
Every transformation documented,

1295
00:48:23,760 –> 00:48:25,200
every decision bound to policy

1296
00:48:25,200 –> 00:48:27,280
and every action gated by approval

1297
00:48:27,280 –> 00:48:29,840
when the consequences exceed the autonomy boundary.

1298
00:48:29,840 –> 00:48:31,920
Now, let’s be precise about systems

1299
00:48:31,920 –> 00:48:34,240
touched without pretending we’re doing a product tour.

1300
00:48:34,240 –> 00:48:36,480
Finance reconciliation in a Microsoft enterprise

1301
00:48:36,480 –> 00:48:38,320
will touch at least three surfaces.

1302
00:48:38,320 –> 00:48:40,320
The system of record, the collaboration layer,

1303
00:48:40,320 –> 00:48:41,680
and identity context.

1304
00:48:41,680 –> 00:48:44,880
The system of record is your ERP and its satellites,

1305
00:48:44,880 –> 00:48:48,000
where transactions live and where adjustments ultimately land.

1306
00:48:48,000 –> 00:48:50,960
The collaboration layer is Microsoft 365.

1307
00:48:50,960 –> 00:48:53,680
Excel files, SharePoint or OneDrive stores,

1308
00:48:53,680 –> 00:48:56,080
Teams conversations, email threads that become

1309
00:48:56,080 –> 00:48:58,400
approvals in practice even when they shouldn’t.

1310
00:48:58,400 –> 00:49:00,320
An identity context is Entra,

1311
00:49:00,320 –> 00:49:01,840
who is authorized to view,

1312
00:49:01,840 –> 00:49:03,360
who is authorized to propose,

1313
00:49:03,360 –> 00:49:04,480
who is authorized to post,

1314
00:49:04,480 –> 00:49:07,440
and what segregation of duties rules must remain true,

1315
00:49:07,440 –> 00:49:09,360
even when an agent is doing the legwork.

1316
00:49:09,360 –> 00:49:11,920
And this is where the autonomy stack becomes unavoidable.

1317
00:49:11,920 –> 00:49:13,280
Events are the close calendar,

1318
00:49:13,280 –> 00:49:14,320
the arrival of feeds,

1319
00:49:14,320 –> 00:49:16,560
the detection of variances beyond tolerance,

1320
00:49:16,560 –> 00:49:18,800
reasoning is classification and policy mapping.

1321
00:49:18,800 –> 00:49:21,360
Orchestration is dispatching specialized matches

1322
00:49:21,360 –> 00:49:22,560
and anomaly agents.

1323
00:49:22,560 –> 00:49:24,560
Action is creating the adjustment package

1324
00:49:24,560 –> 00:49:26,880
or posting within a constrained scope if allowed.

1325
00:49:26,880 –> 00:49:29,280
Evidence is the entire point.

1326
00:49:29,280 –> 00:49:33,280
A replayable reconciliation run that survives hostile review.

1327
00:49:33,280 –> 00:49:35,680
So scenario two sets up the real enterprise question.

1328
00:49:35,680 –> 00:49:37,200
IT autonomy fails loudly.

1329
00:49:37,200 –> 00:49:39,200
Finance autonomy fails quietly.

1330
00:49:39,200 –> 00:49:40,400
And that’s why in the next section,

1331
00:49:40,400 –> 00:49:42,160
the product we design isn’t the agent.

1332
00:49:42,160 –> 00:49:43,520
It’s the audit trail.

1333
00:49:43,520 –> 00:49:45,920
Scenario two, evidence first design.

1334
00:49:45,920 –> 00:49:47,760
Audit trails as the product.

1335
00:49:47,760 –> 00:49:50,880
Finance autonomy only works when the evidence trail is treated

1336
00:49:50,880 –> 00:49:53,600
as a first class deliverable, not a side effect.

1337
00:49:53,600 –> 00:49:55,280
Most implementations do the opposite.

1338
00:49:55,280 –> 00:49:56,880
They build the reconciliation logic,

1339
00:49:56,880 –> 00:49:58,160
they wire up the connectors,

1340
00:49:58,160 –> 00:50:00,080
they generate a looks right summary,

1341
00:50:00,080 –> 00:50:01,280
and then someone says,

1342
00:50:01,280 –> 00:50:02,800
“We’ll add audit later.”

1343
00:50:02,800 –> 00:50:04,720
Audit later is how you end up with an agent

1344
00:50:04,720 –> 00:50:06,800
that can move numbers without leaving fingerprints.

1345
00:50:06,800 –> 00:50:07,920
That is not innovation,

1346
00:50:07,920 –> 00:50:09,920
that is a governance incident with better branding.

1347
00:50:09,920 –> 00:50:12,560
So in this scenario, the product is the audit trail.

1348
00:50:12,560 –> 00:50:14,800
The reconciliation result is just the byproduct

1349
00:50:14,800 –> 00:50:16,080
that makes finance care.

1350
00:50:16,080 –> 00:50:17,920
Start with the required artifacts

1351
00:50:17,920 –> 00:50:20,080
because finance doesn’t accept vibes as proof.

1352
00:50:20,080 –> 00:50:24,160
Every matched or adjusted item needs source references,

1353
00:50:24,160 –> 00:50:25,440
the transformations applied,

1354
00:50:25,440 –> 00:50:27,200
the rationale and the approval context,

1355
00:50:27,200 –> 00:50:30,080
not as pros, as structured linkable objects,

1356
00:50:30,080 –> 00:50:31,360
a bank line item ID,

1357
00:50:31,360 –> 00:50:32,800
an ERP document ID,

1358
00:50:32,800 –> 00:50:34,800
a file hash or SharePoint version ID

1359
00:50:34,800 –> 00:50:36,160
for the supporting schedule,

1360
00:50:36,160 –> 00:50:37,760
and a pointer to the policy clause

1361
00:50:37,760 –> 00:50:39,280
that authorizes the treatment.

1362
00:50:39,280 –> 00:50:41,840
If the agent can’t point back to exactly what it used,

1363
00:50:41,840 –> 00:50:43,680
it can’t claim it reconciled anything.

1364
00:50:43,680 –> 00:50:46,160
It just predicted what reconciliation might look like.

1365
00:50:46,160 –> 00:50:47,200
That distinction matters

1366
00:50:47,200 –> 00:50:50,320
because large language models are inherently probabilistic.

1367
00:50:50,320 –> 00:50:52,000
They generate plausible explanations

1368
00:50:52,000 –> 00:50:55,040
unless you force them to operate under grounding constraints.

1369
00:50:55,040 –> 00:50:57,200
In finance, plausible is the enemy.

1370
00:50:57,200 –> 00:50:59,360
So grounding discipline becomes non-negotiable.

1371
00:50:59,360 –> 00:51:00,800
This is not a web search problem.

1372
00:51:00,800 –> 00:51:04,080
This is not, let’s ask the internet what GAAP says.

1373
00:51:04,080 –> 00:51:07,440
The agent must operate on controlled enterprise data sources

1374
00:51:07,440 –> 00:51:09,200
with deterministic access boundaries.

1375
00:51:09,200 –> 00:51:10,960
If the system of record is the ERP,

1376
00:51:10,960 –> 00:51:12,640
then the agent reads the ERP.

1377
00:51:12,640 –> 00:51:15,280
If supporting documentation lives in SharePoint,

1378
00:51:15,280 –> 00:51:16,960
then it reads specific libraries

1379
00:51:16,960 –> 00:51:19,360
with specific labels under specific scopes.

1380
00:51:19,360 –> 00:51:20,720
And when it produces a narrative,

1381
00:51:20,720 –> 00:51:21,760
it cites those sources

1382
00:51:21,760 –> 00:51:23,360
like a hostile reviewer will check them

1383
00:51:23,360 –> 00:51:24,160
because they will.

1384
00:51:24,160 –> 00:51:27,200
Now, orchestration finance reconciliation looks like one workflow,

1385
00:51:27,200 –> 00:51:29,440
but it’s really a set of specialist behaviors

1386
00:51:29,440 –> 00:51:31,200
coordinated under a strict contract.

1387
00:51:31,200 –> 00:51:33,840
You typically want at least three conceptual agents,

1388
00:51:33,840 –> 00:51:37,120
even if they’re implemented as one service, a matching specialist.

1389
00:51:37,120 –> 00:51:40,160
It performs deterministic joins and pattern matches

1390
00:51:40,160 –> 00:51:42,640
with tolerances and rules that are explicit.

1391
00:51:42,640 –> 00:51:44,480
It should prefer boring, explainable logic

1392
00:51:44,480 –> 00:51:46,000
over reasoning whenever possible

1393
00:51:46,000 –> 00:51:49,040
because deterministic matching produces auditability by default.

1394
00:51:49,040 –> 00:51:51,360
An anomaly specialist.

1395
00:51:51,360 –> 00:51:54,000
It clusters exceptions into known classes,

1396
00:51:54,000 –> 00:51:56,560
prioritizes by materiality and risk,

1397
00:51:56,560 –> 00:51:59,440
and flags what cannot be resolved automatically.

1398
00:51:59,440 –> 00:52:02,160
The goal is not to generate a longer exception list.

1399
00:52:02,160 –> 00:52:04,880
The goal is to reduce the controller’s search space.

1400
00:52:04,880 –> 00:52:06,080
A policy specialist.

1401
00:52:06,080 –> 00:52:08,560
It maps proposed adjustments to policy.

1402
00:52:08,560 –> 00:52:10,720
Sagregation of duties, approval thresholds,

1403
00:52:10,720 –> 00:52:13,600
materiality rules, and whatever your organization enforces.

1404
00:52:13,600 –> 00:52:15,520
This is where the autonomy boundary lives.

1405
00:52:15,520 –> 00:52:18,000
In finance, the system can propose broadly,

1406
00:52:18,000 –> 00:52:20,000
but it can only execute narrowly

1407
00:52:20,000 –> 00:52:22,480
and only with the approvals the policy requires.

1408
00:52:22,480 –> 00:52:24,080
Then a coordinator ties them together

1409
00:52:24,080 –> 00:52:25,440
and produces a run artifact,

1410
00:52:25,440 –> 00:52:27,360
and that run artifact has to be replayable.

1411
00:52:27,360 –> 00:52:29,360
Replayability is the thing most teams skip

1412
00:52:29,360 –> 00:52:30,800
because it feels like extra work.

1413
00:52:30,800 –> 00:52:31,760
It is not extra work.

1414
00:52:31,760 –> 00:52:34,640
It is the only mechanism that converts agent output

1415
00:52:34,640 –> 00:52:37,040
into operationally defensible automation.

1416
00:52:37,040 –> 00:52:39,440
Replay means you can take the same inputs,

1417
00:52:39,440 –> 00:52:40,800
the same source extracts,

1418
00:52:40,800 –> 00:52:43,600
the same versions of files, the same policy rule set,

1419
00:52:43,600 –> 00:52:46,160
and rerun the logic to get the same outcome.

1420
00:52:46,160 –> 00:52:48,960
Or if the outcome changes, you can prove why.

1421
00:52:48,960 –> 00:52:51,600
A data change, a policy change, or a toolversion change

1422
00:52:51,600 –> 00:52:53,840
without replay post-mortems become storytelling.

1423
00:52:53,840 –> 00:52:55,440
Finance doesn’t tolerate storytelling.

1424
00:52:55,440 –> 00:52:56,720
So what does the agent produce?

1425
00:52:56,720 –> 00:52:59,200
It produces variance packs and exception cues

1426
00:52:59,200 –> 00:53:01,840
that look like finance work product, not AI output.

1427
00:53:01,840 –> 00:53:04,560
A variance pack that includes the matched sets,

1428
00:53:04,560 –> 00:53:08,080
the unmatched sets, the transformation steps, and the rationale.

1429
00:53:08,080 –> 00:53:10,640
An exception cue that includes reason codes,

1430
00:53:10,640 –> 00:53:12,160
suggested remediation steps,

1431
00:53:12,160 –> 00:53:14,720
and the minimum approval required to resolve it.

1432
00:53:14,720 –> 00:53:16,800
And it produces controller-ready narratives

1433
00:53:16,800 –> 00:53:18,080
that are grounded.

1434
00:53:18,080 –> 00:53:20,240
Every claim backed by a linked source reference.

1435
00:53:20,240 –> 00:53:23,200
Now metrics because you’ll be asked to justify this.

1436
00:53:23,200 –> 00:53:25,600
Time to close for the reconciliation cycle is obvious.

1437
00:53:25,600 –> 00:53:26,400
But it’s not enough.

1438
00:53:26,400 –> 00:53:28,960
You track error rate versus human baselines,

1439
00:53:28,960 –> 00:53:31,760
because autonomy that is faster but wrong is not autonomy.

1440
00:53:31,760 –> 00:53:33,440
You track exception backlog aging

1441
00:53:33,440 –> 00:53:35,440
because the goal is to shrink the long tail

1442
00:53:35,440 –> 00:53:37,600
that drags close past the calendar.

1443
00:53:37,600 –> 00:53:39,200
And you track intervention rate.

1444
00:53:39,200 –> 00:53:41,520
How often did humans have to rewrite the rationale?

1445
00:53:41,520 –> 00:53:42,720
Not just approve the package.

1446
00:53:42,720 –> 00:53:44,240
Because if humans keep rewriting it,

1447
00:53:44,240 –> 00:53:46,000
you didn’t automate reconciliation.

1448
00:53:46,000 –> 00:53:47,680
You automated draft generation.

1449
00:53:47,680 –> 00:53:50,080
And once you build evidence first,

1450
00:53:50,080 –> 00:53:51,600
you also get a hidden benefit.

1451
00:53:51,600 –> 00:53:52,960
Blast radius containment.

1452
00:53:52,960 –> 00:53:54,800
If every action is tied to a policy clause

1453
00:53:54,800 –> 00:53:56,080
and an approval state,

1454
00:53:56,080 –> 00:53:58,880
the system can’t quietly just post the entry.

1455
00:53:58,880 –> 00:54:00,480
It either has the authority and evidence

1456
00:54:00,480 –> 00:54:02,320
or it escalates with a complete package.

1457
00:54:02,320 –> 00:54:04,960
That’s the autonomy boundary, but finance flavoured.

1458
00:54:04,960 –> 00:54:07,600
And it’s the only version that survives audit season.

1459
00:54:07,600 –> 00:54:09,120
Scenario three, setup.

1460
00:54:09,120 –> 00:54:11,680
Security incident triage without SOC collapse.

1461
00:54:11,680 –> 00:54:14,240
Security is where autonomy stops being a throughput discussion

1462
00:54:14,240 –> 00:54:16,000
and becomes an adversarial one.

1463
00:54:16,000 –> 00:54:18,400
IT remediation fights entropy.

1464
00:54:18,400 –> 00:54:19,920
Finance fights scrutiny.

1465
00:54:19,920 –> 00:54:22,080
Security fights an opponent that adapts.

1466
00:54:22,080 –> 00:54:26,160
And that’s why SOC collapse is the most honest autonomy use case you can pick.

1467
00:54:26,160 –> 00:54:29,520
Because the baseline operating model is already broken in most enterprises,

1468
00:54:29,520 –> 00:54:32,080
alert volume grows faster than analyst headcount.

1469
00:54:32,080 –> 00:54:33,520
Fidelity stays mediocre.

1470
00:54:33,520 –> 00:54:37,200
And every new tool adds another stream of signals that mostly become noise.

1471
00:54:37,200 –> 00:54:40,160
So analysts spend their day rooting, enriching,

1472
00:54:40,160 –> 00:54:42,960
and writing summaries that don’t prevent the next incident.

1473
00:54:42,960 –> 00:54:43,920
The queue doesn’t shrink.

1474
00:54:43,920 –> 00:54:45,280
It churns.

1475
00:54:45,280 –> 00:54:46,640
Defender produces alerts.

1476
00:54:46,640 –> 00:54:48,080
Sentinel produces incidents.

1477
00:54:48,080 –> 00:54:50,000
Identity produces risk events.

1478
00:54:50,000 –> 00:54:51,840
Endpoint telemetry produces anomalies.

1479
00:54:51,840 –> 00:54:53,280
Cloud produces activity logs.

1480
00:54:53,280 –> 00:54:54,800
None of those are inherently wrong.

1481
00:54:54,800 –> 00:54:57,120
The failure is the human bottleneck in the middle.

1482
00:54:57,120 –> 00:55:00,640
A small team forced to do correlation and enrichment manually

1483
00:55:00,640 –> 00:55:04,240
at the exact moment the environment requires speed and consistency.

1484
00:55:04,240 –> 00:55:06,640
So the baseline workflow looks like this.

1485
00:55:06,640 –> 00:55:10,320
Triage, enrich, correlate, decide, contain, document.

1486
00:55:10,320 –> 00:55:12,080
And everyone pretends it’s a linear process.

1487
00:55:12,080 –> 00:55:12,560
It isn’t.

1488
00:55:12,560 –> 00:55:13,360
It’s a loop.

1489
00:55:13,360 –> 00:55:16,400
Analysts, bounds between portals, copy identifiers,

1490
00:55:16,400 –> 00:55:20,320
search for context, and rebuild the same mental model of what happened every time.

1491
00:55:20,320 –> 00:55:22,320
The attacker gets parallelism.

1492
00:55:22,320 –> 00:55:24,000
The defenders get a ticketing queue.

1493
00:55:24,000 –> 00:55:25,360
That asymmetry is the point.

1494
00:55:25,360 –> 00:55:27,600
So when people ask what autonomy is good for,

1495
00:55:27,600 –> 00:55:29,440
security has the cleanest answer.

1496
00:55:29,440 –> 00:55:32,400
Autonomy buys you parallelism under policy.

1497
00:55:32,400 –> 00:55:34,720
It lets you do the mechanical work at machine speed,

1498
00:55:34,720 –> 00:55:37,520
correlation, enrichment, scoping, and low risk containment.

1499
00:55:37,520 –> 00:55:42,080
So humans spend their limited attention on the weird cases that actually require judgment.

1500
00:55:42,080 –> 00:55:44,800
But the autonomy boundary here is brutally non-negotiable.

1501
00:55:44,800 –> 00:55:47,200
A security agent doesn’t get to improvise containment.

1502
00:55:47,200 –> 00:55:48,640
It doesn’t get to try something.

1503
00:55:48,640 –> 00:55:52,240
It doesn’t get to block identities or isolate devices because it feels right.

1504
00:55:52,240 –> 00:55:55,200
It acts only under policy with pre-approved actions,

1505
00:55:55,200 –> 00:55:58,720
bounded scopes, and evidence thresholds that are defined ahead of time.

1506
00:55:58,720 –> 00:56:01,280
Otherwise, you build the most dangerous thing possible.

1507
00:56:01,280 –> 00:56:04,320
An actor in your tenant with the power to disrupt business operations

1508
00:56:04,320 –> 00:56:06,960
guided by probabilistic reasoning during high stress.

1509
00:56:06,960 –> 00:56:10,160
So the agentic objective in this scenario is narrow by design.

1510
00:56:10,160 –> 00:56:12,640
Correlate alerts into coherent narratives.

1511
00:56:12,640 –> 00:56:15,760
Assess blast radius with real signals, not vibes.

1512
00:56:15,760 –> 00:56:19,440
Contain low to medium risk incidents where policy already defines the response

1513
00:56:19,440 –> 00:56:22,640
and generate investigation summaries that humans can trust and replay.

1514
00:56:22,640 –> 00:56:25,040
That means the agent becomes a triage engine

1515
00:56:25,040 –> 00:56:28,480
and a response executor for the boring repeatable cases.

1516
00:56:28,480 –> 00:56:31,600
Suspicious sign-ins with clear identity risk signals,

1517
00:56:31,600 –> 00:56:35,200
commodity malware on endpoints where isolation is already standard,

1518
00:56:35,200 –> 00:56:38,160
impossible travel combined with high confidence fishing,

1519
00:56:38,160 –> 00:56:41,280
known bad tokens, known bad device posture.

1520
00:56:41,280 –> 00:56:44,640
It handles the class of incidents where humans currently waste time

1521
00:56:44,640 –> 00:56:47,120
doing the same steps and it escalates everything else.

1522
00:56:47,120 –> 00:56:50,320
Now the thing most people miss is that security autonomy fails first

1523
00:56:50,320 –> 00:56:52,080
when identities and afterthought,

1524
00:56:52,080 –> 00:56:55,360
because containment is mostly identity and access control actions.

1525
00:56:55,360 –> 00:56:58,560
Revoke sessions, reset passwords, disable accounts,

1526
00:56:58,560 –> 00:57:01,040
block tokens, tighten conditional access,

1527
00:57:01,040 –> 00:57:03,040
remove risky app consent.

1528
00:57:03,040 –> 00:57:05,280
If you can’t express those actions as bounded,

1529
00:57:05,280 –> 00:57:08,480
auditable operations under explicit identity constraints,

1530
00:57:08,480 –> 00:57:10,080
you don’t have autonomous response.

1531
00:57:10,080 –> 00:57:11,440
You have automated self-harm.

1532
00:57:11,440 –> 00:57:13,600
So you need a hard boundary.

1533
00:57:13,600 –> 00:57:15,280
The agent can recommend broadly,

1534
00:57:15,280 –> 00:57:18,480
but it can only execute in the lanes you’ve made deterministic.

1535
00:57:18,480 –> 00:57:22,240
And the evidence requirement must be higher than the model is confident.

1536
00:57:22,240 –> 00:57:24,800
It has to be these signals match this response class

1537
00:57:24,800 –> 00:57:27,280
under this policy clause within this scope.

1538
00:57:27,280 –> 00:57:29,200
And the payoff signal for the audience is simple.

1539
00:57:29,200 –> 00:57:31,200
The problem isn’t building the containment action.

1540
00:57:31,200 –> 00:57:32,480
Microsoft gives you actions.

1541
00:57:32,480 –> 00:57:35,600
The problem is deciding when the system is allowed to execute them,

1542
00:57:35,600 –> 00:57:38,400
under which identity and how you prove it didn’t overreach.

1543
00:57:38,400 –> 00:57:42,000
Because the SOC doesn’t get judged by how fast it can generate a summary,

1544
00:57:42,000 –> 00:57:44,000
it gets judged by whether it contained the right thing

1545
00:57:44,000 –> 00:57:45,120
without breaking the business.

1546
00:57:45,120 –> 00:57:49,040
So this scenario is where the autonomy stack becomes visibly real.

1547
00:57:49,040 –> 00:57:51,280
Event ingestion is alerts and incidents,

1548
00:57:51,280 –> 00:57:54,240
reasoning is correlation and classification under policy.

1549
00:57:54,240 –> 00:57:58,160
Orchestration is too rooting across defenders, sentinel and entra,

1550
00:57:58,160 –> 00:58:00,560
action is containment with bounded permissions,

1551
00:58:00,560 –> 00:58:02,720
and evidence is the investigation record

1552
00:58:02,720 –> 00:58:05,280
that ties every step back to signals and policy.

1553
00:58:05,280 –> 00:58:08,000
And in the next section, we map it as an enforcement graph.

1554
00:58:08,000 –> 00:58:10,560
Defender detects, sentinel correlates,

1555
00:58:10,560 –> 00:58:12,240
entra enforces.

1556
00:58:12,240 –> 00:58:14,800
If those three aren’t wired into a coherent control plane,

1557
00:58:14,800 –> 00:58:17,760
autonomy won’t save the SOC, it will just accelerate the chaos.

1558
00:58:17,760 –> 00:58:21,520
Scenario three, system flow, defender plus sentinel

1559
00:58:21,520 –> 00:58:23,120
plus entra as enforcement graph.

1560
00:58:23,120 –> 00:58:24,880
If scenario three is going to work,

1561
00:58:24,880 –> 00:58:26,560
it needs a real system flow.

1562
00:58:26,560 –> 00:58:28,160
Not the agent checks defender,

1563
00:58:28,160 –> 00:58:29,680
not it uses sentinel.

1564
00:58:29,680 –> 00:58:32,880
A flow where each product plays its actual role in the enterprise,

1565
00:58:32,880 –> 00:58:36,320
defender a signal source, sentinel as correlation and case management,

1566
00:58:36,320 –> 00:58:40,400
entra as the enforcement graph that turns decisions into bounded actions.

1567
00:58:40,400 –> 00:58:41,520
Start with ingestion.

1568
00:58:41,520 –> 00:58:46,880
Defender for endpoint and defender for office generate alerts with raw artifacts,

1569
00:58:46,880 –> 00:58:51,120
device IDs, user principles, process hashes, URLs, mailbox activity,

1570
00:58:51,120 –> 00:58:53,360
and whatever else the detection contains.

1571
00:58:53,360 –> 00:58:57,280
Sentinel ingests those alerts and also brings in everything defender doesn’t own.

1572
00:58:57,280 –> 00:59:00,480
Cloud activity logs, firewall events, identity risk events,

1573
00:59:00,480 –> 00:59:02,400
and third party sources if you have them.

1574
00:59:02,400 –> 00:59:05,040
The agent doesn’t treat this as “more data.”

1575
00:59:05,040 –> 00:59:06,800
It treats it as a graph problem,

1576
00:59:06,800 –> 00:59:09,280
which entities are involved, what relationships exist,

1577
00:59:09,280 –> 00:59:10,640
and what changed recently.

1578
00:59:10,640 –> 00:59:13,360
So the first move in the flow is normalization into entities.

1579
00:59:13,360 –> 00:59:16,640
User device app mailbox IP token session tenant resource.

1580
00:59:16,640 –> 00:59:19,200
If the system can’t map the alert to entities,

1581
00:59:19,200 –> 00:59:20,720
it should not execute anything.

1582
00:59:20,720 –> 00:59:23,840
It should escalate for human triage because it can’t bound scope.

1583
00:59:23,840 –> 00:59:25,840
Containment without scope is just disruption.

1584
00:59:25,840 –> 00:59:30,080
Then comes reasoning, correlation and blast radius estimation.

1585
00:59:30,080 –> 00:59:31,600
This is where sentinel earns its role.

1586
00:59:31,600 –> 00:59:34,400
Sentinel already builds incidents and correlates signals.

1587
00:59:34,400 –> 00:59:36,800
The agent’s job is to query that correlation layer,

1588
00:59:36,800 –> 00:59:38,480
not to reinvent it with reasoning.

1589
00:59:38,480 –> 00:59:41,040
It should pull the incident graph,

1590
00:59:41,040 –> 00:59:45,040
related alerts, linked entities, timeline, known tactics,

1591
00:59:45,040 –> 00:59:46,640
and severity context.

1592
00:59:46,640 –> 00:59:48,960
Then it applies an execution contract decision.

1593
00:59:48,960 –> 00:59:52,000
Does this incident class have an approved autonomous response path?

1594
00:59:52,000 –> 00:59:54,480
That decision is not a vibe check, it’s policy.

1595
00:59:54,480 –> 00:59:59,840
Low to medium risk classes with clear response playbooks can be eligible.

1596
00:59:59,840 –> 01:00:02,800
Revoke sessions for a confirmed, risky sign-in,

1597
01:00:02,800 –> 01:00:05,680
isolated device for a high confidence malware alert,

1598
01:00:05,680 –> 01:00:09,120
block a known malicious URL through your existing controls,

1599
01:00:09,120 –> 01:00:12,960
disable a specific OAuth consent that matches a known bad pattern.

1600
01:00:12,960 –> 01:00:17,440
High risk or ambiguous cases get escalated with a complete evidence bundle.

1601
01:00:17,440 –> 01:00:19,760
Now orchestration tool routing.

1602
01:00:19,760 –> 01:00:23,680
This is the part that separates agent as chat from agent as system.

1603
01:00:23,680 –> 01:00:27,680
The agent routes work across a set of tools that already exist.

1604
01:00:27,680 –> 01:00:30,480
Defender APIs for endpoint and email actions,

1605
01:00:30,480 –> 01:00:33,440
Sentinel automation rules or playbooks for workflow,

1606
01:00:33,440 –> 01:00:35,280
Entra for identity enforcement,

1607
01:00:35,280 –> 01:00:37,600
and graph for communications and ticketing.

1608
01:00:37,600 –> 01:00:39,920
The key is that orchestration must be deterministic

1609
01:00:39,920 –> 01:00:42,560
about which tool is authoritative for which action.

1610
01:00:42,560 –> 01:00:45,280
You don’t revoke sessions through a random connector

1611
01:00:45,280 –> 01:00:46,880
if Entra is the enforcement point.

1612
01:00:46,880 –> 01:00:49,280
You don’t isolate devices through a custom script

1613
01:00:49,280 –> 01:00:52,720
if Defender already provides the actuator and the audit trail.

1614
01:00:52,720 –> 01:00:54,960
Orchestration chooses the canonical actuator

1615
01:00:54,960 –> 01:00:57,120
because that’s how you get predictable logs

1616
01:00:57,120 –> 01:00:58,480
and predictable rollback.

1617
01:00:58,480 –> 01:01:01,040
Then we hit action and action should come in two tiers.

1618
01:01:01,040 –> 01:01:02,880
Containment and coordination.

1619
01:01:02,880 –> 01:01:05,760
Containment actions are the hard ones, session revoke,

1620
01:01:05,760 –> 01:01:09,920
password reset initiation, user disablement in narrow conditions,

1621
01:01:09,920 –> 01:01:13,680
device isolation, token blocking, OAuth app consent removal

1622
01:01:13,680 –> 01:01:16,000
and conditional access response patterns.

1623
01:01:16,000 –> 01:01:19,040
Coordination actions are everything that keeps humans aligned.

1624
01:01:19,040 –> 01:01:21,120
Create or update the Sentinel incident,

1625
01:01:21,120 –> 01:01:23,600
open the ITSM ticket if that’s your process,

1626
01:01:23,600 –> 01:01:25,680
notify the SOC channel in Teams,

1627
01:01:25,680 –> 01:01:28,240
and ping an on-call human only when thresholds say

1628
01:01:28,240 –> 01:01:29,840
the agent can’t close the loop.

1629
01:01:29,840 –> 01:01:32,240
Now the enforcement graph, Entra as the choke point,

1630
01:01:32,240 –> 01:01:34,320
this is where people get comfortable and then get hurt.

1631
01:01:34,320 –> 01:01:37,200
They treat Entra as identity, meaning login and users.

1632
01:01:37,200 –> 01:01:40,400
In reality, it is the decision engine for access across the tenant.

1633
01:01:40,400 –> 01:01:42,080
When the agent takes action, it should do it

1634
01:01:42,080 –> 01:01:43,760
through Entra controlled mechanisms,

1635
01:01:43,760 –> 01:01:47,200
revoking sessions, blocking sign-ins through conditional access,

1636
01:01:47,200 –> 01:01:50,400
where appropriate, adjusting entitlements through scoped rolls

1637
01:01:50,400 –> 01:01:53,600
and ensuring the agent identity itself remains constrained.

1638
01:01:53,600 –> 01:01:56,160
And every action must run as a non-human principle

1639
01:01:56,160 –> 01:01:58,320
with explicit permissions, not global admin,

1640
01:01:58,320 –> 01:02:00,880
not security administrator because it was easier.

1641
01:02:00,880 –> 01:02:03,520
The system should have separate execution identities

1642
01:02:03,520 –> 01:02:05,280
for separate action classes,

1643
01:02:05,280 –> 01:02:07,440
because the moment one identity can do everything,

1644
01:02:07,440 –> 01:02:09,760
the blast radius becomes the entire tenant.

1645
01:02:09,760 –> 01:02:12,160
Again, worm mechanics, just in a blazer.

1646
01:02:12,160 –> 01:02:13,280
Finally, evidence.

1647
01:02:13,280 –> 01:02:16,720
Every run produces a replayable record.

1648
01:02:16,720 –> 01:02:19,360
The alert IDs, incident IDs, entity graph,

1649
01:02:19,360 –> 01:02:21,520
the policy clause that authorised action,

1650
01:02:21,520 –> 01:02:24,240
the exact tool calls the parameters, the identity used,

1651
01:02:24,240 –> 01:02:27,360
the verification checks, and the final state change.

1652
01:02:27,360 –> 01:02:28,960
And verification matters here.

1653
01:02:28,960 –> 01:02:32,720
Session revoked, confirmed, device isolation state confirmed,

1654
01:02:32,720 –> 01:02:36,400
sign-in-risk-reduced confirmed, incident status updated confirmed.

1655
01:02:36,400 –> 01:02:40,160
A verification fails, the system doesn’t try harder indefinitely.

1656
01:02:40,160 –> 01:02:42,640
It escalates with the evidence bundle and it stops.

1657
01:02:42,640 –> 01:02:45,840
So the system flow is simple to say, but hard to implement cleanly.

1658
01:02:45,840 –> 01:02:48,160
Defender detects sentinel correlates,

1659
01:02:48,160 –> 01:02:52,480
entra enforces, the agent sits in the middle as an orchestrator under contract.

1660
01:02:52,480 –> 01:02:54,880
If you can’t draw that graph and name the boundaries,

1661
01:02:54,880 –> 01:02:56,320
you don’t have autonomous triage,

1662
01:02:56,320 –> 01:02:58,560
you have conditional chaos with security branding.

1663
01:02:58,560 –> 01:03:03,440
The limiting factor, identity debt and authorisation sprawl.

1664
01:03:03,440 –> 01:03:06,720
All three scenarios hit the same wall and it’s not model quality,

1665
01:03:06,720 –> 01:03:09,600
it’s not agent memory, it’s not orchestration patterns,

1666
01:03:09,600 –> 01:03:12,800
it’s identity debt, identity debt is the inevitable accumulation

1667
01:03:12,800 –> 01:03:14,720
of non-human operators and entitlements

1668
01:03:14,720 –> 01:03:16,960
that your organization cannot explain anymore,

1669
01:03:16,960 –> 01:03:18,880
but still depends on to function.

1670
01:03:18,880 –> 01:03:22,080
Service principles manage identities, app registrations,

1671
01:03:22,080 –> 01:03:24,240
connector identities, delegated permissions,

1672
01:03:24,240 –> 01:03:26,960
certificates, secrets, conditional access exceptions,

1673
01:03:26,960 –> 01:03:30,960
break glass accounts, temporary admin roles that never got removed.

1674
01:03:30,960 –> 01:03:34,000
This clicked for a lot of architects when agents showed up

1675
01:03:34,000 –> 01:03:35,840
because agents don’t just consume permissions,

1676
01:03:35,840 –> 01:03:37,040
they operationalize them.

1677
01:03:37,040 –> 01:03:39,920
A human with broad access is a risk,

1678
01:03:39,920 –> 01:03:43,440
but it’s a bounded risk, attention, fatigue, and work hours,

1679
01:03:43,440 –> 01:03:44,720
limit blast radius.

1680
01:03:44,720 –> 01:03:47,520
An autonomous executor with broad access is different,

1681
01:03:47,520 –> 01:03:50,560
it can apply that access continuously in parallel

1682
01:03:50,560 –> 01:03:53,680
and without the psychological friction that makes humans hesitate,

1683
01:03:53,680 –> 01:03:56,480
so identity debt is not accidental, it is guaranteed.

1684
01:03:56,480 –> 01:04:00,400
Autonomy makes it visible because it forces you to name the actor.

1685
01:04:00,400 –> 01:04:03,760
Every time you build an agent that does things,

1686
01:04:03,760 –> 01:04:05,280
you must pick an identity,

1687
01:04:05,280 –> 01:04:08,080
and every identity you add expands the authorisation graph,

1688
01:04:08,080 –> 01:04:11,280
new assignments, new scopes, new conditional logic, new exceptions,

1689
01:04:11,280 –> 01:04:12,720
these pathways accumulate.

1690
01:04:13,360 –> 01:04:15,440
This is the foundational misunderstanding.

1691
01:04:15,440 –> 01:04:18,560
Most organizations still treat Entra as an identity provider.

1692
01:04:18,560 –> 01:04:19,520
They are wrong.

1693
01:04:19,520 –> 01:04:22,960
In architectural terms, Entra is a distributed decision engine.

1694
01:04:22,960 –> 01:04:25,600
It continuously compiles policy, role assignments,

1695
01:04:25,600 –> 01:04:27,840
device posture, risk signals, token claims,

1696
01:04:27,840 –> 01:04:30,960
and application constraints into real-time authorisation outcomes.

1697
01:04:30,960 –> 01:04:32,480
And once you introduce agents,

1698
01:04:32,480 –> 01:04:35,440
you’re feeding that engine a new species of principle,

1699
01:04:35,440 –> 01:04:38,400
non-human actors that behave like staff but scale like software.

1700
01:04:38,400 –> 01:04:41,280
That distinction matters because the enterprise typically governs

1701
01:04:41,280 –> 01:04:43,680
human identities with social process.

1702
01:04:43,680 –> 01:04:47,200
Onboarding, role changes, manager approvals, quarterly reviews.

1703
01:04:47,200 –> 01:04:50,640
It governs app identities with whatever happened during the project.

1704
01:04:50,640 –> 01:04:52,160
That’s where identity debt comes from,

1705
01:04:52,160 –> 01:04:54,800
not misconfiguration, design or mission.

1706
01:04:54,800 –> 01:04:57,600
Now add authorisation sprawl.

1707
01:04:57,600 –> 01:04:59,760
Autonomous work is rarely one permission.

1708
01:04:59,760 –> 01:05:00,960
It’s a multi-step chain,

1709
01:05:00,960 –> 01:05:04,560
read telemetry, update a ticket, pull a file, call an API,

1710
01:05:04,560 –> 01:05:08,000
write a config change, post a notification, verify health.

1711
01:05:08,000 –> 01:05:09,520
Each step has a permission surface,

1712
01:05:09,520 –> 01:05:11,680
and you have to grant enough capability for the agent

1713
01:05:11,680 –> 01:05:12,960
to complete the chain.

1714
01:05:12,960 –> 01:05:16,400
Over time, the safest path becomes just give it a bigger role.

1715
01:05:16,400 –> 01:05:18,400
And that’s where RBAC starts lying to you.

1716
01:05:18,400 –> 01:05:22,240
RBAC roles tend to be static bundles designed around human job functions.

1717
01:05:22,240 –> 01:05:23,680
Agents don’t have job functions.

1718
01:05:23,680 –> 01:05:26,480
They have task graphs, a task graph crosses roles.

1719
01:05:26,480 –> 01:05:28,960
It crosses systems, it crosses environments.

1720
01:05:28,960 –> 01:05:32,560
It also changes over time because the easiest way to evolve an agent

1721
01:05:32,560 –> 01:05:35,360
is to add one more tool and one more action.

1722
01:05:35,360 –> 01:05:38,800
So you end up with a mismatch, static roles versus dynamic execution.

1723
01:05:38,800 –> 01:05:41,760
The organisation tries to solve that mismatch with exceptions.

1724
01:05:41,760 –> 01:05:45,120
Conditional access excludes the agent because the run broke.

1725
01:05:45,120 –> 01:05:47,120
A resource group gets a broader role assignment

1726
01:05:47,120 –> 01:05:49,120
because a remediation failed at 2am.

1727
01:05:49,120 –> 01:05:52,560
A connector gets tenant-wide read because a dataset wasn’t available.

1728
01:05:52,560 –> 01:05:54,000
Each exception feels small.

1729
01:05:54,000 –> 01:05:56,000
Each exception is an entropy generator.

1730
01:05:56,000 –> 01:05:58,160
And the real danger isn’t the obvious gap.

1731
01:05:58,160 –> 01:06:01,360
It’s the ambiguity you create when the same agent behaves differently

1732
01:06:01,360 –> 01:06:04,160
across context because policy drift has accumulated.

1733
01:06:04,160 –> 01:06:07,120
Deterministic intent becomes probabilistic behaviour.

1734
01:06:07,120 –> 01:06:10,240
You can’t predict what the agent can do anymore because the authorisation graph

1735
01:06:10,240 –> 01:06:12,560
has become a patchwork of historical compromises.

1736
01:06:12,560 –> 01:06:16,000
This is why identity debt unwinds slower than it accrues.

1737
01:06:16,000 –> 01:06:20,720
It accrues at project speed, one sprint, one fix, one temporary permission.

1738
01:06:20,720 –> 01:06:25,120
It unwinds at audit speed, inventory, review, re-approval, remediation

1739
01:06:25,120 –> 01:06:29,360
and political negotiation with every team that depends on the thing you’re trying to remove.

1740
01:06:29,360 –> 01:06:32,240
And in an agentic enterprise, the identities don’t just sit there.

1741
01:06:32,240 –> 01:06:35,040
They execute, they touch data, they change state,

1742
01:06:35,040 –> 01:06:39,360
they create evidence trails that ironically prove the access is being used

1743
01:06:39,360 –> 01:06:42,320
which makes it harder to decommission because now it’s critical.

1744
01:06:42,320 –> 01:06:46,000
So the limiting factor in autonomy isn’t whether the agent can plan.

1745
01:06:46,000 –> 01:06:50,000
It’s whether you can constrain execution without collapsing the workflow.

1746
01:06:50,000 –> 01:06:55,200
If you can’t express least privilege as an execution contract that maps to actual entitlements,

1747
01:06:55,200 –> 01:06:58,160
the agent either fails constantly so people widen permissions

1748
01:06:58,160 –> 01:06:59,760
or it succeeds unsafely.

1749
01:06:59,760 –> 01:07:02,560
So you accumulate risk until something breaks loudly.

1750
01:07:02,560 –> 01:07:03,920
That’s the identity debt trap.

1751
01:07:03,920 –> 01:07:06,880
Either you accept failure and keep humans in the loop forever

1752
01:07:06,880 –> 01:07:09,440
or you accept sprawl and pretend you can govern it later.

1753
01:07:09,440 –> 01:07:10,400
You can’t.

1754
01:07:10,400 –> 01:07:12,560
So when someone asks what does Altera really change?

1755
01:07:12,560 –> 01:07:17,120
The honest answer is this, it forces the enterprise to operationalize the autonomy boundary

1756
01:07:17,120 –> 01:07:21,200
as an identity and authorization problem, not a UX problem, not a chat problem.

1757
01:07:21,200 –> 01:07:24,960
And once you see that, the next limiting factor becomes obvious.

1758
01:07:24,960 –> 01:07:30,800
Tool access is the new perimeter and MCP makes that perimeter easier to adopt and easier to lose control of.

1759
01:07:30,800 –> 01:07:33,840
MCP and tool access, one protocol, many new ways to fail.

1760
01:07:33,840 –> 01:07:36,240
MCP is going to feel like progress because it is.

1761
01:07:36,240 –> 01:07:37,680
It standardizes tool access.

1762
01:07:37,680 –> 01:07:42,080
It makes agent can call a tool, stop being a bespoke integration project.

1763
01:07:42,080 –> 01:07:44,400
It turns every SAS system, every internal service,

1764
01:07:44,400 –> 01:07:48,400
every local capability into something an agent runtime can discover and invoke

1765
01:07:48,400 –> 01:07:51,920
without your developers reinventing glue code for the thousandth time.

1766
01:07:51,920 –> 01:07:53,040
And that’s the trap.

1767
01:07:53,040 –> 01:07:54,960
Standardization doesn’t reduce risk.

1768
01:07:54,960 –> 01:07:56,240
It reduces friction.

1769
01:07:56,240 –> 01:07:59,760
Risk scales with adoption and MCP is designed to accelerate adoption.

1770
01:07:59,760 –> 01:08:05,360
So if you treat MCP as just a protocol, you will wake up with a tool surface area that outgrew your governance model.

1771
01:08:05,360 –> 01:08:09,200
Here is the failure mode to anchor on because it’s the one that will actually happen.

1772
01:08:09,200 –> 01:08:13,040
An agent accidentally gains the ability to delete what it should only read,

1773
01:08:13,040 –> 01:08:16,240
not because someone flipped an evil setting, because tool scopes drift,

1774
01:08:16,240 –> 01:08:19,600
because a connector gets reused, because a server gets upgraded,

1775
01:08:19,600 –> 01:08:22,880
because someone adds one method to solve a legitimate business need.

1776
01:08:22,880 –> 01:08:26,560
And the permission model doesn’t force reauthorization with the same seriousness

1777
01:08:26,560 –> 01:08:28,080
as adding a new human admin.

1778
01:08:28,080 –> 01:08:31,920
MCP makes tool capabilities composable, composability is how you get outcomes.

1779
01:08:31,920 –> 01:08:36,000
Composibility is also how you get privilege escalation with a paper trail.

1780
01:08:36,000 –> 01:08:41,360
The thing most people miss is that MCP collapses the psychological boundary between data access

1781
01:08:41,360 –> 01:08:43,120
and action execution.

1782
01:08:43,120 –> 01:08:47,440
In a pre-agent world, a connector that reads SharePoint feels like a data integration.

1783
01:08:47,440 –> 01:08:51,040
A connector that changes, entra rolls feels like administration.

1784
01:08:51,040 –> 01:08:53,520
Different teams, different approvals, different audits.

1785
01:08:53,520 –> 01:08:56,080
MCP puts them in the same shape, a tool call.

1786
01:08:56,080 –> 01:09:00,400
That distinction matters because your organization’s current control model relies on friction.

1787
01:09:00,400 –> 01:09:03,440
Separate portals, separate owners, separate change boards.

1788
01:09:03,440 –> 01:09:08,480
MCP removes that friction, therefore your design has to replace it with enforceable intent.

1789
01:09:08,480 –> 01:09:09,600
So what breaks first?

1790
01:09:09,600 –> 01:09:10,480
Tools sprawl.

1791
01:09:10,480 –> 01:09:12,720
Every product team will ship an MCP server.

1792
01:09:12,720 –> 01:09:14,480
Every vendor will ship an MCP server.

1793
01:09:14,480 –> 01:09:18,240
Every internal platform team will expose helpful MCP endpoints,

1794
01:09:18,240 –> 01:09:19,920
because it’s easier than building a UI.

1795
01:09:19,920 –> 01:09:22,880
And suddenly your agent runtime isn’t talking to five systems,

1796
01:09:22,880 –> 01:09:24,400
it’s talking to 50.

1797
01:09:24,400 –> 01:09:29,600
And each one comes with its own auth model, its own scope semantics, its own notion of read,

1798
01:09:29,600 –> 01:09:31,280
and its own logging quality.

1799
01:09:31,280 –> 01:09:32,640
That is not interoperability.

1800
01:09:32,640 –> 01:09:34,640
That is an authorization expansion pack.

1801
01:09:34,640 –> 01:09:38,480
Then you get entitlement multiplication, a single business workflow that used to require

1802
01:09:38,480 –> 01:09:43,120
one person with three roles now requires an agent identity with tool access across multiple servers.

1803
01:09:43,120 –> 01:09:47,360
Each server wants credentials, tokens, delegated permissions, app roles,

1804
01:09:47,360 –> 01:09:50,880
secrets, certificates, managed identities, pick your poison.

1805
01:09:50,880 –> 01:09:53,280
And because agents are expected to work end to end,

1806
01:09:53,280 –> 01:09:57,600
the easiest path is to grant broad access so the workflow doesn’t get stuck.

1807
01:09:57,600 –> 01:10:01,760
That’s how delete permissions show up in a read scenario, not maliciously, inevitably.

1808
01:10:01,760 –> 01:10:06,640
So you need two separate control concepts and enterprises keep blending them until nothing is

1809
01:10:06,640 –> 01:10:09,360
controlled. Discovery is not authorization.

1810
01:10:09,360 –> 01:10:13,440
A registry that lets an agent find MCP servers is not a permission system.

1811
01:10:13,440 –> 01:10:14,480
It’s an index.

1812
01:10:14,480 –> 01:10:17,600
It answers what exists, not what is allowed.

1813
01:10:17,600 –> 01:10:21,520
If you confuse those, you’ve built an ecosystem where it showed up in the registry,

1814
01:10:21,520 –> 01:10:24,000
it becomes the justification for the agent used it.

1815
01:10:24,000 –> 01:10:25,200
That’s backwards.

1816
01:10:25,200 –> 01:10:27,040
Authorization must be explicit.

1817
01:10:27,040 –> 01:10:30,240
Per agent identity, per tool, per method, per scope,

1818
01:10:30,240 –> 01:10:32,560
with evidence requirements that can be audited.

1819
01:10:32,560 –> 01:10:35,440
And the allo list has to be enforced in the control plane,

1820
01:10:35,440 –> 01:10:37,680
not politely suggested in runtime code.

1821
01:10:37,680 –> 01:10:40,880
Because runtime code drifts, control planes are supposed to be the thing that doesn’t.

1822
01:10:40,880 –> 01:10:43,920
Now ad-versioning because MCP servers won’t sit still.

1823
01:10:43,920 –> 01:10:47,840
Servers get new capabilities, methods get renamed, default scopes get widened

1824
01:10:47,840 –> 01:10:50,000
because of ender wants fewer support tickets.

1825
01:10:50,000 –> 01:10:52,800
Breaking changes don’t always break the integration.

1826
01:10:52,800 –> 01:10:54,880
Sometimes they break your safety assumptions.

1827
01:10:54,880 –> 01:10:57,120
That’s why tool allow listing can’t be.

1828
01:10:57,120 –> 01:10:58,800
This server is approved.

1829
01:10:58,800 –> 01:11:01,760
It has to be this server, this version,

1830
01:11:01,760 –> 01:11:04,320
these methods, these scopes, in this environment,

1831
01:11:04,320 –> 01:11:05,920
anything else is trust by branding.

1832
01:11:05,920 –> 01:11:10,480
And the ugly part is that MCP encourages exactly the behavior that creates drift.

1833
01:11:10,480 –> 01:11:11,680
Rapid composition.

1834
01:11:11,680 –> 01:11:13,920
You build an agent, you add a tool, you get a win-you-ship.

1835
01:11:13,920 –> 01:11:16,320
Over time, the tool graph becomes the real perimeter,

1836
01:11:16,320 –> 01:11:18,960
because it defines what the agent can touch.

1837
01:11:18,960 –> 01:11:22,000
So MCP doesn’t replace identity debt, it accelerates it.

1838
01:11:22,000 –> 01:11:25,600
Every MCP server you add is another place where entitlements can sprawl

1839
01:11:25,600 –> 01:11:27,520
another place where evidence can be lost,

1840
01:11:27,520 –> 01:11:32,640
another place where temporary access becomes permanent because it unblocked the workflow.

1841
01:11:32,640 –> 01:11:35,040
And yes, Microsoft is leaning into MCP hard,

1842
01:11:35,040 –> 01:11:37,840
Teams AI library agent platforms, Windows registries,

1843
01:11:37,840 –> 01:11:39,840
that’s not a warning that MCP is bad.

1844
01:11:39,840 –> 01:11:42,080
It’s a warning that MCP will be everywhere,

1845
01:11:42,080 –> 01:11:46,160
therefore your enterprise needs to treat tool access like production infrastructure.

1846
01:11:46,160 –> 01:11:49,040
Because in an autonomous enterprise, tools are actuators,

1847
01:11:49,040 –> 01:11:51,200
and actuators are weapons if you don’t constrain them.

1848
01:11:51,200 –> 01:11:53,840
So if you remember one rule from this section, make it this.

1849
01:11:53,840 –> 01:11:56,080
MCP makes action cheap.

1850
01:11:56,080 –> 01:11:58,640
Governance has to make unsafe action impossible,

1851
01:11:58,640 –> 01:12:00,880
otherwise you didn’t build an agent platform.

1852
01:12:00,880 –> 01:12:04,320
You build a fast path to conditional chaos with standardized APIs.

1853
01:12:04,320 –> 01:12:08,320
Observability and replayability, the only cure for agent sites, so.

1854
01:12:08,320 –> 01:12:12,960
MCP makes action cheap, that means the enterprise has to make accountability unavoidable.

1855
01:12:12,960 –> 01:12:16,640
Because once agents start acting, the failure mode isn’t the model was wrong.

1856
01:12:16,640 –> 01:12:20,560
The failure mode is that nobody can prove what happened in what order,

1857
01:12:20,560 –> 01:12:22,720
under which permissions, and based on which inputs.

1858
01:12:22,720 –> 01:12:25,280
That’s how you end up in the worst possible incident review,

1859
01:12:25,280 –> 01:12:29,840
a room full of senior people reconstructing reality from screenshots and vibes.

1860
01:12:29,840 –> 01:12:33,520
Without observability, autonomy degenerates into agent set, so.

1861
01:12:33,520 –> 01:12:36,560
An agent set, so is not evidence, it’s a resignation letter.

1862
01:12:36,560 –> 01:12:40,560
So the core requirement for an autonomous enterprise is not better prompting.

1863
01:12:40,560 –> 01:12:43,840
It’s a telemetry model that treats every run like a production change,

1864
01:12:43,840 –> 01:12:45,840
recorded, attributable, and replayable.

1865
01:12:45,840 –> 01:12:48,560
Start with what has to be captured, not optionally.

1866
01:12:48,560 –> 01:12:52,000
By contract, inputs, the event payloads, ticket fields, alert IDs,

1867
01:12:52,000 –> 01:12:56,320
file versions, data extracts, and the exact prompt instructions that shape decisions.

1868
01:12:56,320 –> 01:13:00,000
If the agent used a SharePoint file, you need the file identity in version.

1869
01:13:00,000 –> 01:13:03,040
If it used a Sentinel incident, you need the incident ID,

1870
01:13:03,040 –> 01:13:05,200
and the related entity graph snapshot.

1871
01:13:05,200 –> 01:13:10,080
If it used work data, you need the scope that defined what work data meant at that moment.

1872
01:13:10,080 –> 01:13:13,760
Then decisions, the branching points, what class did it assign the incident to?

1873
01:13:13,760 –> 01:13:15,200
Which policy clause did it map to?

1874
01:13:15,200 –> 01:13:17,760
Which confidence threshold did it claim it met?

1875
01:13:17,760 –> 01:13:19,680
Which evidence requirement did it satisfy?

1876
01:13:19,680 –> 01:13:20,160
And how?

1877
01:13:20,160 –> 01:13:23,440
The thing most people miss is that decisions are more important than outputs.

1878
01:13:23,440 –> 01:13:24,880
Outputs are easy to store.

1879
01:13:24,880 –> 01:13:26,880
Decisions are where accountability lives.

1880
01:13:26,880 –> 01:13:29,840
Then tool calls, every tool in vocation with parameters.

1881
01:13:29,840 –> 01:13:30,880
Which API endpoint?

1882
01:13:30,880 –> 01:13:31,520
Which method?

1883
01:13:31,520 –> 01:13:32,080
Which scope?

1884
01:13:32,080 –> 01:13:32,880
Which identity?

1885
01:13:32,880 –> 01:13:34,240
Which resource IDs?

1886
01:13:34,240 –> 01:13:35,120
And the response?

1887
01:13:35,120 –> 01:13:37,040
If an agent restarts a service,

1888
01:13:37,040 –> 01:13:40,320
you log the resource ID, the operation ID, and the result state.

1889
01:13:40,320 –> 01:13:42,880
If it revokes a session, you log the principle,

1890
01:13:42,880 –> 01:13:46,480
the token session identifiers, if available, and the confirmation.

1891
01:13:46,480 –> 01:13:49,120
This has to be structured data, not a chat transcript,

1892
01:13:49,120 –> 01:13:51,600
chat transcripts or theater, tool calls or facts.

1893
01:13:51,600 –> 01:13:54,080
Then actions and state changes, what changed in Azure,

1894
01:13:54,080 –> 01:13:56,640
what changed in Entra, what changed in the ITSM record,

1895
01:13:56,640 –> 01:13:57,840
what messages were sent,

1896
01:13:57,840 –> 01:14:01,280
and critically, what verification checks were executed after the action.

1897
01:14:01,280 –> 01:14:05,520
If the contract says verify health probe green, show the probe result.

1898
01:14:05,520 –> 01:14:08,240
Not the sentence verified successfully.

1899
01:14:08,240 –> 01:14:10,560
Finally, outputs, the human facing artifact,

1900
01:14:10,560 –> 01:14:12,640
the incident report, the reconciliation pack,

1901
01:14:12,640 –> 01:14:14,640
the investigation summary, those are important,

1902
01:14:14,640 –> 01:14:15,840
but they are downstream.

1903
01:14:15,840 –> 01:14:17,680
They should be generated from the run record,

1904
01:14:17,680 –> 01:14:19,920
not written as free form narrative that drifts away

1905
01:14:19,920 –> 01:14:21,200
from what actually happened.

1906
01:14:21,200 –> 01:14:24,320
Now auditability, audit does not care that an agent is clever,

1907
01:14:24,320 –> 01:14:27,120
audit cares that identity and action are linkable.

1908
01:14:27,120 –> 01:14:30,080
Who or what took the action and under what authorization?

1909
01:14:30,080 –> 01:14:32,560
That means your run record must tie to the non-human principle,

1910
01:14:32,560 –> 01:14:34,320
the role assignments active at the time

1911
01:14:34,320 –> 01:14:36,640
and any approval objects that were required.

1912
01:14:36,640 –> 01:14:39,920
If you can’t link action to authorization deterministically,

1913
01:14:39,920 –> 01:14:43,360
you didn’t automate work, you automated liability.

1914
01:14:43,360 –> 01:14:44,880
Cost controls also live here,

1915
01:14:44,880 –> 01:14:47,840
and this is where most teams accidentally build infinite loops

1916
01:14:47,840 –> 01:14:48,480
with a budget.

1917
01:14:48,480 –> 01:14:51,520
You need to track token usage, tool usage, action volume,

1918
01:14:51,520 –> 01:14:53,520
retry as and failure loops per run,

1919
01:14:53,520 –> 01:14:54,800
not to optimize the model,

1920
01:14:54,800 –> 01:14:57,520
to enforce blast radius on compute and on action.

1921
01:14:57,520 –> 01:15:00,560
If an agent gets stuck and calls the same tool 50 times,

1922
01:15:00,560 –> 01:15:02,080
that’s not persistence.

1923
01:15:02,080 –> 01:15:04,880
That’s a runaway process. Observability is how you detect it.

1924
01:15:04,880 –> 01:15:06,560
Control plane limits are how you stop it.

1925
01:15:06,560 –> 01:15:08,400
And now the real point, replayability.

1926
01:15:08,400 –> 01:15:10,560
Replayability means you can re-execute the run

1927
01:15:10,560 –> 01:15:11,840
in a controlled environment

1928
01:15:11,840 –> 01:15:14,320
and see the same decisions with the same inputs.

1929
01:15:14,320 –> 01:15:15,600
Or if something differs,

1930
01:15:15,600 –> 01:15:17,040
you can point to the exact delta,

1931
01:15:17,040 –> 01:15:18,000
different data version,

1932
01:15:18,000 –> 01:15:19,040
different policy version,

1933
01:15:19,040 –> 01:15:21,120
different tool version, different model version.

1934
01:15:21,120 –> 01:15:24,000
That is how you do post mortems without mythology

1935
01:15:24,000 –> 01:15:25,680
because without replay incident review

1936
01:15:25,680 –> 01:15:26,800
becomes storytelling.

1937
01:15:26,800 –> 01:15:28,960
Humans, fill gaps, teams protect themselves.

1938
01:15:28,960 –> 01:15:31,360
People argue about what the agent meant.

1939
01:15:31,360 –> 01:15:33,360
None of that matters. The system did what it did.

1940
01:15:33,360 –> 01:15:36,240
Replay is how you stop debating and start fixing.

1941
01:15:36,240 –> 01:15:38,720
And replayability changes governance behavior.

1942
01:15:38,720 –> 01:15:41,200
It forces you to version your execution contracts.

1943
01:15:41,200 –> 01:15:43,440
It forces you to treat tool scopes like code.

1944
01:15:43,440 –> 01:15:45,120
It forces you to notice drift.

1945
01:15:45,120 –> 01:15:47,920
When a server update widens capabilities replay breaks,

1946
01:15:47,920 –> 01:15:50,160
therefore someone has to re-approved the new behavior.

1947
01:15:50,160 –> 01:15:51,760
That is the point.

1948
01:15:51,760 –> 01:15:54,480
So the cure for agent said so is a run ledger,

1949
01:15:54,480 –> 01:15:56,080
immutable enough to trust,

1950
01:15:56,080 –> 01:15:59,600
detailed enough to diagnose and structured enough to audit.

1951
01:15:59,600 –> 01:16:01,920
If you don’t build that, autonomy won’t scale

1952
01:16:01,920 –> 01:16:03,280
because trust won’t scale.

1953
01:16:03,280 –> 01:16:05,440
Now, once you can observe and replay,

1954
01:16:05,440 –> 01:16:07,040
you can do the next uncomfortable thing.

1955
01:16:07,040 –> 01:16:09,520
You can compute ROI without fantasy

1956
01:16:09,520 –> 01:16:11,920
because you can finally count outcomes,

1957
01:16:11,920 –> 01:16:14,000
interventions, rollbacks,

1958
01:16:14,000 –> 01:16:16,320
and policy violations as data,

1959
01:16:16,320 –> 01:16:17,280
not opinions.

1960
01:16:17,280 –> 01:16:20,640
ROI, without fantasy,

1961
01:16:20,640 –> 01:16:23,040
cost, speed, and risk is one equation.

1962
01:16:23,040 –> 01:16:24,640
Once you can observe and replay,

1963
01:16:24,640 –> 01:16:27,200
you can finally talk about ROI without lying to yourself.

1964
01:16:27,920 –> 01:16:30,880
Most agent ROI decks are token math and vibes.

1965
01:16:30,880 –> 01:16:33,040
Tokens are cheap, therefore we saved money,

1966
01:16:33,040 –> 01:16:34,640
or time saved per employee,

1967
01:16:34,640 –> 01:16:36,320
therefore we gained capacity.

1968
01:16:36,320 –> 01:16:37,440
That’s assistance logic.

1969
01:16:37,440 –> 01:16:38,480
It’s fine for co-pilot.

1970
01:16:38,480 –> 01:16:40,800
It’s the wrong accounting model for autonomy.

1971
01:16:40,800 –> 01:16:43,040
Because autonomy doesn’t sell you better sentences.

1972
01:16:43,040 –> 01:16:44,320
It sells you closed loops.

1973
01:16:44,320 –> 01:16:46,480
So the unit of value isn’t cost per chat.

1974
01:16:46,480 –> 01:16:47,760
It’s cost per outcome.

1975
01:16:47,760 –> 01:16:49,200
Cost per resolved incident,

1976
01:16:49,200 –> 01:16:51,280
cost per reconciled variance pack,

1977
01:16:51,280 –> 01:16:54,080
cost per contained low-risk security incident,

1978
01:16:54,080 –> 01:16:56,400
with an evidence bundle that survives review.

1979
01:16:56,400 –> 01:16:58,400
If you can’t measure cost per outcome,

1980
01:16:58,400 –> 01:17:00,080
you are not doing ROI.

1981
01:17:00,080 –> 01:17:01,600
You’re doing procurement theater.

1982
01:17:01,600 –> 01:17:04,320
Start with cost, but define it like an operator.

1983
01:17:04,320 –> 01:17:06,240
Direct compute is the easy part.

1984
01:17:06,240 –> 01:17:09,440
Model calls, orchestration runtime, tool call overhead.

1985
01:17:09,440 –> 01:17:10,320
You should measure that,

1986
01:17:10,320 –> 01:17:12,480
but you should treat it as a marginal cost

1987
01:17:12,480 –> 01:17:15,280
on top of the real cost driver, human intervention.

1988
01:17:15,280 –> 01:17:16,800
Every time the agent escalates,

1989
01:17:16,800 –> 01:17:19,440
pauses, asks for approval or fails verification,

1990
01:17:19,440 –> 01:17:20,800
and needs a human to clean up.

1991
01:17:20,800 –> 01:17:23,120
That’s labor cost injected back into the loop.

1992
01:17:23,120 –> 01:17:24,640
And it’s not just the time spent.

1993
01:17:24,640 –> 01:17:25,760
It’s the context switch.

1994
01:17:25,760 –> 01:17:27,040
It’s the seniority tax,

1995
01:17:27,040 –> 01:17:28,800
because exceptions tend to land

1996
01:17:28,800 –> 01:17:30,960
on the most expensive humans you have.

1997
01:17:30,960 –> 01:17:33,440
If an agent creates more exceptions than it resolves,

1998
01:17:33,440 –> 01:17:36,400
congratulations, you automated the worst part of the job.

1999
01:17:36,400 –> 01:17:38,080
Now speed, and this is where enterprises

2000
01:17:38,080 –> 01:17:39,600
keep using the wrong metric.

2001
01:17:39,600 –> 01:17:41,760
Speed isn’t how fast did it respond.

2002
01:17:41,760 –> 01:17:43,040
Speed is Q behavior.

2003
01:17:43,040 –> 01:17:45,840
Q depth never goes down when the system can’t close.

2004
01:17:45,840 –> 01:17:47,280
Tickets churn, not close.

2005
01:17:47,280 –> 01:17:48,560
Analysts become routers.

2006
01:17:48,560 –> 01:17:50,800
Controllers become spreadsheet traffic cops.

2007
01:17:50,800 –> 01:17:53,680
Autonomy wins when it reduces Q depth over time,

2008
01:17:53,680 –> 01:17:55,840
not when it generates a faster first reply.

2009
01:17:55,840 –> 01:17:58,240
So measure time to close, not time to first action.

2010
01:17:58,240 –> 01:17:59,680
Measure throughput under load.

2011
01:17:59,680 –> 01:18:02,160
How many incidents closed per day at peak volume

2012
01:18:02,160 –> 01:18:03,520
with the same head count?

2013
01:18:03,520 –> 01:18:04,800
Measure backlog aging.

2014
01:18:04,800 –> 01:18:08,080
How long do exceptions sit before a human touches them?

2015
01:18:08,080 –> 01:18:10,000
And measure the shape of the distribution,

2016
01:18:10,000 –> 01:18:12,400
not just the average, because the average hides your long tail.

2017
01:18:12,400 –> 01:18:13,840
The long tail is where trust dies.

2018
01:18:13,840 –> 01:18:16,160
Now risk, because this is the part that turns ROI

2019
01:18:16,160 –> 01:18:17,760
into a real enterprise conversation.

2020
01:18:17,760 –> 01:18:19,120
Risk isn’t a moral concept.

2021
01:18:19,120 –> 01:18:21,040
It’s an operational metric, intervention rate,

2022
01:18:21,040 –> 01:18:24,240
rollback rate, policy violations, and audit exceptions.

2023
01:18:24,240 –> 01:18:26,400
Those are the things that make your autonomy program

2024
01:18:26,400 –> 01:18:28,080
politically unsustainable.

2025
01:18:28,080 –> 01:18:30,880
If intervention rate is high, you didn’t build autonomy.

2026
01:18:30,880 –> 01:18:33,200
You built a noisy assistant that still needs a person

2027
01:18:33,200 –> 01:18:34,320
to finish the job.

2028
01:18:34,320 –> 01:18:36,720
If rollback rate is high, your verification is weak

2029
01:18:36,720 –> 01:18:39,120
or your execution contract is too permissive.

2030
01:18:39,120 –> 01:18:42,240
If policy violations occur, your control plane is ornamental.

2031
01:18:42,240 –> 01:18:44,240
And if audit exceptions appear,

2032
01:18:44,240 –> 01:18:46,080
finance and security will shut you down,

2033
01:18:46,080 –> 01:18:48,640
regardless of how productive it felt in a demo.

2034
01:18:48,640 –> 01:18:49,840
This is the uncomfortable truth.

2035
01:18:49,840 –> 01:18:51,680
In autonomy, risk has a cost curve.

2036
01:18:51,680 –> 01:18:53,920
The first policy breach costs your credibility.

2037
01:18:53,920 –> 01:18:56,480
The second costs your budget, the third costs you the program.

2038
01:18:56,480 –> 01:18:59,280
So the equation you should run is brutally simple.

2039
01:18:59,280 –> 01:19:02,080
Cost per outcome equals, compute plus tool usage,

2040
01:19:02,080 –> 01:19:05,360
plus human intervention, plus remediation overhead from failures.

2041
01:19:05,360 –> 01:19:08,400
Speed equals outcomes per unit time under real load,

2042
01:19:08,400 –> 01:19:11,680
reflected as reduced queue depth and reduced backlog aging.

2043
01:19:11,680 –> 01:19:15,040
Risk equals the rate at which outcomes required rollback,

2044
01:19:15,040 –> 01:19:18,560
violated policy, or produced evidence that didn’t pass review.

2045
01:19:18,560 –> 01:19:20,480
And you don’t get to optimize one in isolation.

2046
01:19:20,480 –> 01:19:22,960
If you reduce cost by lowering evidence requirements,

2047
01:19:22,960 –> 01:19:24,000
you increase risk.

2048
01:19:24,000 –> 01:19:25,920
If you increase speed by widening permissions,

2049
01:19:25,920 –> 01:19:27,440
you increase blast radius.

2050
01:19:27,440 –> 01:19:30,800
If you reduce risk by forcing human approvals everywhere,

2051
01:19:30,800 –> 01:19:33,440
you collapse autonomy back into faster labor.

2052
01:19:33,440 –> 01:19:35,600
That distinction matters because executives will ask,

2053
01:19:35,600 –> 01:19:37,680
should we just buy more copilot seats?

2054
01:19:37,680 –> 01:19:40,000
And the honest answer is copilot boosts individuals,

2055
01:19:40,000 –> 01:19:41,680
autonomy boosts system throughput.

2056
01:19:41,680 –> 01:19:44,320
Copilot makes one analyst faster at triage.

2057
01:19:44,320 –> 01:19:47,600
Autonomy makes the queue smaller even when the analyst isn’t there.

2058
01:19:47,600 –> 01:19:50,640
And that’s the only kind of ROI that survives budget season,

2059
01:19:50,640 –> 01:19:52,480
because it shows up as fewer open tickets,

2060
01:19:52,480 –> 01:19:54,640
faster close cycles and fewer policy incidents,

2061
01:19:54,640 –> 01:19:55,920
not happier anecdotes.

2062
01:19:55,920 –> 01:19:58,720
So if you want one practical test before you show a single number,

2063
01:19:58,720 –> 01:19:59,600
it’s this.

2064
01:19:59,600 –> 01:20:01,520
Pick a workflow where the queue never shrinks.

2065
01:20:01,520 –> 01:20:02,800
The tickets keep coming.

2066
01:20:02,800 –> 01:20:04,800
The team keeps working hard.

2067
01:20:04,800 –> 01:20:06,400
And yet the backlog ages anyway.

2068
01:20:06,400 –> 01:20:08,240
If autonomy can’t change that queue shape,

2069
01:20:08,240 –> 01:20:09,520
it’s not an autonomy investment.

2070
01:20:09,520 –> 01:20:11,200
It’s a chat interface with ambition.

2071
01:20:11,200 –> 01:20:13,840
And now that you can define ROI like an adult,

2072
01:20:13,840 –> 01:20:16,720
you can do the next thing most organizations avoid.

2073
01:20:16,720 –> 01:20:20,080
Decide when autonomy is worth it and when the correct answer is no.

2074
01:20:20,080 –> 01:20:21,440
Decision framework.

2075
01:20:21,440 –> 01:20:23,840
When autonomy is worth it, when to say no.

2076
01:20:23,840 –> 01:20:26,880
Here’s the decision framework executives keep asking for

2077
01:20:26,880 –> 01:20:30,000
and architects keep avoiding because it forces a real answer.

2078
01:20:30,000 –> 01:20:32,400
Autonomy is worth it when the work is repeatable.

2079
01:20:32,400 –> 01:20:34,880
The ownership is explicit and the system already

2080
01:20:34,880 –> 01:20:37,200
emits enough telemetry to verify success.

2081
01:20:37,200 –> 01:20:38,400
Not logs exist.

2082
01:20:38,400 –> 01:20:41,280
Telemetry that can prove the outcome

2083
01:20:41,280 –> 01:20:43,280
without a human squinting at a dashboard.

2084
01:20:43,280 –> 01:20:44,480
That’s the first gate.

2085
01:20:44,480 –> 01:20:46,480
Second gate, the action surface is enforceable.

2086
01:20:46,480 –> 01:20:49,360
You can name the tools, scopes and identities involved.

2087
01:20:49,360 –> 01:20:50,960
You can write an execution contract

2088
01:20:50,960 –> 01:20:52,640
that the runtime can’t negotiate with.

2089
01:20:52,640 –> 01:20:54,800
If you can’t, you’re not evaluating autonomy.

2090
01:20:54,800 –> 01:20:56,320
You’re evaluating optimism.

2091
01:20:56,320 –> 01:21:00,640
Third gate, you can define escalation contracts in advance.

2092
01:21:00,640 –> 01:21:02,240
When the agent hits ambiguity,

2093
01:21:02,240 –> 01:21:04,640
it doesn’t stall silently and it doesn’t improvise.

2094
01:21:04,640 –> 01:21:06,640
It roots to a human with the evidence bundle

2095
01:21:06,640 –> 01:21:08,080
and a proposed next action.

2096
01:21:08,080 –> 01:21:09,600
Humans become exception handlers.

2097
01:21:09,600 –> 01:21:11,600
If humans are still the default executor,

2098
01:21:11,600 –> 01:21:12,960
you bought faster labor.

2099
01:21:12,960 –> 01:21:15,920
Now the no criteria because mature teams say no early

2100
01:21:15,920 –> 01:21:17,760
and save themselves a year of politics,

2101
01:21:17,760 –> 01:21:19,360
say no when approvals are ambiguous.

2102
01:21:19,360 –> 01:21:21,680
If you can’t express who is allowed to approve what?

2103
01:21:21,680 –> 01:21:23,280
As machine readable policy,

2104
01:21:23,280 –> 01:21:25,680
autonomy will either freeze or bypass the process.

2105
01:21:25,680 –> 01:21:28,400
Both outcomes are failures, just with different paperwork.

2106
01:21:28,400 –> 01:21:30,480
Say no when data boundaries are unclear.

2107
01:21:30,480 –> 01:21:33,040
If you’re all can’t name which systems are authoritative

2108
01:21:33,040 –> 01:21:35,120
and which are just spreadsheets people trust,

2109
01:21:35,120 –> 01:21:37,360
you’re going to ground the agent on the wrong truth

2110
01:21:37,360 –> 01:21:39,120
and then argue about it for months.

2111
01:21:39,120 –> 01:21:42,160
Finance and security will not tolerate that ambiguity.

2112
01:21:42,160 –> 01:21:44,400
Say no when the audit surface doesn’t exist.

2113
01:21:44,400 –> 01:21:46,400
If you cannot capture inputs, decisions,

2114
01:21:46,400 –> 01:21:49,520
tool calls and verification as a replayable run record,

2115
01:21:49,520 –> 01:21:52,720
you will eventually end up with agents said so in front of leadership.

2116
01:21:52,720 –> 01:21:54,160
That’s the end of the program

2117
01:21:54,160 –> 01:21:57,360
and say no when nobody owns the page, autonomy shifts ownership.

2118
01:21:57,360 –> 01:21:58,480
It doesn’t remove it.

2119
01:21:58,480 –> 01:22:01,120
If the failure mode is, everyone is responsible,

2120
01:22:01,120 –> 01:22:03,840
then no one will fix the control plane when it drifts

2121
01:22:03,840 –> 01:22:06,480
and the agent will inherit a decaying policy model.

2122
01:22:06,480 –> 01:22:08,160
So the maturity gates are simple.

2123
01:22:08,160 –> 01:22:09,520
Identity readiness.

2124
01:22:09,520 –> 01:22:12,720
Can you issue non-human principles with narrow scopes

2125
01:22:12,720 –> 01:22:14,400
and life cycle controls?

2126
01:22:14,400 –> 01:22:15,760
Tool registry readiness.

2127
01:22:15,760 –> 01:22:19,600
Can you enumerate and allow list what exists versus what’s allowed?

2128
01:22:19,600 –> 01:22:20,560
Evidence readiness.

2129
01:22:20,560 –> 01:22:24,560
Can you produce replayable runs that survive post mortems and audits?

2130
01:22:24,560 –> 01:22:27,200
Now human in the loop design, this isn’t about feelings,

2131
01:22:27,200 –> 01:22:28,400
it’s about thresholds.

2132
01:22:28,400 –> 01:22:31,200
Define explicit confidence thresholds per action class.

2133
01:22:31,200 –> 01:22:33,600
Define evidence requirements per incident class.

2134
01:22:33,600 –> 01:22:35,520
Define what triggers elevation,

2135
01:22:35,520 –> 01:22:38,480
what triggers approval and what triggers escalation.

2136
01:22:38,480 –> 01:22:41,120
Don’t let human in the loop become a permanent crutch

2137
01:22:41,120 –> 01:22:43,520
and don’t let full autonomy become a marketing goal.

2138
01:22:43,520 –> 01:22:46,400
The autonomy boundary is a control surface treated like one.

2139
01:22:46,400 –> 01:22:48,880
And the operating model is the part most org skip

2140
01:22:48,880 –> 01:22:50,400
because it’s boring and political.

2141
01:22:50,400 –> 01:22:52,400
Who owns agent failures, not the dev team,

2142
01:22:52,400 –> 01:22:54,640
not AI, the business owner of the workflow?

2143
01:22:54,640 –> 01:22:56,160
Who owns policy changes?

2144
01:22:56,160 –> 01:22:57,920
The team that owns the control plane

2145
01:22:57,920 –> 01:23:00,320
with change control like any other enforcement system?

2146
01:23:00,320 –> 01:23:01,440
Who owns the tool scopes?

2147
01:23:01,440 –> 01:23:02,240
The tool owners?

2148
01:23:02,240 –> 01:23:04,800
With versioning and re-approval when capabilities change.

2149
01:23:04,800 –> 01:23:06,880
That’s the framework if it sounds strict good.

2150
01:23:06,880 –> 01:23:08,880
Autonomy is strictness automated.

2151
01:23:08,880 –> 01:23:10,160
Implementation payoff.

2152
01:23:10,160 –> 01:23:12,880
The 30-day autonomy pilot that doesn’t embarrass you.

2153
01:23:12,880 –> 01:23:15,920
If you want a 30-day pilot that survives contact with reality,

2154
01:23:15,920 –> 01:23:16,960
pick one domain.

2155
01:23:16,960 –> 01:23:18,960
IT remediation or security triage.

2156
01:23:18,960 –> 01:23:20,640
Don’t run three pilots in parallel

2157
01:23:20,640 –> 01:23:22,400
and call the confusion learning.

2158
01:23:22,400 –> 01:23:24,160
Write three policies on day one,

2159
01:23:24,160 –> 01:23:26,160
allowed actions, evidence requirements

2160
01:23:26,160 –> 01:23:28,560
with confidence thresholds and escalation paths

2161
01:23:28,560 –> 01:23:30,000
with named owners.

2162
01:23:30,000 –> 01:23:32,000
Stand up evidence before you stand up autonomy.

2163
01:23:32,000 –> 01:23:33,760
Action logs, tool call capture

2164
01:23:33,760 –> 01:23:37,040
and replayable run records mapped to your audit expectations.

2165
01:23:37,040 –> 01:23:39,360
If you can’t replay a run, you can’t defend it.

2166
01:23:39,360 –> 01:23:41,120
Then measure the only metrics that matter.

2167
01:23:41,120 –> 01:23:41,920
Time to close.

2168
01:23:41,920 –> 01:23:44,160
MTTR delta, human enlub rate,

2169
01:23:44,160 –> 01:23:46,080
rollback rate and policy violations.

2170
01:23:46,080 –> 01:23:47,360
If those don’t move, stop.

2171
01:23:47,360 –> 01:23:48,400
Don’t rebrand.

2172
01:23:48,400 –> 01:23:51,600
Autonomy becomes safe only when it’s enforced by design

2173
01:23:51,600 –> 01:23:52,880
through the autonomy boundary

2174
01:23:52,880 –> 01:23:54,000
and execution contract,

2175
01:23:54,000 –> 01:23:55,440
not by intent or good luck.

2176
01:23:55,440 –> 01:23:57,360
If you want to test readiness this week,

2177
01:23:57,360 –> 01:23:58,320
do one thing.

2178
01:23:58,320 –> 01:24:00,400
Remove one human step from a workflow

2179
01:24:00,400 –> 01:24:01,760
where the queue never shrinks

2180
01:24:01,760 –> 01:24:03,280
but add one hard boundary

2181
01:24:03,280 –> 01:24:04,800
that the agent cannot cross

2182
01:24:04,800 –> 01:24:06,400
without evidence and policy.

2183
01:24:06,400 –> 01:24:09,120
And here’s the line that should end the discussion fast.

2184
01:24:09,120 –> 01:24:13,040
If you can’t name who wakes up at 2am when the agent fails,

2185
01:24:13,040 –> 01:24:14,400
you’re not ready for autonomy.

2186
01:24:14,400 –> 01:24:16,320
If you’ve got a workflow where tickets churn

2187
01:24:16,320 –> 01:24:18,720
and nobody can close the loop, put it in the comments.

2188
01:24:18,720 –> 01:24:19,920
And watch the next episode

2189
01:24:19,920 –> 01:24:22,320
because we’ll go deeper on agent identities,

2190
01:24:22,320 –> 01:24:23,520
MCP entitlements

2191
01:24:23,520 –> 01:24:26,720
and how to stop conditional chaos before it becomes policy drift.





Source link

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Search
Popular Now
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...

Discover more from 365 Community Online

Subscribe now to keep reading and get access to the full archive.

Continue reading