
1
00:00:00,000 –> 00:00:02,260
Most organizations think governance is documentation.
2
00:00:02,260 –> 00:00:02,960
They are wrong.
3
00:00:02,960 –> 00:00:04,680
Documentation is what you write down
4
00:00:04,680 –> 00:00:07,680
after the platform has already decided what it will allow.
5
00:00:07,680 –> 00:00:11,480
Governance is control, enforced intent at scale.
6
00:00:11,480 –> 00:00:12,960
Because once you have dozens of teams
7
00:00:12,960 –> 00:00:15,360
and hundreds of subscriptions, your blast radius
8
00:00:15,360 –> 00:00:16,880
stops being a bad deployment
9
00:00:16,880 –> 00:00:19,400
and starts being a bad operating model.
10
00:00:19,400 –> 00:00:21,600
That’s when audits turn into emergencies,
11
00:00:21,600 –> 00:00:23,280
costs leak quietly for months,
12
00:00:23,280 –> 00:00:26,320
and security becomes a collection of exceptions, nobody owns.
13
00:00:26,320 –> 00:00:28,440
This episode isn’t an Azure features tour.
14
00:00:28,440 –> 00:00:29,620
It’s the operating system.
15
00:00:29,620 –> 00:00:31,200
Landing zones, management groups,
16
00:00:31,200 –> 00:00:34,880
are beac with PIM as your policy as real guardrails
17
00:00:34,880 –> 00:00:37,800
and the feedback loops that keep it from degrading.
18
00:00:37,800 –> 00:00:41,500
The enterprise failure mode, policy drift becomes normal.
19
00:00:41,500 –> 00:00:43,820
Here’s what most enterprises don’t admit out loud.
20
00:00:43,820 –> 00:00:45,340
Governance doesn’t usually fail
21
00:00:45,340 –> 00:00:46,940
because controls are missing.
22
00:00:46,940 –> 00:00:49,660
It fails because controls drift, it starts clean.
23
00:00:49,660 –> 00:00:51,940
There’s a baseline, there’s a naming standard,
24
00:00:51,940 –> 00:00:53,340
there’s a policy initiative,
25
00:00:53,340 –> 00:00:55,980
there are owner assignments that are temporary,
26
00:00:55,980 –> 00:00:58,140
there’s a spreadsheet somebody calls a RACE.
27
00:00:58,140 –> 00:00:59,700
Everyone feels responsible.
28
00:00:59,700 –> 00:01:01,740
Then the first exception request shows up.
29
00:01:01,740 –> 00:01:03,340
It’s always reasonable, it’s always urgent,
30
00:01:03,340 –> 00:01:05,780
it’s always just for this one workload.
31
00:01:05,780 –> 00:01:07,540
The platform team makes a choice.
32
00:01:07,540 –> 00:01:09,300
Block the business and be hated
33
00:01:09,300 –> 00:01:11,900
or approve the exception and be pragmatic.
34
00:01:11,900 –> 00:01:14,020
They approve it because humans optimize
35
00:01:14,020 –> 00:01:16,140
for reducing conflict in the moment.
36
00:01:16,140 –> 00:01:18,100
That exception becomes an entropy generator
37
00:01:18,100 –> 00:01:20,140
and the enterprise mistake is thinking
38
00:01:20,140 –> 00:01:22,300
entropy generators are self-cleaning.
39
00:01:22,300 –> 00:01:23,140
They aren’t.
40
00:01:23,140 –> 00:01:25,140
Most organizations never remove exceptions.
41
00:01:25,140 –> 00:01:26,660
They don’t even track them properly.
42
00:01:26,660 –> 00:01:29,100
They just accumulate them until the baseline
43
00:01:29,100 –> 00:01:30,380
is no longer the baseline.
44
00:01:30,380 –> 00:01:33,660
It’s a loose suggestion with historical artifacts attached.
45
00:01:33,660 –> 00:01:35,820
Now, there are three distinct failure types
46
00:01:35,820 –> 00:01:38,740
that get conflated as we need better governance.
47
00:01:38,740 –> 00:01:41,020
First, missing controls, that simple.
48
00:01:41,020 –> 00:01:42,580
You never created the guardrail,
49
00:01:42,580 –> 00:01:44,660
you never assigned the initiative.
50
00:01:44,660 –> 00:01:47,660
You never enabled logging, you never restricted regions.
51
00:01:47,660 –> 00:01:49,340
This is immature, but honest,
52
00:01:49,340 –> 00:01:51,220
you can fix it by building the control.
53
00:01:51,220 –> 00:01:53,860
Second, drifting controls, this is the real enterprise disease.
54
00:01:53,860 –> 00:01:54,700
You had the guardrail,
55
00:01:54,700 –> 00:01:56,820
but you allowed incremental deviations
56
00:01:56,820 –> 00:01:58,980
until the guardrail no longer defines reality.
57
00:01:58,980 –> 00:02:01,020
In other words, the policy exists,
58
00:02:01,020 –> 00:02:04,380
but the organization has learned how to root around it.
59
00:02:04,380 –> 00:02:05,780
Third, conflicting controls.
60
00:02:05,780 –> 00:02:08,060
This is where multiple teams create their own baselines,
61
00:02:08,060 –> 00:02:11,580
each correct in isolation, but incompatible in combination.
62
00:02:11,580 –> 00:02:13,180
One team denies public endpoints.
63
00:02:13,180 –> 00:02:14,860
Another team deploys managed services
64
00:02:14,860 –> 00:02:17,300
that assume public endpoints during provisioning.
65
00:02:17,300 –> 00:02:20,180
A third team builds a pipeline that auto-remediates tags
66
00:02:20,180 –> 00:02:21,700
but breaks terraform state.
67
00:02:21,700 –> 00:02:23,300
Everyone is doing governance.
68
00:02:23,300 –> 00:02:25,100
The platform is doing conditional chaos.
69
00:02:25,100 –> 00:02:27,140
That distinction matters because enterprises treat
70
00:02:27,140 –> 00:02:28,660
all three as a tooling problem.
71
00:02:28,660 –> 00:02:30,660
They buy something, they deploy dashboards,
72
00:02:30,660 –> 00:02:33,940
they measure compliance scores, they create more documentation.
73
00:02:33,940 –> 00:02:36,340
And none of it stops drift because drift is not a knowledge problem.
74
00:02:36,340 –> 00:02:38,380
It’s a decision distribution problem.
75
00:02:38,380 –> 00:02:40,900
In Azure, decision making is inherently distributed.
76
00:02:40,900 –> 00:02:43,780
ARM will accept deployments from portals, pipelines,
77
00:02:43,780 –> 00:02:45,780
service principles, managed identities,
78
00:02:45,780 –> 00:02:47,940
and whatever else you allow into the tenant.
79
00:02:47,940 –> 00:02:50,340
Every team makes thousands of micro decisions.
80
00:02:50,340 –> 00:02:53,260
Regions, SKUs, network exposure, identity assignments,
81
00:02:53,260 –> 00:02:55,060
logging, encryption tags.
82
00:02:55,060 –> 00:02:57,900
If you don’t enforce constraints, you don’t have governance,
83
00:02:57,900 –> 00:02:59,300
you have opinions.
84
00:02:59,300 –> 00:03:00,980
And here’s the uncomfortable truth.
85
00:03:00,980 –> 00:03:03,100
Even good teams create chaos at scale
86
00:03:03,100 –> 00:03:05,820
because good doesn’t survive unbounded choice.
87
00:03:05,820 –> 00:03:08,900
People rotate, projects get handed over, contractors show up,
88
00:03:08,900 –> 00:03:11,180
deadlines compressed, teams optimize locally.
89
00:03:11,180 –> 00:03:12,980
Over time, the platform becomes a museum
90
00:03:12,980 –> 00:03:14,820
of half enforced intentions.
91
00:03:14,820 –> 00:03:17,180
This is why the platform team becomes a ticket queue.
92
00:03:17,180 –> 00:03:18,580
Not because they’re incompetent,
93
00:03:18,580 –> 00:03:21,820
because the system is asking them to be the runtime authorization engine
94
00:03:21,820 –> 00:03:23,380
for the entire enterprise.
95
00:03:23,380 –> 00:03:25,700
Every exception is a manual compile step.
96
00:03:25,700 –> 00:03:28,140
Every quick approval is a new branch of behavior
97
00:03:28,140 –> 00:03:29,740
that someone must remember forever.
98
00:03:29,740 –> 00:03:31,180
Then audit season arrives.
99
00:03:31,180 –> 00:03:33,220
And suddenly, the organization discovers
100
00:03:33,220 –> 00:03:35,100
it can’t prove what it thinks it enforces.
101
00:03:35,100 –> 00:03:36,820
The spreadsheet says public access is blocked,
102
00:03:36,820 –> 00:03:39,340
but the tenant contains a set of exemptions nobody can explain.
103
00:03:39,340 –> 00:03:40,940
Secure score looks fine,
104
00:03:40,940 –> 00:03:44,180
but only because the loudest issues were muted through waivers.
105
00:03:44,180 –> 00:03:46,780
Logging exists, but not consistently because deploy
106
00:03:46,780 –> 00:03:50,220
if not exists was never remediated for legacy resources.
107
00:03:50,220 –> 00:03:52,780
Costs can’t be allocated because tags were recommended,
108
00:03:52,780 –> 00:03:54,060
not required.
109
00:03:54,060 –> 00:03:57,300
So the audit becomes a scramble, export policies,
110
00:03:57,300 –> 00:04:00,820
screenshot dashboards, manually map controls,
111
00:04:00,820 –> 00:04:03,060
and hope the auditor doesn’t ask the one question
112
00:04:03,060 –> 00:04:04,260
that exposes the drift.
113
00:04:04,260 –> 00:04:06,420
Incidents are worse.
114
00:04:06,420 –> 00:04:07,340
When something goes wrong,
115
00:04:07,340 –> 00:04:09,940
the post-incident review doesn’t say we lacked policy.
116
00:04:09,940 –> 00:04:12,340
It says we didn’t realize this path existed.
117
00:04:12,340 –> 00:04:14,900
That path exists because drift created it.
118
00:04:14,900 –> 00:04:17,060
An owner assignment that never expired,
119
00:04:17,060 –> 00:04:19,900
a subscription moved into a different management group
120
00:04:19,900 –> 00:04:21,660
and exemption without an end date,
121
00:04:21,660 –> 00:04:23,820
a resource type allowed temporarily
122
00:04:23,820 –> 00:04:25,860
and now the blast radius is real.
123
00:04:25,860 –> 00:04:27,780
This is where most enterprises break things.
124
00:04:27,780 –> 00:04:30,500
They confuse autonomy with absence of constraints.
125
00:04:30,500 –> 00:04:33,780
Autonomy only scales when boundaries are explicit and enforced.
126
00:04:33,780 –> 00:04:36,380
If you remember nothing else from this section, remember this.
127
00:04:36,380 –> 00:04:38,140
Exceptions are not special cases.
128
00:04:38,140 –> 00:04:40,260
They are permanent forks in system behavior
129
00:04:40,260 –> 00:04:41,780
unless you design them to expire.
130
00:04:41,780 –> 00:04:44,500
And that’s why the only sustainable fix is governance by design.
131
00:04:44,500 –> 00:04:47,820
Not more meetings, not more documentation, design,
132
00:04:47,820 –> 00:04:50,260
governance by design, deterministic guardrails
133
00:04:50,260 –> 00:04:52,140
versus probabilistic security.
134
00:04:52,140 –> 00:04:55,820
Governance by design means the platform enforces intent,
135
00:04:55,820 –> 00:04:56,820
not people.
136
00:04:56,820 –> 00:04:58,260
It stops being a set of guidelines
137
00:04:58,260 –> 00:05:00,820
and becomes a machine that compiles your enterprise assumptions
138
00:05:00,820 –> 00:05:01,980
into a loud reality.
139
00:05:01,980 –> 00:05:04,780
In architectural terms, as your governance is an authorization
140
00:05:04,780 –> 00:05:07,340
and compliance compiler, sitting on top of as your resource
141
00:05:07,340 –> 00:05:09,780
manager, arm is the control plane.
142
00:05:09,780 –> 00:05:11,300
Everything becomes a request.
143
00:05:11,300 –> 00:05:13,940
Create, update, delete.
144
00:05:13,940 –> 00:05:17,020
The only question that matters is what the control plane will accept.
145
00:05:17,020 –> 00:05:20,020
Most organizations treat that acceptance as a human process,
146
00:05:20,020 –> 00:05:22,660
tickets, reviews, approvals, tribal knowledge.
147
00:05:22,660 –> 00:05:23,740
That’s the comfortable model.
148
00:05:23,740 –> 00:05:24,820
And it doesn’t scale.
149
00:05:24,820 –> 00:05:27,580
The alternative is a deterministic model.
150
00:05:27,580 –> 00:05:30,020
What must be true for the resource to exist at all?
151
00:05:30,020 –> 00:05:31,380
Deterministic doesn’t mean perfect.
152
00:05:31,380 –> 00:05:32,660
It means predictable.
153
00:05:32,660 –> 00:05:35,300
It means the same request gets the same outcome every time,
154
00:05:35,300 –> 00:05:37,700
regardless of who clicked the button, which pipeline ran
155
00:05:37,700 –> 00:05:39,900
or which team is under pressure this week.
156
00:05:39,900 –> 00:05:41,180
That is the foundational difference
157
00:05:41,180 –> 00:05:43,060
between governance and governance theater.
158
00:05:43,060 –> 00:05:45,940
A deterministic guardrail is something like resources
159
00:05:45,940 –> 00:05:48,060
can only exist in approved regions.
160
00:05:48,060 –> 00:05:50,500
Storage accounts must use secure transfer.
161
00:05:50,500 –> 00:05:52,540
Diagnostics must go to a known workspace.
162
00:05:52,540 –> 00:05:55,100
Public endpoints are denied unless explicitly allowed
163
00:05:55,100 –> 00:05:56,700
through a controlled path.
164
00:05:56,700 –> 00:05:58,740
If the condition isn’t met, the deployment fails,
165
00:05:58,740 –> 00:06:01,500
not later, not after a report at the boundary.
166
00:06:01,500 –> 00:06:04,380
Now contrast that with the probabilistic model most enterprises
167
00:06:04,380 –> 00:06:05,700
drift into.
168
00:06:05,700 –> 00:06:08,500
Probabilistic security is the world of should be true unless 20
169
00:06:08,500 –> 00:06:10,660
thoughts audit-only controls, recommended tags,
170
00:06:10,660 –> 00:06:12,540
optional encryption, and a policy baseline
171
00:06:12,540 –> 00:06:13,980
with exemptions sprinkled everywhere
172
00:06:13,980 –> 00:06:17,220
because delivery pressure always wins eventually.
173
00:06:17,220 –> 00:06:20,260
In a probabilistic system security becomes a set of odds.
174
00:06:20,260 –> 00:06:21,740
Most resources comply.
175
00:06:21,740 –> 00:06:23,180
Most teams do the right thing.
176
00:06:23,180 –> 00:06:24,780
Most of the time nothing bad happens.
177
00:06:24,780 –> 00:06:25,700
That’s not governance.
178
00:06:25,700 –> 00:06:27,580
That’s wishful thinking with a dashboard.
179
00:06:27,580 –> 00:06:28,420
And here’s the trap.
180
00:06:28,420 –> 00:06:30,740
Probabilistic systems feel productive.
181
00:06:30,740 –> 00:06:31,700
They don’t block anyone.
182
00:06:31,700 –> 00:06:33,140
They don’t cause deployment failures.
183
00:06:33,140 –> 00:06:35,900
They minimize friction, but friction doesn’t disappear.
184
00:06:35,900 –> 00:06:36,820
It moves.
185
00:06:36,820 –> 00:06:39,180
It moves into incident response, audit preparation,
186
00:06:39,180 –> 00:06:40,220
and cost cleanup.
187
00:06:40,220 –> 00:06:41,820
It becomes delayed pain with interest.
188
00:06:41,820 –> 00:06:43,780
The enterprise goal isn’t to centralize control.
189
00:06:43,780 –> 00:06:45,980
It’s to enable autonomy without turning the platform
190
00:06:45,980 –> 00:06:47,900
team into a permanent approval board.
191
00:06:47,900 –> 00:06:49,940
Governance by design is how that happens.
192
00:06:49,940 –> 00:06:51,900
You define the minimum, non-negotiables.
193
00:06:51,900 –> 00:06:53,620
You enforce them automatically, and you
194
00:06:53,620 –> 00:06:55,420
let teams innovate inside the box.
195
00:06:55,420 –> 00:06:57,460
This is where most architects make the wrong trade off.
196
00:06:57,460 –> 00:07:00,500
They think enforcing guardrails kills velocity.
197
00:07:00,500 –> 00:07:02,220
What kills velocity is inconsistency.
198
00:07:02,220 –> 00:07:04,740
Every team reinventing patterns, every deployment
199
00:07:04,740 –> 00:07:07,860
producing a new variant, every exception becoming a bespoke
200
00:07:07,860 –> 00:07:09,740
snowflake that breaks the next automation.
201
00:07:09,740 –> 00:07:12,580
The weird part is the platform doesn’t care about your org chart.
202
00:07:12,580 –> 00:07:15,140
Arm doesn’t know what a critical workload is.
203
00:07:15,140 –> 00:07:16,940
It doesn’t know what temporary means.
204
00:07:16,940 –> 00:07:18,900
It doesn’t know that a contractor is deploying something
205
00:07:18,900 –> 00:07:20,540
you’ll inherit for five years.
206
00:07:20,540 –> 00:07:23,060
It just evaluates requests against the rules you actually
207
00:07:23,060 –> 00:07:24,060
implemented.
208
00:07:24,060 –> 00:07:27,060
So the design question becomes, what must always be true
209
00:07:27,060 –> 00:07:28,420
and at what scope?
210
00:07:28,420 –> 00:07:30,980
Because scope is where determinism is one or lost.
211
00:07:30,980 –> 00:07:33,020
If you enforce guardrails at the wrong level,
212
00:07:33,020 –> 00:07:35,420
you get either chaos or gridlock too high,
213
00:07:35,420 –> 00:07:37,740
and you block legitimate variation too low,
214
00:07:37,740 –> 00:07:39,420
and you create drift because every team
215
00:07:39,420 –> 00:07:41,340
creates its own version of policy.
216
00:07:41,340 –> 00:07:43,500
This is why governance starts with structure,
217
00:07:43,500 –> 00:07:46,220
a hierarchy that matches intent, and guardrails
218
00:07:46,220 –> 00:07:47,500
attached to that hierarchy.
219
00:07:47,500 –> 00:07:49,940
So inheritance does real work.
220
00:07:49,940 –> 00:07:51,980
And there’s one more uncomfortable truth.
221
00:07:51,980 –> 00:07:55,460
A deterministic model requires you to say no in advance
222
00:07:55,460 –> 00:07:57,500
in code before the request arrives.
223
00:07:57,500 –> 00:08:00,420
That means leaders must sponsor it, not approve it emotionally,
224
00:08:00,420 –> 00:08:02,060
sponsor it operationally.
225
00:08:02,060 –> 00:08:04,460
Because the first time a deployment fails in production
226
00:08:04,460 –> 00:08:06,740
due to a deny policy, somebody will call it a governance
227
00:08:06,740 –> 00:08:07,460
outage.
228
00:08:07,460 –> 00:08:08,900
They will demand an exception.
229
00:08:08,900 –> 00:08:11,460
And if leadership treats exceptions as negotiation instead
230
00:08:11,460 –> 00:08:14,300
of risk decisions, you’re back to probabilistic security
231
00:08:14,300 –> 00:08:15,020
within a month.
232
00:08:15,020 –> 00:08:17,340
So governance by design is not a policy project.
233
00:08:17,340 –> 00:08:18,580
It’s an operating stance.
234
00:08:18,580 –> 00:08:21,180
Define the boundaries, enforce them centrally,
235
00:08:21,180 –> 00:08:23,180
and make deviations expensive enough
236
00:08:23,180 –> 00:08:25,220
that teams only ask when the risk is real.
237
00:08:25,220 –> 00:08:26,820
Now, the obvious question is where to start.
238
00:08:26,820 –> 00:08:28,100
You start with the foundation that
239
00:08:28,100 –> 00:08:30,820
makes inheritance and boundaries possible at scale.
240
00:08:30,820 –> 00:08:33,380
Enterprise landing zones and management groups.
241
00:08:33,380 –> 00:08:36,100
Landing zones and management groups where scale either works
242
00:08:36,100 –> 00:08:36,980
or doesn’t.
243
00:08:36,980 –> 00:08:40,220
An enterprise landing zone is not a template you deploy and forget.
244
00:08:40,220 –> 00:08:41,940
It’s the set of prerequisites that make
245
00:08:41,940 –> 00:08:43,740
every future workload boring.
246
00:08:43,740 –> 00:08:47,020
Identity boundaries, network boundaries, logging destinations,
247
00:08:47,020 –> 00:08:50,060
policy inheritance, and ownership models already in place
248
00:08:50,060 –> 00:08:52,060
before the first team shows up with a deadline.
249
00:08:52,060 –> 00:08:53,420
Most orgs do this backwards.
250
00:08:53,420 –> 00:08:54,820
They migrate the workload.
251
00:08:54,820 –> 00:08:57,020
Then they ask the platform team to govern it.
252
00:08:57,020 –> 00:08:59,580
That’s like building a city, then arguing about roads
253
00:08:59,580 –> 00:09:01,260
and water after people moved in.
254
00:09:01,260 –> 00:09:02,900
The platform will accept the chaos.
255
00:09:02,900 –> 00:09:04,140
The auditors will not.
256
00:09:04,140 –> 00:09:07,100
Landing zones are the pre-work that makes autonomy possible.
257
00:09:07,100 –> 00:09:09,100
They standardize what must be standardized
258
00:09:09,100 –> 00:09:11,540
and they leave everything else to the workload teams.
259
00:09:11,540 –> 00:09:13,500
The hierarchy is the enforcement surface.
260
00:09:13,500 –> 00:09:15,060
And if you get the hierarchy wrong,
261
00:09:15,060 –> 00:09:16,740
every control becomes expensive.
262
00:09:16,740 –> 00:09:18,260
At the top is the tenant group group.
263
00:09:18,260 –> 00:09:19,500
Under that are management groups.
264
00:09:19,500 –> 00:09:21,580
Under management groups are subscriptions.
265
00:09:21,580 –> 00:09:23,500
Under subscriptions are resource groups.
266
00:09:23,500 –> 00:09:25,220
Under resource groups are the resources.
267
00:09:25,220 –> 00:09:27,420
That chain matters because inheritance matters.
268
00:09:27,420 –> 00:09:29,420
Azure policy inherits down that tree.
269
00:09:29,420 –> 00:09:31,060
Our back inherits down that tree.
270
00:09:31,060 –> 00:09:32,860
If your tree is messy, your governance
271
00:09:32,860 –> 00:09:34,780
becomes click ops archeology.
272
00:09:34,780 –> 00:09:36,060
This is where people get cute.
273
00:09:36,060 –> 00:09:37,980
They create management group hierarchies
274
00:09:37,980 –> 00:09:39,460
that look like org charts.
275
00:09:39,460 –> 00:09:42,060
Region, business unit, environment, application,
276
00:09:42,060 –> 00:09:44,380
team, project, fiscal year, the moon phase.
277
00:09:44,380 –> 00:09:45,340
It feels structured.
278
00:09:45,340 –> 00:09:46,820
It’s also unmanageable.
279
00:09:46,820 –> 00:09:49,340
A management group hierarchy should be shallow enough
280
00:09:49,340 –> 00:09:51,460
that humans can reason about it during an incident.
281
00:09:51,460 –> 00:09:53,580
Three to four levels is usually the practical ceiling,
282
00:09:53,580 –> 00:09:54,900
not because Microsoft says so.
283
00:09:54,900 –> 00:09:57,060
Because every additional level becomes another place
284
00:09:57,060 –> 00:10:00,180
for conflicting policy assignments, RBX brawl.
285
00:10:00,180 –> 00:10:03,180
And why does the subscription inherit that deny policy
286
00:10:03,180 –> 00:10:05,380
depth creates ambiguity.
287
00:10:05,380 –> 00:10:06,940
Breath creates delegation.
288
00:10:06,940 –> 00:10:08,220
You want breath.
289
00:10:08,220 –> 00:10:11,260
The governing principle is simple, separate by intent.
290
00:10:11,260 –> 00:10:13,100
Platform intent, shared services
291
00:10:13,100 –> 00:10:16,380
and foundational components that should be tightly controlled.
292
00:10:16,380 –> 00:10:19,540
Workload intent, applications that need freedom inside guardrails.
293
00:10:19,540 –> 00:10:23,660
Sandbox intent, experimentation, where you accept risk,
294
00:10:23,660 –> 00:10:25,140
but you contain it.
295
00:10:25,140 –> 00:10:28,380
Production intent, workloads that must meet the baseline
296
00:10:28,380 –> 00:10:30,300
with tight exception handling.
297
00:10:30,300 –> 00:10:32,220
Regulated intent, data boundaries where
298
00:10:32,220 –> 00:10:35,460
you’re not negotiating encryption, logging or exposure,
299
00:10:35,460 –> 00:10:37,420
those are boundary decisions.
300
00:10:37,420 –> 00:10:39,580
And boundary decisions are how you avoid pretending
301
00:10:39,580 –> 00:10:41,980
a single tenant is one uniform risk domain.
302
00:10:41,980 –> 00:10:43,980
This is also why management group design is not
303
00:10:43,980 –> 00:10:45,100
an Azure feature.
304
00:10:45,100 –> 00:10:47,140
It’s how you encode your enterprise risk model
305
00:10:47,140 –> 00:10:48,420
into the control plane.
306
00:10:48,420 –> 00:10:50,180
A common pattern that survives reality
307
00:10:50,180 –> 00:10:52,700
is a top-level split between platform and workloads.
308
00:10:52,700 –> 00:10:54,300
Platform gets its own management group
309
00:10:54,300 –> 00:10:57,180
for identity, network, security, tooling, and logging.
310
00:10:57,180 –> 00:10:59,740
It typically hosts subscriptions for shared services,
311
00:10:59,740 –> 00:11:02,940
hub networking, central monitoring, identity-related services,
312
00:11:02,940 –> 00:11:06,180
and anything that should not be modified by app teams.
313
00:11:06,180 –> 00:11:08,540
Workloads sit under their own management groups
314
00:11:08,540 –> 00:11:10,660
segmented by environment and risk.
315
00:11:10,660 –> 00:11:13,260
Production and non-production should not share inheritance
316
00:11:13,260 –> 00:11:16,020
unless you enjoy explaining why dev deployments
317
00:11:16,020 –> 00:11:18,460
can bypass guardrails that proud must obey.
318
00:11:18,460 –> 00:11:20,780
Then you isolate sandboxes, not to punish teams.
319
00:11:20,780 –> 00:11:24,220
To contain novelty, sandboxes are where teams learn, prototype,
320
00:11:24,220 –> 00:11:25,380
and break things.
321
00:11:25,380 –> 00:11:27,900
The enterprise just refuses to let those breaks propagate
322
00:11:27,900 –> 00:11:30,460
into regulated or production boundaries.
323
00:11:30,460 –> 00:11:32,260
Now here’s the part most people miss.
324
00:11:32,260 –> 00:11:34,260
A landing zone is not only structure.
325
00:11:34,260 –> 00:11:35,100
It’s a contract.
326
00:11:35,100 –> 00:11:37,220
If a subscription lives in this management group,
327
00:11:37,220 –> 00:11:39,660
it inherits these policies, these role assignments,
328
00:11:39,660 –> 00:11:41,340
and these logging requirements.
329
00:11:41,340 –> 00:11:43,260
That contract is what makes subscription
330
00:11:43,260 –> 00:11:45,500
vending and self-service possible later.
331
00:11:45,500 –> 00:11:47,260
Without that contract, every new subscription
332
00:11:47,260 –> 00:11:49,060
becomes a bespoke negotiation.
333
00:11:49,060 –> 00:11:50,900
You lose the ability to scale.
334
00:11:50,900 –> 00:11:52,300
And once you have that contract,
335
00:11:52,300 –> 00:11:54,380
you can keep the hierarchy understandable
336
00:11:54,380 –> 00:11:55,980
by avoiding ornamental layers.
337
00:11:55,980 –> 00:11:58,020
If a management group level doesn’t change policy,
338
00:11:58,020 –> 00:12:00,820
our back, or logging posture, it’s probably just decoration.
339
00:12:00,820 –> 00:12:02,500
Decoration becomes entropy.
340
00:12:02,500 –> 00:12:04,340
So the objective is a hierarchy
341
00:12:04,340 –> 00:12:06,820
where moving a subscription is a meaningful action.
342
00:12:06,820 –> 00:12:07,780
It changes the rules.
343
00:12:07,780 –> 00:12:09,100
It changes the blast radius.
344
00:12:09,100 –> 00:12:10,220
It changes who can do what?
345
00:12:10,220 –> 00:12:12,540
That’s how you know your structure is doing real work.
346
00:12:12,540 –> 00:12:14,700
In the next section, the conversation gets even more
347
00:12:14,700 –> 00:12:16,540
uncomfortable because subscriptions
348
00:12:16,540 –> 00:12:18,020
are where governance becomes expensive
349
00:12:18,020 –> 00:12:20,020
if you pretend they’re only billing containers.
350
00:12:20,020 –> 00:12:20,860
They are not.
351
00:12:20,860 –> 00:12:22,780
They are your primary boundary for cost,
352
00:12:22,780 –> 00:12:25,940
access, policy inheritance, and incident containment.
353
00:12:25,940 –> 00:12:27,580
Subscription strategy, billing boundary,
354
00:12:27,580 –> 00:12:29,820
security boundary, blast radius boundary.
355
00:12:29,820 –> 00:12:32,100
Subscriptions are not where you put workloads.
356
00:12:32,100 –> 00:12:33,340
They are where you draw boundaries
357
00:12:33,340 –> 00:12:34,980
the enterprise can actually enforce.
358
00:12:34,980 –> 00:12:36,660
Billing boundary is the obvious one.
359
00:12:36,660 –> 00:12:38,580
One subscription, one cost container,
360
00:12:38,580 –> 00:12:40,580
you can budget, alert, and attribute.
361
00:12:40,580 –> 00:12:42,580
But if you stop there, you’ll build a tenant
362
00:12:42,580 –> 00:12:46,540
that looks tidy in finance reports and chaotic everywhere else.
363
00:12:46,540 –> 00:12:48,660
Because subscriptions are also security boundaries.
364
00:12:48,660 –> 00:12:50,820
They are the scope where our back assignments become
365
00:12:50,820 –> 00:12:53,940
survivable, where policy inheritance becomes predictable,
366
00:12:53,940 –> 00:12:55,780
and where incidents become containable.
367
00:12:55,780 –> 00:12:57,820
When something goes wrong, you want the failure domain
368
00:12:57,820 –> 00:12:59,940
to be a subscription, not the entire tenant
369
00:12:59,940 –> 00:13:02,180
because everyone is contributor at root.
370
00:13:02,180 –> 00:13:03,540
This is a boundary decision.
371
00:13:03,540 –> 00:13:06,260
And boundary decisions are how you keep blast radius
372
00:13:06,260 –> 00:13:08,900
from turning into organizational trauma.
373
00:13:08,900 –> 00:13:10,460
Start with the principle.
374
00:13:10,460 –> 00:13:12,020
Group subscriptions under management groups
375
00:13:12,020 –> 00:13:14,820
based on shared governance needs, not based on org charts.
376
00:13:14,820 –> 00:13:17,540
If two subscriptions need different deny policies,
377
00:13:17,540 –> 00:13:19,860
different logging destinations, different network models
378
00:13:19,860 –> 00:13:21,860
or different access patterns, they should not
379
00:13:21,860 –> 00:13:23,580
share the same inheritance surface.
380
00:13:23,580 –> 00:13:25,300
Every time you pretend they’re the same,
381
00:13:25,300 –> 00:13:27,380
you create policy exceptions later.
382
00:13:27,380 –> 00:13:29,420
Those exceptions become permanent forks.
383
00:13:29,420 –> 00:13:32,260
A common enterprise pattern that works is environment separation,
384
00:13:32,260 –> 00:13:34,540
dev, test, and prod in separate subscriptions.
385
00:13:34,540 –> 00:13:36,180
Not because Microsoft requires it,
386
00:13:36,180 –> 00:13:37,980
because it gives you three useful properties,
387
00:13:37,980 –> 00:13:39,780
different access, different policy strictness
388
00:13:39,780 –> 00:13:41,420
and different incident containment.
389
00:13:41,420 –> 00:13:44,460
Dev can tolerate border experimentation, prod cannot.
390
00:13:44,460 –> 00:13:45,620
And when you put them together,
391
00:13:45,620 –> 00:13:48,100
you inevitably drift prod down to dev behavior
392
00:13:48,100 –> 00:13:49,980
through temporary waivers.
393
00:13:49,980 –> 00:13:52,100
Another pattern is business units separation,
394
00:13:52,100 –> 00:13:54,420
but only when business units are truly autonomous
395
00:13:54,420 –> 00:13:55,620
and lead isolation.
396
00:13:55,620 –> 00:13:58,980
If business units share platform services and security posture,
397
00:13:58,980 –> 00:14:00,820
splitting subscriptions per business unit
398
00:14:00,820 –> 00:14:03,780
can create redundant work and inconsistent standards.
399
00:14:03,780 –> 00:14:05,700
Separation is not automatically governance.
400
00:14:05,700 –> 00:14:06,940
It’s just multiplication.
401
00:14:06,940 –> 00:14:10,140
Use it only when it reduces risk or clarifies ownership.
402
00:14:10,140 –> 00:14:12,420
Regulated workloads are the clearest case.
403
00:14:12,420 –> 00:14:14,940
If a workload has data residency constraints,
404
00:14:14,940 –> 00:14:17,940
higher audit requirements, or strict network exposure rules,
405
00:14:17,940 –> 00:14:20,140
it needs an isolated subscription boundary.
406
00:14:20,140 –> 00:14:22,460
Otherwise, the regulated baseline becomes optional
407
00:14:22,460 –> 00:14:24,860
because it competes with non-regulated delivery pressure
408
00:14:24,860 –> 00:14:26,380
in the same inheritance tree.
409
00:14:26,380 –> 00:14:28,300
Now the anti-patterns, they are painfully common.
410
00:14:28,300 –> 00:14:29,940
First, one subscription for everything.
411
00:14:29,940 –> 00:14:33,140
This feels efficient until the first time you try to delegate.
412
00:14:33,140 –> 00:14:35,780
Then you either grant broad rights to unblock teams
413
00:14:35,780 –> 00:14:37,740
or you centralize everything through tickets,
414
00:14:37,740 –> 00:14:38,980
either way you lose.
415
00:14:38,980 –> 00:14:41,140
One big subscription becomes a shared blast radius
416
00:14:41,140 –> 00:14:42,700
and a shared blame domain.
417
00:14:42,700 –> 00:14:44,620
Second, random per team subscriptions
418
00:14:44,620 –> 00:14:46,140
with no inheritance strategy.
419
00:14:46,140 –> 00:14:47,300
Teams get autonomy,
420
00:14:47,300 –> 00:14:49,060
but the enterprise gets entropy.
421
00:14:49,060 –> 00:14:50,420
Policies are inconsistent,
422
00:14:50,420 –> 00:14:52,100
our bark differs per team,
423
00:14:52,100 –> 00:14:53,940
logging ends up in multiple workspaces
424
00:14:53,940 –> 00:14:55,980
and your SOC spends its time correlating
425
00:14:55,980 –> 00:14:57,540
across fractured telemetry.
426
00:14:57,540 –> 00:14:59,180
This is how you end up with cloud sprawl
427
00:14:59,180 –> 00:15:01,420
and no credible compliance story.
428
00:15:01,420 –> 00:15:03,580
Third, subscriptions created ad hoc
429
00:15:03,580 –> 00:15:05,540
with no subscription-wending model.
430
00:15:05,540 –> 00:15:07,300
If a subscription is born through a portal,
431
00:15:07,300 –> 00:15:09,580
click it will inherit whatever defaults happened
432
00:15:09,580 –> 00:15:11,540
to exist that day and those defaults changed.
433
00:15:11,540 –> 00:15:12,900
That means your governance baseline
434
00:15:12,900 –> 00:15:15,460
becomes a matter of timing, not intent.
435
00:15:15,460 –> 00:15:18,220
That’s an unacceptable property in an enterprise system.
436
00:15:18,220 –> 00:15:20,540
A proper subscription strategy answers four questions
437
00:15:20,540 –> 00:15:21,380
up front.
438
00:15:21,380 –> 00:15:22,700
Who owns this subscription?
439
00:15:22,700 –> 00:15:23,820
Not who pays for it?
440
00:15:23,820 –> 00:15:26,300
Who’s accountable for what exists inside it?
441
00:15:26,300 –> 00:15:27,580
What baseline applies?
442
00:15:27,580 –> 00:15:28,820
Which initiatives are assigned?
443
00:15:28,820 –> 00:15:30,460
Which denies are non-negotiable?
444
00:15:30,460 –> 00:15:31,900
Which controls are audit first?
445
00:15:31,900 –> 00:15:33,020
What access model applies?
446
00:15:33,020 –> 00:15:34,100
Which groups get contributor?
447
00:15:34,100 –> 00:15:35,940
Which groups are eligible for elevation?
448
00:15:35,940 –> 00:15:37,460
And which roles are forbidden?
449
00:15:37,460 –> 00:15:39,260
And what is the blast radius expectation?
450
00:15:39,260 –> 00:15:41,820
If this subscription is compromised or misconfigured,
451
00:15:41,820 –> 00:15:44,260
what is the maximum damage it can do by design?
452
00:15:44,260 –> 00:15:46,740
If you can’t answer those, don’t create the subscription yet.
453
00:15:46,740 –> 00:15:48,260
You’re not creating capacity.
454
00:15:48,260 –> 00:15:50,220
You’re creating future incident scope.
455
00:15:50,220 –> 00:15:51,460
And here’s the uncomfortable truth.
456
00:15:51,460 –> 00:15:54,620
Subscriptions are where enterprises try to avoid saying no
457
00:15:54,620 –> 00:15:56,100
and then they pay for it later.
458
00:15:56,100 –> 00:15:57,580
They create a shared subscription
459
00:15:57,580 –> 00:15:59,140
because it’s easier today.
460
00:15:59,140 –> 00:16:01,500
Then they spend years on tangling access, policies,
461
00:16:01,500 –> 00:16:02,620
and billing allocations.
462
00:16:02,620 –> 00:16:04,580
The platform didn’t become complicated.
463
00:16:04,580 –> 00:16:05,980
The boundary choices did.
464
00:16:05,980 –> 00:16:07,580
Once you have this subscription boundaries
465
00:16:07,580 –> 00:16:10,540
aligned to intent, the next entropy source is access.
466
00:16:10,540 –> 00:16:12,580
Because even with perfect hierarchy,
467
00:16:12,580 –> 00:16:15,540
one careless owner assignment can punch through all your design
468
00:16:15,540 –> 00:16:16,300
assumptions.
469
00:16:16,300 –> 00:16:18,700
Identity and R-back, stop assigning people,
470
00:16:18,700 –> 00:16:19,980
start assigning intent.
471
00:16:19,980 –> 00:16:21,900
One subscription boundaries exist.
472
00:16:21,900 –> 00:16:24,260
Identity becomes the fastest way to destroy them.
473
00:16:24,260 –> 00:16:26,500
Azure R-back is simple on paper who can do what
474
00:16:26,500 –> 00:16:29,020
where principle, role, scope.
475
00:16:29,020 –> 00:16:31,260
In reality, it becomes an authorization graph
476
00:16:31,260 –> 00:16:34,220
that quietly sprawls until nobody can tell you why
477
00:16:34,220 –> 00:16:36,700
an intern can delete a production firewall.
478
00:16:36,700 –> 00:16:39,460
The foundational mistake is treating R-back as HR,
479
00:16:39,460 –> 00:16:42,220
assigning access to named humans because it’s fast.
480
00:16:42,220 –> 00:16:43,500
Humans are not stable.
481
00:16:43,500 –> 00:16:45,780
Teams change vendors rotate, people go on leave,
482
00:16:45,780 –> 00:16:47,340
and accounts get compromised.
483
00:16:47,340 –> 00:16:49,460
If your governance model depends on individuals
484
00:16:49,460 –> 00:16:51,300
being careful forever you’ve already lost,
485
00:16:51,300 –> 00:16:54,620
RBX needs to express intent, not personalities.
486
00:16:54,620 –> 00:16:55,900
Intent is stable.
487
00:16:55,900 –> 00:16:58,100
Workload operators can restart resources.
488
00:16:58,100 –> 00:17:00,060
Platform engineers can manage network.
489
00:17:00,060 –> 00:17:01,540
Security can read posture.
490
00:17:01,540 –> 00:17:04,060
Automation can deploy, but not assign roles.
491
00:17:04,060 –> 00:17:05,180
Those are durable statements.
492
00:17:05,180 –> 00:17:06,500
So the enterprise law is this.
493
00:17:06,500 –> 00:17:09,020
Assign roles to groups, not users.
494
00:17:09,020 –> 00:17:10,380
Not because it’s fashionable,
495
00:17:10,380 –> 00:17:13,140
because it’s the only way off-boarding works at scale.
496
00:17:13,140 –> 00:17:14,860
If a person leaves and you have to search
497
00:17:14,860 –> 00:17:17,420
for their direct assignments across subscriptions,
498
00:17:17,420 –> 00:17:19,580
you’ve built a breach persistence mechanism.
499
00:17:19,580 –> 00:17:21,300
Group membership is the control surface.
500
00:17:21,300 –> 00:17:23,780
The identity platform already knows how to manage groups.
501
00:17:23,780 –> 00:17:27,060
Your job is to make group design, match boundary design.
502
00:17:27,060 –> 00:17:28,860
That means you don’t create one group called
503
00:17:28,860 –> 00:17:30,940
Azure admins and call it governance.
504
00:17:30,940 –> 00:17:33,660
You create groups that encode scope and purpose.
505
00:17:33,660 –> 00:17:35,580
Not elegant names, useful names.
506
00:17:35,580 –> 00:17:38,580
A group should imply exactly where it has access and why.
507
00:17:38,580 –> 00:17:40,060
And then there’s scope discipline.
508
00:17:40,060 –> 00:17:42,060
Most people think scope is about convenience.
509
00:17:42,060 –> 00:17:43,340
It’s about blast radius.
510
00:17:43,340 –> 00:17:46,140
Azure gives you scopes in descending order of danger.
511
00:17:46,140 –> 00:17:49,180
Management group, subscription, resource group, resource.
512
00:17:49,180 –> 00:17:52,060
The higher you assign, the more inheritance you create.
513
00:17:52,060 –> 00:17:54,380
Inheritance is powerful, and it is also how
514
00:17:54,380 –> 00:17:56,820
privileged spreads when nobody is paying attention.
515
00:17:56,820 –> 00:17:59,060
The rule is assign at the highest scope
516
00:17:59,060 –> 00:18:00,860
that meets the requirement but no higher.
517
00:18:00,860 –> 00:18:03,140
That sounds contradictory until you understand the goal.
518
00:18:03,140 –> 00:18:06,300
You want minimal assignments, but you want minimal blast radius.
519
00:18:06,300 –> 00:18:08,140
If a team truly owns a subscription,
520
00:18:08,140 –> 00:18:09,700
then assign a subscription scope.
521
00:18:09,700 –> 00:18:13,060
If they own one application, assign at that resource group.
522
00:18:13,060 –> 00:18:14,980
If they only need access to one key vault,
523
00:18:14,980 –> 00:18:16,980
do not give them contributor at the resource group
524
00:18:16,980 –> 00:18:17,900
because you’re tired.
525
00:18:17,900 –> 00:18:19,940
This is also where people misuse contributor.
526
00:18:19,940 –> 00:18:21,260
Contributor is not developer.
527
00:18:21,260 –> 00:18:23,620
Contributor is can change almost everything.
528
00:18:23,620 –> 00:18:27,100
If your developers can change network, identity-related resources,
529
00:18:27,100 –> 00:18:29,340
policy assignments, or logging configuration,
530
00:18:29,340 –> 00:18:31,780
you’ve given them the ability to erase the guardrails
531
00:18:31,780 –> 00:18:32,980
that made them safe.
532
00:18:32,980 –> 00:18:33,820
An owner is worse.
533
00:18:33,820 –> 00:18:35,380
Owner is not a convenience role.
534
00:18:35,380 –> 00:18:37,420
Owner is a breach multiplier because it includes
535
00:18:37,420 –> 00:18:39,140
the ability to assign roles.
536
00:18:39,140 –> 00:18:41,020
When an identity is compromised,
537
00:18:41,020 –> 00:18:43,580
owner turns that compromise into persistence.
538
00:18:43,580 –> 00:18:45,740
Attackers don’t just deploy resources.
539
00:18:45,740 –> 00:18:49,660
They grant themselves access that survives password resets.
540
00:18:49,660 –> 00:18:51,900
Microsoft guidance enlist privilege patterns
541
00:18:51,900 –> 00:18:55,140
commonly advises keeping subscription owners extremely limited.
542
00:18:55,140 –> 00:18:58,580
In practice, the right number is as few as you can operate with
543
00:18:58,580 –> 00:19:01,060
and it should be monitored like a toxic asset.
544
00:19:01,060 –> 00:19:02,740
Now the part most enterprises avoid
545
00:19:02,740 –> 00:19:05,540
because it creates conflict, separating duties.
546
00:19:05,540 –> 00:19:07,940
A platform team should not be the same identity cohort
547
00:19:07,940 –> 00:19:09,340
as workload operators.
548
00:19:09,340 –> 00:19:12,060
Security readers should not also be deployment writers.
549
00:19:12,060 –> 00:19:14,620
Auditors should not be troubleshooters with contributor.
550
00:19:14,620 –> 00:19:17,380
When you combine duties, you don’t just increase risk.
551
00:19:17,380 –> 00:19:19,020
You erase accountability.
552
00:19:19,020 –> 00:19:21,140
Every incident becomes someone with broad access
553
00:19:21,140 –> 00:19:23,020
did something and you can’t prove intent.
554
00:19:23,020 –> 00:19:24,860
So define the roles as intent lanes.
555
00:19:24,860 –> 00:19:28,420
Platform lane manages shared services, network, identity
556
00:19:28,420 –> 00:19:30,780
integrations, logging destinations,
557
00:19:30,780 –> 00:19:32,540
rarely touches workload resources.
558
00:19:32,540 –> 00:19:34,900
Workload lane, deploys and operates applications
559
00:19:34,900 –> 00:19:36,460
inside the subscription boundary,
560
00:19:36,460 –> 00:19:38,020
but cannot rewrite the boundary.
561
00:19:38,020 –> 00:19:40,500
Security lane, reads posture, reads logs,
562
00:19:40,500 –> 00:19:42,420
can request elevation for investigations
563
00:19:42,420 –> 00:19:45,020
but does not own production mutations by default.
564
00:19:45,020 –> 00:19:47,220
Automation lane, service principles,
565
00:19:47,220 –> 00:19:49,900
and managed identities that deploy the approved patterns
566
00:19:49,900 –> 00:19:51,020
and nothing else.
567
00:19:51,020 –> 00:19:52,020
If you do this right,
568
00:19:52,020 –> 00:19:53,940
RBAC stops being an access spreadsheet
569
00:19:53,940 –> 00:19:55,860
and becomes a boundary enforcement tool.
570
00:19:55,860 –> 00:19:58,980
Teams can move faster because they know what they’re allowed to do
571
00:19:58,980 –> 00:20:00,180
without asking.
572
00:20:00,180 –> 00:20:02,180
The platform team stops being a permission desk
573
00:20:02,180 –> 00:20:04,100
because permissions express design.
574
00:20:04,100 –> 00:20:05,900
But RBAC still degrades over time
575
00:20:05,900 –> 00:20:07,780
because standing privilege accumulates.
576
00:20:07,780 –> 00:20:09,900
People get temporary access and never lose it.
577
00:20:09,900 –> 00:20:12,220
Emergency fixes become permanent roles
578
00:20:12,220 –> 00:20:14,740
and eventually, least privilege becomes a slogan.
579
00:20:14,740 –> 00:20:16,300
That’s why RBAC alone is not enough.
580
00:20:16,300 –> 00:20:18,620
You need time bound elevation as the default behavior
581
00:20:18,620 –> 00:20:21,140
and that means privileged identity management.
582
00:20:21,140 –> 00:20:22,660
Privileged identity management,
583
00:20:22,660 –> 00:20:25,580
standing privilege is just deferred incident response.
584
00:20:25,580 –> 00:20:28,260
Most enterprises say they believe in least privilege.
585
00:20:28,260 –> 00:20:30,220
Then they hand out permanent admin rights
586
00:20:30,220 –> 00:20:32,700
because people need to do their jobs.
587
00:20:32,700 –> 00:20:34,180
That isn’t least privilege.
588
00:20:34,180 –> 00:20:36,100
That’s pre-approved incident response.
589
00:20:36,100 –> 00:20:39,220
Standing privilege is just risk you haven’t been forced to pay for yet.
590
00:20:39,220 –> 00:20:41,820
PM exists because RBAC assignments rot.
591
00:20:41,820 –> 00:20:43,860
They rot for the same reason everything else does.
592
00:20:43,860 –> 00:20:45,860
Humans don’t come back later to remove access.
593
00:20:45,860 –> 00:20:46,860
They no longer need.
594
00:20:46,860 –> 00:20:49,140
The urgent work finishes the ticket closes
595
00:20:49,140 –> 00:20:52,860
and the elevated role quietly becomes part of someone’s identity for years.
596
00:20:52,860 –> 00:20:54,620
And in Azure, that’s not a small mistake.
597
00:20:54,620 –> 00:20:57,700
A single persistent owner or user access administrator assignment
598
00:20:57,700 –> 00:20:59,940
is a persistence mechanism for attackers,
599
00:20:59,940 –> 00:21:02,380
a compliance finding waiting to happen,
600
00:21:02,380 –> 00:21:05,020
and a guaranteed we didn’t know they still had that moment
601
00:21:05,020 –> 00:21:06,580
during an incident review.
602
00:21:06,580 –> 00:21:09,100
Privileged identity management flips the model.
603
00:21:09,100 –> 00:21:11,100
It separates entitlement from activation.
604
00:21:11,100 –> 00:21:14,100
Eligible means the identity is allowed to request elevation.
605
00:21:14,100 –> 00:21:15,900
Active means it is elevated right now.
606
00:21:15,900 –> 00:21:18,420
That distinction matters because it turns privileged access
607
00:21:18,420 –> 00:21:20,700
from a default state into an event.
608
00:21:20,700 –> 00:21:24,060
Events can be audited, events can be constrained, events can expire.
609
00:21:24,060 –> 00:21:28,220
In other words, PM makes privilege behave like a controlled resource
610
00:21:28,220 –> 00:21:30,060
instead of a personal benefit.
611
00:21:30,060 –> 00:21:31,060
Here’s the rule of thumb.
612
00:21:31,060 –> 00:21:33,140
If someone’s job requires admin rights all day,
613
00:21:33,140 –> 00:21:34,820
every day the job is poorly designed.
614
00:21:34,820 –> 00:21:36,100
Admin is not a job function.
615
00:21:36,100 –> 00:21:37,300
It’s an escalation path.
616
00:21:37,300 –> 00:21:40,980
So the operating model becomes everyone runs as normal R-back roads
617
00:21:40,980 –> 00:21:43,780
most of the time and the few actions that require high privilege
618
00:21:43,780 –> 00:21:48,460
are done through time-bound activation with friction that forces intent.
619
00:21:48,460 –> 00:21:51,980
And yes, the friction is the point that friction is where most organizations
620
00:21:51,980 –> 00:21:55,580
break things because leaders treat it like a negotiation with developers
621
00:21:55,580 –> 00:21:58,020
instead of a safety mechanism for the enterprise.
622
00:21:58,020 –> 00:22:00,580
PM only works when leadership mandates it as a standard
623
00:22:00,580 –> 00:22:03,220
not when a platform team tries to encourage it.
624
00:22:03,220 –> 00:22:06,580
The first week you enforce it, someone will complain that approvals slow them down.
625
00:22:06,580 –> 00:22:09,540
They will ask for permanent access just for this project.
626
00:22:09,540 –> 00:22:11,540
If that request succeeds, you don’t have PM.
627
00:22:11,540 –> 00:22:12,700
You have a bypass process.
628
00:22:12,700 –> 00:22:14,140
So you need a few non-negotiables.
629
00:22:14,140 –> 00:22:17,300
First, time limits, privileged roles expire.
630
00:22:17,300 –> 00:22:18,300
Always.
631
00:22:18,300 –> 00:22:20,620
There is no such thing as until further notice.
632
00:22:20,620 –> 00:22:24,100
If you can’t set an end time, you are approving a permanent exception
633
00:22:24,100 –> 00:22:25,660
and we already covered how that ends.
634
00:22:25,660 –> 00:22:29,140
Second, justification, not a paragraph of theater.
635
00:22:29,140 –> 00:22:32,860
A real reason tied to a task, a ticket, a change request, an incident.
636
00:22:32,860 –> 00:22:35,700
If the request can’t name what it’s doing, it shouldn’t be elevated.
637
00:22:35,700 –> 00:22:36,820
Third, MFA.
638
00:22:36,820 –> 00:22:41,300
Privileged activation without strong authentication is just convenience layered on top of risk.
639
00:22:41,300 –> 00:22:44,420
Fourth, approvals for the roles that can change boundaries.
640
00:22:44,420 –> 00:22:47,220
Some roles should be self-activate for operational speed.
641
00:22:47,220 –> 00:22:50,740
Others should require a second set of eyes because they can rewrite the system.
642
00:22:50,740 –> 00:22:53,300
And the system defining roles are predictable.
643
00:22:53,300 –> 00:22:55,700
Owner and user access administrator.
644
00:22:55,700 –> 00:22:58,180
Anything that can assign roles or change permissions
645
00:22:58,180 –> 00:22:59,780
is not an operational role.
646
00:22:59,780 –> 00:23:00,860
It’s a governance role.
647
00:23:00,860 –> 00:23:01,700
Treat it like one.
648
00:23:01,700 –> 00:23:04,460
Now, eligible versus active is only the mechanics.
649
00:23:04,460 –> 00:23:06,620
The real design is separation of duties.
650
00:23:06,620 –> 00:23:10,060
Platform owners are the people who define and maintain the guardrails.
651
00:23:10,060 –> 00:23:13,500
Policy assignments, management, group structure, subscription vending,
652
00:23:13,500 –> 00:23:15,860
logging destinations, network baselines.
653
00:23:15,860 –> 00:23:18,980
Workload operators run applications inside those boundaries.
654
00:23:18,980 –> 00:23:21,780
Deploy, scale, patch, troubleshoot.
655
00:23:21,780 –> 00:23:24,540
Auditors and security teams need visibility and evidence,
656
00:23:24,540 –> 00:23:29,820
read access, compliance posture, logs and the ability to request temporary elevation for investigations.
657
00:23:29,820 –> 00:23:32,300
But when you blur those roles, PIM becomes cosmetic.
658
00:23:32,300 –> 00:23:35,100
People just activate everything because they might need it.
659
00:23:35,100 –> 00:23:37,940
And the enterprise ends up with privilege concurrency.
660
00:23:37,940 –> 00:23:43,060
Dozens of admins active all day, every day, with justifications that say, work.
661
00:23:43,060 –> 00:23:43,780
That’s not governance.
662
00:23:43,780 –> 00:23:45,220
That’s paperwork around power.
663
00:23:45,220 –> 00:23:48,740
A good PIM model keeps the eligible set small and the active set smaller.
664
00:23:48,740 –> 00:23:51,180
It makes privilege the exception not the default.
665
00:23:51,180 –> 00:23:53,540
It also makes emergency access explicit.
666
00:23:53,540 –> 00:23:58,620
Break glass accounts are not daily drivers and their use should be loud, logged and reviewed.
667
00:23:58,620 –> 00:24:00,380
And here’s the part that makes it survivable.
668
00:24:00,380 –> 00:24:01,860
PIM with boundaries.
669
00:24:01,860 –> 00:24:06,060
If you’ve designed subscriptions and management groups as real blast radius containers,
670
00:24:06,060 –> 00:24:07,980
then PIM activation has a meaningful scope.
671
00:24:07,980 –> 00:24:11,820
Someone can be eligible for contributor in a specific production subscription,
672
00:24:11,820 –> 00:24:12,980
not across everything.
673
00:24:12,980 –> 00:24:16,220
That keeps incident impact bounded even when elevation happens.
674
00:24:16,220 –> 00:24:18,140
So PIM is not extra security.
675
00:24:18,140 –> 00:24:22,980
It’s the enforcement mechanism that keeps RBIQ from collapsing into permanent admin culture.
676
00:24:22,980 –> 00:24:26,660
Access controls decide who can do what, but they don’t decide what is allowed to exist.
677
00:24:26,660 –> 00:24:31,220
That’s the next layer as your policy, as your policy system, resource state enforcement,
678
00:24:31,220 –> 00:24:32,300
not guidance.
679
00:24:32,300 –> 00:24:36,500
As your policy is where most enterprises reveal what they actually believe, because RBIQ controls
680
00:24:36,500 –> 00:24:40,860
who can act, policy controls what is allowed to exist and policy does not care who you are.
681
00:24:40,860 –> 00:24:43,340
It doesn’t care that you’re a global admin having a bad day.
682
00:24:43,340 –> 00:24:46,380
It doesn’t care that the deployment came from a trusted pipeline.
683
00:24:46,380 –> 00:24:49,660
It evaluates the resource state and decides whether arm will accept it.
684
00:24:49,660 –> 00:24:52,700
That distinction matters.
685
00:24:52,700 –> 00:24:56,860
Most organizations treat policy like guardrails on a bowling lane, helpful suggestions that
686
00:24:56,860 –> 00:24:59,500
stop beginners from throwing the ball into the gutter.
687
00:24:59,500 –> 00:25:01,020
That is the wrong mental model.
688
00:25:01,020 –> 00:25:05,980
Azure policy is a gate in the control plane, a resource state gate.
689
00:25:05,980 –> 00:25:09,860
It sits in the deployment path and says this request is valid or it is not.
690
00:25:09,860 –> 00:25:14,780
If you want deterministic governance, policy is how you get it, not by writing standards documents
691
00:25:14,780 –> 00:25:16,660
and hoping people remember them.
692
00:25:16,660 –> 00:25:21,780
Now structurally, as your policy is three things, definitions, initiatives and assignments.
693
00:25:21,780 –> 00:25:27,020
Investigations are the individual rules, what to check and what to do when the check fails.
694
00:25:27,020 –> 00:25:29,140
Initiatives are bundles of definitions.
695
00:25:29,140 –> 00:25:32,900
Your baseline packaged into something you can apply repeatedly.
696
00:25:32,900 –> 00:25:36,820
Assignments are where you attach a definition or initiative to a scope in your hierarchy.
697
00:25:36,820 –> 00:25:40,060
Management group, subscription resource group, sometimes a resource.
698
00:25:40,060 –> 00:25:41,900
That scope part is the entire game.
699
00:25:41,900 –> 00:25:44,700
If you assign policies at the wrong scope, you create drift.
700
00:25:44,700 –> 00:25:47,620
If you assign them too narrowly, every subscription becomes a snowflake.
701
00:25:47,620 –> 00:25:51,660
If you assign them too broadly without thinking, you block legitimate variation and
702
00:25:51,660 –> 00:25:53,140
create an exception culture.
703
00:25:53,140 –> 00:25:57,220
So the enterprise pattern is define baseline centrally, assign them high enough that inheritance
704
00:25:57,220 –> 00:26:00,860
does the work and only allow deviations through controlled exception parts.
705
00:26:00,860 –> 00:26:02,900
Now the part people get wrong, the effect model.
706
00:26:02,900 –> 00:26:04,340
Azure policy isn’t one thing.
707
00:26:04,340 –> 00:26:06,180
It’s a set of enforcement behaviors.
708
00:26:06,180 –> 00:26:10,580
The important ones for enterprise design are deny, audit, modify and deploy if not exists.
709
00:26:10,580 –> 00:26:11,860
Deny is the obvious one.
710
00:26:11,860 –> 00:26:15,500
Deny means the resource can’t be created or updated if it violates the rule.
711
00:26:15,500 –> 00:26:16,820
This is preventive control.
712
00:26:16,820 –> 00:26:17,820
It’s deterministic.
713
00:26:17,820 –> 00:26:19,220
It stops drift at the boundary.
714
00:26:19,220 –> 00:26:20,700
Audit is the other common one.
715
00:26:20,700 –> 00:26:23,340
It records non-compliance, but let’s the deployment happen.
716
00:26:23,340 –> 00:26:24,860
This is a detective control.
717
00:26:24,860 –> 00:26:27,780
It’s useful during rollout and discovery, but it’s not a guardrail.
718
00:26:27,780 –> 00:26:29,420
Audit doesn’t prevent anything.
719
00:26:29,420 –> 00:26:32,300
It just produces evidence that something happened.
720
00:26:32,300 –> 00:26:35,140
Modify is where policy starts behaving like a compiler.
721
00:26:35,140 –> 00:26:38,580
Modify can change a deployment or resource configuration to meet your rule.
722
00:26:38,580 –> 00:26:41,700
Add tags and force settings, normalize configurations.
723
00:26:41,700 –> 00:26:45,500
This reduces friction because teams don’t have to remember everything, but it also creates
724
00:26:45,500 –> 00:26:48,660
hidden behavior if you don’t communicate it clearly.
725
00:26:48,660 –> 00:26:51,380
If not exists is the most powerful and the most abused.
726
00:26:51,380 –> 00:26:56,460
It says if a required configuration or related resource is missing, deployed automatically.
727
00:26:56,460 –> 00:27:01,820
This is how you enforce diagnostics, monitoring agents, security extensions and every resource
728
00:27:01,820 –> 00:27:04,260
must send logs to the right place.
729
00:27:04,260 –> 00:27:08,220
It’s also how you end up with remediation tasks you never ran and therefore a tenent
730
00:27:08,220 –> 00:27:10,820
full of legacy resources that never got fixed.
731
00:27:10,820 –> 00:27:12,180
Here’s the rule of thumb.
732
00:27:12,180 –> 00:27:14,500
Most enterprises need and rarely follow.
733
00:27:14,500 –> 00:27:15,820
Use deny for boundaries.
734
00:27:15,820 –> 00:27:17,620
You are not negotiating.
735
00:27:17,620 –> 00:27:22,540
Investigations, disallowed resource types, prohibited exposure patterns and this must never exist
736
00:27:22,540 –> 00:27:23,940
in this environment.
737
00:27:23,940 –> 00:27:27,020
Use audit for learning and rollout, not as a destination.
738
00:27:27,020 –> 00:27:31,780
If something matters it eventually becomes deny, modify or deploy if not exists.
739
00:27:31,780 –> 00:27:37,580
Use modify for hygiene, tags, standard settings, consistency that should be automatic.
740
00:27:37,580 –> 00:27:43,060
Use deploy if not exists for platform requirements that must be present for governance to function.
741
00:27:43,060 –> 00:27:44,260
Diagnostics.
742
00:27:44,260 –> 00:27:47,740
Work destinations, baseline monitoring and security controls that are systemic.
743
00:27:47,740 –> 00:27:50,860
If you remember nothing else from this section, remember this.
744
00:27:50,860 –> 00:27:55,620
If you deploy policies but don’t remediate existing resources, you’re running a split-brain
745
00:27:55,620 –> 00:27:56,620
environment.
746
00:27:56,620 –> 00:27:57,860
New resources follow the rules.
747
00:27:57,860 –> 00:27:59,260
Old resources don’t.
748
00:27:59,260 –> 00:28:00,260
That’s not governance.
749
00:28:00,260 –> 00:28:02,100
That’s two realities in one tenent.
750
00:28:02,100 –> 00:28:05,340
Now initiatives are how you stop reinventing baselines.
751
00:28:05,340 –> 00:28:08,300
You don’t want 50 separate policy assignments per subscription.
752
00:28:08,300 –> 00:28:12,780
You want a few standardized baselines, versioned, repeatable and assigned consistently.
753
00:28:12,780 –> 00:28:14,780
This is where enterprises break things again.
754
00:28:14,780 –> 00:28:19,060
They create too many initiatives, each owned by a different team, each overlapping, each
755
00:28:19,060 –> 00:28:20,380
slightly different.
756
00:28:20,380 –> 00:28:23,500
Then they spend their lives managing exemptions and conflicts.
757
00:28:23,500 –> 00:28:24,940
The better model is boring.
758
00:28:24,940 –> 00:28:29,340
A small number of enterprise initiatives align to intent, one baseline for production, one
759
00:28:29,340 –> 00:28:33,540
baseline for non-production, one baseline for regulated, one baseline for platform.
760
00:28:33,540 –> 00:28:37,420
If you need more than that, the hierarchy is probably wrong or your baseline is trying
761
00:28:37,420 –> 00:28:40,020
to encode every opinion the company has ever had.
762
00:28:40,020 –> 00:28:45,460
And yes, you will still need exemptions on necessary, legacy exists, migrations exist,
763
00:28:45,460 –> 00:28:49,340
some services have awkward provisioning paths, some workloads have real constraints.
764
00:28:49,340 –> 00:28:54,020
But exemptions are also entropy generators and they must be treated like radioactive material,
765
00:28:54,020 –> 00:28:56,300
documented, time-bound and reviewed.
766
00:28:56,300 –> 00:28:58,300
There are two kinds of exceptions that matter.
767
00:28:58,300 –> 00:29:03,380
A waiver is, we are non-compliant, we accept the risk temporarily, that must have an expiry
768
00:29:03,380 –> 00:29:07,500
and owner, and a reason that an auditor could read without laughing.
769
00:29:07,500 –> 00:29:12,380
Investigated exception is, we are deviating, but we have compensating controls.
770
00:29:12,380 –> 00:29:15,660
That still needs ownership and review because mitigations decay too.
771
00:29:15,660 –> 00:29:19,660
The minute the mitigation disappears, you’re just non-compliant with extra confidence.
772
00:29:19,660 –> 00:29:21,100
And here’s the quiet failure.
773
00:29:21,100 –> 00:29:24,660
Deleting or changing policy assignments can often exemptions.
774
00:29:24,660 –> 00:29:28,300
Leaving behind compliance gaps nobody sees unless they actively hunt them.
775
00:29:28,300 –> 00:29:31,060
That’s not a tooling issue, that’s a life cycle issue.
776
00:29:31,060 –> 00:29:33,460
Policy needs the same discipline as code.
777
00:29:33,460 –> 00:29:36,220
Versioning, change control and cleanup.
778
00:29:36,220 –> 00:29:40,300
The practical part, the first guard rails that pay rent, enterprises don’t need 500 policies
779
00:29:40,300 –> 00:29:43,780
on day one, they need 10 to 15 that prevent obvious damage.
780
00:29:43,780 –> 00:29:46,660
Require tags that enable ownership and cost allocation.
781
00:29:46,660 –> 00:29:50,340
Enforce allowed locations to avoid sovereignty and latency surprises.
782
00:29:50,340 –> 00:29:54,100
Restrict unapproved SKUs and resource types so teams can’t accidentally deploy a billing
783
00:29:54,100 –> 00:29:55,100
incident.
784
00:29:55,100 –> 00:29:58,060
Require encryption and secure transfer, where applicable.
785
00:29:58,060 –> 00:30:01,420
Enforce diagnostics, so logs exist where your associate expects them.
786
00:30:01,420 –> 00:30:06,100
Deny public endpoints in production unless explicitly rooted through an approved design.
787
00:30:06,100 –> 00:30:09,900
These are boundary decisions implemented as deterministic controls.
788
00:30:09,900 –> 00:30:13,180
And when teams complain, the response isn’t, we’re doing governance.
789
00:30:13,180 –> 00:30:16,180
The response is, the platform only allows safe autonomy.
790
00:30:16,180 –> 00:30:20,260
Now once policy is working as the gate, you still need feedback.
791
00:30:20,260 –> 00:30:23,500
Because even a perfect policy baseline doesn’t tell you where you’re weak, it tells you
792
00:30:23,500 –> 00:30:25,060
where you’re violated.
793
00:30:25,060 –> 00:30:27,660
That’s the difference between enforcement and posture.
794
00:30:27,660 –> 00:30:32,060
And that’s where defender for cloud and continuous compliance signals enter the picture.
795
00:30:32,060 –> 00:30:35,980
Security posture and continuous compliance signals not dashboards.
796
00:30:35,980 –> 00:30:38,700
Defender for cloud is not governance, it’s the smoke alarm.
797
00:30:38,700 –> 00:30:42,940
And like every smoke alarm, it’s only useful if you wired it into a building that can actually
798
00:30:42,940 –> 00:30:44,100
contain a fire.
799
00:30:44,100 –> 00:30:46,940
Most enterprises treat defender for cloud like a report card.
800
00:30:46,940 –> 00:30:48,700
They chase secure score because it’s measurable.
801
00:30:48,700 –> 00:30:52,780
It looks good in steering committees and it creates the illusion that security is improving.
802
00:30:52,780 –> 00:30:53,780
That’s the wrong use.
803
00:30:53,780 –> 00:30:55,820
Secure score is a prioritization signal.
804
00:30:55,820 –> 00:30:57,060
It’s a backlog generator.
805
00:30:57,060 –> 00:31:00,660
It’s the system pointing at the most common and most impactful misconfigurations it can
806
00:31:00,660 –> 00:31:01,660
see.
807
00:31:01,660 –> 00:31:04,820
If you treat it like a trophy, you will do what enterprises always do.
808
00:31:04,820 –> 00:31:06,420
It’s the number instead of the risk.
809
00:31:06,420 –> 00:31:08,820
You’ll disable recommendations you don’t want to explain.
810
00:31:08,820 –> 00:31:12,820
You’ll accept waivers because they are politically convenient and eventually the score becomes
811
00:31:12,820 –> 00:31:15,460
managed while the attack surface stays real.
812
00:31:15,460 –> 00:31:16,460
Here’s the rule of thumb.
813
00:31:16,460 –> 00:31:18,820
Governance is what prevents unsafe states.
814
00:31:18,820 –> 00:31:22,500
Posture management is how you detect the states that slip through, the states you haven’t
815
00:31:22,500 –> 00:31:25,820
governed yet, and the toxic combinations you didn’t anticipate.
816
00:31:25,820 –> 00:31:30,100
This is where CSPM earns its keep, not because it shows you a dashboard, because it surfaces
817
00:31:30,100 –> 00:31:31,100
attack paths.
818
00:31:31,100 –> 00:31:32,860
It surfaces exposure patterns.
819
00:31:32,860 –> 00:31:38,940
It surfaces the uncomfortable adjacency, public endpoint plus weak identity plus missing logs.
820
00:31:38,940 –> 00:31:41,220
The platform doesn’t fail because one control is missing.
821
00:31:41,220 –> 00:31:43,780
It fails because several minor gaps line up.
822
00:31:43,780 –> 00:31:46,500
Defender for cloud is useful when it changes what you fix next.
823
00:31:46,500 –> 00:31:50,780
That means the operating model must treat it as a feed, a continuous set of signals that
824
00:31:50,780 –> 00:31:55,260
either map back to policy gaps, R-back gaps, or operating discipline gaps.
825
00:31:55,260 –> 00:31:59,700
If defender tells you storage accounts allow public access, the outcome shouldn’t be, will
826
00:31:59,700 –> 00:32:00,940
tell teams to be careful.
827
00:32:00,940 –> 00:32:06,380
The outcome should be a decision.
828
00:32:06,380 –> 00:32:10,100
Should this be denied by policy and production modified automatically or permitted only through
829
00:32:10,100 –> 00:32:11,580
a controlled exception path?
830
00:32:11,580 –> 00:32:15,380
If defender tells you you’re missing diagnostics, the outcome is not a ticket storm.
831
00:32:15,380 –> 00:32:16,620
The outcome is deploy.
832
00:32:16,620 –> 00:32:21,100
If not exists, plus remediation tasks, wired into subscription vending, so new subscriptions
833
00:32:21,100 –> 00:32:23,380
inherit the logging baseline from day zero.
834
00:32:23,380 –> 00:32:28,180
If defender tells you identities are overprivileged, the outcome is not an annual access review
835
00:32:28,180 –> 00:32:29,180
powerpoint.
836
00:32:29,180 –> 00:32:33,380
The outcome is tightening scopes, reducing owners, pushing privilege into PM and monitoring
837
00:32:33,380 –> 00:32:35,540
activation events like production changes.
838
00:32:35,540 –> 00:32:36,940
This is the feedback loop.
839
00:32:36,940 –> 00:32:39,820
Signals should end as enforced guardrails or they remain noise.
840
00:32:39,820 –> 00:32:41,340
Now compliance.
841
00:32:41,340 –> 00:32:45,140
Most enterprises think compliance is an audit artifact, a spreadsheet you fill in, a set
842
00:32:45,140 –> 00:32:49,580
of controls you map once a year, and a fire drill where everyone produces screenshots.
843
00:32:49,580 –> 00:32:52,740
That model fails in cloud because cloud changes every day.
844
00:32:52,740 –> 00:32:56,100
So the only compliance model that survives is continuous compliance.
845
00:32:56,100 –> 00:33:01,700
The ability to show at any point what is enforced, what is compliant, what is exempted, and
846
00:33:01,700 –> 00:33:03,460
who approved the deviations.
847
00:33:03,460 –> 00:33:07,820
This is where compliance manager is useful, but again, not as a tool to her.
848
00:33:07,820 –> 00:33:10,620
Compliance manager matters because it forces traceability.
849
00:33:10,620 –> 00:33:15,060
It takes requirements from frameworks and turns them into accessible items, improvement actions,
850
00:33:15,060 –> 00:33:16,140
evidence and progress.
851
00:33:16,140 –> 00:33:20,860
It gives you a place to link your, we enforce this claim to something real.
852
00:33:20,860 –> 00:33:25,740
Policy initiatives, logging standards, access controls and operational processes.
853
00:33:25,740 –> 00:33:29,340
But the critical part isn’t the UI, the critical part is the mapping philosophy.
854
00:33:29,340 –> 00:33:33,300
You don’t start with HIPAA or PCI and then go hunting for Azure features.
855
00:33:33,300 –> 00:33:37,820
You start with your enforced baselines and map them to frameworks through control families.
856
00:33:37,820 –> 00:33:42,180
Many organizations use a common pivot, like NIST style control domains to reduce duplication
857
00:33:42,180 –> 00:33:45,780
across frameworks because the overlap is real even if the language differs.
858
00:33:45,780 –> 00:33:49,300
This is also where the Microsoft Cloud Security benchmark fits.
859
00:33:49,300 –> 00:33:52,100
It’s not magic and it’s not complete for every regulation.
860
00:33:52,100 –> 00:33:56,700
But it gives you a baseline mapping that you can assign through policy initiatives and measure consistently.
861
00:33:56,700 –> 00:34:00,060
Treat it as a starting point, then add what your industry actually requires.
862
00:34:00,060 –> 00:34:02,900
And when auditors ask for evidence, you don’t hand them dashboards.
863
00:34:02,900 –> 00:34:07,060
You hand them enforced intent, the management group structure, the initiatives assigned at each
864
00:34:07,060 –> 00:34:11,940
tier, the PM settings for privileged roles, the policy exemption registered with expiry,
865
00:34:11,940 –> 00:34:14,860
and the logging parts that prove you can investigate.
866
00:34:14,860 –> 00:34:18,900
That’s how audit prep drops from months to days because you’re not assembling evidence.
867
00:34:18,900 –> 00:34:20,540
You’re operating it continuously.
868
00:34:20,540 –> 00:34:24,700
Now the last connection in this section, if you can’t link governance to accountability,
869
00:34:24,700 –> 00:34:26,220
the system will rot.
870
00:34:26,220 –> 00:34:29,660
Security posture without accountability becomes security theater.
871
00:34:29,660 –> 00:34:32,660
Compliance without accountability becomes documentation theater.
872
00:34:32,660 –> 00:34:36,260
And that brings in the most ignored governance domain in Azure Enterprises.
873
00:34:36,260 –> 00:34:37,260
Cost.
874
00:34:37,260 –> 00:34:39,500
Because spend is not just an accounting problem.
875
00:34:39,500 –> 00:34:41,700
It’s an authorization problem with a price tag.
876
00:34:41,700 –> 00:34:42,700
Finops guardrails?
877
00:34:42,700 –> 00:34:44,700
Cost is governance, not accounting.
878
00:34:44,700 –> 00:34:48,300
Cost is not a finance problem that shows up after the architecture is done.
879
00:34:48,300 –> 00:34:49,500
Cost is governance.
880
00:34:50,500 –> 00:34:56,740
Because spend is just another way the platform expresses what was allowed to exist.
881
00:34:56,740 –> 00:35:01,420
If a team can deploy any SKU in any region with any redundancy setting, you didn’t build
882
00:35:01,420 –> 00:35:02,420
a cloud platform.
883
00:35:02,420 –> 00:35:05,700
You built an unlimited purchasing system with an API.
884
00:35:05,700 –> 00:35:08,700
This is the part enterprises love to pretend is separate.
885
00:35:08,700 –> 00:35:13,140
Security handles security, finance handles cost, platform handles availability, and then
886
00:35:13,140 –> 00:35:17,700
everyone acts surprised when a breach and a budget incident are the same story.
887
00:35:17,700 –> 00:35:21,420
Too much permission, too few constraints, and no ownership signal.
888
00:35:21,420 –> 00:35:24,540
Finops in Azure starts with the most boring truth in cloud.
889
00:35:24,540 –> 00:35:26,180
Visibility is not automatic.
890
00:35:26,180 –> 00:35:28,220
Visibility requires metadata.
891
00:35:28,220 –> 00:35:30,060
Metadata requires standards.
892
00:35:30,060 –> 00:35:31,220
Standards require enforcement.
893
00:35:31,220 –> 00:35:34,300
So if you want cost accountability, the first dependency is tagging.
894
00:35:34,300 –> 00:35:37,060
Not recommend tagging, not we have a wiki page.
895
00:35:37,060 –> 00:35:38,220
Enforced tagging.
896
00:35:38,220 –> 00:35:42,460
And you already know what enforces it as your policy with deny or modify.
897
00:35:42,460 –> 00:35:47,100
Deny when the tag is mandatory to allocate spend, modify when you can safely inherit or
898
00:35:47,100 –> 00:35:48,980
append values without breaking meaning.
899
00:35:48,980 –> 00:35:51,420
Either way, you’re not asking teams to remember.
900
00:35:51,420 –> 00:35:53,460
You’re making the platform refuse ambiguity.
901
00:35:53,460 –> 00:35:56,460
The tags that matter aren’t dozens of creative labels.
902
00:35:56,460 –> 00:36:01,300
There is a small set that lets the enterprise answer basic questions without archaeology.
903
00:36:01,300 –> 00:36:02,300
Who owns this?
904
00:36:02,300 –> 00:36:03,300
What environment is it?
905
00:36:03,300 –> 00:36:05,180
What product or cost center pays for it?
906
00:36:05,180 –> 00:36:07,860
And what data sensitivity tier it belongs to?
907
00:36:07,860 –> 00:36:11,020
If you can’t answer those, you can’t go on cost and you can’t go on risk because you
908
00:36:11,020 –> 00:36:13,180
don’t know who to call when something is exposed.
909
00:36:13,180 –> 00:36:14,940
Now budgets.
910
00:36:14,940 –> 00:36:20,420
These organizations deploy budgets like a personal finance app alerts so we don’t overspend.
911
00:36:20,420 –> 00:36:21,740
That’s the wrong framing.
912
00:36:21,740 –> 00:36:26,100
Budgets exist to surface accountability early while changes are still reversible.
913
00:36:26,100 –> 00:36:28,100
A budget alert is a governance signal.
914
00:36:28,100 –> 00:36:31,620
This subscription is behaving outside expectation that might be waste.
915
00:36:31,620 –> 00:36:32,620
It might be a surge.
916
00:36:32,620 –> 00:36:33,780
It might be a migration.
917
00:36:33,780 –> 00:36:35,580
It might be an attacker spinning resources.
918
00:36:35,580 –> 00:36:38,620
The point is you find out while it’s still small enough to stop.
919
00:36:38,620 –> 00:36:40,740
So budgets need to map to boundaries.
920
00:36:40,740 –> 00:36:44,920
If subscriptions are your governance containers, budgets attach to subscriptions.
921
00:36:44,920 –> 00:36:49,300
If your products span multiple subscriptions, you need a consistent tag model and cost analysis
922
00:36:49,300 –> 00:36:50,820
views to aggregate by tag.
923
00:36:50,820 –> 00:36:53,480
Either way, budgets without boundaries are just noise.
924
00:36:53,480 –> 00:36:55,560
And then there’s show back and charge back.
925
00:36:55,560 –> 00:36:57,840
Enterprises argue about this like it’s a financial policy debate.
926
00:36:57,840 –> 00:37:00,120
It’s not charge back is enforcement.
927
00:37:00,120 –> 00:37:01,560
Show back is cultural pressure.
928
00:37:01,560 –> 00:37:05,040
Both are mechanisms that make teams feel the consequences of choices they’re allowed
929
00:37:05,040 –> 00:37:06,040
to make.
930
00:37:06,040 –> 00:37:09,800
If teams can deploy anything but never see the bill, they will optimize for speed, not
931
00:37:09,800 –> 00:37:10,800
sustainability.
932
00:37:10,800 –> 00:37:11,800
That isn’t malice.
933
00:37:11,800 –> 00:37:15,480
In behavior, humans optimize for what is measured and felt.
934
00:37:15,480 –> 00:37:20,440
So the operating model has to make cost visible to the decision makers, not to finance after
935
00:37:20,440 –> 00:37:21,440
the fact.
936
00:37:21,440 –> 00:37:24,700
That means regular reports by owner, environment and application.
937
00:37:24,700 –> 00:37:28,360
It means tying subscriptions and tags to real teams with real escalation parts.
938
00:37:28,360 –> 00:37:31,160
Now the guard rails that prevent obvious damage.
939
00:37:31,160 –> 00:37:33,480
High-risk cost drivers are predictable.
940
00:37:33,480 –> 00:37:38,720
Premium SKUs, multi-region replication, unbounded data egress and resource types that teams
941
00:37:38,720 –> 00:37:40,200
try out and forget.
942
00:37:40,200 –> 00:37:41,520
You don’t solve that with education.
943
00:37:41,520 –> 00:37:46,580
You solve it with allowed SKUs and allowed resource types in the environments where experimentation
944
00:37:46,580 –> 00:37:47,580
isn’t acceptable.
945
00:37:47,580 –> 00:37:51,920
This is the same deny versus audit philosophy from policy applied to money.
946
00:37:51,920 –> 00:37:55,660
In production, you deny the SKUs that can explode cost without a review.
947
00:37:55,660 –> 00:37:59,660
In non-production, you can allow more but you still contain it with budgets and alerts.
948
00:37:59,660 –> 00:38:04,000
In sandbox, you allow the weirdness but you kept the blast radius with tight spending limits
949
00:38:04,000 –> 00:38:05,240
and isolation.
950
00:38:05,240 –> 00:38:06,640
And here’s the real point.
951
00:38:06,640 –> 00:38:08,840
Finops guard rails are not about saving money.
952
00:38:08,840 –> 00:38:11,120
They are about enforcing intentionality.
953
00:38:11,120 –> 00:38:16,360
In a workload, requests and expensive configuration, the platform should force a moment of decision.
954
00:38:16,360 –> 00:38:17,800
Is this required?
955
00:38:17,800 –> 00:38:18,800
Who approves it?
956
00:38:18,800 –> 00:38:20,440
And what is the expected outcome?
957
00:38:20,440 –> 00:38:23,800
If you can’t answer that, the platform shouldn’t allow it by default.
958
00:38:23,800 –> 00:38:25,440
Because otherwise you don’t have cloud governance.
959
00:38:25,440 –> 00:38:27,040
You have cloud entropy with invoices.
960
00:38:27,040 –> 00:38:28,840
Now none of this runs itself.
961
00:38:28,840 –> 00:38:32,800
Tag enforcement, budgets, chargeback models, SKU restrictions, exception handling, these are
962
00:38:32,800 –> 00:38:33,800
not features.
963
00:38:33,800 –> 00:38:34,960
They are operating disciplines.
964
00:38:34,960 –> 00:38:37,480
And that means the next piece has to exist.
965
00:38:37,480 –> 00:38:41,680
A governance operating model that survives org changes, survives pressure and doesn’t
966
00:38:41,680 –> 00:38:44,360
collapse into ticket-based permission vending.
967
00:38:44,360 –> 00:38:48,160
The operating model, Rassie, Escalation and the end of ticket-based governance.
968
00:38:48,160 –> 00:38:51,680
Here is the part nobody wants to do because it sounds like process.
969
00:38:51,680 –> 00:38:55,600
But without an operating model, everything you build so far becomes optional the moment
970
00:38:55,600 –> 00:38:57,720
the right person complains loudly enough.
971
00:38:57,720 –> 00:39:01,920
As your governance isn’t sustained by policies, it’s sustained by decision rights, who is allowed
972
00:39:01,920 –> 00:39:05,800
to decide, who is allowed to override and who has to clean up the mess when reality
973
00:39:05,800 –> 00:39:06,800
disagrees.
974
00:39:06,800 –> 00:39:08,680
First artifact isn’t another initiative.
975
00:39:08,680 –> 00:39:11,360
It’s a Rassie that names owners in plain language.
976
00:39:11,360 –> 00:39:16,400
Platform team owns the landing zone, the management group hierarchy, subscription vending,
977
00:39:16,400 –> 00:39:22,360
shared networking, central logging destinations and the policy baselines that define non-negotiables.
978
00:39:22,360 –> 00:39:26,840
Security owns the security requirements, the risk acceptance criteria, the review of
979
00:39:26,840 –> 00:39:33,560
exceptions that change exposure and the monitoring expectations that prove controls are working.
980
00:39:33,560 –> 00:39:37,880
In- Teams, own the workloads inside the boundaries including data classification inside their
981
00:39:37,880 –> 00:39:42,200
apps, operational reliability and compliance with the baseline.
982
00:39:42,200 –> 00:39:44,120
They do not own the baseline itself.
983
00:39:44,120 –> 00:39:48,400
Finops of finance owns tagging standards for chargeback and the budget model plus escalation
984
00:39:48,400 –> 00:39:50,600
when spent violates expectations.
985
00:39:50,600 –> 00:39:54,280
Audit and risk owns evidence requirements, the cadence of reviews and the definition of
986
00:39:54,280 –> 00:39:57,120
what acceptable looks like for regulated workloads.
987
00:39:57,120 –> 00:40:00,520
If you can’t point at an owner you don’t have governance, you have distributed blame.
988
00:40:00,520 –> 00:40:02,000
Now you define decision paths.
989
00:40:02,000 –> 00:40:06,360
This is where most enterprises break things because they default to open a ticket.
990
00:40:06,360 –> 00:40:09,000
Ticket-based governance is governance theatre.
991
00:40:09,000 –> 00:40:10,720
It centralises friction, not control.
992
00:40:10,720 –> 00:40:11,880
The right model is simple.
993
00:40:11,880 –> 00:40:12,880
Three lanes.
994
00:40:12,880 –> 00:40:13,880
Self-serve lane.
995
00:40:13,880 –> 00:40:17,640
If it’s within baseline and within budget, teams deploy without asking.
996
00:40:17,640 –> 00:40:22,600
That includes standard resource types, approved regions, approved SKUs and approved network
997
00:40:22,600 –> 00:40:23,600
patterns.
998
00:40:23,600 –> 00:40:25,640
The platform team does not approve normal work.
999
00:40:25,640 –> 00:40:26,840
Approval lane.
1000
00:40:26,840 –> 00:40:30,640
If it increases risk or cost meaningfully, it requires approval.
1001
00:40:30,640 –> 00:40:33,480
Not by the platform team, but by the correct owner.
1002
00:40:33,480 –> 00:40:35,400
Cost exceptions go to Finops.
1003
00:40:35,400 –> 00:40:37,520
Exposure exceptions go to security.
1004
00:40:37,520 –> 00:40:42,240
Architecture deviations go to the platform team if they affect shared services or boundaries.
1005
00:40:42,240 –> 00:40:43,240
Denied lane.
1006
00:40:43,240 –> 00:40:44,640
Some things are just not allowed.
1007
00:40:44,640 –> 00:40:47,400
Public endpoints in production if your model forbids them.
1008
00:40:47,400 –> 00:40:49,360
Random regions for data residency reasons.
1009
00:40:49,360 –> 00:40:51,560
Broad role assignments at the tenant route.
1010
00:40:51,560 –> 00:40:56,160
If something belongs in the denied lane, you encode it as deny, not as a policy document.
1011
00:40:56,160 –> 00:40:58,240
And yes, this will cause deployment failures.
1012
00:40:58,240 –> 00:40:59,640
That is not a failure of governance.
1013
00:40:59,640 –> 00:41:01,040
That is governance functioning.
1014
00:41:01,040 –> 00:41:04,720
Now the workflow for exceptions, because exceptions are the entropy gateway.
1015
00:41:04,720 –> 00:41:09,560
If you don’t design the workflow, teams will invent one by messaging whoever they know.
1016
00:41:09,560 –> 00:41:11,040
The workflow is.
1017
00:41:11,040 –> 00:41:12,040
Request review.
1018
00:41:12,040 –> 00:41:13,040
Approve.
1019
00:41:13,040 –> 00:41:14,040
Expire.
1020
00:41:14,040 –> 00:41:15,040
Revalidate.
1021
00:41:15,040 –> 00:41:16,040
Remove.
1022
00:41:16,040 –> 00:41:17,560
Request includes scope, duration, reason and a link to a ticket.
1023
00:41:17,560 –> 00:41:18,960
No ticket, no exception.
1024
00:41:18,960 –> 00:41:21,360
Because we need it is not traceable intent.
1025
00:41:21,360 –> 00:41:23,480
Review includes risk classification.
1026
00:41:23,480 –> 00:41:25,440
Is this a waiver or is it mitigated?
1027
00:41:25,440 –> 00:41:27,080
What compensating controls exist?
1028
00:41:27,080 –> 00:41:28,400
Who owns those controls?
1029
00:41:28,400 –> 00:41:30,760
Review includes explicit expiry.
1030
00:41:30,760 –> 00:41:32,720
Even mitigations get a review date.
1031
00:41:32,720 –> 00:41:35,000
Nothing stays forever without re-approval.
1032
00:41:35,000 –> 00:41:37,360
Expire means the system forces the question again.
1033
00:41:37,360 –> 00:41:41,920
If the business still needs it, they ask again and the owner re-accepts the risk.
1034
00:41:41,920 –> 00:41:44,000
If they don’t ask again, the exception dies.
1035
00:41:44,000 –> 00:41:46,440
That’s how you prevent exception permanence.
1036
00:41:46,440 –> 00:41:49,920
Revalidate means you periodically audit active exemptions and remove the ones that don’t
1037
00:41:49,920 –> 00:41:51,440
have a current business reason.
1038
00:41:51,440 –> 00:41:52,440
Not annually.
1039
00:41:52,440 –> 00:41:54,200
Continuously in small batches.
1040
00:41:54,200 –> 00:41:57,040
Remove means cleanup is part of the process, not a future hope.
1041
00:41:57,040 –> 00:42:00,520
Our make policy behave like code because policy is code.
1042
00:42:00,520 –> 00:42:02,160
Version your initiatives and assignments.
1043
00:42:02,160 –> 00:42:03,640
Use pull requests.
1044
00:42:03,640 –> 00:42:06,440
Require review from the owners who will carry the risk.
1045
00:42:06,440 –> 00:42:07,960
Deploy in rollout rings.
1046
00:42:07,960 –> 00:42:09,360
Dev management group first.
1047
00:42:09,360 –> 00:42:10,200
Then non-prod.
1048
00:42:10,200 –> 00:42:11,200
Then prod.
1049
00:42:11,200 –> 00:42:14,760
That distinction matters because policy changes are production changes.
1050
00:42:14,760 –> 00:42:17,440
Also, treat exemptions as code adjacent artifacts.
1051
00:42:17,440 –> 00:42:20,960
If your policy life cycle doesn’t track them, you’ll end up with often de-exceptions and
1052
00:42:20,960 –> 00:42:22,600
mystery compliance gaps.
1053
00:42:22,600 –> 00:42:25,920
Finally, stop making the platform team the help desk.
1054
00:42:25,920 –> 00:42:29,160
The platform team’s job is to maintain the guardrails and the paved roads not to click
1055
00:42:29,160 –> 00:42:30,880
approve on every deployment.
1056
00:42:30,880 –> 00:42:35,160
If your governance model requires a ticket for normal work, teams will root around you.
1057
00:42:35,160 –> 00:42:39,000
They’ll use shadow subscriptions, alternate tenants or unmanaged identities.
1058
00:42:39,000 –> 00:42:40,120
The system will still run it.
1059
00:42:40,120 –> 00:42:41,600
It will just run without your control.
1060
00:42:41,600 –> 00:42:44,840
So the operating model is the mechanism that keeps autonomy safe.
1061
00:42:44,840 –> 00:42:49,400
Clear owners, clear lanes, controlled exceptions and policy as code, life cycle discipline
1062
00:42:49,400 –> 00:42:51,680
that survives people changes.
1063
00:42:51,680 –> 00:42:53,120
Three enterprise scenarios.
1064
00:42:53,120 –> 00:42:54,960
What works at scale looks like.
1065
00:42:54,960 –> 00:42:58,440
Now apply all of that to reality because this is where governance either proves itself
1066
00:42:58,440 –> 00:43:00,400
or collapses into theory.
1067
00:43:00,400 –> 00:43:02,040
Scenario one, M&A Cloud on boarding.
1068
00:43:02,040 –> 00:43:05,760
An acquired company shows up with Azure subscriptions that were built under a different
1069
00:43:05,760 –> 00:43:09,680
threat model, a different cost model and usually a different level of discipline.
1070
00:43:09,680 –> 00:43:14,200
The naive enterprise move is to treat on boarding as an assessment project.
1071
00:43:14,200 –> 00:43:19,560
Weeks of meetings, manual reviews, spreadsheets and will migrate them into our standard later.
1072
00:43:19,560 –> 00:43:20,600
Later never arrives.
1073
00:43:20,600 –> 00:43:23,800
The scalable move is to treat on boarding as a boundary action.
1074
00:43:23,800 –> 00:43:28,080
Those subscriptions get placed under a dedicated management group that represents the acquisition
1075
00:43:28,080 –> 00:43:29,160
landing area.
1076
00:43:29,160 –> 00:43:34,280
That management group has a known policy baseline, logging destinations, region constraints,
1077
00:43:34,280 –> 00:43:38,120
required tags and the minimum deny policies that stop obvious damage.
1078
00:43:38,120 –> 00:43:41,760
Arbeck inheritance is applied at that management group using groups, not people, so the access
1079
00:43:41,760 –> 00:43:43,800
model is immediately consistent.
1080
00:43:43,800 –> 00:43:47,600
Privilege is time bound through PIM from day one because acquired admins are the highest
1081
00:43:47,600 –> 00:43:49,640
risk identities you will inherit.
1082
00:43:49,640 –> 00:43:52,920
Then the platform team does not manually approve every workload.
1083
00:43:52,920 –> 00:43:54,320
They let inheritance to the work.
1084
00:43:54,320 –> 00:43:58,040
If the workload is compliant, it deploys if it is not it fails and the failure message is
1085
00:43:58,040 –> 00:44:00,040
the governance interface.
1086
00:44:00,040 –> 00:44:03,040
Exceptions are allowed, but they go through the same workflow.
1087
00:44:03,040 –> 00:44:05,960
Request, expiry and revalidation.
1088
00:44:05,960 –> 00:44:09,120
That is how you get the 70% reduction in on boarding time.
1089
00:44:09,120 –> 00:44:13,520
You replace bespoke reviews with deterministic guardrails and predictable inheritance.
1090
00:44:13,520 –> 00:44:15,080
And you also get the real win.
1091
00:44:15,080 –> 00:44:18,960
Zero manual security reviews for normal work because you encoded the review into the control
1092
00:44:18,960 –> 00:44:19,960
plane.
1093
00:44:19,960 –> 00:44:24,760
The two regulated industry rollout, finance or healthcare doesn’t fail compliance because
1094
00:44:24,760 –> 00:44:26,520
they lack policy documents.
1095
00:44:26,520 –> 00:44:30,760
They fail because they can’t prove control consistently across hundreds of teams and thousands
1096
00:44:30,760 –> 00:44:31,960
of resources.
1097
00:44:31,960 –> 00:44:36,840
The working model is a regulated management group tier with non-negotiable baseline policies.
1098
00:44:36,840 –> 00:44:39,040
The encryption isn’t a recommendation.
1099
00:44:39,040 –> 00:44:40,640
Dagnostics aren’t when we get to it.
1100
00:44:40,640 –> 00:44:42,760
Network exposure isn’t a per team preference.
1101
00:44:42,760 –> 00:44:47,120
The baseline is assigned as an initiative, versioned and rolled out like code.
1102
00:44:47,120 –> 00:44:51,680
Audit are time-bound, owned and documented as waivers or mitigations with evidence.
1103
00:44:51,680 –> 00:44:55,560
PIM is mandatory for boundary changing roles and activation events are part of the audit
1104
00:44:55,560 –> 00:44:57,040
story.
1105
00:44:57,040 –> 00:44:59,080
Then continuous compliance becomes simple.
1106
00:44:59,080 –> 00:45:02,720
The organization can show what is enforced, what is compliant and what is exempted at any
1107
00:45:02,720 –> 00:45:04,240
point in time.
1108
00:45:04,240 –> 00:45:07,800
Audit preparation shrinks from months to days because the evidence is not assembled.
1109
00:45:07,800 –> 00:45:12,120
It is produced by design and over time findings reduce because the system stops relying on
1110
00:45:12,120 –> 00:45:14,600
human memory to enforce controls.
1111
00:45:14,600 –> 00:45:19,080
In scenario three, multi-team, multi-tenant governance inside a single Azure tenant.
1112
00:45:19,080 –> 00:45:20,760
This is where most enterprises die slowly.
1113
00:45:20,760 –> 00:45:23,320
Hundreds of application teams want autonomy.
1114
00:45:23,320 –> 00:45:24,320
Security wants control.
1115
00:45:24,320 –> 00:45:26,720
The platform team wants to avoid becoming a bottleneck.
1116
00:45:26,720 –> 00:45:29,640
The default outcome is either chaos or bureaucracy.
1117
00:45:29,640 –> 00:45:34,840
The model that survives is intentional boundary design plus self-service within guardrails.
1118
00:45:34,840 –> 00:45:38,000
Subscriptions are vended through a standard process, placed into management groups that
1119
00:45:38,000 –> 00:45:42,000
reflect environment and risk and inherit the baseline automatically.
1120
00:45:42,000 –> 00:45:44,520
Our back is group-based, scoped and boring.
1121
00:45:44,520 –> 00:45:45,520
Owner is scarce.
1122
00:45:45,520 –> 00:45:47,880
PIM is mandatory for elevated roles.
1123
00:45:47,880 –> 00:45:49,520
Azure policy is not guidance.
1124
00:45:49,520 –> 00:45:53,560
It is a gate and the operating model makes the ticket queue disappear by design.
1125
00:45:53,560 –> 00:45:55,080
Normal work is self-serve.
1126
00:45:55,080 –> 00:45:57,680
Risk increasing work requires approvals from the correct owner.
1127
00:45:57,680 –> 00:46:00,840
Forbidden work is denied by policy not debated in meetings.
1128
00:46:00,840 –> 00:46:02,320
Exceptions expire by default.
1129
00:46:02,320 –> 00:46:04,880
And revalidation is routine, not heroic.
1130
00:46:04,880 –> 00:46:08,400
That’s how you get no production outages due to over-permission access.
1131
00:46:08,400 –> 00:46:12,200
Not because nobody makes mistakes, but because mistakes can’t cross boundaries as easily.
1132
00:46:12,200 –> 00:46:13,640
The blast radius is smaller.
1133
00:46:13,640 –> 00:46:15,160
The identity surface is narrower.
1134
00:46:15,160 –> 00:46:17,880
The policies catch the obvious failures early.
1135
00:46:17,880 –> 00:46:21,280
And the org stops depending on tribal knowledge to keep production alive.
1136
00:46:21,280 –> 00:46:22,760
These three scenarios look different.
1137
00:46:22,760 –> 00:46:25,920
They are the same system, a hierarchy that encodes intent.
1138
00:46:25,920 –> 00:46:29,600
Access that assigns roles to groups and uses PIM to prevent privileged permanence.
1139
00:46:29,600 –> 00:46:31,880
Policy that enforces what is allowed to exist.
1140
00:46:31,880 –> 00:46:34,960
Signals from posture and compliance that feed back into guardrails.
1141
00:46:34,960 –> 00:46:40,000
In an operating model that turns governance into a product, not a ticket desk.
1142
00:46:40,000 –> 00:46:42,640
Governance is enforced intent or its entropy.
1143
00:46:42,640 –> 00:46:45,080
Governance is not what the enterprise says it believes.
1144
00:46:45,080 –> 00:46:47,240
It’s what the control plane refuses to allow.
1145
00:46:47,240 –> 00:46:52,280
If you do nothing else, lock down identity with group-based RBAC and PIM, define management
1146
00:46:52,280 –> 00:46:57,240
groups that reflect risk and deploy five deny policies that prevent obvious damage.
1147
00:46:57,240 –> 00:47:00,880
Subscribe for the next episode on building a landing zone operating model with policy
1148
00:47:00,880 –> 00:47:05,040
as code, rollout rings and exception handling that doesn’t decay into conditional chaos.