
1
00:00:00,000 –> 00:00:02,040
Most organizations believe Fabric Plus Copilot
2
00:00:02,040 –> 00:00:03,760
made data engineering easier.
3
00:00:03,760 –> 00:00:04,560
They are wrong.
4
00:00:04,560 –> 00:00:07,200
Fabric removed the ceremony, not the responsibility.
5
00:00:07,200 –> 00:00:09,720
And Copilot removed the typing, not the consequences.
6
00:00:09,720 –> 00:00:13,160
So yes, pipelines run faster, notebooks feel friendlier,
7
00:00:13,160 –> 00:00:16,240
SQL appears out of thin air, and Power BI lights up
8
00:00:16,240 –> 00:00:18,200
before anyone has written a real contract.
9
00:00:18,200 –> 00:00:19,480
But the thing nobody’s mentioning is,
10
00:00:19,480 –> 00:00:21,040
what speed does to ambiguity?
11
00:00:21,040 –> 00:00:21,960
It chips it.
12
00:00:21,960 –> 00:00:23,280
And it chips it at machine speed.
13
00:00:23,280 –> 00:00:25,520
Here’s the comfortable belief I keep hearing in incident
14
00:00:25,520 –> 00:00:28,520
reviews and why are we over budget meetings?
15
00:00:28,520 –> 00:00:30,120
We consolidated the stack.
16
00:00:30,120 –> 00:00:31,480
We finally have one platform.
17
00:00:31,480 –> 00:00:32,720
The teams are moving faster.
18
00:00:32,720 –> 00:00:34,080
This should be simpler.
19
00:00:34,080 –> 00:00:35,720
That’s the marketing narrative too.
20
00:00:35,720 –> 00:00:40,000
One Lake, one experience, one workspace, one integrated story.
21
00:00:40,000 –> 00:00:42,000
And if you stop the analysis at the UI layer,
22
00:00:42,000 –> 00:00:43,280
that narrative seems true.
23
00:00:43,280 –> 00:00:45,640
You can create a workspace, land files into one Lake,
24
00:00:45,640 –> 00:00:47,760
click a button, generate a semantic model,
25
00:00:47,760 –> 00:00:49,160
and render a report in minutes.
26
00:00:49,160 –> 00:00:50,000
That demo is real.
27
00:00:50,000 –> 00:00:51,880
The problem is what that demo trained people
28
00:00:51,880 –> 00:00:53,880
to believe that the platform removed the need
29
00:00:53,880 –> 00:00:55,160
for deliberate architecture.
30
00:00:55,160 –> 00:00:56,040
It did not.
31
00:00:56,040 –> 00:00:58,040
In architectural terms, Fabric is not a tool.
32
00:00:58,040 –> 00:01:00,480
It’s platform physics, a single SAS control plane
33
00:01:00,480 –> 00:01:02,720
that lets every persona touch the same assets
34
00:01:02,720 –> 00:01:06,040
from multiple angles, engineering, warehousing,
35
00:01:06,040 –> 00:01:10,760
BI, real-time AI, inside one capacity envelope.
36
00:01:10,760 –> 00:01:12,040
That’s not easier.
37
00:01:12,040 –> 00:01:14,800
That’s less friction between intent and impact.
38
00:01:14,800 –> 00:01:17,960
And when your intent is unclear, that reduction in friction
39
00:01:17,960 –> 00:01:19,240
doesn’t produce simplicity.
40
00:01:19,240 –> 00:01:20,880
It produces architectural erosion.
41
00:01:20,880 –> 00:01:22,960
You’ve seen this pattern before just not this fast.
42
00:01:22,960 –> 00:01:25,360
In older stacks, ambiguity had attacks.
43
00:01:25,360 –> 00:01:28,960
It had to cross boundaries, ADF to Synapse, DBT to warehouse,
44
00:01:28,960 –> 00:01:31,600
warehouse to Power BI, Power BI to the app.
45
00:01:31,600 –> 00:01:33,840
Each handle forced someone to make a decision,
46
00:01:33,840 –> 00:01:35,120
or at least document one.
47
00:01:35,120 –> 00:01:37,960
It was annoying, slow, and expensive,
48
00:01:37,960 –> 00:01:39,480
but it did something useful.
49
00:01:39,480 –> 00:01:41,320
It slowed bad decisions down.
50
00:01:41,320 –> 00:01:43,280
It also gave you time to notice ownership gaps
51
00:01:43,280 –> 00:01:45,080
because chain trade was limited by friction.
52
00:01:45,080 –> 00:01:46,680
Fabric removes that padding.
53
00:01:46,680 –> 00:01:50,000
So the failure mode surfaced earlier, and they surface louder.
54
00:01:50,000 –> 00:01:52,560
That’s why I’m positioning this episode a very specific way.
55
00:01:52,560 –> 00:01:54,080
These failures happened in Fabric,
56
00:01:54,080 –> 00:01:55,560
but they’re not Fabric-specific.
57
00:01:55,560 –> 00:01:58,000
Fabric just removed the padding that used to hide them.
58
00:01:58,000 –> 00:02:00,480
If you’ve operated snowflake, Databricks, Synapse,
59
00:02:00,480 –> 00:02:03,320
BigQuery, Pick your religion, you’ve seen the same entropy.
60
00:02:03,320 –> 00:02:05,160
Fabric just makes the drift visible sooner
61
00:02:05,160 –> 00:02:08,400
because the boundaries blur by default, not by exception.
62
00:02:08,400 –> 00:02:10,280
And that distinction matters, especially if you’re one
63
00:02:10,280 –> 00:02:13,120
of the people inheriting a Fabric estate right now.
64
00:02:13,120 –> 00:02:17,280
This episode is for senior data engineers, analytics engineers,
65
00:02:17,280 –> 00:02:19,920
data platform owners, architects who just got handed
66
00:02:19,920 –> 00:02:22,520
a bunch of workspaces, and leaders asking the only
67
00:02:22,520 –> 00:02:25,400
honest question in the room, why is this still hard?
68
00:02:25,400 –> 00:02:27,000
Because the sales pitch said it wouldn’t be,
69
00:02:27,000 –> 00:02:29,360
it’s still hard because data engineering was never hard
70
00:02:29,360 –> 00:02:30,720
due to typing speed.
71
00:02:30,720 –> 00:02:32,960
It was hard because of responsibility, contracts, cost,
72
00:02:32,960 –> 00:02:34,080
and control.
73
00:02:34,080 –> 00:02:37,080
Fabric accelerates delivery, therefore it accelerates drift,
74
00:02:37,080 –> 00:02:39,280
copilot accelerates authoring, therefore it accelerates
75
00:02:39,280 –> 00:02:40,080
wrongness.
76
00:02:40,080 –> 00:02:42,320
And the first signals you’ll see are not always failures
77
00:02:42,320 –> 00:02:43,240
in the classic sense.
78
00:02:43,240 –> 00:02:44,600
You won’t get a red pipeline.
79
00:02:44,600 –> 00:02:46,120
You’ll get a capacity spike.
80
00:02:46,120 –> 00:02:48,000
You’ll get two dashboards that disagree.
81
00:02:48,000 –> 00:02:50,320
You’ll get an ordered question that makes everyone stare
82
00:02:50,320 –> 00:02:50,960
at the floor.
83
00:02:50,960 –> 00:02:52,040
The platform works.
84
00:02:52,040 –> 00:02:53,720
The system doesn’t.
85
00:02:53,720 –> 00:02:55,240
Now here’s the part that kills me.
86
00:02:55,240 –> 00:02:57,800
Teams interpret that as Fabric is immature or copilot
87
00:02:57,800 –> 00:03:00,320
is unreliable or we need better training.
88
00:03:00,320 –> 00:03:03,080
And sure, training helps people click the right buttons.
89
00:03:03,080 –> 00:03:04,720
But training doesn’t enforce intent.
90
00:03:04,720 –> 00:03:06,040
Labels don’t enforce intent.
91
00:03:06,040 –> 00:03:07,600
Documentation doesn’t enforce intent.
92
00:03:07,600 –> 00:03:09,520
Workspace RBAC doesn’t enforce intent.
93
00:03:09,520 –> 00:03:10,400
Only design does.
94
00:03:10,400 –> 00:03:11,200
Only boundaries do.
95
00:03:11,200 –> 00:03:12,480
Only gates do.
96
00:03:12,480 –> 00:03:14,120
So the promise for this episode is simple.
97
00:03:14,120 –> 00:03:15,040
It’s not a tutorial.
98
00:03:15,040 –> 00:03:16,160
It’s not a feature tour.
99
00:03:16,160 –> 00:03:18,360
It’s an explanation of why teams feel faster and more
100
00:03:18,360 –> 00:03:20,280
out of control at the exact same time.
101
00:03:20,280 –> 00:03:23,320
And what operating rules actually survive this platform,
102
00:03:23,320 –> 00:03:25,200
we’re going to walk through the old mental model
103
00:03:25,200 –> 00:03:26,160
that Fabric replaced.
104
00:03:26,160 –> 00:03:29,880
What Fabric actually changed under the hood and why one platform
105
00:03:29,880 –> 00:03:31,600
quietly converts your governance model
106
00:03:31,600 –> 00:03:35,320
from deterministic to probabilistic unless you fight back.
107
00:03:35,320 –> 00:03:37,320
Then we’ll talk about copilot, not as a novelty,
108
00:03:37,320 –> 00:03:39,520
but as an acceleration layer that optimizes
109
00:03:39,520 –> 00:03:41,080
for completion over consequence.
110
00:03:41,080 –> 00:03:43,280
And I’m not going to ask you to trust vibes.
111
00:03:43,280 –> 00:03:45,240
The only artifacts that matter in this world
112
00:03:45,240 –> 00:03:48,840
are execution plans, capacity metrics, and violation
113
00:03:48,840 –> 00:03:49,440
counts.
114
00:03:49,440 –> 00:03:51,800
If it doesn’t show up in a plan, a cost report,
115
00:03:51,800 –> 00:03:54,560
or a violation count, it’s not governance, it’s hope.
116
00:03:54,560 –> 00:03:56,520
By the end, you’ll have a mental model for Fabric
117
00:03:56,520 –> 00:03:59,360
that’s honest and an operating model that assumes decay
118
00:03:59,360 –> 00:04:00,520
unless enforced.
119
00:04:00,520 –> 00:04:02,280
Because that’s the only model that survives
120
00:04:02,280 –> 00:04:04,360
a platform designed to make everything easy,
121
00:04:04,360 –> 00:04:05,920
right up until it’s expensive.
122
00:04:05,920 –> 00:04:08,080
Now let’s rewind to the old world for a second
123
00:04:08,080 –> 00:04:10,240
because you need to remember what friction used to do
124
00:04:10,240 –> 00:04:11,200
for you.
125
00:04:11,200 –> 00:04:13,640
The old mental model pipelines as the product.
126
00:04:13,640 –> 00:04:16,120
Before Fabric, most data teams treated the pipeline
127
00:04:16,120 –> 00:04:18,880
as the product, not the dataset, not the semantic model,
128
00:04:18,880 –> 00:04:21,120
not the KPI definition, the pipeline.
129
00:04:21,120 –> 00:04:23,600
Success meant the job ran, the schedule didn’t slip,
130
00:04:23,600 –> 00:04:25,640
and nothing paged you at 2am.
131
00:04:25,640 –> 00:04:28,120
The pipeline graph was the artifact you showed leadership
132
00:04:28,120 –> 00:04:29,040
to prove progress.
133
00:04:29,040 –> 00:04:30,960
Look, the boxes connect, the arrows flow,
134
00:04:30,960 –> 00:04:32,320
the runtime is green.
135
00:04:32,320 –> 00:04:34,280
And honestly, in that era, it made sense
136
00:04:34,280 –> 00:04:36,240
because infrastructure was the bottleneck.
137
00:04:36,240 –> 00:04:39,040
You didn’t wake up and casually provision an estate.
138
00:04:39,040 –> 00:04:41,680
You requested clusters, you negotiated gateways,
139
00:04:41,680 –> 00:04:44,760
you waited on networking, you argued over service limits,
140
00:04:44,760 –> 00:04:46,440
you begged for firewall holes,
141
00:04:46,440 –> 00:04:49,000
and because all of that took time, the organization learned
142
00:04:49,000 –> 00:04:52,560
to behavior, treat delivery as a construction project,
143
00:04:52,560 –> 00:04:55,640
engineers as builders, environments as hard boundaries,
144
00:04:55,640 –> 00:04:57,520
releases as events.
145
00:04:57,520 –> 00:04:59,920
SQL was the interface because SQL was the only thing
146
00:04:59,920 –> 00:05:01,320
that survived handoffs.
147
00:05:01,320 –> 00:05:03,840
Data factory to Synapse, Synapse to a warehouse,
148
00:05:03,840 –> 00:05:06,280
warehouse to Power BI, each layer forced you
149
00:05:06,280 –> 00:05:07,800
to speak a shared language.
150
00:05:07,800 –> 00:05:10,480
That language wasn’t always pretty, but it was explicit.
151
00:05:10,480 –> 00:05:13,360
If a transformation mattered, it got written down somewhere.
152
00:05:13,360 –> 00:05:17,240
In SQL, in stored procedures, in DBT models, in views.
153
00:05:17,240 –> 00:05:19,720
And if it didn’t get written down, it usually didn’t ship
154
00:05:19,720 –> 00:05:22,400
because the next tool in the chain demanded a decision.
155
00:05:22,400 –> 00:05:25,600
Toolspro wasn’t expected tax, and it came with a hidden benefit.
156
00:05:25,600 –> 00:05:27,920
But when you had ADF over here, a warehouse over there
157
00:05:27,920 –> 00:05:29,800
and Power BI in a different portal,
158
00:05:29,800 –> 00:05:31,760
you couldn’t pretend the boundaries weren’t real.
159
00:05:31,760 –> 00:05:33,440
Security lived in multiple places,
160
00:05:33,440 –> 00:05:35,080
compute lived in multiple places,
161
00:05:35,080 –> 00:05:36,560
storage lived in multiple places.
162
00:05:36,560 –> 00:05:38,640
That forced explicit ownership conversations,
163
00:05:38,640 –> 00:05:40,080
even if they were miserable.
164
00:05:40,080 –> 00:05:41,040
Who owns the lake?
165
00:05:41,040 –> 00:05:42,640
Who owns the warehouse schema?
166
00:05:42,640 –> 00:05:44,320
Who owns the semantic model?
167
00:05:44,320 –> 00:05:45,440
Who is allowed to publish?
168
00:05:45,440 –> 00:05:46,920
Who approves refresh schedules?
169
00:05:46,920 –> 00:05:50,000
You didn’t always like the answers, but the system demanded you ask.
170
00:05:50,000 –> 00:05:52,200
Deployment friction was also accidental governance.
171
00:05:52,200 –> 00:05:54,600
PRs, approvals, promotion between dev and prod,
172
00:05:54,600 –> 00:05:56,560
long run times, limited concurrency,
173
00:05:56,560 –> 00:05:58,680
none of that made engineers happy.
174
00:05:58,680 –> 00:05:59,840
But it created a throttle.
175
00:05:59,840 –> 00:06:01,880
A bad decision had to survive a review cycle.
176
00:06:01,880 –> 00:06:04,600
A schema change had to survive someone noticing it.
177
00:06:04,600 –> 00:06:07,080
A new pipeline had to survive an environment boundary.
178
00:06:07,080 –> 00:06:09,520
You could still make mistakes, but you couldn’t make them
179
00:06:09,520 –> 00:06:11,680
every five minutes across every surface.
180
00:06:11,680 –> 00:06:14,920
And that mattered because failure used to look like an outage.
181
00:06:14,920 –> 00:06:16,800
A pipeline failed, a job didn’t run,
182
00:06:16,800 –> 00:06:18,320
a dataset didn’t refresh.
183
00:06:18,320 –> 00:06:19,640
Something was visibly broken.
184
00:06:19,640 –> 00:06:21,760
What didn’t happen as often was quietly wrong.
185
00:06:21,760 –> 00:06:23,840
You didn’t usually get a perfect green check mark attached
186
00:06:23,840 –> 00:06:25,040
to a perfectly wrong dashboard
187
00:06:25,040 –> 00:06:27,440
that refreshed on time and lied consistently.
188
00:06:27,440 –> 00:06:29,440
Not because the old world was morally superior
189
00:06:29,440 –> 00:06:30,480
because it was slower.
190
00:06:30,480 –> 00:06:32,640
Slow systems leak less ambiguity per hour.
191
00:06:32,640 –> 00:06:34,560
Here’s the key insight to keeping your head
192
00:06:34,560 –> 00:06:36,560
that friction did two things at once.
193
00:06:36,560 –> 00:06:38,280
It slowed bad decisions down.
194
00:06:38,280 –> 00:06:41,080
And it hid ownership gaps by reducing the rate of change.
195
00:06:41,080 –> 00:06:44,120
So when leaders say why did fabric make everything feel out of control?
196
00:06:44,120 –> 00:06:46,520
The answer isn’t that fabric broke governance.
197
00:06:46,520 –> 00:06:49,320
The old stack just rationed chaos.
198
00:06:49,320 –> 00:06:52,600
Fabrics real rewrite from pipeline graph to platform physics.
199
00:06:52,600 –> 00:06:55,360
Fabrics real rewrite isn’t that it gave you nicer tooling.
200
00:06:55,360 –> 00:06:57,080
It rewired where decisions happen.
201
00:06:57,080 –> 00:07:00,360
In the old model, your pipeline graph set on top of separate systems.
202
00:07:00,360 –> 00:07:02,440
The graph orchestrated movement between places
203
00:07:02,440 –> 00:07:04,360
that were owned differently, built differently,
204
00:07:04,360 –> 00:07:07,000
secured differently, and optimized differently
205
00:07:07,000 –> 00:07:08,360
that forced you to think in boundaries
206
00:07:08,360 –> 00:07:11,000
because the platform forced you to pay for boundaries.
207
00:07:11,000 –> 00:07:14,360
Fabric collapses that whole experience into one SaaS control plane
208
00:07:14,360 –> 00:07:16,600
and the first consequence is that the workspace
209
00:07:16,600 –> 00:07:18,200
becomes the unit of reality.
210
00:07:18,200 –> 00:07:20,840
Not the database, not the subscription, not the cluster,
211
00:07:20,840 –> 00:07:21,800
the workspace.
212
00:07:21,800 –> 00:07:24,600
Microsoft markets this as one experience,
213
00:07:24,600 –> 00:07:25,800
and that’s accurate,
214
00:07:25,800 –> 00:07:27,880
but the technical meaning is more interesting.
215
00:07:27,880 –> 00:07:30,200
Fabric gives every workload,
216
00:07:30,200 –> 00:07:32,920
Lakehouse, warehouse, data factory, notebooks,
217
00:07:32,920 –> 00:07:36,520
Power BI, real-time intelligence, data agents,
218
00:07:36,520 –> 00:07:38,520
one shared containment model.
219
00:07:38,520 –> 00:07:41,400
Same surface, same identity plane, same capacity envelope,
220
00:07:41,400 –> 00:07:43,240
same place where people click share.
221
00:07:43,240 –> 00:07:46,120
So the platform stops behaving like a chain of systems.
222
00:07:46,120 –> 00:07:48,520
It behaves like a single authorization graph
223
00:07:48,520 –> 00:07:50,680
attached to a single compute meter.
224
00:07:50,680 –> 00:07:53,080
That distinction matters because once you consolidate
225
00:07:53,080 –> 00:07:55,320
compute storage and publishing into one place
226
00:07:55,320 –> 00:07:57,080
you no longer have handoffs.
227
00:07:57,080 –> 00:07:58,280
You have lateral movement.
228
00:07:58,280 –> 00:08:00,360
Engineers stop handing work off across tools
229
00:08:00,360 –> 00:08:02,360
and start handing it around inside the workspace.
230
00:08:02,360 –> 00:08:04,120
And because the mechanics feel frictionless,
231
00:08:04,120 –> 00:08:06,600
teams interpret that as architectural progress.
232
00:08:06,600 –> 00:08:09,960
But the real story is what got removed, the walls between responsibilities.
233
00:08:09,960 –> 00:08:11,960
Fabric collapses the old stack in four ways
234
00:08:11,960 –> 00:08:14,360
that are convenient in demos and brutal in real estate.
235
00:08:14,360 –> 00:08:17,720
First storage gets pulled under a single narrative, one lake.
236
00:08:17,720 –> 00:08:20,040
The one drive for data analogy is catchy,
237
00:08:20,040 –> 00:08:22,760
and it’s not wrong in the sense that you get a unified lake layer
238
00:08:22,760 –> 00:08:23,800
and a unified catalog,
239
00:08:23,800 –> 00:08:26,760
but the actual behavior that matters for engineering
240
00:08:26,760 –> 00:08:27,880
isn’t the metaphor.
241
00:08:27,880 –> 00:08:28,680
It’s gravity.
242
00:08:28,680 –> 00:08:31,080
One lake becomes the default landing zone for everything,
243
00:08:31,080 –> 00:08:32,200
and the moment that happens,
244
00:08:32,200 –> 00:08:35,160
people stop asking who owns the data because it’s in the lake.
245
00:08:35,160 –> 00:08:36,840
Ownership turns into location.
246
00:08:36,840 –> 00:08:37,880
That’s not ownership.
247
00:08:37,880 –> 00:08:41,000
Second compute gets abstracted into capacity.
248
00:08:41,000 –> 00:08:43,080
You don’t pay per engine the way you used to.
249
00:08:43,080 –> 00:08:45,880
You pay for the shared meter that every engine draws from.
250
00:08:45,880 –> 00:08:48,120
Spark sessions, warehouse queries, model refreshes,
251
00:08:48,120 –> 00:08:49,800
interactive exploration, same pool.
252
00:08:49,800 –> 00:08:51,160
That sounds like simplification
253
00:08:51,160 –> 00:08:53,640
until you realize what it does to accountability.
254
00:08:53,640 –> 00:08:55,640
When the meter is shared, everyone believes
255
00:08:55,640 –> 00:08:57,960
someone else is responsible for the spike.
256
00:08:57,960 –> 00:08:59,160
And because it’s a SaaS platform,
257
00:08:59,160 –> 00:09:01,240
the infrastructure isn’t yours to tune,
258
00:09:01,240 –> 00:09:03,400
so teams search for comfort in the UI,
259
00:09:03,400 –> 00:09:05,960
instead of certainty in the workload behavior.
260
00:09:05,960 –> 00:09:08,520
Third, security baselines get normalized.
261
00:09:08,520 –> 00:09:10,200
Workspaces come with a simple role model.
262
00:09:10,200 –> 00:09:11,160
Great for adoption.
263
00:09:11,160 –> 00:09:12,440
Dangerous for intent.
264
00:09:12,440 –> 00:09:15,080
Because workspace roles are not a data security model.
265
00:09:15,080 –> 00:09:16,360
They’re a collaboration model.
266
00:09:16,360 –> 00:09:17,560
They answer who can work here,
267
00:09:17,560 –> 00:09:18,840
not who can see this table,
268
00:09:18,840 –> 00:09:20,120
not who can query this view,
269
00:09:20,120 –> 00:09:22,360
not who is allowed to infer this sensitive attribute
270
00:09:22,360 –> 00:09:23,240
through a join.
271
00:09:23,240 –> 00:09:25,640
But when everything lives behind the same workspace boundary,
272
00:09:25,640 –> 00:09:28,200
people treat that boundary as if it were a database boundary.
273
00:09:28,200 –> 00:09:31,400
It is not fourth integration parts become internal.
274
00:09:31,400 –> 00:09:34,360
In the old world, a new pipeline meant choosing connectors,
275
00:09:34,360 –> 00:09:36,760
run times, service principles, landing zones,
276
00:09:36,760 –> 00:09:39,080
and a destination pattern that was ceremony.
277
00:09:39,080 –> 00:09:41,640
Fabric collapses that into item creation.
278
00:09:41,640 –> 00:09:44,760
Data flow, pipeline, notebook, shortcut mirror,
279
00:09:44,760 –> 00:09:47,000
warehouse table, semantic model, click, click,
280
00:09:47,000 –> 00:09:47,640
done.
281
00:09:47,640 –> 00:09:49,480
And because it’s all inside one surface,
282
00:09:49,480 –> 00:09:51,480
every path looks equally endorsed.
283
00:09:51,480 –> 00:09:52,840
Shortcuts look like ownership.
284
00:09:52,840 –> 00:09:54,280
Copies look like lineage.
285
00:09:54,280 –> 00:09:55,960
Semantic reuse looks like governance.
286
00:09:55,960 –> 00:09:58,360
Meanwhile, the actual physics never changed.
287
00:09:58,360 –> 00:10:01,240
Data still has three realities you can’t negotiate with.
288
00:10:01,240 –> 00:10:03,640
Cost, contracts, and control.
289
00:10:03,640 –> 00:10:06,520
Costs still exist, but now it’s surfaced as capacity behavior.
290
00:10:06,520 –> 00:10:09,160
Contracts still exist, but now they’re optional,
291
00:10:09,160 –> 00:10:10,440
unless you enforce them.
292
00:10:10,440 –> 00:10:13,880
Controls still exists, but now it’s easy to confuse access
293
00:10:13,880 –> 00:10:16,040
with intent.
294
00:10:16,040 –> 00:10:18,120
This is the foundational misunderstanding.
295
00:10:18,120 –> 00:10:20,680
Fabric removed the walls, not the physics.
296
00:10:20,680 –> 00:10:23,560
When Microsoft says workloads become experiences,
297
00:10:23,560 –> 00:10:26,280
what that means in practice is that workloads become modes.
298
00:10:26,280 –> 00:10:28,600
A lake house isn’t a separate product you provision.
299
00:10:28,600 –> 00:10:30,920
It’s an item inside the workspace.
300
00:10:30,920 –> 00:10:33,480
A warehouse isn’t a separate environment you protect.
301
00:10:33,480 –> 00:10:36,200
It’s an item inside the same permission model.
302
00:10:36,200 –> 00:10:37,640
Power BI isn’t downstream.
303
00:10:37,640 –> 00:10:40,040
It’s in the same place, pointed at the same assets,
304
00:10:40,040 –> 00:10:42,440
built by the same people, sometimes in the same afternoon.
305
00:10:42,440 –> 00:10:44,840
So the integration story becomes dangerously simple.
306
00:10:44,840 –> 00:10:47,000
If someone can see it, they can use it.
307
00:10:47,000 –> 00:10:48,280
If they can use it, they will.
308
00:10:48,280 –> 00:10:49,960
If they can publish it, they’ll publish it.
309
00:10:49,960 –> 00:10:52,120
And once something gets published, it becomes production.
310
00:10:52,120 –> 00:10:54,200
Because executives don’t care how it was made.
311
00:10:54,200 –> 00:10:55,400
They care that it exists.
312
00:10:55,400 –> 00:10:58,360
That’s why fabric feels like velocity and chaos at the same time.
313
00:10:58,360 –> 00:11:00,520
Because you’re no longer fighting tools.
314
00:11:00,520 –> 00:11:03,640
You’re fighting entropy in a single shared control plane.
315
00:11:03,640 –> 00:11:06,040
And now we need to talk about what speed does
316
00:11:06,040 –> 00:11:07,800
when determinism isn’t enforced.
317
00:11:07,800 –> 00:11:11,080
Because this is where one platform turns into one blast radius.
318
00:11:11,080 –> 00:11:15,160
In a speed without determinism, why faster feels like less safe?
319
00:11:15,160 –> 00:11:18,200
Most teams confuse pipeline speed with decision speed.
320
00:11:18,200 –> 00:11:20,840
Fabric absolutely makes execution faster.
321
00:11:20,840 –> 00:11:21,960
Ingestion is simpler.
322
00:11:21,960 –> 00:11:23,560
Notebooks are one click away.
323
00:11:23,560 –> 00:11:25,880
Direct lake lights up visuals and the platform
324
00:11:25,880 –> 00:11:27,640
removes a lot of the old drag.
325
00:11:27,640 –> 00:11:29,160
But the hidden variable is that
326
00:11:29,160 –> 00:11:32,760
the rate of change goes up and with it, the rate of unreviewed decisions.
327
00:11:32,760 –> 00:11:34,040
You didn’t just speed up jobs.
328
00:11:34,040 –> 00:11:35,640
You sped up policy mistakes.
329
00:11:35,640 –> 00:11:37,240
And the moment the change rate goes up,
330
00:11:37,240 –> 00:11:39,320
ambiguity stops being a documentation problem.
331
00:11:39,320 –> 00:11:40,840
It becomes a production behavior.
332
00:11:40,840 –> 00:11:43,800
Because ambiguity in data engineering isn’t philosophical.
333
00:11:43,800 –> 00:11:46,360
It’s concrete drift, schema drift, naming drift,
334
00:11:46,360 –> 00:11:48,360
semantics drift, access drift.
335
00:11:48,360 –> 00:11:50,120
In the old stack, drift still happened,
336
00:11:50,120 –> 00:11:52,120
but it had to crawl through boundaries.
337
00:11:52,120 –> 00:11:54,680
Now it moves at the same speed as a refresh schedule
338
00:11:54,680 –> 00:11:56,680
and the same speed as a co-pilot suggestion.
339
00:11:56,680 –> 00:11:58,840
So faster starts to feel like less safe.
340
00:11:58,840 –> 00:12:00,520
Not because fabric is unsafe,
341
00:12:00,520 –> 00:12:03,320
but because determinism isn’t enforced by default.
342
00:12:03,320 –> 00:12:05,000
You’re operating a high-speed platform
343
00:12:05,000 –> 00:12:07,800
with low friction publishing and mostly social contracts.
344
00:12:07,800 –> 00:12:09,880
Social contracts do not survive scale.
345
00:12:09,880 –> 00:12:11,320
Here’s the real mechanism.
346
00:12:11,320 –> 00:12:13,480
Default behaviors become architecture.
347
00:12:13,480 –> 00:12:15,720
Not in a poetic way, in an operational way.
348
00:12:15,720 –> 00:12:17,480
If a workspace has no naming conventions,
349
00:12:17,480 –> 00:12:20,120
then whatever the first team did becomes the convention.
350
00:12:20,120 –> 00:12:22,520
If the lake house lands files with inconsistent types,
351
00:12:22,520 –> 00:12:26,120
then the warehouse consumes that inconsistency unless you reject it.
352
00:12:26,120 –> 00:12:28,920
If the semantic model gets built directly off raw tables
353
00:12:28,920 –> 00:12:30,120
because it’s convenient,
354
00:12:30,120 –> 00:12:32,040
that convenience becomes a dependency graph.
355
00:12:32,040 –> 00:12:34,440
And once dashboards exist, nobody wants to touch the source
356
00:12:34,440 –> 00:12:36,200
because now you’re breaking the business.
357
00:12:36,200 –> 00:12:39,400
So you end up with workspace conventions replacing contracts
358
00:12:39,400 –> 00:12:41,080
and shortcuts replacing ownership
359
00:12:41,080 –> 00:12:43,320
and it refreshes replacing correctness.
360
00:12:43,320 –> 00:12:45,640
That distinction matters because fabric makes it easy
361
00:12:45,640 –> 00:12:47,400
to create the downstream artifacts
362
00:12:47,400 –> 00:12:49,720
before you’ve locked the upstream invariance.
363
00:12:49,720 –> 00:12:51,240
And when downstream exists first,
364
00:12:51,240 –> 00:12:53,080
the organization starts managing backwards,
365
00:12:53,080 –> 00:12:54,920
patch the semantic layer, patch the DAX,
366
00:12:54,920 –> 00:12:57,320
patch the report, patch the refresh schedule.
367
00:12:57,320 –> 00:12:59,080
The data layer becomes a landfill of,
368
00:12:59,080 –> 00:13:01,400
we’ll clean it later, later never comes.
369
00:13:01,400 –> 00:13:03,080
Now add capacity pricing to this
370
00:13:03,080 –> 00:13:05,400
because this is where fabric gets brutally honest.
371
00:13:05,400 –> 00:13:07,080
In the old world, you paid per tool.
372
00:13:07,080 –> 00:13:09,560
You could hide inefficiency inside somebody else’s bill,
373
00:13:09,560 –> 00:13:12,360
the warehouse bill, the spark bill, the BI bill.
374
00:13:12,360 –> 00:13:13,880
Fabric makes the chaos shared.
375
00:13:13,880 –> 00:13:15,400
Everything draws from one meter.
376
00:13:15,400 –> 00:13:18,520
So the cost of ambiguity shows up as compute consumption.
377
00:13:18,520 –> 00:13:21,400
Spikes, throttling, degraded interactivity,
378
00:13:21,400 –> 00:13:22,680
refresh contention,
379
00:13:22,680 –> 00:13:26,040
and a monthly bill that suddenly has a personality.
380
00:13:26,040 –> 00:13:28,440
And leadership doesn’t care that the sequel was only used
381
00:13:28,440 –> 00:13:29,720
for a small result set.
382
00:13:29,720 –> 00:13:31,400
The platform still paid for the scan.
383
00:13:31,400 –> 00:13:33,160
This is why fabric becomes the first place
384
00:13:33,160 –> 00:13:35,240
many organizations experience cost incidents
385
00:13:35,240 –> 00:13:36,440
as their primary signal.
386
00:13:36,440 –> 00:13:39,960
Not outages, not failures, cost, capacity saturation,
387
00:13:39,960 –> 00:13:41,960
performance degradation, the system functions,
388
00:13:41,960 –> 00:13:44,120
but it does so expensively and unpredictably,
389
00:13:44,120 –> 00:13:45,640
which is just another way of saying
390
00:13:45,640 –> 00:13:47,320
your assumptions are no longer free.
391
00:13:47,320 –> 00:13:49,240
You can’t hope your way into stable spend,
392
00:13:49,240 –> 00:13:52,200
you can’t train your way into deterministic access boundaries.
393
00:13:52,200 –> 00:13:54,760
You can’t label your way into a schema contract.
394
00:13:54,760 –> 00:13:56,840
Fabric doesn’t break, your assumptions do.
395
00:13:56,840 –> 00:13:59,960
And once you see that, you realize why the platform feels
396
00:13:59,960 –> 00:14:01,400
like it’s tightening around you
397
00:14:01,400 –> 00:14:03,720
because you’re no longer amortizing mistakes over time.
398
00:14:03,720 –> 00:14:05,480
You’re paying for them immediately at scale
399
00:14:05,480 –> 00:14:07,960
on a shared meter with downstream artifacts
400
00:14:07,960 –> 00:14:10,440
that create political resistance to fixing the source.
401
00:14:10,440 –> 00:14:11,640
So yes, fabric is faster.
402
00:14:11,640 –> 00:14:14,440
But speed without gates is just higher frequency failure.
403
00:14:14,440 –> 00:14:17,240
And now co-pilot becomes the acceleration layer on top of that,
404
00:14:17,240 –> 00:14:19,160
because it doesn’t just speed up execution.
405
00:14:19,160 –> 00:14:21,880
It speeds up the creation of plausible and bounded decisions.
406
00:14:22,040 –> 00:14:24,760
Co-pilot’s real impact, completion over consequence.
407
00:14:24,760 –> 00:14:27,640
Co-pilot for Microsoft fabric is the part everyone wants to talk about
408
00:14:27,640 –> 00:14:29,480
because it looks like free velocity.
409
00:14:29,480 –> 00:14:31,480
You type a sentence and it writes the pipeline,
410
00:14:31,480 –> 00:14:33,800
it writes the notebook cell, it writes the school,
411
00:14:33,800 –> 00:14:36,120
it writes the KQL, it suggests the transformation,
412
00:14:36,120 –> 00:14:39,080
it will even explain the error message it helped create.
413
00:14:39,080 –> 00:14:41,480
And the uncomfortable truth is that it does exactly
414
00:14:41,480 –> 00:14:42,680
what it’s designed to do.
415
00:14:42,680 –> 00:14:44,280
It helps you complete a task,
416
00:14:44,280 –> 00:14:46,360
but co-pilot doesn’t live in your incident timeline.
417
00:14:46,360 –> 00:14:47,640
It doesn’t live in your budget,
418
00:14:47,640 –> 00:14:49,560
it doesn’t live in your audit findings.
419
00:14:49,560 –> 00:14:53,400
So when you ask, what does co-pilot actually change in data engineering?
420
00:14:53,400 –> 00:14:55,400
The answer is not it replaces engineers
421
00:14:55,400 –> 00:14:57,640
and it eliminates complexity.
422
00:14:57,640 –> 00:15:00,040
It changes the failure rate of ambiguity.
423
00:15:00,040 –> 00:15:02,280
Because the optimization target is completion,
424
00:15:02,280 –> 00:15:03,880
plausible output fast,
425
00:15:03,880 –> 00:15:05,000
not deterministic cost,
426
00:15:05,000 –> 00:15:06,280
not governance intent,
427
00:15:06,280 –> 00:15:07,800
not long term correctness.
428
00:15:07,800 –> 00:15:09,240
Co-pilot writes answers.
429
00:15:09,240 –> 00:15:11,320
Data engineering is about consequences.
430
00:15:11,320 –> 00:15:12,680
Here’s what most teams miss.
431
00:15:12,680 –> 00:15:14,840
Co-pilot doesn’t generate work in a vacuum.
432
00:15:14,840 –> 00:15:16,680
It generates work inside a platform
433
00:15:16,680 –> 00:15:19,480
where the easiest path is also the widest blast radius.
434
00:15:19,480 –> 00:15:21,560
Fabric gives you immediate downstream surfaces,
435
00:15:21,560 –> 00:15:24,280
direct lakes, semantic models, reports, data agents.
436
00:15:24,280 –> 00:15:27,480
So whatever co-pilot produces can become real fast.
437
00:15:27,480 –> 00:15:29,400
And once it becomes real, it becomes defended.
438
00:15:29,400 –> 00:15:32,440
People build slides on it, they send links, they make decisions,
439
00:15:32,440 –> 00:15:33,800
then you, the inheriting architect,
440
00:15:33,800 –> 00:15:35,480
get asked why the numbers changed.
441
00:15:35,480 –> 00:15:37,640
Co-pilot didn’t break anything.
442
00:15:37,640 –> 00:15:39,480
It just made it easy to ship an assumption
443
00:15:39,480 –> 00:15:41,160
before you encoded it as a contract.
444
00:15:41,160 –> 00:15:42,920
And you can see the pattern across workloads.
445
00:15:42,920 –> 00:15:45,480
In a warehouse, co-pilot will happily generate a query
446
00:15:45,480 –> 00:15:47,480
that returns the right-looking columns
447
00:15:47,480 –> 00:15:50,120
with the right-looking joins for the right-looking question.
448
00:15:50,120 –> 00:15:52,920
It will also happily do it without a bounded time predicate,
449
00:15:52,920 –> 00:15:54,600
without respecting partitioning intent,
450
00:15:54,600 –> 00:15:57,720
and without any awareness of what your capacity meter will do
451
00:15:57,720 –> 00:16:00,600
when someone runs it at 9 a.m., alongside three refreshes
452
00:16:00,600 –> 00:16:03,160
and 10 other analysts trying to just check something.
453
00:16:03,160 –> 00:16:05,320
In a notebook, it will generate transformations
454
00:16:05,320 –> 00:16:07,000
that look reasonable and run.
455
00:16:07,000 –> 00:16:09,880
It won’t stop and ask you if the lake house is schema-on-read
456
00:16:09,880 –> 00:16:11,960
and therefore silently accepting drift.
457
00:16:11,960 –> 00:16:13,720
It won’t insist on validation gates.
458
00:16:13,720 –> 00:16:16,680
It won’t create quarantine tables unless you tell it to.
459
00:16:16,680 –> 00:16:19,000
And if you don’t tell it to, you don’t have quality enforcement.
460
00:16:19,000 –> 00:16:21,000
You have optimism with a refresh schedule.
461
00:16:21,000 –> 00:16:22,680
In Data Factory, it will generate pipelines
462
00:16:22,680 –> 00:16:24,520
that connect and move data.
463
00:16:24,520 –> 00:16:26,840
It won’t force you to define ownership boundaries
464
00:16:26,840 –> 00:16:28,040
or consumption contracts.
465
00:16:28,040 –> 00:16:31,560
It will create a path and paths at scale become dependency.
466
00:16:31,560 –> 00:16:32,920
And then there are data agents.
467
00:16:32,920 –> 00:16:34,680
This is where the illusion gets dangerous.
468
00:16:34,680 –> 00:16:36,680
Because now the natural language layer is pointed
469
00:16:36,680 –> 00:16:37,800
at your models and tables
470
00:16:37,800 –> 00:16:40,680
and people interpret conversational access as governed access.
471
00:16:40,680 –> 00:16:43,800
But the agent can only operate within what the caller can already see.
472
00:16:43,800 –> 00:16:45,880
That means co-pilot doesn’t inherit your intent.
473
00:16:45,880 –> 00:16:47,560
It inherits your permissions.
474
00:16:47,560 –> 00:16:48,920
If your permissions are broad
475
00:16:48,920 –> 00:16:51,960
because workspace roles became the whole security strategy,
476
00:16:51,960 –> 00:16:53,640
co-pilot becomes a really efficient way
477
00:16:53,640 –> 00:16:55,720
to explore things you didn’t mean to expose.
478
00:16:55,720 –> 00:16:58,200
This is why it worked becomes, it’s correct
479
00:16:58,200 –> 00:17:00,280
in fabric estates that lean on co-pilot.
480
00:17:00,280 –> 00:17:02,040
The result renders, the dashboard refreshes,
481
00:17:02,040 –> 00:17:04,040
the agent answers, nobody gets an error.
482
00:17:04,040 –> 00:17:07,560
And in modern organizations, no error is treated as proof of correctness.
483
00:17:07,560 –> 00:17:10,200
It isn’t, co-pilot makes two kinds of teams faster.
484
00:17:10,200 –> 00:17:11,960
The teams with enforcement become faster
485
00:17:11,960 –> 00:17:14,680
at implementing decisions they already made deliberately.
486
00:17:14,680 –> 00:17:16,760
Teams without enforcement become faster
487
00:17:16,760 –> 00:17:19,080
at producing artifacts that look like decisions.
488
00:17:19,080 –> 00:17:20,200
Those are not the same thing.
489
00:17:20,200 –> 00:17:21,880
So the real governance limitation
490
00:17:21,880 –> 00:17:23,400
isn’t that co-pilot hallucinates.
491
00:17:23,400 –> 00:17:27,240
The deeper limitation is that co-pilot has no native concept of your invariance
492
00:17:27,240 –> 00:17:29,160
unless you encode them into the system.
493
00:17:29,160 –> 00:17:31,880
Schemas, views, procedures, roles,
494
00:17:31,880 –> 00:17:33,240
CICD gates,
495
00:17:33,240 –> 00:17:36,920
and acceptance criteria like execution plans and violation counts.
496
00:17:36,920 –> 00:17:38,760
If you don’t enforce those surfaces,
497
00:17:38,760 –> 00:17:40,840
co-pilot doesn’t accelerate engineering.
498
00:17:40,840 –> 00:17:42,200
It accelerates entropy.
499
00:17:42,200 –> 00:17:45,480
And now we can stop theorizing because the failure modes aren’t abstract.
500
00:17:45,480 –> 00:17:48,600
They show up as cost incidents, correctness incidents,
501
00:17:48,600 –> 00:17:51,080
and access incidents often with no obvious break.
502
00:17:51,080 –> 00:17:54,680
Case one, setup, the cost drift that looked like a ghost.
503
00:17:54,680 –> 00:17:58,520
The first fabric estate failure that shows up for most organizations
504
00:17:58,520 –> 00:18:00,200
isn’t a security breach,
505
00:18:00,200 –> 00:18:02,440
and it isn’t a data quality scandal.
506
00:18:02,440 –> 00:18:03,480
It’s a bill.
507
00:18:03,480 –> 00:18:06,360
Or more accurately, a capacity graph that looks haunted.
508
00:18:06,360 –> 00:18:08,520
Here’s the symptom pattern, you’re on a steady state.
509
00:18:08,520 –> 00:18:09,800
The business feels good.
510
00:18:09,800 –> 00:18:11,000
Dashboards refresh.
511
00:18:11,000 –> 00:18:12,200
Nobody deployed anything.
512
00:18:12,200 –> 00:18:13,800
There wasn’t a release weekend.
513
00:18:13,800 –> 00:18:15,000
No big new data set.
514
00:18:15,000 –> 00:18:17,000
No we migrated to a new model.
515
00:18:17,000 –> 00:18:19,240
And then the capacity metrics spike hard,
516
00:18:19,240 –> 00:18:21,080
usually right in the middle of the workday,
517
00:18:21,080 –> 00:18:24,440
and usually in bursts that don’t line up with your pipeline schedule.
518
00:18:24,440 –> 00:18:26,600
Leadership asks the worst question in the world.
519
00:18:26,600 –> 00:18:27,560
What changed?
520
00:18:27,560 –> 00:18:30,680
And you end up in that familiar meeting where five people say nothing
521
00:18:30,680 –> 00:18:32,120
with total sincerity.
522
00:18:32,120 –> 00:18:34,200
Because from their perspective, nothing changed.
523
00:18:34,200 –> 00:18:35,320
The reports still load.
524
00:18:35,320 –> 00:18:36,520
The refresh is still complete.
525
00:18:36,520 –> 00:18:38,600
The pipeline success rate is still green.
526
00:18:38,600 –> 00:18:41,560
So the organization interprets the spike as billing weirdness
527
00:18:41,560 –> 00:18:43,080
or a fabric capacity glitch,
528
00:18:43,080 –> 00:18:45,320
or Microsoft did something in the service.
529
00:18:45,320 –> 00:18:45,960
Plot twist.
530
00:18:45,960 –> 00:18:47,400
It’s almost never a platform ghost.
531
00:18:47,400 –> 00:18:48,920
It’s a query-shaped problem.
532
00:18:48,920 –> 00:18:51,160
Fabric just made cost the first incident signal
533
00:18:51,160 –> 00:18:53,400
because cost is the only thing the platform can’t hide
534
00:18:53,400 –> 00:18:55,320
when everything shares the same meter.
535
00:18:55,320 –> 00:18:56,840
You can mask correctness for weeks.
536
00:18:56,840 –> 00:18:58,520
You can mask access drift for months.
537
00:18:58,520 –> 00:19:00,200
But you cannot mask capacity pressure
538
00:19:00,200 –> 00:19:01,560
when 10 people run wide,
539
00:19:01,560 –> 00:19:03,400
unbounded queries at the same time.
540
00:19:03,400 –> 00:19:05,080
The semantic model is refreshing.
541
00:19:05,080 –> 00:19:06,680
So in fabric, the first question isn’t,
542
00:19:06,680 –> 00:19:07,560
did it fail?
543
00:19:07,560 –> 00:19:09,240
It’s what did it do to the meter?
544
00:19:09,240 –> 00:19:10,600
And that’s why this case matters.
545
00:19:10,600 –> 00:19:11,960
It forces a new discipline.
546
00:19:11,960 –> 00:19:14,600
You have to treat cost as an engineered property,
547
00:19:14,600 –> 00:19:16,200
not as an after-the-fact reporter.
548
00:19:16,200 –> 00:19:18,680
If cost is undefined, governance is imaginary.
549
00:19:18,680 –> 00:19:21,080
Because a platform that charges you for runtime
550
00:19:21,080 –> 00:19:23,400
doesn’t care whether your results hit was small.
551
00:19:23,400 –> 00:19:25,400
It charges you for what you asked the engine to do.
552
00:19:25,400 –> 00:19:27,880
Now, to debug this without lying to yourself,
553
00:19:27,880 –> 00:19:29,720
you need ground-truth artifacts.
554
00:19:29,720 –> 00:19:32,600
Not screenshots, not guesses, not it feels slower.
555
00:19:32,600 –> 00:19:33,880
Ground truth means three things.
556
00:19:33,880 –> 00:19:35,160
One, execution plans.
557
00:19:35,160 –> 00:19:36,120
Not the query text.
558
00:19:36,120 –> 00:19:36,920
The plan.
559
00:19:36,920 –> 00:19:37,880
The joint types.
560
00:19:37,880 –> 00:19:38,520
The scans.
561
00:19:38,520 –> 00:19:39,160
The sorts.
562
00:19:39,160 –> 00:19:39,800
The spills.
563
00:19:39,800 –> 00:19:40,760
The operator costs.
564
00:19:40,760 –> 00:19:42,280
The shape of the work.
565
00:19:42,280 –> 00:19:44,760
Two, scan rows versus returned rows.
566
00:19:44,760 –> 00:19:47,160
That ratio tells you whether you’re paying for a seek
567
00:19:47,160 –> 00:19:48,520
or funding a full-world scan
568
00:19:48,520 –> 00:19:50,600
to retrieve 15 rows for a visual.
569
00:19:50,600 –> 00:19:53,880
Three, fabric capacity metrics aligned to time windows.
570
00:19:53,880 –> 00:19:55,800
You correlate the spike with the query window
571
00:19:55,800 –> 00:19:57,000
and the refresh window.
572
00:19:57,000 –> 00:19:58,120
You don’t start with blame.
573
00:19:58,120 –> 00:19:59,480
You start with correlation.
574
00:19:59,480 –> 00:20:00,280
And once you do that,
575
00:20:00,280 –> 00:20:02,360
you’ll notice the same pattern over and over.
576
00:20:02,360 –> 00:20:04,760
Interactive bursts plus refresh concurrency
577
00:20:04,760 –> 00:20:06,520
plus one or two helpful queries
578
00:20:06,520 –> 00:20:08,120
that look innocent in a chat window
579
00:20:08,120 –> 00:20:10,680
and behave like a denial of wallet attack in the engine.
580
00:20:10,680 –> 00:20:11,960
That’s the setup.
581
00:20:11,960 –> 00:20:13,320
Now we can talk root cause
582
00:20:13,320 –> 00:20:15,160
because the plan will tell on you every time.
583
00:20:15,160 –> 00:20:18,040
In case one root cause, AI-circle, that scan the world.
584
00:20:18,040 –> 00:20:20,440
The root cause usually isn’t someone ran a query.
585
00:20:20,440 –> 00:20:22,040
People always ran queries.
586
00:20:22,040 –> 00:20:24,680
The root cause is that co-pilot made it socially acceptable
587
00:20:24,680 –> 00:20:27,160
to run warehouse-scale SQL with no discipline
588
00:20:27,160 –> 00:20:29,240
because it looked professional enough to trust
589
00:20:29,240 –> 00:20:31,000
and fast enough to repeat.
590
00:20:31,000 –> 00:20:33,240
So what does the pattern look like in practice?
591
00:20:33,240 –> 00:20:34,600
First, non-sarguable predicates.
592
00:20:34,600 –> 00:20:36,440
Co-pilot loves convenience syntax.
593
00:20:36,440 –> 00:20:39,320
It’ll give you things like filtering on a computed expression,
594
00:20:39,320 –> 00:20:41,400
wrapping date columns in functions,
595
00:20:41,400 –> 00:20:43,320
building conditions that read well to humans
596
00:20:43,320 –> 00:20:45,160
and destroy index usage for engines.
597
00:20:45,160 –> 00:20:46,840
The user sees a clean wear clause.
598
00:20:46,840 –> 00:20:49,000
The optimizer sees cool, I can’t seek.
599
00:20:49,000 –> 00:20:50,120
And so it scans.
600
00:20:50,120 –> 00:20:52,760
And on a big warehouse table, scan is not a detail.
601
00:20:52,760 –> 00:20:53,480
It’s the bill.
602
00:20:53,480 –> 00:20:56,600
Second, missing time filters.
603
00:20:56,600 –> 00:20:58,760
This one is almost comical
604
00:20:58,760 –> 00:21:01,560
because the question people ask is always time bound.
605
00:21:01,560 –> 00:21:03,640
Last month, this quarter,
606
00:21:03,640 –> 00:21:05,240
since the campaign started.
607
00:21:05,240 –> 00:21:07,080
But co-pilot tends to generate queries
608
00:21:07,080 –> 00:21:10,520
that are semantically plausible without being operationally bounded.
609
00:21:10,520 –> 00:21:11,800
It’ll happily join fact tables
610
00:21:11,800 –> 00:21:14,120
and return the right columns without enforcing a window.
611
00:21:14,120 –> 00:21:15,800
The result set can still be tiny
612
00:21:15,800 –> 00:21:18,280
because the final visual only needs 10 rows.
613
00:21:18,280 –> 00:21:19,880
But the engine had to read everything
614
00:21:19,880 –> 00:21:21,160
to discover those 10 rows.
615
00:21:21,160 –> 00:21:24,120
Third, select star against large tables.
616
00:21:24,120 –> 00:21:26,840
It happens because co-pilot optimizes for completion.
617
00:21:26,840 –> 00:21:28,520
Select, return something.
618
00:21:28,520 –> 00:21:30,120
The user doesn’t have to think.
619
00:21:30,120 –> 00:21:31,640
And once you select,
620
00:21:31,640 –> 00:21:35,320
in a warehouse table, you’ve made two decisions you didn’t intend to make.
621
00:21:35,320 –> 00:21:37,480
You’ve committed to a wider IO footprint
622
00:21:37,480 –> 00:21:38,920
and you’ve committed to schema drift
623
00:21:38,920 –> 00:21:42,040
because now every new column is automatically in scope
624
00:21:42,040 –> 00:21:44,040
for every downstream consumer.
625
00:21:44,040 –> 00:21:45,400
You didn’t just write a query,
626
00:21:45,400 –> 00:21:47,320
you wrote a contract you never reviewed
627
00:21:47,320 –> 00:21:49,000
and here’s where people get fooled.
628
00:21:49,000 –> 00:21:50,520
But the report is simple.
629
00:21:50,520 –> 00:21:52,040
But the result is small.
630
00:21:52,040 –> 00:21:53,480
But it finishes quickly.
631
00:21:53,480 –> 00:21:54,600
None of that matters.
632
00:21:54,600 –> 00:21:56,600
What matters is what the plan did.
633
00:21:56,600 –> 00:21:59,240
In execution plans, the tell is consistent.
634
00:21:59,240 –> 00:22:01,400
Large scans feeding hash joins.
635
00:22:01,400 –> 00:22:02,680
Followed by big sorts.
636
00:22:02,680 –> 00:22:03,800
And then you see spills.
637
00:22:03,800 –> 00:22:05,720
Spills are the platform politely admitting
638
00:22:05,720 –> 00:22:09,000
it ran out of memory and decided to rent more time from your capacity.
639
00:22:09,000 –> 00:22:10,360
That’s not a performance issue.
640
00:22:10,360 –> 00:22:12,040
That’s a cost policy failure.
641
00:22:12,040 –> 00:22:14,760
You can also see it in scandros versus return rows.
642
00:22:14,760 –> 00:22:16,920
When that ratio is absurd,
643
00:22:16,920 –> 00:22:19,160
millions scanned, dozens returned,
644
00:22:19,160 –> 00:22:21,320
you’re not looking at someone exploring.
645
00:22:21,320 –> 00:22:23,800
You’re looking at a lack of bounded query surfaces.
646
00:22:23,800 –> 00:22:25,800
Now add fabrics abstraction layer
647
00:22:25,800 –> 00:22:27,480
and you get the second illusion.
648
00:22:27,480 –> 00:22:30,040
Semantic model refreshes still succeed.
649
00:22:30,040 –> 00:22:31,800
Direct lag still renders visuals.
650
00:22:31,800 –> 00:22:33,640
The user’s experience stays smooth enough
651
00:22:33,640 –> 00:22:35,480
that nobody treats it as an incident.
652
00:22:35,480 –> 00:22:38,360
The platform eats the cost, the dashboards keep refreshing
653
00:22:38,360 –> 00:22:40,680
and the only thing that screams is the capacity meter.
654
00:22:40,680 –> 00:22:43,240
So the spikes correlate to interactive bursts
655
00:22:43,240 –> 00:22:45,240
and refresh concurrency, not deployments.
656
00:22:45,240 –> 00:22:46,520
That’s why it feels like a ghost.
657
00:22:46,520 –> 00:22:47,800
Nothing in get changed.
658
00:22:47,800 –> 00:22:50,360
But user behavior changed and co-pilot made that behavior
659
00:22:50,360 –> 00:22:51,640
easier to produce at scale.
660
00:22:51,640 –> 00:22:53,320
This is also why this problem
661
00:22:53,320 –> 00:22:54,920
shows up first in fabric estates
662
00:22:54,920 –> 00:22:57,240
that democratize access early.
663
00:22:57,240 –> 00:22:58,600
You give broad read permissions
664
00:22:58,600 –> 00:22:59,880
because you want adoption.
665
00:22:59,880 –> 00:23:01,560
You don’t lock down query surfaces
666
00:23:01,560 –> 00:23:03,560
because you don’t want to slow people down.
667
00:23:03,560 –> 00:23:05,480
You let analysts query raw tables
668
00:23:05,480 –> 00:23:07,160
because it’s just read only.
669
00:23:07,160 –> 00:23:08,520
Then co-pilot shows up
670
00:23:08,520 –> 00:23:10,360
and now the casual analyst can generate
671
00:23:10,360 –> 00:23:12,200
warehouse grade SQL in 10 seconds
672
00:23:12,200 –> 00:23:14,840
without understanding sagibility, partition elimination,
673
00:23:14,840 –> 00:23:17,240
joint strategy, or the difference between works
674
00:23:17,240 –> 00:23:19,960
and works efficiently under concurrency.
675
00:23:19,960 –> 00:23:21,880
And before anyone gets comfortable blaming co-pilot,
676
00:23:21,880 –> 00:23:23,800
remember co-pilot isn’t the villain.
677
00:23:23,800 –> 00:23:25,320
Co-pilot is the amplifier.
678
00:23:25,320 –> 00:23:27,560
The actual design omission is letting raw tables
679
00:23:27,560 –> 00:23:29,320
become the consumption interface.
680
00:23:29,320 –> 00:23:31,240
If raw tables are queryable by default,
681
00:23:31,240 –> 00:23:32,920
then your cost model is probabilistic.
682
00:23:32,920 –> 00:23:35,880
You’re gambling that every consumer will behave
683
00:23:35,880 –> 00:23:38,360
like a senior engineer with an execution plan open.
684
00:23:38,360 –> 00:23:38,920
They won’t.
685
00:23:38,920 –> 00:23:40,200
So here’s the anchor line that matters
686
00:23:40,200 –> 00:23:41,720
for the rest of the episode.
687
00:23:41,720 –> 00:23:43,400
If it doesn’t show up in a plan,
688
00:23:43,400 –> 00:23:45,800
a cost report, or a violation count,
689
00:23:45,800 –> 00:23:47,640
it’s not governance, it’s hope.
690
00:23:47,640 –> 00:23:49,240
In this case, the plan is the confession.
691
00:23:49,240 –> 00:23:50,680
And it always confesses.
692
00:23:50,680 –> 00:23:51,960
Case one, fix.
693
00:23:51,960 –> 00:23:53,480
Views as query surface.
694
00:23:53,480 –> 00:23:55,400
Plans as acceptance criteria.
695
00:23:55,400 –> 00:23:58,200
So the fix isn’t tell people to write better school.
696
00:23:58,200 –> 00:24:00,600
That’s education, education decays.
697
00:24:00,600 –> 00:24:02,280
The fix is to change the query surface
698
00:24:02,280 –> 00:24:04,840
so the platform can’t be helpfully expensive by default.
699
00:24:04,840 –> 00:24:08,440
In other words, you stop letting raw tables be a public API.
700
00:24:08,440 –> 00:24:11,160
You make the warehouse behave like a system with boundaries,
701
00:24:11,160 –> 00:24:12,840
not a playground with billing.
702
00:24:12,840 –> 00:24:15,560
The first enforcement move is simple and unpopular.
703
00:24:15,560 –> 00:24:17,080
Views and stored procedures
704
00:24:17,080 –> 00:24:19,320
become the only supported consumption interface.
705
00:24:19,320 –> 00:24:21,800
Not preferred, not recommended, the only one.
706
00:24:21,800 –> 00:24:23,640
If analysts, reports, data agents,
707
00:24:23,640 –> 00:24:26,520
and ad hoc explorers can hit raw fact tables directly,
708
00:24:26,520 –> 00:24:29,080
then you’ve already accepted that cost is a shared gamble.
709
00:24:29,080 –> 00:24:30,360
You didn’t implement governance.
710
00:24:30,360 –> 00:24:32,440
You implemented hope with a refresh schedule,
711
00:24:32,440 –> 00:24:33,480
so you lock it down.
712
00:24:33,480 –> 00:24:36,760
You create a serving schema, call it something boring like serving,
713
00:24:36,760 –> 00:24:37,880
or consume.
714
00:24:37,880 –> 00:24:40,440
And that is where every queryable object lives.
715
00:24:40,440 –> 00:24:43,480
Views expose only the columns you intend to support,
716
00:24:43,480 –> 00:24:45,800
and only with filters you intend to pay for,
717
00:24:45,800 –> 00:24:49,320
stored procedures become the path for parameterized access.
718
00:24:49,320 –> 00:24:51,400
Date windows, entity scopes,
719
00:24:51,400 –> 00:24:53,400
and give me the last end days,
720
00:24:53,400 –> 00:24:56,440
patents that don’t require every consumer to rediscover
721
00:24:56,440 –> 00:24:58,120
sargability the hard way.
722
00:24:58,120 –> 00:25:00,120
That’s also where you enforce naming discipline.
723
00:25:00,120 –> 00:25:03,160
You don’t expose fact sales total 23 of V2 final final.
724
00:25:03,160 –> 00:25:06,760
You expose sales, fact sales, through a view that hides the underlying mess
725
00:25:06,760 –> 00:25:08,120
until you refactor it.
726
00:25:08,120 –> 00:25:09,720
The consumer gets stability.
727
00:25:09,720 –> 00:25:14,200
You get freedom to change internals without a political incident every time a column gets renamed.
728
00:25:14,200 –> 00:25:16,760
Second execution plans become acceptance criteria,
729
00:25:16,760 –> 00:25:20,040
not for every query in the estate, don’t be theatrical.
730
00:25:20,040 –> 00:25:22,680
For critical parts, semantic model refresh queries,
731
00:25:22,680 –> 00:25:27,560
top interactive report queries, and anything that hits large fact tables or runs under concurrency,
732
00:25:27,560 –> 00:25:29,880
you treat the plan the same way you treat a security review,
733
00:25:29,880 –> 00:25:31,560
it’s a gate, not a suggestion.
734
00:25:31,560 –> 00:25:34,440
Here are the non-negotiables you enforce in review.
735
00:25:34,440 –> 00:25:35,560
Bounded predicates.
736
00:25:35,560 –> 00:25:37,880
If the query can read the entire table it will.
737
00:25:37,880 –> 00:25:41,560
So you require time windows, partition elimination, and parameterization.
738
00:25:41,560 –> 00:25:43,720
You ban non-sargable predicates in those parts.
739
00:25:43,720 –> 00:25:45,160
You also ban select star.
740
00:25:45,160 –> 00:25:47,960
If you don’t name the columns, you don’t understand the contract you’re creating,
741
00:25:47,960 –> 00:25:49,800
and you’re also guaranteeing drift.
742
00:25:49,800 –> 00:25:52,600
And you explicitly look for the usual engine taxes.
743
00:25:52,600 –> 00:25:56,920
Scans where you expected Seaks, hash joins where you expected a narrow data set,
744
00:25:56,920 –> 00:26:00,200
big sorts, and spills. Spills are treated as a defect.
745
00:26:00,200 –> 00:26:02,360
Not a performance defect, a governance defect,
746
00:26:02,360 –> 00:26:07,080
because a spilled query under concurrency is a cost incident waiting for a calendar invite.
747
00:26:07,080 –> 00:26:10,440
Third, you turn capacity metrics into a feedback loop,
748
00:26:10,440 –> 00:26:12,440
not a post-mortem artifact.
749
00:26:12,440 –> 00:26:14,600
You establish a stable cost baseline.
750
00:26:14,600 –> 00:26:17,800
What normal looks like by time of day, by workload, by refresh window,
751
00:26:17,800 –> 00:26:21,400
then you tag and track deviations to query patterns not to teams.
752
00:26:21,400 –> 00:26:22,520
The goal isn’t blame.
753
00:26:22,520 –> 00:26:24,440
The goal is making cost predictable.
754
00:26:24,440 –> 00:26:27,080
When a spike happens, you correlate which query shapes run,
755
00:26:27,080 –> 00:26:31,160
which objects they hit, which concurrency windows existed, which refreshes overlapped.
756
00:26:31,160 –> 00:26:33,960
You build a library of this pattern causes that spike.
757
00:26:33,960 –> 00:26:35,400
Then you enforce that library.
758
00:26:35,400 –> 00:26:37,240
Over time, something weird happens.
759
00:26:37,240 –> 00:26:39,080
The random alerts stop being random.
760
00:26:39,080 –> 00:26:40,920
The platform becomes legible again.
761
00:26:40,920 –> 00:26:44,520
And once the cost surface becomes predictable, everything else gets easier.
762
00:26:44,520 –> 00:26:48,840
Because you can now talk about governance like an engineered property with artifacts.
763
00:26:48,840 –> 00:26:51,160
Not an aspirational poster in a wiki.
764
00:26:51,960 –> 00:26:55,480
That’s the actual outcome you want from this fix, not cheaper queries,
765
00:26:55,480 –> 00:26:56,600
deterministic behavior.
766
00:26:56,600 –> 00:27:01,160
Now we can move to the next failure mode, because once cost is controlled,
767
00:27:01,160 –> 00:27:02,680
the next thing that breaks is truth.
768
00:27:02,680 –> 00:27:04,040
Case 2.
769
00:27:04,040 –> 00:27:07,480
Set up, Lake House, Warehouse, Contract Collapse.
770
00:27:07,480 –> 00:27:10,280
Once you stop bleeding money, you notice the next failure mode.
771
00:27:10,280 –> 00:27:11,880
Nobody agrees on what’s true.
772
00:27:11,880 –> 00:27:14,360
This one shows up as Power BI is wrong,
773
00:27:14,360 –> 00:27:17,080
which is always a useful sentence because it’s never specific.
774
00:27:17,080 –> 00:27:21,000
What they mean is two reports disagree, two teams have two official KPIs
775
00:27:21,000 –> 00:27:25,160
and the executive dashboard mysteriously changes when someone improves a model.
776
00:27:25,160 –> 00:27:26,680
The platform keeps refreshing.
777
00:27:26,680 –> 00:27:28,360
Nothing is read, the argument still happens.
778
00:27:28,360 –> 00:27:31,400
And it happens because the contract never existed.
779
00:27:31,400 –> 00:27:34,040
In fabric estates, this usually starts the same way.
780
00:27:34,040 –> 00:27:36,680
A team lands data into a lake house because it’s fast,
781
00:27:36,680 –> 00:27:38,040
because notebooks are right there,
782
00:27:38,040 –> 00:27:40,120
because delta tables feel like progress,
783
00:27:40,120 –> 00:27:42,520
and because direct lake makes Power BI light up
784
00:27:42,520 –> 00:27:45,400
without the traditional import and warehouse ceremony.
785
00:27:45,400 –> 00:27:47,320
So the early win is real data shows up,
786
00:27:47,320 –> 00:27:49,160
visuals render, people feel unblocked,
787
00:27:49,160 –> 00:27:51,400
but the lake house mental model is schema on read.
788
00:27:51,400 –> 00:27:52,600
It’s permissive by design.
789
00:27:52,600 –> 00:27:55,080
It will accept drift unless you force it not to.
790
00:27:55,080 –> 00:27:57,640
Files arrive with slightly different column types.
791
00:27:57,640 –> 00:28:01,000
New columns appear because the source system team just added a flag.
792
00:28:01,000 –> 00:28:03,160
Strings show up where integers used to be.
793
00:28:03,160 –> 00:28:05,400
Time stamps, move formats, and everyone shrugs
794
00:28:05,400 –> 00:28:08,280
because the notebook still runs after one more cost.
795
00:28:08,280 –> 00:28:10,520
Then the organization does the next predictable thing.
796
00:28:10,520 –> 00:28:13,000
It mirrors that lake house shape into a warehouse
797
00:28:13,000 –> 00:28:14,680
or builds a warehouse on top of it,
798
00:28:14,680 –> 00:28:17,000
expecting the warehouse to provide structure.
799
00:28:17,000 –> 00:28:20,040
But the warehouse can’t enforce structure you never defined.
800
00:28:20,040 –> 00:28:23,160
If the ingestion pipeline treats whatever showed up as acceptable,
801
00:28:23,160 –> 00:28:25,480
the warehouse layer becomes a reflection of ambiguity
802
00:28:25,480 –> 00:28:26,920
not a rejection boundary.
803
00:28:26,920 –> 00:28:28,920
The schema looks cleaner because it’s s-quall
804
00:28:28,920 –> 00:28:31,080
therefore leadership assumes it’s controlled.
805
00:28:31,080 –> 00:28:33,160
Meanwhile, the real behavior is that the warehouse
806
00:28:33,160 –> 00:28:35,560
is normalizing drift as if it were a feature.
807
00:28:35,560 –> 00:28:38,840
And once that happens, the semantic layer becomes a patch bay.
808
00:28:38,840 –> 00:28:41,000
Analytics engineers start solving data problems
809
00:28:41,000 –> 00:28:42,440
with DAX and Power Query,
810
00:28:42,440 –> 00:28:43,960
because it’s the only place they can move
811
00:28:43,960 –> 00:28:45,560
without waiting on an upstream fix.
812
00:28:45,560 –> 00:28:47,000
Measures get more complex.
813
00:28:47,000 –> 00:28:49,560
Calculated columns appear to reconcile type mismatches.
814
00:28:49,560 –> 00:28:50,840
Relationships get weird.
815
00:28:50,840 –> 00:28:52,760
Fix it in the model becomes the operating model
816
00:28:52,760 –> 00:28:54,520
because it’s fast and politically safe.
817
00:28:54,520 –> 00:28:56,280
But it’s also where truth fragments
818
00:28:56,280 –> 00:28:58,680
you end up with multiple correct KPIs.
819
00:28:58,680 –> 00:29:00,680
Finance has one definition in a report.
820
00:29:00,680 –> 00:29:03,080
Sales has another in a different semantic model.
821
00:29:03,080 –> 00:29:05,640
And the same metric name starts meaning different filters
822
00:29:05,640 –> 00:29:06,520
and different grain.
823
00:29:06,520 –> 00:29:08,440
Everyone’s dashboard refreshes on time.
824
00:29:08,440 –> 00:29:10,120
Everyone’s wrong in a different way.
825
00:29:10,120 –> 00:29:11,640
Co-pilot accelerates this collapse
826
00:29:11,640 –> 00:29:13,480
because it rewards plausibility.
827
00:29:13,480 –> 00:29:15,240
It will happily generate transformations
828
00:29:15,240 –> 00:29:17,720
and KPI logic that compile and render,
829
00:29:17,720 –> 00:29:19,720
even if they encode a hidden assumption
830
00:29:19,720 –> 00:29:23,400
about grain, keys, deduplication, or later-riving facts.
831
00:29:23,400 –> 00:29:25,480
It doesn’t ask you what invariant you’re enforcing.
832
00:29:25,480 –> 00:29:27,160
It asks you what output you want.
833
00:29:27,160 –> 00:29:29,400
So the deeper principle for this case is simple.
834
00:29:29,400 –> 00:29:30,920
The earlier you enforce shape,
835
00:29:30,920 –> 00:29:33,880
the fewer semantic patch jobs you fund downstream.
836
00:29:33,880 –> 00:29:36,600
Task flows and medallion visuals don’t create contracts.
837
00:29:36,600 –> 00:29:37,880
They create diagrams.
838
00:29:37,880 –> 00:29:39,560
If your bronze to silver to gold layers
839
00:29:39,560 –> 00:29:42,280
aren’t backed by schema enforcement and validation gates,
840
00:29:42,280 –> 00:29:43,560
you don’t have architecture.
841
00:29:43,560 –> 00:29:45,480
You have a faster way to ship ambiguity.
842
00:29:45,480 –> 00:29:49,320
Case 2Fix enforced schemas, validation gates, quarantine tables.
843
00:29:49,320 –> 00:29:51,480
The fix is not get better at modeling.
844
00:29:51,480 –> 00:29:53,240
The fix is to put an enforcement boundary
845
00:29:53,240 –> 00:29:54,760
where ambiguity enters the estate
846
00:29:54,760 –> 00:29:56,360
and then refused to negotiate with it.
847
00:29:56,360 –> 00:29:59,320
In fabric that boundary is the lake house to warehouse seam.
848
00:29:59,320 –> 00:30:01,560
Treat the lake house as an intake zone.
849
00:30:01,560 –> 00:30:05,080
Fast landing, cheap iteration, messy reality.
850
00:30:05,080 –> 00:30:07,560
Treat the warehouse as the contract zone.
851
00:30:07,560 –> 00:30:11,080
Shaped, typed, keyed, and intentionally consumable.
852
00:30:11,880 –> 00:30:14,040
The warehouse isn’t another place to query.
853
00:30:14,040 –> 00:30:15,880
It’s the point where the organization
854
00:30:15,880 –> 00:30:17,880
finally commits to what the data is.
855
00:30:17,880 –> 00:30:19,880
So you start with explicit warehouse schemas,
856
00:30:19,880 –> 00:30:21,240
not debu-en-vibes.
857
00:30:21,240 –> 00:30:24,680
Real schemas that encode domain ownership and consumption intent
858
00:30:24,680 –> 00:30:28,520
sales, finance, HR, serving, whatever matches your estate model.
859
00:30:28,520 –> 00:30:30,840
This matters because schema is how you stop location
860
00:30:30,840 –> 00:30:32,680
from masquerading as ownership.
861
00:30:32,680 –> 00:30:35,640
If the table lives in finance, finance owns the contract.
862
00:30:35,640 –> 00:30:37,480
If it lives in serving, the platform team
863
00:30:37,480 –> 00:30:38,920
owns the consumption interface.
864
00:30:38,920 –> 00:30:40,760
The name becomes a control surface.
865
00:30:40,760 –> 00:30:42,360
Then you define shape deliberately.
866
00:30:42,360 –> 00:30:45,320
Columns, types, nullability, keys, and invariance.
867
00:30:45,320 –> 00:30:47,320
And yes, invariance is a real word here.
868
00:30:47,320 –> 00:30:49,720
It means statements the business believes are always true
869
00:30:49,720 –> 00:30:51,560
and therefore the system must enforce.
870
00:30:51,560 –> 00:30:53,000
An order has an order date.
871
00:30:53,000 –> 00:30:54,760
A customer key cannot be empty.
872
00:30:54,760 –> 00:30:56,600
A currency code is from a known set.
873
00:30:56,600 –> 00:30:58,360
A fact table grain is stable.
874
00:30:58,360 –> 00:31:01,160
And later arriving updates have a declared strategy.
875
00:31:01,160 –> 00:31:02,920
Without those, you don’t have a contract.
876
00:31:02,920 –> 00:31:04,920
You have a file that happened to pass today.
877
00:31:04,920 –> 00:31:06,440
Now here’s the uncomfortable move.
878
00:31:06,440 –> 00:31:09,000
You stop silently coercing data.
879
00:31:09,000 –> 00:31:12,280
Most lake house pipelines fix drift by casting strings
880
00:31:12,280 –> 00:31:14,520
to ins trimming columns, defaulting missing values
881
00:31:14,520 –> 00:31:15,320
and moving on.
882
00:31:15,320 –> 00:31:16,600
That keeps the pipeline green.
883
00:31:16,600 –> 00:31:19,640
It also turns unknown behavior into accepted behavior.
884
00:31:19,640 –> 00:31:21,720
So instead, you add validation gates at ingestion
885
00:31:21,720 –> 00:31:23,080
and transform boundaries.
886
00:31:23,080 –> 00:31:26,520
And you make failure an operational signal, not an embarrassment.
887
00:31:26,520 –> 00:31:28,520
Validation gates are simple in concept.
888
00:31:28,520 –> 00:31:31,960
Before data crosses into the contract zone, it gets checked.
889
00:31:31,960 –> 00:31:35,480
Schema matches, types match, required fields exist.
890
00:31:35,480 –> 00:31:37,600
Keys behave the way you claim they behave.
891
00:31:37,600 –> 00:31:40,240
Duplicates are handled according to a rule you can explain.
892
00:31:40,240 –> 00:31:41,520
If the checks pass, you load.
893
00:31:41,520 –> 00:31:45,040
If they fail, you do not best effort the data into production
894
00:31:45,040 –> 00:31:47,360
and hope the semantic layer can reconcile it.
895
00:31:47,360 –> 00:31:48,400
You quarantine it.
896
00:31:48,400 –> 00:31:50,640
Quarantine tables are the part most teams avoid
897
00:31:50,640 –> 00:31:52,920
because it feels like admitting imperfection.
898
00:31:52,920 –> 00:31:55,120
But quarantine is how you keep the estate honest.
899
00:31:55,120 –> 00:31:57,440
When violations occur, you land the bad routes
900
00:31:57,440 –> 00:31:59,600
in a quarantine schema with metadata.
901
00:31:59,600 –> 00:32:03,600
Source system, load time, violation type, offending values.
902
00:32:03,600 –> 00:32:06,880
And you publish the count as a metric, not hidden in a notebook cell.
903
00:32:06,880 –> 00:32:08,640
A real metric that leadership can see
904
00:32:08,640 –> 00:32:11,280
if they insist on shipping data without contracts.
905
00:32:11,280 –> 00:32:12,960
This changes behavior fast.
906
00:32:12,960 –> 00:32:16,480
Because now schema drift is no longer a silent downstream argument.
907
00:32:16,480 –> 00:32:18,320
It becomes a visible upstream event.
908
00:32:18,320 –> 00:32:20,320
And once it’s visible, you can assign responsibility,
909
00:32:20,320 –> 00:32:23,040
either the source system changed, the ingestion logic failed,
910
00:32:23,040 –> 00:32:25,360
or the contract needs an intentional version bump.
911
00:32:25,360 –> 00:32:26,800
Those are the only three truths.
912
00:32:26,800 –> 00:32:28,000
Everything else is denial.
913
00:32:28,000 –> 00:32:29,600
And you’ll notice a second order effect
914
00:32:29,600 –> 00:32:31,600
that everyone likes once it happens.
915
00:32:31,600 –> 00:32:33,360
Downstream logic collapses.
916
00:32:33,360 –> 00:32:34,800
In a good way.
917
00:32:34,800 –> 00:32:36,480
When the warehouse enforces shape,
918
00:32:36,480 –> 00:32:38,880
the semantic model stops being a patch bay.
919
00:32:38,880 –> 00:32:41,120
DAX measures get simpler because they don’t have
920
00:32:41,120 –> 00:32:42,960
to guard against type chaos.
921
00:32:42,960 –> 00:32:46,160
Relationships stabilize because keys behave consistently.
922
00:32:46,160 –> 00:32:48,240
Two correct KPIs becomes harder to sustain
923
00:32:48,240 –> 00:32:50,960
because the raw truth is no longer malleable per workspace.
924
00:32:50,960 –> 00:32:52,480
You can still build multiple measures,
925
00:32:52,480 –> 00:32:54,160
but you’re doing it on top of a stable,
926
00:32:54,160 –> 00:32:56,080
typed contract, not on top of a shifting,
927
00:32:56,080 –> 00:32:57,440
lake of assumptions.
928
00:32:57,440 –> 00:33:00,000
This is also where co-pilot becomes useful again.
929
00:33:00,000 –> 00:33:02,080
With enforced schemas and clear contracts,
930
00:33:02,080 –> 00:33:04,320
co-pilot can generate transformations and measures
931
00:33:04,320 –> 00:33:05,680
that are constrained by reality.
932
00:33:05,680 –> 00:33:07,040
You’ve reduced the search space.
933
00:33:07,040 –> 00:33:09,200
You’ve moved from plausible to bounded.
934
00:33:09,200 –> 00:33:10,240
But that’s the whole trick.
935
00:33:10,240 –> 00:33:12,720
Don’t ask AI to be your governance model.
936
00:33:12,720 –> 00:33:15,680
Make governance the environment AI operates inside.
937
00:33:15,680 –> 00:33:18,400
And when someone asks why you’re slowing things down with gates,
938
00:33:18,400 –> 00:33:19,760
the answer is simple.
939
00:33:19,760 –> 00:33:21,120
You’re not slowing delivery.
940
00:33:21,120 –> 00:33:23,120
You’re stopping unreviewed ambiguity
941
00:33:23,120 –> 00:33:25,120
from becoming production truth.
942
00:33:25,120 –> 00:33:26,800
Case three setup plus fix.
943
00:33:26,800 –> 00:33:29,600
Workspace only security and the ownership vacuum.
944
00:33:29,600 –> 00:33:32,160
Once cost is predictable and truth is enforced,
945
00:33:32,160 –> 00:33:34,480
the next failure mode is the one nobody wants to talk about
946
00:33:34,480 –> 00:33:36,000
because it isn’t a performance graph.
947
00:33:36,000 –> 00:33:37,680
It’s an audit question.
948
00:33:37,680 –> 00:33:39,680
The symptom pattern is always the same.
949
00:33:39,680 –> 00:33:41,680
You find service principles with broad access
950
00:33:41,680 –> 00:33:43,600
because the pipeline needed it.
951
00:33:43,600 –> 00:33:45,440
You find analysts who can see raw tables
952
00:33:45,440 –> 00:33:47,360
because they were helping validate.
953
00:33:47,360 –> 00:33:49,920
You find multiple workspaces with the same data set
954
00:33:49,920 –> 00:33:52,960
copied three different ways because someone needed autonomy.
955
00:33:52,960 –> 00:33:56,000
And when the security team asks who can see this table,
956
00:33:56,000 –> 00:33:58,320
the answer is a long pause followed by,
957
00:33:58,320 –> 00:33:59,600
we think only AI cuts.
958
00:33:59,600 –> 00:34:01,040
That’s not a security posture.
959
00:34:01,040 –> 00:34:01,920
That’s a rumor.
960
00:34:01,920 –> 00:34:03,200
This is where fabrics convenience
961
00:34:03,200 –> 00:34:04,560
becomes architectural erosion
962
00:34:04,560 –> 00:34:06,560
because workspaces feel like security boundaries
963
00:34:06,560 –> 00:34:08,000
but they aren’t data boundaries.
964
00:34:08,000 –> 00:34:09,360
They are collaboration containers.
965
00:34:09,360 –> 00:34:11,600
A workspace role answers who can contribute here,
966
00:34:11,600 –> 00:34:13,600
not who can access this column,
967
00:34:13,600 –> 00:34:15,280
not who can join these two tables
968
00:34:15,280 –> 00:34:16,560
and infer something sensitive,
969
00:34:16,560 –> 00:34:18,880
not who can run an expensive query surface
970
00:34:18,880 –> 00:34:20,400
that becomes a side channel.
971
00:34:20,400 –> 00:34:23,360
And if you treat workspace roles as your whole strategy,
972
00:34:23,360 –> 00:34:24,800
you create an ownership vacuum
973
00:34:24,800 –> 00:34:26,480
because now nobody owns the access model
974
00:34:26,480 –> 00:34:27,840
at the data engine layer.
975
00:34:27,840 –> 00:34:29,360
Nobody owns the table level intent.
976
00:34:29,360 –> 00:34:31,280
Nobody owns the deny by default posture.
977
00:34:31,280 –> 00:34:33,280
Everyone assumes the workspace boundary is enough,
978
00:34:33,280 –> 00:34:35,360
therefore nobody builds real boundaries inside it.
979
00:34:35,360 –> 00:34:36,480
The platform works.
980
00:34:36,480 –> 00:34:37,360
The system doesn’t.
981
00:34:37,360 –> 00:34:41,440
Copilot makes this worse in a very specific way.
982
00:34:41,440 –> 00:34:44,320
It changes how people discover and interact with data.
983
00:34:44,320 –> 00:34:46,640
In the old world, consumers had to know where to look.
984
00:34:46,640 –> 00:34:48,480
A report, a data set,
985
00:34:48,480 –> 00:34:50,480
maybe a documented SQL endpoint.
986
00:34:50,480 –> 00:34:53,440
In fabric, copilot, and agents make exploration conversational
987
00:34:53,440 –> 00:34:56,800
and cross surface, they surface tables, they suggest joins.
988
00:34:56,800 –> 00:34:58,640
They help people just query it.
989
00:34:58,640 –> 00:35:01,520
And if you gave broad read access because you wanted adoption,
990
00:35:01,520 –> 00:35:03,360
copilot becomes the fastest path
991
00:35:03,360 –> 00:35:06,720
to exploring raw assets you never intended as consumption paths.
992
00:35:06,720 –> 00:35:08,480
It doesn’t bypass permissions.
993
00:35:08,480 –> 00:35:10,400
It bypasses your assumptions.
994
00:35:10,400 –> 00:35:13,200
So the root cause isn’t copilot exposed data.
995
00:35:13,200 –> 00:35:16,560
The root cause is that you never encoded your consumption surfaces
996
00:35:16,560 –> 00:35:18,960
and you never encoded your intent at the engine layer.
997
00:35:18,960 –> 00:35:22,160
You left a vacuum and the platform filled it with default behavior,
998
00:35:22,160 –> 00:35:24,880
broad access, raw tables as APIs,
999
00:35:24,880 –> 00:35:27,120
and service principles that look like owners.
1000
00:35:27,120 –> 00:35:28,480
Now here’s the uncomfortable truth.
1001
00:35:28,480 –> 00:35:30,480
Most fabric estates don’t get hacked.
1002
00:35:30,480 –> 00:35:31,760
They get drifted.
1003
00:35:31,760 –> 00:35:34,880
Access expands because every exception feels justified at the time.
1004
00:35:34,880 –> 00:35:37,520
Give the SP owner for now, we’ll tighten later.
1005
00:35:37,520 –> 00:35:40,000
Add them as member they need to publish.
1006
00:35:40,000 –> 00:35:41,280
Just let them read the layhouse.
1007
00:35:41,280 –> 00:35:42,640
It’s not that sensitive.
1008
00:35:42,640 –> 00:35:44,240
Then later becomes never,
1009
00:35:44,240 –> 00:35:45,920
and the estate becomes unauditable.
1010
00:35:45,920 –> 00:35:47,280
Not because anyone was malicious,
1011
00:35:47,280 –> 00:35:49,840
but because no one was responsible for the intent model.
1012
00:35:49,840 –> 00:35:52,000
So the fix is not more training.
1013
00:35:52,000 –> 00:35:52,960
It’s not more labels.
1014
00:35:52,960 –> 00:35:54,560
It’s not put it in a wiki.
1015
00:35:54,560 –> 00:35:56,000
You need enforceable boundaries
1016
00:35:56,000 –> 00:35:57,520
that survive convenience.
1017
00:35:57,520 –> 00:36:00,080
First, define schema-based security boundaries
1018
00:36:00,080 –> 00:36:01,520
in the warehouse as the contract zone.
1019
00:36:01,520 –> 00:36:03,920
If you use the previous section correctly,
1020
00:36:03,920 –> 00:36:06,720
you already have real schemas that encode ownership.
1021
00:36:06,720 –> 00:36:09,600
Domain schema, serving schemas, quarantine schemas.
1022
00:36:09,600 –> 00:36:11,360
Now you attach security to that structure.
1023
00:36:11,360 –> 00:36:14,000
You create database roles that map to business intent,
1024
00:36:14,000 –> 00:36:17,440
consumer, steward, engineer, automation, whatever matches your org,
1025
00:36:17,440 –> 00:36:19,360
and you start from deny by default.
1026
00:36:19,360 –> 00:36:21,840
No implicit access because someone is in the workspace.
1027
00:36:21,840 –> 00:36:24,320
Second, you restrict raw table access.
1028
00:36:24,320 –> 00:36:26,880
Tables become internal implementation details.
1029
00:36:26,880 –> 00:36:29,840
Views and procedures become the controlled access parts.
1030
00:36:29,840 –> 00:36:31,840
This is where performance and security align.
1031
00:36:31,840 –> 00:36:33,920
You expose what you intend people to query,
1032
00:36:33,920 –> 00:36:36,800
and you can enforce both column level and row level constraints
1033
00:36:36,800 –> 00:36:37,600
where appropriate.
1034
00:36:37,600 –> 00:36:40,160
More importantly, you can enforce that agents
1035
00:36:40,160 –> 00:36:43,120
and ad hoc queries hit the same surfaces your reports hit.
1036
00:36:43,120 –> 00:36:46,080
One contract, one path, one place to secure and tune.
1037
00:36:46,080 –> 00:36:49,280
Third, you treat service principles as identities with blast radius,
1038
00:36:49,280 –> 00:36:50,880
not as pipeline glue.
1039
00:36:50,880 –> 00:36:52,880
Every SP has an explicit role.
1040
00:36:52,880 –> 00:36:55,440
Scope to the minimum schema and objects it needs.
1041
00:36:55,440 –> 00:36:59,040
You stop granting workspace roles as a substitute for data permissions.
1042
00:36:59,040 –> 00:37:02,720
You also stop mixing can deploy artifacts with can read data.
1043
00:37:02,720 –> 00:37:03,920
Those are different rights.
1044
00:37:03,920 –> 00:37:05,520
When you collapse them into member,
1045
00:37:05,520 –> 00:37:08,320
you create an estate where publishing equals reading
1046
00:37:08,320 –> 00:37:11,200
and reading equals exploring and exploring equals inference.
1047
00:37:11,200 –> 00:37:13,920
Finally, you make auditability a first class artifact,
1048
00:37:13,920 –> 00:37:14,960
not a yearly scramble.
1049
00:37:14,960 –> 00:37:17,600
You should be able to answer three questions without a meeting.
1050
00:37:17,600 –> 00:37:20,560
Who can see it, who can change it, and through what surface.
1051
00:37:20,560 –> 00:37:23,520
If you can’t answer those questions from roles, grants,
1052
00:37:23,520 –> 00:37:25,920
and controlled interfaces, you don’t have governance.
1053
00:37:25,920 –> 00:37:26,960
You have a story.
1054
00:37:26,960 –> 00:37:28,800
And this is the big outcome of this fix.
1055
00:37:28,800 –> 00:37:32,320
You can become audit-defensible without re-architecting the whole estate.
1056
00:37:32,320 –> 00:37:33,760
You don’t need a new platform.
1057
00:37:33,760 –> 00:37:35,200
You need a real intent model,
1058
00:37:35,200 –> 00:37:37,120
enforced inside the data engine,
1059
00:37:37,120 –> 00:37:40,000
with workspaces treated as collaboration, not containment.
1060
00:37:40,000 –> 00:37:42,800
Because in fabric, the workspace is where people click,
1061
00:37:42,800 –> 00:37:44,320
but the warehouse is where you enforce.
1062
00:37:44,320 –> 00:37:47,600
New failure modes in the fabric plus copilot era.
1063
00:37:47,600 –> 00:37:50,240
Now, if those three cases felt familiar, good.
1064
00:37:50,240 –> 00:37:51,200
They’re not edge cases.
1065
00:37:51,200 –> 00:37:53,200
They’re the new baseline failure modes
1066
00:37:53,200 –> 00:37:56,800
in a fabric estate where speed exists before enforcement.
1067
00:37:56,800 –> 00:37:58,800
And what’s different in this era is that failure
1068
00:37:58,800 –> 00:38:00,320
doesn’t require a visible break.
1069
00:38:00,320 –> 00:38:01,760
The platform keeps running.
1070
00:38:01,760 –> 00:38:03,360
Refreshes keep finishing.
1071
00:38:03,360 –> 00:38:04,320
Users keep clicking.
1072
00:38:04,320 –> 00:38:05,600
Copilot keeps responding.
1073
00:38:05,600 –> 00:38:08,640
The estate degrades anyway because the failure modes are systemic,
1074
00:38:08,640 –> 00:38:11,360
boundary blur, cost drift, and the ownership vacuum.
1075
00:38:11,360 –> 00:38:13,520
And they don’t show up as one big explosion.
1076
00:38:13,520 –> 00:38:16,720
They show up as a slow replacement of intent with convenience.
1077
00:38:16,720 –> 00:38:19,120
Failure mode 1 is boundary blur across layers.
1078
00:38:19,120 –> 00:38:20,880
In fabric, the lake house feeds the warehouse,
1079
00:38:20,880 –> 00:38:22,480
the warehouse feeds the semantic model,
1080
00:38:22,480 –> 00:38:25,440
the semantic model feeds reports and reports feed apps.
1081
00:38:25,440 –> 00:38:28,160
But the boundaries between those layers are mostly social
1082
00:38:28,160 –> 00:38:30,000
unless you make them physical.
1083
00:38:30,000 –> 00:38:33,440
So what happens in real estate is that each layer starts fixing
1084
00:38:33,440 –> 00:38:35,440
what it doesn’t like about the layer before it.
1085
00:38:35,440 –> 00:38:37,680
The lake house lands messy data.
1086
00:38:37,680 –> 00:38:40,000
The warehouse normalizes it without rejecting it.
1087
00:38:40,000 –> 00:38:43,520
The semantic model compensates with calculated columns and relationships.
1088
00:38:43,520 –> 00:38:45,920
And the report compensates with DAX and filters.
1089
00:38:45,920 –> 00:38:47,680
Every local fix feels rational,
1090
00:38:47,680 –> 00:38:49,920
but every local fix also creates drift
1091
00:38:49,920 –> 00:38:52,480
because the logic now lives in multiple places
1092
00:38:52,480 –> 00:38:56,000
and nobody can describe the system as one coherent contract.
1093
00:38:56,000 –> 00:38:59,360
And because fabric makes it easy to create a new semantic model
1094
00:38:59,360 –> 00:39:01,040
or a new report in minutes,
1095
00:39:01,040 –> 00:39:03,120
teams fork truth instead of repairing it.
1096
00:39:03,120 –> 00:39:04,720
A temporary model becomes the model.
1097
00:39:04,720 –> 00:39:06,640
A quick report becomes operational.
1098
00:39:06,640 –> 00:39:09,520
Apps become distribution channels for inconsistencies
1099
00:39:09,520 –> 00:39:11,600
and then you get the weirdest kind of outage.
1100
00:39:11,600 –> 00:39:14,240
The business still runs, but nobody trusts the numbers.
1101
00:39:14,240 –> 00:39:15,200
That’s boundary blur.
1102
00:39:15,920 –> 00:39:20,080
Failure mode 2 is cost-drift driven by AI-shaped interaction patterns.
1103
00:39:20,080 –> 00:39:22,080
This isn’t just bad sequel.
1104
00:39:22,080 –> 00:39:25,120
It’s the combination of co-pilot making query authoring cheap
1105
00:39:25,120 –> 00:39:27,840
and fabric making query execution shared.
1106
00:39:27,840 –> 00:39:31,280
When a platform charges you for runtime under concurrency,
1107
00:39:31,280 –> 00:39:34,320
every unbounded scan becomes an incident generator.
1108
00:39:34,320 –> 00:39:37,840
But the trick is that cost drift doesn’t necessarily correlate to deployments.
1109
00:39:37,840 –> 00:39:40,160
So traditional change management doesn’t catch it.
1110
00:39:40,160 –> 00:39:41,680
You can have perfect CICD
1111
00:39:41,680 –> 00:39:43,760
and still get wrecked at 9.05 AM
1112
00:39:43,760 –> 00:39:46,080
because a well-meaning analyst asked co-pilot a question
1113
00:39:46,080 –> 00:39:47,840
that produced a non-sargable predicate,
1114
00:39:47,840 –> 00:39:50,800
hit a large table and ran alongside refresh concurrency.
1115
00:39:50,800 –> 00:39:52,560
So the incident taxonomy shifts.
1116
00:39:52,560 –> 00:39:54,800
Your first signal becomes capacity pressure,
1117
00:39:54,800 –> 00:39:56,320
not pipeline failure.
1118
00:39:56,320 –> 00:39:58,400
Your debugging artifact becomes the plan,
1119
00:39:58,400 –> 00:39:59,280
not the log.
1120
00:39:59,280 –> 00:40:01,840
Your prevention mechanism becomes query surfaces
1121
00:40:01,840 –> 00:40:04,480
and acceptance criteria, not best practices.
1122
00:40:04,480 –> 00:40:05,760
And if you don’t accept that shift,
1123
00:40:05,760 –> 00:40:07,600
you keep treating spend as a billing problem
1124
00:40:07,600 –> 00:40:09,520
when it’s actually an architecture problem.
1125
00:40:09,520 –> 00:40:11,680
Failure mode 3 is the ownership vacuum
1126
00:40:11,680 –> 00:40:13,440
that looks like everything is fine.
1127
00:40:13,440 –> 00:40:14,640
This is the most corrosive one
1128
00:40:14,640 –> 00:40:17,200
because it hides behind success metrics, pipelines run,
1129
00:40:17,200 –> 00:40:19,200
dashboards refresh, people deliver,
1130
00:40:19,200 –> 00:40:21,040
but nobody owns semantics end to end.
1131
00:40:21,040 –> 00:40:22,640
Nobody owns the contract boundary,
1132
00:40:22,640 –> 00:40:24,400
nobody owns the access intent model.
1133
00:40:24,400 –> 00:40:26,080
And because fabric centralizes everything
1134
00:40:26,080 –> 00:40:27,200
into a workspace experience,
1135
00:40:27,200 –> 00:40:28,640
the organization starts confusing
1136
00:40:28,640 –> 00:40:31,440
someone has access with someone has responsibility.
1137
00:40:31,440 –> 00:40:32,480
Those are not the same.
1138
00:40:32,480 –> 00:40:33,920
An ownership vacuum forms
1139
00:40:33,920 –> 00:40:36,640
when the system can function without an explicit owner.
1140
00:40:36,640 –> 00:40:37,600
Fabric lets you do that.
1141
00:40:37,600 –> 00:40:40,240
It’s a platform designed for speed and collaboration,
1142
00:40:40,240 –> 00:40:41,520
so it will happily keep working
1143
00:40:41,520 –> 00:40:43,120
while your governance model erodes.
1144
00:40:43,120 –> 00:40:44,800
Co-pilot accelerates the erosion
1145
00:40:44,800 –> 00:40:46,480
because it makes it easier for more people
1146
00:40:46,480 –> 00:40:49,520
to create more artifacts, pipelines, notebooks,
1147
00:40:49,520 –> 00:40:51,920
transformations, models and answers.
1148
00:40:51,920 –> 00:40:54,000
More artifacts means more implicit contracts.
1149
00:40:54,000 –> 00:40:55,840
More implicit contracts means more drift.
1150
00:40:55,840 –> 00:40:58,400
And drift without an owner is not a technical issue.
1151
00:40:58,400 –> 00:40:59,760
It’s entropy management failure.
1152
00:40:59,760 –> 00:41:03,760
This is why each exception becomes an entropy generator.
1153
00:41:03,760 –> 00:41:05,440
You add one temporary shortcut,
1154
00:41:05,440 –> 00:41:07,200
one just for now workspace role.
1155
00:41:07,200 –> 00:41:08,880
One quick semantic model
1156
00:41:08,880 –> 00:41:10,800
built directly on raw tables.
1157
00:41:10,800 –> 00:41:12,880
One fix in DAX because upstream is slow,
1158
00:41:12,880 –> 00:41:15,200
one copilot generated query saved as a data set
1159
00:41:15,200 –> 00:41:16,480
because it worked.
1160
00:41:16,480 –> 00:41:18,640
Each one is defensible in isolation.
1161
00:41:18,640 –> 00:41:21,600
Together, they form a system where intent can’t be proven.
1162
00:41:21,600 –> 00:41:24,000
And the real consequences that you end up with incidents
1163
00:41:24,000 –> 00:41:25,600
that aren’t outages,
1164
00:41:25,600 –> 00:41:27,600
cost incidents, correctness incidents,
1165
00:41:27,600 –> 00:41:28,640
access incidents.
1166
00:41:28,640 –> 00:41:30,080
They don’t always show up as red lights.
1167
00:41:30,080 –> 00:41:31,920
They show up as executive distrust,
1168
00:41:31,920 –> 00:41:33,280
ordered discomfort,
1169
00:41:33,280 –> 00:41:36,480
and a capacity meter that behaves like a random number generator.
1170
00:41:36,480 –> 00:41:39,040
So if you’re inheriting a fabric estate and thinking,
1171
00:41:39,040 –> 00:41:41,040
why does this feel harder than it should be,
1172
00:41:41,040 –> 00:41:41,760
here’s the answer.
1173
00:41:41,760 –> 00:41:42,560
The platform works.
1174
00:41:42,560 –> 00:41:43,680
The system doesn’t.
1175
00:41:43,680 –> 00:41:45,760
And the only way out is to stop treating governance
1176
00:41:45,760 –> 00:41:48,640
as education and start treating it as enforced design.
1177
00:41:48,640 –> 00:41:50,640
Because fabric doesn’t slow you down anymore,
1178
00:41:50,640 –> 00:41:52,720
which means you have to choose where friction belongs.
1179
00:41:52,720 –> 00:41:55,520
The role collapse, what shrunk, what didn’t.
1180
00:41:55,520 –> 00:41:57,120
Here’s the part people keep getting wrong
1181
00:41:57,120 –> 00:41:58,560
and it’s why the hiring conversations
1182
00:41:58,560 –> 00:42:00,560
in fabric estates are weird right now.
1183
00:42:00,560 –> 00:42:03,360
They see copilot generating notebooks and SQL
1184
00:42:03,360 –> 00:42:05,680
and they conclude the data engineer role is shrinking.
1185
00:42:05,680 –> 00:42:06,720
It is.
1186
00:42:06,720 –> 00:42:09,600
But only the parts that were never the job in the first place.
1187
00:42:09,600 –> 00:42:11,840
What shrunk is the visible labor?
1188
00:42:11,840 –> 00:42:13,840
Handwriting the pipeline scaffolding,
1189
00:42:13,840 –> 00:42:15,200
stitching connectors together,
1190
00:42:15,200 –> 00:42:16,080
doing repetitive,
1191
00:42:16,080 –> 00:42:19,280
SQL transforms that look impressive in a commit history,
1192
00:42:19,280 –> 00:42:22,160
and maintaining glue code that exists purely
1193
00:42:22,160 –> 00:42:24,320
because tools used to be separate.
1194
00:42:24,320 –> 00:42:25,760
Fabric collapsed a lot of that work
1195
00:42:25,760 –> 00:42:28,480
into item creation and shared experiences,
1196
00:42:28,480 –> 00:42:30,880
and copilot collapses even more into auto-complete
1197
00:42:30,880 –> 00:42:32,080
with extra steps.
1198
00:42:32,080 –> 00:42:34,480
So yes, the estate can produce artifacts faster,
1199
00:42:34,480 –> 00:42:37,040
but the trap is that teams confuse artifact velocity
1200
00:42:37,040 –> 00:42:38,080
with system integrity.
1201
00:42:38,080 –> 00:42:41,040
In the old world, you had to earn every artifact
1202
00:42:41,040 –> 00:42:42,960
by suffering through provisioned infrastructure,
1203
00:42:42,960 –> 00:42:44,480
separate portals, long-run times,
1204
00:42:44,480 –> 00:42:46,240
and annoying deployment friction.
1205
00:42:46,240 –> 00:42:48,480
That friction acted like attacks on casual changes,
1206
00:42:48,480 –> 00:42:49,760
but slowed bad decisions.
1207
00:42:49,760 –> 00:42:50,960
It also slowed good ones.
1208
00:42:50,960 –> 00:42:54,080
But the net effect was that fewer people touch the system,
1209
00:42:54,080 –> 00:42:55,280
fewer times per day,
1210
00:42:55,280 –> 00:42:58,480
with fewer opportunities to accidentally publish a new truth.
1211
00:42:58,480 –> 00:42:59,760
Fabric removes that tax.
1212
00:42:59,760 –> 00:43:03,120
So if your organization’s identity for a data engineer
1213
00:43:03,120 –> 00:43:04,720
was the person who makes pipelines,
1214
00:43:04,720 –> 00:43:06,880
of course it feels like the role is disappearing.
1215
00:43:06,880 –> 00:43:09,040
Pipelines are easier, notebooks are easier,
1216
00:43:09,040 –> 00:43:11,040
even the semantic model path is faster.
1217
00:43:11,040 –> 00:43:12,800
The visible assembly work shrinks,
1218
00:43:12,800 –> 00:43:14,160
and now the uncomfortable part.
1219
00:43:14,160 –> 00:43:15,920
The responsibilities that didn’t shrink
1220
00:43:15,920 –> 00:43:18,720
are the ones that actually determine whether the estate survives,
1221
00:43:18,720 –> 00:43:22,320
data contracts, schema enforcement, cost predictability,
1222
00:43:22,320 –> 00:43:25,040
security boundaries, and ownership clarity.
1223
00:43:25,040 –> 00:43:26,320
Those did not get automated.
1224
00:43:26,320 –> 00:43:27,840
They got exposed.
1225
00:43:27,840 –> 00:43:29,920
Because once the tool friction goes away,
1226
00:43:29,920 –> 00:43:32,240
the system will ship whatever you allow it to ship.
1227
00:43:32,240 –> 00:43:33,760
If you didn’t encode invariants,
1228
00:43:33,760 –> 00:43:35,520
you don’t get flexibility.
1229
00:43:35,520 –> 00:43:37,680
That you get drift at a higher refresh frequency.
1230
00:43:38,400 –> 00:43:40,000
If you didn’t encode cost intent,
1231
00:43:40,000 –> 00:43:41,520
you don’t get self-service,
1232
00:43:41,520 –> 00:43:43,120
you get shared meter contention,
1233
00:43:43,120 –> 00:43:45,280
and a budget that behaves like weather.
1234
00:43:45,280 –> 00:43:46,320
This is the role collapse,
1235
00:43:46,320 –> 00:43:50,000
the platform turned craftsmanship into a smaller percentage of the work,
1236
00:43:50,000 –> 00:43:52,720
and turned governance into the defining bottleneck.
1237
00:43:52,720 –> 00:43:55,440
When tooling gets easier, discipline becomes the bottleneck,
1238
00:43:55,440 –> 00:43:57,840
not because discipline is morally good,
1239
00:43:57,840 –> 00:43:59,360
but because discipline is the only thing
1240
00:43:59,360 –> 00:44:01,280
that produces deterministic outcomes
1241
00:44:01,280 –> 00:44:03,680
in a high-speed low-friction platform.
1242
00:44:03,680 –> 00:44:07,200
This is also why so many teams feel personally attacked by fabric.
1243
00:44:07,200 –> 00:44:09,120
The old world let you be a hero with effort.
1244
00:44:09,120 –> 00:44:10,960
You could brute force a solution with time
1245
00:44:10,960 –> 00:44:12,480
and the friction of the system
1246
00:44:12,480 –> 00:44:14,400
made that effort feel like engineering.
1247
00:44:14,400 –> 00:44:15,920
Fabric removes the heroism layer,
1248
00:44:15,920 –> 00:44:18,560
it rewards design, it punishes improvisation.
1249
00:44:18,560 –> 00:44:19,920
And if you came up in an environment
1250
00:44:19,920 –> 00:44:22,160
where making it work was the primary skill,
1251
00:44:22,160 –> 00:44:24,160
fabric makes you faster at making it work,
1252
00:44:24,160 –> 00:44:26,960
and therefore faster at creating the next incident class.
1253
00:44:26,960 –> 00:44:28,160
So the job moved up the stack
1254
00:44:28,160 –> 00:44:29,440
and the modern data engineer
1255
00:44:29,440 –> 00:44:31,360
isn’t primarily a tool operator anymore.
1256
00:44:31,360 –> 00:44:32,960
There are boundary enforcers.
1257
00:44:32,960 –> 00:44:36,000
Contracts, schemers, roles, plans, and gates.
1258
00:44:36,000 –> 00:44:37,760
They decide which paths are allowed,
1259
00:44:37,760 –> 00:44:39,600
which behaviors are acceptable,
1260
00:44:39,600 –> 00:44:41,280
and which quick fixes are rejected
1261
00:44:41,280 –> 00:44:42,880
because they generate long-term drift.
1262
00:44:42,880 –> 00:44:46,000
Analytics engineers feel the same squeeze from the other side.
1263
00:44:46,000 –> 00:44:47,440
If logic lives in five tools,
1264
00:44:47,440 –> 00:44:48,960
Lakehouse transforms, warehouse views,
1265
00:44:48,960 –> 00:44:51,200
semantic model measures, report-level filters,
1266
00:44:51,200 –> 00:44:52,400
and app-level business rules,
1267
00:44:52,400 –> 00:44:54,320
then the platform didn’t give you agility.
1268
00:44:54,320 –> 00:44:56,720
It gave you five places for truth to fragment.
1269
00:44:56,720 –> 00:44:59,040
Platform owners also get a new kind of accountability.
1270
00:44:59,040 –> 00:45:00,160
You can’t hide behind,
1271
00:45:00,160 –> 00:45:03,200
we size the cluster wrong, or we need more nodes.
1272
00:45:03,200 –> 00:45:05,120
In fabric, if plans are unstable,
1273
00:45:05,120 –> 00:45:06,160
cost is undefined,
1274
00:45:06,160 –> 00:45:08,560
and you will be the person asked why the meter spikes.
1275
00:45:08,560 –> 00:45:09,600
Not because it’s your fault,
1276
00:45:09,600 –> 00:45:12,080
but because the platform centralised the blast radius
1277
00:45:12,080 –> 00:45:13,920
into one capacity envelope.
1278
00:45:13,920 –> 00:45:15,760
An architect’s inheriting fabric estates
1279
00:45:15,760 –> 00:45:17,520
have the worst version of this.
1280
00:45:17,520 –> 00:45:20,080
They inherit a pile of artefacts that all work
1281
00:45:20,080 –> 00:45:22,400
with no explicit boundaries, no clear owners,
1282
00:45:22,400 –> 00:45:23,680
and a history of exceptions
1283
00:45:23,680 –> 00:45:25,600
that became the actual operating model.
1284
00:45:25,600 –> 00:45:27,040
They’re asked to make it reliable
1285
00:45:27,040 –> 00:45:28,560
without slowing the business down,
1286
00:45:28,560 –> 00:45:29,760
which is corporate code for,
1287
00:45:29,760 –> 00:45:31,760
please enforce discipline without creating conflict.
1288
00:45:31,760 –> 00:45:32,560
But that’s the shift.
1289
00:45:32,560 –> 00:45:33,680
The role didn’t get smaller.
1290
00:45:33,680 –> 00:45:35,760
The role got less visible and more consequential.
1291
00:45:35,760 –> 00:45:37,680
Fabric didn’t eliminate data engineering.
1292
00:45:37,680 –> 00:45:39,920
It eliminated the parts that used to distract people
1293
00:45:39,920 –> 00:45:40,960
from the real job,
1294
00:45:40,960 –> 00:45:42,560
enforcing intent at scale.
1295
00:45:42,560 –> 00:45:44,560
What the modern data engineer is now?
1296
00:45:44,560 –> 00:45:47,120
Most organisations still describe the modern data engineer
1297
00:45:47,120 –> 00:45:49,120
as a faster version of the old one.
1298
00:45:49,120 –> 00:45:50,720
Same job, new tooling.
1299
00:45:50,720 –> 00:45:51,920
That belief is comfortable.
1300
00:45:51,920 –> 00:45:52,880
And it’s also wrong.
1301
00:45:52,880 –> 00:45:54,960
The modern data engineer in a fabric estate
1302
00:45:54,960 –> 00:45:56,240
is not a pipeline writer,
1303
00:45:56,240 –> 00:45:57,360
not a tool operator,
1304
00:45:57,360 –> 00:45:58,880
and not a SQL typist.
1305
00:45:58,880 –> 00:46:00,000
Those activities still exist,
1306
00:46:00,000 –> 00:46:01,920
but they are not the centre of gravity anymore.
1307
00:46:02,640 –> 00:46:04,480
Fabric and co-pilot shoved the work
1308
00:46:04,480 –> 00:46:07,200
up the stack into a place most teams avoided.
1309
00:46:07,200 –> 00:46:08,480
Explosive intent.
1310
00:46:08,480 –> 00:46:10,480
And this is the part that makes people uncomfortable,
1311
00:46:10,480 –> 00:46:12,880
because intent can’t be improvised.
1312
00:46:12,880 –> 00:46:15,040
It has to be designed, enforced, and defended.
1313
00:46:15,040 –> 00:46:16,080
So what is the job now?
1314
00:46:16,080 –> 00:46:17,600
First, contract designer.
1315
00:46:17,600 –> 00:46:18,960
A contract is not a wiki page,
1316
00:46:18,960 –> 00:46:20,240
and it’s not a diagram.
1317
00:46:20,240 –> 00:46:21,680
A contract is an enforcement boundary
1318
00:46:21,680 –> 00:46:23,840
that makes ambiguity expensive to ship.
1319
00:46:23,840 –> 00:46:26,000
It defines grain, keys, types,
1320
00:46:26,000 –> 00:46:28,160
nullability, freshness expectations,
1321
00:46:28,160 –> 00:46:29,280
and failure behaviour.
1322
00:46:29,280 –> 00:46:31,360
It encodes what the business thinks is true
1323
00:46:31,360 –> 00:46:33,040
into something the platform can enforce
1324
00:46:33,040 –> 00:46:34,720
without asking permission every time.
1325
00:46:34,720 –> 00:46:37,280
When you skip this, the estate will still move fast.
1326
00:46:37,280 –> 00:46:39,200
It will just move fast toward drift.
1327
00:46:39,200 –> 00:46:40,960
Second, boundary enforcer.
1328
00:46:40,960 –> 00:46:43,280
This is the skill that separates
1329
00:46:43,280 –> 00:46:46,640
we adopted fabric from we operate fabric.
1330
00:46:46,640 –> 00:46:48,560
Boundaries in fabric are not the marketing ones.
1331
00:46:48,560 –> 00:46:50,320
They’re not lake house versus warehouse
1332
00:46:50,320 –> 00:46:51,360
as a product menu.
1333
00:46:51,360 –> 00:46:53,120
Their architectural seems.
1334
00:46:53,120 –> 00:46:54,320
Intake versus contract,
1335
00:46:54,320 –> 00:46:55,440
contract versus consumption,
1336
00:46:55,440 –> 00:46:57,120
consumption versus presentation.
1337
00:46:57,120 –> 00:46:59,440
The modern data engineer makes those seams physical
1338
00:46:59,440 –> 00:47:02,240
with views, procedures, schemers, and controlled surfaces.
1339
00:47:02,240 –> 00:47:04,480
And they also make those seams social
1340
00:47:04,480 –> 00:47:06,160
by refusing to accept quick fixes
1341
00:47:06,160 –> 00:47:07,440
that bypass the boundary.
1342
00:47:07,440 –> 00:47:11,040
That refusal matters more than any new feature announcement.
1343
00:47:11,040 –> 00:47:12,160
Third, cost governor.
1344
00:47:12,160 –> 00:47:15,680
If you’re uncomfortable calling cost a governance problem,
1345
00:47:15,680 –> 00:47:16,960
fabric will fix that for you.
1346
00:47:16,960 –> 00:47:18,880
In fabric, cost is not finances problem.
1347
00:47:18,880 –> 00:47:21,440
Cost is an execution property of your decisions.
1348
00:47:21,440 –> 00:47:23,120
Query surfaces, concurrency,
1349
00:47:23,120 –> 00:47:24,480
refresh design, and the difference
1350
00:47:24,480 –> 00:47:26,960
between bounded access and raw exploration.
1351
00:47:26,960 –> 00:47:29,200
The modern data engineer treats execution plans
1352
00:47:29,200 –> 00:47:30,640
like policy artifacts.
1353
00:47:30,640 –> 00:47:32,640
They enforce sagability expectations,
1354
00:47:32,640 –> 00:47:34,880
bounded predicates, and predictable query shapes
1355
00:47:34,880 –> 00:47:35,760
for critical parts,
1356
00:47:35,760 –> 00:47:37,520
not because they enjoy gatekeeping
1357
00:47:37,520 –> 00:47:40,640
because deterministic spend requires deterministic behavior.
1358
00:47:40,640 –> 00:47:42,640
Fourth, failure mode anticipator.
1359
00:47:42,640 –> 00:47:46,320
Old school data engineering treated failure as outages.
1360
00:47:46,320 –> 00:47:49,280
A job failed, a pipeline stopped, a connector broke.
1361
00:47:49,280 –> 00:47:52,720
In fabric, the more dangerous failures are the quiet ones.
1362
00:47:52,720 –> 00:47:54,880
It refreshed, but it drifted.
1363
00:47:54,880 –> 00:47:56,960
It answered, but it exposed, it worked,
1364
00:47:56,960 –> 00:47:58,560
but it burned the capacity meter.
1365
00:47:58,960 –> 00:48:01,360
The modern data engineer thinks in incident classes,
1366
00:48:01,360 –> 00:48:03,200
cost incidents, correctness incidents,
1367
00:48:03,200 –> 00:48:05,600
access incidents, and they design for detection.
1368
00:48:05,600 –> 00:48:07,200
Plans, scans versus returns,
1369
00:48:07,200 –> 00:48:08,800
violations, quarantines, lineage,
1370
00:48:08,800 –> 00:48:11,120
and auditability aren’t optional extras.
1371
00:48:11,120 –> 00:48:13,360
They are the sensors that tell you the estate is decaying
1372
00:48:13,360 –> 00:48:15,040
before the business notices.
1373
00:48:15,040 –> 00:48:16,480
That distinction matters
1374
00:48:16,480 –> 00:48:18,640
because you can’t manage what you can’t observe
1375
00:48:18,640 –> 00:48:21,120
and you can’t govern what you can’t prove.
1376
00:48:21,120 –> 00:48:22,800
Now tie this back to the T-CeCle mindset
1377
00:48:22,800 –> 00:48:25,120
because this is where people underestimate what’s happening.
1378
00:48:25,120 –> 00:48:26,800
Most teams treat T-CeCle,
1379
00:48:26,800 –> 00:48:30,160
schemas, rolls, and plans as implementation details.
1380
00:48:30,160 –> 00:48:32,560
In a fabric estate, those are control surfaces.
1381
00:48:32,560 –> 00:48:33,840
A schema is not a folder.
1382
00:48:33,840 –> 00:48:34,960
It’s an ownership boundary.
1383
00:48:34,960 –> 00:48:36,240
A role is not a convenience.
1384
00:48:36,240 –> 00:48:37,520
It’s an intent declaration.
1385
00:48:37,520 –> 00:48:38,720
A view is not a shortcut.
1386
00:48:38,720 –> 00:48:39,760
It’s a contract wrapper.
1387
00:48:39,760 –> 00:48:42,720
An execution plan is not a performance troubleshooting tool.
1388
00:48:42,720 –> 00:48:44,640
It’s the cost policy reality check.
1389
00:48:44,640 –> 00:48:46,320
And once you see those as control surfaces,
1390
00:48:46,320 –> 00:48:48,000
you stop asking co-pilot to be smarter.
1391
00:48:48,000 –> 00:48:49,680
You start making the system stricter.
1392
00:48:49,680 –> 00:48:51,280
That is the only sustainable move.
1393
00:48:51,280 –> 00:48:54,160
Now, because this episode is for multiple audiences,
1394
00:48:54,160 –> 00:48:56,000
the roll shift needs to be explicit
1395
00:48:56,000 –> 00:48:59,360
because each group tends to outsource the hard part to someone else.
1396
00:48:59,360 –> 00:49:02,640
Data engineer, if you don’t enforce schema and contracts,
1397
00:49:02,640 –> 00:49:04,240
you’re not engineering, you’re relaying,
1398
00:49:04,240 –> 00:49:07,600
you’re moving ambiguity from source to report at higher speed.
1399
00:49:07,600 –> 00:49:10,320
Analytics engineer, if logic lives in five tools,
1400
00:49:10,320 –> 00:49:12,320
you own the drift, not emotionally.
1401
00:49:12,320 –> 00:49:15,040
Operationally, you can’t demand a single source of truth
1402
00:49:15,040 –> 00:49:17,920
while implementing five competing sources of logic.
1403
00:49:17,920 –> 00:49:20,880
Platform owner, if plans are unstable, cost is undefined.
1404
00:49:20,880 –> 00:49:23,120
You don’t get to argue, it’s just usage
1405
00:49:23,120 –> 00:49:24,800
when the query surface allows the scam.
1406
00:49:25,520 –> 00:49:27,760
Architect, if boundaries aren’t explicit,
1407
00:49:27,760 –> 00:49:29,360
integration will decide them for you.
1408
00:49:29,360 –> 00:49:32,240
And integration always picks the path of least resistance,
1409
00:49:32,240 –> 00:49:33,520
not the path of least risk.
1410
00:49:33,520 –> 00:49:37,680
Executive, speed without control always builds you later.
1411
00:49:37,680 –> 00:49:40,720
Sometimes as money, sometimes as trust, sometimes as audit pain,
1412
00:49:40,720 –> 00:49:41,760
always as rework.
1413
00:49:41,760 –> 00:49:44,160
This is the modern identity.
1414
00:49:44,160 –> 00:49:46,240
Enforce intent at scale,
1415
00:49:46,240 –> 00:49:49,200
in a platform that will otherwise enforce entropy for you.
1416
00:49:49,200 –> 00:49:52,640
Operating model for fabric plus co-pilot,
1417
00:49:52,640 –> 00:49:54,400
enforcement, not education.
1418
00:49:54,400 –> 00:49:56,800
So here’s the operating model and it is not inspiring.
1419
00:49:56,800 –> 00:49:58,080
It is enforceable.
1420
00:49:58,080 –> 00:50:01,120
First rule, AI drafts, humans approve.
1421
00:50:01,120 –> 00:50:03,600
Generation is typing, reviews engineering.
1422
00:50:03,600 –> 00:50:06,240
You don’t merge co-pilot output because it compiled.
1423
00:50:06,240 –> 00:50:08,400
You merge it because it passed acceptance criteria
1424
00:50:08,400 –> 00:50:09,680
you can defend later.
1425
00:50:09,680 –> 00:50:12,720
And that means you define acceptance criteria that aren’t vibes.
1426
00:50:12,720 –> 00:50:15,200
Schema checks, constraint checks, execution plans,
1427
00:50:15,200 –> 00:50:16,560
and security intent.
1428
00:50:16,560 –> 00:50:18,320
If it can’t be validated mechanically,
1429
00:50:18,320 –> 00:50:20,160
it will be argued socially.
1430
00:50:20,160 –> 00:50:22,640
Social governance collapses under scheduled pressure.
1431
00:50:22,640 –> 00:50:24,080
Mechanical governance survives.
1432
00:50:24,720 –> 00:50:26,960
Second rule, contracts before convenience.
1433
00:50:26,960 –> 00:50:28,640
The lake house is not your truth layer.
1434
00:50:28,640 –> 00:50:29,760
It’s your intake layer.
1435
00:50:29,760 –> 00:50:31,680
The warehouse is where you commit to shape.
1436
00:50:31,680 –> 00:50:34,720
That distinction matters because schema on read is a drift engine.
1437
00:50:34,720 –> 00:50:37,440
If you let bronze drift into silver and call it agile,
1438
00:50:37,440 –> 00:50:38,560
you’re not shipping fast.
1439
00:50:38,560 –> 00:50:41,040
You’re spreading ambiguity across more downstream surfaces
1440
00:50:41,040 –> 00:50:42,640
at a higher refresh rate.
1441
00:50:42,640 –> 00:50:44,240
So you declare the primary boundary,
1442
00:50:44,240 –> 00:50:45,280
lake house to warehouse.
1443
00:50:45,280 –> 00:50:48,240
That boundary gets gates, not documentation, gates.
1444
00:50:48,240 –> 00:50:51,040
Third rule, execution plans are cost policy,
1445
00:50:51,040 –> 00:50:52,320
not an optimization hobby.
1446
00:50:52,320 –> 00:50:55,280
If you want deterministic spend on a shared capacity,
1447
00:50:55,280 –> 00:50:58,000
you treat plans stability like you treat change control.
1448
00:50:58,000 –> 00:51:00,000
Critical parts must have bounded predicates.
1449
00:51:00,000 –> 00:51:01,440
They must have selective filters.
1450
00:51:01,440 –> 00:51:03,360
They must avoid non-saggable predicates.
1451
00:51:03,360 –> 00:51:05,200
They must avoid select star.
1452
00:51:05,200 –> 00:51:07,840
And they must not spill under expected concurrency.
1453
00:51:07,840 –> 00:51:09,840
If they do, you don’t monitor it.
1454
00:51:09,840 –> 00:51:11,360
You redesign the surface.
1455
00:51:11,360 –> 00:51:13,680
Fourth rule, views and procedures over raw tables.
1456
00:51:13,680 –> 00:51:16,080
Always, this is the consumption API.
1457
00:51:16,080 –> 00:51:19,040
It protects cost, correctness, and security in one move.
1458
00:51:19,040 –> 00:51:22,080
Views give you stable contracts even when internal change.
1459
00:51:22,080 –> 00:51:23,840
Procedures give you parametrization,
1460
00:51:23,840 –> 00:51:26,400
so consumers don’t invent their own query shapes.
1461
00:51:26,400 –> 00:51:28,640
And because everything points at the same surface,
1462
00:51:28,640 –> 00:51:31,200
you have one place to tune, one place to secure,
1463
00:51:31,200 –> 00:51:32,720
and one place to audit.
1464
00:51:32,720 –> 00:51:34,720
If you let raw tables be querable by default,
1465
00:51:34,720 –> 00:51:37,440
you have accepted uncontrolled query shapes.
1466
00:51:37,440 –> 00:51:39,680
You’ve also accepted schema drift as a breaking change
1467
00:51:39,680 –> 00:51:40,960
you’ll discover downstream.
1468
00:51:40,960 –> 00:51:41,920
That’s not self-service.
1469
00:51:41,920 –> 00:51:43,840
That’s unmanaged blast radius.
1470
00:51:43,840 –> 00:51:47,520
Fifth rule, CIR, RCD gates for schema and logic changes.
1471
00:51:47,520 –> 00:51:49,920
Fabric is fast enough that drift accumulates faster
1472
00:51:49,920 –> 00:51:52,800
than your team’s ability to remember why things were designed
1473
00:51:52,800 –> 00:51:53,600
the way they were.
1474
00:51:53,600 –> 00:51:55,120
That’s what entropy looks like.
1475
00:51:55,120 –> 00:51:57,360
Yesterday’s exception becomes today’s dependency.
1476
00:51:57,360 –> 00:52:01,200
So you put friction, backward, belongs, in promotion,
1477
00:52:01,200 –> 00:52:02,720
not in exploration.
1478
00:52:02,720 –> 00:52:03,920
Exploration can be fast.
1479
00:52:03,920 –> 00:52:05,920
Production has gates.
1480
00:52:05,920 –> 00:52:07,200
Those gates should be boring.
1481
00:52:07,200 –> 00:52:09,920
Schemer diffs require review, contract changes require
1482
00:52:09,920 –> 00:52:12,800
versioning, security changes require explicit intent,
1483
00:52:12,800 –> 00:52:15,520
and critical query parts require plan review.
1484
00:52:15,520 –> 00:52:17,120
Not because you love process,
1485
00:52:17,120 –> 00:52:19,520
because without gates, your estate becomes a pile of
1486
00:52:19,520 –> 00:52:22,400
artifacts that work until the day leadership asks
1487
00:52:22,400 –> 00:52:24,960
why a number changed and nobody can prove anything.
1488
00:52:24,960 –> 00:52:28,080
Sixth rule, every layer assumes decay unless enforced.
1489
00:52:28,080 –> 00:52:30,320
Workspaces drift, permissions drift,
1490
00:52:30,320 –> 00:52:32,720
semantic models drift, definitions drift,
1491
00:52:32,720 –> 00:52:34,560
people leave, context disappears,
1492
00:52:34,560 –> 00:52:36,800
copilot produces plausible artifacts,
1493
00:52:36,800 –> 00:52:38,880
overtime policies drift away from intent.
1494
00:52:38,880 –> 00:52:40,240
That is not a fabric problem.
1495
00:52:40,240 –> 00:52:42,960
That is the default state of any platform at scale.
1496
00:52:42,960 –> 00:52:44,400
So you design for proof.
1497
00:52:44,400 –> 00:52:46,000
Violation counts, quarantine counts,
1498
00:52:46,000 –> 00:52:48,560
plan baselines, lineage visibility, and audit trails.
1499
00:52:48,560 –> 00:52:49,840
You don’t need a perfect system.
1500
00:52:49,840 –> 00:52:52,800
You need a system that tells you when it is becoming less true.
1501
00:52:52,800 –> 00:52:55,440
And the strongest line to keep in your head when someone tries to
1502
00:52:55,440 –> 00:52:58,720
negotiate governance into guidance is this.
1503
00:52:58,720 –> 00:53:00,480
Fabric moves data fast.
1504
00:53:00,480 –> 00:53:02,800
Governance decides whether it stays true.
1505
00:53:02,800 –> 00:53:05,440
So the final takeaway is simple, and it’s not flattering.
1506
00:53:05,440 –> 00:53:07,440
Fabric didn’t simplify data engineering.
1507
00:53:07,440 –> 00:53:08,960
It simplified the mechanics.
1508
00:53:08,960 –> 00:53:11,200
It removed the ceremony that used to slow you down,
1509
00:53:11,200 –> 00:53:13,840
and it removed the padding that used to hide design omissions.
1510
00:53:13,840 –> 00:53:16,400
That’s why these failures showed up in fabric first.
1511
00:53:16,400 –> 00:53:17,680
But they aren’t fabric specific.
1512
00:53:17,680 –> 00:53:19,840
They are the inevitable failure modes of any platform
1513
00:53:19,840 –> 00:53:22,400
that collapses in gestion, transformation, storage,
1514
00:53:22,400 –> 00:53:26,400
semantics, and consumption into a single fast surface.
1515
00:53:26,400 –> 00:53:28,080
When you can move data at machine speed,
1516
00:53:28,080 –> 00:53:30,320
you can also ship ambiguity at machine speed.
1517
00:53:30,320 –> 00:53:33,280
And once ambiguity ships, it becomes somebody’s dashboard,
1518
00:53:33,280 –> 00:53:34,560
somebody’s executive summary,
1519
00:53:34,560 –> 00:53:36,000
somebody’s single source of truth
1520
00:53:36,000 –> 00:53:38,160
that exists in exactly one workspace
1521
00:53:38,160 –> 00:53:41,040
because it was the fastest place to fix the problem.
1522
00:53:41,040 –> 00:53:42,720
If you’re a senior data engineer,
1523
00:53:42,720 –> 00:53:45,040
the job now is to stop building artifacts
1524
00:53:45,040 –> 00:53:46,720
and start enforcing invariants.
1525
00:53:46,720 –> 00:53:48,080
If you’re an analytics engineer,
1526
00:53:48,080 –> 00:53:50,960
the job is to stop distributing logic across five tools
1527
00:53:50,960 –> 00:53:52,960
and then acting surprised when truth fragments.
1528
00:53:52,960 –> 00:53:56,400
If you own the platform, the job is to stop treating capacity spikes
1529
00:53:56,400 –> 00:53:59,200
like building weirdness and start treating query surfaces
1530
00:53:59,200 –> 00:54:01,040
and execution plans as policy.
1531
00:54:01,040 –> 00:54:02,720
If you’re the inheriting architect,
1532
00:54:02,720 –> 00:54:04,880
the job is to make boundaries explicit
1533
00:54:04,880 –> 00:54:06,880
before integration decides them for you.
1534
00:54:06,880 –> 00:54:09,520
And if you’re a leader, the job is to stop buying speed
1535
00:54:09,520 –> 00:54:11,760
and then acting shocked when control erodes.
1536
00:54:11,760 –> 00:54:13,440
Now, if you want a concrete next step,
1537
00:54:13,440 –> 00:54:16,240
here’s the real prompt I want you to answer for your own estate.
1538
00:54:16,240 –> 00:54:19,120
What artifact do you trust most when something feels wrong?
1539
00:54:19,120 –> 00:54:21,200
Execution plan, capacity spike trace,
1540
00:54:21,200 –> 00:54:23,360
quarantine count lineage view, audit log,
1541
00:54:23,360 –> 00:54:24,320
because whatever you pick,
1542
00:54:24,320 –> 00:54:26,480
that’s what your governance model is actually built on,
1543
00:54:26,480 –> 00:54:27,760
whether you admit it or not.
1544
00:54:27,760 –> 00:54:31,440
Next episode is how to design fabric data contracts
1545
00:54:31,440 –> 00:54:32,960
that survive co-pilot.
1546
00:54:32,960 –> 00:54:33,960
Not a diagram.
1547
00:54:33,960 –> 00:54:35,280
A contract that blocks drift,
1548
00:54:35,280 –> 00:54:36,480
survives workforce churn
1549
00:54:36,480 –> 00:54:38,400
and still lets teams move fast
1550
00:54:38,400 –> 00:54:41,840
without turning your capacity into a random number generator.
1551
00:54:41,840 –> 00:54:43,640
Fabric is a speed multiplier
1552
00:54:43,640 –> 00:54:45,440
and it multiplies your governance debt
1553
00:54:45,440 –> 00:54:48,160
with the same enthusiasm it multiplies your delivery.
1554
00:54:48,160 –> 00:54:50,800
Drop a comment with the single artifact you trust most.
1555
00:54:50,800 –> 00:54:53,200
Execution plan, capacity metrics,
1556
00:54:53,200 –> 00:54:55,040
violation count or lineage,
1557
00:54:55,040 –> 00:54:57,280
and tell me which failure mode hit you hardest,
1558
00:54:57,280 –> 00:54:59,760
cost, contracts or security.
1559
00:54:59,760 –> 00:55:01,760
Subscribe for the data contract episode.