
1
00:00:00,000 –> 00:00:02,880
Most organizations treat ESG reporting like a narrative.
2
00:00:02,880 –> 00:00:04,800
Auditors treat it like evidence.
3
00:00:04,800 –> 00:00:06,680
An evidence has rules, origin, integrity,
4
00:00:06,680 –> 00:00:08,800
repeatability, and access control.
5
00:00:08,800 –> 00:00:12,360
If your number exists, because someone edited a spreadsheet,
6
00:00:12,360 –> 00:00:13,800
your stack isn’t a stack.
7
00:00:13,800 –> 00:00:14,720
It’s a story.
8
00:00:14,720 –> 00:00:17,120
In this episode, this is what gets built.
9
00:00:17,120 –> 00:00:21,560
A minimal, auditable OESG architecture on Microsoft Cloud
10
00:00:21,560 –> 00:00:24,280
that you can replicate identity, immutability,
11
00:00:24,280 –> 00:00:26,400
governed calculations, lineage,
12
00:00:26,400 –> 00:00:29,280
and a reporting layer that doesn’t rewrite history.
13
00:00:29,280 –> 00:00:31,520
And there’s one reason dashboards are the fastest path
14
00:00:31,520 –> 00:00:32,480
to audit failure.
15
00:00:32,480 –> 00:00:33,360
Coming up.
16
00:00:33,360 –> 00:00:35,040
The foundational misunderstanding.
17
00:00:35,040 –> 00:00:36,320
ESG isn’t a report.
18
00:00:36,320 –> 00:00:37,840
It’s a system of record.
19
00:00:37,840 –> 00:00:39,880
The core misconception is comforting.
20
00:00:39,880 –> 00:00:42,760
ESG is a document, a disclosure, a set of charts,
21
00:00:42,760 –> 00:00:45,040
a few paragraphs that say we’re improving.
22
00:00:45,040 –> 00:00:47,560
That framing works right up until someone asks for proof.
23
00:00:47,560 –> 00:00:49,800
In architectural terms, ESG is not a report.
24
00:00:49,800 –> 00:00:51,040
It is a system of record.
25
00:00:51,040 –> 00:00:54,160
That distinction matters because a report is an output artifact.
26
00:00:54,160 –> 00:00:56,200
It can be produced by almost any workflow,
27
00:00:56,200 –> 00:00:58,840
including workflows that should never survive contact
28
00:00:58,840 –> 00:00:59,760
with assurance.
29
00:00:59,760 –> 00:01:01,600
A system of record is different.
30
00:01:01,600 –> 00:01:04,640
It is a controlled environment where inputs, transformations
31
00:01:04,640 –> 00:01:07,680
and outputs are all tracked, replayable, and attributable.
32
00:01:07,680 –> 00:01:10,880
And OESG, operational ESG, is simply
33
00:01:10,880 –> 00:01:14,080
the adult version of ESG, measurable, decision-ready,
34
00:01:14,080 –> 00:01:15,000
and auditable.
35
00:01:15,000 –> 00:01:18,200
If ESG is going to be used in corporate disclosures, regulatory
36
00:01:18,200 –> 00:01:20,120
submissions or investor communications,
37
00:01:20,120 –> 00:01:22,880
the underlying system has to behave like financial reporting
38
00:01:22,880 –> 00:01:26,160
systems, not in theme, in mechanics.
39
00:01:26,160 –> 00:01:28,800
So the anatomy starts with a control system model.
40
00:01:28,800 –> 00:01:31,840
Inputs, transformations, outputs, and attestations.
41
00:01:31,840 –> 00:01:33,520
Inputs are the operational facts.
42
00:01:33,520 –> 00:01:36,000
Energy consumption, fuel use, travel, procurement,
43
00:01:36,000 –> 00:01:39,000
line items, workforce counts, water usage.
44
00:01:39,000 –> 00:01:40,840
Transformations are the govern processes
45
00:01:40,840 –> 00:01:43,920
that normalize units, map to organizational structure,
46
00:01:43,920 –> 00:01:46,240
apply emission factors, and compute KPIs.
47
00:01:46,240 –> 00:01:49,440
Outputs are the period-specific KPI tables and disclosures.
48
00:01:49,440 –> 00:01:52,600
Attestations are the approvals, sign-offs, and audit artifacts
49
00:01:52,600 –> 00:01:54,840
that prove the outputs were produced under control.
50
00:01:54,840 –> 00:01:57,360
Most organizations skip straight to outputs.
51
00:01:57,360 –> 00:01:59,480
They build dashboards, they produce slides,
52
00:01:59,480 –> 00:02:02,560
they call it reporting, but they never build the chain of custody.
53
00:02:02,560 –> 00:02:05,280
Chain of custody is the real product, not the pretty chart,
54
00:02:05,280 –> 00:02:07,080
because assurance doesn’t audit your chart.
55
00:02:07,080 –> 00:02:10,080
It audits whether the number behind the chart is defensible,
56
00:02:10,080 –> 00:02:12,160
where it came from, who touched it, how it changed,
57
00:02:12,160 –> 00:02:14,240
which logic produced it, and whether anyone could have
58
00:02:14,240 –> 00:02:16,120
quietly altered it after close.
59
00:02:16,120 –> 00:02:19,360
This is where deterministic versus probabilistic ESG shows up.
60
00:02:19,360 –> 00:02:21,760
Deterministic ESG is boring and boring is good.
61
00:02:21,760 –> 00:02:24,280
Given the same raw inputs, the same factor versions,
62
00:02:24,280 –> 00:02:26,560
and the same calculation logic, the system produces
63
00:02:26,560 –> 00:02:28,080
the same outputs every time.
64
00:02:28,080 –> 00:02:31,120
Re-run last year and two years, and you get the same result.
65
00:02:31,120 –> 00:02:33,840
That’s what auditors expect, even if they don’t say the word,
66
00:02:33,840 –> 00:02:35,040
deterministic.
67
00:02:35,040 –> 00:02:37,640
Probabilistic ESG is what you get when human edits
68
00:02:37,640 –> 00:02:39,960
are allowed to masquerade as process.
69
00:02:39,960 –> 00:02:42,360
Numbers drift because someone fixed a file,
70
00:02:42,360 –> 00:02:45,840
optimized a model, or updated a mapping.
71
00:02:45,840 –> 00:02:47,760
The system still produces an output,
72
00:02:47,760 –> 00:02:49,680
but it can’t reproduce its own past.
73
00:02:49,680 –> 00:02:50,720
It can’t explain itself.
74
00:02:50,720 –> 00:02:52,080
It can’t prove integrity.
75
00:02:52,080 –> 00:02:54,440
And once you can’t reproduce, you can’t assure.
76
00:02:54,440 –> 00:02:55,840
Here’s the thing most people miss.
77
00:02:55,840 –> 00:02:57,360
Auditors don’t need perfection.
78
00:02:57,360 –> 00:02:58,640
They need controllability.
79
00:02:58,640 –> 00:03:00,840
They need you to show that changes are visible,
80
00:03:00,840 –> 00:03:02,520
bounded, approved, and attributable.
81
00:03:02,520 –> 00:03:05,120
When you can’t do that, every number becomes a debate.
82
00:03:05,120 –> 00:03:06,360
And debates are expensive.
83
00:03:06,360 –> 00:03:09,600
So let’s talk about the audit questions that break weak stacks.
84
00:03:09,600 –> 00:03:12,320
Not because auditors are evil, because this is their job.
85
00:03:12,320 –> 00:03:13,920
Who changed it? When did they change it?
86
00:03:13,920 –> 00:03:15,080
Why did they change it?
87
00:03:15,080 –> 00:03:16,360
What approval existed?
88
00:03:16,360 –> 00:03:18,240
What version of the factors did you use?
89
00:03:18,240 –> 00:03:20,120
What version of the calculation logic did you use?
90
00:03:20,120 –> 00:03:22,280
What inputs were in scope for the period close?
91
00:03:22,280 –> 00:03:24,560
Who had access to alter or data curated data
92
00:03:24,560 –> 00:03:25,480
and reported outputs?
93
00:03:25,480 –> 00:03:28,240
Can you show lineage from the KPI back to the source record
94
00:03:28,240 –> 00:03:29,840
without reconstructing a PowerPoint?
95
00:03:29,840 –> 00:03:31,640
If your answer to any of these is we think,
96
00:03:31,640 –> 00:03:33,120
you don’t have OESG.
97
00:03:33,120 –> 00:03:34,240
You have a narrative.
98
00:03:34,240 –> 00:03:36,080
Now here’s where it gets uncomfortable.
99
00:03:36,080 –> 00:03:38,720
Most ESG programs treat the sustainability team
100
00:03:38,720 –> 00:03:40,040
like the owner of truth.
101
00:03:40,040 –> 00:03:41,720
But systems don’t care about job titles,
102
00:03:41,720 –> 00:03:44,240
systems care about permissions and pathways.
103
00:03:44,240 –> 00:03:47,160
If a single person can both submit data and adjust
104
00:03:47,160 –> 00:03:48,920
the calculation and publish the dashboard,
105
00:03:48,920 –> 00:03:50,000
you don’t have governance.
106
00:03:50,000 –> 00:03:52,160
You have conditional chaos.
107
00:03:52,160 –> 00:03:53,680
And it accumulates.
108
00:03:53,680 –> 00:03:56,000
Every exception becomes an entropy generator.
109
00:03:56,000 –> 00:03:57,840
One more undocumented pathway for numbers
110
00:03:57,840 –> 00:03:59,920
to change without leaving a clean trail.
111
00:03:59,920 –> 00:04:01,880
This clicked for a lot of teams when
112
00:04:01,880 –> 00:04:03,960
assurance started asking for evidence packs,
113
00:04:03,960 –> 00:04:06,960
not just the final KPI, but the supporting documents,
114
00:04:06,960 –> 00:04:09,720
the factor library provenance, the ingestion logs,
115
00:04:09,720 –> 00:04:12,200
the validation results, and the approvals.
116
00:04:12,200 –> 00:04:14,360
Suddenly the ESG report wasn’t the deliverable.
117
00:04:14,360 –> 00:04:17,000
The deliverable was the ability to prove the report.
118
00:04:17,000 –> 00:04:18,760
So the architecture has a simple objective
119
00:04:18,760 –> 00:04:20,720
and forced chain of custody at scale.
120
00:04:20,720 –> 00:04:23,080
That means every ESG number has to be traceable
121
00:04:23,080 –> 00:04:24,720
through four properties.
122
00:04:24,720 –> 00:04:25,560
Origin?
123
00:04:25,560 –> 00:04:27,560
The system can identify the source record
124
00:04:27,560 –> 00:04:28,800
and the source system.
125
00:04:28,800 –> 00:04:29,640
Transformation?
126
00:04:29,640 –> 00:04:31,880
The system can show which pipeline and which logic
127
00:04:31,880 –> 00:04:33,560
produce the derived record.
128
00:04:33,560 –> 00:04:34,400
Integrity?
129
00:04:34,400 –> 00:04:37,560
The system can show the data wasn’t overwritten post-close.
130
00:04:37,560 –> 00:04:40,600
And changes are recorded as new versions or adjustments,
131
00:04:40,600 –> 00:04:42,280
not silent edits.
132
00:04:42,280 –> 00:04:45,280
Access, the system can show who could touch what and when.
133
00:04:45,280 –> 00:04:47,440
Once you accept ESG as a system of record,
134
00:04:47,440 –> 00:04:48,920
everything else becomes obvious.
135
00:04:48,920 –> 00:04:51,600
Dashboards become presentation, not computation.
136
00:04:51,600 –> 00:04:53,520
Spreadsheets become controlled submissions,
137
00:04:53,520 –> 00:04:55,040
not a source of truth.
138
00:04:55,040 –> 00:04:57,840
One of fixes become formal adjustments with approvals.
139
00:04:57,840 –> 00:05:00,000
And every component you choose in Microsoft Cloud
140
00:05:00,000 –> 00:05:02,760
starts mapping to a property the auditor will eventually
141
00:05:02,760 –> 00:05:03,280
demand.
142
00:05:03,280 –> 00:05:05,560
Now before we go further, you need a working definition
143
00:05:05,560 –> 00:05:09,320
of auditable in system terms, because it’s not a checkbox
144
00:05:09,320 –> 00:05:11,440
and it’s definitely not a screenshot.
145
00:05:11,440 –> 00:05:12,400
That comes next.
146
00:05:12,400 –> 00:05:15,040
The audit grade requirements, immutability, reproducibility,
147
00:05:15,040 –> 00:05:16,800
lineage, separation of duties.
148
00:05:16,800 –> 00:05:19,400
So what does auditable mean when it stops being a vibe
149
00:05:19,400 –> 00:05:21,400
and starts being a system property?
150
00:05:21,400 –> 00:05:22,960
It collapses into four requirements,
151
00:05:22,960 –> 00:05:25,600
not because Microsoft says so, because auditors behave
152
00:05:25,600 –> 00:05:28,280
predictably and systems either withstand that pressure
153
00:05:28,280 –> 00:05:29,360
or they don’t.
154
00:05:29,360 –> 00:05:32,120
Immutability, reproducibility, lineage,
155
00:05:32,120 –> 00:05:33,840
and separation of duties.
156
00:05:33,840 –> 00:05:35,960
First, immutability.
157
00:05:35,960 –> 00:05:38,840
Immutability is not, we promise we won’t change it.
158
00:05:38,840 –> 00:05:41,640
Immutability is, the platform will not let you change it.
159
00:05:41,640 –> 00:05:42,960
That’s the entire point.
160
00:05:42,960 –> 00:05:45,640
On Microsoft Cloud, that shows up as right ones,
161
00:05:45,640 –> 00:05:47,800
read many behavior on Azure Blob storage
162
00:05:47,800 –> 00:05:51,240
or ADLS Gen 2 through immutable storage policies.
163
00:05:51,240 –> 00:05:54,080
After period close, your raw evidence and your period outputs
164
00:05:54,080 –> 00:05:56,960
have to stop being mutable objects and become records.
165
00:05:56,960 –> 00:05:59,200
That distinction matters because most ESG programs
166
00:05:59,200 –> 00:06:01,240
close a period socially, not technically.
167
00:06:01,240 –> 00:06:03,320
People agree it’s closed, but the storage layer
168
00:06:03,320 –> 00:06:04,520
still allows overrides.
169
00:06:04,520 –> 00:06:06,800
So someone fixes a typo, reruns a pipeline,
170
00:06:06,800 –> 00:06:09,040
uploads a corrected file, and now the evidence
171
00:06:09,040 –> 00:06:10,840
for the closed period silently changes,
172
00:06:10,840 –> 00:06:12,560
your process still feels controlled,
173
00:06:12,560 –> 00:06:14,120
but the system behavior is not.
174
00:06:14,120 –> 00:06:15,560
Auditors don’t audit feelings.
175
00:06:15,560 –> 00:06:17,560
They audit whether changes were possible.
176
00:06:17,560 –> 00:06:20,080
Time-based retention is the operational version
177
00:06:20,080 –> 00:06:21,440
of immutability.
178
00:06:21,440 –> 00:06:23,320
You lock data for defined interval,
179
00:06:23,320 –> 00:06:25,320
so it can’t be modified or deleted.
180
00:06:25,320 –> 00:06:27,400
Legal hold is the litigation version.
181
00:06:27,400 –> 00:06:30,120
It stays locked until someone with authority clears it.
182
00:06:30,120 –> 00:06:31,760
The consequence is the same either way.
183
00:06:31,760 –> 00:06:33,760
Overrides become illegal.
184
00:06:33,760 –> 00:06:37,080
Which means your pipeline design has to evolve from replace
185
00:06:37,080 –> 00:06:38,640
to publish a new version.
186
00:06:38,640 –> 00:06:40,640
Second, reproducibility.
187
00:06:40,640 –> 00:06:44,280
Reproducibility is the ability to rerun FYI and FYI+2
188
00:06:44,280 –> 00:06:45,480
and get the same result.
189
00:06:45,480 –> 00:06:47,560
Not similar, not close.
190
00:06:47,560 –> 00:06:48,640
The same.
191
00:06:48,640 –> 00:06:51,200
That means three things must be frozen per period.
192
00:06:51,200 –> 00:06:53,560
Inputs, factors, and logic.
193
00:06:53,560 –> 00:06:55,040
Most people only freeze inputs,
194
00:06:55,040 –> 00:06:56,680
and even that is usually wishful thinking.
195
00:06:56,680 –> 00:06:58,920
The system needs to freeze the factor library versions
196
00:06:58,920 –> 00:07:01,360
used for that period and freeze the calculation artifacts
197
00:07:01,360 –> 00:07:02,200
that reference them.
198
00:07:02,200 –> 00:07:04,880
If you rerun with latest factors or latest code,
199
00:07:04,880 –> 00:07:06,320
you’re not reproducing history.
200
00:07:06,320 –> 00:07:07,040
You’re rewriting it.
201
00:07:07,040 –> 00:07:09,880
Reproducibility is why dashboard math is an audit trap.
202
00:07:09,880 –> 00:07:12,320
You can’t prove what logic produced last year’s number
203
00:07:12,320 –> 00:07:15,080
if the logic lives in a constantly edited semantic model.
204
00:07:15,080 –> 00:07:17,280
Even if the code is technically visible,
205
00:07:17,280 –> 00:07:19,200
it’s not governed like a calculation engine.
206
00:07:19,200 –> 00:07:21,120
There’s no concept of a period bound release
207
00:07:21,120 –> 00:07:23,120
an approved version and a locked output.
208
00:07:23,120 –> 00:07:25,040
Auditors don’t need your DAX to be clever.
209
00:07:25,040 –> 00:07:26,560
They need it to stop moving.
210
00:07:26,560 –> 00:07:28,040
Third, lineage.
211
00:07:28,040 –> 00:07:31,280
Lineage is the answer to a single question
212
00:07:31,280 –> 00:07:32,880
that destroys weak stacks.
213
00:07:32,880 –> 00:07:34,440
Where did this number come from?
214
00:07:34,440 –> 00:07:36,840
Not philosophically, mechanically.
215
00:07:36,840 –> 00:07:39,680
Lineage is origin to transformation, to consumption,
216
00:07:39,680 –> 00:07:42,640
source system record, to ingested file or table
217
00:07:42,640 –> 00:07:45,080
through transformations, into curated models,
218
00:07:45,080 –> 00:07:48,680
into reported outputs, into the data set that Power BI reads.
219
00:07:48,680 –> 00:07:51,200
If you can’t trace it quickly, you will trace it slowly.
220
00:07:51,200 –> 00:07:53,800
And slowly means meetings, screenshots,
221
00:07:53,800 –> 00:07:55,400
and spreadsheet archaeology.
222
00:07:55,400 –> 00:07:58,040
That is not assurance that is theater.
223
00:07:58,040 –> 00:08:01,520
Microsoft purview exists because human memory does not scale.
224
00:08:01,520 –> 00:08:03,280
It’s the metadata system that turns,
225
00:08:03,280 –> 00:08:06,200
we think this is how it flows into, here is the graph.
226
00:08:06,200 –> 00:08:08,360
It also becomes your change management weapon.
227
00:08:08,360 –> 00:08:10,520
Before you change a pipeline or a calculation,
228
00:08:10,520 –> 00:08:12,080
you can see downstream impact.
229
00:08:12,080 –> 00:08:14,440
Without lineage, every change is a blind deployment
230
00:08:14,440 –> 00:08:16,280
into your own reporting boundary.
231
00:08:16,280 –> 00:08:18,040
And yes, product capabilities evolve.
232
00:08:18,040 –> 00:08:18,880
That’s normal.
233
00:08:18,880 –> 00:08:20,040
Your requirement does not evolve.
234
00:08:20,040 –> 00:08:22,200
The requirement is explainability under pressure.
235
00:08:22,200 –> 00:08:24,360
Fourth, separation of duties.
236
00:08:24,360 –> 00:08:27,280
This one is where most ESG programs quietly fail
237
00:08:27,280 –> 00:08:28,640
because it’s inconvenient.
238
00:08:28,640 –> 00:08:29,960
But the logic is simple.
239
00:08:29,960 –> 00:08:32,880
The person who submits data cannot be the person who approves it,
240
00:08:32,880 –> 00:08:34,400
and the person who changes logic
241
00:08:34,400 –> 00:08:37,120
cannot be the person who publishes the reported outputs.
242
00:08:37,120 –> 00:08:39,080
You need role separation across data entry,
243
00:08:39,080 –> 00:08:41,440
validation, calculation, approval, and reporting.
244
00:08:41,440 –> 00:08:42,320
In Microsoft terms,
245
00:08:42,320 –> 00:08:43,960
EntraID is not a diagram.
246
00:08:43,960 –> 00:08:45,600
It is the enforcement mechanism,
247
00:08:45,600 –> 00:08:48,040
group membership, role assignments, access reviews,
248
00:08:48,040 –> 00:08:49,640
audit logs, these are evidence.
249
00:08:49,640 –> 00:08:51,120
And you don’t get evidence by saying
250
00:08:51,120 –> 00:08:53,320
only the sustainability team has access.
251
00:08:53,320 –> 00:08:55,520
You get evidence by proving which identities
252
00:08:55,520 –> 00:08:57,240
had which permissions during the period
253
00:08:57,240 –> 00:09:00,320
and showing that privileged access was bounded and reviewable.
254
00:09:00,320 –> 00:09:02,520
Most organizations end up with a hero admin
255
00:09:02,520 –> 00:09:03,560
because it’s faster.
256
00:09:03,560 –> 00:09:04,560
It is not governance.
257
00:09:04,560 –> 00:09:06,640
It is a single point of audit failure.
258
00:09:06,640 –> 00:09:08,800
So those four properties define your architecture.
259
00:09:08,800 –> 00:09:10,640
Immutability prevents silent edits,
260
00:09:10,640 –> 00:09:14,360
reproducibility prevents drift, lineage prevents archaeology.
261
00:09:14,360 –> 00:09:16,720
Separation of duties prevents conflict of interest
262
00:09:16,720 –> 00:09:17,640
and invisible power.
263
00:09:17,640 –> 00:09:18,480
And here’s the payoff.
264
00:09:18,480 –> 00:09:21,120
Once these exist, your ESG stack stops
265
00:09:21,120 –> 00:09:23,880
being a collection of tools and becomes a control plane.
266
00:09:23,880 –> 00:09:25,280
Now the uncomfortable part.
267
00:09:25,280 –> 00:09:29,000
These requirements map directly to specific Microsoft services.
268
00:09:29,000 –> 00:09:30,360
Some are non-negotiable.
269
00:09:30,360 –> 00:09:32,680
The rest are optional until scale and regulation
270
00:09:32,680 –> 00:09:34,240
make the mandatory.
271
00:09:34,240 –> 00:09:37,120
Microsoft stack map, non-negotiable versus optional.
272
00:09:37,120 –> 00:09:39,160
Now we map those four audit grade requirements
273
00:09:39,160 –> 00:09:41,360
to Microsoft services, not as a shopping list,
274
00:09:41,360 –> 00:09:42,680
as a chain of enforcement.
275
00:09:42,680 –> 00:09:44,640
Because the system doesn’t become auditable
276
00:09:44,640 –> 00:09:45,520
when you buy tools.
277
00:09:45,520 –> 00:09:47,760
It becomes auditable when every requirement
278
00:09:47,760 –> 00:09:50,360
has an implementation that removes human discretion.
279
00:09:50,360 –> 00:09:52,160
Start with the non-negotiables.
280
00:09:52,160 –> 00:09:54,760
These are the components, auditors implicitly expect
281
00:09:54,760 –> 00:09:56,840
even if they never say Microsoft out loud.
282
00:09:56,840 –> 00:09:58,120
First identity and access.
283
00:09:58,120 –> 00:10:01,040
Microsoft, Entra ID, Entra is not single sign on.
284
00:10:01,040 –> 00:10:03,360
Architecturally, it’s the distributed decision engine
285
00:10:03,360 –> 00:10:05,640
that decides who can submit, who can transform,
286
00:10:05,640 –> 00:10:07,440
who can approve and who can publish.
287
00:10:07,440 –> 00:10:09,640
And it produces logs, logs are evidence.
288
00:10:09,640 –> 00:10:12,000
If role separation is one of your requirements,
289
00:10:12,000 –> 00:10:15,120
Entra is where it either happens or it doesn’t.
290
00:10:15,120 –> 00:10:17,240
Second, storage with immutability.
291
00:10:17,240 –> 00:10:19,000
Azure Data Lake Storage Gen 2,
292
00:10:19,000 –> 00:10:20,840
with immutable storage policies for the zones
293
00:10:20,840 –> 00:10:23,920
that become evidence, raw and period closed reported outputs,
294
00:10:23,920 –> 00:10:26,800
plus any evidence vault you keep for supporting documents.
295
00:10:26,800 –> 00:10:29,160
This is the part everyone tries to negotiate away
296
00:10:29,160 –> 00:10:31,320
because it forces pipeline discipline.
297
00:10:31,320 –> 00:10:33,680
But worm isn’t a feature, it’s a behavior change.
298
00:10:33,680 –> 00:10:37,400
Once you enable immutability, overrides are no longer a quick fix.
299
00:10:37,400 –> 00:10:39,160
They are an audit event you can’t perform.
300
00:10:39,160 –> 00:10:40,720
That constraint is the entire point.
301
00:10:40,720 –> 00:10:42,360
Third, a governed calculation zone.
302
00:10:42,360 –> 00:10:44,680
Fabric Lake House or Azure Synapse Analytics.
303
00:10:44,680 –> 00:10:47,120
Pick one and treat it like an accounting engine.
304
00:10:47,120 –> 00:10:50,600
Version artifacts, control deployments and period bound releases.
305
00:10:50,600 –> 00:10:51,800
Your calculations need to live
306
00:10:51,800 –> 00:10:53,400
where they can be tested, reviewed
307
00:10:53,400 –> 00:10:55,760
and rerun against frozen inputs and frozen factors.
308
00:10:55,760 –> 00:10:58,360
If your KPI logic lives in power BI measures,
309
00:10:58,360 –> 00:10:59,840
you didn’t build a calculation zone.
310
00:10:59,840 –> 00:11:02,200
You built a dashboard that quietly rewrites history.
311
00:11:02,200 –> 00:11:03,640
Fourth, governance,
312
00:11:03,640 –> 00:11:05,640
and lineage, Microsoft purview.
313
00:11:05,640 –> 00:11:08,520
Purview is the difference between,
314
00:11:08,520 –> 00:11:12,080
we can probably explain this and here is the lineage graph.
315
00:11:12,080 –> 00:11:14,760
Here are the owners, here are the transformations.
316
00:11:14,760 –> 00:11:17,920
Under assurance pressure, that difference becomes the whole game.
317
00:11:17,920 –> 00:11:20,840
Purview is also how you scale governance beyond tribal knowledge.
318
00:11:20,840 –> 00:11:22,720
People leave, your metadata can’t.
319
00:11:22,720 –> 00:11:24,640
Fifth, reporting as a thin layer.
320
00:11:24,640 –> 00:11:27,520
Power BI, power BI is allowed, power BI is useful.
321
00:11:27,520 –> 00:11:30,480
Power BI is also where most teams destroy auditability
322
00:11:30,480 –> 00:11:33,160
by turning the semantic model into the calculation engine.
323
00:11:33,160 –> 00:11:34,360
So the rule is brutal.
324
00:11:34,360 –> 00:11:37,200
Power BI consumes reported period close tables.
325
00:11:37,200 –> 00:11:39,240
Measures are presentation and aggregation,
326
00:11:39,240 –> 00:11:40,560
not emissions accounting.
327
00:11:40,560 –> 00:11:42,800
You want auditors to argue about your visuals,
328
00:11:42,800 –> 00:11:43,600
not your logic.
329
00:11:43,600 –> 00:11:45,280
So that’s the non-negotiable baseline.
330
00:11:45,280 –> 00:11:47,840
Entra, ADLS Gen 2 with immutability,
331
00:11:47,840 –> 00:11:51,080
fabric or synapse for calculations, purview for lineage
332
00:11:51,080 –> 00:11:54,440
and power BI as the last mile presentation layer.
333
00:11:54,440 –> 00:11:57,640
Now the optional components, optional does not mean irrelevant.
334
00:11:57,640 –> 00:12:01,600
It means not required until scale, regulation and complexity corner you.
335
00:12:02,680 –> 00:12:06,000
Microsoft Sustainability Manager sits in that category.
336
00:12:06,000 –> 00:12:08,680
It’s optional when you already have mature emissions logic,
337
00:12:08,680 –> 00:12:11,000
a controlled factor library, and the willpower
338
00:12:11,000 –> 00:12:13,640
to build transparent pipelines and models yourself.
339
00:12:13,640 –> 00:12:16,720
It becomes valuable when you need faster onboarding to frameworks,
340
00:12:16,720 –> 00:12:18,400
faster scope three workflows,
341
00:12:18,400 –> 00:12:21,200
or you simply don’t have internal emissions domain depth.
342
00:12:21,200 –> 00:12:23,320
The platform has audit trail capabilities
343
00:12:23,320 –> 00:12:25,040
and data trail reporting features,
344
00:12:25,040 –> 00:12:26,800
but it doesn’t absolve you from architecture.
345
00:12:26,800 –> 00:12:29,160
If you treat it as a black box that spits out numbers,
346
00:12:29,160 –> 00:12:30,840
you’re just outsourcing your audit risk
347
00:12:30,840 –> 00:12:32,400
to a product configuration.
348
00:12:32,400 –> 00:12:34,480
As your data factory is also optional,
349
00:12:34,480 –> 00:12:37,120
but only if your ingestion needs stay simple.
350
00:12:37,120 –> 00:12:40,000
If fabric native ingestion covers your source systems fine.
351
00:12:40,000 –> 00:12:42,120
But when you have real ERP integration,
352
00:12:42,120 –> 00:12:45,160
IoT telemetry coordination, multi-step API dependencies
353
00:12:45,160 –> 00:12:46,760
and cross-system timing constraints,
354
00:12:46,760 –> 00:12:49,120
data factory becomes the orchestration layer
355
00:12:49,120 –> 00:12:51,240
that keeps ingestion deterministic.
356
00:12:51,240 –> 00:12:53,320
Just remember the immutability constraint.
357
00:12:53,320 –> 00:12:56,320
Data factory pipelines that override files collide
358
00:12:56,320 –> 00:12:59,120
with worm policies and fail, that failure isn’t a bug.
359
00:12:59,120 –> 00:13:00,840
It’s your architecture revealing itself.
360
00:13:00,840 –> 00:13:03,680
Azure Machine Learning is optional in the purest sense.
361
00:13:03,680 –> 00:13:05,960
Use it for forecasting, anomaly detection
362
00:13:05,960 –> 00:13:07,200
and scenario modeling.
363
00:13:07,200 –> 00:13:08,520
Never for baseline numbers.
364
00:13:08,520 –> 00:13:11,000
Model outputs are estimates and estimates
365
00:13:11,000 –> 00:13:12,640
need labeling, provenance and governance
366
00:13:12,640 –> 00:13:13,680
like any other input.
367
00:13:13,680 –> 00:13:16,600
Otherwise, your AI insights become untraceable logic changes
368
00:13:16,600 –> 00:13:17,920
with a brand name.
369
00:13:17,920 –> 00:13:19,120
Here’s the short warning.
370
00:13:19,120 –> 00:13:21,720
Every optional tool becomes mandatory.
371
00:13:21,720 –> 00:13:24,400
The moment you use it to create or modify numbers
372
00:13:24,400 –> 00:13:25,960
inside your reporting boundary,
373
00:13:25,960 –> 00:13:27,840
the system doesn’t care that it was a pilot.
374
00:13:27,840 –> 00:13:30,240
If it touched the number, it’s part of the evidence chain.
375
00:13:30,240 –> 00:13:31,640
So you now have the map.
376
00:13:31,640 –> 00:13:33,880
Non-negotiables enforce the four requirements.
377
00:13:33,880 –> 00:13:37,520
Optional tools add capability, but also add pathways for entropy.
378
00:13:37,520 –> 00:13:38,920
Next, you started the edge
379
00:13:38,920 –> 00:13:40,880
because the first place truth gets corrupted
380
00:13:40,880 –> 00:13:43,400
is always the first place data enters your system.
381
00:13:43,400 –> 00:13:46,800
Operational data sources, where OSG actually comes from.
382
00:13:46,800 –> 00:13:49,000
OSG doesn’t come from your sustainability team
383
00:13:49,000 –> 00:13:51,800
that team usually collects it, begs for it, cleans it up
384
00:13:51,800 –> 00:13:52,840
and tries to defend it.
385
00:13:52,840 –> 00:13:55,720
But the data originates somewhere else.
386
00:13:55,720 –> 00:13:58,240
Operational systems that were never designed
387
00:13:58,240 –> 00:13:59,920
to be audited for carbon math.
388
00:13:59,920 –> 00:14:01,400
That’s the first architectural truth.
389
00:14:01,400 –> 00:14:03,880
If you don’t treat the source systems as part of the reporting
390
00:14:03,880 –> 00:14:06,440
boundary, you’ll spend your life explaining downstream numbers
391
00:14:06,440 –> 00:14:08,480
while upstream inputs keep changing.
392
00:14:08,480 –> 00:14:11,480
So let’s name the real sources and the real damage they can do.
393
00:14:11,480 –> 00:14:14,680
Start with ERP, SAP, Dynamics, whatever you’ve standardized
394
00:14:14,680 –> 00:14:18,080
on ERP is where activity data becomes audit-friendly
395
00:14:18,080 –> 00:14:19,800
because it already has controls.
396
00:14:19,800 –> 00:14:22,240
Transactions, approvals, posting periods,
397
00:14:22,240 –> 00:14:24,360
master data, organizational structure.
398
00:14:24,360 –> 00:14:27,480
But the trap is that teams try to use the ERP outputs
399
00:14:27,480 –> 00:14:30,040
as already reported sustainability data.
400
00:14:30,040 –> 00:14:30,560
They shouldn’t.
401
00:14:30,560 –> 00:14:33,600
You want the activity data, fuel purchases, freight costs,
402
00:14:33,600 –> 00:14:36,400
inventory movement, utility invoices, travel expenses,
403
00:14:36,400 –> 00:14:37,800
procurement line items.
404
00:14:37,800 –> 00:14:40,160
The thing most people miss is that ERP is better
405
00:14:40,160 –> 00:14:42,480
as an evidence source than as a calculation engine.
406
00:14:42,480 –> 00:14:44,200
It’s good at capturing business events
407
00:14:44,200 –> 00:14:45,680
with identity and timestamps.
408
00:14:45,680 –> 00:14:48,240
It is not good at emissions factors, allocation logic
409
00:14:48,240 –> 00:14:50,600
or multi-scope reconciliation unless you deliberately
410
00:14:50,600 –> 00:14:51,280
build it that way.
411
00:14:51,280 –> 00:14:54,440
So ERP is a source of facts, not a source of finished ESG
412
00:14:54,440 –> 00:14:56,960
truth, next energy meters and IoT telemetry.
413
00:14:56,960 –> 00:14:59,680
This is where teams get excited about granularity
414
00:14:59,680 –> 00:15:01,760
and then quietly drown.
415
00:15:01,760 –> 00:15:05,000
Telemetry is high volume, high frequency and low forgiveness.
416
00:15:05,000 –> 00:15:07,600
You can collect a million readings and still fail assurance
417
00:15:07,600 –> 00:15:10,080
because you can’t explain context, which facility,
418
00:15:10,080 –> 00:15:12,240
which meter, what unit, which time zone,
419
00:15:12,240 –> 00:15:15,080
what calibration assumptions, what mapping from meter
420
00:15:15,080 –> 00:15:16,840
to asset to business unit.
421
00:15:16,840 –> 00:15:18,560
Telemetry without context is not data.
422
00:15:18,560 –> 00:15:20,360
It’s noise with audit liability.
423
00:15:20,360 –> 00:15:22,920
And because IoT pipelines often involve gateways
424
00:15:22,920 –> 00:15:25,360
edge buffering, retries and late arriving events
425
00:15:25,360 –> 00:15:26,840
you need to design for time.
426
00:15:26,840 –> 00:15:28,760
Event time versus ingestion time
427
00:15:28,760 –> 00:15:30,960
and what happens when the real reading shows up
428
00:15:30,960 –> 00:15:32,360
after period close.
429
00:15:32,360 –> 00:15:34,920
If you don’t decide that early, your close process becomes
430
00:15:34,920 –> 00:15:36,960
a permanent argument with your own sensors.
431
00:15:36,960 –> 00:15:39,600
Now HR systems.
432
00:15:39,600 –> 00:15:42,160
Workforce metrics sound simple until you try
433
00:15:42,160 –> 00:15:43,800
to define them consistently.
434
00:15:43,800 –> 00:15:46,960
Headcount, turnover, diversity, health and safety incidents,
435
00:15:46,960 –> 00:15:49,000
training hours, these are all HR managed
436
00:15:49,000 –> 00:15:50,160
and they are sensitive.
437
00:15:50,160 –> 00:15:53,320
That creates two constraints, access control and aggregation.
438
00:15:53,320 –> 00:15:54,840
You don’t want raw employee records
439
00:15:54,840 –> 00:15:56,600
wandering into analytics workspaces
440
00:15:56,600 –> 00:15:58,080
because someone wanted a dashboard.
441
00:15:58,080 –> 00:16:01,320
For OESG, HR systems should feed controlled aggregates
442
00:16:01,320 –> 00:16:03,280
with documented definitions and a stable
443
00:16:03,280 –> 00:16:05,360
organizational hierarchy mapping.
444
00:16:05,360 –> 00:16:07,600
Otherwise you get denominator drift.
445
00:16:07,600 –> 00:16:09,200
The metric stays the same name,
446
00:16:09,200 –> 00:16:11,160
but the population changes silently.
447
00:16:11,160 –> 00:16:13,000
Auditors don’t need to see personal data.
448
00:16:13,000 –> 00:16:15,720
They need to see that the metric definition didn’t mutate.
449
00:16:15,720 –> 00:16:18,040
Then procurement and suppliers, which is where scope three
450
00:16:18,040 –> 00:16:20,680
stops being theory and becomes operational humiliation.
451
00:16:20,680 –> 00:16:23,680
Supplier data comes through surveys, portals, partner feeds,
452
00:16:23,680 –> 00:16:25,840
invoices and sometimes email attachments
453
00:16:25,840 –> 00:16:28,480
that should never be admitted into an evidence chain.
454
00:16:28,480 –> 00:16:29,840
The variability is the point.
455
00:16:29,840 –> 00:16:32,040
Suppliers don’t share the same systems,
456
00:16:32,040 –> 00:16:34,160
the same data quality or the same incentives.
457
00:16:34,160 –> 00:16:36,400
So you need to capture two things from day one.
458
00:16:36,400 –> 00:16:37,880
Coverage and confidence.
459
00:16:37,880 –> 00:16:39,760
What percentage of spend or categories
460
00:16:39,760 –> 00:16:42,760
have supplier provided data and what percentage is estimated?
461
00:16:42,760 –> 00:16:44,160
Those flags aren’t nice to have.
462
00:16:44,160 –> 00:16:46,120
They’re the only honest way to survive questions
463
00:16:46,120 –> 00:16:47,120
about completeness.
464
00:16:47,120 –> 00:16:49,200
And if you don’t store supplier submissions
465
00:16:49,200 –> 00:16:51,920
as evidence artifacts with identity, timestamps
466
00:16:51,920 –> 00:16:54,360
and versioning, you will not be able to prove what was known
467
00:16:54,360 –> 00:16:55,800
at the time of reporting.
468
00:16:55,800 –> 00:16:59,480
Now the radioactive source, spreadsheets and CSV files,
469
00:16:59,480 –> 00:17:00,320
they’re allowed.
470
00:17:00,320 –> 00:17:03,400
They’re also the birthplace of Final V7 CSV,
471
00:17:03,400 –> 00:17:06,600
which is the universal symbol of uncontrolled modification.
472
00:17:06,600 –> 00:17:07,920
Spreadsheets are not evil.
473
00:17:07,920 –> 00:17:09,160
They’re just not control systems.
474
00:17:09,160 –> 00:17:11,520
They don’t preserve chain of custody by default
475
00:17:11,520 –> 00:17:13,400
and they make it trivial to change history
476
00:17:13,400 –> 00:17:14,800
without leaving a durable trail.
477
00:17:14,800 –> 00:17:17,800
So in an audit grade stack, spreadsheets are treated
478
00:17:17,800 –> 00:17:19,160
as controlled submissions.
479
00:17:19,160 –> 00:17:21,360
Metadata captured, schema validated,
480
00:17:21,360 –> 00:17:24,240
approvals recorded, and then the content gets ingested
481
00:17:24,240 –> 00:17:26,640
into the raw zone as append only evidence.
482
00:17:26,640 –> 00:17:29,160
The spreadsheet itself becomes supporting documentation
483
00:17:29,160 –> 00:17:31,720
in the evidence vault, not the system of record.
484
00:17:31,720 –> 00:17:32,680
Here’s the checkpoint.
485
00:17:32,680 –> 00:17:35,880
Every source system has its own native controls, gaps
486
00:17:35,880 –> 00:17:37,320
and failure modes.
487
00:17:37,320 –> 00:17:40,480
ERP brings structure, but temps reported outputs.
488
00:17:40,480 –> 00:17:42,880
IoT brings volume, but lacks business context.
489
00:17:42,880 –> 00:17:45,720
HR brings sensitivity and definition drift.
490
00:17:45,720 –> 00:17:48,400
Suppliers bring variability and partial coverage.
491
00:17:48,400 –> 00:17:50,120
Spreadsheets brings speed and entropy.
492
00:17:50,120 –> 00:17:53,360
Once you accept that, the next step becomes obvious.
493
00:17:53,360 –> 00:17:55,120
Ingestion is where truth gets corrupted
494
00:17:55,120 –> 00:17:56,960
because ingestion is where humans still believe
495
00:17:56,960 –> 00:17:58,320
overwriting is a feature.
496
00:17:58,320 –> 00:18:00,040
Ingestion patterns control pipelines
497
00:18:00,040 –> 00:18:02,120
versus human driven upload rituals.
498
00:18:02,120 –> 00:18:04,840
Ingestion is where OESG dies in real life
499
00:18:04,840 –> 00:18:07,160
because ingestion is where teams still confuse
500
00:18:07,160 –> 00:18:09,480
getting data in with getting evidence in.
501
00:18:09,480 –> 00:18:10,640
Those are not the same.
502
00:18:10,640 –> 00:18:12,120
The design rule is simple.
503
00:18:12,120 –> 00:18:14,120
Ingestion must be append first.
504
00:18:14,120 –> 00:18:15,800
Overrides are ordered poison.
505
00:18:15,800 –> 00:18:18,560
The moment your process allows replace the file,
506
00:18:18,560 –> 00:18:20,600
you’ve created an invisible edit pathway
507
00:18:20,600 –> 00:18:22,120
inside your reporting boundary.
508
00:18:22,120 –> 00:18:24,240
And auditors don’t need to prove you used it.
509
00:18:24,240 –> 00:18:25,640
They only need to prove you could.
510
00:18:25,640 –> 00:18:28,520
Append first means every load becomes a new object
511
00:18:28,520 –> 00:18:31,400
or a new version with a load identifier that never repeats.
512
00:18:31,400 –> 00:18:33,800
If you want to correct something, you don’t edit history.
513
00:18:33,800 –> 00:18:36,400
You publish an adjustment with rationale and approval
514
00:18:36,400 –> 00:18:38,760
and you keep the original as evidence.
515
00:18:38,760 –> 00:18:40,160
Now here’s where most people mess up.
516
00:18:40,160 –> 00:18:42,280
They treat ingestion as a user interface problem.
517
00:18:42,280 –> 00:18:43,520
They build an upload folder.
518
00:18:43,520 –> 00:18:45,520
They write drop files here instructions.
519
00:18:45,520 –> 00:18:46,880
They call it a pipeline.
520
00:18:46,880 –> 00:18:49,320
Then the first late file arrives and someone
521
00:18:49,320 –> 00:18:50,960
overrides the last one.
522
00:18:50,960 –> 00:18:53,880
Because the business wanted the dashboard to be right.
523
00:18:53,880 –> 00:18:55,200
That’s not ingestion.
524
00:18:55,200 –> 00:18:56,480
That’s ritual.
525
00:18:56,480 –> 00:18:59,800
A controlled ingestion pattern has three non-negotiable behaviors.
526
00:18:59,800 –> 00:19:02,560
Orchestrated movement, validation gates, and telemetry
527
00:19:02,560 –> 00:19:05,040
that can be handed to an auditor without translation.
528
00:19:05,040 –> 00:19:07,680
Let’s talk tooling because Microsoft gives you multiple ways
529
00:19:07,680 –> 00:19:10,480
to ingest and none of them magically make you auditable.
530
00:19:10,480 –> 00:19:12,480
Fabric native ingestion is convenient
531
00:19:12,480 –> 00:19:14,400
when your sources are straightforward.
532
00:19:14,400 –> 00:19:17,640
Files, tables, common connectors, predictable schedules,
533
00:19:17,640 –> 00:19:19,840
and you can keep the orchestration simple.
534
00:19:19,840 –> 00:19:21,400
The benefit is proximity.
535
00:19:21,400 –> 00:19:23,240
You’re already in the Lake House world.
536
00:19:23,240 –> 00:19:26,160
And you can land data close to where it will be processed.
537
00:19:26,160 –> 00:19:28,080
The failure mode is also proximity.
538
00:19:28,080 –> 00:19:30,920
Teams let convenience become a substitute for control
539
00:19:30,920 –> 00:19:33,800
and they stop capturing the metadata that proves what happened.
540
00:19:33,800 –> 00:19:36,400
Azure Data Factory exists for the unglamorous reality,
541
00:19:36,400 –> 00:19:39,120
complex ERP integration, IoT coordination,
542
00:19:39,120 –> 00:19:42,200
multi-step API polls, dependencies, retries, and sequencing
543
00:19:42,200 –> 00:19:44,320
that can’t be trusted to just run.
544
00:19:44,320 –> 00:19:47,800
It also has the operational surface area for governance.
545
00:19:47,800 –> 00:19:49,720
Parameterized pipelines, run history,
546
00:19:49,720 –> 00:19:51,320
integration runtime behavior,
547
00:19:51,320 –> 00:19:53,640
and consistent patterns across many sources.
548
00:19:53,640 –> 00:19:55,240
But the constraint is brutal.
549
00:19:55,240 –> 00:19:58,400
Immutability will punish sloppy ADF designs.
550
00:19:58,400 –> 00:20:01,440
When ADF tries to override a file in an immutable container,
551
00:20:01,440 –> 00:20:02,440
it fails.
552
00:20:02,440 –> 00:20:03,440
That’s not Microsoft being difficult.
553
00:20:03,440 –> 00:20:05,480
That’s your system proving that it was designed
554
00:20:05,480 –> 00:20:07,000
to rewrite evidence.
555
00:20:07,000 –> 00:20:09,960
So the pattern becomes right to a mutable staging area,
556
00:20:09,960 –> 00:20:13,120
validate, then publish into the immutable raw archive.
557
00:20:13,120 –> 00:20:17,240
Mutable staging immutable archive, two zones, two behaviors.
558
00:20:17,240 –> 00:20:19,520
Now validation gates.
559
00:20:19,520 –> 00:20:22,480
This is the part that separates ingestion from data dumping.
560
00:20:22,480 –> 00:20:24,240
Every load needs to be validated before it
561
00:20:24,240 –> 00:20:26,720
becomes evidence inside your system of record.
562
00:20:26,720 –> 00:20:29,360
And validation isn’t just did the pipeline run.
563
00:20:29,360 –> 00:20:31,680
It’s, did the data meet minimum standards
564
00:20:31,680 –> 00:20:33,080
to be considered in scope?
565
00:20:33,080 –> 00:20:36,400
The practical gates are boring and therefore effective.
566
00:20:36,400 –> 00:20:40,440
Schema checks, column names, types, required fields,
567
00:20:40,440 –> 00:20:42,080
and allowed null behavior.
568
00:20:42,080 –> 00:20:45,440
Unit normalization, kWH versus MWH,
569
00:20:45,440 –> 00:20:47,960
liters versus cubic meters, distance units,
570
00:20:47,960 –> 00:20:52,720
currency, time zones, required dimensions, site, region, period,
571
00:20:52,720 –> 00:20:55,680
scope category, source system identifier.
572
00:20:55,680 –> 00:20:57,840
And the system must record the outcome.
573
00:20:57,840 –> 00:21:01,320
Pass, fail, quarantine, or accepted with exceptions.
574
00:21:01,320 –> 00:21:03,800
This is where you stop pretending spreadsheets are harmless.
575
00:21:03,800 –> 00:21:06,200
A controlled CSV submission is allowed only
576
00:21:06,200 –> 00:21:09,240
if it passes the same validation gates as an API feed.
577
00:21:09,240 –> 00:21:11,000
Otherwise, you’ve created a privileged path
578
00:21:11,000 –> 00:21:13,200
for human supplied nonsense to enter the raw zone.
579
00:21:13,200 –> 00:21:15,640
Now, log everything, not for observability dashboards,
580
00:21:15,640 –> 00:21:16,720
for chain of custody.
581
00:21:16,720 –> 00:21:18,800
Every load should produce a durable record
582
00:21:18,800 –> 00:21:22,480
that includes a load ID, source system, extract window,
583
00:21:22,480 –> 00:21:24,640
ingestion timestamp, submitter identity
584
00:21:24,640 –> 00:21:28,200
where applicable, file name or object path validation results,
585
00:21:28,200 –> 00:21:30,440
and the pipeline version that perform the load.
586
00:21:30,440 –> 00:21:32,200
This is where entry shows up again.
587
00:21:32,200 –> 00:21:34,240
Submitter identity is not a name in an email.
588
00:21:34,240 –> 00:21:37,200
It’s an authenticated identity tied to the submission event.
589
00:21:37,200 –> 00:21:39,800
If you can’t attribute a submission to a real identity,
590
00:21:39,800 –> 00:21:41,840
you can’t prove separation of duties.
591
00:21:41,840 –> 00:21:44,000
And if you can’t prove separation of duties,
592
00:21:44,000 –> 00:21:47,080
you will eventually be asked why you believe your own data.
593
00:21:47,080 –> 00:21:49,640
There’s also a subtle requirement most teams miss.
594
00:21:49,640 –> 00:21:51,520
ingestion needs to be replayable.
595
00:21:51,520 –> 00:21:54,120
With someone asks, what did we know on March 31st?
596
00:21:54,120 –> 00:21:56,400
You can’t respond with the current state of the lake.
597
00:21:56,400 –> 00:21:58,960
You need to be able to point to the exact load artifacts
598
00:21:58,960 –> 00:22:00,360
that were in scope at close.
599
00:22:00,360 –> 00:22:01,640
So ingestion isn’t a pipe.
600
00:22:01,640 –> 00:22:02,640
It’s a ledger.
601
00:22:02,640 –> 00:22:04,920
And once you build it that way, the downstream system
602
00:22:04,920 –> 00:22:05,800
gets easier.
603
00:22:05,800 –> 00:22:08,400
Curated models get cleaner inputs, calculations
604
00:22:08,400 –> 00:22:11,520
become stable and period close becomes an actual event,
605
00:22:11,520 –> 00:22:13,200
not a calendar reminder.
606
00:22:13,200 –> 00:22:16,080
Next, the data has to be stored like it will be subpoenaed
607
00:22:16,080 –> 00:22:17,440
because it can be.
608
00:22:17,440 –> 00:22:21,200
Storage anatomy, raw, curated, bioreported,
609
00:22:21,200 –> 00:22:22,640
plus an evidence vault.
610
00:22:22,640 –> 00:22:25,040
Storage is where good intentions go to die,
611
00:22:25,040 –> 00:22:27,640
because most teams store ESG data the same way
612
00:22:27,640 –> 00:22:29,280
they store project files.
613
00:22:29,280 –> 00:22:31,920
Whatever folder exists, whatever naming convention
614
00:22:31,920 –> 00:22:35,680
someone remembers, and whatever overrides still work.
615
00:22:35,680 –> 00:22:37,160
That’s not storage architecture.
616
00:22:37,160 –> 00:22:39,240
That’s entropy management without the management.
617
00:22:39,240 –> 00:22:42,040
An auditable ESG stack needs storage anatomy.
618
00:22:42,040 –> 00:22:43,800
Distinct layers with distinct rules
619
00:22:43,800 –> 00:22:46,360
because different data states have different liabilities.
620
00:22:46,360 –> 00:22:48,280
You’re not organizing data for convenience.
621
00:22:48,280 –> 00:22:50,840
You’re organizing it so the system can prove what happened.
622
00:22:50,840 –> 00:22:53,760
So the baseline pattern is three zones, raw, curated,
623
00:22:53,760 –> 00:22:54,760
and reported.
624
00:22:54,760 –> 00:22:58,080
And then a fourth thing most stacks forget an evidence vault.
625
00:22:58,080 –> 00:23:00,640
Raw is the closest to source landing zone.
626
00:23:00,640 –> 00:23:01,600
Append only.
627
00:23:01,600 –> 00:23:02,640
Minimal transformation.
628
00:23:02,640 –> 00:23:04,280
The goal is not usability.
629
00:23:04,280 –> 00:23:05,720
The goal is preservation.
630
00:23:05,720 –> 00:23:07,360
Raw data answers one question.
631
00:23:07,360 –> 00:23:09,680
What did we receive from where and when?
632
00:23:09,680 –> 00:23:12,320
That means raw objects need stable identifiers
633
00:23:12,320 –> 00:23:14,480
and immutable behavior after close.
634
00:23:14,480 –> 00:23:17,960
If you normalize units in raw, you’ve already destroyed provenance
635
00:23:17,960 –> 00:23:19,840
unless you also store the original.
636
00:23:19,840 –> 00:23:22,240
So raw keeps the original representation.
637
00:23:22,240 –> 00:23:24,600
The meter reading payload, the invoice extract,
638
00:23:24,600 –> 00:23:27,320
the supplier submission file, the export from ERP.
639
00:23:27,320 –> 00:23:29,040
You can add metadata alongside it.
640
00:23:29,040 –> 00:23:30,920
You do not fix it in place.
641
00:23:30,920 –> 00:23:32,920
Curated is where the data becomes usable.
642
00:23:32,920 –> 00:23:36,000
This is where you standardize, conform, and model.
643
00:23:36,000 –> 00:23:38,080
Curated data answers a different question.
644
00:23:38,080 –> 00:23:40,040
What does this mean in our organization?
645
00:23:40,040 –> 00:23:42,480
This is where you map source specific fields
646
00:23:42,480 –> 00:23:43,640
into a common schema.
647
00:23:43,640 –> 00:23:45,760
Standardize units apply reference data
648
00:23:45,760 –> 00:23:48,800
like organizational hierarchies and attach quality flags.
649
00:23:48,800 –> 00:23:51,000
The curated zone is where you deal with the reality
650
00:23:51,000 –> 00:23:54,280
that one system calls it planned, another calls it site,
651
00:23:54,280 –> 00:23:56,120
and a third calls it location.
652
00:23:56,120 –> 00:23:58,160
And none of them agree on identifiers.
653
00:23:58,160 –> 00:24:01,360
You resolve that here explicitly in versioned transformations
654
00:24:01,360 –> 00:24:02,160
you can explain.
655
00:24:02,160 –> 00:24:05,200
Curated is also where you keep truth with scars.
656
00:24:05,200 –> 00:24:06,920
You don’t hide data quality issues.
657
00:24:06,920 –> 00:24:08,040
You mark them.
658
00:24:08,040 –> 00:24:12,000
Later, arriving data, missing dimensions, suspect values,
659
00:24:12,000 –> 00:24:13,840
all of that becomes flags and exceptions
660
00:24:13,840 –> 00:24:16,360
because clean data with no record of cleaning
661
00:24:16,360 –> 00:24:18,200
is just manipulated data.
662
00:24:18,200 –> 00:24:20,480
Reported is the period-closed output zone.
663
00:24:20,480 –> 00:24:22,320
This is where KPIs become records.
664
00:24:22,320 –> 00:24:24,600
Reported answers, the only question assurance really
665
00:24:24,600 –> 00:24:27,120
cares about what did you report for this period
666
00:24:27,120 –> 00:24:29,480
under which logic using which inputs and factors.
667
00:24:29,480 –> 00:24:30,960
Reported data must be stable.
668
00:24:30,960 –> 00:24:33,920
Once the period closes, reported outputs do not change.
669
00:24:33,920 –> 00:24:35,160
If something needs correction,
670
00:24:35,160 –> 00:24:37,200
you don’t override reported tables.
671
00:24:37,200 –> 00:24:39,320
You publish an adjustment entry with references,
672
00:24:39,320 –> 00:24:42,120
what changed, why, and which approval allowed it.
673
00:24:42,120 –> 00:24:43,600
That’s how financial systems work.
674
00:24:43,600 –> 00:24:45,360
And ESG doesn’t get a special exemption
675
00:24:45,360 –> 00:24:46,880
just because it feels newer.
676
00:24:46,880 –> 00:24:48,280
Now here’s the thing most people miss.
677
00:24:48,280 –> 00:24:50,600
These three zones are not only about data shape,
678
00:24:50,600 –> 00:24:52,200
they’re about access boundaries.
679
00:24:52,200 –> 00:24:54,840
Raw is restricted because it contains direct extracts
680
00:24:54,840 –> 00:24:56,520
and sometimes sensitive fields.
681
00:24:56,520 –> 00:24:57,960
Curated is restricted differently
682
00:24:57,960 –> 00:25:00,040
because it represents standardized enterprise data
683
00:25:00,040 –> 00:25:01,600
that can be widely misused.
684
00:25:01,600 –> 00:25:04,120
Reported is restricted because it’s the official record.
685
00:25:04,120 –> 00:25:06,040
Different audiences, different permissions,
686
00:25:06,040 –> 00:25:07,760
same-entra enforcement model,
687
00:25:07,760 –> 00:25:09,200
and then there’s the evidence vault.
688
00:25:09,200 –> 00:25:12,280
The evidence vault is not a folder called supporting docs.
689
00:25:12,280 –> 00:25:14,040
It’s a controlled repository for everything
690
00:25:14,040 –> 00:25:15,160
that proves the numbers.
691
00:25:15,160 –> 00:25:16,960
Supplyers, submissions, invoices,
692
00:25:16,960 –> 00:25:19,480
meter calibration records, calculation approvals,
693
00:25:19,480 –> 00:25:21,760
factor library provenance, mapping decisions,
694
00:25:21,760 –> 00:25:23,840
and period-close attestations.
695
00:25:23,840 –> 00:25:26,800
This vault matters because ESG is not purely quantitative.
696
00:25:26,800 –> 00:25:28,480
Even when the KPIs are number,
697
00:25:28,480 –> 00:25:30,880
the justification often involves documents.
698
00:25:30,880 –> 00:25:32,840
The vault is where you store those artifacts
699
00:25:32,840 –> 00:25:35,960
with the same chain of custody expectations as raw data,
700
00:25:35,960 –> 00:25:39,960
who submitted it, when, which KPI or period it supports,
701
00:25:39,960 –> 00:25:42,040
and whether it was locked after close.
702
00:25:42,040 –> 00:25:44,040
If the supporting evidence lives in teams chats
703
00:25:44,040 –> 00:25:46,920
and someone’s mailbox, it doesn’t exist in audit terms.
704
00:25:46,920 –> 00:25:48,400
It exists as a future argument.
705
00:25:48,400 –> 00:25:51,080
Now naming and versioning, because mystery tables
706
00:25:51,080 –> 00:25:53,960
are an architectural failure, not a documentation failure,
707
00:25:53,960 –> 00:25:56,880
every object needs a predictable name that encodes,
708
00:25:56,880 –> 00:25:59,720
domain, source, period, and version.
709
00:25:59,720 –> 00:26:01,560
Not because auditors love naming conventions,
710
00:26:01,560 –> 00:26:04,120
but because humans do, you want an engineer to look at a path
711
00:26:04,120 –> 00:26:06,160
and understand whether it’s raw or reported,
712
00:26:06,160 –> 00:26:07,760
whether it’s preliminary or closed,
713
00:26:07,760 –> 00:26:09,600
and which period it belongs to,
714
00:26:09,600 –> 00:26:11,200
versioning needs to be explicit.
715
00:26:11,200 –> 00:26:12,480
Latest is not a version.
716
00:26:12,480 –> 00:26:13,920
Final is not a version.
717
00:26:13,920 –> 00:26:15,560
Final final two is a confession.
718
00:26:15,560 –> 00:26:17,960
So the storage anatomy creates a set of invariants.
719
00:26:17,960 –> 00:26:19,880
Raw preserves, curated standardizes,
720
00:26:19,880 –> 00:26:22,160
reported freezes, and the evidence vault proves.
721
00:26:22,160 –> 00:26:25,280
Once you have that, immutability stops being a storage checkbox
722
00:26:25,280 –> 00:26:27,520
and becomes a design constraint that your pipelines
723
00:26:27,520 –> 00:26:28,600
can actually survive.
724
00:26:28,600 –> 00:26:29,680
That’s next.
725
00:26:29,680 –> 00:26:31,400
Immutability, worm.
726
00:26:31,400 –> 00:26:34,280
How to lock evidence without breaking your pipelines?
727
00:26:34,280 –> 00:26:36,080
Immutability is where good architecture
728
00:26:36,080 –> 00:26:38,400
stop being aspirational and start being inconvenient,
729
00:26:38,400 –> 00:26:39,360
which is why it works.
730
00:26:39,360 –> 00:26:41,200
In Azure terms, this is right once,
731
00:26:41,200 –> 00:26:43,120
read many immutable storage policies
732
00:26:43,120 –> 00:26:45,520
on blob storage or ADLS Gen2
733
00:26:45,520 –> 00:26:48,440
that prevent modification or deletion for a defined period.
734
00:26:48,440 –> 00:26:51,040
Time-based retention locks data for a set interval.
735
00:26:51,040 –> 00:26:54,760
Legal hold locks it until someone explicitly clears it.
736
00:26:54,760 –> 00:26:55,760
Different intent?
737
00:26:55,760 –> 00:26:56,600
Same effect.
738
00:26:56,600 –> 00:26:59,040
You can create and read, but you can’t rewrite the past.
739
00:26:59,040 –> 00:27:00,040
That’s the point.
740
00:27:00,040 –> 00:27:02,280
The mistake teams make is treating immutability
741
00:27:02,280 –> 00:27:05,000
as a storage toggle you enable later.
742
00:27:05,000 –> 00:27:07,240
But immutability isn’t a feature you add.
743
00:27:07,240 –> 00:27:09,840
It’s a constraint that changes pipeline behavior,
744
00:27:09,840 –> 00:27:13,280
deployment patterns, and how humans negotiate fixes.
745
00:27:13,280 –> 00:27:15,720
So let’s be explicit about what changes operationally
746
00:27:15,720 –> 00:27:17,920
the moment you lock a container.
747
00:27:17,920 –> 00:27:19,920
Overrides become illegal.
748
00:27:19,920 –> 00:27:23,160
Re-run the job becomes, publish a new version.
749
00:27:23,160 –> 00:27:25,600
Re-run the job becomes, post an adjustment.
750
00:27:25,600 –> 00:27:27,720
And any pipeline that assumes it can land
751
00:27:27,720 –> 00:27:30,000
to the same path twice will fail loudly.
752
00:27:30,000 –> 00:27:31,880
As your storage will enforce the policy
753
00:27:31,880 –> 00:27:34,880
and your orchestration will surface it as an error.
754
00:27:34,880 –> 00:27:36,360
In data factory, you’ll see failures
755
00:27:36,360 –> 00:27:37,800
like path immutable due to policy
756
00:27:37,800 –> 00:27:39,680
when a copy activity attempts to override
757
00:27:39,680 –> 00:27:41,160
or modify a protected path.
758
00:27:41,160 –> 00:27:42,880
That error is not a platform defect.
759
00:27:42,880 –> 00:27:45,640
It is the system preventing evidence tempering accidental
760
00:27:45,640 –> 00:27:46,480
or otherwise.
761
00:27:46,480 –> 00:27:48,160
This is the foundational misunderstanding
762
00:27:48,160 –> 00:27:50,600
people think immutability is about security.
763
00:27:50,600 –> 00:27:51,440
It’s not.
764
00:27:51,440 –> 00:27:52,280
It’s about time.
765
00:27:52,280 –> 00:27:54,520
Making period close real in the storage layer
766
00:27:54,520 –> 00:27:56,240
not just in a calendar invite.
767
00:27:56,240 –> 00:27:57,760
Now, there are two workable patterns
768
00:27:57,760 –> 00:28:00,040
that don’t destroy your operations.
769
00:28:00,040 –> 00:28:02,320
The first pattern is immutable by design.
770
00:28:02,320 –> 00:28:06,120
Every ingestion writes to a unique, never-reused object name,
771
00:28:06,120 –> 00:28:07,880
and you never need to override anything.
772
00:28:07,880 –> 00:28:10,400
That means a path that includes a load identifier
773
00:28:10,400 –> 00:28:13,440
plus a deterministic partitioning scheme, source, system,
774
00:28:13,440 –> 00:28:16,320
date, and maybe hour if you’re dealing with telemetry.
775
00:28:16,320 –> 00:28:18,520
Each run produces a new object set.
776
00:28:18,520 –> 00:28:20,840
If the data arrives late, it lands as a new object set
777
00:28:20,840 –> 00:28:22,440
with a later load ID.
778
00:28:22,440 –> 00:28:24,560
You can still compute the same reported outputs
779
00:28:24,560 –> 00:28:27,800
because your close process selects which load IDs are in scope.
780
00:28:27,800 –> 00:28:30,360
The second pattern is the one most organizations actually
781
00:28:30,360 –> 00:28:30,680
need.
782
00:28:30,680 –> 00:28:34,080
Mutable staging, validated, publish, immutable archive.
783
00:28:34,080 –> 00:28:35,120
Here’s how it works.
784
00:28:35,120 –> 00:28:38,160
You ingest into a staging area that is intentionally mutable.
785
00:28:38,160 –> 00:28:40,440
You can rerun pipelines there, fix mapping bugs,
786
00:28:40,440 –> 00:28:42,320
and iterate without fighting worm.
787
00:28:42,320 –> 00:28:43,560
Then you run validation gates.
788
00:28:43,560 –> 00:28:45,480
And only after validation succeeds
789
00:28:45,480 –> 00:28:48,480
do you publish to the raw evidence zone, which is immutable.
790
00:28:48,480 –> 00:28:50,280
Publish is not copy and delete.
791
00:28:50,280 –> 00:28:52,160
Publish is a one-way promotion.
792
00:28:52,160 –> 00:28:55,080
New immutable objects written with a stable naming convention
793
00:28:55,080 –> 00:28:57,880
plus a metadata record that binds them to a load ID,
794
00:28:57,880 –> 00:29:01,360
pipeline version, submitter identity, and validation results.
795
00:29:01,360 –> 00:29:02,880
If you’re using Azure Data Factory,
796
00:29:02,880 –> 00:29:05,720
this pattern becomes mandatory in some transformation
797
00:29:05,720 –> 00:29:07,680
scenarios because certain activities rely
798
00:29:07,680 –> 00:29:10,360
on temporary files during processing.
799
00:29:10,360 –> 00:29:12,480
Immutable policies prevent those temporary rights
800
00:29:12,480 –> 00:29:13,480
and cleanup operations.
801
00:29:13,480 –> 00:29:15,600
So you write to non-immutable storage first,
802
00:29:15,600 –> 00:29:18,080
then use a copy activity to move the finalized outputs
803
00:29:18,080 –> 00:29:19,600
into the immutable container.
804
00:29:19,600 –> 00:29:23,200
Again, inconvenient, predictable, correct.
805
00:29:23,200 –> 00:29:26,760
Now the subtle part, immutability doesn’t just apply to raw,
806
00:29:26,760 –> 00:29:28,960
it applies to anything you will later claim as evidence.
807
00:29:28,960 –> 00:29:31,240
That includes factor libraries for a closed period,
808
00:29:31,240 –> 00:29:33,680
the period closed reported KPI outputs,
809
00:29:33,680 –> 00:29:36,280
and the evidence vault documents that support disclosures.
810
00:29:36,280 –> 00:29:38,080
If those objects remain mutable,
811
00:29:38,080 –> 00:29:39,880
you can’t prove historical integrity.
812
00:29:39,880 –> 00:29:41,080
You can only promise it.
813
00:29:41,080 –> 00:29:43,000
Auditors don’t accept promises as controls.
814
00:29:43,000 –> 00:29:46,080
So you need a closed process that includes a storage lock step.
815
00:29:46,080 –> 00:29:48,200
At period close, you freeze the selection of inputs,
816
00:29:48,200 –> 00:29:49,880
which load IDs are included.
817
00:29:49,880 –> 00:29:51,200
You freeze factor versions,
818
00:29:51,200 –> 00:29:53,360
you freeze the calculation logic reference,
819
00:29:53,360 –> 00:29:55,160
then you publish the reported outputs
820
00:29:55,160 –> 00:29:58,280
and apply immutability to the reported zone for that period.
821
00:29:58,280 –> 00:30:00,320
You’re not locking the entire lake forever.
822
00:30:00,320 –> 00:30:02,000
You’re locking the slices that represent
823
00:30:02,000 –> 00:30:03,480
what we knew and reported.
824
00:30:03,480 –> 00:30:05,200
That distinction matters because you still need
825
00:30:05,200 –> 00:30:06,640
to operate next month.
826
00:30:06,640 –> 00:30:09,120
One more uncomfortable truth immutability forces you
827
00:30:09,120 –> 00:30:10,280
to design for corrections.
828
00:30:10,280 –> 00:30:11,960
Corrections can’t be overrides,
829
00:30:11,960 –> 00:30:13,880
so they become adjustment entries.
830
00:30:13,880 –> 00:30:16,200
Additive records that reference the original,
831
00:30:16,200 –> 00:30:18,480
carry a rationale and require approval.
832
00:30:18,480 –> 00:30:19,640
If you do this well,
833
00:30:19,640 –> 00:30:22,480
you end up with something auditors understand immediately.
834
00:30:22,480 –> 00:30:24,400
The original evidence remains intact
835
00:30:24,400 –> 00:30:26,880
and the adjustment trail is visible and attributable.
836
00:30:26,880 –> 00:30:27,880
If you do this poorly,
837
00:30:27,880 –> 00:30:30,360
people will attempt to bypass the system.
838
00:30:30,360 –> 00:30:31,680
They’ll hunt for a mutable folder.
839
00:30:31,680 –> 00:30:33,120
They’ll ask for exceptions.
840
00:30:33,120 –> 00:30:34,440
They’ll demand admin rights.
841
00:30:34,440 –> 00:30:35,680
That’s not a people problem.
842
00:30:35,680 –> 00:30:37,440
That’s you failing to design the only thing
843
00:30:37,440 –> 00:30:38,800
that survives entropy,
844
00:30:38,800 –> 00:30:41,080
an architecture that makes the right behavior easier
845
00:30:41,080 –> 00:30:42,280
than the wrong one.
846
00:30:42,280 –> 00:30:43,920
Next up is the hard part.
847
00:30:43,920 –> 00:30:45,600
Calculations that don’t drift,
848
00:30:45,600 –> 00:30:48,720
even when everyone keeps improving the logic.
849
00:30:48,720 –> 00:30:50,360
The governed calculation zone,
850
00:30:50,360 –> 00:30:51,960
fabric lake house or synapse,
851
00:30:51,960 –> 00:30:53,280
not dashboard math.
852
00:30:53,280 –> 00:30:55,600
This is where most ESG stacks quietly rot,
853
00:30:55,600 –> 00:30:56,680
the calculation layer,
854
00:30:56,680 –> 00:30:58,000
not because people can’t do math
855
00:30:58,000 –> 00:30:59,400
because they put math in places
856
00:30:59,400 –> 00:31:01,800
that can’t be governed like an accounting system.
857
00:31:01,800 –> 00:31:03,520
Power BI is a presentation tool.
858
00:31:03,520 –> 00:31:06,120
It is not an audit grade calculation engine.
859
00:31:06,120 –> 00:31:07,360
The moment your emissions logic
860
00:31:07,360 –> 00:31:09,320
lives primarily in DAX measures,
861
00:31:09,320 –> 00:31:11,080
you’ve made your numbers dependent on a file
862
00:31:11,080 –> 00:31:13,040
that changes whenever someone wants a new visual.
863
00:31:13,040 –> 00:31:14,120
That’s not control.
864
00:31:14,120 –> 00:31:16,280
That’s drift with a user interface.
865
00:31:16,280 –> 00:31:17,640
So the rule is blunt.
866
00:31:17,640 –> 00:31:19,640
Calculations live in a governed zone,
867
00:31:19,640 –> 00:31:22,640
fabric lake house or azure synapse analytics.
868
00:31:22,640 –> 00:31:24,040
Pick one.
869
00:31:24,040 –> 00:31:26,440
Then treat it like a finance system.
870
00:31:26,440 –> 00:31:29,680
Version logic, controlled releases, testability
871
00:31:29,680 –> 00:31:31,320
and reproducibility per period.
872
00:31:31,320 –> 00:31:34,080
Why this matters shows up the first time a stakeholder asks,
873
00:31:34,080 –> 00:31:36,120
why did last year’s number change?
874
00:31:36,120 –> 00:31:38,920
And you discover the answer is someone edited a measure.
875
00:31:38,920 –> 00:31:40,680
That answer will not survive assurance.
876
00:31:40,680 –> 00:31:42,120
In a governed calculation zone,
877
00:31:42,120 –> 00:31:44,880
the primary artifacts are explicit and inspecable.
878
00:31:44,880 –> 00:31:47,320
SQL views, stored procedures, notebooks
879
00:31:47,320 –> 00:31:49,400
and tables that represent outputs.
880
00:31:49,400 –> 00:31:51,960
You choose the artifact type based on what you can
881
00:31:51,960 –> 00:31:54,280
govern consistently, not on what your favorite
882
00:31:54,280 –> 00:31:55,680
engineer likes this week.
883
00:31:55,680 –> 00:31:58,080
SQL views can be clean for transparency.
884
00:31:58,080 –> 00:32:01,320
The logic is readable, diffable and can be reviewed.
885
00:32:01,320 –> 00:32:03,840
Stored procedures can enforce parameterization
886
00:32:03,840 –> 00:32:06,120
and encapsulate controlled transformations,
887
00:32:06,120 –> 00:32:08,080
but they can also become opaque
888
00:32:08,080 –> 00:32:11,640
if people start hiding business logic inside procedural code.
889
00:32:11,640 –> 00:32:14,240
Notebooks are powerful for complex transformations
890
00:32:14,240 –> 00:32:17,800
and factor application, but they demand discipline, source
891
00:32:17,800 –> 00:32:21,280
control, approved releases and consistent execution
892
00:32:21,280 –> 00:32:22,360
environments.
893
00:32:22,360 –> 00:32:24,880
Choose one dominant pattern for KPI computation
894
00:32:24,880 –> 00:32:26,920
and enforce it, because mixed paradigms
895
00:32:26,920 –> 00:32:28,720
are how you lose reproducibility.
896
00:32:28,720 –> 00:32:30,920
And this is the checkpoint most people ignore.
897
00:32:30,920 –> 00:32:33,240
Unit consistency and dimensionality.
898
00:32:33,240 –> 00:32:36,120
Emissions calculations are not just multiplication.
899
00:32:36,120 –> 00:32:38,080
They are multiplication under constraints.
900
00:32:38,080 –> 00:32:40,880
Site, region, period, source system, scope category,
901
00:32:40,880 –> 00:32:43,080
activity type, unit and factor version.
902
00:32:43,080 –> 00:32:45,480
If any of those dimensions are missing or ambiguous,
903
00:32:45,480 –> 00:32:47,440
you will produce numbers that look plausible
904
00:32:47,440 –> 00:32:48,840
and fail under interrogation.
905
00:32:48,840 –> 00:32:51,320
So the govern zone has a job beyond computing.
906
00:32:51,320 –> 00:32:53,080
It enforces dimensional completeness.
907
00:32:53,080 –> 00:32:56,120
Every record must be joinable to organizational structure.
908
00:32:56,120 –> 00:32:59,400
Every activity record must carry a unit that can be normalized.
909
00:32:59,400 –> 00:33:02,200
Every computed record must carry the factor version key used,
910
00:33:02,200 –> 00:33:04,320
not DEFRA, not EPA.
911
00:33:04,320 –> 00:33:07,920
A version key that binds the output to a specific factor
912
00:33:07,920 –> 00:33:09,280
library snapshot.
913
00:33:09,280 –> 00:33:12,000
Now here’s where period close mechanics stop being a meeting
914
00:33:12,000 –> 00:33:13,720
and become an implementation.
915
00:33:13,720 –> 00:33:16,960
For a close to be auditable, three things must freeze together.
916
00:33:16,960 –> 00:33:18,960
Inputs, factors and logic reference.
917
00:33:18,960 –> 00:33:21,640
Freeze inputs means the system records, which load IDs
918
00:33:21,640 –> 00:33:24,120
or partitions are in scope for the period.
919
00:33:24,120 –> 00:33:26,080
You don’t just have much data.
920
00:33:26,080 –> 00:33:28,920
You have these ingestion runs validated, approved,
921
00:33:28,920 –> 00:33:29,880
included.
922
00:33:29,880 –> 00:33:32,200
Later rivals don’t overwrite anything.
923
00:33:32,200 –> 00:33:33,880
They become late arrivals.
924
00:33:33,880 –> 00:33:36,520
Explicitly excluded or treated as adjustments.
925
00:33:36,520 –> 00:33:38,160
Freeze factors means the factor library
926
00:33:38,160 –> 00:33:40,720
used for that period is published with a version key
927
00:33:40,720 –> 00:33:42,120
and then locked as evidence.
928
00:33:42,120 –> 00:33:44,880
If your calculation queries join to latest,
929
00:33:44,880 –> 00:33:45,880
you’ve already failed.
930
00:33:45,880 –> 00:33:47,360
You’re not calculating a period.
931
00:33:47,360 –> 00:33:49,600
You’re calculating today’s opinion of the past.
932
00:33:49,600 –> 00:33:52,080
Freeze logic reference means the exact calculation
933
00:33:52,080 –> 00:33:54,880
artifacts used are versioned and identifiable.
934
00:33:54,880 –> 00:33:56,880
A git commit, a release notebook package,
935
00:33:56,880 –> 00:34:00,120
a view definition version, something durable.
936
00:34:00,120 –> 00:34:01,760
The current notebook is not a version.
937
00:34:01,760 –> 00:34:03,040
It’s a moving target.
938
00:34:03,040 –> 00:34:05,440
Once those three are frozen, the reported outputs
939
00:34:05,440 –> 00:34:07,040
can be generated deterministically
940
00:34:07,040 –> 00:34:08,760
and published into the reported zone.
941
00:34:08,760 –> 00:34:10,920
And Power BI consumes those outputs.
942
00:34:10,920 –> 00:34:11,720
That’s the boundary.
943
00:34:11,720 –> 00:34:15,160
Power BI doesn’t get to help by recomputing core emissions
944
00:34:15,160 –> 00:34:15,680
logic.
945
00:34:15,680 –> 00:34:18,160
Now, a common objection is, but we need flexibility.
946
00:34:18,160 –> 00:34:19,440
No, you need control change.
947
00:34:19,440 –> 00:34:20,320
You can change the model.
948
00:34:20,320 –> 00:34:21,400
You can improve mapping.
949
00:34:21,400 –> 00:34:22,520
You can add new factors.
950
00:34:22,520 –> 00:34:24,480
You can refine scope three categories.
951
00:34:24,480 –> 00:34:27,360
But every change becomes a new version with a new effective date
952
00:34:27,360 –> 00:34:29,640
and a clear statement of what periods it impacts.
953
00:34:29,640 –> 00:34:30,720
That’s not bureaucracy.
954
00:34:30,720 –> 00:34:32,920
That’s how you stop rewriting history by accident.
955
00:34:32,920 –> 00:34:35,320
If you remember nothing else, the governed calculation zone
956
00:34:35,320 –> 00:34:37,400
is where ESG becomes deterministic.
957
00:34:37,400 –> 00:34:39,920
Dashboards are where ESG becomes arguable.
958
00:34:39,920 –> 00:34:42,360
And once you enforce deterministic computation,
959
00:34:42,360 –> 00:34:44,320
the next dependency becomes obvious.
960
00:34:44,320 –> 00:34:47,040
Emissions logic, lives and dies on factor management.
961
00:34:47,040 –> 00:34:49,960
Emissions factors, versioning or your rewriting history.
962
00:34:49,960 –> 00:34:52,080
Emissions factors are where most ESG stacks
963
00:34:52,080 –> 00:34:53,560
commit their quietest fraud.
964
00:34:53,560 –> 00:34:56,200
They treat reference data like a convenience file,
965
00:34:56,200 –> 00:34:58,400
a spreadsheet attachment, something you update
966
00:34:58,400 –> 00:34:59,880
when the new one comes out.
967
00:34:59,880 –> 00:35:02,280
That behavior rewrites history.
968
00:35:02,280 –> 00:35:04,440
Because an emission factor is not a number.
969
00:35:04,440 –> 00:35:07,080
It’s a controlled assumption that converts activity
970
00:35:07,080 –> 00:35:08,120
into emissions.
971
00:35:08,120 –> 00:35:10,800
Change the assumption and you change the outcome.
972
00:35:10,800 –> 00:35:13,440
Which means if you can’t prove which factor set applied
973
00:35:13,440 –> 00:35:16,160
to FYI, your FYI numbers aren’t evidence.
974
00:35:16,160 –> 00:35:18,200
There are current interpretation of the past.
975
00:35:18,200 –> 00:35:20,360
Auditors don’t assure interpretations.
976
00:35:20,360 –> 00:35:22,200
They assure records.
977
00:35:22,200 –> 00:35:23,480
So the rule is simple.
978
00:35:23,480 –> 00:35:25,640
Emissions factors are controlled reference data
979
00:35:25,640 –> 00:35:28,160
with versioning, provenance and effective dates.
980
00:35:28,160 –> 00:35:30,920
And once a period closes, the specific factor set
981
00:35:30,920 –> 00:35:33,520
used for that period becomes immutable evidence.
982
00:35:33,520 –> 00:35:35,400
This is the part everyone tries to shortcut with,
983
00:35:35,400 –> 00:35:37,400
we use DEFRA or we use EPA.
984
00:35:37,400 –> 00:35:38,160
That’s not evidence.
985
00:35:38,160 –> 00:35:39,200
That’s a brand label.
986
00:35:39,200 –> 00:35:41,400
What matters is the specific library version,
987
00:35:41,400 –> 00:35:43,440
the effective date range, the geography mapping
988
00:35:43,440 –> 00:35:45,240
and the category mapping you applied.
989
00:35:45,240 –> 00:35:46,600
And here’s the thing most people miss.
990
00:35:46,600 –> 00:35:48,560
Factor management isn’t one table.
991
00:35:48,560 –> 00:35:49,600
It’s a small system.
992
00:35:49,600 –> 00:35:51,760
You need at least four concepts in your model.
993
00:35:51,760 –> 00:35:53,480
One, a factor library entity.
994
00:35:53,480 –> 00:35:56,360
This represents a published set you can refer to as a unit.
995
00:35:56,360 –> 00:35:58,840
It has a name, a publisher, a published date,
996
00:35:58,840 –> 00:36:01,600
and a status like draft approved and archived.
997
00:36:01,600 –> 00:36:05,160
Two, the factor records themselves, the actual conversion values
998
00:36:05,160 –> 00:36:07,360
with units, gas type, where applicable
999
00:36:07,360 –> 00:36:10,200
and any classification fields you rely on in joins.
1000
00:36:10,200 –> 00:36:14,760
Three, applicability metadata, geography, sector,
1001
00:36:14,760 –> 00:36:17,960
activity type mapping, and effective date range.
1002
00:36:17,960 –> 00:36:20,120
If a factor is only valid for a country
1003
00:36:20,120 –> 00:36:22,040
or only valid from a certain date,
1004
00:36:22,040 –> 00:36:24,200
the model needs to carry that explicitly.
1005
00:36:24,200 –> 00:36:26,880
Four, provenance artifacts, where it came from
1006
00:36:26,880 –> 00:36:28,400
and how it entered your system.
1007
00:36:28,400 –> 00:36:30,400
That can be a link to an evidence document
1008
00:36:30,400 –> 00:36:33,880
in your evidence vault or at minimum, a stored reference
1009
00:36:33,880 –> 00:36:35,640
that can be produced during assurance.
1010
00:36:35,640 –> 00:36:37,280
Now the failure mode is predictable.
1011
00:36:37,280 –> 00:36:40,160
Teams store factors in a table called emission factors
1012
00:36:40,160 –> 00:36:42,800
with a column called factor value and no version key.
1013
00:36:42,800 –> 00:36:45,080
Then calculations join to it using a natural key
1014
00:36:45,080 –> 00:36:48,840
like activity type and country, and they default to latest.
1015
00:36:48,840 –> 00:36:50,720
And it works until the factor table updates,
1016
00:36:50,720 –> 00:36:53,120
then rerunning last year produces different results.
1017
00:36:53,120 –> 00:36:54,600
And the team calls it an update.
1018
00:36:54,600 –> 00:36:55,680
It is not an update.
1019
00:36:55,680 –> 00:36:57,480
It is a restatement without governance.
1020
00:36:57,480 –> 00:36:59,680
So the enforcement pattern is also predictable.
1021
00:36:59,680 –> 00:37:01,440
Factor to period binding.
1022
00:37:01,440 –> 00:37:04,440
Every computer emission record must carry a factor version key,
1023
00:37:04,440 –> 00:37:06,480
not a textual label, a key that ties back
1024
00:37:06,480 –> 00:37:08,880
to a specific published library snapshot.
1025
00:37:08,880 –> 00:37:10,960
And your calculation logic must require it.
1026
00:37:10,960 –> 00:37:13,040
If the pipeline can run without specifying
1027
00:37:13,040 –> 00:37:14,800
the factor version, you’ve built a machine
1028
00:37:14,800 –> 00:37:16,440
that can rewrite its own past.
1029
00:37:16,440 –> 00:37:17,960
This is where systems beat policy.
1030
00:37:17,960 –> 00:37:19,960
Don’t tell people, don’t use latest.
1031
00:37:19,960 –> 00:37:22,880
Make latest unusable in period close processing.
1032
00:37:22,880 –> 00:37:24,520
Use it only in exploratory analysis
1033
00:37:24,520 –> 00:37:27,640
where you explicitly label the output as non-reportable.
1034
00:37:27,640 –> 00:37:29,200
Then you build the publish workflow.
1035
00:37:29,200 –> 00:37:32,440
Factors do not appear in production tables as ad hoc edits.
1036
00:37:32,440 –> 00:37:34,000
They move through a life cycle.
1037
00:37:34,000 –> 00:37:36,240
Draft factors exist in a working area.
1038
00:37:36,240 –> 00:37:38,360
Someone reviews them, someone approves them,
1039
00:37:38,360 –> 00:37:40,040
and then you publish a new library version
1040
00:37:40,040 –> 00:37:41,440
after published you lock it.
1041
00:37:41,440 –> 00:37:43,240
That’s where immutability enters again.
1042
00:37:43,240 –> 00:37:46,000
The published factor library for a period becomes evidence.
1043
00:37:46,000 –> 00:37:47,640
So it becomes worm protected.
1044
00:37:47,640 –> 00:37:49,400
And yes, you can still correct factor data.
1045
00:37:49,400 –> 00:37:51,440
You just can’t pretend it was always that way.
1046
00:37:51,440 –> 00:37:52,920
Corrections become a new version
1047
00:37:52,920 –> 00:37:56,600
with a clear statement of impact, which future periods use it,
1048
00:37:56,600 –> 00:37:59,480
and whether prior periods require an adjustment entry.
1049
00:37:59,480 –> 00:38:01,600
Now, how does this land in a Microsoft stack
1050
00:38:01,600 –> 00:38:03,880
without turning into another governance slide deck?
1051
00:38:03,880 –> 00:38:06,400
In fabric or synops, you implement factor libraries
1052
00:38:06,400 –> 00:38:09,480
as tables with explicit version keys and effective dating.
1053
00:38:09,480 –> 00:38:11,680
In your calculation views or notebooks,
1054
00:38:11,680 –> 00:38:13,680
you join activity data to factors
1055
00:38:13,680 –> 00:38:17,080
using activity classification, geography, and period date.
1056
00:38:17,080 –> 00:38:18,600
But you don’t let the joint float.
1057
00:38:18,600 –> 00:38:22,640
You require an input parameter for factor library version ID
1058
00:38:22,640 –> 00:38:24,400
when producing reported outputs.
1059
00:38:24,400 –> 00:38:26,680
Or you bind the version through a period
1060
00:38:26,680 –> 00:38:29,200
close configuration table that is itself locked
1061
00:38:29,200 –> 00:38:30,280
after close.
1062
00:38:30,280 –> 00:38:32,720
Either way, the output row carries the version ID.
1063
00:38:32,720 –> 00:38:35,680
And you treat the factor library publish as a formal release.
1064
00:38:35,680 –> 00:38:37,200
It’s an artifact with approvals.
1065
00:38:37,200 –> 00:38:39,920
It’s registered in purview, and it can be traced.
1066
00:38:39,920 –> 00:38:42,200
That’s how you answer the audit question in one sentence.
1067
00:38:42,200 –> 00:38:45,600
FI1 used factor library version X published on Y,
1068
00:38:45,600 –> 00:38:47,520
approved by Z, and locked on close.
1069
00:38:47,520 –> 00:38:50,120
Without that, you’ll end up in the classic assurance failure.
1070
00:38:50,120 –> 00:38:52,360
Someone asks why FI1 changed?
1071
00:38:52,360 –> 00:38:55,520
And you respond with, because the factors were updated.
1072
00:38:55,520 –> 00:38:58,000
That response admits you don’t have reproducibility.
1073
00:38:58,000 –> 00:38:59,480
And if you don’t have reproducibility,
1074
00:38:59,480 –> 00:39:01,080
you don’t have audit grade ESG.
1075
00:39:01,080 –> 00:39:02,600
Once factor versioning is real,
1076
00:39:02,600 –> 00:39:04,440
KPI modeling stops being guesswork
1077
00:39:04,440 –> 00:39:06,240
and starts being constrained engineering.
1078
00:39:06,240 –> 00:39:07,080
That’s next.
1079
00:39:07,080 –> 00:39:09,880
KPI modeling, scope 1, 3, energy water,
1080
00:39:09,880 –> 00:39:12,120
workforce metrics, supplier coverage.
1081
00:39:12,120 –> 00:39:14,240
Once factors are versioned, KPI modeling stops
1082
00:39:14,240 –> 00:39:15,720
being a creative writing exercise
1083
00:39:15,720 –> 00:39:17,760
and becomes what it always should have been.
1084
00:39:17,760 –> 00:39:19,680
Constraints encoded as data.
1085
00:39:19,680 –> 00:39:21,960
Most ESG teams model KPI’s like labels.
1086
00:39:21,960 –> 00:39:26,280
scope 1, scope 2, scope 3, water, diversity, supplier coverage.
1087
00:39:26,280 –> 00:39:28,720
Then they build a dashboard and assume the definitions
1088
00:39:28,720 –> 00:39:31,040
will stay stable because everyone agreed.
1089
00:39:31,040 –> 00:39:31,720
They won’t.
1090
00:39:31,720 –> 00:39:33,680
So KPI modeling has one job.
1091
00:39:33,680 –> 00:39:36,280
Make the definition enforceable and make drift visible
1092
00:39:36,280 –> 00:39:37,680
when someone tries to change it.
1093
00:39:37,680 –> 00:39:39,280
Start with scope 1, 2, and 3.
1094
00:39:39,280 –> 00:39:41,200
These aren’t tags you slap on at the end.
1095
00:39:41,200 –> 00:39:42,320
There are structural constraints
1096
00:39:42,320 –> 00:39:45,520
that determine what data qualifies, which factors are valid,
1097
00:39:45,520 –> 00:39:46,880
and what boundaries apply.
1098
00:39:46,880 –> 00:39:50,520
scope 1 is direct emissions from owned or controlled sources.
1099
00:39:50,520 –> 00:39:52,800
In system terms, scope 1 activity records
1100
00:39:52,800 –> 00:39:54,520
must bind to assets you control.
1101
00:39:54,520 –> 00:39:57,480
Boilers, generators, company vehicles, refrigerants.
1102
00:39:57,480 –> 00:39:59,960
That means your data model needs an asset dimension
1103
00:39:59,960 –> 00:40:03,000
or at least an owned control attribute you can prove,
1104
00:40:03,000 –> 00:40:04,040
not infer later.
1105
00:40:04,040 –> 00:40:06,760
If you can’t tie the activity record to the controlled
1106
00:40:06,760 –> 00:40:08,880
asset set that existed during the period,
1107
00:40:08,880 –> 00:40:10,200
you’re back to narrative.
1108
00:40:10,200 –> 00:40:14,080
scope 2 is purchased electricity, heat, steam cooling.
1109
00:40:14,080 –> 00:40:17,000
In modeling terms, scope 2 requires a clean separation
1110
00:40:17,000 –> 00:40:18,760
between consumption and factor application
1111
00:40:18,760 –> 00:40:22,000
because electricity data can arrive as meter readings,
1112
00:40:22,000 –> 00:40:23,960
invoices, or allocations.
1113
00:40:23,960 –> 00:40:26,480
Your model must preserve the original consumption units
1114
00:40:26,480 –> 00:40:28,000
and the conversion path.
1115
00:40:28,000 –> 00:40:31,440
And the output must carry which factor version applied,
1116
00:40:31,440 –> 00:40:33,720
plus the geography and supplier mapping
1117
00:40:33,720 –> 00:40:34,600
that justified it.
1118
00:40:34,600 –> 00:40:37,240
Otherwise, you’ll end up with global average factors
1119
00:40:37,240 –> 00:40:39,800
quietly covering gaps and then spend months pretending
1120
00:40:39,800 –> 00:40:40,960
it was intentional.
1121
00:40:40,960 –> 00:40:43,960
scope 3 is where the system either becomes honest or collapses.
1122
00:40:43,960 –> 00:40:45,760
scope 3 is a value chain problem, which
1123
00:40:45,760 –> 00:40:48,400
means the model needs to handle mixed evidence quality.
1124
00:40:48,400 –> 00:40:51,080
Supplyer provided data, spend-based estimates,
1125
00:40:51,080 –> 00:40:53,640
activity-based proxies, and hybrid methods.
1126
00:40:53,640 –> 00:40:55,880
The common failure is forcing all of that into one column
1127
00:40:55,880 –> 00:40:58,480
called emissions and calling it complete.
1128
00:40:58,480 –> 00:41:02,800
So the rule is every scope 3 KPI must carry two flags,
1129
00:41:02,800 –> 00:41:06,600
measured versus estimated, and coverage scope, measured means
1130
00:41:06,600 –> 00:41:09,200
supplier provided or directly sourced activity
1131
00:41:09,200 –> 00:41:10,760
with traceable factors.
1132
00:41:10,760 –> 00:41:14,160
Estimated means proxy logic with estimation factors
1133
00:41:14,160 –> 00:41:16,960
treated as controlled inputs just like emission factors.
1134
00:41:16,960 –> 00:41:19,920
Coverage scope means what part of the category this KPI
1135
00:41:19,920 –> 00:41:21,080
represents.
1136
00:41:21,080 –> 00:41:23,640
Percent of spend covered, percent of suppliers covered,
1137
00:41:23,640 –> 00:41:25,080
percent of sites covered.
1138
00:41:25,080 –> 00:41:27,480
Without those, your scope 3 number is just a confidence
1139
00:41:27,480 –> 00:41:28,040
trick.
1140
00:41:28,040 –> 00:41:30,320
Now energy and water, because these KPI’s
1141
00:41:30,320 –> 00:41:32,640
attract the most casual denominator abuse.
1142
00:41:32,640 –> 00:41:35,680
Consumption is easy, intensities where you get audited.
1143
00:41:35,680 –> 00:41:38,480
Energy intensity metrics require a denominator.
1144
00:41:38,480 –> 00:41:41,600
Revenue, production volume, floor area, headcount,
1145
00:41:41,600 –> 00:41:42,640
output units.
1146
00:41:42,640 –> 00:41:44,640
Denominators drift because someone changes
1147
00:41:44,640 –> 00:41:46,480
the definition of revenue or switches
1148
00:41:46,480 –> 00:41:48,600
the production metric mid-year or updates
1149
00:41:48,600 –> 00:41:50,440
organizational structure mappings.
1150
00:41:50,440 –> 00:41:53,640
So your KPI model needs to treat denominators as govern data,
1151
00:41:53,640 –> 00:41:55,200
not as a measure in a dashboard.
1152
00:41:55,200 –> 00:41:58,040
That means store denominators as tables with source, period,
1153
00:41:58,040 –> 00:42:00,160
or unit, and definition version.
1154
00:42:00,160 –> 00:42:02,600
Then compute intensity in the governed calculation zone
1155
00:42:02,600 –> 00:42:04,320
and publish it like any other KPI.
1156
00:42:04,320 –> 00:42:06,280
If someone wants a new denominator, fine,
1157
00:42:06,280 –> 00:42:09,000
they get a new KPI variant with a new definition,
1158
00:42:09,000 –> 00:42:11,000
not a silent rewrite of the old one.
1159
00:42:11,000 –> 00:42:14,400
Water works the same way, but with more traps, local units,
1160
00:42:14,400 –> 00:42:17,680
local reporting boundaries, and data that often arrives late.
1161
00:42:17,680 –> 00:42:20,960
So the model needs quality flags estimated missing context,
1162
00:42:20,960 –> 00:42:23,080
suspects, bikes, later arriving.
1163
00:42:23,080 –> 00:42:24,120
Don’t hide those.
1164
00:42:24,120 –> 00:42:25,960
Put them in the data set so the dashboard
1165
00:42:25,960 –> 00:42:28,520
can surface confidence, not just totals.
1166
00:42:28,520 –> 00:42:30,760
Workforce metrics are the quiet governance test.
1167
00:42:30,760 –> 00:42:33,840
Headcount, turn over, training hours, safety incident rates,
1168
00:42:33,840 –> 00:42:35,560
these are definition landmines.
1169
00:42:35,560 –> 00:42:38,080
The calculation often depends on what counts as an employee,
1170
00:42:38,080 –> 00:42:40,800
what counts as a contractor, which geographies are in scope,
1171
00:42:40,800 –> 00:42:43,800
and how organizational units map to legal entities.
1172
00:42:43,800 –> 00:42:46,240
If the HR team changes the underlying definition,
1173
00:42:46,240 –> 00:42:48,280
your KPI changes without a code change.
1174
00:42:48,280 –> 00:42:50,280
So KPI modeling for workforce metrics
1175
00:42:50,280 –> 00:42:52,280
must include definition binding.
1176
00:42:52,280 –> 00:42:55,560
A version definition record that states the inclusion rules,
1177
00:42:55,560 –> 00:42:57,880
the denominator, and the aggregation level.
1178
00:42:57,880 –> 00:43:00,480
Then the outputs carry that definition version key, again,
1179
00:43:00,480 –> 00:43:01,960
not a label, a key.
1180
00:43:01,960 –> 00:43:05,680
And supplier coverage needs to be modeled explicitly
1181
00:43:05,680 –> 00:43:08,120
because it’s the only way to prevent vanity percentages.
1182
00:43:08,120 –> 00:43:09,520
Covered must have a definition,
1183
00:43:09,520 –> 00:43:11,120
covered by survey response,
1184
00:43:11,120 –> 00:43:14,000
covered by verified activity, covered by modeled estimates,
1185
00:43:14,000 –> 00:43:15,440
each is a different confidence level.
1186
00:43:15,440 –> 00:43:17,920
So store coverage as its own KPI family
1187
00:43:17,920 –> 00:43:20,240
with numerator and denominator definitions
1188
00:43:20,240 –> 00:43:22,320
and treat it like a first class metric.
1189
00:43:22,320 –> 00:43:24,160
Otherwise, you’ll end up with a green dashboard
1190
00:43:24,160 –> 00:43:25,960
that can’t explain its own scope.
1191
00:43:25,960 –> 00:43:28,920
Once your KPI model encodes scope, definitions,
1192
00:43:28,920 –> 00:43:31,320
estimation flags, and denominator governance,
1193
00:43:31,320 –> 00:43:32,880
the architecture can produce numbers
1194
00:43:32,880 –> 00:43:34,520
that survive interrogation.
1195
00:43:34,520 –> 00:43:36,280
And now we can talk about the failure modes
1196
00:43:36,280 –> 00:43:37,680
because they’re not random.
1197
00:43:37,680 –> 00:43:40,080
They’re designed in failure mode one,
1198
00:43:40,080 –> 00:43:42,040
manual CSV overrides.
1199
00:43:42,040 –> 00:43:43,360
If there’s a single failure mode
1200
00:43:43,360 –> 00:43:46,160
that shows up in almost every ESG program, it’s this.
1201
00:43:46,160 –> 00:43:48,040
Someone fixes the number with a file.
1202
00:43:48,040 –> 00:43:49,840
It always sounds reasonable in the moment,
1203
00:43:49,840 –> 00:43:51,360
the meter export was wrong.
1204
00:43:51,360 –> 00:43:52,920
The facility center correction late,
1205
00:43:52,920 –> 00:43:54,800
the supplier portal didn’t respond.
1206
00:43:54,800 –> 00:43:56,200
The CFO wants the dashboard
1207
00:43:56,200 –> 00:43:58,200
to match what finance believes is true.
1208
00:43:58,200 –> 00:44:00,200
So a spreadsheet appears, then a CSV,
1209
00:44:00,200 –> 00:44:01,600
then a folder called uploads,
1210
00:44:01,600 –> 00:44:03,760
then a file name that admits the whole control model
1211
00:44:03,760 –> 00:44:06,320
is imaginary, final V7.
1212
00:44:06,320 –> 00:44:08,520
C is a sieve. Here’s what goes wrong mechanically.
1213
00:44:08,520 –> 00:44:11,320
A manual override has no inherent chain of custody.
1214
00:44:11,320 –> 00:44:13,160
It doesn’t preserve who changed the value
1215
00:44:13,160 –> 00:44:15,600
what the previous value was, what justification existed,
1216
00:44:15,600 –> 00:44:16,800
which approval covered it,
1217
00:44:16,800 –> 00:44:18,880
and whether the period was already closed.
1218
00:44:18,880 –> 00:44:20,480
A CSV is a blob of claims.
1219
00:44:20,480 –> 00:44:23,400
Unless the system forces metadata capture and approval,
1220
00:44:23,400 –> 00:44:25,840
the file becomes a silent rewrite of evidence.
1221
00:44:25,840 –> 00:44:28,680
And once that pathway exists, it gets used for everything.
1222
00:44:28,680 –> 00:44:31,040
First, it’s just this one facility.
1223
00:44:31,040 –> 00:44:32,440
Then it’s just this one month.
1224
00:44:32,440 –> 00:44:34,240
Then it’s just this supplier.
1225
00:44:34,240 –> 00:44:36,080
Then it becomes the default operating model
1226
00:44:36,080 –> 00:44:38,760
because it’s faster than fixing ingestion, modeling,
1227
00:44:38,760 –> 00:44:39,800
or validation.
1228
00:44:39,800 –> 00:44:42,120
Entropy loves convenience.
1229
00:44:42,120 –> 00:44:43,680
Auditors hate it for one reason.
1230
00:44:43,680 –> 00:44:46,160
It creates an uncontrolled modification pathway
1231
00:44:46,160 –> 00:44:47,600
inside the reporting boundary.
1232
00:44:47,600 –> 00:44:49,320
That phrase matters because it doesn’t matter
1233
00:44:49,320 –> 00:44:51,200
whether the sustainability team is honest.
1234
00:44:51,200 –> 00:44:54,160
It matters whether the system allows undetectable change.
1235
00:44:54,160 –> 00:44:57,160
When a CSV can be uploaded and override prior data,
1236
00:44:57,160 –> 00:44:59,640
you have created a pathway where numbers can change
1237
00:44:59,640 –> 00:45:00,960
without a durable trail.
1238
00:45:00,960 –> 00:45:02,440
That’s the definition of weak control.
1239
00:45:02,440 –> 00:45:04,280
The classic symptoms are always the same.
1240
00:45:04,280 –> 00:45:05,480
No reviewer trace.
1241
00:45:05,480 –> 00:45:07,960
One person edits, one person uploads,
1242
00:45:07,960 –> 00:45:10,720
and the approval is a team’s message.
1243
00:45:10,720 –> 00:45:11,600
No locked period.
1244
00:45:11,600 –> 00:45:13,280
The organization says the month is closed,
1245
00:45:13,280 –> 00:45:15,200
but the storage and tables are still writable,
1246
00:45:15,200 –> 00:45:17,480
so the month is closed in conversation only.
1247
00:45:17,480 –> 00:45:18,400
No checksum.
1248
00:45:18,400 –> 00:45:19,640
No content fingerprint.
1249
00:45:19,640 –> 00:45:22,000
You can’t even prove the file you showed the auditor
1250
00:45:22,000 –> 00:45:23,760
is the file that produced the KPI.
1251
00:45:23,760 –> 00:45:25,680
No binding to calculation logic.
1252
00:45:25,680 –> 00:45:28,680
The override becomes the truth by brute force.
1253
00:45:28,680 –> 00:45:31,120
Not because it passed through governed computation
1254
00:45:31,120 –> 00:45:33,240
with known factors and known code.
1255
00:45:33,240 –> 00:45:35,680
Now, the countermeasure is not band spreadsheets.
1256
00:45:35,680 –> 00:45:37,360
That’s how you force shadow processes.
1257
00:45:37,360 –> 00:45:39,240
The countermeasure is controlled submissions.
1258
00:45:39,240 –> 00:45:40,960
If the business needs a manual pathway,
1259
00:45:40,960 –> 00:45:42,960
you give them one that behaves like a ledger.
1260
00:45:42,960 –> 00:45:45,560
Authenticated submitter identity required schema,
1261
00:45:45,560 –> 00:45:47,720
validation gates, and an approval workflow
1262
00:45:47,720 –> 00:45:50,080
that results in an immutable publish.
1263
00:45:50,080 –> 00:45:53,440
The CSV becomes a source artifact, not a rewrite tool,
1264
00:45:53,440 –> 00:45:55,000
so the pattern looks like this.
1265
00:45:55,000 –> 00:45:57,400
A submission is ingested into a staging area
1266
00:45:57,400 –> 00:45:59,680
and tagged with metadata, who submitted it,
1267
00:45:59,680 –> 00:46:04,120
when, for which site, for which period, and under which submission type.
1268
00:46:04,120 –> 00:46:07,880
Then the system runs validation, schema checks, unit checks,
1269
00:46:07,880 –> 00:46:10,680
required dimensions, and basic sanity thresholds.
1270
00:46:10,680 –> 00:46:13,200
If it fails, it gets rejected or quarantined,
1271
00:46:13,200 –> 00:46:16,480
not fixed later, quarantined with a recorded reason.
1272
00:46:16,480 –> 00:46:19,000
If it passes, it does not override anything.
1273
00:46:19,000 –> 00:46:22,040
It gets published as a new versioned object into the raw zone,
1274
00:46:22,040 –> 00:46:24,440
append first, always.
1275
00:46:24,440 –> 00:46:27,120
Then the important part, adjustments, not edits.
1276
00:46:27,120 –> 00:46:29,760
If the period is still open, you can allow the new submission
1277
00:46:29,760 –> 00:46:32,160
to become the latest accepted load for that period,
1278
00:46:32,160 –> 00:46:34,720
but you still keep the older load as evidence.
1279
00:46:34,720 –> 00:46:38,480
If the period is closed, the submission cannot mutate the closed outputs.
1280
00:46:38,480 –> 00:46:40,240
It can only create an adjustment entry
1281
00:46:40,240 –> 00:46:43,600
that is explicitly labeled as post-close, includes a rationale,
1282
00:46:43,600 –> 00:46:46,640
and requires approval from a different role than the submitter.
1283
00:46:46,640 –> 00:46:49,920
Separation of duties stops being an ideal and becomes enforced behavior,
1284
00:46:49,920 –> 00:46:52,240
and yes, you keep the supporting evidence.
1285
00:46:52,240 –> 00:46:54,480
The CSV file itself goes into the evidence fault
1286
00:46:54,480 –> 00:46:57,280
with its metadata and, ideally, a stored fingerprint
1287
00:46:57,280 –> 00:46:58,760
so you can prove integrity.
1288
00:46:58,760 –> 00:47:00,320
The approval record goes into the vault,
1289
00:47:00,320 –> 00:47:02,120
the validation report goes into the vault.
1290
00:47:02,120 –> 00:47:03,760
This is how you build an evidence pack
1291
00:47:03,760 –> 00:47:05,320
without rebuilding your memory.
1292
00:47:05,320 –> 00:47:07,240
Now, here’s the uncomfortable truth.
1293
00:47:07,240 –> 00:47:10,840
The business will still demand just change the number, though.
1294
00:47:10,840 –> 00:47:12,080
They always do.
1295
00:47:12,080 –> 00:47:14,480
Your job is to make the only available change pathway
1296
00:47:14,480 –> 00:47:17,600
one that leaves scars, a new version of recorded justification
1297
00:47:17,600 –> 00:47:19,280
and a visible approval trail.
1298
00:47:19,280 –> 00:47:22,160
When people complain that it’s slower, that’s the control working.
1299
00:47:22,160 –> 00:47:24,280
And once you solve Final V7, CSV,
1300
00:47:24,280 –> 00:47:26,600
you’ll notice the next failure mode is subtler.
1301
00:47:26,600 –> 00:47:29,360
The dashboard itself starts rewriting history.
1302
00:47:29,360 –> 00:47:30,600
Failure mode 2.
1303
00:47:30,600 –> 00:47:32,360
Calculation drift in Power BI.
1304
00:47:32,360 –> 00:47:35,160
Calculation drift is the audit failure that feels like progress.
1305
00:47:35,160 –> 00:47:38,600
Someone opens the Power BI model, sees a measure that looks inefficient,
1306
00:47:38,600 –> 00:47:40,480
re-rights it, the visuals load faster,
1307
00:47:40,480 –> 00:47:42,080
and everyone calls it an improvement.
1308
00:47:42,080 –> 00:47:45,240
But the system didn’t just get faster, it got less accountable.
1309
00:47:45,240 –> 00:47:47,920
Because Power BI is designed for interactive analysis,
1310
00:47:47,920 –> 00:47:49,440
not period bound computation,
1311
00:47:49,440 –> 00:47:54,400
but it’s superpowers flexibility and flexibility is the enemy of reproducibility
1312
00:47:54,400 –> 00:47:56,040
when you’re inside a reporting boundary.
1313
00:47:56,040 –> 00:47:57,880
Here’s what goes wrong.
1314
00:47:57,880 –> 00:48:00,880
The ESG team builds core logic in DAX.
1315
00:48:00,880 –> 00:48:04,560
Emission conversions, allocations, scope categorization,
1316
00:48:04,560 –> 00:48:07,880
intensity denominators, it starts small.
1317
00:48:07,880 –> 00:48:09,800
Then new requirements arrive.
1318
00:48:09,800 –> 00:48:13,840
New sites, a new supplier category, a new framework question,
1319
00:48:13,840 –> 00:48:16,600
a minor change to how renewable energy is treated.
1320
00:48:16,600 –> 00:48:17,760
So they update the model.
1321
00:48:17,760 –> 00:48:19,760
And historic values silently recompute.
1322
00:48:19,760 –> 00:48:22,560
That’s the difference between a calculation engine and a dashboard.
1323
00:48:22,560 –> 00:48:25,520
A calculation engine produces outputs that become records.
1324
00:48:25,520 –> 00:48:27,720
A dashboard recomputes every time it refreshes.
1325
00:48:27,720 –> 00:48:30,320
When you change the logic, you don’t just change future numbers.
1326
00:48:30,320 –> 00:48:31,800
You change last year’s numbers.
1327
00:48:31,800 –> 00:48:34,360
This clicked for a lot of architects the first time someone asked
1328
00:48:34,360 –> 00:48:36,280
for a prior year reconciliation,
1329
00:48:36,280 –> 00:48:38,280
and the answer was the model changed.
1330
00:48:38,280 –> 00:48:39,920
Not because anyone acted maliciously,
1331
00:48:39,920 –> 00:48:42,040
because the platform makes logic edit trivial
1332
00:48:42,040 –> 00:48:43,960
and makes version binding optional.
1333
00:48:43,960 –> 00:48:45,840
Optional controls aren’t controls.
1334
00:48:45,840 –> 00:48:48,600
Now the deeper problem is that Power BI doesn’t naturally behave
1335
00:48:48,600 –> 00:48:50,560
like a governed release artifact.
1336
00:48:50,560 –> 00:48:52,000
Yes, you can manage workspaces.
1337
00:48:52,000 –> 00:48:53,920
Yes, you can use deployment pipelines.
1338
00:48:53,920 –> 00:48:55,760
Yes, you can restrict who can publish.
1339
00:48:55,760 –> 00:48:58,400
But the model itself remains a moving object
1340
00:48:58,400 –> 00:49:02,120
unless you design a release process that treats it like code
1341
00:49:02,120 –> 00:49:03,760
that impacts financial statements.
1342
00:49:03,760 –> 00:49:04,920
Most organizations don’t.
1343
00:49:04,920 –> 00:49:06,160
They treat it like a report.
1344
00:49:06,160 –> 00:49:07,840
So you end up with the classic drift pattern,
1345
00:49:07,840 –> 00:49:10,640
a developer optimizes a measure for performance.
1346
00:49:10,640 –> 00:49:12,680
The measure changes, the outputs change.
1347
00:49:12,680 –> 00:49:15,240
Nobody notices until a regulator, an auditor,
1348
00:49:15,240 –> 00:49:17,160
or finance compares this year’s report
1349
00:49:17,160 –> 00:49:19,120
to a saved PDF from last year.
1350
00:49:19,120 –> 00:49:20,800
And now you’re in restatement territory
1351
00:49:20,800 –> 00:49:22,920
except you don’t have restatement mechanics.
1352
00:49:22,920 –> 00:49:24,720
You have a dashboard that rewrote history
1353
00:49:24,720 –> 00:49:26,160
without leaving an obvious scar.
1354
00:49:26,160 –> 00:49:29,560
That’s why auditors increasingly flag logic heavy DAX models.
1355
00:49:29,560 –> 00:49:30,840
Not because DAX is wrong,
1356
00:49:30,840 –> 00:49:32,560
because DAX is too easy to change
1357
00:49:32,560 –> 00:49:33,880
without the control ceremony
1358
00:49:33,880 –> 00:49:36,320
that should accompany changes to reported numbers.
1359
00:49:36,320 –> 00:49:38,680
The architecture rule that stops this is brutal.
1360
00:49:38,680 –> 00:49:42,280
Power BI is a thin semantic layer over reported tables only.
1361
00:49:42,280 –> 00:49:44,920
That means the calculations on produces the KPI tables.
1362
00:49:44,920 –> 00:49:46,960
Those KPI tables are period closed outputs.
1363
00:49:46,960 –> 00:49:50,040
Power BI reads them, power BI can aggregate and format them.
1364
00:49:50,040 –> 00:49:51,160
It can create visuals.
1365
00:49:51,160 –> 00:49:52,680
It can create drill paths.
1366
00:49:52,680 –> 00:49:54,400
It can even create convenience measures
1367
00:49:54,400 –> 00:49:57,400
that don’t affect the underlying accounting of emissions.
1368
00:49:57,400 –> 00:49:59,800
But it cannot be the place where emissions accounting lives.
1369
00:49:59,800 –> 00:50:02,400
If you remember nothing else, DAX measures should be formatting,
1370
00:50:02,400 –> 00:50:03,360
not accounting.
1371
00:50:03,360 –> 00:50:06,440
Now, the system level countermeasure is equally blunt.
1372
00:50:06,440 –> 00:50:09,920
KPI outputs are tables, not measures.
1373
00:50:09,920 –> 00:50:11,400
Instead of total scope to emissions
1374
00:50:11,400 –> 00:50:13,920
being a measure that depends on five other measures,
1375
00:50:13,920 –> 00:50:16,080
it becomes a column in a reported KPI table
1376
00:50:16,080 –> 00:50:17,840
produced by fabric or synapse
1377
00:50:17,840 –> 00:50:22,640
with keys for period or unit scope category, method and factor version.
1378
00:50:22,640 –> 00:50:24,200
Power BI then displays it.
1379
00:50:24,200 –> 00:50:26,280
When someone asks, why did it change?
1380
00:50:26,280 –> 00:50:28,640
You have an answer rooted in artifacts,
1381
00:50:28,640 –> 00:50:32,720
input load IDs, factor versions and calculation release version.
1382
00:50:32,720 –> 00:50:35,120
Not because someone updated the report,
1383
00:50:35,120 –> 00:50:37,640
a practical control pattern looks like this.
1384
00:50:37,640 –> 00:50:39,960
You publish a KPI data set that is certified
1385
00:50:39,960 –> 00:50:42,440
and only refreshes from the reported zone.
1386
00:50:42,440 –> 00:50:45,320
You separate two dashboard classes, assurance dashboards
1387
00:50:45,320 –> 00:50:47,440
that only read period closed outputs
1388
00:50:47,440 –> 00:50:51,440
and management dashboards that can read operational or provisional data.
1389
00:50:51,440 –> 00:50:53,160
That distinction matters because management
1390
00:50:53,160 –> 00:50:57,160
wants speed and iteration, assurance wants stability and traceability,
1391
00:50:57,160 –> 00:50:59,600
mixing them guarantees you’ll optimize for convenience
1392
00:50:59,600 –> 00:51:01,280
and later pretend it was governance.
1393
00:51:01,280 –> 00:51:02,640
And you still keep snapshots.
1394
00:51:02,640 –> 00:51:06,200
At close, you export and store the period close report outputs
1395
00:51:06,200 –> 00:51:08,320
as artifacts in your evidence vault.
1396
00:51:08,320 –> 00:51:10,240
PDF for human readable records
1397
00:51:10,240 –> 00:51:14,040
and CSV or data set extracts for machine traceability.
1398
00:51:14,040 –> 00:51:16,320
Those snapshots don’t replace the reported tables.
1399
00:51:16,320 –> 00:51:17,320
They complement them.
1400
00:51:17,320 –> 00:51:19,120
They prove what was presented at the time.
1401
00:51:19,120 –> 00:51:20,400
Now here’s the cynical truth.
1402
00:51:20,400 –> 00:51:23,720
People will try to sneak logic back into Power BI because it’s faster.
1403
00:51:23,720 –> 00:51:25,560
They’ll say, it’s just a small adjustment.
1404
00:51:25,560 –> 00:51:27,320
They’ll say, we can do it as a measure.
1405
00:51:27,320 –> 00:51:29,360
They’ll say, it’s only for this visual.
1406
00:51:29,360 –> 00:51:32,840
And then six months later, you discover the visual became the source of truth
1407
00:51:32,840 –> 00:51:34,680
because it was the thing executives looked at.
1408
00:51:34,680 –> 00:51:35,760
That’s how drift wins.
1409
00:51:35,760 –> 00:51:38,840
So, enforce the boundary calculations in the governed zone.
1410
00:51:38,840 –> 00:51:41,960
Outputs in reported tables, Power BI as presentation.
1411
00:51:41,960 –> 00:51:44,000
And once you move logic out of the dashboard,
1412
00:51:44,000 –> 00:51:45,440
you’ll hit the next failure mode.
1413
00:51:45,440 –> 00:51:47,040
Even with SQL-based logic,
1414
00:51:47,040 –> 00:51:48,440
you still can’t reproduce history
1415
00:51:48,440 –> 00:51:49,880
if your emission factors float.
1416
00:51:49,880 –> 00:51:52,680
Failure mode three, missing factor versioning.
1417
00:51:52,680 –> 00:51:54,600
Missing factor versioning is the failure mode
1418
00:51:54,600 –> 00:51:57,000
that makes every other control look decorative.
1419
00:51:57,000 –> 00:51:58,560
You can have immutable storage.
1420
00:51:58,560 –> 00:51:59,800
You can have lineage.
1421
00:51:59,800 –> 00:52:01,720
You can even have a governed calculation zone.
1422
00:52:01,720 –> 00:52:04,560
But if your emission factors behave like current truth,
1423
00:52:04,560 –> 00:52:07,120
you’ve built a system that recalculates the past
1424
00:52:07,120 –> 00:52:08,080
with today’s assumptions.
1425
00:52:08,080 –> 00:52:08,920
That’s not reporting.
1426
00:52:08,920 –> 00:52:10,640
That’s revisionism with a SQL engine.
1427
00:52:10,640 –> 00:52:12,600
Here’s how it happens in real architectures.
1428
00:52:12,600 –> 00:52:15,080
A team centralizes emission factors in a table.
1429
00:52:15,080 –> 00:52:17,880
They add a column for source and maybe year.
1430
00:52:17,880 –> 00:52:21,360
They join activity data to factors by geography and activity type.
1431
00:52:21,360 –> 00:52:24,400
Then, because nobody wants to pass parameters around,
1432
00:52:24,400 –> 00:52:27,800
they add a view called something like VW emission factor latest.
1433
00:52:27,800 –> 00:52:29,760
And every calculation joins to latest,
1434
00:52:29,760 –> 00:52:31,800
it works until the factor set updates.
1435
00:52:31,800 –> 00:52:34,800
Then you rerun FY1 and FY+2 and the numbers change.
1436
00:52:34,800 –> 00:52:36,560
Not because the activity changed.
1437
00:52:36,560 –> 00:52:37,880
Not because the logic changed.
1438
00:52:37,880 –> 00:52:40,320
Because the factor table did what tables do,
1439
00:52:40,320 –> 00:52:42,080
it reflected the current state.
1440
00:52:42,080 –> 00:52:44,480
And your system quietly rewrote history.
1441
00:52:44,480 –> 00:52:47,000
This is why we use DEFRA, isn’t evidence.
1442
00:52:47,000 –> 00:52:48,720
It’s a marketing label for a library.
1443
00:52:48,720 –> 00:52:49,360
Which version?
1444
00:52:49,360 –> 00:52:50,200
Which published date?
1445
00:52:50,200 –> 00:52:51,160
Which effective dates?
1446
00:52:51,160 –> 00:52:52,160
Which geography mapping?
1447
00:52:52,160 –> 00:52:53,080
Which category mapping?
1448
00:52:53,080 –> 00:52:55,040
If your answer is the one in the table,
1449
00:52:55,040 –> 00:52:57,320
you are admitting you can’t reproduce a prior close.
1450
00:52:57,320 –> 00:53:00,400
And reproducibility is one of the audit grade requirements
1451
00:53:00,400 –> 00:53:01,760
you claimed you met.
1452
00:53:01,760 –> 00:53:03,960
Now, what does assurance actually do with this?
1453
00:53:03,960 –> 00:53:05,880
They ask the simplest question on earth.
1454
00:53:05,880 –> 00:53:06,920
Prove the number.
1455
00:53:06,920 –> 00:53:07,880
Not tell a story.
1456
00:53:07,880 –> 00:53:09,000
Prove it.
1457
00:53:09,000 –> 00:53:10,040
They want to see the chain.
1458
00:53:10,040 –> 00:53:12,320
Activity record, factor record, and the logic
1459
00:53:12,320 –> 00:53:13,360
that multiplied them.
1460
00:53:13,360 –> 00:53:15,560
If the factor record is not pinned to the period,
1461
00:53:15,560 –> 00:53:17,680
you can’t prove what it was at the time.
1462
00:53:17,680 –> 00:53:19,240
You can only show what it is now.
1463
00:53:19,240 –> 00:53:20,960
That becomes ambiguity and ambiguity
1464
00:53:20,960 –> 00:53:23,960
becomes qualifications, restatements, or a scope limitation
1465
00:53:23,960 –> 00:53:26,600
in an assurance report depending on how bad it is.
1466
00:53:26,600 –> 00:53:29,640
The most common real world trigger is annual factor updates.
1467
00:53:29,640 –> 00:53:31,040
A new factor library comes in.
1468
00:53:31,040 –> 00:53:32,040
Someone imports it.
1469
00:53:32,040 –> 00:53:35,000
They overwrite last year’s rows because it’s an update.
1470
00:53:35,000 –> 00:53:37,800
Then they rerun calculations to validate the new year.
1471
00:53:37,800 –> 00:53:39,440
And suddenly last year changes.
1472
00:53:39,440 –> 00:53:40,960
The dashboard still looks reasonable,
1473
00:53:40,960 –> 00:53:43,400
so nobody notices until finance compares numbers
1474
00:53:43,400 –> 00:53:46,120
to last year’s submission or an auditor requests
1475
00:53:46,120 –> 00:53:48,280
a re-performance for the prior period.
1476
00:53:48,280 –> 00:53:50,360
And then the organization learns a painful lesson.
1477
00:53:50,360 –> 00:53:52,840
Factor drift is indistinguishable from manipulation
1478
00:53:52,840 –> 00:53:54,560
unless you can prove version binding.
1479
00:53:54,560 –> 00:53:57,360
So the architecture countermeasure has to be enforcement,
1480
00:53:57,360 –> 00:53:58,200
not guidance.
1481
00:53:58,200 –> 00:54:00,760
First, you build factor libraries as version sets.
1482
00:54:00,760 –> 00:54:02,520
A library has a unique version key.
1483
00:54:02,520 –> 00:54:04,080
The factor records carry that key.
1484
00:54:04,080 –> 00:54:05,880
And you never update an existing version.
1485
00:54:05,880 –> 00:54:07,640
You publish a new version, always.
1486
00:54:07,640 –> 00:54:10,400
Second, you bind factors to periods explicitly.
1487
00:54:10,400 –> 00:54:13,120
That can be done through a period close configuration table
1488
00:54:13,120 –> 00:54:15,200
that records per reporting period.
1489
00:54:15,200 –> 00:54:18,000
The factor library version IDs used for each domain.
1490
00:54:18,000 –> 00:54:21,400
Electricity, fuel, travel, freight, whatever you model.
1491
00:54:21,400 –> 00:54:24,000
That configuration becomes part of the close package
1492
00:54:24,000 –> 00:54:26,600
and gets locked after close because it’s the switchboard
1493
00:54:26,600 –> 00:54:28,360
that defines reproducibility.
1494
00:54:28,360 –> 00:54:31,400
Third, you make the pipeline fail without a factor version key.
1495
00:54:31,400 –> 00:54:34,480
This is the part teams avoid because it feels strict.
1496
00:54:34,480 –> 00:54:35,640
Good, it should.
1497
00:54:35,640 –> 00:54:39,000
In your SQL views, stored procedures or notebooks,
1498
00:54:39,000 –> 00:54:41,480
the joint effectors must include the version key.
1499
00:54:41,480 –> 00:54:43,480
If the caller doesn’t supply it, the job fails.
1500
00:54:43,480 –> 00:54:45,560
If the configuration table doesn’t have a version
1501
00:54:45,560 –> 00:54:47,160
for that period, the job fails.
1502
00:54:47,160 –> 00:54:48,480
No silent fallbacks.
1503
00:54:48,480 –> 00:54:50,560
No latest, no convenience.
1504
00:54:50,560 –> 00:54:53,640
Because latest is an entropy generator disguised as a default.
1505
00:54:53,640 –> 00:54:56,240
Once you enforce that, you can do an audit-ready rerun.
1506
00:54:56,240 –> 00:54:58,240
You can take FYI one activity data.
1507
00:54:58,240 –> 00:55:01,000
You can select the load IDs that were in scope at close.
1508
00:55:01,000 –> 00:55:03,920
You can select the factor library versions recorded for that close.
1509
00:55:03,920 –> 00:55:07,600
You can run the calculation artifacts tied to the released logic version.
1510
00:55:07,600 –> 00:55:08,800
And you get the same result.
1511
00:55:08,800 –> 00:55:10,520
That’s what reproducibility means.
1512
00:55:10,520 –> 00:55:11,520
Not close enough.
1513
00:55:11,520 –> 00:55:13,120
The same.
1514
00:55:13,120 –> 00:55:14,320
Now the subtle trap.
1515
00:55:14,320 –> 00:55:16,440
Effective dates and geography mapping.
1516
00:55:16,440 –> 00:55:19,200
Even with version keys, teams still mess up by storing factors
1517
00:55:19,200 –> 00:55:20,720
without applicability constraints
1518
00:55:20,720 –> 00:55:22,520
then joining based on best match.
1519
00:55:22,520 –> 00:55:24,920
That produces probabilistic factor selection.
1520
00:55:24,920 –> 00:55:26,840
So the factor model needs effective dates,
1521
00:55:26,840 –> 00:55:28,840
geography codes and classification keys
1522
00:55:28,840 –> 00:55:30,600
that make selection deterministic.
1523
00:55:30,600 –> 00:55:33,960
If multiple factors match, the pipeline should fail and force resolution,
1524
00:55:33,960 –> 00:55:36,000
not pick one and pretend it was intentional.
1525
00:55:36,000 –> 00:55:37,560
And the final piece is evidence.
1526
00:55:37,560 –> 00:55:40,600
When you publish a factor library version, store provenance.
1527
00:55:40,600 –> 00:55:42,680
Where it came from and when it was approved.
1528
00:55:42,680 –> 00:55:46,280
In Microsoft Sustainability Manager, factors can live in factor libraries.
1529
00:55:46,280 –> 00:55:49,240
In Fabric or Synapse, you’ll model them as tables.
1530
00:55:49,240 –> 00:55:51,880
Either way, the published version used for a close period
1531
00:55:51,880 –> 00:55:53,720
becomes part of the evidence chain.
1532
00:55:53,720 –> 00:55:56,480
So it gets locked and registered in purview for lineage.
1533
00:55:56,480 –> 00:55:59,240
Because the moment you can’t prove which factors were used,
1534
00:55:59,240 –> 00:56:01,000
you’re no longer defending a number,
1535
00:56:01,000 –> 00:56:02,800
you’re defending a belief about a number
1536
00:56:02,800 –> 00:56:05,200
and auditors don’t assure beliefs.
1537
00:56:05,200 –> 00:56:08,880
Purview, lineage as your only defense against prove it moments.
1538
00:56:08,880 –> 00:56:10,640
At some point, someone will ask the question
1539
00:56:10,640 –> 00:56:12,360
that ends the fun part of ESG.
1540
00:56:12,360 –> 00:56:13,160
Prove it.
1541
00:56:13,160 –> 00:56:14,760
Not explain it, not summarize it.
1542
00:56:14,760 –> 00:56:16,840
Prove that this KPI came from these sources
1543
00:56:16,840 –> 00:56:19,480
went through these transformations and landed in this report
1544
00:56:19,480 –> 00:56:23,160
without being casually rewritten by whoever had access on a Tuesday.
1545
00:56:23,160 –> 00:56:25,080
This is where most ESG stacks collapse
1546
00:56:25,080 –> 00:56:27,280
because they rely on human memory and slide decks.
1547
00:56:27,280 –> 00:56:30,000
That’s not governance, that’s folklore.
1548
00:56:30,000 –> 00:56:33,080
Microsoft purview is the mechanism that turns your folklore
1549
00:56:33,080 –> 00:56:34,480
into queryable metadata.
1550
00:56:34,480 –> 00:56:37,200
And that distinction matters because lineage isn’t a diagram
1551
00:56:37,200 –> 00:56:37,960
you draw once.
1552
00:56:37,960 –> 00:56:40,760
It’s an operational record of how data moved, changed shape,
1553
00:56:40,760 –> 00:56:43,560
and became something the business now claims is true.
1554
00:56:43,560 –> 00:56:45,560
Lineage in plain system terms is origin
1555
00:56:45,560 –> 00:56:47,880
to transformation to consumption.
1556
00:56:47,880 –> 00:56:49,800
Origin is where the data came from.
1557
00:56:49,800 –> 00:56:53,640
ERP extracts, meter feeds, supplier submissions, HR aggregates.
1558
00:56:53,640 –> 00:56:55,280
Transformation is what you did to it.
1559
00:56:55,280 –> 00:56:58,240
Validation standardization mapping factor application KPI
1560
00:56:58,240 –> 00:57:00,560
computation consumption is where it shows up.
1561
00:57:00,560 –> 00:57:04,360
Reported tables, semantic models, power BI reports, exports.
1562
00:57:04,360 –> 00:57:07,800
If any link in that chain is someone knows, you don’t have lineage.
1563
00:57:07,800 –> 00:57:09,200
You have a future incident.
1564
00:57:09,200 –> 00:57:11,080
Here’s what you actually register in purview
1565
00:57:11,080 –> 00:57:13,520
if you wanted to be useful under audit pressure.
1566
00:57:13,520 –> 00:57:18,200
You register the storage assets, lake houses, ADLS paths,
1567
00:57:18,200 –> 00:57:21,320
containers that correspond to raw curated reported
1568
00:57:21,320 –> 00:57:22,720
and the evidence vault.
1569
00:57:22,720 –> 00:57:24,720
You register the processing assets.
1570
00:57:24,720 –> 00:57:28,640
Pipelines, notebooks, SQL endpoints, whatever actually
1571
00:57:28,640 –> 00:57:31,160
performs transformations and publishes outputs.
1572
00:57:31,160 –> 00:57:33,120
And you register the analytics assets.
1573
00:57:33,120 –> 00:57:34,920
The data sets and reports people use,
1574
00:57:34,920 –> 00:57:37,160
including the certified data sets that represent
1575
00:57:37,160 –> 00:57:38,200
the assurance layer.
1576
00:57:38,200 –> 00:57:41,000
And you assign ownership, real ownership, not the team,
1577
00:57:41,000 –> 00:57:42,520
a named role with accountability.
1578
00:57:42,520 –> 00:57:45,520
The thing most people miss is that auditors don’t only ask
1579
00:57:45,520 –> 00:57:46,760
where did the number come from.
1580
00:57:46,760 –> 00:57:49,280
They ask who is responsible for this asset.
1581
00:57:49,280 –> 00:57:52,440
Per view is where you make that answer deterministic instead of social.
1582
00:57:52,440 –> 00:57:54,160
Now, what does this look like in practice
1583
00:57:54,160 –> 00:57:56,040
in the moment you’re under scrutiny?
1584
00:57:56,040 –> 00:57:58,280
A stakeholder points its scope to for a region
1585
00:57:58,280 –> 00:58:00,040
and says, this seems high.
1586
00:58:00,040 –> 00:58:02,840
If you have lineage, you can trace from the Power BI visual
1587
00:58:02,840 –> 00:58:04,600
back to the reported KPI table,
1588
00:58:04,600 –> 00:58:06,680
back to the calculation view or notebook,
1589
00:58:06,680 –> 00:58:08,560
back to the curated consumption table,
1590
00:58:08,560 –> 00:58:11,200
back to the raw invoice extract or meter feed.
1591
00:58:11,200 –> 00:58:15,240
And you can identify the load IDs and factor library version key used.
1592
00:58:15,240 –> 00:58:16,600
You can do it in minutes.
1593
00:58:16,600 –> 00:58:19,240
Without lineage, you do PowerPoint archaeology.
1594
00:58:19,240 –> 00:58:20,240
You open old emails.
1595
00:58:20,240 –> 00:58:21,720
You ask someone who left the company.
1596
00:58:21,720 –> 00:58:24,600
You rebuild the path from memory and hope it matches reality.
1597
00:58:24,600 –> 00:58:25,360
It won’t.
1598
00:58:25,360 –> 00:58:26,760
Lineage isn’t only for auditors.
1599
00:58:26,760 –> 00:58:31,200
It’s also the fastest way to find where data quality issues actually entered the system.
1600
00:58:31,200 –> 00:58:34,040
When a KPI looks wrong, teams usually blame the calculation.
1601
00:58:34,040 –> 00:58:37,080
Half the time the calculation is fine and the input mapping is wrong
1602
00:58:37,080 –> 00:58:39,040
or the organizational hierarchy changed.
1603
00:58:39,040 –> 00:58:40,200
Or a unit got misread.
1604
00:58:40,200 –> 00:58:41,640
Lineage gives you a breadcrumb trail,
1605
00:58:41,640 –> 00:58:43,880
so root cause analysis becomes mechanical.
1606
00:58:43,880 –> 00:58:46,520
Find the upstream change point, not the downstream symptom.
1607
00:58:46,520 –> 00:58:48,840
And here’s the other use case nobody budgets for.
1608
00:58:48,840 –> 00:58:49,880
Impact analysis.
1609
00:58:49,880 –> 00:58:52,840
Every time you change a pipeline, a mapping or a factor library,
1610
00:58:52,840 –> 00:58:55,000
you are changing a graph of dependencies.
1611
00:58:55,000 –> 00:58:58,600
Without lineage, you don’t know what you’ll break until something breaks.
1612
00:58:58,600 –> 00:59:02,240
With lineage, you can see downstream consumers before you ship the change.
1613
00:59:02,240 –> 00:59:05,680
That’s how you stop small improvements from becoming multi-year restatements.
1614
00:59:05,680 –> 00:59:06,880
Now there’s a reality check.
1615
00:59:06,880 –> 00:59:10,160
Per view capabilities evolve, integrations change,
1616
00:59:10,160 –> 00:59:13,560
some sustainability specific solutions in the Microsoft ecosystem,
1617
00:59:13,560 –> 00:59:15,760
show up in preview states and then move.
1618
00:59:15,760 –> 00:59:17,520
That is not a reason to avoid governance.
1619
00:59:17,520 –> 00:59:21,240
It’s the reason to avoid hard coding your governance into documentation.
1620
00:59:21,240 –> 00:59:23,360
Your architecture has to tolerate product drift
1621
00:59:23,360 –> 00:59:26,480
by treating lineage as a first class system behavior.
1622
00:59:26,480 –> 00:59:29,680
Register assets consistently, enforce naming conventions,
1623
00:59:29,680 –> 00:59:33,840
keep ownership current and make lineage review part of release management.
1624
00:59:33,840 –> 00:59:34,960
And yes, there’s setup.
1625
00:59:34,960 –> 00:59:37,800
You will configure, connectors, you will manage identities,
1626
00:59:37,800 –> 00:59:40,640
you will decide which assets get scanned and how often.
1627
00:59:40,640 –> 00:59:44,240
You will deal with the fact that not everything stitches perfectly on day one.
1628
00:59:44,240 –> 00:59:45,760
But you’re not doing this for aesthetics.
1629
00:59:45,760 –> 00:59:48,880
You’re doing it because prove it moments don’t arrive on your schedule.
1630
00:59:48,880 –> 00:59:50,320
They arrive when the board is watching.
1631
00:59:50,320 –> 00:59:53,080
So per view becomes your only defensible posture.
1632
00:59:53,080 –> 00:59:57,920
A way to demonstrate and to end that your ESG numbers are products of controlled systems,
1633
00:59:57,920 –> 01:00:00,240
not a collection of best effort narratives.
1634
01:00:00,240 –> 01:00:03,320
And once you accept that, the next dependency becomes obvious.
1635
01:00:03,320 –> 01:00:05,040
Governance without identity is theater.
1636
01:00:05,040 –> 01:00:07,720
Identity is what turns metadata into enforcement.
1637
01:00:07,720 –> 01:00:09,560
Entra ID plus role separation.
1638
01:00:09,560 –> 01:00:11,520
Stop letting everyone be everyone.
1639
01:00:11,520 –> 01:00:13,920
Most organizations say they have governance.
1640
01:00:13,920 –> 01:00:16,800
Then you look at their permissions and realize they have hope.
1641
01:00:16,800 –> 01:00:19,800
They treat Microsoft Entra ID like a login system.
1642
01:00:19,800 –> 01:00:21,120
Not what it actually is.
1643
01:00:21,120 –> 01:00:23,440
The control plane for who can touch evidence,
1644
01:00:23,440 –> 01:00:28,280
who can change logic and who can publish numbers that will later be defended in an assurance room.
1645
01:00:28,280 –> 01:00:32,240
That distinction matters because ESG fails when identity becomes optional.
1646
01:00:32,240 –> 01:00:33,920
Role separation is not bureaucracy.
1647
01:00:33,920 –> 01:00:39,040
It is the only reason an auditor believes your system didn’t quietly rewrite itself under deadline pressure.
1648
01:00:39,040 –> 01:00:41,920
So the model is simple and it’s intentionally boring.
1649
01:00:41,920 –> 01:00:43,600
Submitter, validator,
1650
01:00:43,600 –> 01:00:46,320
calculator, approver, report publisher.
1651
01:00:46,320 –> 01:00:48,000
A submitter can provide data.
1652
01:00:48,000 –> 01:00:49,560
They can’t edit raw archives.
1653
01:00:49,560 –> 01:00:50,800
They can’t change mappings.
1654
01:00:50,800 –> 01:00:52,480
They can’t adjust reported outputs.
1655
01:00:52,480 –> 01:00:56,360
A validator can review ingestion results and data quality exceptions.
1656
01:00:56,360 –> 01:00:58,560
They can quarantine or accept with exception.
1657
01:00:58,560 –> 01:00:59,680
They can’t publish factors.
1658
01:00:59,680 –> 01:01:01,560
They can’t deploy calculation code.
1659
01:01:01,560 –> 01:01:03,920
A calculator can run the governed compute process.
1660
01:01:03,920 –> 01:01:05,400
They can’t alter raw evidence.
1661
01:01:05,400 –> 01:01:07,120
They can’t approve their own changes.
1662
01:01:07,120 –> 01:01:09,000
They can’t publish the final report.
1663
01:01:09,000 –> 01:01:11,440
An approver can sign off on period close,
1664
01:01:11,440 –> 01:01:14,200
factor library versions and post-close adjustments.
1665
01:01:14,200 –> 01:01:16,320
They don’t need broad data engineering access.
1666
01:01:16,320 –> 01:01:17,960
They need explicit rights to approve
1667
01:01:17,960 –> 01:01:20,400
and their approvals need to be recorded as evidence.
1668
01:01:20,400 –> 01:01:24,240
A report publisher can publish certified data sets and reports
1669
01:01:24,240 –> 01:01:26,040
that consume reported outputs.
1670
01:01:26,040 –> 01:01:30,960
They cannot modify the calculation logic or the data inputs that produce those outputs.
1671
01:01:30,960 –> 01:01:33,680
If you collapse those roles into the sustainability team,
1672
01:01:33,680 –> 01:01:38,080
you’ve built a system where the same identity can create data, change data,
1673
01:01:38,080 –> 01:01:40,080
compute results and approve results.
1674
01:01:40,080 –> 01:01:41,400
That is not control.
1675
01:01:41,400 –> 01:01:43,160
That is conditional chaos.
1676
01:01:43,160 –> 01:01:48,240
Now here’s the part everyone tries to dodge.
1677
01:01:48,240 –> 01:01:49,920
Access boundaries by zone.
1678
01:01:49,920 –> 01:01:53,160
Raw, curated and reported are not just storage partitions.
1679
01:01:53,160 –> 01:01:54,560
They are permission boundaries.
1680
01:01:54,560 –> 01:01:57,400
Raw should be readable by the people who need to trace provenance
1681
01:01:57,400 –> 01:01:58,720
and resolve ingestion issues,
1682
01:01:58,720 –> 01:02:04,360
but rightable only by controlled ingestion identities typically service principles executing pipelines.
1683
01:02:04,360 –> 01:02:06,400
Humans don’t get right access to raw evidence.
1684
01:02:06,400 –> 01:02:09,760
They get a submission mechanism that produces new immutable artifacts.
1685
01:02:09,760 –> 01:02:11,000
That’s a different thing.
1686
01:02:11,000 –> 01:02:14,840
Curated should be writable only by the transformation process identities
1687
01:02:14,840 –> 01:02:16,960
and the engineers responsible for the model.
1688
01:02:16,960 –> 01:02:21,040
Broughter read access is fine, but right access is a scalpel, not a group membership.
1689
01:02:21,040 –> 01:02:22,760
Reported should be locked down hardest.
1690
01:02:22,760 –> 01:02:27,200
Right access only for the closed process and controlled adjustment workflows.
1691
01:02:27,200 –> 01:02:31,520
Read access for reporting, finance, internal audit and whoever consumes the KPIs.
1692
01:02:31,520 –> 01:02:35,760
But nobody should be casually updating reported tables because it’s just a fix.
1693
01:02:35,760 –> 01:02:36,880
Fixes are adjustments.
1694
01:02:36,880 –> 01:02:38,240
Adjustments have approvals.
1695
01:02:38,240 –> 01:02:40,440
Approvals have identity separation.
1696
01:02:40,440 –> 01:02:44,920
And yes, all of this is enforced with Entra Groups, service principles, managed identities
1697
01:02:44,920 –> 01:02:49,200
and our back assignments at the storage and compute layers, not in VizioDont in the system.
1698
01:02:49,200 –> 01:02:51,960
Now, evidence of control matters as much as control itself.
1699
01:02:51,960 –> 01:02:53,360
You don’t just need permissions.
1700
01:02:53,360 –> 01:02:54,600
You need proof of permissions.
1701
01:02:54,600 –> 01:02:57,760
So you treat Entra assignments, role memberships and privilege changes
1702
01:02:57,760 –> 01:02:59,560
as part of the assurance package.
1703
01:02:59,560 –> 01:03:02,200
When the auditor asks, who could have changed this?
1704
01:03:02,200 –> 01:03:03,640
You don’t answer with a meeting.
1705
01:03:03,640 –> 01:03:08,240
You answer with access history, role definitions and audit logs.
1706
01:03:08,240 –> 01:03:12,920
Which brings us to the most common ESG security anti-pattern, the hero admin.
1707
01:03:12,920 –> 01:03:15,920
The hero admin shows up when the pipeline breaks, the close date is near
1708
01:03:15,920 –> 01:03:18,520
and somebody says, just give me contributor for a minute.
1709
01:03:18,520 –> 01:03:20,840
Temporary elevation becomes permanent.
1710
01:03:20,840 –> 01:03:22,320
Exceptions become normal.
1711
01:03:22,320 –> 01:03:25,640
And then months later you discover your separation of duties is a myth
1712
01:03:25,640 –> 01:03:28,280
because everyone has been operating as everyone.
1713
01:03:28,280 –> 01:03:30,480
The countermeasure isn’t trust people more.
1714
01:03:30,480 –> 01:03:34,440
It’s making elevation visible and costly if you must allow privileged access.
1715
01:03:34,440 –> 01:03:37,400
You make it time-bound, explicitly approved and logged.
1716
01:03:37,400 –> 01:03:40,320
You treat it as an incident artifact, not a convenience.
1717
01:03:40,320 –> 01:03:43,680
Because every exception is an entropy generator that will be reused.
1718
01:03:43,680 –> 01:03:45,000
And here is the awkward truth.
1719
01:03:45,000 –> 01:03:47,640
The sustainability organization will fight you on this.
1720
01:03:47,640 –> 01:03:48,760
They’ll say it slows them down.
1721
01:03:48,760 –> 01:03:49,320
They’re correct.
1722
01:03:49,320 –> 01:03:51,240
Controls always slow down change.
1723
01:03:51,240 –> 01:03:52,160
That’s the trade.
1724
01:03:52,160 –> 01:03:55,680
If you want audit grade reporting, you don’t optimize for speed of edits.
1725
01:03:55,680 –> 01:03:57,600
You optimize for survivability under scrutiny.
1726
01:03:57,600 –> 01:04:00,680
So you design the workflow, so the compliant path is the easy path.
1727
01:04:00,680 –> 01:04:03,000
Controls submissions instead of shared folders
1728
01:04:03,000 –> 01:04:06,440
approved factor publishing instead of spreadsheet swaps, period close gates
1729
01:04:06,440 –> 01:04:10,520
that lock storage and reports that only read from reported outputs.
1730
01:04:10,520 –> 01:04:12,320
Entra is what makes all of that enforceable.
1731
01:04:12,320 –> 01:04:16,040
Without it, purview shows you lineage of data that anyone could have altered.
1732
01:04:16,040 –> 01:04:17,040
That’s not governance.
1733
01:04:17,040 –> 01:04:19,320
That’s cataloging your own uncertainty.
1734
01:04:19,320 –> 01:04:23,480
Next, we talk about where organizations reintroduce entropy for convenience.
1735
01:04:23,480 –> 01:04:24,840
The reporting layer.
1736
01:04:24,840 –> 01:04:28,000
Reporting layer, power BI as presentation, not truth.
1737
01:04:28,000 –> 01:04:31,120
Reporting is where most teams undo everything they just built
1738
01:04:31,120 –> 01:04:33,480
because power BI makes it easy to be helpful.
1739
01:04:33,480 –> 01:04:35,280
Helpful is not a control objective.
1740
01:04:35,280 –> 01:04:38,520
In an auditable ESG stack, power BI is a presentation layer.
1741
01:04:38,520 –> 01:04:41,080
A thin semantic layer over reported outputs.
1742
01:04:41,080 –> 01:04:42,120
It does not own the math.
1743
01:04:42,120 –> 01:04:43,720
It does not fix missing data.
1744
01:04:43,720 –> 01:04:47,760
And it does not quietly restate history because someone wanted a cleaner chart.
1745
01:04:47,760 –> 01:04:49,840
The system behavior you want is simple.
1746
01:04:49,840 –> 01:04:53,520
Fabric or Synapse produces period closed KPI tables in the reported zone.
1747
01:04:53,520 –> 01:04:56,520
Power BI reads those tables through certified data sets.
1748
01:04:56,520 –> 01:04:58,480
The report is a window, not a calculator.
1749
01:04:58,480 –> 01:05:00,920
That distinction matters because the report is the thing
1750
01:05:00,920 –> 01:05:06,360
executives screenshot, regulators request an auditor’s reconcil against the evidence package.
1751
01:05:06,360 –> 01:05:09,720
If the report can change without a corresponding change in the reported tables,
1752
01:05:09,720 –> 01:05:11,360
you’ve created a second truth.
1753
01:05:11,360 –> 01:05:13,640
And you will spend the next year arguing with yourself.
1754
01:05:13,640 –> 01:05:16,000
So you build two classes of dashboards on purpose.
1755
01:05:16,000 –> 01:05:18,280
The first class is regulatory and assurance reporting.
1756
01:05:18,280 –> 01:05:20,160
It reads only from the reported zone.
1757
01:05:20,160 –> 01:05:21,840
It refreshes on controlled schedules.
1758
01:05:21,840 –> 01:05:23,400
It uses certified data sets.
1759
01:05:23,400 –> 01:05:27,360
It has locked definitions and a release process that looks boring on purpose.
1760
01:05:27,360 –> 01:05:29,880
The second class is management and operations reporting.
1761
01:05:29,880 –> 01:05:32,680
It can read, curate it and even operational data.
1762
01:05:32,680 –> 01:05:33,600
It can move quickly.
1763
01:05:33,600 –> 01:05:35,720
It can support what’s happening right now.
1764
01:05:35,720 –> 01:05:36,720
Questions.
1765
01:05:36,720 –> 01:05:39,440
But it is explicitly labeled as operational, not reportable.
1766
01:05:39,440 –> 01:05:42,640
Different audience, different expectations, different tolerance for drift.
1767
01:05:42,640 –> 01:05:46,920
If you collapse those into one dashboard, you’ll optimize for executive convenience and
1768
01:05:46,920 –> 01:05:48,720
accidentally publish it as evidence.
1769
01:05:48,720 –> 01:05:52,440
Now, even in a thin semantic layer, you still need governance because semantics are where
1770
01:05:52,440 –> 01:05:53,440
definitions drift.
1771
01:05:53,440 –> 01:05:59,680
Use certified data sets, not personal workspaces and not an analyst’s final pbix.
1772
01:05:59,680 –> 01:06:03,560
Assurances are the signal that the data set is backed by controlled sources as an owner
1773
01:06:03,560 –> 01:06:05,520
and is part of the assurance boundary.
1774
01:06:05,520 –> 01:06:06,520
Promoted is not enough.
1775
01:06:06,520 –> 01:06:09,520
Promoted is a social tag, certified is a control decision.
1776
01:06:09,520 –> 01:06:10,520
Control the refresh path.
1777
01:06:10,520 –> 01:06:15,240
If the data set refreshes from curated tables, someone will eventually change a curated
1778
01:06:15,240 –> 01:06:20,280
transformation and unintentionally shift a number that was treated as stable.
1779
01:06:20,280 –> 01:06:22,760
Assurance data sets refresh from reported tables only.
1780
01:06:22,760 –> 01:06:23,760
That’s the rule.
1781
01:06:23,760 –> 01:06:27,240
Then you design the visuals that actually survive audit scrutiny.
1782
01:06:27,240 –> 01:06:29,160
Auditors don’t care about your color palette.
1783
01:06:29,160 –> 01:06:30,920
They care about your ability to explain.
1784
01:06:30,920 –> 01:06:35,200
So the mandatory visuals are the ones that surface control relevant context, targets versus
1785
01:06:35,200 –> 01:06:37,880
actuals, yes, but also confidence indicators.
1786
01:06:37,880 –> 01:06:39,760
Measured versus estimated split.
1787
01:06:39,760 –> 01:06:41,360
Coverage metrics alongside totals.
1788
01:06:41,360 –> 01:06:44,040
An explicit period labels tied to closed status.
1789
01:06:44,040 –> 01:06:48,000
A scope three total without a coverage indicator is not a KPI.
1790
01:06:48,000 –> 01:06:49,720
It’s a mood.
1791
01:06:49,720 –> 01:06:52,040
You also enforce drill path discipline.
1792
01:06:52,040 –> 01:06:55,960
The drill path needs to be deterministic, grouped to region, to site, to source record
1793
01:06:55,960 –> 01:06:59,360
identifiers.
1794
01:06:59,360 –> 01:07:03,400
When a number gets challenged, the report must let you drill to the reported record grain,
1795
01:07:03,400 –> 01:07:07,280
then provide the keys that let an engineer trace lineage back through purview, period,
1796
01:07:07,280 –> 01:07:10,480
or unit load ID and factor library version key.
1797
01:07:10,480 –> 01:07:16,080
If the drill stops at an aggregated chart, your report is a poster, not an audit artifact.
1798
01:07:16,080 –> 01:07:17,960
Now the part everyone ignores.
1799
01:07:17,960 –> 01:07:18,960
Export strategy.
1800
01:07:18,960 –> 01:07:21,120
At period close, you snapshot the outputs.
1801
01:07:21,120 –> 01:07:24,280
Not because power BI is unreliable, but because people are.
1802
01:07:24,280 –> 01:07:28,560
Those often want exactly what was presented at close, and they wanted reproducible even
1803
01:07:28,560 –> 01:07:31,840
if someone changes a report later for internal reasons.
1804
01:07:31,840 –> 01:07:37,640
So you export close packages, a PDF snapshot for human readable continuity, plus a data extract
1805
01:07:37,640 –> 01:07:41,040
that matches the reported KPI tables for machine comparison.
1806
01:07:41,040 –> 01:07:45,680
Store both in the evidence vault with the close metadata, period, data set version,
1807
01:07:45,680 –> 01:07:47,280
report version, and approval reference.
1808
01:07:47,280 –> 01:07:48,680
This is not redundant.
1809
01:07:48,680 –> 01:07:53,720
It is defense against, we change the report layout, becoming, we can’t reproduce what
1810
01:07:53,720 –> 01:07:54,720
we filed.
1811
01:07:54,720 –> 01:07:56,200
A practical warning.
1812
01:07:56,200 –> 01:08:01,000
The easiest way to reintroduce calculation drift is to allow just one measure to creep in.
1813
01:08:01,000 –> 01:08:04,840
Someone will say the reported tables don’t include a ratio they want, or the business
1814
01:08:04,840 –> 01:08:09,280
wants a different intensity denominator in a visual, or they want to adjust a mapping
1815
01:08:09,280 –> 01:08:12,880
in the report because it’s faster than waiting for the next pipeline run.
1816
01:08:12,880 –> 01:08:17,640
If you allow that, power BI becomes the calculation engine again, slowly, one convenience
1817
01:08:17,640 –> 01:08:18,640
at a time.
1818
01:08:18,640 –> 01:08:22,880
So the rule stays harsh, the only math allowed in power BI is presentation math.
1819
01:08:22,880 –> 01:08:26,920
Formatting simple aggregations that don’t change accounting semantics and convenience measures
1820
01:08:26,920 –> 01:08:28,800
that do not become the source of truth.
1821
01:08:28,800 –> 01:08:32,200
Anything that changes the meaning of a KPI belongs in the governed calculation zone gets
1822
01:08:32,200 –> 01:08:35,040
versioned and gets published into the reported tables.
1823
01:08:35,040 –> 01:08:38,200
Because the only thing worse than having no ESG story is having two.
1824
01:08:38,200 –> 01:08:39,200
Optional components.
1825
01:08:39,200 –> 01:08:42,520
Sustainability manager, ADF, Azure ML, where they fit.
1826
01:08:42,520 –> 01:08:43,680
Optional doesn’t mean irrelevant.
1827
01:08:43,680 –> 01:08:47,880
It means the component is not part of the minimum control surface required to survive assurance.
1828
01:08:47,880 –> 01:08:51,880
You added when it reduces audit risk or operational friction, not when it makes the demo
1829
01:08:51,880 –> 01:08:52,880
prettier.
1830
01:08:52,880 –> 01:08:55,040
But with Microsoft Sustainability Manager.
1831
01:08:55,040 –> 01:08:59,160
Microsoft Sustainability Manager is useful when the organization needs structured workflows
1832
01:08:59,160 –> 01:09:03,600
and the sustainability focused data model without inventing everything from scratch.
1833
01:09:03,600 –> 01:09:07,120
It positions itself around record, report, and reduce.
1834
01:09:07,120 –> 01:09:08,120
That’s not marketing fluff.
1835
01:09:08,120 –> 01:09:09,320
It’s a workflow boundary.
1836
01:09:09,320 –> 01:09:14,680
It can unify silo data, run emissions calculations and support reporting modules.
1837
01:09:14,680 –> 01:09:17,320
But the architectural question isn’t, is it good?
1838
01:09:17,320 –> 01:09:20,000
The question is does it help you enforce intent?
1839
01:09:20,000 –> 01:09:23,760
Where it helps is governed data collection and auditability inside its domain.
1840
01:09:23,760 –> 01:09:25,520
The platform can track data changes.
1841
01:09:25,520 –> 01:09:28,840
It can enable auditing for sustainability tables in data verse.
1842
01:09:28,840 –> 01:09:33,840
It also has data trail report capabilities described as preview, producing traceability
1843
01:09:33,840 –> 01:09:36,960
across inputs, calculation models, logs, and outputs.
1844
01:09:36,960 –> 01:09:41,680
That can be valuable when your current state is uncontrolled spreadsheets and tribal knowledge.
1845
01:09:41,680 –> 01:09:44,880
Because it gives you a default control story you can actually show.
1846
01:09:44,880 –> 01:09:48,040
Where it doesn’t help is when you treat it as the system of record for everything and
1847
01:09:48,040 –> 01:09:51,560
stop caring about reproducibility at the platform boundary.
1848
01:09:51,560 –> 01:09:55,720
If you already have mature emissions logic, strong factor governance, and a deterministic
1849
01:09:55,720 –> 01:09:59,520
lake house model, sustainability manager becomes optional.
1850
01:09:59,520 –> 01:10:03,040
You might still use it for workflow and data collection features, but you don’t outsource
1851
01:10:03,040 –> 01:10:05,080
your assurance posture to an app.
1852
01:10:05,080 –> 01:10:08,080
And if you do adopt it, be honest about constraints.
1853
01:10:08,080 –> 01:10:11,120
Auditing configuration is blunt through the standard interface.
1854
01:10:11,120 –> 01:10:15,600
It’s all or nothing for sustainability tables unless you use the power platform web API
1855
01:10:15,600 –> 01:10:17,000
for more granular control.
1856
01:10:17,000 –> 01:10:18,000
That’s manageable.
1857
01:10:18,000 –> 01:10:21,940
It’s also a reminder that audit ready still requires engineering.
1858
01:10:21,940 –> 01:10:23,880
Next as your data factory.
1859
01:10:23,880 –> 01:10:28,160
If you’re all in on fabric native ingestion and your source landscape is simple, ADF is
1860
01:10:28,160 –> 01:10:29,160
optional.
1861
01:10:29,160 –> 01:10:30,520
Fabric can ingest.
1862
01:10:30,520 –> 01:10:31,920
Fabric can orchestrate.
1863
01:10:31,920 –> 01:10:33,680
And for many organizations, that’s enough.
1864
01:10:33,680 –> 01:10:36,720
But ADF remains valuable when reality shows up.
1865
01:10:36,720 –> 01:10:42,520
Complex ERP extraction, IoT fan in, API rate limits, cross system dependencies, and multi-step
1866
01:10:42,520 –> 01:10:46,280
orchestration that spans networks and security boundaries.
1867
01:10:46,280 –> 01:10:50,240
ADF is the thing you use when you need the pipeline to behave like an integration system,
1868
01:10:50,240 –> 01:10:52,000
not like a notebook with optimism.
1869
01:10:52,000 –> 01:10:54,920
Just remember the immutability constraint you already accepted.
1870
01:10:54,920 –> 01:10:57,880
ADF will fail when you try to override immutable paths.
1871
01:10:57,880 –> 01:11:01,240
You’ll see errors like path immutable due to policy because the storage layer is doing
1872
01:11:01,240 –> 01:11:02,240
its job.
1873
01:11:02,240 –> 01:11:06,160
And for certain transformation patterns, ADF data flows can’t write directly to immutable
1874
01:11:06,160 –> 01:11:08,680
containers because they rely on temporary files.
1875
01:11:08,680 –> 01:11:10,360
The pattern stays the same.
1876
01:11:10,360 –> 01:11:14,720
Write to immutable staging destination, then copy finalized outputs into the immutable
1877
01:11:14,720 –> 01:11:15,720
evidence zone.
1878
01:11:15,720 –> 01:11:17,280
ADF isn’t more enterprise.
1879
01:11:17,280 –> 01:11:18,760
It’s more orchestration.
1880
01:11:18,760 –> 01:11:19,760
That’s different.
1881
01:11:19,760 –> 01:11:23,680
Now, as your machine learning, it’s optional because it doesn’t produce baseline numbers.
1882
01:11:23,680 –> 01:11:26,520
If it does, you’re building a probabilistic accounting system.
1883
01:11:26,520 –> 01:11:28,800
You can’t audit a model’s intuition.
1884
01:11:28,800 –> 01:11:31,440
Use Azure ML for three things only.
1885
01:11:31,440 –> 01:11:35,160
Forecasting, anomaly detection, and scenario modeling.
1886
01:11:35,160 –> 01:11:36,160
Forecasting helps planning.
1887
01:11:36,160 –> 01:11:38,280
Anomaly detection helps data quality.
1888
01:11:38,280 –> 01:11:40,120
Scenario modeling helps reduction strategy.
1889
01:11:40,120 –> 01:11:42,280
None of those are the reported KPI baseline.
1890
01:11:42,280 –> 01:11:43,560
They are overlays.
1891
01:11:43,560 –> 01:11:47,720
And they must be labeled as overlays in the data model and in the reports.
1892
01:11:47,720 –> 01:11:53,800
Model outputs should carry model version training data window runtime stamp and clear classification
1893
01:11:53,800 –> 01:11:55,760
as estimated forecast.
1894
01:11:55,760 –> 01:11:59,960
And otherwise, you’ll inevitably promote a forecast to a fact because it looks clean on
1895
01:11:59,960 –> 01:12:00,960
a slide.
1896
01:12:00,960 –> 01:12:02,960
So the decision rule is harsh.
1897
01:12:02,960 –> 01:12:08,800
Add optional tooling only when it reduces audit risk, not when it reduces effort.
1898
01:12:08,800 –> 01:12:11,920
Sustainability manager reduces chaos when you need structured collection and built in
1899
01:12:11,920 –> 01:12:13,800
sustainability workflows.
1900
01:12:13,800 –> 01:12:17,880
ADF reduces fragility when integration complexity exceeds what fabric orchestration can
1901
01:12:17,880 –> 01:12:19,080
realistically manage.
1902
01:12:19,080 –> 01:12:22,840
Azure ML adds intelligence, but only if you keep it out of the accounting path.
1903
01:12:22,840 –> 01:12:25,040
Optional components don’t replace the fundamentals.
1904
01:12:25,040 –> 01:12:29,720
They either reinforce them or they accelerate your failure in a more expensive way.
1905
01:12:29,720 –> 01:12:32,760
Next a short comparison because every stack can calculate emissions.
1906
01:12:32,760 –> 01:12:35,160
Very few can prove them end to end.
1907
01:12:35,160 –> 01:12:36,840
The short comparison.
1908
01:12:36,840 –> 01:12:39,560
Microsoft versus snowflake Databricks GCP.
1909
01:12:39,560 –> 01:12:43,800
At this point, someone always asks the same question usually with a budget spreadsheet open.
1910
01:12:43,800 –> 01:12:44,800
Why Microsoft?
1911
01:12:44,800 –> 01:12:45,800
Why not snowflake?
1912
01:12:45,800 –> 01:12:46,800
Why not Databricks?
1913
01:12:46,800 –> 01:12:48,160
Why not just do this on GCP?
1914
01:12:48,160 –> 01:12:51,280
And the answer is not that those stacks can’t calculate emissions.
1915
01:12:51,280 –> 01:12:53,280
They can.
1916
01:12:53,280 –> 01:12:58,120
Any competent data platform can ingest activity data, join it to factor tables and output
1917
01:12:58,120 –> 01:12:59,840
a number labeled scope two.
1918
01:12:59,840 –> 01:13:00,920
That part is not rare.
1919
01:13:00,920 –> 01:13:02,240
It’s table stakes.
1920
01:13:02,240 –> 01:13:05,120
The problem is that assurance doesn’t reward computation.
1921
01:13:05,120 –> 01:13:06,560
Assurance rewards proof.
1922
01:13:06,560 –> 01:13:09,160
So the comparison only matters on three axes.
1923
01:13:09,160 –> 01:13:13,000
Identity and access control, lineage and governance and audit evidence as a first class
1924
01:13:13,000 –> 01:13:14,000
output.
1925
01:13:14,000 –> 01:13:15,000
Everything else is noise.
1926
01:13:15,000 –> 01:13:16,560
Start with identity and access.
1927
01:13:16,560 –> 01:13:18,520
Microsoft’s advantage is not that entry exists.
1928
01:13:18,520 –> 01:13:19,520
Every cloud has IM.
1929
01:13:19,520 –> 01:13:23,720
The advantage is that the identity plane is already the enterprise default for most organizations
1930
01:13:23,720 –> 01:13:28,600
running Microsoft 365 Azure and Power Platform and it reaches into the services you’re
1931
01:13:28,600 –> 01:13:33,080
using for ESG, storage, compute, BI and workflow.
1932
01:13:33,080 –> 01:13:35,480
That matters because roll separation isn’t a concept.
1933
01:13:35,480 –> 01:13:38,400
It’s a continuous enforcement problem across the entire stack.
1934
01:13:38,400 –> 01:13:42,960
In Microsoft land, you can actually make submitter, validator, calculator, approver, publisher
1935
01:13:42,960 –> 01:13:47,760
map to real groups and real permissions that propagate into the services doing the work.
1936
01:13:47,760 –> 01:13:53,120
In many non-Microsoft stacks, identity becomes an assembly task, not impossible, just assembled.
1937
01:13:53,120 –> 01:13:57,080
And assembled identity is where temporary access becomes permanent, where service accounts
1938
01:13:57,080 –> 01:14:02,000
become shared and where your separation of duties quietly turns into conditional chaos.
1939
01:14:02,000 –> 01:14:03,480
The platform didn’t fail you.
1940
01:14:03,480 –> 01:14:04,760
Your architecture did.
1941
01:14:04,760 –> 01:14:06,840
But the platform determines how hard it is to fail.
1942
01:14:06,840 –> 01:14:08,520
Now lineage and governance.
1943
01:14:08,520 –> 01:14:13,640
This is the access where most ESG teams discover the difference between we have data and we can
1944
01:14:13,640 –> 01:14:15,240
explain data.
1945
01:14:15,240 –> 01:14:16,560
Microsoft purview is not magic.
1946
01:14:16,560 –> 01:14:19,520
It’s just a governance plane that is designed to be a governance plane.
1947
01:14:19,520 –> 01:14:23,800
You register assets, you scan, you capture lineage, you assign owners, you query metadata,
1948
01:14:23,800 –> 01:14:28,000
you walk into an audit room with something more defensible than a diagram in confluence.
1949
01:14:28,000 –> 01:14:32,520
And because purview integrates into common Microsoft data services, you can build lineage
1950
01:14:32,520 –> 01:14:37,520
that spans ingestion artifacts, lake house or warehouse objects and power BI consumption
1951
01:14:37,520 –> 01:14:40,160
in a way that is operationally achievable.
1952
01:14:40,160 –> 01:14:43,560
In other ecosystems, governance is usually a product stack you bolt on.
1953
01:14:43,560 –> 01:14:46,640
Databricks has unity catalog and lineage capabilities in its ecosystems.
1954
01:14:46,640 –> 01:14:48,640
Snowflake has governance features and partners.
1955
01:14:48,640 –> 01:14:51,360
GCP has data catalog and governance tooling.
1956
01:14:51,360 –> 01:14:52,360
All of that can work.
1957
01:14:52,360 –> 01:14:56,440
But the single pane of glass story becomes single pane of glass after integration work,
1958
01:14:56,440 –> 01:15:00,480
which means it competes with everything else for time, budget and political attention.
1959
01:15:00,480 –> 01:15:04,280
The time governance loses those fights, not because people are lazy, because they get
1960
01:15:04,280 –> 01:15:06,560
measured on delivery, not survivability.
1961
01:15:06,560 –> 01:15:10,480
So Microsoft’s practical advantage is not perfection, it’s friction.
1962
01:15:10,480 –> 01:15:14,400
Less friction to do governance well means it’s more likely to happen and more likely to
1963
01:15:14,400 –> 01:15:16,320
stay current when the system evolves.
1964
01:15:16,320 –> 01:15:17,880
That is what audit is actually experienced.
1965
01:15:17,880 –> 01:15:22,040
Now the third axis audit evidence, this is the point that makes most platform comparisons
1966
01:15:22,040 –> 01:15:23,040
meaningless.
1967
01:15:23,040 –> 01:15:28,840
Audit grade ESG is evidence management, immutable raw inputs, version factors, version logic,
1968
01:15:28,840 –> 01:15:30,440
period close configuration.
1969
01:15:30,440 –> 01:15:35,760
All adjustments, snapshots, access logs, approval trails, reproducible reruns.
1970
01:15:35,760 –> 01:15:36,760
That’s the system.
1971
01:15:36,760 –> 01:15:38,760
The report is just an output.
1972
01:15:38,760 –> 01:15:42,160
Microsoft doesn’t automatically give you this either, but the architecture aligns with
1973
01:15:42,160 –> 01:15:45,920
it because the components map cleanly to the evidence life cycle.
1974
01:15:45,920 –> 01:15:48,480
Entra gives you the enforcement surface for role separation.
1975
01:15:48,480 –> 01:15:52,520
ADLS Gen2 with immutability gives you the evidence world behavior.
1976
01:15:52,520 –> 01:15:56,360
Fabricosinaps gives you the governed compute surface, where you can implement deterministic
1977
01:15:56,360 –> 01:15:57,960
calculation artifacts.
1978
01:15:57,960 –> 01:16:01,240
All of you gives you lineage as query metadata instead of oral tradition.
1979
01:16:01,240 –> 01:16:06,760
Power BI gives you presentation without forcing you to mix computation and visualization.
1980
01:16:06,760 –> 01:16:10,960
That collection forms an integrated control plane story, not a vendor story, a control
1981
01:16:10,960 –> 01:16:14,600
story, and other stacks you can absolutely build the same control story.
1982
01:16:14,600 –> 01:16:18,480
But you build it, you assemble the identity controls across tools, you assemble lineage
1983
01:16:18,480 –> 01:16:23,120
across transformation engines and BI, you assemble immutability patterns across storage
1984
01:16:23,120 –> 01:16:27,920
and pipeline behavior, you assemble evidence packs as a discipline, not a platform feature.
1985
01:16:27,920 –> 01:16:32,200
And every assembly point becomes a place where policy erodes because policy always erodes
1986
01:16:32,200 –> 01:16:34,680
when intent isn’t enforced by design.
1987
01:16:34,680 –> 01:16:36,400
That’s the uncomfortable truth.
1988
01:16:36,400 –> 01:16:39,440
Architecture is what remains after your governance committee stops meeting.
1989
01:16:39,440 –> 01:16:44,120
Now, to be fair, there are reasons teams choose Snowflake, Databricks, or GCP for ESG.
1990
01:16:44,120 –> 01:16:46,880
They might already run their entire analytics estate there.
1991
01:16:46,880 –> 01:16:49,400
They might have stronger internal skills on that stack.
1992
01:16:49,400 –> 01:16:53,000
They might have vendor constraints or data gravity that makes Microsoft the wrong place
1993
01:16:53,000 –> 01:16:54,000
to compute.
1994
01:16:54,000 –> 01:16:57,720
None of that is invalid, but if they choose those platforms, they still have to answer
1995
01:16:57,720 –> 01:17:02,320
the same assurance questions and they still have to build the same non-negotiables immutability,
1996
01:17:02,320 –> 01:17:04,960
reproducibility, lineage, separation of duties.
1997
01:17:04,960 –> 01:17:07,160
The stack changes, the physics don’t.
1998
01:17:07,160 –> 01:17:10,200
So the short verdict is this, all stacks can calculate emissions.
1999
01:17:10,200 –> 01:17:14,480
Very few stacks can prove them end-to-end without deliberate architecture that prioritizes
2000
01:17:14,480 –> 01:17:16,320
evidence over convenience.
2001
01:17:16,320 –> 01:17:20,120
Microsoft’s advantage is that it gives you a coherent set of primitives that align
2002
01:17:20,120 –> 01:17:24,560
with audit survivability, especially in organizations already living inside Entra,
2003
01:17:24,560 –> 01:17:26,920
Microsoft 365, and Azure.
2004
01:17:26,920 –> 01:17:29,000
And this episode was never about vendor fandom.
2005
01:17:29,000 –> 01:17:32,920
It was about building something that survives contact with assurance, which is why the next
2006
01:17:32,920 –> 01:17:35,240
section matters more than the comparison.
2007
01:17:35,240 –> 01:17:40,200
The minimal viable auditable ESG architecture is the part you can actually replicate.
2008
01:17:40,200 –> 01:17:43,800
Minimal viable auditable ESG architecture, the replicable blueprint.
2009
01:17:43,800 –> 01:17:47,600
Here’s the part people pretend they want until it forces decisions.
2010
01:17:47,600 –> 01:17:52,320
A minimal viable auditable ESG architecture isn’t minimal because it’s cheap or quick.
2011
01:17:52,320 –> 01:17:56,660
It’s minimal because it contains the smallest set of components and artifacts that can
2012
01:17:56,660 –> 01:18:00,500
survive assurance without turning your team into full-time historians.
2013
01:18:00,500 –> 01:18:04,340
So define the environs clearly, boundaries, components, and produced evidence.
2014
01:18:04,340 –> 01:18:09,220
The boundaries are four zones, raw, curated, reported, and an evidence vault.
2015
01:18:09,220 –> 01:18:13,380
Not because medallion architecture is fashionable, but because it’s the cleanest way to separate
2016
01:18:13,380 –> 01:18:16,540
what happened from what you did to it from what you claim.
2017
01:18:16,540 –> 01:18:18,140
The components are five.
2018
01:18:18,140 –> 01:18:24,060
Entra ID, ADLS, GN2 with immutability, fabric or synapse for governed compute, purview
2019
01:18:24,060 –> 01:18:28,020
for lineage, and power BI as a thin presentation layer.
2020
01:18:28,020 –> 01:18:32,060
Everything else is optional, and optional means it’s allowed to be absent without collapsing
2021
01:18:32,060 –> 01:18:33,860
audit survivability.
2022
01:18:33,860 –> 01:18:36,220
The artifacts are what make it auditable.
2023
01:18:36,220 –> 01:18:40,900
Load IDs and ingestion logs, immutable raw objects, governed calculation artifacts with
2024
01:18:40,900 –> 01:18:47,260
version control, factor library versions with approval records, period close configuration,
2025
01:18:47,260 –> 01:18:51,020
reported KPI tables, and a close package snapshot.
2026
01:18:51,020 –> 01:18:54,180
If you don’t produce those artifacts, you didn’t build an auditable system.
2027
01:18:54,180 –> 01:18:55,180
You built a dashboard.
2028
01:18:55,180 –> 01:19:00,380
Now walk through one KPI end-to-end because architecture without a trace is just a diagram.
2029
01:19:00,380 –> 01:19:01,940
Pick scope to emissions.
2030
01:19:01,940 –> 01:19:06,620
It’s common enough, and it’s where drift and factor ambiguity show up fast.
2031
01:19:06,620 –> 01:19:12,260
The source is activity data, electricity consumption by site and period from invoices or meters.
2032
01:19:12,260 –> 01:19:15,540
Ingestion lands it in the raw zone as append only objects.
2033
01:19:15,540 –> 01:19:19,820
Each load gets a load ID, timestamp, source identifier, and submitter identity.
2034
01:19:19,820 –> 01:19:24,620
If the ingestion came from a human submission, it still lands as a new versioned object.
2035
01:19:24,620 –> 01:19:26,660
No override, ever.
2036
01:19:26,660 –> 01:19:29,500
Validation runs before anything becomes curated.
2037
01:19:29,500 –> 01:19:34,380
Schema checks, unit checks, required dimensions like site, period, and measurement type.
2038
01:19:34,380 –> 01:19:37,500
The validation output is not a log line in the pipeline run.
2039
01:19:37,500 –> 01:19:39,700
It’s an artifact you can retrieve later.
2040
01:19:39,700 –> 01:19:43,980
Pass, fail, warnings, and what was corrected or normalized.
2041
01:19:43,980 –> 01:19:46,060
Then curated, you standardize the shape.
2042
01:19:46,060 –> 01:19:47,460
Sites map to org units.
2043
01:19:47,460 –> 01:19:51,220
Normalize, missing dimensions get flagged not silently filled.
2044
01:19:51,220 –> 01:19:54,220
This is also where you enforce controlled vocab.
2045
01:19:54,220 –> 01:19:57,380
Electricity type, supply identifiers, region codes.
2046
01:19:57,380 –> 01:20:02,340
The curated tables carry quality flags forward because clean data that hides uncertainty is
2047
01:20:02,340 –> 01:20:03,500
a liability.
2048
01:20:03,500 –> 01:20:06,780
Then the calculation zone, fabric, lake house, or synapse.
2049
01:20:06,780 –> 01:20:11,340
This is where scope to emissions gets computed using versioned logic, not DAX.
2050
01:20:11,340 –> 01:20:13,820
Not the report, a governed artifact.
2051
01:20:13,820 –> 01:20:16,340
The computation binds to two things explicitly.
2052
01:20:16,340 –> 01:20:19,940
The activity load IDs and the factor library version key.
2053
01:20:19,940 –> 01:20:23,180
If the job doesn’t have a factor version key, it fails.
2054
01:20:23,180 –> 01:20:26,780
If multiple factors match due to sloppy mappings, it fails.
2055
01:20:26,780 –> 01:20:28,140
Deterministic selection or no selection.
2056
01:20:28,140 –> 01:20:29,460
Now period close.
2057
01:20:29,460 –> 01:20:31,540
Period close is not a calendar event.
2058
01:20:31,540 –> 01:20:34,860
It’s a state change that freezes your ability to rewrite the past.
2059
01:20:34,860 –> 01:20:38,580
You freeze the inputs by selecting the accepted load IDs for the period.
2060
01:20:38,580 –> 01:20:43,220
You freeze the factors by binding the approved factor library version IDs to that period.
2061
01:20:43,220 –> 01:20:47,260
Refreeze the logic by referencing the released calculation artifact version.
2062
01:20:47,260 –> 01:20:51,460
Then you publish the reported outputs, the KPI tables for that period with keys for period
2063
01:20:51,460 –> 01:20:55,740
or unit method measured versus estimated flags and the factor version key used.
2064
01:20:55,740 –> 01:21:00,780
Then you lock as ADLS immutability applies to the raw evidence and to the published close
2065
01:21:00,780 –> 01:21:02,140
artifacts.
2066
01:21:02,140 –> 01:21:06,220
Factor library snapshot, close configuration, and reported outputs as needed for your
2067
01:21:06,220 –> 01:21:07,540
evidence strategy.
2068
01:21:07,540 –> 01:21:10,220
You don’t have to make every table immutable forever.
2069
01:21:10,220 –> 01:21:12,380
You do have to make the close package immutable.
2070
01:21:12,380 –> 01:21:13,380
That’s the point.
2071
01:21:13,380 –> 01:21:14,380
Then power BI.
2072
01:21:14,380 –> 01:21:16,260
Power BI reads reported tables only.
2073
01:21:16,260 –> 01:21:17,540
The data set is certified.
2074
01:21:17,540 –> 01:21:18,860
The refresh is controlled.
2075
01:21:18,860 –> 01:21:23,500
The visuals include the confidence context, measured versus estimated, coverage indicators,
2076
01:21:23,500 –> 01:21:26,180
and the drill path down to record identifiers.
2077
01:21:26,180 –> 01:21:29,300
When someone challenges the KPI, you don’t debate, you drill.
2078
01:21:29,300 –> 01:21:30,980
And per view stitches the whole path.
2079
01:21:30,980 –> 01:21:33,940
Power BI report to data set, data set to report the tables.
2080
01:21:33,940 –> 01:21:36,100
Report the tables to calculation artifacts.
2081
01:21:36,100 –> 01:21:39,900
Calculation artifacts to curated inputs, curated inputs to raw loads, raw loads to sources
2082
01:21:39,900 –> 01:21:41,220
and submission identities.
2083
01:21:41,220 –> 01:21:43,220
That lineage is not for beauty, it’s for.
2084
01:21:43,220 –> 01:21:46,980
The day someone asks, prove it, while your calendar is already on fire.
2085
01:21:46,980 –> 01:21:48,460
Finally, sequencing.
2086
01:21:48,460 –> 01:21:51,860
Because this is where most teams implode by trying to boil the ocean.
2087
01:21:51,860 –> 01:21:54,540
Week one, pick one KPI and one data source.
2088
01:21:54,540 –> 01:21:57,860
Build ingestion into raw with load IDs and validation artifacts.
2089
01:21:57,860 –> 01:22:04,180
Week two, build the curated model for that KPI, including quality flags and control dimensions.
2090
01:22:04,180 –> 01:22:08,740
Week three, implement the governed calculation zone with factor version binding and a reported
2091
01:22:08,740 –> 01:22:09,740
output table.
2092
01:22:09,740 –> 01:22:14,620
Week four, register assets in purview and build a thin power BI report that drills to record
2093
01:22:14,620 –> 01:22:16,660
IDs and shows factor version keys.
2094
01:22:16,660 –> 01:22:17,860
That’s done.
2095
01:22:17,860 –> 01:22:20,340
Not perfect, done in the only sense that matters.
2096
01:22:20,340 –> 01:22:24,660
The auditor’s questions are answerable from systems, not from meetings.
2097
01:22:24,660 –> 01:22:29,660
Auditable ESG in Microsoft isn’t about dashboards, it’s about immutable data, versioned calculations
2098
01:22:29,660 –> 01:22:32,740
and lineage you can explain to an auditor without PowerPoint.
2099
01:22:32,740 –> 01:22:37,380
If you want the next layer, the ESG data model itself, raw versus curated versus reported
2100
01:22:37,380 –> 01:22:40,660
and how to enforce period close, watch the next episode and subscribe.