The Anatomy of an Auditable ESG Stack

Mirko PetersPodcasts2 hours ago26 Views


1
00:00:00,000 –> 00:00:02,880
Most organizations treat ESG reporting like a narrative.

2
00:00:02,880 –> 00:00:04,800
Auditors treat it like evidence.

3
00:00:04,800 –> 00:00:06,680
An evidence has rules, origin, integrity,

4
00:00:06,680 –> 00:00:08,800
repeatability, and access control.

5
00:00:08,800 –> 00:00:12,360
If your number exists, because someone edited a spreadsheet,

6
00:00:12,360 –> 00:00:13,800
your stack isn’t a stack.

7
00:00:13,800 –> 00:00:14,720
It’s a story.

8
00:00:14,720 –> 00:00:17,120
In this episode, this is what gets built.

9
00:00:17,120 –> 00:00:21,560
A minimal, auditable OESG architecture on Microsoft Cloud

10
00:00:21,560 –> 00:00:24,280
that you can replicate identity, immutability,

11
00:00:24,280 –> 00:00:26,400
governed calculations, lineage,

12
00:00:26,400 –> 00:00:29,280
and a reporting layer that doesn’t rewrite history.

13
00:00:29,280 –> 00:00:31,520
And there’s one reason dashboards are the fastest path

14
00:00:31,520 –> 00:00:32,480
to audit failure.

15
00:00:32,480 –> 00:00:33,360
Coming up.

16
00:00:33,360 –> 00:00:35,040
The foundational misunderstanding.

17
00:00:35,040 –> 00:00:36,320
ESG isn’t a report.

18
00:00:36,320 –> 00:00:37,840
It’s a system of record.

19
00:00:37,840 –> 00:00:39,880
The core misconception is comforting.

20
00:00:39,880 –> 00:00:42,760
ESG is a document, a disclosure, a set of charts,

21
00:00:42,760 –> 00:00:45,040
a few paragraphs that say we’re improving.

22
00:00:45,040 –> 00:00:47,560
That framing works right up until someone asks for proof.

23
00:00:47,560 –> 00:00:49,800
In architectural terms, ESG is not a report.

24
00:00:49,800 –> 00:00:51,040
It is a system of record.

25
00:00:51,040 –> 00:00:54,160
That distinction matters because a report is an output artifact.

26
00:00:54,160 –> 00:00:56,200
It can be produced by almost any workflow,

27
00:00:56,200 –> 00:00:58,840
including workflows that should never survive contact

28
00:00:58,840 –> 00:00:59,760
with assurance.

29
00:00:59,760 –> 00:01:01,600
A system of record is different.

30
00:01:01,600 –> 00:01:04,640
It is a controlled environment where inputs, transformations

31
00:01:04,640 –> 00:01:07,680
and outputs are all tracked, replayable, and attributable.

32
00:01:07,680 –> 00:01:10,880
And OESG, operational ESG, is simply

33
00:01:10,880 –> 00:01:14,080
the adult version of ESG, measurable, decision-ready,

34
00:01:14,080 –> 00:01:15,000
and auditable.

35
00:01:15,000 –> 00:01:18,200
If ESG is going to be used in corporate disclosures, regulatory

36
00:01:18,200 –> 00:01:20,120
submissions or investor communications,

37
00:01:20,120 –> 00:01:22,880
the underlying system has to behave like financial reporting

38
00:01:22,880 –> 00:01:26,160
systems, not in theme, in mechanics.

39
00:01:26,160 –> 00:01:28,800
So the anatomy starts with a control system model.

40
00:01:28,800 –> 00:01:31,840
Inputs, transformations, outputs, and attestations.

41
00:01:31,840 –> 00:01:33,520
Inputs are the operational facts.

42
00:01:33,520 –> 00:01:36,000
Energy consumption, fuel use, travel, procurement,

43
00:01:36,000 –> 00:01:39,000
line items, workforce counts, water usage.

44
00:01:39,000 –> 00:01:40,840
Transformations are the govern processes

45
00:01:40,840 –> 00:01:43,920
that normalize units, map to organizational structure,

46
00:01:43,920 –> 00:01:46,240
apply emission factors, and compute KPIs.

47
00:01:46,240 –> 00:01:49,440
Outputs are the period-specific KPI tables and disclosures.

48
00:01:49,440 –> 00:01:52,600
Attestations are the approvals, sign-offs, and audit artifacts

49
00:01:52,600 –> 00:01:54,840
that prove the outputs were produced under control.

50
00:01:54,840 –> 00:01:57,360
Most organizations skip straight to outputs.

51
00:01:57,360 –> 00:01:59,480
They build dashboards, they produce slides,

52
00:01:59,480 –> 00:02:02,560
they call it reporting, but they never build the chain of custody.

53
00:02:02,560 –> 00:02:05,280
Chain of custody is the real product, not the pretty chart,

54
00:02:05,280 –> 00:02:07,080
because assurance doesn’t audit your chart.

55
00:02:07,080 –> 00:02:10,080
It audits whether the number behind the chart is defensible,

56
00:02:10,080 –> 00:02:12,160
where it came from, who touched it, how it changed,

57
00:02:12,160 –> 00:02:14,240
which logic produced it, and whether anyone could have

58
00:02:14,240 –> 00:02:16,120
quietly altered it after close.

59
00:02:16,120 –> 00:02:19,360
This is where deterministic versus probabilistic ESG shows up.

60
00:02:19,360 –> 00:02:21,760
Deterministic ESG is boring and boring is good.

61
00:02:21,760 –> 00:02:24,280
Given the same raw inputs, the same factor versions,

62
00:02:24,280 –> 00:02:26,560
and the same calculation logic, the system produces

63
00:02:26,560 –> 00:02:28,080
the same outputs every time.

64
00:02:28,080 –> 00:02:31,120
Re-run last year and two years, and you get the same result.

65
00:02:31,120 –> 00:02:33,840
That’s what auditors expect, even if they don’t say the word,

66
00:02:33,840 –> 00:02:35,040
deterministic.

67
00:02:35,040 –> 00:02:37,640
Probabilistic ESG is what you get when human edits

68
00:02:37,640 –> 00:02:39,960
are allowed to masquerade as process.

69
00:02:39,960 –> 00:02:42,360
Numbers drift because someone fixed a file,

70
00:02:42,360 –> 00:02:45,840
optimized a model, or updated a mapping.

71
00:02:45,840 –> 00:02:47,760
The system still produces an output,

72
00:02:47,760 –> 00:02:49,680
but it can’t reproduce its own past.

73
00:02:49,680 –> 00:02:50,720
It can’t explain itself.

74
00:02:50,720 –> 00:02:52,080
It can’t prove integrity.

75
00:02:52,080 –> 00:02:54,440
And once you can’t reproduce, you can’t assure.

76
00:02:54,440 –> 00:02:55,840
Here’s the thing most people miss.

77
00:02:55,840 –> 00:02:57,360
Auditors don’t need perfection.

78
00:02:57,360 –> 00:02:58,640
They need controllability.

79
00:02:58,640 –> 00:03:00,840
They need you to show that changes are visible,

80
00:03:00,840 –> 00:03:02,520
bounded, approved, and attributable.

81
00:03:02,520 –> 00:03:05,120
When you can’t do that, every number becomes a debate.

82
00:03:05,120 –> 00:03:06,360
And debates are expensive.

83
00:03:06,360 –> 00:03:09,600
So let’s talk about the audit questions that break weak stacks.

84
00:03:09,600 –> 00:03:12,320
Not because auditors are evil, because this is their job.

85
00:03:12,320 –> 00:03:13,920
Who changed it? When did they change it?

86
00:03:13,920 –> 00:03:15,080
Why did they change it?

87
00:03:15,080 –> 00:03:16,360
What approval existed?

88
00:03:16,360 –> 00:03:18,240
What version of the factors did you use?

89
00:03:18,240 –> 00:03:20,120
What version of the calculation logic did you use?

90
00:03:20,120 –> 00:03:22,280
What inputs were in scope for the period close?

91
00:03:22,280 –> 00:03:24,560
Who had access to alter or data curated data

92
00:03:24,560 –> 00:03:25,480
and reported outputs?

93
00:03:25,480 –> 00:03:28,240
Can you show lineage from the KPI back to the source record

94
00:03:28,240 –> 00:03:29,840
without reconstructing a PowerPoint?

95
00:03:29,840 –> 00:03:31,640
If your answer to any of these is we think,

96
00:03:31,640 –> 00:03:33,120
you don’t have OESG.

97
00:03:33,120 –> 00:03:34,240
You have a narrative.

98
00:03:34,240 –> 00:03:36,080
Now here’s where it gets uncomfortable.

99
00:03:36,080 –> 00:03:38,720
Most ESG programs treat the sustainability team

100
00:03:38,720 –> 00:03:40,040
like the owner of truth.

101
00:03:40,040 –> 00:03:41,720
But systems don’t care about job titles,

102
00:03:41,720 –> 00:03:44,240
systems care about permissions and pathways.

103
00:03:44,240 –> 00:03:47,160
If a single person can both submit data and adjust

104
00:03:47,160 –> 00:03:48,920
the calculation and publish the dashboard,

105
00:03:48,920 –> 00:03:50,000
you don’t have governance.

106
00:03:50,000 –> 00:03:52,160
You have conditional chaos.

107
00:03:52,160 –> 00:03:53,680
And it accumulates.

108
00:03:53,680 –> 00:03:56,000
Every exception becomes an entropy generator.

109
00:03:56,000 –> 00:03:57,840
One more undocumented pathway for numbers

110
00:03:57,840 –> 00:03:59,920
to change without leaving a clean trail.

111
00:03:59,920 –> 00:04:01,880
This clicked for a lot of teams when

112
00:04:01,880 –> 00:04:03,960
assurance started asking for evidence packs,

113
00:04:03,960 –> 00:04:06,960
not just the final KPI, but the supporting documents,

114
00:04:06,960 –> 00:04:09,720
the factor library provenance, the ingestion logs,

115
00:04:09,720 –> 00:04:12,200
the validation results, and the approvals.

116
00:04:12,200 –> 00:04:14,360
Suddenly the ESG report wasn’t the deliverable.

117
00:04:14,360 –> 00:04:17,000
The deliverable was the ability to prove the report.

118
00:04:17,000 –> 00:04:18,760
So the architecture has a simple objective

119
00:04:18,760 –> 00:04:20,720
and forced chain of custody at scale.

120
00:04:20,720 –> 00:04:23,080
That means every ESG number has to be traceable

121
00:04:23,080 –> 00:04:24,720
through four properties.

122
00:04:24,720 –> 00:04:25,560
Origin?

123
00:04:25,560 –> 00:04:27,560
The system can identify the source record

124
00:04:27,560 –> 00:04:28,800
and the source system.

125
00:04:28,800 –> 00:04:29,640
Transformation?

126
00:04:29,640 –> 00:04:31,880
The system can show which pipeline and which logic

127
00:04:31,880 –> 00:04:33,560
produce the derived record.

128
00:04:33,560 –> 00:04:34,400
Integrity?

129
00:04:34,400 –> 00:04:37,560
The system can show the data wasn’t overwritten post-close.

130
00:04:37,560 –> 00:04:40,600
And changes are recorded as new versions or adjustments,

131
00:04:40,600 –> 00:04:42,280
not silent edits.

132
00:04:42,280 –> 00:04:45,280
Access, the system can show who could touch what and when.

133
00:04:45,280 –> 00:04:47,440
Once you accept ESG as a system of record,

134
00:04:47,440 –> 00:04:48,920
everything else becomes obvious.

135
00:04:48,920 –> 00:04:51,600
Dashboards become presentation, not computation.

136
00:04:51,600 –> 00:04:53,520
Spreadsheets become controlled submissions,

137
00:04:53,520 –> 00:04:55,040
not a source of truth.

138
00:04:55,040 –> 00:04:57,840
One of fixes become formal adjustments with approvals.

139
00:04:57,840 –> 00:05:00,000
And every component you choose in Microsoft Cloud

140
00:05:00,000 –> 00:05:02,760
starts mapping to a property the auditor will eventually

141
00:05:02,760 –> 00:05:03,280
demand.

142
00:05:03,280 –> 00:05:05,560
Now before we go further, you need a working definition

143
00:05:05,560 –> 00:05:09,320
of auditable in system terms, because it’s not a checkbox

144
00:05:09,320 –> 00:05:11,440
and it’s definitely not a screenshot.

145
00:05:11,440 –> 00:05:12,400
That comes next.

146
00:05:12,400 –> 00:05:15,040
The audit grade requirements, immutability, reproducibility,

147
00:05:15,040 –> 00:05:16,800
lineage, separation of duties.

148
00:05:16,800 –> 00:05:19,400
So what does auditable mean when it stops being a vibe

149
00:05:19,400 –> 00:05:21,400
and starts being a system property?

150
00:05:21,400 –> 00:05:22,960
It collapses into four requirements,

151
00:05:22,960 –> 00:05:25,600
not because Microsoft says so, because auditors behave

152
00:05:25,600 –> 00:05:28,280
predictably and systems either withstand that pressure

153
00:05:28,280 –> 00:05:29,360
or they don’t.

154
00:05:29,360 –> 00:05:32,120
Immutability, reproducibility, lineage,

155
00:05:32,120 –> 00:05:33,840
and separation of duties.

156
00:05:33,840 –> 00:05:35,960
First, immutability.

157
00:05:35,960 –> 00:05:38,840
Immutability is not, we promise we won’t change it.

158
00:05:38,840 –> 00:05:41,640
Immutability is, the platform will not let you change it.

159
00:05:41,640 –> 00:05:42,960
That’s the entire point.

160
00:05:42,960 –> 00:05:45,640
On Microsoft Cloud, that shows up as right ones,

161
00:05:45,640 –> 00:05:47,800
read many behavior on Azure Blob storage

162
00:05:47,800 –> 00:05:51,240
or ADLS Gen 2 through immutable storage policies.

163
00:05:51,240 –> 00:05:54,080
After period close, your raw evidence and your period outputs

164
00:05:54,080 –> 00:05:56,960
have to stop being mutable objects and become records.

165
00:05:56,960 –> 00:05:59,200
That distinction matters because most ESG programs

166
00:05:59,200 –> 00:06:01,240
close a period socially, not technically.

167
00:06:01,240 –> 00:06:03,320
People agree it’s closed, but the storage layer

168
00:06:03,320 –> 00:06:04,520
still allows overrides.

169
00:06:04,520 –> 00:06:06,800
So someone fixes a typo, reruns a pipeline,

170
00:06:06,800 –> 00:06:09,040
uploads a corrected file, and now the evidence

171
00:06:09,040 –> 00:06:10,840
for the closed period silently changes,

172
00:06:10,840 –> 00:06:12,560
your process still feels controlled,

173
00:06:12,560 –> 00:06:14,120
but the system behavior is not.

174
00:06:14,120 –> 00:06:15,560
Auditors don’t audit feelings.

175
00:06:15,560 –> 00:06:17,560
They audit whether changes were possible.

176
00:06:17,560 –> 00:06:20,080
Time-based retention is the operational version

177
00:06:20,080 –> 00:06:21,440
of immutability.

178
00:06:21,440 –> 00:06:23,320
You lock data for defined interval,

179
00:06:23,320 –> 00:06:25,320
so it can’t be modified or deleted.

180
00:06:25,320 –> 00:06:27,400
Legal hold is the litigation version.

181
00:06:27,400 –> 00:06:30,120
It stays locked until someone with authority clears it.

182
00:06:30,120 –> 00:06:31,760
The consequence is the same either way.

183
00:06:31,760 –> 00:06:33,760
Overrides become illegal.

184
00:06:33,760 –> 00:06:37,080
Which means your pipeline design has to evolve from replace

185
00:06:37,080 –> 00:06:38,640
to publish a new version.

186
00:06:38,640 –> 00:06:40,640
Second, reproducibility.

187
00:06:40,640 –> 00:06:44,280
Reproducibility is the ability to rerun FYI and FYI+2

188
00:06:44,280 –> 00:06:45,480
and get the same result.

189
00:06:45,480 –> 00:06:47,560
Not similar, not close.

190
00:06:47,560 –> 00:06:48,640
The same.

191
00:06:48,640 –> 00:06:51,200
That means three things must be frozen per period.

192
00:06:51,200 –> 00:06:53,560
Inputs, factors, and logic.

193
00:06:53,560 –> 00:06:55,040
Most people only freeze inputs,

194
00:06:55,040 –> 00:06:56,680
and even that is usually wishful thinking.

195
00:06:56,680 –> 00:06:58,920
The system needs to freeze the factor library versions

196
00:06:58,920 –> 00:07:01,360
used for that period and freeze the calculation artifacts

197
00:07:01,360 –> 00:07:02,200
that reference them.

198
00:07:02,200 –> 00:07:04,880
If you rerun with latest factors or latest code,

199
00:07:04,880 –> 00:07:06,320
you’re not reproducing history.

200
00:07:06,320 –> 00:07:07,040
You’re rewriting it.

201
00:07:07,040 –> 00:07:09,880
Reproducibility is why dashboard math is an audit trap.

202
00:07:09,880 –> 00:07:12,320
You can’t prove what logic produced last year’s number

203
00:07:12,320 –> 00:07:15,080
if the logic lives in a constantly edited semantic model.

204
00:07:15,080 –> 00:07:17,280
Even if the code is technically visible,

205
00:07:17,280 –> 00:07:19,200
it’s not governed like a calculation engine.

206
00:07:19,200 –> 00:07:21,120
There’s no concept of a period bound release

207
00:07:21,120 –> 00:07:23,120
an approved version and a locked output.

208
00:07:23,120 –> 00:07:25,040
Auditors don’t need your DAX to be clever.

209
00:07:25,040 –> 00:07:26,560
They need it to stop moving.

210
00:07:26,560 –> 00:07:28,040
Third, lineage.

211
00:07:28,040 –> 00:07:31,280
Lineage is the answer to a single question

212
00:07:31,280 –> 00:07:32,880
that destroys weak stacks.

213
00:07:32,880 –> 00:07:34,440
Where did this number come from?

214
00:07:34,440 –> 00:07:36,840
Not philosophically, mechanically.

215
00:07:36,840 –> 00:07:39,680
Lineage is origin to transformation, to consumption,

216
00:07:39,680 –> 00:07:42,640
source system record, to ingested file or table

217
00:07:42,640 –> 00:07:45,080
through transformations, into curated models,

218
00:07:45,080 –> 00:07:48,680
into reported outputs, into the data set that Power BI reads.

219
00:07:48,680 –> 00:07:51,200
If you can’t trace it quickly, you will trace it slowly.

220
00:07:51,200 –> 00:07:53,800
And slowly means meetings, screenshots,

221
00:07:53,800 –> 00:07:55,400
and spreadsheet archaeology.

222
00:07:55,400 –> 00:07:58,040
That is not assurance that is theater.

223
00:07:58,040 –> 00:08:01,520
Microsoft purview exists because human memory does not scale.

224
00:08:01,520 –> 00:08:03,280
It’s the metadata system that turns,

225
00:08:03,280 –> 00:08:06,200
we think this is how it flows into, here is the graph.

226
00:08:06,200 –> 00:08:08,360
It also becomes your change management weapon.

227
00:08:08,360 –> 00:08:10,520
Before you change a pipeline or a calculation,

228
00:08:10,520 –> 00:08:12,080
you can see downstream impact.

229
00:08:12,080 –> 00:08:14,440
Without lineage, every change is a blind deployment

230
00:08:14,440 –> 00:08:16,280
into your own reporting boundary.

231
00:08:16,280 –> 00:08:18,040
And yes, product capabilities evolve.

232
00:08:18,040 –> 00:08:18,880
That’s normal.

233
00:08:18,880 –> 00:08:20,040
Your requirement does not evolve.

234
00:08:20,040 –> 00:08:22,200
The requirement is explainability under pressure.

235
00:08:22,200 –> 00:08:24,360
Fourth, separation of duties.

236
00:08:24,360 –> 00:08:27,280
This one is where most ESG programs quietly fail

237
00:08:27,280 –> 00:08:28,640
because it’s inconvenient.

238
00:08:28,640 –> 00:08:29,960
But the logic is simple.

239
00:08:29,960 –> 00:08:32,880
The person who submits data cannot be the person who approves it,

240
00:08:32,880 –> 00:08:34,400
and the person who changes logic

241
00:08:34,400 –> 00:08:37,120
cannot be the person who publishes the reported outputs.

242
00:08:37,120 –> 00:08:39,080
You need role separation across data entry,

243
00:08:39,080 –> 00:08:41,440
validation, calculation, approval, and reporting.

244
00:08:41,440 –> 00:08:42,320
In Microsoft terms,

245
00:08:42,320 –> 00:08:43,960
EntraID is not a diagram.

246
00:08:43,960 –> 00:08:45,600
It is the enforcement mechanism,

247
00:08:45,600 –> 00:08:48,040
group membership, role assignments, access reviews,

248
00:08:48,040 –> 00:08:49,640
audit logs, these are evidence.

249
00:08:49,640 –> 00:08:51,120
And you don’t get evidence by saying

250
00:08:51,120 –> 00:08:53,320
only the sustainability team has access.

251
00:08:53,320 –> 00:08:55,520
You get evidence by proving which identities

252
00:08:55,520 –> 00:08:57,240
had which permissions during the period

253
00:08:57,240 –> 00:09:00,320
and showing that privileged access was bounded and reviewable.

254
00:09:00,320 –> 00:09:02,520
Most organizations end up with a hero admin

255
00:09:02,520 –> 00:09:03,560
because it’s faster.

256
00:09:03,560 –> 00:09:04,560
It is not governance.

257
00:09:04,560 –> 00:09:06,640
It is a single point of audit failure.

258
00:09:06,640 –> 00:09:08,800
So those four properties define your architecture.

259
00:09:08,800 –> 00:09:10,640
Immutability prevents silent edits,

260
00:09:10,640 –> 00:09:14,360
reproducibility prevents drift, lineage prevents archaeology.

261
00:09:14,360 –> 00:09:16,720
Separation of duties prevents conflict of interest

262
00:09:16,720 –> 00:09:17,640
and invisible power.

263
00:09:17,640 –> 00:09:18,480
And here’s the payoff.

264
00:09:18,480 –> 00:09:21,120
Once these exist, your ESG stack stops

265
00:09:21,120 –> 00:09:23,880
being a collection of tools and becomes a control plane.

266
00:09:23,880 –> 00:09:25,280
Now the uncomfortable part.

267
00:09:25,280 –> 00:09:29,000
These requirements map directly to specific Microsoft services.

268
00:09:29,000 –> 00:09:30,360
Some are non-negotiable.

269
00:09:30,360 –> 00:09:32,680
The rest are optional until scale and regulation

270
00:09:32,680 –> 00:09:34,240
make the mandatory.

271
00:09:34,240 –> 00:09:37,120
Microsoft stack map, non-negotiable versus optional.

272
00:09:37,120 –> 00:09:39,160
Now we map those four audit grade requirements

273
00:09:39,160 –> 00:09:41,360
to Microsoft services, not as a shopping list,

274
00:09:41,360 –> 00:09:42,680
as a chain of enforcement.

275
00:09:42,680 –> 00:09:44,640
Because the system doesn’t become auditable

276
00:09:44,640 –> 00:09:45,520
when you buy tools.

277
00:09:45,520 –> 00:09:47,760
It becomes auditable when every requirement

278
00:09:47,760 –> 00:09:50,360
has an implementation that removes human discretion.

279
00:09:50,360 –> 00:09:52,160
Start with the non-negotiables.

280
00:09:52,160 –> 00:09:54,760
These are the components, auditors implicitly expect

281
00:09:54,760 –> 00:09:56,840
even if they never say Microsoft out loud.

282
00:09:56,840 –> 00:09:58,120
First identity and access.

283
00:09:58,120 –> 00:10:01,040
Microsoft, Entra ID, Entra is not single sign on.

284
00:10:01,040 –> 00:10:03,360
Architecturally, it’s the distributed decision engine

285
00:10:03,360 –> 00:10:05,640
that decides who can submit, who can transform,

286
00:10:05,640 –> 00:10:07,440
who can approve and who can publish.

287
00:10:07,440 –> 00:10:09,640
And it produces logs, logs are evidence.

288
00:10:09,640 –> 00:10:12,000
If role separation is one of your requirements,

289
00:10:12,000 –> 00:10:15,120
Entra is where it either happens or it doesn’t.

290
00:10:15,120 –> 00:10:17,240
Second, storage with immutability.

291
00:10:17,240 –> 00:10:19,000
Azure Data Lake Storage Gen 2,

292
00:10:19,000 –> 00:10:20,840
with immutable storage policies for the zones

293
00:10:20,840 –> 00:10:23,920
that become evidence, raw and period closed reported outputs,

294
00:10:23,920 –> 00:10:26,800
plus any evidence vault you keep for supporting documents.

295
00:10:26,800 –> 00:10:29,160
This is the part everyone tries to negotiate away

296
00:10:29,160 –> 00:10:31,320
because it forces pipeline discipline.

297
00:10:31,320 –> 00:10:33,680
But worm isn’t a feature, it’s a behavior change.

298
00:10:33,680 –> 00:10:37,400
Once you enable immutability, overrides are no longer a quick fix.

299
00:10:37,400 –> 00:10:39,160
They are an audit event you can’t perform.

300
00:10:39,160 –> 00:10:40,720
That constraint is the entire point.

301
00:10:40,720 –> 00:10:42,360
Third, a governed calculation zone.

302
00:10:42,360 –> 00:10:44,680
Fabric Lake House or Azure Synapse Analytics.

303
00:10:44,680 –> 00:10:47,120
Pick one and treat it like an accounting engine.

304
00:10:47,120 –> 00:10:50,600
Version artifacts, control deployments and period bound releases.

305
00:10:50,600 –> 00:10:51,800
Your calculations need to live

306
00:10:51,800 –> 00:10:53,400
where they can be tested, reviewed

307
00:10:53,400 –> 00:10:55,760
and rerun against frozen inputs and frozen factors.

308
00:10:55,760 –> 00:10:58,360
If your KPI logic lives in power BI measures,

309
00:10:58,360 –> 00:10:59,840
you didn’t build a calculation zone.

310
00:10:59,840 –> 00:11:02,200
You built a dashboard that quietly rewrites history.

311
00:11:02,200 –> 00:11:03,640
Fourth, governance,

312
00:11:03,640 –> 00:11:05,640
and lineage, Microsoft purview.

313
00:11:05,640 –> 00:11:08,520
Purview is the difference between,

314
00:11:08,520 –> 00:11:12,080
we can probably explain this and here is the lineage graph.

315
00:11:12,080 –> 00:11:14,760
Here are the owners, here are the transformations.

316
00:11:14,760 –> 00:11:17,920
Under assurance pressure, that difference becomes the whole game.

317
00:11:17,920 –> 00:11:20,840
Purview is also how you scale governance beyond tribal knowledge.

318
00:11:20,840 –> 00:11:22,720
People leave, your metadata can’t.

319
00:11:22,720 –> 00:11:24,640
Fifth, reporting as a thin layer.

320
00:11:24,640 –> 00:11:27,520
Power BI, power BI is allowed, power BI is useful.

321
00:11:27,520 –> 00:11:30,480
Power BI is also where most teams destroy auditability

322
00:11:30,480 –> 00:11:33,160
by turning the semantic model into the calculation engine.

323
00:11:33,160 –> 00:11:34,360
So the rule is brutal.

324
00:11:34,360 –> 00:11:37,200
Power BI consumes reported period close tables.

325
00:11:37,200 –> 00:11:39,240
Measures are presentation and aggregation,

326
00:11:39,240 –> 00:11:40,560
not emissions accounting.

327
00:11:40,560 –> 00:11:42,800
You want auditors to argue about your visuals,

328
00:11:42,800 –> 00:11:43,600
not your logic.

329
00:11:43,600 –> 00:11:45,280
So that’s the non-negotiable baseline.

330
00:11:45,280 –> 00:11:47,840
Entra, ADLS Gen 2 with immutability,

331
00:11:47,840 –> 00:11:51,080
fabric or synapse for calculations, purview for lineage

332
00:11:51,080 –> 00:11:54,440
and power BI as the last mile presentation layer.

333
00:11:54,440 –> 00:11:57,640
Now the optional components, optional does not mean irrelevant.

334
00:11:57,640 –> 00:12:01,600
It means not required until scale, regulation and complexity corner you.

335
00:12:02,680 –> 00:12:06,000
Microsoft Sustainability Manager sits in that category.

336
00:12:06,000 –> 00:12:08,680
It’s optional when you already have mature emissions logic,

337
00:12:08,680 –> 00:12:11,000
a controlled factor library, and the willpower

338
00:12:11,000 –> 00:12:13,640
to build transparent pipelines and models yourself.

339
00:12:13,640 –> 00:12:16,720
It becomes valuable when you need faster onboarding to frameworks,

340
00:12:16,720 –> 00:12:18,400
faster scope three workflows,

341
00:12:18,400 –> 00:12:21,200
or you simply don’t have internal emissions domain depth.

342
00:12:21,200 –> 00:12:23,320
The platform has audit trail capabilities

343
00:12:23,320 –> 00:12:25,040
and data trail reporting features,

344
00:12:25,040 –> 00:12:26,800
but it doesn’t absolve you from architecture.

345
00:12:26,800 –> 00:12:29,160
If you treat it as a black box that spits out numbers,

346
00:12:29,160 –> 00:12:30,840
you’re just outsourcing your audit risk

347
00:12:30,840 –> 00:12:32,400
to a product configuration.

348
00:12:32,400 –> 00:12:34,480
As your data factory is also optional,

349
00:12:34,480 –> 00:12:37,120
but only if your ingestion needs stay simple.

350
00:12:37,120 –> 00:12:40,000
If fabric native ingestion covers your source systems fine.

351
00:12:40,000 –> 00:12:42,120
But when you have real ERP integration,

352
00:12:42,120 –> 00:12:45,160
IoT telemetry coordination, multi-step API dependencies

353
00:12:45,160 –> 00:12:46,760
and cross-system timing constraints,

354
00:12:46,760 –> 00:12:49,120
data factory becomes the orchestration layer

355
00:12:49,120 –> 00:12:51,240
that keeps ingestion deterministic.

356
00:12:51,240 –> 00:12:53,320
Just remember the immutability constraint.

357
00:12:53,320 –> 00:12:56,320
Data factory pipelines that override files collide

358
00:12:56,320 –> 00:12:59,120
with worm policies and fail, that failure isn’t a bug.

359
00:12:59,120 –> 00:13:00,840
It’s your architecture revealing itself.

360
00:13:00,840 –> 00:13:03,680
Azure Machine Learning is optional in the purest sense.

361
00:13:03,680 –> 00:13:05,960
Use it for forecasting, anomaly detection

362
00:13:05,960 –> 00:13:07,200
and scenario modeling.

363
00:13:07,200 –> 00:13:08,520
Never for baseline numbers.

364
00:13:08,520 –> 00:13:11,000
Model outputs are estimates and estimates

365
00:13:11,000 –> 00:13:12,640
need labeling, provenance and governance

366
00:13:12,640 –> 00:13:13,680
like any other input.

367
00:13:13,680 –> 00:13:16,600
Otherwise, your AI insights become untraceable logic changes

368
00:13:16,600 –> 00:13:17,920
with a brand name.

369
00:13:17,920 –> 00:13:19,120
Here’s the short warning.

370
00:13:19,120 –> 00:13:21,720
Every optional tool becomes mandatory.

371
00:13:21,720 –> 00:13:24,400
The moment you use it to create or modify numbers

372
00:13:24,400 –> 00:13:25,960
inside your reporting boundary,

373
00:13:25,960 –> 00:13:27,840
the system doesn’t care that it was a pilot.

374
00:13:27,840 –> 00:13:30,240
If it touched the number, it’s part of the evidence chain.

375
00:13:30,240 –> 00:13:31,640
So you now have the map.

376
00:13:31,640 –> 00:13:33,880
Non-negotiables enforce the four requirements.

377
00:13:33,880 –> 00:13:37,520
Optional tools add capability, but also add pathways for entropy.

378
00:13:37,520 –> 00:13:38,920
Next, you started the edge

379
00:13:38,920 –> 00:13:40,880
because the first place truth gets corrupted

380
00:13:40,880 –> 00:13:43,400
is always the first place data enters your system.

381
00:13:43,400 –> 00:13:46,800
Operational data sources, where OSG actually comes from.

382
00:13:46,800 –> 00:13:49,000
OSG doesn’t come from your sustainability team

383
00:13:49,000 –> 00:13:51,800
that team usually collects it, begs for it, cleans it up

384
00:13:51,800 –> 00:13:52,840
and tries to defend it.

385
00:13:52,840 –> 00:13:55,720
But the data originates somewhere else.

386
00:13:55,720 –> 00:13:58,240
Operational systems that were never designed

387
00:13:58,240 –> 00:13:59,920
to be audited for carbon math.

388
00:13:59,920 –> 00:14:01,400
That’s the first architectural truth.

389
00:14:01,400 –> 00:14:03,880
If you don’t treat the source systems as part of the reporting

390
00:14:03,880 –> 00:14:06,440
boundary, you’ll spend your life explaining downstream numbers

391
00:14:06,440 –> 00:14:08,480
while upstream inputs keep changing.

392
00:14:08,480 –> 00:14:11,480
So let’s name the real sources and the real damage they can do.

393
00:14:11,480 –> 00:14:14,680
Start with ERP, SAP, Dynamics, whatever you’ve standardized

394
00:14:14,680 –> 00:14:18,080
on ERP is where activity data becomes audit-friendly

395
00:14:18,080 –> 00:14:19,800
because it already has controls.

396
00:14:19,800 –> 00:14:22,240
Transactions, approvals, posting periods,

397
00:14:22,240 –> 00:14:24,360
master data, organizational structure.

398
00:14:24,360 –> 00:14:27,480
But the trap is that teams try to use the ERP outputs

399
00:14:27,480 –> 00:14:30,040
as already reported sustainability data.

400
00:14:30,040 –> 00:14:30,560
They shouldn’t.

401
00:14:30,560 –> 00:14:33,600
You want the activity data, fuel purchases, freight costs,

402
00:14:33,600 –> 00:14:36,400
inventory movement, utility invoices, travel expenses,

403
00:14:36,400 –> 00:14:37,800
procurement line items.

404
00:14:37,800 –> 00:14:40,160
The thing most people miss is that ERP is better

405
00:14:40,160 –> 00:14:42,480
as an evidence source than as a calculation engine.

406
00:14:42,480 –> 00:14:44,200
It’s good at capturing business events

407
00:14:44,200 –> 00:14:45,680
with identity and timestamps.

408
00:14:45,680 –> 00:14:48,240
It is not good at emissions factors, allocation logic

409
00:14:48,240 –> 00:14:50,600
or multi-scope reconciliation unless you deliberately

410
00:14:50,600 –> 00:14:51,280
build it that way.

411
00:14:51,280 –> 00:14:54,440
So ERP is a source of facts, not a source of finished ESG

412
00:14:54,440 –> 00:14:56,960
truth, next energy meters and IoT telemetry.

413
00:14:56,960 –> 00:14:59,680
This is where teams get excited about granularity

414
00:14:59,680 –> 00:15:01,760
and then quietly drown.

415
00:15:01,760 –> 00:15:05,000
Telemetry is high volume, high frequency and low forgiveness.

416
00:15:05,000 –> 00:15:07,600
You can collect a million readings and still fail assurance

417
00:15:07,600 –> 00:15:10,080
because you can’t explain context, which facility,

418
00:15:10,080 –> 00:15:12,240
which meter, what unit, which time zone,

419
00:15:12,240 –> 00:15:15,080
what calibration assumptions, what mapping from meter

420
00:15:15,080 –> 00:15:16,840
to asset to business unit.

421
00:15:16,840 –> 00:15:18,560
Telemetry without context is not data.

422
00:15:18,560 –> 00:15:20,360
It’s noise with audit liability.

423
00:15:20,360 –> 00:15:22,920
And because IoT pipelines often involve gateways

424
00:15:22,920 –> 00:15:25,360
edge buffering, retries and late arriving events

425
00:15:25,360 –> 00:15:26,840
you need to design for time.

426
00:15:26,840 –> 00:15:28,760
Event time versus ingestion time

427
00:15:28,760 –> 00:15:30,960
and what happens when the real reading shows up

428
00:15:30,960 –> 00:15:32,360
after period close.

429
00:15:32,360 –> 00:15:34,920
If you don’t decide that early, your close process becomes

430
00:15:34,920 –> 00:15:36,960
a permanent argument with your own sensors.

431
00:15:36,960 –> 00:15:39,600
Now HR systems.

432
00:15:39,600 –> 00:15:42,160
Workforce metrics sound simple until you try

433
00:15:42,160 –> 00:15:43,800
to define them consistently.

434
00:15:43,800 –> 00:15:46,960
Headcount, turnover, diversity, health and safety incidents,

435
00:15:46,960 –> 00:15:49,000
training hours, these are all HR managed

436
00:15:49,000 –> 00:15:50,160
and they are sensitive.

437
00:15:50,160 –> 00:15:53,320
That creates two constraints, access control and aggregation.

438
00:15:53,320 –> 00:15:54,840
You don’t want raw employee records

439
00:15:54,840 –> 00:15:56,600
wandering into analytics workspaces

440
00:15:56,600 –> 00:15:58,080
because someone wanted a dashboard.

441
00:15:58,080 –> 00:16:01,320
For OESG, HR systems should feed controlled aggregates

442
00:16:01,320 –> 00:16:03,280
with documented definitions and a stable

443
00:16:03,280 –> 00:16:05,360
organizational hierarchy mapping.

444
00:16:05,360 –> 00:16:07,600
Otherwise you get denominator drift.

445
00:16:07,600 –> 00:16:09,200
The metric stays the same name,

446
00:16:09,200 –> 00:16:11,160
but the population changes silently.

447
00:16:11,160 –> 00:16:13,000
Auditors don’t need to see personal data.

448
00:16:13,000 –> 00:16:15,720
They need to see that the metric definition didn’t mutate.

449
00:16:15,720 –> 00:16:18,040
Then procurement and suppliers, which is where scope three

450
00:16:18,040 –> 00:16:20,680
stops being theory and becomes operational humiliation.

451
00:16:20,680 –> 00:16:23,680
Supplier data comes through surveys, portals, partner feeds,

452
00:16:23,680 –> 00:16:25,840
invoices and sometimes email attachments

453
00:16:25,840 –> 00:16:28,480
that should never be admitted into an evidence chain.

454
00:16:28,480 –> 00:16:29,840
The variability is the point.

455
00:16:29,840 –> 00:16:32,040
Suppliers don’t share the same systems,

456
00:16:32,040 –> 00:16:34,160
the same data quality or the same incentives.

457
00:16:34,160 –> 00:16:36,400
So you need to capture two things from day one.

458
00:16:36,400 –> 00:16:37,880
Coverage and confidence.

459
00:16:37,880 –> 00:16:39,760
What percentage of spend or categories

460
00:16:39,760 –> 00:16:42,760
have supplier provided data and what percentage is estimated?

461
00:16:42,760 –> 00:16:44,160
Those flags aren’t nice to have.

462
00:16:44,160 –> 00:16:46,120
They’re the only honest way to survive questions

463
00:16:46,120 –> 00:16:47,120
about completeness.

464
00:16:47,120 –> 00:16:49,200
And if you don’t store supplier submissions

465
00:16:49,200 –> 00:16:51,920
as evidence artifacts with identity, timestamps

466
00:16:51,920 –> 00:16:54,360
and versioning, you will not be able to prove what was known

467
00:16:54,360 –> 00:16:55,800
at the time of reporting.

468
00:16:55,800 –> 00:16:59,480
Now the radioactive source, spreadsheets and CSV files,

469
00:16:59,480 –> 00:17:00,320
they’re allowed.

470
00:17:00,320 –> 00:17:03,400
They’re also the birthplace of Final V7 CSV,

471
00:17:03,400 –> 00:17:06,600
which is the universal symbol of uncontrolled modification.

472
00:17:06,600 –> 00:17:07,920
Spreadsheets are not evil.

473
00:17:07,920 –> 00:17:09,160
They’re just not control systems.

474
00:17:09,160 –> 00:17:11,520
They don’t preserve chain of custody by default

475
00:17:11,520 –> 00:17:13,400
and they make it trivial to change history

476
00:17:13,400 –> 00:17:14,800
without leaving a durable trail.

477
00:17:14,800 –> 00:17:17,800
So in an audit grade stack, spreadsheets are treated

478
00:17:17,800 –> 00:17:19,160
as controlled submissions.

479
00:17:19,160 –> 00:17:21,360
Metadata captured, schema validated,

480
00:17:21,360 –> 00:17:24,240
approvals recorded, and then the content gets ingested

481
00:17:24,240 –> 00:17:26,640
into the raw zone as append only evidence.

482
00:17:26,640 –> 00:17:29,160
The spreadsheet itself becomes supporting documentation

483
00:17:29,160 –> 00:17:31,720
in the evidence vault, not the system of record.

484
00:17:31,720 –> 00:17:32,680
Here’s the checkpoint.

485
00:17:32,680 –> 00:17:35,880
Every source system has its own native controls, gaps

486
00:17:35,880 –> 00:17:37,320
and failure modes.

487
00:17:37,320 –> 00:17:40,480
ERP brings structure, but temps reported outputs.

488
00:17:40,480 –> 00:17:42,880
IoT brings volume, but lacks business context.

489
00:17:42,880 –> 00:17:45,720
HR brings sensitivity and definition drift.

490
00:17:45,720 –> 00:17:48,400
Suppliers bring variability and partial coverage.

491
00:17:48,400 –> 00:17:50,120
Spreadsheets brings speed and entropy.

492
00:17:50,120 –> 00:17:53,360
Once you accept that, the next step becomes obvious.

493
00:17:53,360 –> 00:17:55,120
Ingestion is where truth gets corrupted

494
00:17:55,120 –> 00:17:56,960
because ingestion is where humans still believe

495
00:17:56,960 –> 00:17:58,320
overwriting is a feature.

496
00:17:58,320 –> 00:18:00,040
Ingestion patterns control pipelines

497
00:18:00,040 –> 00:18:02,120
versus human driven upload rituals.

498
00:18:02,120 –> 00:18:04,840
Ingestion is where OESG dies in real life

499
00:18:04,840 –> 00:18:07,160
because ingestion is where teams still confuse

500
00:18:07,160 –> 00:18:09,480
getting data in with getting evidence in.

501
00:18:09,480 –> 00:18:10,640
Those are not the same.

502
00:18:10,640 –> 00:18:12,120
The design rule is simple.

503
00:18:12,120 –> 00:18:14,120
Ingestion must be append first.

504
00:18:14,120 –> 00:18:15,800
Overrides are ordered poison.

505
00:18:15,800 –> 00:18:18,560
The moment your process allows replace the file,

506
00:18:18,560 –> 00:18:20,600
you’ve created an invisible edit pathway

507
00:18:20,600 –> 00:18:22,120
inside your reporting boundary.

508
00:18:22,120 –> 00:18:24,240
And auditors don’t need to prove you used it.

509
00:18:24,240 –> 00:18:25,640
They only need to prove you could.

510
00:18:25,640 –> 00:18:28,520
Append first means every load becomes a new object

511
00:18:28,520 –> 00:18:31,400
or a new version with a load identifier that never repeats.

512
00:18:31,400 –> 00:18:33,800
If you want to correct something, you don’t edit history.

513
00:18:33,800 –> 00:18:36,400
You publish an adjustment with rationale and approval

514
00:18:36,400 –> 00:18:38,760
and you keep the original as evidence.

515
00:18:38,760 –> 00:18:40,160
Now here’s where most people mess up.

516
00:18:40,160 –> 00:18:42,280
They treat ingestion as a user interface problem.

517
00:18:42,280 –> 00:18:43,520
They build an upload folder.

518
00:18:43,520 –> 00:18:45,520
They write drop files here instructions.

519
00:18:45,520 –> 00:18:46,880
They call it a pipeline.

520
00:18:46,880 –> 00:18:49,320
Then the first late file arrives and someone

521
00:18:49,320 –> 00:18:50,960
overrides the last one.

522
00:18:50,960 –> 00:18:53,880
Because the business wanted the dashboard to be right.

523
00:18:53,880 –> 00:18:55,200
That’s not ingestion.

524
00:18:55,200 –> 00:18:56,480
That’s ritual.

525
00:18:56,480 –> 00:18:59,800
A controlled ingestion pattern has three non-negotiable behaviors.

526
00:18:59,800 –> 00:19:02,560
Orchestrated movement, validation gates, and telemetry

527
00:19:02,560 –> 00:19:05,040
that can be handed to an auditor without translation.

528
00:19:05,040 –> 00:19:07,680
Let’s talk tooling because Microsoft gives you multiple ways

529
00:19:07,680 –> 00:19:10,480
to ingest and none of them magically make you auditable.

530
00:19:10,480 –> 00:19:12,480
Fabric native ingestion is convenient

531
00:19:12,480 –> 00:19:14,400
when your sources are straightforward.

532
00:19:14,400 –> 00:19:17,640
Files, tables, common connectors, predictable schedules,

533
00:19:17,640 –> 00:19:19,840
and you can keep the orchestration simple.

534
00:19:19,840 –> 00:19:21,400
The benefit is proximity.

535
00:19:21,400 –> 00:19:23,240
You’re already in the Lake House world.

536
00:19:23,240 –> 00:19:26,160
And you can land data close to where it will be processed.

537
00:19:26,160 –> 00:19:28,080
The failure mode is also proximity.

538
00:19:28,080 –> 00:19:30,920
Teams let convenience become a substitute for control

539
00:19:30,920 –> 00:19:33,800
and they stop capturing the metadata that proves what happened.

540
00:19:33,800 –> 00:19:36,400
Azure Data Factory exists for the unglamorous reality,

541
00:19:36,400 –> 00:19:39,120
complex ERP integration, IoT coordination,

542
00:19:39,120 –> 00:19:42,200
multi-step API polls, dependencies, retries, and sequencing

543
00:19:42,200 –> 00:19:44,320
that can’t be trusted to just run.

544
00:19:44,320 –> 00:19:47,800
It also has the operational surface area for governance.

545
00:19:47,800 –> 00:19:49,720
Parameterized pipelines, run history,

546
00:19:49,720 –> 00:19:51,320
integration runtime behavior,

547
00:19:51,320 –> 00:19:53,640
and consistent patterns across many sources.

548
00:19:53,640 –> 00:19:55,240
But the constraint is brutal.

549
00:19:55,240 –> 00:19:58,400
Immutability will punish sloppy ADF designs.

550
00:19:58,400 –> 00:20:01,440
When ADF tries to override a file in an immutable container,

551
00:20:01,440 –> 00:20:02,440
it fails.

552
00:20:02,440 –> 00:20:03,440
That’s not Microsoft being difficult.

553
00:20:03,440 –> 00:20:05,480
That’s your system proving that it was designed

554
00:20:05,480 –> 00:20:07,000
to rewrite evidence.

555
00:20:07,000 –> 00:20:09,960
So the pattern becomes right to a mutable staging area,

556
00:20:09,960 –> 00:20:13,120
validate, then publish into the immutable raw archive.

557
00:20:13,120 –> 00:20:17,240
Mutable staging immutable archive, two zones, two behaviors.

558
00:20:17,240 –> 00:20:19,520
Now validation gates.

559
00:20:19,520 –> 00:20:22,480
This is the part that separates ingestion from data dumping.

560
00:20:22,480 –> 00:20:24,240
Every load needs to be validated before it

561
00:20:24,240 –> 00:20:26,720
becomes evidence inside your system of record.

562
00:20:26,720 –> 00:20:29,360
And validation isn’t just did the pipeline run.

563
00:20:29,360 –> 00:20:31,680
It’s, did the data meet minimum standards

564
00:20:31,680 –> 00:20:33,080
to be considered in scope?

565
00:20:33,080 –> 00:20:36,400
The practical gates are boring and therefore effective.

566
00:20:36,400 –> 00:20:40,440
Schema checks, column names, types, required fields,

567
00:20:40,440 –> 00:20:42,080
and allowed null behavior.

568
00:20:42,080 –> 00:20:45,440
Unit normalization, kWH versus MWH,

569
00:20:45,440 –> 00:20:47,960
liters versus cubic meters, distance units,

570
00:20:47,960 –> 00:20:52,720
currency, time zones, required dimensions, site, region, period,

571
00:20:52,720 –> 00:20:55,680
scope category, source system identifier.

572
00:20:55,680 –> 00:20:57,840
And the system must record the outcome.

573
00:20:57,840 –> 00:21:01,320
Pass, fail, quarantine, or accepted with exceptions.

574
00:21:01,320 –> 00:21:03,800
This is where you stop pretending spreadsheets are harmless.

575
00:21:03,800 –> 00:21:06,200
A controlled CSV submission is allowed only

576
00:21:06,200 –> 00:21:09,240
if it passes the same validation gates as an API feed.

577
00:21:09,240 –> 00:21:11,000
Otherwise, you’ve created a privileged path

578
00:21:11,000 –> 00:21:13,200
for human supplied nonsense to enter the raw zone.

579
00:21:13,200 –> 00:21:15,640
Now, log everything, not for observability dashboards,

580
00:21:15,640 –> 00:21:16,720
for chain of custody.

581
00:21:16,720 –> 00:21:18,800
Every load should produce a durable record

582
00:21:18,800 –> 00:21:22,480
that includes a load ID, source system, extract window,

583
00:21:22,480 –> 00:21:24,640
ingestion timestamp, submitter identity

584
00:21:24,640 –> 00:21:28,200
where applicable, file name or object path validation results,

585
00:21:28,200 –> 00:21:30,440
and the pipeline version that perform the load.

586
00:21:30,440 –> 00:21:32,200
This is where entry shows up again.

587
00:21:32,200 –> 00:21:34,240
Submitter identity is not a name in an email.

588
00:21:34,240 –> 00:21:37,200
It’s an authenticated identity tied to the submission event.

589
00:21:37,200 –> 00:21:39,800
If you can’t attribute a submission to a real identity,

590
00:21:39,800 –> 00:21:41,840
you can’t prove separation of duties.

591
00:21:41,840 –> 00:21:44,000
And if you can’t prove separation of duties,

592
00:21:44,000 –> 00:21:47,080
you will eventually be asked why you believe your own data.

593
00:21:47,080 –> 00:21:49,640
There’s also a subtle requirement most teams miss.

594
00:21:49,640 –> 00:21:51,520
ingestion needs to be replayable.

595
00:21:51,520 –> 00:21:54,120
With someone asks, what did we know on March 31st?

596
00:21:54,120 –> 00:21:56,400
You can’t respond with the current state of the lake.

597
00:21:56,400 –> 00:21:58,960
You need to be able to point to the exact load artifacts

598
00:21:58,960 –> 00:22:00,360
that were in scope at close.

599
00:22:00,360 –> 00:22:01,640
So ingestion isn’t a pipe.

600
00:22:01,640 –> 00:22:02,640
It’s a ledger.

601
00:22:02,640 –> 00:22:04,920
And once you build it that way, the downstream system

602
00:22:04,920 –> 00:22:05,800
gets easier.

603
00:22:05,800 –> 00:22:08,400
Curated models get cleaner inputs, calculations

604
00:22:08,400 –> 00:22:11,520
become stable and period close becomes an actual event,

605
00:22:11,520 –> 00:22:13,200
not a calendar reminder.

606
00:22:13,200 –> 00:22:16,080
Next, the data has to be stored like it will be subpoenaed

607
00:22:16,080 –> 00:22:17,440
because it can be.

608
00:22:17,440 –> 00:22:21,200
Storage anatomy, raw, curated, bioreported,

609
00:22:21,200 –> 00:22:22,640
plus an evidence vault.

610
00:22:22,640 –> 00:22:25,040
Storage is where good intentions go to die,

611
00:22:25,040 –> 00:22:27,640
because most teams store ESG data the same way

612
00:22:27,640 –> 00:22:29,280
they store project files.

613
00:22:29,280 –> 00:22:31,920
Whatever folder exists, whatever naming convention

614
00:22:31,920 –> 00:22:35,680
someone remembers, and whatever overrides still work.

615
00:22:35,680 –> 00:22:37,160
That’s not storage architecture.

616
00:22:37,160 –> 00:22:39,240
That’s entropy management without the management.

617
00:22:39,240 –> 00:22:42,040
An auditable ESG stack needs storage anatomy.

618
00:22:42,040 –> 00:22:43,800
Distinct layers with distinct rules

619
00:22:43,800 –> 00:22:46,360
because different data states have different liabilities.

620
00:22:46,360 –> 00:22:48,280
You’re not organizing data for convenience.

621
00:22:48,280 –> 00:22:50,840
You’re organizing it so the system can prove what happened.

622
00:22:50,840 –> 00:22:53,760
So the baseline pattern is three zones, raw, curated,

623
00:22:53,760 –> 00:22:54,760
and reported.

624
00:22:54,760 –> 00:22:58,080
And then a fourth thing most stacks forget an evidence vault.

625
00:22:58,080 –> 00:23:00,640
Raw is the closest to source landing zone.

626
00:23:00,640 –> 00:23:01,600
Append only.

627
00:23:01,600 –> 00:23:02,640
Minimal transformation.

628
00:23:02,640 –> 00:23:04,280
The goal is not usability.

629
00:23:04,280 –> 00:23:05,720
The goal is preservation.

630
00:23:05,720 –> 00:23:07,360
Raw data answers one question.

631
00:23:07,360 –> 00:23:09,680
What did we receive from where and when?

632
00:23:09,680 –> 00:23:12,320
That means raw objects need stable identifiers

633
00:23:12,320 –> 00:23:14,480
and immutable behavior after close.

634
00:23:14,480 –> 00:23:17,960
If you normalize units in raw, you’ve already destroyed provenance

635
00:23:17,960 –> 00:23:19,840
unless you also store the original.

636
00:23:19,840 –> 00:23:22,240
So raw keeps the original representation.

637
00:23:22,240 –> 00:23:24,600
The meter reading payload, the invoice extract,

638
00:23:24,600 –> 00:23:27,320
the supplier submission file, the export from ERP.

639
00:23:27,320 –> 00:23:29,040
You can add metadata alongside it.

640
00:23:29,040 –> 00:23:30,920
You do not fix it in place.

641
00:23:30,920 –> 00:23:32,920
Curated is where the data becomes usable.

642
00:23:32,920 –> 00:23:36,000
This is where you standardize, conform, and model.

643
00:23:36,000 –> 00:23:38,080
Curated data answers a different question.

644
00:23:38,080 –> 00:23:40,040
What does this mean in our organization?

645
00:23:40,040 –> 00:23:42,480
This is where you map source specific fields

646
00:23:42,480 –> 00:23:43,640
into a common schema.

647
00:23:43,640 –> 00:23:45,760
Standardize units apply reference data

648
00:23:45,760 –> 00:23:48,800
like organizational hierarchies and attach quality flags.

649
00:23:48,800 –> 00:23:51,000
The curated zone is where you deal with the reality

650
00:23:51,000 –> 00:23:54,280
that one system calls it planned, another calls it site,

651
00:23:54,280 –> 00:23:56,120
and a third calls it location.

652
00:23:56,120 –> 00:23:58,160
And none of them agree on identifiers.

653
00:23:58,160 –> 00:24:01,360
You resolve that here explicitly in versioned transformations

654
00:24:01,360 –> 00:24:02,160
you can explain.

655
00:24:02,160 –> 00:24:05,200
Curated is also where you keep truth with scars.

656
00:24:05,200 –> 00:24:06,920
You don’t hide data quality issues.

657
00:24:06,920 –> 00:24:08,040
You mark them.

658
00:24:08,040 –> 00:24:12,000
Later, arriving data, missing dimensions, suspect values,

659
00:24:12,000 –> 00:24:13,840
all of that becomes flags and exceptions

660
00:24:13,840 –> 00:24:16,360
because clean data with no record of cleaning

661
00:24:16,360 –> 00:24:18,200
is just manipulated data.

662
00:24:18,200 –> 00:24:20,480
Reported is the period-closed output zone.

663
00:24:20,480 –> 00:24:22,320
This is where KPIs become records.

664
00:24:22,320 –> 00:24:24,600
Reported answers, the only question assurance really

665
00:24:24,600 –> 00:24:27,120
cares about what did you report for this period

666
00:24:27,120 –> 00:24:29,480
under which logic using which inputs and factors.

667
00:24:29,480 –> 00:24:30,960
Reported data must be stable.

668
00:24:30,960 –> 00:24:33,920
Once the period closes, reported outputs do not change.

669
00:24:33,920 –> 00:24:35,160
If something needs correction,

670
00:24:35,160 –> 00:24:37,200
you don’t override reported tables.

671
00:24:37,200 –> 00:24:39,320
You publish an adjustment entry with references,

672
00:24:39,320 –> 00:24:42,120
what changed, why, and which approval allowed it.

673
00:24:42,120 –> 00:24:43,600
That’s how financial systems work.

674
00:24:43,600 –> 00:24:45,360
And ESG doesn’t get a special exemption

675
00:24:45,360 –> 00:24:46,880
just because it feels newer.

676
00:24:46,880 –> 00:24:48,280
Now here’s the thing most people miss.

677
00:24:48,280 –> 00:24:50,600
These three zones are not only about data shape,

678
00:24:50,600 –> 00:24:52,200
they’re about access boundaries.

679
00:24:52,200 –> 00:24:54,840
Raw is restricted because it contains direct extracts

680
00:24:54,840 –> 00:24:56,520
and sometimes sensitive fields.

681
00:24:56,520 –> 00:24:57,960
Curated is restricted differently

682
00:24:57,960 –> 00:25:00,040
because it represents standardized enterprise data

683
00:25:00,040 –> 00:25:01,600
that can be widely misused.

684
00:25:01,600 –> 00:25:04,120
Reported is restricted because it’s the official record.

685
00:25:04,120 –> 00:25:06,040
Different audiences, different permissions,

686
00:25:06,040 –> 00:25:07,760
same-entra enforcement model,

687
00:25:07,760 –> 00:25:09,200
and then there’s the evidence vault.

688
00:25:09,200 –> 00:25:12,280
The evidence vault is not a folder called supporting docs.

689
00:25:12,280 –> 00:25:14,040
It’s a controlled repository for everything

690
00:25:14,040 –> 00:25:15,160
that proves the numbers.

691
00:25:15,160 –> 00:25:16,960
Supplyers, submissions, invoices,

692
00:25:16,960 –> 00:25:19,480
meter calibration records, calculation approvals,

693
00:25:19,480 –> 00:25:21,760
factor library provenance, mapping decisions,

694
00:25:21,760 –> 00:25:23,840
and period-close attestations.

695
00:25:23,840 –> 00:25:26,800
This vault matters because ESG is not purely quantitative.

696
00:25:26,800 –> 00:25:28,480
Even when the KPIs are number,

697
00:25:28,480 –> 00:25:30,880
the justification often involves documents.

698
00:25:30,880 –> 00:25:32,840
The vault is where you store those artifacts

699
00:25:32,840 –> 00:25:35,960
with the same chain of custody expectations as raw data,

700
00:25:35,960 –> 00:25:39,960
who submitted it, when, which KPI or period it supports,

701
00:25:39,960 –> 00:25:42,040
and whether it was locked after close.

702
00:25:42,040 –> 00:25:44,040
If the supporting evidence lives in teams chats

703
00:25:44,040 –> 00:25:46,920
and someone’s mailbox, it doesn’t exist in audit terms.

704
00:25:46,920 –> 00:25:48,400
It exists as a future argument.

705
00:25:48,400 –> 00:25:51,080
Now naming and versioning, because mystery tables

706
00:25:51,080 –> 00:25:53,960
are an architectural failure, not a documentation failure,

707
00:25:53,960 –> 00:25:56,880
every object needs a predictable name that encodes,

708
00:25:56,880 –> 00:25:59,720
domain, source, period, and version.

709
00:25:59,720 –> 00:26:01,560
Not because auditors love naming conventions,

710
00:26:01,560 –> 00:26:04,120
but because humans do, you want an engineer to look at a path

711
00:26:04,120 –> 00:26:06,160
and understand whether it’s raw or reported,

712
00:26:06,160 –> 00:26:07,760
whether it’s preliminary or closed,

713
00:26:07,760 –> 00:26:09,600
and which period it belongs to,

714
00:26:09,600 –> 00:26:11,200
versioning needs to be explicit.

715
00:26:11,200 –> 00:26:12,480
Latest is not a version.

716
00:26:12,480 –> 00:26:13,920
Final is not a version.

717
00:26:13,920 –> 00:26:15,560
Final final two is a confession.

718
00:26:15,560 –> 00:26:17,960
So the storage anatomy creates a set of invariants.

719
00:26:17,960 –> 00:26:19,880
Raw preserves, curated standardizes,

720
00:26:19,880 –> 00:26:22,160
reported freezes, and the evidence vault proves.

721
00:26:22,160 –> 00:26:25,280
Once you have that, immutability stops being a storage checkbox

722
00:26:25,280 –> 00:26:27,520
and becomes a design constraint that your pipelines

723
00:26:27,520 –> 00:26:28,600
can actually survive.

724
00:26:28,600 –> 00:26:29,680
That’s next.

725
00:26:29,680 –> 00:26:31,400
Immutability, worm.

726
00:26:31,400 –> 00:26:34,280
How to lock evidence without breaking your pipelines?

727
00:26:34,280 –> 00:26:36,080
Immutability is where good architecture

728
00:26:36,080 –> 00:26:38,400
stop being aspirational and start being inconvenient,

729
00:26:38,400 –> 00:26:39,360
which is why it works.

730
00:26:39,360 –> 00:26:41,200
In Azure terms, this is right once,

731
00:26:41,200 –> 00:26:43,120
read many immutable storage policies

732
00:26:43,120 –> 00:26:45,520
on blob storage or ADLS Gen2

733
00:26:45,520 –> 00:26:48,440
that prevent modification or deletion for a defined period.

734
00:26:48,440 –> 00:26:51,040
Time-based retention locks data for a set interval.

735
00:26:51,040 –> 00:26:54,760
Legal hold locks it until someone explicitly clears it.

736
00:26:54,760 –> 00:26:55,760
Different intent?

737
00:26:55,760 –> 00:26:56,600
Same effect.

738
00:26:56,600 –> 00:26:59,040
You can create and read, but you can’t rewrite the past.

739
00:26:59,040 –> 00:27:00,040
That’s the point.

740
00:27:00,040 –> 00:27:02,280
The mistake teams make is treating immutability

741
00:27:02,280 –> 00:27:05,000
as a storage toggle you enable later.

742
00:27:05,000 –> 00:27:07,240
But immutability isn’t a feature you add.

743
00:27:07,240 –> 00:27:09,840
It’s a constraint that changes pipeline behavior,

744
00:27:09,840 –> 00:27:13,280
deployment patterns, and how humans negotiate fixes.

745
00:27:13,280 –> 00:27:15,720
So let’s be explicit about what changes operationally

746
00:27:15,720 –> 00:27:17,920
the moment you lock a container.

747
00:27:17,920 –> 00:27:19,920
Overrides become illegal.

748
00:27:19,920 –> 00:27:23,160
Re-run the job becomes, publish a new version.

749
00:27:23,160 –> 00:27:25,600
Re-run the job becomes, post an adjustment.

750
00:27:25,600 –> 00:27:27,720
And any pipeline that assumes it can land

751
00:27:27,720 –> 00:27:30,000
to the same path twice will fail loudly.

752
00:27:30,000 –> 00:27:31,880
As your storage will enforce the policy

753
00:27:31,880 –> 00:27:34,880
and your orchestration will surface it as an error.

754
00:27:34,880 –> 00:27:36,360
In data factory, you’ll see failures

755
00:27:36,360 –> 00:27:37,800
like path immutable due to policy

756
00:27:37,800 –> 00:27:39,680
when a copy activity attempts to override

757
00:27:39,680 –> 00:27:41,160
or modify a protected path.

758
00:27:41,160 –> 00:27:42,880
That error is not a platform defect.

759
00:27:42,880 –> 00:27:45,640
It is the system preventing evidence tempering accidental

760
00:27:45,640 –> 00:27:46,480
or otherwise.

761
00:27:46,480 –> 00:27:48,160
This is the foundational misunderstanding

762
00:27:48,160 –> 00:27:50,600
people think immutability is about security.

763
00:27:50,600 –> 00:27:51,440
It’s not.

764
00:27:51,440 –> 00:27:52,280
It’s about time.

765
00:27:52,280 –> 00:27:54,520
Making period close real in the storage layer

766
00:27:54,520 –> 00:27:56,240
not just in a calendar invite.

767
00:27:56,240 –> 00:27:57,760
Now, there are two workable patterns

768
00:27:57,760 –> 00:28:00,040
that don’t destroy your operations.

769
00:28:00,040 –> 00:28:02,320
The first pattern is immutable by design.

770
00:28:02,320 –> 00:28:06,120
Every ingestion writes to a unique, never-reused object name,

771
00:28:06,120 –> 00:28:07,880
and you never need to override anything.

772
00:28:07,880 –> 00:28:10,400
That means a path that includes a load identifier

773
00:28:10,400 –> 00:28:13,440
plus a deterministic partitioning scheme, source, system,

774
00:28:13,440 –> 00:28:16,320
date, and maybe hour if you’re dealing with telemetry.

775
00:28:16,320 –> 00:28:18,520
Each run produces a new object set.

776
00:28:18,520 –> 00:28:20,840
If the data arrives late, it lands as a new object set

777
00:28:20,840 –> 00:28:22,440
with a later load ID.

778
00:28:22,440 –> 00:28:24,560
You can still compute the same reported outputs

779
00:28:24,560 –> 00:28:27,800
because your close process selects which load IDs are in scope.

780
00:28:27,800 –> 00:28:30,360
The second pattern is the one most organizations actually

781
00:28:30,360 –> 00:28:30,680
need.

782
00:28:30,680 –> 00:28:34,080
Mutable staging, validated, publish, immutable archive.

783
00:28:34,080 –> 00:28:35,120
Here’s how it works.

784
00:28:35,120 –> 00:28:38,160
You ingest into a staging area that is intentionally mutable.

785
00:28:38,160 –> 00:28:40,440
You can rerun pipelines there, fix mapping bugs,

786
00:28:40,440 –> 00:28:42,320
and iterate without fighting worm.

787
00:28:42,320 –> 00:28:43,560
Then you run validation gates.

788
00:28:43,560 –> 00:28:45,480
And only after validation succeeds

789
00:28:45,480 –> 00:28:48,480
do you publish to the raw evidence zone, which is immutable.

790
00:28:48,480 –> 00:28:50,280
Publish is not copy and delete.

791
00:28:50,280 –> 00:28:52,160
Publish is a one-way promotion.

792
00:28:52,160 –> 00:28:55,080
New immutable objects written with a stable naming convention

793
00:28:55,080 –> 00:28:57,880
plus a metadata record that binds them to a load ID,

794
00:28:57,880 –> 00:29:01,360
pipeline version, submitter identity, and validation results.

795
00:29:01,360 –> 00:29:02,880
If you’re using Azure Data Factory,

796
00:29:02,880 –> 00:29:05,720
this pattern becomes mandatory in some transformation

797
00:29:05,720 –> 00:29:07,680
scenarios because certain activities rely

798
00:29:07,680 –> 00:29:10,360
on temporary files during processing.

799
00:29:10,360 –> 00:29:12,480
Immutable policies prevent those temporary rights

800
00:29:12,480 –> 00:29:13,480
and cleanup operations.

801
00:29:13,480 –> 00:29:15,600
So you write to non-immutable storage first,

802
00:29:15,600 –> 00:29:18,080
then use a copy activity to move the finalized outputs

803
00:29:18,080 –> 00:29:19,600
into the immutable container.

804
00:29:19,600 –> 00:29:23,200
Again, inconvenient, predictable, correct.

805
00:29:23,200 –> 00:29:26,760
Now the subtle part, immutability doesn’t just apply to raw,

806
00:29:26,760 –> 00:29:28,960
it applies to anything you will later claim as evidence.

807
00:29:28,960 –> 00:29:31,240
That includes factor libraries for a closed period,

808
00:29:31,240 –> 00:29:33,680
the period closed reported KPI outputs,

809
00:29:33,680 –> 00:29:36,280
and the evidence vault documents that support disclosures.

810
00:29:36,280 –> 00:29:38,080
If those objects remain mutable,

811
00:29:38,080 –> 00:29:39,880
you can’t prove historical integrity.

812
00:29:39,880 –> 00:29:41,080
You can only promise it.

813
00:29:41,080 –> 00:29:43,000
Auditors don’t accept promises as controls.

814
00:29:43,000 –> 00:29:46,080
So you need a closed process that includes a storage lock step.

815
00:29:46,080 –> 00:29:48,200
At period close, you freeze the selection of inputs,

816
00:29:48,200 –> 00:29:49,880
which load IDs are included.

817
00:29:49,880 –> 00:29:51,200
You freeze factor versions,

818
00:29:51,200 –> 00:29:53,360
you freeze the calculation logic reference,

819
00:29:53,360 –> 00:29:55,160
then you publish the reported outputs

820
00:29:55,160 –> 00:29:58,280
and apply immutability to the reported zone for that period.

821
00:29:58,280 –> 00:30:00,320
You’re not locking the entire lake forever.

822
00:30:00,320 –> 00:30:02,000
You’re locking the slices that represent

823
00:30:02,000 –> 00:30:03,480
what we knew and reported.

824
00:30:03,480 –> 00:30:05,200
That distinction matters because you still need

825
00:30:05,200 –> 00:30:06,640
to operate next month.

826
00:30:06,640 –> 00:30:09,120
One more uncomfortable truth immutability forces you

827
00:30:09,120 –> 00:30:10,280
to design for corrections.

828
00:30:10,280 –> 00:30:11,960
Corrections can’t be overrides,

829
00:30:11,960 –> 00:30:13,880
so they become adjustment entries.

830
00:30:13,880 –> 00:30:16,200
Additive records that reference the original,

831
00:30:16,200 –> 00:30:18,480
carry a rationale and require approval.

832
00:30:18,480 –> 00:30:19,640
If you do this well,

833
00:30:19,640 –> 00:30:22,480
you end up with something auditors understand immediately.

834
00:30:22,480 –> 00:30:24,400
The original evidence remains intact

835
00:30:24,400 –> 00:30:26,880
and the adjustment trail is visible and attributable.

836
00:30:26,880 –> 00:30:27,880
If you do this poorly,

837
00:30:27,880 –> 00:30:30,360
people will attempt to bypass the system.

838
00:30:30,360 –> 00:30:31,680
They’ll hunt for a mutable folder.

839
00:30:31,680 –> 00:30:33,120
They’ll ask for exceptions.

840
00:30:33,120 –> 00:30:34,440
They’ll demand admin rights.

841
00:30:34,440 –> 00:30:35,680
That’s not a people problem.

842
00:30:35,680 –> 00:30:37,440
That’s you failing to design the only thing

843
00:30:37,440 –> 00:30:38,800
that survives entropy,

844
00:30:38,800 –> 00:30:41,080
an architecture that makes the right behavior easier

845
00:30:41,080 –> 00:30:42,280
than the wrong one.

846
00:30:42,280 –> 00:30:43,920
Next up is the hard part.

847
00:30:43,920 –> 00:30:45,600
Calculations that don’t drift,

848
00:30:45,600 –> 00:30:48,720
even when everyone keeps improving the logic.

849
00:30:48,720 –> 00:30:50,360
The governed calculation zone,

850
00:30:50,360 –> 00:30:51,960
fabric lake house or synapse,

851
00:30:51,960 –> 00:30:53,280
not dashboard math.

852
00:30:53,280 –> 00:30:55,600
This is where most ESG stacks quietly rot,

853
00:30:55,600 –> 00:30:56,680
the calculation layer,

854
00:30:56,680 –> 00:30:58,000
not because people can’t do math

855
00:30:58,000 –> 00:30:59,400
because they put math in places

856
00:30:59,400 –> 00:31:01,800
that can’t be governed like an accounting system.

857
00:31:01,800 –> 00:31:03,520
Power BI is a presentation tool.

858
00:31:03,520 –> 00:31:06,120
It is not an audit grade calculation engine.

859
00:31:06,120 –> 00:31:07,360
The moment your emissions logic

860
00:31:07,360 –> 00:31:09,320
lives primarily in DAX measures,

861
00:31:09,320 –> 00:31:11,080
you’ve made your numbers dependent on a file

862
00:31:11,080 –> 00:31:13,040
that changes whenever someone wants a new visual.

863
00:31:13,040 –> 00:31:14,120
That’s not control.

864
00:31:14,120 –> 00:31:16,280
That’s drift with a user interface.

865
00:31:16,280 –> 00:31:17,640
So the rule is blunt.

866
00:31:17,640 –> 00:31:19,640
Calculations live in a governed zone,

867
00:31:19,640 –> 00:31:22,640
fabric lake house or azure synapse analytics.

868
00:31:22,640 –> 00:31:24,040
Pick one.

869
00:31:24,040 –> 00:31:26,440
Then treat it like a finance system.

870
00:31:26,440 –> 00:31:29,680
Version logic, controlled releases, testability

871
00:31:29,680 –> 00:31:31,320
and reproducibility per period.

872
00:31:31,320 –> 00:31:34,080
Why this matters shows up the first time a stakeholder asks,

873
00:31:34,080 –> 00:31:36,120
why did last year’s number change?

874
00:31:36,120 –> 00:31:38,920
And you discover the answer is someone edited a measure.

875
00:31:38,920 –> 00:31:40,680
That answer will not survive assurance.

876
00:31:40,680 –> 00:31:42,120
In a governed calculation zone,

877
00:31:42,120 –> 00:31:44,880
the primary artifacts are explicit and inspecable.

878
00:31:44,880 –> 00:31:47,320
SQL views, stored procedures, notebooks

879
00:31:47,320 –> 00:31:49,400
and tables that represent outputs.

880
00:31:49,400 –> 00:31:51,960
You choose the artifact type based on what you can

881
00:31:51,960 –> 00:31:54,280
govern consistently, not on what your favorite

882
00:31:54,280 –> 00:31:55,680
engineer likes this week.

883
00:31:55,680 –> 00:31:58,080
SQL views can be clean for transparency.

884
00:31:58,080 –> 00:32:01,320
The logic is readable, diffable and can be reviewed.

885
00:32:01,320 –> 00:32:03,840
Stored procedures can enforce parameterization

886
00:32:03,840 –> 00:32:06,120
and encapsulate controlled transformations,

887
00:32:06,120 –> 00:32:08,080
but they can also become opaque

888
00:32:08,080 –> 00:32:11,640
if people start hiding business logic inside procedural code.

889
00:32:11,640 –> 00:32:14,240
Notebooks are powerful for complex transformations

890
00:32:14,240 –> 00:32:17,800
and factor application, but they demand discipline, source

891
00:32:17,800 –> 00:32:21,280
control, approved releases and consistent execution

892
00:32:21,280 –> 00:32:22,360
environments.

893
00:32:22,360 –> 00:32:24,880
Choose one dominant pattern for KPI computation

894
00:32:24,880 –> 00:32:26,920
and enforce it, because mixed paradigms

895
00:32:26,920 –> 00:32:28,720
are how you lose reproducibility.

896
00:32:28,720 –> 00:32:30,920
And this is the checkpoint most people ignore.

897
00:32:30,920 –> 00:32:33,240
Unit consistency and dimensionality.

898
00:32:33,240 –> 00:32:36,120
Emissions calculations are not just multiplication.

899
00:32:36,120 –> 00:32:38,080
They are multiplication under constraints.

900
00:32:38,080 –> 00:32:40,880
Site, region, period, source system, scope category,

901
00:32:40,880 –> 00:32:43,080
activity type, unit and factor version.

902
00:32:43,080 –> 00:32:45,480
If any of those dimensions are missing or ambiguous,

903
00:32:45,480 –> 00:32:47,440
you will produce numbers that look plausible

904
00:32:47,440 –> 00:32:48,840
and fail under interrogation.

905
00:32:48,840 –> 00:32:51,320
So the govern zone has a job beyond computing.

906
00:32:51,320 –> 00:32:53,080
It enforces dimensional completeness.

907
00:32:53,080 –> 00:32:56,120
Every record must be joinable to organizational structure.

908
00:32:56,120 –> 00:32:59,400
Every activity record must carry a unit that can be normalized.

909
00:32:59,400 –> 00:33:02,200
Every computed record must carry the factor version key used,

910
00:33:02,200 –> 00:33:04,320
not DEFRA, not EPA.

911
00:33:04,320 –> 00:33:07,920
A version key that binds the output to a specific factor

912
00:33:07,920 –> 00:33:09,280
library snapshot.

913
00:33:09,280 –> 00:33:12,000
Now here’s where period close mechanics stop being a meeting

914
00:33:12,000 –> 00:33:13,720
and become an implementation.

915
00:33:13,720 –> 00:33:16,960
For a close to be auditable, three things must freeze together.

916
00:33:16,960 –> 00:33:18,960
Inputs, factors and logic reference.

917
00:33:18,960 –> 00:33:21,640
Freeze inputs means the system records, which load IDs

918
00:33:21,640 –> 00:33:24,120
or partitions are in scope for the period.

919
00:33:24,120 –> 00:33:26,080
You don’t just have much data.

920
00:33:26,080 –> 00:33:28,920
You have these ingestion runs validated, approved,

921
00:33:28,920 –> 00:33:29,880
included.

922
00:33:29,880 –> 00:33:32,200
Later rivals don’t overwrite anything.

923
00:33:32,200 –> 00:33:33,880
They become late arrivals.

924
00:33:33,880 –> 00:33:36,520
Explicitly excluded or treated as adjustments.

925
00:33:36,520 –> 00:33:38,160
Freeze factors means the factor library

926
00:33:38,160 –> 00:33:40,720
used for that period is published with a version key

927
00:33:40,720 –> 00:33:42,120
and then locked as evidence.

928
00:33:42,120 –> 00:33:44,880
If your calculation queries join to latest,

929
00:33:44,880 –> 00:33:45,880
you’ve already failed.

930
00:33:45,880 –> 00:33:47,360
You’re not calculating a period.

931
00:33:47,360 –> 00:33:49,600
You’re calculating today’s opinion of the past.

932
00:33:49,600 –> 00:33:52,080
Freeze logic reference means the exact calculation

933
00:33:52,080 –> 00:33:54,880
artifacts used are versioned and identifiable.

934
00:33:54,880 –> 00:33:56,880
A git commit, a release notebook package,

935
00:33:56,880 –> 00:34:00,120
a view definition version, something durable.

936
00:34:00,120 –> 00:34:01,760
The current notebook is not a version.

937
00:34:01,760 –> 00:34:03,040
It’s a moving target.

938
00:34:03,040 –> 00:34:05,440
Once those three are frozen, the reported outputs

939
00:34:05,440 –> 00:34:07,040
can be generated deterministically

940
00:34:07,040 –> 00:34:08,760
and published into the reported zone.

941
00:34:08,760 –> 00:34:10,920
And Power BI consumes those outputs.

942
00:34:10,920 –> 00:34:11,720
That’s the boundary.

943
00:34:11,720 –> 00:34:15,160
Power BI doesn’t get to help by recomputing core emissions

944
00:34:15,160 –> 00:34:15,680
logic.

945
00:34:15,680 –> 00:34:18,160
Now, a common objection is, but we need flexibility.

946
00:34:18,160 –> 00:34:19,440
No, you need control change.

947
00:34:19,440 –> 00:34:20,320
You can change the model.

948
00:34:20,320 –> 00:34:21,400
You can improve mapping.

949
00:34:21,400 –> 00:34:22,520
You can add new factors.

950
00:34:22,520 –> 00:34:24,480
You can refine scope three categories.

951
00:34:24,480 –> 00:34:27,360
But every change becomes a new version with a new effective date

952
00:34:27,360 –> 00:34:29,640
and a clear statement of what periods it impacts.

953
00:34:29,640 –> 00:34:30,720
That’s not bureaucracy.

954
00:34:30,720 –> 00:34:32,920
That’s how you stop rewriting history by accident.

955
00:34:32,920 –> 00:34:35,320
If you remember nothing else, the governed calculation zone

956
00:34:35,320 –> 00:34:37,400
is where ESG becomes deterministic.

957
00:34:37,400 –> 00:34:39,920
Dashboards are where ESG becomes arguable.

958
00:34:39,920 –> 00:34:42,360
And once you enforce deterministic computation,

959
00:34:42,360 –> 00:34:44,320
the next dependency becomes obvious.

960
00:34:44,320 –> 00:34:47,040
Emissions logic, lives and dies on factor management.

961
00:34:47,040 –> 00:34:49,960
Emissions factors, versioning or your rewriting history.

962
00:34:49,960 –> 00:34:52,080
Emissions factors are where most ESG stacks

963
00:34:52,080 –> 00:34:53,560
commit their quietest fraud.

964
00:34:53,560 –> 00:34:56,200
They treat reference data like a convenience file,

965
00:34:56,200 –> 00:34:58,400
a spreadsheet attachment, something you update

966
00:34:58,400 –> 00:34:59,880
when the new one comes out.

967
00:34:59,880 –> 00:35:02,280
That behavior rewrites history.

968
00:35:02,280 –> 00:35:04,440
Because an emission factor is not a number.

969
00:35:04,440 –> 00:35:07,080
It’s a controlled assumption that converts activity

970
00:35:07,080 –> 00:35:08,120
into emissions.

971
00:35:08,120 –> 00:35:10,800
Change the assumption and you change the outcome.

972
00:35:10,800 –> 00:35:13,440
Which means if you can’t prove which factor set applied

973
00:35:13,440 –> 00:35:16,160
to FYI, your FYI numbers aren’t evidence.

974
00:35:16,160 –> 00:35:18,200
There are current interpretation of the past.

975
00:35:18,200 –> 00:35:20,360
Auditors don’t assure interpretations.

976
00:35:20,360 –> 00:35:22,200
They assure records.

977
00:35:22,200 –> 00:35:23,480
So the rule is simple.

978
00:35:23,480 –> 00:35:25,640
Emissions factors are controlled reference data

979
00:35:25,640 –> 00:35:28,160
with versioning, provenance and effective dates.

980
00:35:28,160 –> 00:35:30,920
And once a period closes, the specific factor set

981
00:35:30,920 –> 00:35:33,520
used for that period becomes immutable evidence.

982
00:35:33,520 –> 00:35:35,400
This is the part everyone tries to shortcut with,

983
00:35:35,400 –> 00:35:37,400
we use DEFRA or we use EPA.

984
00:35:37,400 –> 00:35:38,160
That’s not evidence.

985
00:35:38,160 –> 00:35:39,200
That’s a brand label.

986
00:35:39,200 –> 00:35:41,400
What matters is the specific library version,

987
00:35:41,400 –> 00:35:43,440
the effective date range, the geography mapping

988
00:35:43,440 –> 00:35:45,240
and the category mapping you applied.

989
00:35:45,240 –> 00:35:46,600
And here’s the thing most people miss.

990
00:35:46,600 –> 00:35:48,560
Factor management isn’t one table.

991
00:35:48,560 –> 00:35:49,600
It’s a small system.

992
00:35:49,600 –> 00:35:51,760
You need at least four concepts in your model.

993
00:35:51,760 –> 00:35:53,480
One, a factor library entity.

994
00:35:53,480 –> 00:35:56,360
This represents a published set you can refer to as a unit.

995
00:35:56,360 –> 00:35:58,840
It has a name, a publisher, a published date,

996
00:35:58,840 –> 00:36:01,600
and a status like draft approved and archived.

997
00:36:01,600 –> 00:36:05,160
Two, the factor records themselves, the actual conversion values

998
00:36:05,160 –> 00:36:07,360
with units, gas type, where applicable

999
00:36:07,360 –> 00:36:10,200
and any classification fields you rely on in joins.

1000
00:36:10,200 –> 00:36:14,760
Three, applicability metadata, geography, sector,

1001
00:36:14,760 –> 00:36:17,960
activity type mapping, and effective date range.

1002
00:36:17,960 –> 00:36:20,120
If a factor is only valid for a country

1003
00:36:20,120 –> 00:36:22,040
or only valid from a certain date,

1004
00:36:22,040 –> 00:36:24,200
the model needs to carry that explicitly.

1005
00:36:24,200 –> 00:36:26,880
Four, provenance artifacts, where it came from

1006
00:36:26,880 –> 00:36:28,400
and how it entered your system.

1007
00:36:28,400 –> 00:36:30,400
That can be a link to an evidence document

1008
00:36:30,400 –> 00:36:33,880
in your evidence vault or at minimum, a stored reference

1009
00:36:33,880 –> 00:36:35,640
that can be produced during assurance.

1010
00:36:35,640 –> 00:36:37,280
Now the failure mode is predictable.

1011
00:36:37,280 –> 00:36:40,160
Teams store factors in a table called emission factors

1012
00:36:40,160 –> 00:36:42,800
with a column called factor value and no version key.

1013
00:36:42,800 –> 00:36:45,080
Then calculations join to it using a natural key

1014
00:36:45,080 –> 00:36:48,840
like activity type and country, and they default to latest.

1015
00:36:48,840 –> 00:36:50,720
And it works until the factor table updates,

1016
00:36:50,720 –> 00:36:53,120
then rerunning last year produces different results.

1017
00:36:53,120 –> 00:36:54,600
And the team calls it an update.

1018
00:36:54,600 –> 00:36:55,680
It is not an update.

1019
00:36:55,680 –> 00:36:57,480
It is a restatement without governance.

1020
00:36:57,480 –> 00:36:59,680
So the enforcement pattern is also predictable.

1021
00:36:59,680 –> 00:37:01,440
Factor to period binding.

1022
00:37:01,440 –> 00:37:04,440
Every computer emission record must carry a factor version key,

1023
00:37:04,440 –> 00:37:06,480
not a textual label, a key that ties back

1024
00:37:06,480 –> 00:37:08,880
to a specific published library snapshot.

1025
00:37:08,880 –> 00:37:10,960
And your calculation logic must require it.

1026
00:37:10,960 –> 00:37:13,040
If the pipeline can run without specifying

1027
00:37:13,040 –> 00:37:14,800
the factor version, you’ve built a machine

1028
00:37:14,800 –> 00:37:16,440
that can rewrite its own past.

1029
00:37:16,440 –> 00:37:17,960
This is where systems beat policy.

1030
00:37:17,960 –> 00:37:19,960
Don’t tell people, don’t use latest.

1031
00:37:19,960 –> 00:37:22,880
Make latest unusable in period close processing.

1032
00:37:22,880 –> 00:37:24,520
Use it only in exploratory analysis

1033
00:37:24,520 –> 00:37:27,640
where you explicitly label the output as non-reportable.

1034
00:37:27,640 –> 00:37:29,200
Then you build the publish workflow.

1035
00:37:29,200 –> 00:37:32,440
Factors do not appear in production tables as ad hoc edits.

1036
00:37:32,440 –> 00:37:34,000
They move through a life cycle.

1037
00:37:34,000 –> 00:37:36,240
Draft factors exist in a working area.

1038
00:37:36,240 –> 00:37:38,360
Someone reviews them, someone approves them,

1039
00:37:38,360 –> 00:37:40,040
and then you publish a new library version

1040
00:37:40,040 –> 00:37:41,440
after published you lock it.

1041
00:37:41,440 –> 00:37:43,240
That’s where immutability enters again.

1042
00:37:43,240 –> 00:37:46,000
The published factor library for a period becomes evidence.

1043
00:37:46,000 –> 00:37:47,640
So it becomes worm protected.

1044
00:37:47,640 –> 00:37:49,400
And yes, you can still correct factor data.

1045
00:37:49,400 –> 00:37:51,440
You just can’t pretend it was always that way.

1046
00:37:51,440 –> 00:37:52,920
Corrections become a new version

1047
00:37:52,920 –> 00:37:56,600
with a clear statement of impact, which future periods use it,

1048
00:37:56,600 –> 00:37:59,480
and whether prior periods require an adjustment entry.

1049
00:37:59,480 –> 00:38:01,600
Now, how does this land in a Microsoft stack

1050
00:38:01,600 –> 00:38:03,880
without turning into another governance slide deck?

1051
00:38:03,880 –> 00:38:06,400
In fabric or synops, you implement factor libraries

1052
00:38:06,400 –> 00:38:09,480
as tables with explicit version keys and effective dating.

1053
00:38:09,480 –> 00:38:11,680
In your calculation views or notebooks,

1054
00:38:11,680 –> 00:38:13,680
you join activity data to factors

1055
00:38:13,680 –> 00:38:17,080
using activity classification, geography, and period date.

1056
00:38:17,080 –> 00:38:18,600
But you don’t let the joint float.

1057
00:38:18,600 –> 00:38:22,640
You require an input parameter for factor library version ID

1058
00:38:22,640 –> 00:38:24,400
when producing reported outputs.

1059
00:38:24,400 –> 00:38:26,680
Or you bind the version through a period

1060
00:38:26,680 –> 00:38:29,200
close configuration table that is itself locked

1061
00:38:29,200 –> 00:38:30,280
after close.

1062
00:38:30,280 –> 00:38:32,720
Either way, the output row carries the version ID.

1063
00:38:32,720 –> 00:38:35,680
And you treat the factor library publish as a formal release.

1064
00:38:35,680 –> 00:38:37,200
It’s an artifact with approvals.

1065
00:38:37,200 –> 00:38:39,920
It’s registered in purview, and it can be traced.

1066
00:38:39,920 –> 00:38:42,200
That’s how you answer the audit question in one sentence.

1067
00:38:42,200 –> 00:38:45,600
FI1 used factor library version X published on Y,

1068
00:38:45,600 –> 00:38:47,520
approved by Z, and locked on close.

1069
00:38:47,520 –> 00:38:50,120
Without that, you’ll end up in the classic assurance failure.

1070
00:38:50,120 –> 00:38:52,360
Someone asks why FI1 changed?

1071
00:38:52,360 –> 00:38:55,520
And you respond with, because the factors were updated.

1072
00:38:55,520 –> 00:38:58,000
That response admits you don’t have reproducibility.

1073
00:38:58,000 –> 00:38:59,480
And if you don’t have reproducibility,

1074
00:38:59,480 –> 00:39:01,080
you don’t have audit grade ESG.

1075
00:39:01,080 –> 00:39:02,600
Once factor versioning is real,

1076
00:39:02,600 –> 00:39:04,440
KPI modeling stops being guesswork

1077
00:39:04,440 –> 00:39:06,240
and starts being constrained engineering.

1078
00:39:06,240 –> 00:39:07,080
That’s next.

1079
00:39:07,080 –> 00:39:09,880
KPI modeling, scope 1, 3, energy water,

1080
00:39:09,880 –> 00:39:12,120
workforce metrics, supplier coverage.

1081
00:39:12,120 –> 00:39:14,240
Once factors are versioned, KPI modeling stops

1082
00:39:14,240 –> 00:39:15,720
being a creative writing exercise

1083
00:39:15,720 –> 00:39:17,760
and becomes what it always should have been.

1084
00:39:17,760 –> 00:39:19,680
Constraints encoded as data.

1085
00:39:19,680 –> 00:39:21,960
Most ESG teams model KPI’s like labels.

1086
00:39:21,960 –> 00:39:26,280
scope 1, scope 2, scope 3, water, diversity, supplier coverage.

1087
00:39:26,280 –> 00:39:28,720
Then they build a dashboard and assume the definitions

1088
00:39:28,720 –> 00:39:31,040
will stay stable because everyone agreed.

1089
00:39:31,040 –> 00:39:31,720
They won’t.

1090
00:39:31,720 –> 00:39:33,680
So KPI modeling has one job.

1091
00:39:33,680 –> 00:39:36,280
Make the definition enforceable and make drift visible

1092
00:39:36,280 –> 00:39:37,680
when someone tries to change it.

1093
00:39:37,680 –> 00:39:39,280
Start with scope 1, 2, and 3.

1094
00:39:39,280 –> 00:39:41,200
These aren’t tags you slap on at the end.

1095
00:39:41,200 –> 00:39:42,320
There are structural constraints

1096
00:39:42,320 –> 00:39:45,520
that determine what data qualifies, which factors are valid,

1097
00:39:45,520 –> 00:39:46,880
and what boundaries apply.

1098
00:39:46,880 –> 00:39:50,520
scope 1 is direct emissions from owned or controlled sources.

1099
00:39:50,520 –> 00:39:52,800
In system terms, scope 1 activity records

1100
00:39:52,800 –> 00:39:54,520
must bind to assets you control.

1101
00:39:54,520 –> 00:39:57,480
Boilers, generators, company vehicles, refrigerants.

1102
00:39:57,480 –> 00:39:59,960
That means your data model needs an asset dimension

1103
00:39:59,960 –> 00:40:03,000
or at least an owned control attribute you can prove,

1104
00:40:03,000 –> 00:40:04,040
not infer later.

1105
00:40:04,040 –> 00:40:06,760
If you can’t tie the activity record to the controlled

1106
00:40:06,760 –> 00:40:08,880
asset set that existed during the period,

1107
00:40:08,880 –> 00:40:10,200
you’re back to narrative.

1108
00:40:10,200 –> 00:40:14,080
scope 2 is purchased electricity, heat, steam cooling.

1109
00:40:14,080 –> 00:40:17,000
In modeling terms, scope 2 requires a clean separation

1110
00:40:17,000 –> 00:40:18,760
between consumption and factor application

1111
00:40:18,760 –> 00:40:22,000
because electricity data can arrive as meter readings,

1112
00:40:22,000 –> 00:40:23,960
invoices, or allocations.

1113
00:40:23,960 –> 00:40:26,480
Your model must preserve the original consumption units

1114
00:40:26,480 –> 00:40:28,000
and the conversion path.

1115
00:40:28,000 –> 00:40:31,440
And the output must carry which factor version applied,

1116
00:40:31,440 –> 00:40:33,720
plus the geography and supplier mapping

1117
00:40:33,720 –> 00:40:34,600
that justified it.

1118
00:40:34,600 –> 00:40:37,240
Otherwise, you’ll end up with global average factors

1119
00:40:37,240 –> 00:40:39,800
quietly covering gaps and then spend months pretending

1120
00:40:39,800 –> 00:40:40,960
it was intentional.

1121
00:40:40,960 –> 00:40:43,960
scope 3 is where the system either becomes honest or collapses.

1122
00:40:43,960 –> 00:40:45,760
scope 3 is a value chain problem, which

1123
00:40:45,760 –> 00:40:48,400
means the model needs to handle mixed evidence quality.

1124
00:40:48,400 –> 00:40:51,080
Supplyer provided data, spend-based estimates,

1125
00:40:51,080 –> 00:40:53,640
activity-based proxies, and hybrid methods.

1126
00:40:53,640 –> 00:40:55,880
The common failure is forcing all of that into one column

1127
00:40:55,880 –> 00:40:58,480
called emissions and calling it complete.

1128
00:40:58,480 –> 00:41:02,800
So the rule is every scope 3 KPI must carry two flags,

1129
00:41:02,800 –> 00:41:06,600
measured versus estimated, and coverage scope, measured means

1130
00:41:06,600 –> 00:41:09,200
supplier provided or directly sourced activity

1131
00:41:09,200 –> 00:41:10,760
with traceable factors.

1132
00:41:10,760 –> 00:41:14,160
Estimated means proxy logic with estimation factors

1133
00:41:14,160 –> 00:41:16,960
treated as controlled inputs just like emission factors.

1134
00:41:16,960 –> 00:41:19,920
Coverage scope means what part of the category this KPI

1135
00:41:19,920 –> 00:41:21,080
represents.

1136
00:41:21,080 –> 00:41:23,640
Percent of spend covered, percent of suppliers covered,

1137
00:41:23,640 –> 00:41:25,080
percent of sites covered.

1138
00:41:25,080 –> 00:41:27,480
Without those, your scope 3 number is just a confidence

1139
00:41:27,480 –> 00:41:28,040
trick.

1140
00:41:28,040 –> 00:41:30,320
Now energy and water, because these KPI’s

1141
00:41:30,320 –> 00:41:32,640
attract the most casual denominator abuse.

1142
00:41:32,640 –> 00:41:35,680
Consumption is easy, intensities where you get audited.

1143
00:41:35,680 –> 00:41:38,480
Energy intensity metrics require a denominator.

1144
00:41:38,480 –> 00:41:41,600
Revenue, production volume, floor area, headcount,

1145
00:41:41,600 –> 00:41:42,640
output units.

1146
00:41:42,640 –> 00:41:44,640
Denominators drift because someone changes

1147
00:41:44,640 –> 00:41:46,480
the definition of revenue or switches

1148
00:41:46,480 –> 00:41:48,600
the production metric mid-year or updates

1149
00:41:48,600 –> 00:41:50,440
organizational structure mappings.

1150
00:41:50,440 –> 00:41:53,640
So your KPI model needs to treat denominators as govern data,

1151
00:41:53,640 –> 00:41:55,200
not as a measure in a dashboard.

1152
00:41:55,200 –> 00:41:58,040
That means store denominators as tables with source, period,

1153
00:41:58,040 –> 00:42:00,160
or unit, and definition version.

1154
00:42:00,160 –> 00:42:02,600
Then compute intensity in the governed calculation zone

1155
00:42:02,600 –> 00:42:04,320
and publish it like any other KPI.

1156
00:42:04,320 –> 00:42:06,280
If someone wants a new denominator, fine,

1157
00:42:06,280 –> 00:42:09,000
they get a new KPI variant with a new definition,

1158
00:42:09,000 –> 00:42:11,000
not a silent rewrite of the old one.

1159
00:42:11,000 –> 00:42:14,400
Water works the same way, but with more traps, local units,

1160
00:42:14,400 –> 00:42:17,680
local reporting boundaries, and data that often arrives late.

1161
00:42:17,680 –> 00:42:20,960
So the model needs quality flags estimated missing context,

1162
00:42:20,960 –> 00:42:23,080
suspects, bikes, later arriving.

1163
00:42:23,080 –> 00:42:24,120
Don’t hide those.

1164
00:42:24,120 –> 00:42:25,960
Put them in the data set so the dashboard

1165
00:42:25,960 –> 00:42:28,520
can surface confidence, not just totals.

1166
00:42:28,520 –> 00:42:30,760
Workforce metrics are the quiet governance test.

1167
00:42:30,760 –> 00:42:33,840
Headcount, turn over, training hours, safety incident rates,

1168
00:42:33,840 –> 00:42:35,560
these are definition landmines.

1169
00:42:35,560 –> 00:42:38,080
The calculation often depends on what counts as an employee,

1170
00:42:38,080 –> 00:42:40,800
what counts as a contractor, which geographies are in scope,

1171
00:42:40,800 –> 00:42:43,800
and how organizational units map to legal entities.

1172
00:42:43,800 –> 00:42:46,240
If the HR team changes the underlying definition,

1173
00:42:46,240 –> 00:42:48,280
your KPI changes without a code change.

1174
00:42:48,280 –> 00:42:50,280
So KPI modeling for workforce metrics

1175
00:42:50,280 –> 00:42:52,280
must include definition binding.

1176
00:42:52,280 –> 00:42:55,560
A version definition record that states the inclusion rules,

1177
00:42:55,560 –> 00:42:57,880
the denominator, and the aggregation level.

1178
00:42:57,880 –> 00:43:00,480
Then the outputs carry that definition version key, again,

1179
00:43:00,480 –> 00:43:01,960
not a label, a key.

1180
00:43:01,960 –> 00:43:05,680
And supplier coverage needs to be modeled explicitly

1181
00:43:05,680 –> 00:43:08,120
because it’s the only way to prevent vanity percentages.

1182
00:43:08,120 –> 00:43:09,520
Covered must have a definition,

1183
00:43:09,520 –> 00:43:11,120
covered by survey response,

1184
00:43:11,120 –> 00:43:14,000
covered by verified activity, covered by modeled estimates,

1185
00:43:14,000 –> 00:43:15,440
each is a different confidence level.

1186
00:43:15,440 –> 00:43:17,920
So store coverage as its own KPI family

1187
00:43:17,920 –> 00:43:20,240
with numerator and denominator definitions

1188
00:43:20,240 –> 00:43:22,320
and treat it like a first class metric.

1189
00:43:22,320 –> 00:43:24,160
Otherwise, you’ll end up with a green dashboard

1190
00:43:24,160 –> 00:43:25,960
that can’t explain its own scope.

1191
00:43:25,960 –> 00:43:28,920
Once your KPI model encodes scope, definitions,

1192
00:43:28,920 –> 00:43:31,320
estimation flags, and denominator governance,

1193
00:43:31,320 –> 00:43:32,880
the architecture can produce numbers

1194
00:43:32,880 –> 00:43:34,520
that survive interrogation.

1195
00:43:34,520 –> 00:43:36,280
And now we can talk about the failure modes

1196
00:43:36,280 –> 00:43:37,680
because they’re not random.

1197
00:43:37,680 –> 00:43:40,080
They’re designed in failure mode one,

1198
00:43:40,080 –> 00:43:42,040
manual CSV overrides.

1199
00:43:42,040 –> 00:43:43,360
If there’s a single failure mode

1200
00:43:43,360 –> 00:43:46,160
that shows up in almost every ESG program, it’s this.

1201
00:43:46,160 –> 00:43:48,040
Someone fixes the number with a file.

1202
00:43:48,040 –> 00:43:49,840
It always sounds reasonable in the moment,

1203
00:43:49,840 –> 00:43:51,360
the meter export was wrong.

1204
00:43:51,360 –> 00:43:52,920
The facility center correction late,

1205
00:43:52,920 –> 00:43:54,800
the supplier portal didn’t respond.

1206
00:43:54,800 –> 00:43:56,200
The CFO wants the dashboard

1207
00:43:56,200 –> 00:43:58,200
to match what finance believes is true.

1208
00:43:58,200 –> 00:44:00,200
So a spreadsheet appears, then a CSV,

1209
00:44:00,200 –> 00:44:01,600
then a folder called uploads,

1210
00:44:01,600 –> 00:44:03,760
then a file name that admits the whole control model

1211
00:44:03,760 –> 00:44:06,320
is imaginary, final V7.

1212
00:44:06,320 –> 00:44:08,520
C is a sieve. Here’s what goes wrong mechanically.

1213
00:44:08,520 –> 00:44:11,320
A manual override has no inherent chain of custody.

1214
00:44:11,320 –> 00:44:13,160
It doesn’t preserve who changed the value

1215
00:44:13,160 –> 00:44:15,600
what the previous value was, what justification existed,

1216
00:44:15,600 –> 00:44:16,800
which approval covered it,

1217
00:44:16,800 –> 00:44:18,880
and whether the period was already closed.

1218
00:44:18,880 –> 00:44:20,480
A CSV is a blob of claims.

1219
00:44:20,480 –> 00:44:23,400
Unless the system forces metadata capture and approval,

1220
00:44:23,400 –> 00:44:25,840
the file becomes a silent rewrite of evidence.

1221
00:44:25,840 –> 00:44:28,680
And once that pathway exists, it gets used for everything.

1222
00:44:28,680 –> 00:44:31,040
First, it’s just this one facility.

1223
00:44:31,040 –> 00:44:32,440
Then it’s just this one month.

1224
00:44:32,440 –> 00:44:34,240
Then it’s just this supplier.

1225
00:44:34,240 –> 00:44:36,080
Then it becomes the default operating model

1226
00:44:36,080 –> 00:44:38,760
because it’s faster than fixing ingestion, modeling,

1227
00:44:38,760 –> 00:44:39,800
or validation.

1228
00:44:39,800 –> 00:44:42,120
Entropy loves convenience.

1229
00:44:42,120 –> 00:44:43,680
Auditors hate it for one reason.

1230
00:44:43,680 –> 00:44:46,160
It creates an uncontrolled modification pathway

1231
00:44:46,160 –> 00:44:47,600
inside the reporting boundary.

1232
00:44:47,600 –> 00:44:49,320
That phrase matters because it doesn’t matter

1233
00:44:49,320 –> 00:44:51,200
whether the sustainability team is honest.

1234
00:44:51,200 –> 00:44:54,160
It matters whether the system allows undetectable change.

1235
00:44:54,160 –> 00:44:57,160
When a CSV can be uploaded and override prior data,

1236
00:44:57,160 –> 00:44:59,640
you have created a pathway where numbers can change

1237
00:44:59,640 –> 00:45:00,960
without a durable trail.

1238
00:45:00,960 –> 00:45:02,440
That’s the definition of weak control.

1239
00:45:02,440 –> 00:45:04,280
The classic symptoms are always the same.

1240
00:45:04,280 –> 00:45:05,480
No reviewer trace.

1241
00:45:05,480 –> 00:45:07,960
One person edits, one person uploads,

1242
00:45:07,960 –> 00:45:10,720
and the approval is a team’s message.

1243
00:45:10,720 –> 00:45:11,600
No locked period.

1244
00:45:11,600 –> 00:45:13,280
The organization says the month is closed,

1245
00:45:13,280 –> 00:45:15,200
but the storage and tables are still writable,

1246
00:45:15,200 –> 00:45:17,480
so the month is closed in conversation only.

1247
00:45:17,480 –> 00:45:18,400
No checksum.

1248
00:45:18,400 –> 00:45:19,640
No content fingerprint.

1249
00:45:19,640 –> 00:45:22,000
You can’t even prove the file you showed the auditor

1250
00:45:22,000 –> 00:45:23,760
is the file that produced the KPI.

1251
00:45:23,760 –> 00:45:25,680
No binding to calculation logic.

1252
00:45:25,680 –> 00:45:28,680
The override becomes the truth by brute force.

1253
00:45:28,680 –> 00:45:31,120
Not because it passed through governed computation

1254
00:45:31,120 –> 00:45:33,240
with known factors and known code.

1255
00:45:33,240 –> 00:45:35,680
Now, the countermeasure is not band spreadsheets.

1256
00:45:35,680 –> 00:45:37,360
That’s how you force shadow processes.

1257
00:45:37,360 –> 00:45:39,240
The countermeasure is controlled submissions.

1258
00:45:39,240 –> 00:45:40,960
If the business needs a manual pathway,

1259
00:45:40,960 –> 00:45:42,960
you give them one that behaves like a ledger.

1260
00:45:42,960 –> 00:45:45,560
Authenticated submitter identity required schema,

1261
00:45:45,560 –> 00:45:47,720
validation gates, and an approval workflow

1262
00:45:47,720 –> 00:45:50,080
that results in an immutable publish.

1263
00:45:50,080 –> 00:45:53,440
The CSV becomes a source artifact, not a rewrite tool,

1264
00:45:53,440 –> 00:45:55,000
so the pattern looks like this.

1265
00:45:55,000 –> 00:45:57,400
A submission is ingested into a staging area

1266
00:45:57,400 –> 00:45:59,680
and tagged with metadata, who submitted it,

1267
00:45:59,680 –> 00:46:04,120
when, for which site, for which period, and under which submission type.

1268
00:46:04,120 –> 00:46:07,880
Then the system runs validation, schema checks, unit checks,

1269
00:46:07,880 –> 00:46:10,680
required dimensions, and basic sanity thresholds.

1270
00:46:10,680 –> 00:46:13,200
If it fails, it gets rejected or quarantined,

1271
00:46:13,200 –> 00:46:16,480
not fixed later, quarantined with a recorded reason.

1272
00:46:16,480 –> 00:46:19,000
If it passes, it does not override anything.

1273
00:46:19,000 –> 00:46:22,040
It gets published as a new versioned object into the raw zone,

1274
00:46:22,040 –> 00:46:24,440
append first, always.

1275
00:46:24,440 –> 00:46:27,120
Then the important part, adjustments, not edits.

1276
00:46:27,120 –> 00:46:29,760
If the period is still open, you can allow the new submission

1277
00:46:29,760 –> 00:46:32,160
to become the latest accepted load for that period,

1278
00:46:32,160 –> 00:46:34,720
but you still keep the older load as evidence.

1279
00:46:34,720 –> 00:46:38,480
If the period is closed, the submission cannot mutate the closed outputs.

1280
00:46:38,480 –> 00:46:40,240
It can only create an adjustment entry

1281
00:46:40,240 –> 00:46:43,600
that is explicitly labeled as post-close, includes a rationale,

1282
00:46:43,600 –> 00:46:46,640
and requires approval from a different role than the submitter.

1283
00:46:46,640 –> 00:46:49,920
Separation of duties stops being an ideal and becomes enforced behavior,

1284
00:46:49,920 –> 00:46:52,240
and yes, you keep the supporting evidence.

1285
00:46:52,240 –> 00:46:54,480
The CSV file itself goes into the evidence fault

1286
00:46:54,480 –> 00:46:57,280
with its metadata and, ideally, a stored fingerprint

1287
00:46:57,280 –> 00:46:58,760
so you can prove integrity.

1288
00:46:58,760 –> 00:47:00,320
The approval record goes into the vault,

1289
00:47:00,320 –> 00:47:02,120
the validation report goes into the vault.

1290
00:47:02,120 –> 00:47:03,760
This is how you build an evidence pack

1291
00:47:03,760 –> 00:47:05,320
without rebuilding your memory.

1292
00:47:05,320 –> 00:47:07,240
Now, here’s the uncomfortable truth.

1293
00:47:07,240 –> 00:47:10,840
The business will still demand just change the number, though.

1294
00:47:10,840 –> 00:47:12,080
They always do.

1295
00:47:12,080 –> 00:47:14,480
Your job is to make the only available change pathway

1296
00:47:14,480 –> 00:47:17,600
one that leaves scars, a new version of recorded justification

1297
00:47:17,600 –> 00:47:19,280
and a visible approval trail.

1298
00:47:19,280 –> 00:47:22,160
When people complain that it’s slower, that’s the control working.

1299
00:47:22,160 –> 00:47:24,280
And once you solve Final V7, CSV,

1300
00:47:24,280 –> 00:47:26,600
you’ll notice the next failure mode is subtler.

1301
00:47:26,600 –> 00:47:29,360
The dashboard itself starts rewriting history.

1302
00:47:29,360 –> 00:47:30,600
Failure mode 2.

1303
00:47:30,600 –> 00:47:32,360
Calculation drift in Power BI.

1304
00:47:32,360 –> 00:47:35,160
Calculation drift is the audit failure that feels like progress.

1305
00:47:35,160 –> 00:47:38,600
Someone opens the Power BI model, sees a measure that looks inefficient,

1306
00:47:38,600 –> 00:47:40,480
re-rights it, the visuals load faster,

1307
00:47:40,480 –> 00:47:42,080
and everyone calls it an improvement.

1308
00:47:42,080 –> 00:47:45,240
But the system didn’t just get faster, it got less accountable.

1309
00:47:45,240 –> 00:47:47,920
Because Power BI is designed for interactive analysis,

1310
00:47:47,920 –> 00:47:49,440
not period bound computation,

1311
00:47:49,440 –> 00:47:54,400
but it’s superpowers flexibility and flexibility is the enemy of reproducibility

1312
00:47:54,400 –> 00:47:56,040
when you’re inside a reporting boundary.

1313
00:47:56,040 –> 00:47:57,880
Here’s what goes wrong.

1314
00:47:57,880 –> 00:48:00,880
The ESG team builds core logic in DAX.

1315
00:48:00,880 –> 00:48:04,560
Emission conversions, allocations, scope categorization,

1316
00:48:04,560 –> 00:48:07,880
intensity denominators, it starts small.

1317
00:48:07,880 –> 00:48:09,800
Then new requirements arrive.

1318
00:48:09,800 –> 00:48:13,840
New sites, a new supplier category, a new framework question,

1319
00:48:13,840 –> 00:48:16,600
a minor change to how renewable energy is treated.

1320
00:48:16,600 –> 00:48:17,760
So they update the model.

1321
00:48:17,760 –> 00:48:19,760
And historic values silently recompute.

1322
00:48:19,760 –> 00:48:22,560
That’s the difference between a calculation engine and a dashboard.

1323
00:48:22,560 –> 00:48:25,520
A calculation engine produces outputs that become records.

1324
00:48:25,520 –> 00:48:27,720
A dashboard recomputes every time it refreshes.

1325
00:48:27,720 –> 00:48:30,320
When you change the logic, you don’t just change future numbers.

1326
00:48:30,320 –> 00:48:31,800
You change last year’s numbers.

1327
00:48:31,800 –> 00:48:34,360
This clicked for a lot of architects the first time someone asked

1328
00:48:34,360 –> 00:48:36,280
for a prior year reconciliation,

1329
00:48:36,280 –> 00:48:38,280
and the answer was the model changed.

1330
00:48:38,280 –> 00:48:39,920
Not because anyone acted maliciously,

1331
00:48:39,920 –> 00:48:42,040
because the platform makes logic edit trivial

1332
00:48:42,040 –> 00:48:43,960
and makes version binding optional.

1333
00:48:43,960 –> 00:48:45,840
Optional controls aren’t controls.

1334
00:48:45,840 –> 00:48:48,600
Now the deeper problem is that Power BI doesn’t naturally behave

1335
00:48:48,600 –> 00:48:50,560
like a governed release artifact.

1336
00:48:50,560 –> 00:48:52,000
Yes, you can manage workspaces.

1337
00:48:52,000 –> 00:48:53,920
Yes, you can use deployment pipelines.

1338
00:48:53,920 –> 00:48:55,760
Yes, you can restrict who can publish.

1339
00:48:55,760 –> 00:48:58,400
But the model itself remains a moving object

1340
00:48:58,400 –> 00:49:02,120
unless you design a release process that treats it like code

1341
00:49:02,120 –> 00:49:03,760
that impacts financial statements.

1342
00:49:03,760 –> 00:49:04,920
Most organizations don’t.

1343
00:49:04,920 –> 00:49:06,160
They treat it like a report.

1344
00:49:06,160 –> 00:49:07,840
So you end up with the classic drift pattern,

1345
00:49:07,840 –> 00:49:10,640
a developer optimizes a measure for performance.

1346
00:49:10,640 –> 00:49:12,680
The measure changes, the outputs change.

1347
00:49:12,680 –> 00:49:15,240
Nobody notices until a regulator, an auditor,

1348
00:49:15,240 –> 00:49:17,160
or finance compares this year’s report

1349
00:49:17,160 –> 00:49:19,120
to a saved PDF from last year.

1350
00:49:19,120 –> 00:49:20,800
And now you’re in restatement territory

1351
00:49:20,800 –> 00:49:22,920
except you don’t have restatement mechanics.

1352
00:49:22,920 –> 00:49:24,720
You have a dashboard that rewrote history

1353
00:49:24,720 –> 00:49:26,160
without leaving an obvious scar.

1354
00:49:26,160 –> 00:49:29,560
That’s why auditors increasingly flag logic heavy DAX models.

1355
00:49:29,560 –> 00:49:30,840
Not because DAX is wrong,

1356
00:49:30,840 –> 00:49:32,560
because DAX is too easy to change

1357
00:49:32,560 –> 00:49:33,880
without the control ceremony

1358
00:49:33,880 –> 00:49:36,320
that should accompany changes to reported numbers.

1359
00:49:36,320 –> 00:49:38,680
The architecture rule that stops this is brutal.

1360
00:49:38,680 –> 00:49:42,280
Power BI is a thin semantic layer over reported tables only.

1361
00:49:42,280 –> 00:49:44,920
That means the calculations on produces the KPI tables.

1362
00:49:44,920 –> 00:49:46,960
Those KPI tables are period closed outputs.

1363
00:49:46,960 –> 00:49:50,040
Power BI reads them, power BI can aggregate and format them.

1364
00:49:50,040 –> 00:49:51,160
It can create visuals.

1365
00:49:51,160 –> 00:49:52,680
It can create drill paths.

1366
00:49:52,680 –> 00:49:54,400
It can even create convenience measures

1367
00:49:54,400 –> 00:49:57,400
that don’t affect the underlying accounting of emissions.

1368
00:49:57,400 –> 00:49:59,800
But it cannot be the place where emissions accounting lives.

1369
00:49:59,800 –> 00:50:02,400
If you remember nothing else, DAX measures should be formatting,

1370
00:50:02,400 –> 00:50:03,360
not accounting.

1371
00:50:03,360 –> 00:50:06,440
Now, the system level countermeasure is equally blunt.

1372
00:50:06,440 –> 00:50:09,920
KPI outputs are tables, not measures.

1373
00:50:09,920 –> 00:50:11,400
Instead of total scope to emissions

1374
00:50:11,400 –> 00:50:13,920
being a measure that depends on five other measures,

1375
00:50:13,920 –> 00:50:16,080
it becomes a column in a reported KPI table

1376
00:50:16,080 –> 00:50:17,840
produced by fabric or synapse

1377
00:50:17,840 –> 00:50:22,640
with keys for period or unit scope category, method and factor version.

1378
00:50:22,640 –> 00:50:24,200
Power BI then displays it.

1379
00:50:24,200 –> 00:50:26,280
When someone asks, why did it change?

1380
00:50:26,280 –> 00:50:28,640
You have an answer rooted in artifacts,

1381
00:50:28,640 –> 00:50:32,720
input load IDs, factor versions and calculation release version.

1382
00:50:32,720 –> 00:50:35,120
Not because someone updated the report,

1383
00:50:35,120 –> 00:50:37,640
a practical control pattern looks like this.

1384
00:50:37,640 –> 00:50:39,960
You publish a KPI data set that is certified

1385
00:50:39,960 –> 00:50:42,440
and only refreshes from the reported zone.

1386
00:50:42,440 –> 00:50:45,320
You separate two dashboard classes, assurance dashboards

1387
00:50:45,320 –> 00:50:47,440
that only read period closed outputs

1388
00:50:47,440 –> 00:50:51,440
and management dashboards that can read operational or provisional data.

1389
00:50:51,440 –> 00:50:53,160
That distinction matters because management

1390
00:50:53,160 –> 00:50:57,160
wants speed and iteration, assurance wants stability and traceability,

1391
00:50:57,160 –> 00:50:59,600
mixing them guarantees you’ll optimize for convenience

1392
00:50:59,600 –> 00:51:01,280
and later pretend it was governance.

1393
00:51:01,280 –> 00:51:02,640
And you still keep snapshots.

1394
00:51:02,640 –> 00:51:06,200
At close, you export and store the period close report outputs

1395
00:51:06,200 –> 00:51:08,320
as artifacts in your evidence vault.

1396
00:51:08,320 –> 00:51:10,240
PDF for human readable records

1397
00:51:10,240 –> 00:51:14,040
and CSV or data set extracts for machine traceability.

1398
00:51:14,040 –> 00:51:16,320
Those snapshots don’t replace the reported tables.

1399
00:51:16,320 –> 00:51:17,320
They complement them.

1400
00:51:17,320 –> 00:51:19,120
They prove what was presented at the time.

1401
00:51:19,120 –> 00:51:20,400
Now here’s the cynical truth.

1402
00:51:20,400 –> 00:51:23,720
People will try to sneak logic back into Power BI because it’s faster.

1403
00:51:23,720 –> 00:51:25,560
They’ll say, it’s just a small adjustment.

1404
00:51:25,560 –> 00:51:27,320
They’ll say, we can do it as a measure.

1405
00:51:27,320 –> 00:51:29,360
They’ll say, it’s only for this visual.

1406
00:51:29,360 –> 00:51:32,840
And then six months later, you discover the visual became the source of truth

1407
00:51:32,840 –> 00:51:34,680
because it was the thing executives looked at.

1408
00:51:34,680 –> 00:51:35,760
That’s how drift wins.

1409
00:51:35,760 –> 00:51:38,840
So, enforce the boundary calculations in the governed zone.

1410
00:51:38,840 –> 00:51:41,960
Outputs in reported tables, Power BI as presentation.

1411
00:51:41,960 –> 00:51:44,000
And once you move logic out of the dashboard,

1412
00:51:44,000 –> 00:51:45,440
you’ll hit the next failure mode.

1413
00:51:45,440 –> 00:51:47,040
Even with SQL-based logic,

1414
00:51:47,040 –> 00:51:48,440
you still can’t reproduce history

1415
00:51:48,440 –> 00:51:49,880
if your emission factors float.

1416
00:51:49,880 –> 00:51:52,680
Failure mode three, missing factor versioning.

1417
00:51:52,680 –> 00:51:54,600
Missing factor versioning is the failure mode

1418
00:51:54,600 –> 00:51:57,000
that makes every other control look decorative.

1419
00:51:57,000 –> 00:51:58,560
You can have immutable storage.

1420
00:51:58,560 –> 00:51:59,800
You can have lineage.

1421
00:51:59,800 –> 00:52:01,720
You can even have a governed calculation zone.

1422
00:52:01,720 –> 00:52:04,560
But if your emission factors behave like current truth,

1423
00:52:04,560 –> 00:52:07,120
you’ve built a system that recalculates the past

1424
00:52:07,120 –> 00:52:08,080
with today’s assumptions.

1425
00:52:08,080 –> 00:52:08,920
That’s not reporting.

1426
00:52:08,920 –> 00:52:10,640
That’s revisionism with a SQL engine.

1427
00:52:10,640 –> 00:52:12,600
Here’s how it happens in real architectures.

1428
00:52:12,600 –> 00:52:15,080
A team centralizes emission factors in a table.

1429
00:52:15,080 –> 00:52:17,880
They add a column for source and maybe year.

1430
00:52:17,880 –> 00:52:21,360
They join activity data to factors by geography and activity type.

1431
00:52:21,360 –> 00:52:24,400
Then, because nobody wants to pass parameters around,

1432
00:52:24,400 –> 00:52:27,800
they add a view called something like VW emission factor latest.

1433
00:52:27,800 –> 00:52:29,760
And every calculation joins to latest,

1434
00:52:29,760 –> 00:52:31,800
it works until the factor set updates.

1435
00:52:31,800 –> 00:52:34,800
Then you rerun FY1 and FY+2 and the numbers change.

1436
00:52:34,800 –> 00:52:36,560
Not because the activity changed.

1437
00:52:36,560 –> 00:52:37,880
Not because the logic changed.

1438
00:52:37,880 –> 00:52:40,320
Because the factor table did what tables do,

1439
00:52:40,320 –> 00:52:42,080
it reflected the current state.

1440
00:52:42,080 –> 00:52:44,480
And your system quietly rewrote history.

1441
00:52:44,480 –> 00:52:47,000
This is why we use DEFRA, isn’t evidence.

1442
00:52:47,000 –> 00:52:48,720
It’s a marketing label for a library.

1443
00:52:48,720 –> 00:52:49,360
Which version?

1444
00:52:49,360 –> 00:52:50,200
Which published date?

1445
00:52:50,200 –> 00:52:51,160
Which effective dates?

1446
00:52:51,160 –> 00:52:52,160
Which geography mapping?

1447
00:52:52,160 –> 00:52:53,080
Which category mapping?

1448
00:52:53,080 –> 00:52:55,040
If your answer is the one in the table,

1449
00:52:55,040 –> 00:52:57,320
you are admitting you can’t reproduce a prior close.

1450
00:52:57,320 –> 00:53:00,400
And reproducibility is one of the audit grade requirements

1451
00:53:00,400 –> 00:53:01,760
you claimed you met.

1452
00:53:01,760 –> 00:53:03,960
Now, what does assurance actually do with this?

1453
00:53:03,960 –> 00:53:05,880
They ask the simplest question on earth.

1454
00:53:05,880 –> 00:53:06,920
Prove the number.

1455
00:53:06,920 –> 00:53:07,880
Not tell a story.

1456
00:53:07,880 –> 00:53:09,000
Prove it.

1457
00:53:09,000 –> 00:53:10,040
They want to see the chain.

1458
00:53:10,040 –> 00:53:12,320
Activity record, factor record, and the logic

1459
00:53:12,320 –> 00:53:13,360
that multiplied them.

1460
00:53:13,360 –> 00:53:15,560
If the factor record is not pinned to the period,

1461
00:53:15,560 –> 00:53:17,680
you can’t prove what it was at the time.

1462
00:53:17,680 –> 00:53:19,240
You can only show what it is now.

1463
00:53:19,240 –> 00:53:20,960
That becomes ambiguity and ambiguity

1464
00:53:20,960 –> 00:53:23,960
becomes qualifications, restatements, or a scope limitation

1465
00:53:23,960 –> 00:53:26,600
in an assurance report depending on how bad it is.

1466
00:53:26,600 –> 00:53:29,640
The most common real world trigger is annual factor updates.

1467
00:53:29,640 –> 00:53:31,040
A new factor library comes in.

1468
00:53:31,040 –> 00:53:32,040
Someone imports it.

1469
00:53:32,040 –> 00:53:35,000
They overwrite last year’s rows because it’s an update.

1470
00:53:35,000 –> 00:53:37,800
Then they rerun calculations to validate the new year.

1471
00:53:37,800 –> 00:53:39,440
And suddenly last year changes.

1472
00:53:39,440 –> 00:53:40,960
The dashboard still looks reasonable,

1473
00:53:40,960 –> 00:53:43,400
so nobody notices until finance compares numbers

1474
00:53:43,400 –> 00:53:46,120
to last year’s submission or an auditor requests

1475
00:53:46,120 –> 00:53:48,280
a re-performance for the prior period.

1476
00:53:48,280 –> 00:53:50,360
And then the organization learns a painful lesson.

1477
00:53:50,360 –> 00:53:52,840
Factor drift is indistinguishable from manipulation

1478
00:53:52,840 –> 00:53:54,560
unless you can prove version binding.

1479
00:53:54,560 –> 00:53:57,360
So the architecture countermeasure has to be enforcement,

1480
00:53:57,360 –> 00:53:58,200
not guidance.

1481
00:53:58,200 –> 00:54:00,760
First, you build factor libraries as version sets.

1482
00:54:00,760 –> 00:54:02,520
A library has a unique version key.

1483
00:54:02,520 –> 00:54:04,080
The factor records carry that key.

1484
00:54:04,080 –> 00:54:05,880
And you never update an existing version.

1485
00:54:05,880 –> 00:54:07,640
You publish a new version, always.

1486
00:54:07,640 –> 00:54:10,400
Second, you bind factors to periods explicitly.

1487
00:54:10,400 –> 00:54:13,120
That can be done through a period close configuration table

1488
00:54:13,120 –> 00:54:15,200
that records per reporting period.

1489
00:54:15,200 –> 00:54:18,000
The factor library version IDs used for each domain.

1490
00:54:18,000 –> 00:54:21,400
Electricity, fuel, travel, freight, whatever you model.

1491
00:54:21,400 –> 00:54:24,000
That configuration becomes part of the close package

1492
00:54:24,000 –> 00:54:26,600
and gets locked after close because it’s the switchboard

1493
00:54:26,600 –> 00:54:28,360
that defines reproducibility.

1494
00:54:28,360 –> 00:54:31,400
Third, you make the pipeline fail without a factor version key.

1495
00:54:31,400 –> 00:54:34,480
This is the part teams avoid because it feels strict.

1496
00:54:34,480 –> 00:54:35,640
Good, it should.

1497
00:54:35,640 –> 00:54:39,000
In your SQL views, stored procedures or notebooks,

1498
00:54:39,000 –> 00:54:41,480
the joint effectors must include the version key.

1499
00:54:41,480 –> 00:54:43,480
If the caller doesn’t supply it, the job fails.

1500
00:54:43,480 –> 00:54:45,560
If the configuration table doesn’t have a version

1501
00:54:45,560 –> 00:54:47,160
for that period, the job fails.

1502
00:54:47,160 –> 00:54:48,480
No silent fallbacks.

1503
00:54:48,480 –> 00:54:50,560
No latest, no convenience.

1504
00:54:50,560 –> 00:54:53,640
Because latest is an entropy generator disguised as a default.

1505
00:54:53,640 –> 00:54:56,240
Once you enforce that, you can do an audit-ready rerun.

1506
00:54:56,240 –> 00:54:58,240
You can take FYI one activity data.

1507
00:54:58,240 –> 00:55:01,000
You can select the load IDs that were in scope at close.

1508
00:55:01,000 –> 00:55:03,920
You can select the factor library versions recorded for that close.

1509
00:55:03,920 –> 00:55:07,600
You can run the calculation artifacts tied to the released logic version.

1510
00:55:07,600 –> 00:55:08,800
And you get the same result.

1511
00:55:08,800 –> 00:55:10,520
That’s what reproducibility means.

1512
00:55:10,520 –> 00:55:11,520
Not close enough.

1513
00:55:11,520 –> 00:55:13,120
The same.

1514
00:55:13,120 –> 00:55:14,320
Now the subtle trap.

1515
00:55:14,320 –> 00:55:16,440
Effective dates and geography mapping.

1516
00:55:16,440 –> 00:55:19,200
Even with version keys, teams still mess up by storing factors

1517
00:55:19,200 –> 00:55:20,720
without applicability constraints

1518
00:55:20,720 –> 00:55:22,520
then joining based on best match.

1519
00:55:22,520 –> 00:55:24,920
That produces probabilistic factor selection.

1520
00:55:24,920 –> 00:55:26,840
So the factor model needs effective dates,

1521
00:55:26,840 –> 00:55:28,840
geography codes and classification keys

1522
00:55:28,840 –> 00:55:30,600
that make selection deterministic.

1523
00:55:30,600 –> 00:55:33,960
If multiple factors match, the pipeline should fail and force resolution,

1524
00:55:33,960 –> 00:55:36,000
not pick one and pretend it was intentional.

1525
00:55:36,000 –> 00:55:37,560
And the final piece is evidence.

1526
00:55:37,560 –> 00:55:40,600
When you publish a factor library version, store provenance.

1527
00:55:40,600 –> 00:55:42,680
Where it came from and when it was approved.

1528
00:55:42,680 –> 00:55:46,280
In Microsoft Sustainability Manager, factors can live in factor libraries.

1529
00:55:46,280 –> 00:55:49,240
In Fabric or Synapse, you’ll model them as tables.

1530
00:55:49,240 –> 00:55:51,880
Either way, the published version used for a close period

1531
00:55:51,880 –> 00:55:53,720
becomes part of the evidence chain.

1532
00:55:53,720 –> 00:55:56,480
So it gets locked and registered in purview for lineage.

1533
00:55:56,480 –> 00:55:59,240
Because the moment you can’t prove which factors were used,

1534
00:55:59,240 –> 00:56:01,000
you’re no longer defending a number,

1535
00:56:01,000 –> 00:56:02,800
you’re defending a belief about a number

1536
00:56:02,800 –> 00:56:05,200
and auditors don’t assure beliefs.

1537
00:56:05,200 –> 00:56:08,880
Purview, lineage as your only defense against prove it moments.

1538
00:56:08,880 –> 00:56:10,640
At some point, someone will ask the question

1539
00:56:10,640 –> 00:56:12,360
that ends the fun part of ESG.

1540
00:56:12,360 –> 00:56:13,160
Prove it.

1541
00:56:13,160 –> 00:56:14,760
Not explain it, not summarize it.

1542
00:56:14,760 –> 00:56:16,840
Prove that this KPI came from these sources

1543
00:56:16,840 –> 00:56:19,480
went through these transformations and landed in this report

1544
00:56:19,480 –> 00:56:23,160
without being casually rewritten by whoever had access on a Tuesday.

1545
00:56:23,160 –> 00:56:25,080
This is where most ESG stacks collapse

1546
00:56:25,080 –> 00:56:27,280
because they rely on human memory and slide decks.

1547
00:56:27,280 –> 00:56:30,000
That’s not governance, that’s folklore.

1548
00:56:30,000 –> 00:56:33,080
Microsoft purview is the mechanism that turns your folklore

1549
00:56:33,080 –> 00:56:34,480
into queryable metadata.

1550
00:56:34,480 –> 00:56:37,200
And that distinction matters because lineage isn’t a diagram

1551
00:56:37,200 –> 00:56:37,960
you draw once.

1552
00:56:37,960 –> 00:56:40,760
It’s an operational record of how data moved, changed shape,

1553
00:56:40,760 –> 00:56:43,560
and became something the business now claims is true.

1554
00:56:43,560 –> 00:56:45,560
Lineage in plain system terms is origin

1555
00:56:45,560 –> 00:56:47,880
to transformation to consumption.

1556
00:56:47,880 –> 00:56:49,800
Origin is where the data came from.

1557
00:56:49,800 –> 00:56:53,640
ERP extracts, meter feeds, supplier submissions, HR aggregates.

1558
00:56:53,640 –> 00:56:55,280
Transformation is what you did to it.

1559
00:56:55,280 –> 00:56:58,240
Validation standardization mapping factor application KPI

1560
00:56:58,240 –> 00:57:00,560
computation consumption is where it shows up.

1561
00:57:00,560 –> 00:57:04,360
Reported tables, semantic models, power BI reports, exports.

1562
00:57:04,360 –> 00:57:07,800
If any link in that chain is someone knows, you don’t have lineage.

1563
00:57:07,800 –> 00:57:09,200
You have a future incident.

1564
00:57:09,200 –> 00:57:11,080
Here’s what you actually register in purview

1565
00:57:11,080 –> 00:57:13,520
if you wanted to be useful under audit pressure.

1566
00:57:13,520 –> 00:57:18,200
You register the storage assets, lake houses, ADLS paths,

1567
00:57:18,200 –> 00:57:21,320
containers that correspond to raw curated reported

1568
00:57:21,320 –> 00:57:22,720
and the evidence vault.

1569
00:57:22,720 –> 00:57:24,720
You register the processing assets.

1570
00:57:24,720 –> 00:57:28,640
Pipelines, notebooks, SQL endpoints, whatever actually

1571
00:57:28,640 –> 00:57:31,160
performs transformations and publishes outputs.

1572
00:57:31,160 –> 00:57:33,120
And you register the analytics assets.

1573
00:57:33,120 –> 00:57:34,920
The data sets and reports people use,

1574
00:57:34,920 –> 00:57:37,160
including the certified data sets that represent

1575
00:57:37,160 –> 00:57:38,200
the assurance layer.

1576
00:57:38,200 –> 00:57:41,000
And you assign ownership, real ownership, not the team,

1577
00:57:41,000 –> 00:57:42,520
a named role with accountability.

1578
00:57:42,520 –> 00:57:45,520
The thing most people miss is that auditors don’t only ask

1579
00:57:45,520 –> 00:57:46,760
where did the number come from.

1580
00:57:46,760 –> 00:57:49,280
They ask who is responsible for this asset.

1581
00:57:49,280 –> 00:57:52,440
Per view is where you make that answer deterministic instead of social.

1582
00:57:52,440 –> 00:57:54,160
Now, what does this look like in practice

1583
00:57:54,160 –> 00:57:56,040
in the moment you’re under scrutiny?

1584
00:57:56,040 –> 00:57:58,280
A stakeholder points its scope to for a region

1585
00:57:58,280 –> 00:58:00,040
and says, this seems high.

1586
00:58:00,040 –> 00:58:02,840
If you have lineage, you can trace from the Power BI visual

1587
00:58:02,840 –> 00:58:04,600
back to the reported KPI table,

1588
00:58:04,600 –> 00:58:06,680
back to the calculation view or notebook,

1589
00:58:06,680 –> 00:58:08,560
back to the curated consumption table,

1590
00:58:08,560 –> 00:58:11,200
back to the raw invoice extract or meter feed.

1591
00:58:11,200 –> 00:58:15,240
And you can identify the load IDs and factor library version key used.

1592
00:58:15,240 –> 00:58:16,600
You can do it in minutes.

1593
00:58:16,600 –> 00:58:19,240
Without lineage, you do PowerPoint archaeology.

1594
00:58:19,240 –> 00:58:20,240
You open old emails.

1595
00:58:20,240 –> 00:58:21,720
You ask someone who left the company.

1596
00:58:21,720 –> 00:58:24,600
You rebuild the path from memory and hope it matches reality.

1597
00:58:24,600 –> 00:58:25,360
It won’t.

1598
00:58:25,360 –> 00:58:26,760
Lineage isn’t only for auditors.

1599
00:58:26,760 –> 00:58:31,200
It’s also the fastest way to find where data quality issues actually entered the system.

1600
00:58:31,200 –> 00:58:34,040
When a KPI looks wrong, teams usually blame the calculation.

1601
00:58:34,040 –> 00:58:37,080
Half the time the calculation is fine and the input mapping is wrong

1602
00:58:37,080 –> 00:58:39,040
or the organizational hierarchy changed.

1603
00:58:39,040 –> 00:58:40,200
Or a unit got misread.

1604
00:58:40,200 –> 00:58:41,640
Lineage gives you a breadcrumb trail,

1605
00:58:41,640 –> 00:58:43,880
so root cause analysis becomes mechanical.

1606
00:58:43,880 –> 00:58:46,520
Find the upstream change point, not the downstream symptom.

1607
00:58:46,520 –> 00:58:48,840
And here’s the other use case nobody budgets for.

1608
00:58:48,840 –> 00:58:49,880
Impact analysis.

1609
00:58:49,880 –> 00:58:52,840
Every time you change a pipeline, a mapping or a factor library,

1610
00:58:52,840 –> 00:58:55,000
you are changing a graph of dependencies.

1611
00:58:55,000 –> 00:58:58,600
Without lineage, you don’t know what you’ll break until something breaks.

1612
00:58:58,600 –> 00:59:02,240
With lineage, you can see downstream consumers before you ship the change.

1613
00:59:02,240 –> 00:59:05,680
That’s how you stop small improvements from becoming multi-year restatements.

1614
00:59:05,680 –> 00:59:06,880
Now there’s a reality check.

1615
00:59:06,880 –> 00:59:10,160
Per view capabilities evolve, integrations change,

1616
00:59:10,160 –> 00:59:13,560
some sustainability specific solutions in the Microsoft ecosystem,

1617
00:59:13,560 –> 00:59:15,760
show up in preview states and then move.

1618
00:59:15,760 –> 00:59:17,520
That is not a reason to avoid governance.

1619
00:59:17,520 –> 00:59:21,240
It’s the reason to avoid hard coding your governance into documentation.

1620
00:59:21,240 –> 00:59:23,360
Your architecture has to tolerate product drift

1621
00:59:23,360 –> 00:59:26,480
by treating lineage as a first class system behavior.

1622
00:59:26,480 –> 00:59:29,680
Register assets consistently, enforce naming conventions,

1623
00:59:29,680 –> 00:59:33,840
keep ownership current and make lineage review part of release management.

1624
00:59:33,840 –> 00:59:34,960
And yes, there’s setup.

1625
00:59:34,960 –> 00:59:37,800
You will configure, connectors, you will manage identities,

1626
00:59:37,800 –> 00:59:40,640
you will decide which assets get scanned and how often.

1627
00:59:40,640 –> 00:59:44,240
You will deal with the fact that not everything stitches perfectly on day one.

1628
00:59:44,240 –> 00:59:45,760
But you’re not doing this for aesthetics.

1629
00:59:45,760 –> 00:59:48,880
You’re doing it because prove it moments don’t arrive on your schedule.

1630
00:59:48,880 –> 00:59:50,320
They arrive when the board is watching.

1631
00:59:50,320 –> 00:59:53,080
So per view becomes your only defensible posture.

1632
00:59:53,080 –> 00:59:57,920
A way to demonstrate and to end that your ESG numbers are products of controlled systems,

1633
00:59:57,920 –> 01:00:00,240
not a collection of best effort narratives.

1634
01:00:00,240 –> 01:00:03,320
And once you accept that, the next dependency becomes obvious.

1635
01:00:03,320 –> 01:00:05,040
Governance without identity is theater.

1636
01:00:05,040 –> 01:00:07,720
Identity is what turns metadata into enforcement.

1637
01:00:07,720 –> 01:00:09,560
Entra ID plus role separation.

1638
01:00:09,560 –> 01:00:11,520
Stop letting everyone be everyone.

1639
01:00:11,520 –> 01:00:13,920
Most organizations say they have governance.

1640
01:00:13,920 –> 01:00:16,800
Then you look at their permissions and realize they have hope.

1641
01:00:16,800 –> 01:00:19,800
They treat Microsoft Entra ID like a login system.

1642
01:00:19,800 –> 01:00:21,120
Not what it actually is.

1643
01:00:21,120 –> 01:00:23,440
The control plane for who can touch evidence,

1644
01:00:23,440 –> 01:00:28,280
who can change logic and who can publish numbers that will later be defended in an assurance room.

1645
01:00:28,280 –> 01:00:32,240
That distinction matters because ESG fails when identity becomes optional.

1646
01:00:32,240 –> 01:00:33,920
Role separation is not bureaucracy.

1647
01:00:33,920 –> 01:00:39,040
It is the only reason an auditor believes your system didn’t quietly rewrite itself under deadline pressure.

1648
01:00:39,040 –> 01:00:41,920
So the model is simple and it’s intentionally boring.

1649
01:00:41,920 –> 01:00:43,600
Submitter, validator,

1650
01:00:43,600 –> 01:00:46,320
calculator, approver, report publisher.

1651
01:00:46,320 –> 01:00:48,000
A submitter can provide data.

1652
01:00:48,000 –> 01:00:49,560
They can’t edit raw archives.

1653
01:00:49,560 –> 01:00:50,800
They can’t change mappings.

1654
01:00:50,800 –> 01:00:52,480
They can’t adjust reported outputs.

1655
01:00:52,480 –> 01:00:56,360
A validator can review ingestion results and data quality exceptions.

1656
01:00:56,360 –> 01:00:58,560
They can quarantine or accept with exception.

1657
01:00:58,560 –> 01:00:59,680
They can’t publish factors.

1658
01:00:59,680 –> 01:01:01,560
They can’t deploy calculation code.

1659
01:01:01,560 –> 01:01:03,920
A calculator can run the governed compute process.

1660
01:01:03,920 –> 01:01:05,400
They can’t alter raw evidence.

1661
01:01:05,400 –> 01:01:07,120
They can’t approve their own changes.

1662
01:01:07,120 –> 01:01:09,000
They can’t publish the final report.

1663
01:01:09,000 –> 01:01:11,440
An approver can sign off on period close,

1664
01:01:11,440 –> 01:01:14,200
factor library versions and post-close adjustments.

1665
01:01:14,200 –> 01:01:16,320
They don’t need broad data engineering access.

1666
01:01:16,320 –> 01:01:17,960
They need explicit rights to approve

1667
01:01:17,960 –> 01:01:20,400
and their approvals need to be recorded as evidence.

1668
01:01:20,400 –> 01:01:24,240
A report publisher can publish certified data sets and reports

1669
01:01:24,240 –> 01:01:26,040
that consume reported outputs.

1670
01:01:26,040 –> 01:01:30,960
They cannot modify the calculation logic or the data inputs that produce those outputs.

1671
01:01:30,960 –> 01:01:33,680
If you collapse those roles into the sustainability team,

1672
01:01:33,680 –> 01:01:38,080
you’ve built a system where the same identity can create data, change data,

1673
01:01:38,080 –> 01:01:40,080
compute results and approve results.

1674
01:01:40,080 –> 01:01:41,400
That is not control.

1675
01:01:41,400 –> 01:01:43,160
That is conditional chaos.

1676
01:01:43,160 –> 01:01:48,240
Now here’s the part everyone tries to dodge.

1677
01:01:48,240 –> 01:01:49,920
Access boundaries by zone.

1678
01:01:49,920 –> 01:01:53,160
Raw, curated and reported are not just storage partitions.

1679
01:01:53,160 –> 01:01:54,560
They are permission boundaries.

1680
01:01:54,560 –> 01:01:57,400
Raw should be readable by the people who need to trace provenance

1681
01:01:57,400 –> 01:01:58,720
and resolve ingestion issues,

1682
01:01:58,720 –> 01:02:04,360
but rightable only by controlled ingestion identities typically service principles executing pipelines.

1683
01:02:04,360 –> 01:02:06,400
Humans don’t get right access to raw evidence.

1684
01:02:06,400 –> 01:02:09,760
They get a submission mechanism that produces new immutable artifacts.

1685
01:02:09,760 –> 01:02:11,000
That’s a different thing.

1686
01:02:11,000 –> 01:02:14,840
Curated should be writable only by the transformation process identities

1687
01:02:14,840 –> 01:02:16,960
and the engineers responsible for the model.

1688
01:02:16,960 –> 01:02:21,040
Broughter read access is fine, but right access is a scalpel, not a group membership.

1689
01:02:21,040 –> 01:02:22,760
Reported should be locked down hardest.

1690
01:02:22,760 –> 01:02:27,200
Right access only for the closed process and controlled adjustment workflows.

1691
01:02:27,200 –> 01:02:31,520
Read access for reporting, finance, internal audit and whoever consumes the KPIs.

1692
01:02:31,520 –> 01:02:35,760
But nobody should be casually updating reported tables because it’s just a fix.

1693
01:02:35,760 –> 01:02:36,880
Fixes are adjustments.

1694
01:02:36,880 –> 01:02:38,240
Adjustments have approvals.

1695
01:02:38,240 –> 01:02:40,440
Approvals have identity separation.

1696
01:02:40,440 –> 01:02:44,920
And yes, all of this is enforced with Entra Groups, service principles, managed identities

1697
01:02:44,920 –> 01:02:49,200
and our back assignments at the storage and compute layers, not in VizioDont in the system.

1698
01:02:49,200 –> 01:02:51,960
Now, evidence of control matters as much as control itself.

1699
01:02:51,960 –> 01:02:53,360
You don’t just need permissions.

1700
01:02:53,360 –> 01:02:54,600
You need proof of permissions.

1701
01:02:54,600 –> 01:02:57,760
So you treat Entra assignments, role memberships and privilege changes

1702
01:02:57,760 –> 01:02:59,560
as part of the assurance package.

1703
01:02:59,560 –> 01:03:02,200
When the auditor asks, who could have changed this?

1704
01:03:02,200 –> 01:03:03,640
You don’t answer with a meeting.

1705
01:03:03,640 –> 01:03:08,240
You answer with access history, role definitions and audit logs.

1706
01:03:08,240 –> 01:03:12,920
Which brings us to the most common ESG security anti-pattern, the hero admin.

1707
01:03:12,920 –> 01:03:15,920
The hero admin shows up when the pipeline breaks, the close date is near

1708
01:03:15,920 –> 01:03:18,520
and somebody says, just give me contributor for a minute.

1709
01:03:18,520 –> 01:03:20,840
Temporary elevation becomes permanent.

1710
01:03:20,840 –> 01:03:22,320
Exceptions become normal.

1711
01:03:22,320 –> 01:03:25,640
And then months later you discover your separation of duties is a myth

1712
01:03:25,640 –> 01:03:28,280
because everyone has been operating as everyone.

1713
01:03:28,280 –> 01:03:30,480
The countermeasure isn’t trust people more.

1714
01:03:30,480 –> 01:03:34,440
It’s making elevation visible and costly if you must allow privileged access.

1715
01:03:34,440 –> 01:03:37,400
You make it time-bound, explicitly approved and logged.

1716
01:03:37,400 –> 01:03:40,320
You treat it as an incident artifact, not a convenience.

1717
01:03:40,320 –> 01:03:43,680
Because every exception is an entropy generator that will be reused.

1718
01:03:43,680 –> 01:03:45,000
And here is the awkward truth.

1719
01:03:45,000 –> 01:03:47,640
The sustainability organization will fight you on this.

1720
01:03:47,640 –> 01:03:48,760
They’ll say it slows them down.

1721
01:03:48,760 –> 01:03:49,320
They’re correct.

1722
01:03:49,320 –> 01:03:51,240
Controls always slow down change.

1723
01:03:51,240 –> 01:03:52,160
That’s the trade.

1724
01:03:52,160 –> 01:03:55,680
If you want audit grade reporting, you don’t optimize for speed of edits.

1725
01:03:55,680 –> 01:03:57,600
You optimize for survivability under scrutiny.

1726
01:03:57,600 –> 01:04:00,680
So you design the workflow, so the compliant path is the easy path.

1727
01:04:00,680 –> 01:04:03,000
Controls submissions instead of shared folders

1728
01:04:03,000 –> 01:04:06,440
approved factor publishing instead of spreadsheet swaps, period close gates

1729
01:04:06,440 –> 01:04:10,520
that lock storage and reports that only read from reported outputs.

1730
01:04:10,520 –> 01:04:12,320
Entra is what makes all of that enforceable.

1731
01:04:12,320 –> 01:04:16,040
Without it, purview shows you lineage of data that anyone could have altered.

1732
01:04:16,040 –> 01:04:17,040
That’s not governance.

1733
01:04:17,040 –> 01:04:19,320
That’s cataloging your own uncertainty.

1734
01:04:19,320 –> 01:04:23,480
Next, we talk about where organizations reintroduce entropy for convenience.

1735
01:04:23,480 –> 01:04:24,840
The reporting layer.

1736
01:04:24,840 –> 01:04:28,000
Reporting layer, power BI as presentation, not truth.

1737
01:04:28,000 –> 01:04:31,120
Reporting is where most teams undo everything they just built

1738
01:04:31,120 –> 01:04:33,480
because power BI makes it easy to be helpful.

1739
01:04:33,480 –> 01:04:35,280
Helpful is not a control objective.

1740
01:04:35,280 –> 01:04:38,520
In an auditable ESG stack, power BI is a presentation layer.

1741
01:04:38,520 –> 01:04:41,080
A thin semantic layer over reported outputs.

1742
01:04:41,080 –> 01:04:42,120
It does not own the math.

1743
01:04:42,120 –> 01:04:43,720
It does not fix missing data.

1744
01:04:43,720 –> 01:04:47,760
And it does not quietly restate history because someone wanted a cleaner chart.

1745
01:04:47,760 –> 01:04:49,840
The system behavior you want is simple.

1746
01:04:49,840 –> 01:04:53,520
Fabric or Synapse produces period closed KPI tables in the reported zone.

1747
01:04:53,520 –> 01:04:56,520
Power BI reads those tables through certified data sets.

1748
01:04:56,520 –> 01:04:58,480
The report is a window, not a calculator.

1749
01:04:58,480 –> 01:05:00,920
That distinction matters because the report is the thing

1750
01:05:00,920 –> 01:05:06,360
executives screenshot, regulators request an auditor’s reconcil against the evidence package.

1751
01:05:06,360 –> 01:05:09,720
If the report can change without a corresponding change in the reported tables,

1752
01:05:09,720 –> 01:05:11,360
you’ve created a second truth.

1753
01:05:11,360 –> 01:05:13,640
And you will spend the next year arguing with yourself.

1754
01:05:13,640 –> 01:05:16,000
So you build two classes of dashboards on purpose.

1755
01:05:16,000 –> 01:05:18,280
The first class is regulatory and assurance reporting.

1756
01:05:18,280 –> 01:05:20,160
It reads only from the reported zone.

1757
01:05:20,160 –> 01:05:21,840
It refreshes on controlled schedules.

1758
01:05:21,840 –> 01:05:23,400
It uses certified data sets.

1759
01:05:23,400 –> 01:05:27,360
It has locked definitions and a release process that looks boring on purpose.

1760
01:05:27,360 –> 01:05:29,880
The second class is management and operations reporting.

1761
01:05:29,880 –> 01:05:32,680
It can read, curate it and even operational data.

1762
01:05:32,680 –> 01:05:33,600
It can move quickly.

1763
01:05:33,600 –> 01:05:35,720
It can support what’s happening right now.

1764
01:05:35,720 –> 01:05:36,720
Questions.

1765
01:05:36,720 –> 01:05:39,440
But it is explicitly labeled as operational, not reportable.

1766
01:05:39,440 –> 01:05:42,640
Different audience, different expectations, different tolerance for drift.

1767
01:05:42,640 –> 01:05:46,920
If you collapse those into one dashboard, you’ll optimize for executive convenience and

1768
01:05:46,920 –> 01:05:48,720
accidentally publish it as evidence.

1769
01:05:48,720 –> 01:05:52,440
Now, even in a thin semantic layer, you still need governance because semantics are where

1770
01:05:52,440 –> 01:05:53,440
definitions drift.

1771
01:05:53,440 –> 01:05:59,680
Use certified data sets, not personal workspaces and not an analyst’s final pbix.

1772
01:05:59,680 –> 01:06:03,560
Assurances are the signal that the data set is backed by controlled sources as an owner

1773
01:06:03,560 –> 01:06:05,520
and is part of the assurance boundary.

1774
01:06:05,520 –> 01:06:06,520
Promoted is not enough.

1775
01:06:06,520 –> 01:06:09,520
Promoted is a social tag, certified is a control decision.

1776
01:06:09,520 –> 01:06:10,520
Control the refresh path.

1777
01:06:10,520 –> 01:06:15,240
If the data set refreshes from curated tables, someone will eventually change a curated

1778
01:06:15,240 –> 01:06:20,280
transformation and unintentionally shift a number that was treated as stable.

1779
01:06:20,280 –> 01:06:22,760
Assurance data sets refresh from reported tables only.

1780
01:06:22,760 –> 01:06:23,760
That’s the rule.

1781
01:06:23,760 –> 01:06:27,240
Then you design the visuals that actually survive audit scrutiny.

1782
01:06:27,240 –> 01:06:29,160
Auditors don’t care about your color palette.

1783
01:06:29,160 –> 01:06:30,920
They care about your ability to explain.

1784
01:06:30,920 –> 01:06:35,200
So the mandatory visuals are the ones that surface control relevant context, targets versus

1785
01:06:35,200 –> 01:06:37,880
actuals, yes, but also confidence indicators.

1786
01:06:37,880 –> 01:06:39,760
Measured versus estimated split.

1787
01:06:39,760 –> 01:06:41,360
Coverage metrics alongside totals.

1788
01:06:41,360 –> 01:06:44,040
An explicit period labels tied to closed status.

1789
01:06:44,040 –> 01:06:48,000
A scope three total without a coverage indicator is not a KPI.

1790
01:06:48,000 –> 01:06:49,720
It’s a mood.

1791
01:06:49,720 –> 01:06:52,040
You also enforce drill path discipline.

1792
01:06:52,040 –> 01:06:55,960
The drill path needs to be deterministic, grouped to region, to site, to source record

1793
01:06:55,960 –> 01:06:59,360
identifiers.

1794
01:06:59,360 –> 01:07:03,400
When a number gets challenged, the report must let you drill to the reported record grain,

1795
01:07:03,400 –> 01:07:07,280
then provide the keys that let an engineer trace lineage back through purview, period,

1796
01:07:07,280 –> 01:07:10,480
or unit load ID and factor library version key.

1797
01:07:10,480 –> 01:07:16,080
If the drill stops at an aggregated chart, your report is a poster, not an audit artifact.

1798
01:07:16,080 –> 01:07:17,960
Now the part everyone ignores.

1799
01:07:17,960 –> 01:07:18,960
Export strategy.

1800
01:07:18,960 –> 01:07:21,120
At period close, you snapshot the outputs.

1801
01:07:21,120 –> 01:07:24,280
Not because power BI is unreliable, but because people are.

1802
01:07:24,280 –> 01:07:28,560
Those often want exactly what was presented at close, and they wanted reproducible even

1803
01:07:28,560 –> 01:07:31,840
if someone changes a report later for internal reasons.

1804
01:07:31,840 –> 01:07:37,640
So you export close packages, a PDF snapshot for human readable continuity, plus a data extract

1805
01:07:37,640 –> 01:07:41,040
that matches the reported KPI tables for machine comparison.

1806
01:07:41,040 –> 01:07:45,680
Store both in the evidence vault with the close metadata, period, data set version,

1807
01:07:45,680 –> 01:07:47,280
report version, and approval reference.

1808
01:07:47,280 –> 01:07:48,680
This is not redundant.

1809
01:07:48,680 –> 01:07:53,720
It is defense against, we change the report layout, becoming, we can’t reproduce what

1810
01:07:53,720 –> 01:07:54,720
we filed.

1811
01:07:54,720 –> 01:07:56,200
A practical warning.

1812
01:07:56,200 –> 01:08:01,000
The easiest way to reintroduce calculation drift is to allow just one measure to creep in.

1813
01:08:01,000 –> 01:08:04,840
Someone will say the reported tables don’t include a ratio they want, or the business

1814
01:08:04,840 –> 01:08:09,280
wants a different intensity denominator in a visual, or they want to adjust a mapping

1815
01:08:09,280 –> 01:08:12,880
in the report because it’s faster than waiting for the next pipeline run.

1816
01:08:12,880 –> 01:08:17,640
If you allow that, power BI becomes the calculation engine again, slowly, one convenience

1817
01:08:17,640 –> 01:08:18,640
at a time.

1818
01:08:18,640 –> 01:08:22,880
So the rule stays harsh, the only math allowed in power BI is presentation math.

1819
01:08:22,880 –> 01:08:26,920
Formatting simple aggregations that don’t change accounting semantics and convenience measures

1820
01:08:26,920 –> 01:08:28,800
that do not become the source of truth.

1821
01:08:28,800 –> 01:08:32,200
Anything that changes the meaning of a KPI belongs in the governed calculation zone gets

1822
01:08:32,200 –> 01:08:35,040
versioned and gets published into the reported tables.

1823
01:08:35,040 –> 01:08:38,200
Because the only thing worse than having no ESG story is having two.

1824
01:08:38,200 –> 01:08:39,200
Optional components.

1825
01:08:39,200 –> 01:08:42,520
Sustainability manager, ADF, Azure ML, where they fit.

1826
01:08:42,520 –> 01:08:43,680
Optional doesn’t mean irrelevant.

1827
01:08:43,680 –> 01:08:47,880
It means the component is not part of the minimum control surface required to survive assurance.

1828
01:08:47,880 –> 01:08:51,880
You added when it reduces audit risk or operational friction, not when it makes the demo

1829
01:08:51,880 –> 01:08:52,880
prettier.

1830
01:08:52,880 –> 01:08:55,040
But with Microsoft Sustainability Manager.

1831
01:08:55,040 –> 01:08:59,160
Microsoft Sustainability Manager is useful when the organization needs structured workflows

1832
01:08:59,160 –> 01:09:03,600
and the sustainability focused data model without inventing everything from scratch.

1833
01:09:03,600 –> 01:09:07,120
It positions itself around record, report, and reduce.

1834
01:09:07,120 –> 01:09:08,120
That’s not marketing fluff.

1835
01:09:08,120 –> 01:09:09,320
It’s a workflow boundary.

1836
01:09:09,320 –> 01:09:14,680
It can unify silo data, run emissions calculations and support reporting modules.

1837
01:09:14,680 –> 01:09:17,320
But the architectural question isn’t, is it good?

1838
01:09:17,320 –> 01:09:20,000
The question is does it help you enforce intent?

1839
01:09:20,000 –> 01:09:23,760
Where it helps is governed data collection and auditability inside its domain.

1840
01:09:23,760 –> 01:09:25,520
The platform can track data changes.

1841
01:09:25,520 –> 01:09:28,840
It can enable auditing for sustainability tables in data verse.

1842
01:09:28,840 –> 01:09:33,840
It also has data trail report capabilities described as preview, producing traceability

1843
01:09:33,840 –> 01:09:36,960
across inputs, calculation models, logs, and outputs.

1844
01:09:36,960 –> 01:09:41,680
That can be valuable when your current state is uncontrolled spreadsheets and tribal knowledge.

1845
01:09:41,680 –> 01:09:44,880
Because it gives you a default control story you can actually show.

1846
01:09:44,880 –> 01:09:48,040
Where it doesn’t help is when you treat it as the system of record for everything and

1847
01:09:48,040 –> 01:09:51,560
stop caring about reproducibility at the platform boundary.

1848
01:09:51,560 –> 01:09:55,720
If you already have mature emissions logic, strong factor governance, and a deterministic

1849
01:09:55,720 –> 01:09:59,520
lake house model, sustainability manager becomes optional.

1850
01:09:59,520 –> 01:10:03,040
You might still use it for workflow and data collection features, but you don’t outsource

1851
01:10:03,040 –> 01:10:05,080
your assurance posture to an app.

1852
01:10:05,080 –> 01:10:08,080
And if you do adopt it, be honest about constraints.

1853
01:10:08,080 –> 01:10:11,120
Auditing configuration is blunt through the standard interface.

1854
01:10:11,120 –> 01:10:15,600
It’s all or nothing for sustainability tables unless you use the power platform web API

1855
01:10:15,600 –> 01:10:17,000
for more granular control.

1856
01:10:17,000 –> 01:10:18,000
That’s manageable.

1857
01:10:18,000 –> 01:10:21,940
It’s also a reminder that audit ready still requires engineering.

1858
01:10:21,940 –> 01:10:23,880
Next as your data factory.

1859
01:10:23,880 –> 01:10:28,160
If you’re all in on fabric native ingestion and your source landscape is simple, ADF is

1860
01:10:28,160 –> 01:10:29,160
optional.

1861
01:10:29,160 –> 01:10:30,520
Fabric can ingest.

1862
01:10:30,520 –> 01:10:31,920
Fabric can orchestrate.

1863
01:10:31,920 –> 01:10:33,680
And for many organizations, that’s enough.

1864
01:10:33,680 –> 01:10:36,720
But ADF remains valuable when reality shows up.

1865
01:10:36,720 –> 01:10:42,520
Complex ERP extraction, IoT fan in, API rate limits, cross system dependencies, and multi-step

1866
01:10:42,520 –> 01:10:46,280
orchestration that spans networks and security boundaries.

1867
01:10:46,280 –> 01:10:50,240
ADF is the thing you use when you need the pipeline to behave like an integration system,

1868
01:10:50,240 –> 01:10:52,000
not like a notebook with optimism.

1869
01:10:52,000 –> 01:10:54,920
Just remember the immutability constraint you already accepted.

1870
01:10:54,920 –> 01:10:57,880
ADF will fail when you try to override immutable paths.

1871
01:10:57,880 –> 01:11:01,240
You’ll see errors like path immutable due to policy because the storage layer is doing

1872
01:11:01,240 –> 01:11:02,240
its job.

1873
01:11:02,240 –> 01:11:06,160
And for certain transformation patterns, ADF data flows can’t write directly to immutable

1874
01:11:06,160 –> 01:11:08,680
containers because they rely on temporary files.

1875
01:11:08,680 –> 01:11:10,360
The pattern stays the same.

1876
01:11:10,360 –> 01:11:14,720
Write to immutable staging destination, then copy finalized outputs into the immutable

1877
01:11:14,720 –> 01:11:15,720
evidence zone.

1878
01:11:15,720 –> 01:11:17,280
ADF isn’t more enterprise.

1879
01:11:17,280 –> 01:11:18,760
It’s more orchestration.

1880
01:11:18,760 –> 01:11:19,760
That’s different.

1881
01:11:19,760 –> 01:11:23,680
Now, as your machine learning, it’s optional because it doesn’t produce baseline numbers.

1882
01:11:23,680 –> 01:11:26,520
If it does, you’re building a probabilistic accounting system.

1883
01:11:26,520 –> 01:11:28,800
You can’t audit a model’s intuition.

1884
01:11:28,800 –> 01:11:31,440
Use Azure ML for three things only.

1885
01:11:31,440 –> 01:11:35,160
Forecasting, anomaly detection, and scenario modeling.

1886
01:11:35,160 –> 01:11:36,160
Forecasting helps planning.

1887
01:11:36,160 –> 01:11:38,280
Anomaly detection helps data quality.

1888
01:11:38,280 –> 01:11:40,120
Scenario modeling helps reduction strategy.

1889
01:11:40,120 –> 01:11:42,280
None of those are the reported KPI baseline.

1890
01:11:42,280 –> 01:11:43,560
They are overlays.

1891
01:11:43,560 –> 01:11:47,720
And they must be labeled as overlays in the data model and in the reports.

1892
01:11:47,720 –> 01:11:53,800
Model outputs should carry model version training data window runtime stamp and clear classification

1893
01:11:53,800 –> 01:11:55,760
as estimated forecast.

1894
01:11:55,760 –> 01:11:59,960
And otherwise, you’ll inevitably promote a forecast to a fact because it looks clean on

1895
01:11:59,960 –> 01:12:00,960
a slide.

1896
01:12:00,960 –> 01:12:02,960
So the decision rule is harsh.

1897
01:12:02,960 –> 01:12:08,800
Add optional tooling only when it reduces audit risk, not when it reduces effort.

1898
01:12:08,800 –> 01:12:11,920
Sustainability manager reduces chaos when you need structured collection and built in

1899
01:12:11,920 –> 01:12:13,800
sustainability workflows.

1900
01:12:13,800 –> 01:12:17,880
ADF reduces fragility when integration complexity exceeds what fabric orchestration can

1901
01:12:17,880 –> 01:12:19,080
realistically manage.

1902
01:12:19,080 –> 01:12:22,840
Azure ML adds intelligence, but only if you keep it out of the accounting path.

1903
01:12:22,840 –> 01:12:25,040
Optional components don’t replace the fundamentals.

1904
01:12:25,040 –> 01:12:29,720
They either reinforce them or they accelerate your failure in a more expensive way.

1905
01:12:29,720 –> 01:12:32,760
Next a short comparison because every stack can calculate emissions.

1906
01:12:32,760 –> 01:12:35,160
Very few can prove them end to end.

1907
01:12:35,160 –> 01:12:36,840
The short comparison.

1908
01:12:36,840 –> 01:12:39,560
Microsoft versus snowflake Databricks GCP.

1909
01:12:39,560 –> 01:12:43,800
At this point, someone always asks the same question usually with a budget spreadsheet open.

1910
01:12:43,800 –> 01:12:44,800
Why Microsoft?

1911
01:12:44,800 –> 01:12:45,800
Why not snowflake?

1912
01:12:45,800 –> 01:12:46,800
Why not Databricks?

1913
01:12:46,800 –> 01:12:48,160
Why not just do this on GCP?

1914
01:12:48,160 –> 01:12:51,280
And the answer is not that those stacks can’t calculate emissions.

1915
01:12:51,280 –> 01:12:53,280
They can.

1916
01:12:53,280 –> 01:12:58,120
Any competent data platform can ingest activity data, join it to factor tables and output

1917
01:12:58,120 –> 01:12:59,840
a number labeled scope two.

1918
01:12:59,840 –> 01:13:00,920
That part is not rare.

1919
01:13:00,920 –> 01:13:02,240
It’s table stakes.

1920
01:13:02,240 –> 01:13:05,120
The problem is that assurance doesn’t reward computation.

1921
01:13:05,120 –> 01:13:06,560
Assurance rewards proof.

1922
01:13:06,560 –> 01:13:09,160
So the comparison only matters on three axes.

1923
01:13:09,160 –> 01:13:13,000
Identity and access control, lineage and governance and audit evidence as a first class

1924
01:13:13,000 –> 01:13:14,000
output.

1925
01:13:14,000 –> 01:13:15,000
Everything else is noise.

1926
01:13:15,000 –> 01:13:16,560
Start with identity and access.

1927
01:13:16,560 –> 01:13:18,520
Microsoft’s advantage is not that entry exists.

1928
01:13:18,520 –> 01:13:19,520
Every cloud has IM.

1929
01:13:19,520 –> 01:13:23,720
The advantage is that the identity plane is already the enterprise default for most organizations

1930
01:13:23,720 –> 01:13:28,600
running Microsoft 365 Azure and Power Platform and it reaches into the services you’re

1931
01:13:28,600 –> 01:13:33,080
using for ESG, storage, compute, BI and workflow.

1932
01:13:33,080 –> 01:13:35,480
That matters because roll separation isn’t a concept.

1933
01:13:35,480 –> 01:13:38,400
It’s a continuous enforcement problem across the entire stack.

1934
01:13:38,400 –> 01:13:42,960
In Microsoft land, you can actually make submitter, validator, calculator, approver, publisher

1935
01:13:42,960 –> 01:13:47,760
map to real groups and real permissions that propagate into the services doing the work.

1936
01:13:47,760 –> 01:13:53,120
In many non-Microsoft stacks, identity becomes an assembly task, not impossible, just assembled.

1937
01:13:53,120 –> 01:13:57,080
And assembled identity is where temporary access becomes permanent, where service accounts

1938
01:13:57,080 –> 01:14:02,000
become shared and where your separation of duties quietly turns into conditional chaos.

1939
01:14:02,000 –> 01:14:03,480
The platform didn’t fail you.

1940
01:14:03,480 –> 01:14:04,760
Your architecture did.

1941
01:14:04,760 –> 01:14:06,840
But the platform determines how hard it is to fail.

1942
01:14:06,840 –> 01:14:08,520
Now lineage and governance.

1943
01:14:08,520 –> 01:14:13,640
This is the access where most ESG teams discover the difference between we have data and we can

1944
01:14:13,640 –> 01:14:15,240
explain data.

1945
01:14:15,240 –> 01:14:16,560
Microsoft purview is not magic.

1946
01:14:16,560 –> 01:14:19,520
It’s just a governance plane that is designed to be a governance plane.

1947
01:14:19,520 –> 01:14:23,800
You register assets, you scan, you capture lineage, you assign owners, you query metadata,

1948
01:14:23,800 –> 01:14:28,000
you walk into an audit room with something more defensible than a diagram in confluence.

1949
01:14:28,000 –> 01:14:32,520
And because purview integrates into common Microsoft data services, you can build lineage

1950
01:14:32,520 –> 01:14:37,520
that spans ingestion artifacts, lake house or warehouse objects and power BI consumption

1951
01:14:37,520 –> 01:14:40,160
in a way that is operationally achievable.

1952
01:14:40,160 –> 01:14:43,560
In other ecosystems, governance is usually a product stack you bolt on.

1953
01:14:43,560 –> 01:14:46,640
Databricks has unity catalog and lineage capabilities in its ecosystems.

1954
01:14:46,640 –> 01:14:48,640
Snowflake has governance features and partners.

1955
01:14:48,640 –> 01:14:51,360
GCP has data catalog and governance tooling.

1956
01:14:51,360 –> 01:14:52,360
All of that can work.

1957
01:14:52,360 –> 01:14:56,440
But the single pane of glass story becomes single pane of glass after integration work,

1958
01:14:56,440 –> 01:15:00,480
which means it competes with everything else for time, budget and political attention.

1959
01:15:00,480 –> 01:15:04,280
The time governance loses those fights, not because people are lazy, because they get

1960
01:15:04,280 –> 01:15:06,560
measured on delivery, not survivability.

1961
01:15:06,560 –> 01:15:10,480
So Microsoft’s practical advantage is not perfection, it’s friction.

1962
01:15:10,480 –> 01:15:14,400
Less friction to do governance well means it’s more likely to happen and more likely to

1963
01:15:14,400 –> 01:15:16,320
stay current when the system evolves.

1964
01:15:16,320 –> 01:15:17,880
That is what audit is actually experienced.

1965
01:15:17,880 –> 01:15:22,040
Now the third axis audit evidence, this is the point that makes most platform comparisons

1966
01:15:22,040 –> 01:15:23,040
meaningless.

1967
01:15:23,040 –> 01:15:28,840
Audit grade ESG is evidence management, immutable raw inputs, version factors, version logic,

1968
01:15:28,840 –> 01:15:30,440
period close configuration.

1969
01:15:30,440 –> 01:15:35,760
All adjustments, snapshots, access logs, approval trails, reproducible reruns.

1970
01:15:35,760 –> 01:15:36,760
That’s the system.

1971
01:15:36,760 –> 01:15:38,760
The report is just an output.

1972
01:15:38,760 –> 01:15:42,160
Microsoft doesn’t automatically give you this either, but the architecture aligns with

1973
01:15:42,160 –> 01:15:45,920
it because the components map cleanly to the evidence life cycle.

1974
01:15:45,920 –> 01:15:48,480
Entra gives you the enforcement surface for role separation.

1975
01:15:48,480 –> 01:15:52,520
ADLS Gen2 with immutability gives you the evidence world behavior.

1976
01:15:52,520 –> 01:15:56,360
Fabricosinaps gives you the governed compute surface, where you can implement deterministic

1977
01:15:56,360 –> 01:15:57,960
calculation artifacts.

1978
01:15:57,960 –> 01:16:01,240
All of you gives you lineage as query metadata instead of oral tradition.

1979
01:16:01,240 –> 01:16:06,760
Power BI gives you presentation without forcing you to mix computation and visualization.

1980
01:16:06,760 –> 01:16:10,960
That collection forms an integrated control plane story, not a vendor story, a control

1981
01:16:10,960 –> 01:16:14,600
story, and other stacks you can absolutely build the same control story.

1982
01:16:14,600 –> 01:16:18,480
But you build it, you assemble the identity controls across tools, you assemble lineage

1983
01:16:18,480 –> 01:16:23,120
across transformation engines and BI, you assemble immutability patterns across storage

1984
01:16:23,120 –> 01:16:27,920
and pipeline behavior, you assemble evidence packs as a discipline, not a platform feature.

1985
01:16:27,920 –> 01:16:32,200
And every assembly point becomes a place where policy erodes because policy always erodes

1986
01:16:32,200 –> 01:16:34,680
when intent isn’t enforced by design.

1987
01:16:34,680 –> 01:16:36,400
That’s the uncomfortable truth.

1988
01:16:36,400 –> 01:16:39,440
Architecture is what remains after your governance committee stops meeting.

1989
01:16:39,440 –> 01:16:44,120
Now, to be fair, there are reasons teams choose Snowflake, Databricks, or GCP for ESG.

1990
01:16:44,120 –> 01:16:46,880
They might already run their entire analytics estate there.

1991
01:16:46,880 –> 01:16:49,400
They might have stronger internal skills on that stack.

1992
01:16:49,400 –> 01:16:53,000
They might have vendor constraints or data gravity that makes Microsoft the wrong place

1993
01:16:53,000 –> 01:16:54,000
to compute.

1994
01:16:54,000 –> 01:16:57,720
None of that is invalid, but if they choose those platforms, they still have to answer

1995
01:16:57,720 –> 01:17:02,320
the same assurance questions and they still have to build the same non-negotiables immutability,

1996
01:17:02,320 –> 01:17:04,960
reproducibility, lineage, separation of duties.

1997
01:17:04,960 –> 01:17:07,160
The stack changes, the physics don’t.

1998
01:17:07,160 –> 01:17:10,200
So the short verdict is this, all stacks can calculate emissions.

1999
01:17:10,200 –> 01:17:14,480
Very few stacks can prove them end-to-end without deliberate architecture that prioritizes

2000
01:17:14,480 –> 01:17:16,320
evidence over convenience.

2001
01:17:16,320 –> 01:17:20,120
Microsoft’s advantage is that it gives you a coherent set of primitives that align

2002
01:17:20,120 –> 01:17:24,560
with audit survivability, especially in organizations already living inside Entra,

2003
01:17:24,560 –> 01:17:26,920
Microsoft 365, and Azure.

2004
01:17:26,920 –> 01:17:29,000
And this episode was never about vendor fandom.

2005
01:17:29,000 –> 01:17:32,920
It was about building something that survives contact with assurance, which is why the next

2006
01:17:32,920 –> 01:17:35,240
section matters more than the comparison.

2007
01:17:35,240 –> 01:17:40,200
The minimal viable auditable ESG architecture is the part you can actually replicate.

2008
01:17:40,200 –> 01:17:43,800
Minimal viable auditable ESG architecture, the replicable blueprint.

2009
01:17:43,800 –> 01:17:47,600
Here’s the part people pretend they want until it forces decisions.

2010
01:17:47,600 –> 01:17:52,320
A minimal viable auditable ESG architecture isn’t minimal because it’s cheap or quick.

2011
01:17:52,320 –> 01:17:56,660
It’s minimal because it contains the smallest set of components and artifacts that can

2012
01:17:56,660 –> 01:18:00,500
survive assurance without turning your team into full-time historians.

2013
01:18:00,500 –> 01:18:04,340
So define the environs clearly, boundaries, components, and produced evidence.

2014
01:18:04,340 –> 01:18:09,220
The boundaries are four zones, raw, curated, reported, and an evidence vault.

2015
01:18:09,220 –> 01:18:13,380
Not because medallion architecture is fashionable, but because it’s the cleanest way to separate

2016
01:18:13,380 –> 01:18:16,540
what happened from what you did to it from what you claim.

2017
01:18:16,540 –> 01:18:18,140
The components are five.

2018
01:18:18,140 –> 01:18:24,060
Entra ID, ADLS, GN2 with immutability, fabric or synapse for governed compute, purview

2019
01:18:24,060 –> 01:18:28,020
for lineage, and power BI as a thin presentation layer.

2020
01:18:28,020 –> 01:18:32,060
Everything else is optional, and optional means it’s allowed to be absent without collapsing

2021
01:18:32,060 –> 01:18:33,860
audit survivability.

2022
01:18:33,860 –> 01:18:36,220
The artifacts are what make it auditable.

2023
01:18:36,220 –> 01:18:40,900
Load IDs and ingestion logs, immutable raw objects, governed calculation artifacts with

2024
01:18:40,900 –> 01:18:47,260
version control, factor library versions with approval records, period close configuration,

2025
01:18:47,260 –> 01:18:51,020
reported KPI tables, and a close package snapshot.

2026
01:18:51,020 –> 01:18:54,180
If you don’t produce those artifacts, you didn’t build an auditable system.

2027
01:18:54,180 –> 01:18:55,180
You built a dashboard.

2028
01:18:55,180 –> 01:19:00,380
Now walk through one KPI end-to-end because architecture without a trace is just a diagram.

2029
01:19:00,380 –> 01:19:01,940
Pick scope to emissions.

2030
01:19:01,940 –> 01:19:06,620
It’s common enough, and it’s where drift and factor ambiguity show up fast.

2031
01:19:06,620 –> 01:19:12,260
The source is activity data, electricity consumption by site and period from invoices or meters.

2032
01:19:12,260 –> 01:19:15,540
Ingestion lands it in the raw zone as append only objects.

2033
01:19:15,540 –> 01:19:19,820
Each load gets a load ID, timestamp, source identifier, and submitter identity.

2034
01:19:19,820 –> 01:19:24,620
If the ingestion came from a human submission, it still lands as a new versioned object.

2035
01:19:24,620 –> 01:19:26,660
No override, ever.

2036
01:19:26,660 –> 01:19:29,500
Validation runs before anything becomes curated.

2037
01:19:29,500 –> 01:19:34,380
Schema checks, unit checks, required dimensions like site, period, and measurement type.

2038
01:19:34,380 –> 01:19:37,500
The validation output is not a log line in the pipeline run.

2039
01:19:37,500 –> 01:19:39,700
It’s an artifact you can retrieve later.

2040
01:19:39,700 –> 01:19:43,980
Pass, fail, warnings, and what was corrected or normalized.

2041
01:19:43,980 –> 01:19:46,060
Then curated, you standardize the shape.

2042
01:19:46,060 –> 01:19:47,460
Sites map to org units.

2043
01:19:47,460 –> 01:19:51,220
Normalize, missing dimensions get flagged not silently filled.

2044
01:19:51,220 –> 01:19:54,220
This is also where you enforce controlled vocab.

2045
01:19:54,220 –> 01:19:57,380
Electricity type, supply identifiers, region codes.

2046
01:19:57,380 –> 01:20:02,340
The curated tables carry quality flags forward because clean data that hides uncertainty is

2047
01:20:02,340 –> 01:20:03,500
a liability.

2048
01:20:03,500 –> 01:20:06,780
Then the calculation zone, fabric, lake house, or synapse.

2049
01:20:06,780 –> 01:20:11,340
This is where scope to emissions gets computed using versioned logic, not DAX.

2050
01:20:11,340 –> 01:20:13,820
Not the report, a governed artifact.

2051
01:20:13,820 –> 01:20:16,340
The computation binds to two things explicitly.

2052
01:20:16,340 –> 01:20:19,940
The activity load IDs and the factor library version key.

2053
01:20:19,940 –> 01:20:23,180
If the job doesn’t have a factor version key, it fails.

2054
01:20:23,180 –> 01:20:26,780
If multiple factors match due to sloppy mappings, it fails.

2055
01:20:26,780 –> 01:20:28,140
Deterministic selection or no selection.

2056
01:20:28,140 –> 01:20:29,460
Now period close.

2057
01:20:29,460 –> 01:20:31,540
Period close is not a calendar event.

2058
01:20:31,540 –> 01:20:34,860
It’s a state change that freezes your ability to rewrite the past.

2059
01:20:34,860 –> 01:20:38,580
You freeze the inputs by selecting the accepted load IDs for the period.

2060
01:20:38,580 –> 01:20:43,220
You freeze the factors by binding the approved factor library version IDs to that period.

2061
01:20:43,220 –> 01:20:47,260
Refreeze the logic by referencing the released calculation artifact version.

2062
01:20:47,260 –> 01:20:51,460
Then you publish the reported outputs, the KPI tables for that period with keys for period

2063
01:20:51,460 –> 01:20:55,740
or unit method measured versus estimated flags and the factor version key used.

2064
01:20:55,740 –> 01:21:00,780
Then you lock as ADLS immutability applies to the raw evidence and to the published close

2065
01:21:00,780 –> 01:21:02,140
artifacts.

2066
01:21:02,140 –> 01:21:06,220
Factor library snapshot, close configuration, and reported outputs as needed for your

2067
01:21:06,220 –> 01:21:07,540
evidence strategy.

2068
01:21:07,540 –> 01:21:10,220
You don’t have to make every table immutable forever.

2069
01:21:10,220 –> 01:21:12,380
You do have to make the close package immutable.

2070
01:21:12,380 –> 01:21:13,380
That’s the point.

2071
01:21:13,380 –> 01:21:14,380
Then power BI.

2072
01:21:14,380 –> 01:21:16,260
Power BI reads reported tables only.

2073
01:21:16,260 –> 01:21:17,540
The data set is certified.

2074
01:21:17,540 –> 01:21:18,860
The refresh is controlled.

2075
01:21:18,860 –> 01:21:23,500
The visuals include the confidence context, measured versus estimated, coverage indicators,

2076
01:21:23,500 –> 01:21:26,180
and the drill path down to record identifiers.

2077
01:21:26,180 –> 01:21:29,300
When someone challenges the KPI, you don’t debate, you drill.

2078
01:21:29,300 –> 01:21:30,980
And per view stitches the whole path.

2079
01:21:30,980 –> 01:21:33,940
Power BI report to data set, data set to report the tables.

2080
01:21:33,940 –> 01:21:36,100
Report the tables to calculation artifacts.

2081
01:21:36,100 –> 01:21:39,900
Calculation artifacts to curated inputs, curated inputs to raw loads, raw loads to sources

2082
01:21:39,900 –> 01:21:41,220
and submission identities.

2083
01:21:41,220 –> 01:21:43,220
That lineage is not for beauty, it’s for.

2084
01:21:43,220 –> 01:21:46,980
The day someone asks, prove it, while your calendar is already on fire.

2085
01:21:46,980 –> 01:21:48,460
Finally, sequencing.

2086
01:21:48,460 –> 01:21:51,860
Because this is where most teams implode by trying to boil the ocean.

2087
01:21:51,860 –> 01:21:54,540
Week one, pick one KPI and one data source.

2088
01:21:54,540 –> 01:21:57,860
Build ingestion into raw with load IDs and validation artifacts.

2089
01:21:57,860 –> 01:22:04,180
Week two, build the curated model for that KPI, including quality flags and control dimensions.

2090
01:22:04,180 –> 01:22:08,740
Week three, implement the governed calculation zone with factor version binding and a reported

2091
01:22:08,740 –> 01:22:09,740
output table.

2092
01:22:09,740 –> 01:22:14,620
Week four, register assets in purview and build a thin power BI report that drills to record

2093
01:22:14,620 –> 01:22:16,660
IDs and shows factor version keys.

2094
01:22:16,660 –> 01:22:17,860
That’s done.

2095
01:22:17,860 –> 01:22:20,340
Not perfect, done in the only sense that matters.

2096
01:22:20,340 –> 01:22:24,660
The auditor’s questions are answerable from systems, not from meetings.

2097
01:22:24,660 –> 01:22:29,660
Auditable ESG in Microsoft isn’t about dashboards, it’s about immutable data, versioned calculations

2098
01:22:29,660 –> 01:22:32,740
and lineage you can explain to an auditor without PowerPoint.

2099
01:22:32,740 –> 01:22:37,380
If you want the next layer, the ESG data model itself, raw versus curated versus reported

2100
01:22:37,380 –> 01:22:40,660
and how to enforce period close, watch the next episode and subscribe.





Source link

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Search
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...

Discover more from 365 Community Online

Subscribe now to keep reading and get access to the full archive.

Continue reading