When You Need Your Own AI — and When You Don’t

Mirko PetersPodcasts2 hours ago8 Views


1
00:00:00,000 –> 00:00:02,200
The night was thick with static.

2
00:00:02,200 –> 00:00:06,200
Your tenant humming files stacked like rusted steel.

3
00:00:06,200 –> 00:00:10,040
You want answers fast, but not guesses.

4
00:00:10,040 –> 00:00:12,080
Copilot is quick, friendly.

5
00:00:12,080 –> 00:00:16,000
It skims your M3 and 65 streets and hands you a summary.

6
00:00:16,000 –> 00:00:21,120
Good enough for small talk, not for policy, not for risk.

7
00:00:21,120 –> 00:00:22,760
Rag cuts deeper.

8
00:00:22,760 –> 00:00:27,960
It drags truth from your own stack, sights it, stands by it.

9
00:00:27,960 –> 00:00:30,920
So here’s the map when Copilot is enough.

10
00:00:30,920 –> 00:00:34,040
When you need your own pipeline and why teams blow this call,

11
00:00:34,040 –> 00:00:36,680
then pay for it in rework, tickets and trust.

12
00:00:36,680 –> 00:00:37,800
Stay sharp.

13
00:00:37,800 –> 00:00:40,840
There’s a secret step that makes this 10x easier.

14
00:00:40,840 –> 00:00:43,440
We’ll get there.

15
00:00:43,440 –> 00:00:48,720
Now, we define the players, defining the players,

16
00:00:48,720 –> 00:00:51,920
what is Copilot and LLMs.

17
00:00:51,920 –> 00:00:54,680
Start with the engine, large language models.

18
00:00:54,680 –> 00:00:59,280
They speak like us because they are trained on oceans of public text.

19
00:00:59,280 –> 00:01:02,440
Patterns, tokens, next word bets, they don’t know.

20
00:01:02,440 –> 00:01:04,720
They predict that prediction is powerful.

21
00:01:04,720 –> 00:01:10,800
Drafts, summaries, code sketches, meeting notes cleaned and sorted.

22
00:01:10,800 –> 00:01:13,120
Fast.

23
00:01:13,120 –> 00:01:17,520
But down here, your world is narrow, specific, messy.

24
00:01:17,520 –> 00:01:21,000
HR policies with last year’s date, a procurement form

25
00:01:21,000 –> 00:01:24,800
that changed last month, a device standard buried in a PDF

26
00:01:24,800 –> 00:01:27,040
on a forgotten SharePoint stack.

27
00:01:27,040 –> 00:01:30,000
A plain LLM won’t see it because in this city,

28
00:01:30,000 –> 00:01:33,640
the model only knows what you feed it now, not what you hit back then,

29
00:01:33,640 –> 00:01:35,600
not what changed yesterday.

30
00:01:35,600 –> 00:01:36,520
Enter Copilot.

31
00:01:36,520 –> 00:01:40,160
Think of it like a streetwise guide inside Microsoft 365.

32
00:01:40,160 –> 00:01:43,440
It can walk outlook alleys, teams corridors, SharePoint towers,

33
00:01:43,440 –> 00:01:44,840
one drive back rooms.

34
00:01:44,840 –> 00:01:46,120
It reads what you can read.

35
00:01:46,120 –> 00:01:48,440
It stays in bounds with your permissions.

36
00:01:48,440 –> 00:01:51,800
It drafts replies, writes meeting recaps,

37
00:01:51,800 –> 00:01:55,280
pulls related files you already have rights to.

38
00:01:55,280 –> 00:01:59,240
It’s good at what’s in my lane right now.

39
00:01:59,240 –> 00:02:03,240
It’s safe, governed, and fast because the terrain is familiar.

40
00:02:03,240 –> 00:02:05,120
Your identity controls the gates.

41
00:02:05,120 –> 00:02:07,160
Your data doesn’t leave the precinct.

42
00:02:07,160 –> 00:02:08,720
Where does Copilot shine?

43
00:02:08,720 –> 00:02:10,960
Every day flow, you’re buried in email,

44
00:02:10,960 –> 00:02:12,240
you need a clean summary.

45
00:02:12,240 –> 00:02:16,200
You want a quick brief for a meeting using files from your team site.

46
00:02:16,200 –> 00:02:18,120
Do you want to rephrase a doc in your voice?

47
00:02:18,120 –> 00:02:21,520
You’re staying inside M365, no custom data pipelines,

48
00:02:21,520 –> 00:02:26,680
no special retrieval logic, no extra tooling, straight utility.

49
00:02:26,680 –> 00:02:29,400
But here’s where most people mess up.

50
00:02:29,400 –> 00:02:35,520
They expect Copilot to know the factory floor, SOP, the onboarding maze,

51
00:02:35,520 –> 00:02:40,280
the device compliance footnote from a PDF that never made it to the right library.

52
00:02:40,280 –> 00:02:45,560
They ask it to cross check ERP fields or explain a CRM status code

53
00:02:45,560 –> 00:02:49,080
that lives outside the M365 city limits.

54
00:02:49,080 –> 00:02:51,920
Then they blame the model when the answer leans generic.

55
00:02:51,920 –> 00:02:53,160
We know better.

56
00:02:53,160 –> 00:02:54,760
It’s not a mind reader.

57
00:02:54,760 –> 00:02:57,160
It’s a runner working a single district.

58
00:02:57,160 –> 00:02:58,720
So what’s missing?

59
00:02:58,720 –> 00:03:02,520
Retrieval, controlled, precise.

60
00:03:02,520 –> 00:03:05,640
You need a librarian who knows where the bodies are buried.

61
00:03:05,640 –> 00:03:08,240
A way to turn your PDFs, web pages,

62
00:03:08,240 –> 00:03:11,560
weekies and databases into fast, relevant context

63
00:03:11,560 –> 00:03:13,360
at the exact moment of the question.

64
00:03:13,360 –> 00:03:15,920
That’s retrieval augmented generation.

65
00:03:15,920 –> 00:03:18,080
Rags, it’s not a model trick.

66
00:03:18,080 –> 00:03:19,880
It’s an information supply chain.

67
00:03:19,880 –> 00:03:21,520
The reason this works is simple.

68
00:03:21,520 –> 00:03:25,200
The model’s memories short, prompts are finite.

69
00:03:25,200 –> 00:03:28,600
But you can fetch just the right chunks at query time.

70
00:03:28,600 –> 00:03:31,440
Feed them in, ask the model to answer only from those sites.

71
00:03:31,440 –> 00:03:34,080
You get grounded output, you get proof.

72
00:03:34,080 –> 00:03:36,320
And when your data shifts, you re-index.

73
00:03:36,320 –> 00:03:40,800
No retraining, no long cycles, just fresher truth.

74
00:03:40,800 –> 00:03:42,000
Now let’s be clear.

75
00:03:42,000 –> 00:03:44,560
Co-pilot can already surface some of your files

76
00:03:44,560 –> 00:03:47,080
if they live in M365 and you have access.

77
00:03:47,080 –> 00:03:50,640
It’s handy, but it won’t build you a custom index

78
00:03:50,640 –> 00:03:54,440
across SharePoint, file shares, websites, and line

79
00:03:54,440 –> 00:03:56,000
of business systems.

80
00:03:56,000 –> 00:03:59,760
It won’t let you tune chunk sizes for a gnarly SOP.

81
00:03:59,760 –> 00:04:03,480
It won’t force citations, run retrieval evaluations,

82
00:04:03,480 –> 00:04:05,720
or give you a custom tool to hit an API

83
00:04:05,720 –> 00:04:07,640
and pull a live value mid-answer.

84
00:04:07,640 –> 00:04:08,960
That’s outside its beat.

85
00:04:08,960 –> 00:04:10,320
Think constraints.

86
00:04:10,320 –> 00:04:13,400
Co-pilot is bounded by your tenant’s native graph

87
00:04:13,400 –> 00:04:15,080
in its own product surface.

88
00:04:15,080 –> 00:04:18,240
That’s good for speed, great for governance.

89
00:04:18,240 –> 00:04:21,960
But if you need cross-system truth, strict grounding,

90
00:04:21,960 –> 00:04:25,520
or repeatable answers tied to version sources,

91
00:04:25,520 –> 00:04:27,640
you’ll feel the walls closing in.

92
00:04:27,640 –> 00:04:29,800
This clicked for me when a team asked Co-pilot

93
00:04:29,800 –> 00:04:32,480
to untangle a device hardening policy.

94
00:04:32,480 –> 00:04:34,960
The dock was split across three PDFs.

95
00:04:34,960 –> 00:04:36,000
One was stale.

96
00:04:36,000 –> 00:04:37,880
One lived on a file server.

97
00:04:37,880 –> 00:04:40,520
One had the only correct baseline.

98
00:04:40,520 –> 00:04:43,320
Co-pilot did its best with what it could see.

99
00:04:43,320 –> 00:04:44,720
The answer sounded right.

100
00:04:44,720 –> 00:04:45,760
It wasn’t.

101
00:04:45,760 –> 00:04:47,960
Service desk tickets spiked.

102
00:04:47,960 –> 00:04:49,200
Minutes wasted.

103
00:04:49,200 –> 00:04:51,120
Trust bled.

104
00:04:51,120 –> 00:04:52,920
With rag, you don’t pray.

105
00:04:52,920 –> 00:04:55,920
You prepare, you ingest, you chunk, you tag.

106
00:04:55,920 –> 00:04:59,320
You index with vectors, so meaning survives paraphrase.

107
00:04:59,320 –> 00:05:01,400
You fetch the closest chunks.

108
00:05:01,400 –> 00:05:02,640
You show citations.

109
00:05:02,640 –> 00:05:04,320
You add a hard rule.

110
00:05:04,320 –> 00:05:06,920
If nothing fits, say you don’t know.

111
00:05:06,920 –> 00:05:08,880
Illucinations drop.

112
00:05:08,880 –> 00:05:11,200
Confidence climbs.

113
00:05:11,200 –> 00:05:15,720
If you remember nothing else, Co-pilot is your inbox partner.

114
00:05:15,720 –> 00:05:17,400
Rag is your knowledge pipeline.

115
00:05:17,400 –> 00:05:20,200
Use the guide when you’re inside the district.

116
00:05:20,200 –> 00:05:23,840
Build the pipeline when the stakes demand proof.

117
00:05:23,840 –> 00:05:25,320
Defining the players.

118
00:05:25,320 –> 00:05:26,280
What is Rag?

119
00:05:26,280 –> 00:05:28,680
Retrieval augmented generation.

120
00:05:28,680 –> 00:05:30,320
Rag isn’t magic.

121
00:05:30,320 –> 00:05:33,440
It’s plumbing, cold pipes, hot truth.

122
00:05:33,440 –> 00:05:35,040
Three moving parts.

123
00:05:35,040 –> 00:05:38,080
Retrieval, augmentation, generation.

124
00:05:38,080 –> 00:05:39,200
Retrieval first.

125
00:05:39,200 –> 00:05:40,960
You build a private library.

126
00:05:40,960 –> 00:05:43,320
Not glossy, brutal.

127
00:05:43,320 –> 00:05:44,800
Your PDFs.

128
00:05:44,800 –> 00:05:45,840
Wikis.

129
00:05:45,840 –> 00:05:46,840
Pages.

130
00:05:46,840 –> 00:05:47,840
Tables.

131
00:05:47,840 –> 00:05:48,840
Tickets.

132
00:05:48,840 –> 00:05:50,360
Change logs.

133
00:05:50,360 –> 00:05:53,680
SOP binders that smell like dust and denial.

134
00:05:53,680 –> 00:05:55,120
You don’t throw them at a model.

135
00:05:55,120 –> 00:05:56,040
You process them.

136
00:05:56,040 –> 00:05:58,760
You slice them into small, useful pieces.

137
00:05:58,760 –> 00:05:59,600
Chunks.

138
00:05:59,600 –> 00:06:02,440
Then you tag them with metadata so a machine can smell

139
00:06:02,440 –> 00:06:04,400
context like a bloodhound.

140
00:06:04,400 –> 00:06:08,480
You vectorize the chunks or meaning holds when the words don’t match.

141
00:06:08,480 –> 00:06:09,800
That’s the search fuel.

142
00:06:09,800 –> 00:06:11,320
Augmented next.

143
00:06:11,320 –> 00:06:13,480
A question walks in.

144
00:06:13,480 –> 00:06:14,520
Plane clothes.

145
00:06:14,520 –> 00:06:16,720
You convert the question into a vector.

146
00:06:16,720 –> 00:06:19,400
You hunt the nearest chunks in your index.

147
00:06:19,400 –> 00:06:21,480
You pull back the top few that matter.

148
00:06:21,480 –> 00:06:22,920
You package them as context.

149
00:06:22,920 –> 00:06:23,800
Not all your data.

150
00:06:23,800 –> 00:06:25,320
Just the right charts.

151
00:06:25,320 –> 00:06:28,040
Tight, relevant, dated, sourced.

152
00:06:28,040 –> 00:06:29,640
You add instructions.

153
00:06:29,640 –> 00:06:32,760
Answer only from these sites.

154
00:06:32,760 –> 00:06:34,560
Quote the source.

155
00:06:34,560 –> 00:06:36,800
If it’s not here say you don’t know.

156
00:06:36,800 –> 00:06:39,360
That’s the leash generation last.

157
00:06:39,360 –> 00:06:41,280
Now the model speaks.

158
00:06:41,280 –> 00:06:42,480
But it’s grounded.

159
00:06:42,480 –> 00:06:44,320
It’s standing on your sources.

160
00:06:44,320 –> 00:06:45,800
It doesn’t riff from memory.

161
00:06:45,800 –> 00:06:48,840
It reasons with the pages you fed it seconds ago.

162
00:06:48,840 –> 00:06:51,280
The answer lands with receipts.

163
00:06:51,280 –> 00:06:52,280
Citations.

164
00:06:52,280 –> 00:06:53,800
No bluffing.

165
00:06:53,800 –> 00:06:55,880
The thing most people miss.

166
00:06:55,880 –> 00:06:58,440
Rag isn’t about shoving PDFs into a hungry mouth.

167
00:06:58,440 –> 00:06:59,680
It’s a supply chain.

168
00:06:59,680 –> 00:07:00,680
Data in.

169
00:07:00,680 –> 00:07:02,040
Chunks clean.

170
00:07:02,040 –> 00:07:03,600
Index is tuned.

171
00:07:03,600 –> 00:07:05,160
Queries tight.

172
00:07:05,160 –> 00:07:06,680
Evaluation constant.

173
00:07:06,680 –> 00:07:08,040
Break any link.

174
00:07:08,040 –> 00:07:09,600
And the outputs rot.

175
00:07:09,600 –> 00:07:12,840
Why this beats fine tuning for business?

176
00:07:12,840 –> 00:07:14,960
Because policies move.

177
00:07:14,960 –> 00:07:16,960
S-O-P’s shift.

178
00:07:16,960 –> 00:07:18,560
Fields change.

179
00:07:18,560 –> 00:07:23,040
You don’t want to retrain a model every time procurement updates align.

180
00:07:23,040 –> 00:07:25,120
With rag you just fix the library.

181
00:07:25,120 –> 00:07:26,120
Raine decks.

182
00:07:26,120 –> 00:07:27,480
You keep the same engine.

183
00:07:27,480 –> 00:07:28,760
You change the fuel.

184
00:07:28,760 –> 00:07:31,320
Now how does this flow in Azure Streets?

185
00:07:31,320 –> 00:07:34,880
Azure AI Foundry gives you the scaffolding.

186
00:07:34,880 –> 00:07:40,440
You ingest from SharePoint stacks Web crawls file shares maybe databases if you map exports.

187
00:07:40,440 –> 00:07:43,480
You chunk with strategies that match the form.

188
00:07:43,480 –> 00:07:45,800
Heading’s matter for S-O-P’s.

189
00:07:45,800 –> 00:07:48,320
Tables need careful passing.

190
00:07:48,320 –> 00:07:53,400
You add metadata version owner, date system, then you vectorize.

191
00:07:53,400 –> 00:07:57,600
Embeddings turn text into numbers that remember intent.

192
00:07:57,600 –> 00:08:03,080
You store those vectors in Azure AI search or a vector store that plays nice.

193
00:08:03,080 –> 00:08:04,080
That’s your index.

194
00:08:04,080 –> 00:08:05,080
Fast.

195
00:08:05,080 –> 00:08:06,080
Searchable.

196
00:08:06,080 –> 00:08:07,680
Ready when the question hits.

197
00:08:07,680 –> 00:08:09,920
When the question hits the retriever goes to work.

198
00:08:09,920 –> 00:08:13,840
It finds the closest matches by meaning not just keywords.

199
00:08:13,840 –> 00:08:16,040
You can do hybrid search too.

200
00:08:16,040 –> 00:08:19,960
Semantics plus text because in this city precision is survival.

201
00:08:19,960 –> 00:08:21,760
You set strictness.

202
00:08:21,760 –> 00:08:23,120
Loose finds more.

203
00:08:23,120 –> 00:08:24,440
Risks noise.

204
00:08:24,440 –> 00:08:25,440
Tight finds less.

205
00:08:25,440 –> 00:08:26,440
Boosts trust.

206
00:08:26,440 –> 00:08:29,800
Filing to your risk then you augment the prompt.

207
00:08:29,800 –> 00:08:33,240
You inject the retrieve chunks clean and labeled.

208
00:08:33,240 –> 00:08:39,400
You set rules, site sources, stay within content, no inventing.

209
00:08:39,400 –> 00:08:45,160
You pass that to the model you deployed doesn’t need to be exotic just consistent.

210
00:08:45,160 –> 00:08:46,800
Now guardrails.

211
00:08:46,800 –> 00:08:48,720
You add don’t know behavior.

212
00:08:48,720 –> 00:08:50,240
You cap on the length.

213
00:08:50,240 –> 00:08:53,360
You require citations to render with the output.

214
00:08:53,360 –> 00:08:55,160
You log which chunks were used.

215
00:08:55,160 –> 00:09:00,800
You track latency, hit rates and nulls because a pipeline you can’t measure is a pipeline

216
00:09:00,800 –> 00:09:02,280
you can’t trust.

217
00:09:02,280 –> 00:09:04,480
Common traps down here.

218
00:09:04,480 –> 00:09:06,560
Chunks too big.

219
00:09:06,560 –> 00:09:09,400
Model gets lost in the sprawl.

220
00:09:09,400 –> 00:09:13,520
Chunks too small, context shatters, no metadata.

221
00:09:13,520 –> 00:09:20,200
You can’t filter stale from fresh, wrong embeddings for your language or domain.

222
00:09:20,200 –> 00:09:22,560
Retrieval returns pretty but wrong passages.

223
00:09:22,560 –> 00:09:24,160
No evaluation loop.

224
00:09:24,160 –> 00:09:26,800
Nobody checks if the top five actually answer the question.

225
00:09:26,800 –> 00:09:29,280
The game changer nobody talks about.

226
00:09:29,280 –> 00:09:30,280
Feedback.

227
00:09:30,280 –> 00:09:32,120
You let users flag bad answers.

228
00:09:32,120 –> 00:09:33,920
You fix the chunk or the source.

229
00:09:33,920 –> 00:09:35,520
You re-index.

230
00:09:35,520 –> 00:09:36,840
Quality rises.

231
00:09:36,840 –> 00:09:38,080
Trust follows.

232
00:09:38,080 –> 00:09:41,240
If you remember nothing else, remember this.

233
00:09:41,240 –> 00:09:42,480
Ragn makes the model local.

234
00:09:42,480 –> 00:09:43,800
It speaks in your dialect.

235
00:09:43,800 –> 00:09:45,200
It cites your law.

236
00:09:45,200 –> 00:09:47,400
It stops pretending.

237
00:09:47,400 –> 00:09:52,520
Because in this city answers without sources are just noise in the rain.

238
00:09:52,520 –> 00:09:55,640
The copilot advantage.

239
00:09:55,640 –> 00:09:57,640
General knowledge and speed.

240
00:09:57,640 –> 00:09:59,200
Copilot moves fast.

241
00:09:59,200 –> 00:10:00,880
That’s the point.

242
00:10:00,880 –> 00:10:02,760
You’re buried in noise.

243
00:10:02,760 –> 00:10:05,000
Male flooding your outlook alleys.

244
00:10:05,000 –> 00:10:07,360
Teams threads stacked like crates.

245
00:10:07,360 –> 00:10:08,360
Files you can see.

246
00:10:08,360 –> 00:10:10,120
Files you’re allowed to see.

247
00:10:10,120 –> 00:10:11,560
Copilot walks that beat with you.

248
00:10:11,560 –> 00:10:13,280
It reads the room.

249
00:10:13,280 –> 00:10:15,920
Drafts a reply that sounds like you.

250
00:10:15,920 –> 00:10:19,120
Pulls three relevant docs from your team site.

251
00:10:19,120 –> 00:10:21,440
Builds a meeting brief in seconds.

252
00:10:21,440 –> 00:10:24,160
Rises a chat war into clean bullet lines.

253
00:10:24,160 –> 00:10:25,480
You don’t hunt.

254
00:10:25,480 –> 00:10:26,480
You don’t stitch.

255
00:10:26,480 –> 00:10:28,080
You just ship.

256
00:10:28,080 –> 00:10:30,600
Because in this city time kills.

257
00:10:30,600 –> 00:10:32,560
Copilot saves minutes per move.

258
00:10:32,560 –> 00:10:34,480
Add that up across a week.

259
00:10:34,480 –> 00:10:35,680
Across a team.

260
00:10:35,680 –> 00:10:37,160
Across a quarter you feel the lift.

261
00:10:37,160 –> 00:10:39,120
Now, the reason it’s smooth.

262
00:10:39,120 –> 00:10:40,960
Identity adwares your badge.

263
00:10:40,960 –> 00:10:42,280
It respects your scope.

264
00:10:42,280 –> 00:10:44,120
It doesn’t break out of the precinct.

265
00:10:44,120 –> 00:10:46,040
No awkward permissions chase.

266
00:10:46,040 –> 00:10:47,680
No custom pipes to maintain.

267
00:10:47,680 –> 00:10:49,200
No embeddings to generate.

268
00:10:49,200 –> 00:10:52,120
Rides the Microsoft graph like a subway map.

269
00:10:52,120 –> 00:10:53,200
Predictable.

270
00:10:53,200 –> 00:10:54,160
Govind.

271
00:10:54,160 –> 00:10:55,800
Quietly efficient.

272
00:10:55,800 –> 00:10:58,200
Drafting is where it shines.

273
00:10:58,200 –> 00:11:00,200
Cold email to warm intro.

274
00:11:00,200 –> 00:11:01,880
Rough notes to clean minutes.

275
00:11:01,880 –> 00:11:04,120
A messy deck turned tight.

276
00:11:04,120 –> 00:11:05,680
Rewrite in your tone.

277
00:11:05,680 –> 00:11:06,600
Fix spelling.

278
00:11:06,600 –> 00:11:07,720
Strip fluff.

279
00:11:07,720 –> 00:11:09,480
That’s breakfast work for Copilot.

280
00:11:09,480 –> 00:11:11,280
It’s also a decent scout.

281
00:11:11,280 –> 00:11:13,560
Show me related docs for this meeting.

282
00:11:13,560 –> 00:11:16,040
It maps your one drive and SharePoint lanes.

283
00:11:16,040 –> 00:11:18,120
It surfaces what’s already in reach.

284
00:11:18,120 –> 00:11:20,400
You pick, you move.

285
00:11:20,400 –> 00:11:22,680
And here’s the truth, the tourists miss.

286
00:11:22,680 –> 00:11:26,520
Sometimes you just need good enough, a passable draft,

287
00:11:26,520 –> 00:11:28,400
a summary that gets you oriented,

288
00:11:28,400 –> 00:11:31,400
a quick check of what’s changed in a folder you own.

289
00:11:31,400 –> 00:11:32,720
These aren’t court cases.

290
00:11:32,720 –> 00:11:34,320
They’re errands.

291
00:11:34,320 –> 00:11:36,880
Copilot eats errands.

292
00:11:36,880 –> 00:11:39,520
Now, boundaries.

293
00:11:39,520 –> 00:11:43,560
Because down here in the undernet, speed can blind you.

294
00:11:43,560 –> 00:11:45,640
Copilot won’t require your knowledge.

295
00:11:45,640 –> 00:11:48,600
It won’t cross the fences into ERP vaults,

296
00:11:48,600 –> 00:11:53,440
or that legacy file, share the last admin sealed with tape.

297
00:11:53,440 –> 00:11:56,800
It won’t enforce answer only with citations on your command.

298
00:11:56,800 –> 00:12:00,800
It won’t let you tune chunk sizes or run retrieval evaluations.

299
00:12:00,800 –> 00:12:04,480
It can pull what’s visible in your M365 lanes.

300
00:12:04,480 –> 00:12:06,840
Useful, but not surgical.

301
00:12:06,840 –> 00:12:08,520
So when do you stay with it?

302
00:12:08,520 –> 00:12:13,800
When the task lives in outlook, teams, SharePoint, one drive.

303
00:12:13,800 –> 00:12:18,560
When the answer is a draft, a summary, a rewrite, a quick list.

304
00:12:18,560 –> 00:12:21,000
When governance and simplicity matter,

305
00:12:21,000 –> 00:12:23,400
more than custom reach.

306
00:12:23,400 –> 00:12:27,240
When you don’t need strict grounding or cross-system joins,

307
00:12:27,240 –> 00:12:30,240
I watch the PM use it to prep a vendor call.

308
00:12:30,240 –> 00:12:33,000
30 messages, four files.

309
00:12:33,000 –> 00:12:37,360
She asked for a one-page brief with open issues and decisions.

310
00:12:37,360 –> 00:12:39,520
Copilot’s batted out in under a minute.

311
00:12:39,520 –> 00:12:41,440
She tweaked three lines.

312
00:12:41,440 –> 00:12:42,240
Done.

313
00:12:42,240 –> 00:12:44,240
That’s the lane.

314
00:12:44,240 –> 00:12:47,880
The mistake is trying to make it a judge, a compliance oracle

315
00:12:47,880 –> 00:12:49,560
across-system agent.

316
00:12:49,560 –> 00:12:53,840
You ask it about a policy that changed last month in a PDF it can’t see.

317
00:12:53,840 –> 00:12:56,480
It answers smooth, generic, and wrong.

318
00:12:56,480 –> 00:12:59,600
You won’t spot the fracture until the ticket queues wells.

319
00:12:59,600 –> 00:13:00,680
We’ve seen that movie.

320
00:13:00,680 –> 00:13:02,280
Use the runner for what it is.

321
00:13:02,280 –> 00:13:05,520
Fast, local, polite with your time.

322
00:13:05,520 –> 00:13:08,080
Once you nail that, everything else clicks.

323
00:13:08,080 –> 00:13:09,480
You don’t overreach.

324
00:13:09,480 –> 00:13:11,240
You don’t over trust.

325
00:13:11,240 –> 00:13:14,160
You keep the errands light in the stakes low.

326
00:13:14,160 –> 00:13:18,360
And when the question demands proof, you switch tools.

327
00:13:18,360 –> 00:13:20,800
Because in this city speed matters.

328
00:13:20,800 –> 00:13:22,440
But truth wins.

329
00:13:22,440 –> 00:13:27,440
The rag necessity when proprietary data is king.

330
00:13:27,440 –> 00:13:29,640
Some questions wear badges.

331
00:13:29,640 –> 00:13:33,440
Prepriotary high stakes, no guesses allowed.

332
00:13:33,440 –> 00:13:36,280
That’s when the librarian steps in.

333
00:13:36,280 –> 00:13:37,400
Rag.

334
00:13:37,400 –> 00:13:40,640
You’ve got policies outside the M365 Glow.

335
00:13:40,640 –> 00:13:44,280
Device baselines buried in stale PDFs.

336
00:13:44,280 –> 00:13:48,360
Onboarding rules have in SharePoint, half on a file server.

337
00:13:48,360 –> 00:13:51,160
S-O-P’s that live as Word, Wiki, and rumor.

338
00:13:51,160 –> 00:13:53,440
Copilot can’t patrol those alleys.

339
00:13:53,440 –> 00:13:55,080
Rag can.

340
00:13:55,080 –> 00:13:56,960
You build the pipeline.

341
00:13:56,960 –> 00:13:58,600
Injust the mess.

342
00:13:58,600 –> 00:14:01,560
Chunk the docs to match how people ask.

343
00:14:01,560 –> 00:14:03,040
Headings with steps.

344
00:14:03,040 –> 00:14:05,360
Tables preserved, not mangled.

345
00:14:05,360 –> 00:14:08,120
Metadata stamped owner version date system sensitivity.

346
00:14:08,120 –> 00:14:12,960
Then vectors embeddings turn language into coordinates, meaning

347
00:14:12,960 –> 00:14:15,560
survives paraphrase.

348
00:14:15,560 –> 00:14:20,720
As your AI search holds the map, fast nearest neighbor hybrid

349
00:14:20,720 –> 00:14:23,560
with semantics when keywords help.

350
00:14:23,560 –> 00:14:26,840
Now the question hits, which device hardening baseline

351
00:14:26,840 –> 00:14:30,560
applies to contractors on Mac OS Q3 revision?

352
00:14:30,560 –> 00:14:34,120
The retriever hunts nearest chunks by meaning filters

353
00:14:34,120 –> 00:14:39,480
by version equals Q3, owner equals security, region equals global,

354
00:14:39,480 –> 00:14:41,560
strictness tuned to avoid noise.

355
00:14:41,560 –> 00:14:43,560
Three passages come home.

356
00:14:43,560 –> 00:14:44,840
You package them.

357
00:14:44,840 –> 00:14:47,760
You say answer only from these sites.

358
00:14:47,760 –> 00:14:52,480
If missing say you don’t know, receipts required.

359
00:14:52,480 –> 00:14:54,480
The model speaks grounded.

360
00:14:54,480 –> 00:14:55,960
It quotes the clause.

361
00:14:55,960 –> 00:14:57,040
It links the source.

362
00:14:57,040 –> 00:14:58,680
It names the revision.

363
00:14:58,680 –> 00:15:01,000
No riff, just law.

364
00:15:01,000 –> 00:15:04,360
Policy and compliance Q&A is built for this.

365
00:15:04,360 –> 00:15:06,480
Employees stop guessing.

366
00:15:06,480 –> 00:15:09,960
They stop pinging the desk for the same 12 questions.

367
00:15:09,960 –> 00:15:12,160
Citations build trust.

368
00:15:12,160 –> 00:15:14,560
If a dog is wrong, you fix the source.

369
00:15:14,560 –> 00:15:17,280
Reindex, the answer changes tomorrow.

370
00:15:17,280 –> 00:15:18,440
No retraining loop.

371
00:15:18,440 –> 00:15:19,680
That’s power.

372
00:15:19,680 –> 00:15:24,840
SOPs next, manufacturing, IT operations, HR workflows.

373
00:15:24,840 –> 00:15:26,240
These aren’t poems.

374
00:15:26,240 –> 00:15:28,040
Their sequences.

375
00:15:28,040 –> 00:15:30,720
Rag turns them into step-by-step guidance.

376
00:15:30,720 –> 00:15:32,640
Chunk-by-heading and step number.

377
00:15:32,640 –> 00:15:34,280
Preserve warnings.

378
00:15:34,280 –> 00:15:36,720
Include preconditions.

379
00:15:36,720 –> 00:15:41,680
At query time, retrieve the exact step and its guard rails.

380
00:15:41,680 –> 00:15:44,560
Ask the model to render a checklist, not a story.

381
00:15:44,560 –> 00:15:45,920
You get action not vibes.

382
00:15:45,920 –> 00:15:49,960
Then CRM and ERP context, Dynamics SAP Sales Force.

383
00:15:49,960 –> 00:15:52,560
Copilot can’t reach the transaction guts.

384
00:15:52,560 –> 00:15:55,360
Rag can unify the narrative.

385
00:15:55,360 –> 00:15:58,200
Embed release notes, field dictionaries, integration

386
00:15:58,200 –> 00:16:02,520
wikis, add tools for live lookups, read only APIs,

387
00:16:02,520 –> 00:16:05,160
status checks, inventory pulls.

388
00:16:05,160 –> 00:16:07,960
The model retrieves the spec, calls the tool,

389
00:16:07,960 –> 00:16:10,120
and explains the result with sites.

390
00:16:10,120 –> 00:16:12,000
Now the agent doesn’t invent.

391
00:16:12,000 –> 00:16:13,720
It confirms.

392
00:16:13,720 –> 00:16:16,320
This is where proprietary data rules.

393
00:16:16,320 –> 00:16:17,600
You need control.

394
00:16:17,600 –> 00:16:21,600
Control of chunk sizes and overlap, so meaning holds.

395
00:16:21,600 –> 00:16:24,800
Control of retrieval filters to lock scope.

396
00:16:24,800 –> 00:16:27,720
Control of grounding to force citations.

397
00:16:27,720 –> 00:16:32,960
Control of tools to fetch live truth and governance.

398
00:16:32,960 –> 00:16:35,080
Foundry gives you safe lanes.

399
00:16:35,080 –> 00:16:36,680
Data boundaries.

400
00:16:36,680 –> 00:16:38,840
Roll-based access.

401
00:16:38,840 –> 00:16:40,680
Versioned indexes.

402
00:16:40,680 –> 00:16:42,280
Monitored runs.

403
00:16:42,280 –> 00:16:47,520
Responsible AI hooks so you can trace why an answer said what it said.

404
00:16:47,520 –> 00:16:51,160
Leaders sleep better when the chain of custody is clear.

405
00:16:51,160 –> 00:16:54,920
Cost and complexity know the shape.

406
00:16:54,920 –> 00:17:00,160
As your AI search carries the index, tier by traffic,

407
00:17:00,160 –> 00:17:02,720
hybrid search helps accuracy.

408
00:17:02,720 –> 00:17:05,400
Embedding’s cost per thousand tokens.

409
00:17:05,400 –> 00:17:07,320
Batch at ingestion.

410
00:17:07,320 –> 00:17:10,240
Re-embed only change chunks.

411
00:17:10,240 –> 00:17:14,280
Model hosting depends on traffic and context size.

412
00:17:14,280 –> 00:17:16,080
Keep prompts tight.

413
00:17:16,080 –> 00:17:18,360
Site only what’s needed.

414
00:17:18,360 –> 00:17:19,840
Storage is cheap.

415
00:17:19,840 –> 00:17:21,760
Bad indexing isn’t.

416
00:17:21,760 –> 00:17:25,160
Plan your fields, plan your filters.

417
00:17:25,160 –> 00:17:27,440
When is rag not optional?

418
00:17:27,440 –> 00:17:29,840
When correctness beats speed?

419
00:17:29,840 –> 00:17:33,000
When answers must side chapter and verse.

420
00:17:33,000 –> 00:17:36,800
When knowledge lives beyond M3 in 65.

421
00:17:36,800 –> 00:17:40,880
When workflows require tools to act, not just speak.

422
00:17:40,880 –> 00:17:46,040
When you need repeatability, same question, same answer, same source.

423
00:17:46,040 –> 00:17:49,680
I walked a tenant that was bleeding data, policy scattered,

424
00:17:49,680 –> 00:17:53,320
doops everywhere, teams asked co-pilot for clarity,

425
00:17:53,320 –> 00:17:57,440
it smiled and guessed, good tone, bad facts.

426
00:17:57,440 –> 00:18:00,200
Tickets stacked like bodies in the alley.

427
00:18:00,200 –> 00:18:04,480
We built the pipeline index across SharePoint and file servers,

428
00:18:04,480 –> 00:18:08,360
trash the doops, tag the truth, force citations,

429
00:18:08,360 –> 00:18:10,840
set don’t know as a badge of honor.

430
00:18:10,840 –> 00:18:13,920
Service desk load dropped, trust climbed.

431
00:18:13,920 –> 00:18:18,080
Not because the model got smarter, because the library did.

432
00:18:18,080 –> 00:18:19,920
And this one matters.

433
00:18:19,920 –> 00:18:22,960
Rag is not a feature you toggle on Tuesdays.

434
00:18:22,960 –> 00:18:27,320
It’s a discipline, sources owned, pipelines monitored,

435
00:18:27,320 –> 00:18:30,840
evaluations weekly, users in the loop,

436
00:18:30,840 –> 00:18:34,520
you measure retrieval hit rate, you inspect top-k quality,

437
00:18:34,520 –> 00:18:36,720
you track don’t know and fix the gap.

438
00:18:36,720 –> 00:18:37,920
Quality is a habit.

439
00:18:37,920 –> 00:18:41,480
So when proprietary data runs the show, you pick the librarian,

440
00:18:41,480 –> 00:18:44,520
you build the pipes, you demand receipts.

441
00:18:44,520 –> 00:18:47,800
Because in this city, your knowledge is the currency.

442
00:18:47,800 –> 00:18:53,120
Guard it, index it, retrieve it clean, then let the model speak,

443
00:18:53,120 –> 00:18:57,880
and stand by it, case study, global manufacturing company,

444
00:18:57,880 –> 00:19:01,000
anonymized, the tenant was humming.

445
00:19:01,000 –> 00:19:04,920
A global manufacturer, plans on three continents,

446
00:19:04,920 –> 00:19:09,320
policies stacked like sheet metal, they wanted truth on demand.

447
00:19:09,320 –> 00:19:12,080
Not vibes, not guesses.

448
00:19:12,080 –> 00:19:14,840
The service desk was drowning in repeat questions,

449
00:19:14,840 –> 00:19:17,760
compliance was a rumor, documents fought each other

450
00:19:17,760 –> 00:19:21,520
in the dark, they tried going faster with generic tools,

451
00:19:21,520 –> 00:19:25,320
speed without ground, it backfired.

452
00:19:25,320 –> 00:19:30,720
So we built a librarian, private, quiet, Azure streets,

453
00:19:30,720 –> 00:19:35,720
rag as the spine, indexes with teeth, citations mandatory,

454
00:19:35,720 –> 00:19:41,040
a team’s doorway, ask, get the clause, see the source.

455
00:19:41,040 –> 00:19:44,240
Confidence returned, tickets fell,

456
00:19:44,240 –> 00:19:47,640
leadership finally saw the shape of their own rules,

457
00:19:47,640 –> 00:19:51,920
and believed them before, without rag, the pain points,

458
00:19:51,920 –> 00:19:53,760
it started ugly.

459
00:19:53,760 –> 00:19:57,240
4,800 policy files scattered like rust,

460
00:19:57,240 –> 00:20:02,000
sharepoint towers, old file servers, email attachments,

461
00:20:02,000 –> 00:20:07,000
masquerading as truth, unlabeled, duplicated, stale.

462
00:20:07,000 –> 00:20:11,040
Employees walked in with the same 12 questions,

463
00:20:11,040 –> 00:20:16,040
security, devices, onboarding, travel allowances,

464
00:20:16,040 –> 00:20:21,040
12 to 15 hits a day on the desk every day.

465
00:20:21,040 –> 00:20:26,040
Each one costing five to seven minutes of hunt and pack search,

466
00:20:26,040 –> 00:20:31,040
keyword roulette, open a PDF, skim, hope the date isn’t lying,

467
00:20:31,040 –> 00:20:35,040
open the twin, different wording, which one wins?

468
00:20:35,040 –> 00:20:36,040
Nobody knew.

469
00:20:36,040 –> 00:20:38,040
Copilot helped in the shallow lanes.

470
00:20:38,040 –> 00:20:42,040
It could find what the employee already had rights to in M365,

471
00:20:42,040 –> 00:20:45,040
it summarized, it drafted, it saved seconds,

472
00:20:45,040 –> 00:20:49,040
but down here the signal lived outside the glow,

473
00:20:49,040 –> 00:20:52,040
the correct baseline sat in a PDF on a file share,

474
00:20:52,040 –> 00:20:55,040
the update lived in a wiki the team forgot to publish,

475
00:20:55,040 –> 00:20:59,040
a meeting note contradicted both, people asked,

476
00:20:59,040 –> 00:21:03,040
the system guessed, nice tone, bad facts,

477
00:21:03,040 –> 00:21:08,040
the fallout, errors in the field, wrong device hardening steps,

478
00:21:08,040 –> 00:21:13,040
onboarding detours, policy exceptions issued on the wrong revision,

479
00:21:13,040 –> 00:21:16,040
the service desk became referee and archaeologist,

480
00:21:16,040 –> 00:21:20,040
trust bled out in small cuts, the cost wasn’t just minutes,

481
00:21:20,040 –> 00:21:23,040
it was rework repeat tickets and risk,

482
00:21:23,040 –> 00:21:26,040
and every fresh hire learned a bad truth,

483
00:21:26,040 –> 00:21:29,040
finding policy was slower than ignoring it,

484
00:21:29,040 –> 00:21:32,040
that’s how tenants bleed quietly in the paperwork alleys,

485
00:21:32,040 –> 00:21:37,040
no scandal, just drag, after, with Azure Rags solution,

486
00:21:37,040 –> 00:21:40,040
the transformation, we turned on a light,

487
00:21:40,040 –> 00:21:44,040
all policy and SOPs flowed into Azure AI search,

488
00:21:44,040 –> 00:21:49,040
no magic, just discipline, crawl, share point,

489
00:21:49,040 –> 00:21:52,040
sweep the file servers, stage the sources,

490
00:21:52,040 –> 00:21:56,040
chunk each document by heading in clause, preserve tables,

491
00:21:56,040 –> 00:22:01,040
tag every shard with owner, version, effective date,

492
00:22:01,040 –> 00:22:06,040
system, sensitivity, then embeddings,

493
00:22:06,040 –> 00:22:10,040
vectors that remember meaning when words change,

494
00:22:10,040 –> 00:22:14,040
hybrid search, wired for speed and precision,

495
00:22:14,040 –> 00:22:18,040
the librarian woke up, a team’s agent became the doorway,

496
00:22:18,040 –> 00:22:21,040
employees asked the same questions,

497
00:22:21,040 –> 00:22:25,040
the retriever hunted by meaning, then filtered by version and owner,

498
00:22:25,040 –> 00:22:28,040
top passages returned with receipts,

499
00:22:28,040 –> 00:22:31,040
we wrapped the prompt with hard rules,

500
00:22:31,040 –> 00:22:33,040
answer only from these sites,

501
00:22:33,040 –> 00:22:37,040
quote the source, if missing, say you don’t know,

502
00:22:37,040 –> 00:22:40,040
the model spoke like a clerk with a case file,

503
00:22:40,040 –> 00:22:43,040
concise, grounded two seconds, not seven minutes,

504
00:22:43,040 –> 00:22:45,040
load on the desk dropped by a third,

505
00:22:45,040 –> 00:22:47,040
not because answers were flashy,

506
00:22:47,040 –> 00:22:49,040
because they were consistent,

507
00:22:49,040 –> 00:22:51,040
contradiction surfaced as alerts,

508
00:22:51,040 –> 00:22:54,040
two PDFs claiming different bass lines,

509
00:22:54,040 –> 00:22:58,040
flagged, owners notified, fix the library,

510
00:22:58,040 –> 00:23:02,040
rain decks, tomorrow’s answers aligned,

511
00:23:02,040 –> 00:23:06,040
no retraining loop, no waiting on model updates,

512
00:23:06,040 –> 00:23:09,040
just fresher truth, people trusted the machine again,

513
00:23:09,040 –> 00:23:12,040
not because it was smart, because it was verifiable,

514
00:23:12,040 –> 00:23:14,040
every answer carried a source,

515
00:23:14,040 –> 00:23:17,040
the agent didn’t bluff, it opted out when blind,

516
00:23:17,040 –> 00:23:20,040
that small honesty turned users into partners,

517
00:23:20,040 –> 00:23:23,040
they reported gaps, we patched sources,

518
00:23:23,040 –> 00:23:27,040
the librarian got sharper, the city got quieter,

519
00:23:27,040 –> 00:23:31,040
credibility boosters, why rag wins on trust and accuracy,

520
00:23:31,040 –> 00:23:33,040
here’s the thing most leaders miss,

521
00:23:33,040 –> 00:23:36,040
speed without proof is theatre,

522
00:23:36,040 –> 00:23:39,040
in policy work, tone isn’t truth,

523
00:23:39,040 –> 00:23:44,040
rag forces receipts, citations aren’t a nice to have,

524
00:23:44,040 –> 00:23:46,040
they’re the contract,

525
00:23:46,040 –> 00:23:49,040
when the answer links to clause 4.3,

526
00:23:49,040 –> 00:23:52,040
revision Q3 owned by security,

527
00:23:52,040 –> 00:23:56,040
the debate ends, people stop arguing with each other,

528
00:23:56,040 –> 00:24:00,040
they argue with the source, and that’s fixable,

529
00:24:00,040 –> 00:24:03,040
the biggest win wasn’t speed, it was accuracy,

530
00:24:03,040 –> 00:24:06,040
you’ll hear that line from the floor,

531
00:24:06,040 –> 00:24:08,040
because once the librarian stands up,

532
00:24:08,040 –> 00:24:11,040
employees stop second guessing the clerk at the window,

533
00:24:11,040 –> 00:24:13,040
they click the source, they see the date,

534
00:24:13,040 –> 00:24:16,040
they move with confidence, that’s how you erase

535
00:24:16,040 –> 00:24:18,040
the quiet drag that kills quarters,

536
00:24:18,040 –> 00:24:22,040
users trusted the answers more because citations were mandatory,

537
00:24:22,040 –> 00:24:24,040
trust isn’t about personality,

538
00:24:24,040 –> 00:24:26,040
it’s about auditability,

539
00:24:26,040 –> 00:24:30,040
mandatory citations make every response traceable,

540
00:24:30,040 –> 00:24:32,040
it also makes QA measurable,

541
00:24:32,040 –> 00:24:36,040
you can test retrieval, did the top passages actually answer the question?

542
00:24:36,040 –> 00:24:39,040
If not, fix chunks or tags,

543
00:24:39,040 –> 00:24:43,040
evaluate again, quality climbs,

544
00:24:43,040 –> 00:24:47,040
the IT department didn’t need to retrain a single model,

545
00:24:47,040 –> 00:24:50,040
just structured their data,

546
00:24:50,040 –> 00:24:54,040
that line matters to budgets, fine tuning sounds heroic,

547
00:24:54,040 –> 00:24:57,040
it’s also slow and brittle for policy work,

548
00:24:57,040 –> 00:25:00,040
policies evolve, SOPs shift,

549
00:25:00,040 –> 00:25:02,040
with RAAG the engine stays put,

550
00:25:02,040 –> 00:25:04,040
the fuel changes,

551
00:25:04,040 –> 00:25:06,040
rain decks changed chunks,

552
00:25:06,040 –> 00:25:08,040
keep embedding current,

553
00:25:08,040 –> 00:25:10,040
no six week model cycles,

554
00:25:10,040 –> 00:25:13,040
no vendor lock to a training pipeline,

555
00:25:13,040 –> 00:25:15,040
you can’t control,

556
00:25:15,040 –> 00:25:17,040
and governance rocks steady.

557
00:25:17,040 –> 00:25:21,040
Azure AI Foundry gives you lanes.

558
00:25:21,040 –> 00:25:23,040
Identity through Entra,

559
00:25:23,040 –> 00:25:25,040
role-based access,

560
00:25:25,040 –> 00:25:27,040
data stays in the tenants shadow,

561
00:25:27,040 –> 00:25:29,040
versioned indexes,

562
00:25:29,040 –> 00:25:31,040
monitoring on latency,

563
00:25:31,040 –> 00:25:33,040
hit rate, nulls, citations,

564
00:25:33,040 –> 00:25:37,040
you can show a chain of custody from question to source,

565
00:25:37,040 –> 00:25:42,040
responsible AI hooks carry the paperwork you need when someone asks,

566
00:25:42,040 –> 00:25:44,040
why did it say that?

567
00:25:44,040 –> 00:25:48,040
In short, RAAG doesn’t pretend to know it proves what it knows,

568
00:25:48,040 –> 00:25:50,040
that’s why it wins.

569
00:25:50,040 –> 00:25:52,040
Choosing your AI strategy,

570
00:25:52,040 –> 00:25:54,040
here’s the map in one line,

571
00:25:54,040 –> 00:25:58,040
Copilot is the runner for your M365 streets,

572
00:25:58,040 –> 00:26:01,040
RAAG is the librarian for your law,

573
00:26:01,040 –> 00:26:03,040
use the runner for drafts, summaries,

574
00:26:03,040 –> 00:26:06,040
and quick pulls inside the district.

575
00:26:06,040 –> 00:26:09,040
Bring the librarian when correctness, citations,

576
00:26:09,040 –> 00:26:11,040
and cross-system truth matter.

577
00:26:11,040 –> 00:26:14,040
If you’re ready to build that pipeline, subscribe,

578
00:26:14,040 –> 00:26:18,040
then watch the next episode where we blueprint a minimal RAAG flow,

579
00:26:18,040 –> 00:26:20,040
costs and guardrails.

580
00:26:20,040 –> 00:26:22,040
Make the call, pick the lane, move.





Source link

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Search
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...