What To Do When Your AI’s Guardrails Fail

I want to talk about the Microsoft 365 Copilot bug. Not because it was exceptional, but because what it exposed should change how every organization architects AI governance. For weeks at the beginning of the year, Microsoft 365 Copilot read and summarized confidential emails despite sensitivity labels and Data Loss Prevention policies being correctly configured to block that behavior. The bug, tracked as CW1226324, affected emails in users’ Sent Items and Drafts folders. Legal communications. Business agreements. Protected health information. All processed by an AI that explicitly stated organizational policies said should never touch it.

Microsoft’s response was that users only accessed information they were already authorized to see. This may be technically accurate, after all, Copilot operates within the user’s mailbox context. However, the sensitivity labels weren’t there to stop users from reading their own email; they were there to stop the AI from processing confidential content. The AI processed it anyway.

A lack of safety net

The incident made it painfully clear that every control designed to keep Copilot away from confidential data — sensitivity labels, DLP policies, access restrictions — lived inside the same platform as Copilot itself. Therefore, when a code error hit, all the controls failed at once. There was no independent layer to catch it, no secondary chances, no safety net.

We wouldn’t ever design physical security this way. Nobody would build a vault where the door lock, alarm, and surveillance cameras all run through a single circuit breaker. But that’s what happened here. Microsoft was the AI provider, the security control provider, and the only entity with visibility into whether those controls were working.

When the platform broke, organizations had no independent way to detect the failure. They simply found out weeks later. Not through their own monitoring, but through Microsoft’s service advisory.

This is an industry architecture problem

To be clear, I’m not writing this to single out Microsoft. Copilot is a powerful tool, and code bugs happen to every vendor. Plus, the team identified the issue and rolled out a fix quickly. The problem isn’t that Microsoft had a bug, but that the architecture turned that single bug into a complete governance failure with no independent detection for weeks.

Whether it’s Copilot, Google Gemini for Workspace, Salesforce Einstein, or any other enterprise AI tool, the typical model is the same. The AI platform provides the governance controls, and organizations trust those controls to work. When they don’t, there’s nothing underneath.

The World Economic Forum’s 2026 Global Cybersecurity Outlook quantified this gap, saying that data leaks through generative AI are now the top cybersecurity concern of CEOs, cited by almost a third. A concern that ranks even higher among cybersecurity professionals, according to prior WEF surveys. Yet roughly one-third of organizations still have no process to validate AI security before deployment.

The WEF report also warned that without strong governance, AI agents can accumulate excessive privileges or propagate errors at scale. As such, they recommended continuous verification, audit trails, and zero-trust principles that treat every AI interaction as untrusted by default. The Copilot incident certainly backs that up.

These aren’t theoretical concerns

Of course, the ramifications of such a bug can be huge. If Copilot processes emails containing protected health information, organizations may need to assess whether this constitutes a reportable breach under HIPAA or similar laws. The question isn’t whether the user was authorized; it’s whether the AI’s processing was authorized under the business associate agreement. Microsoft’s public statement doesn’t resolve that analysis.

Under GDPR in Europe, Article 32 requires appropriate technical measures for the security of processing. If an organization’s sole measure was a vendor’s sensitivity labels that failed for weeks, that’s a difficult argument with regulators. The EU AI Act’s Article 12 adds another layer. The EU AI Act’s Article 12 requires automatic event logging for high-risk AI systems — and if the only records of what the AI accessed come from the same vendor that had the failure, organizations face a documentation gap that weakens their compliance posture. These aren’t theoretical concerns. They’re assessments that any organization that fears they may have been affected needs to run now.

Stop trusting AI platforms

The answer to this isn’t to stop using AI. Enterprise AI tools deliver impactful productivity gains. The answer is to stop trusting AI platforms to govern themselves carte blanche.

Defense in depth is not a new concept. We’ve already applied it to network security for decades through the likes of firewalls, intrusion detection, endpoint protection, and network segmentation. All of which are independent layers, with each capable of catching what the others miss. Nobody argues anymore that a firewall alone is sufficient.

But for AI governance, we’ve been operating with a single layer for too long and the Copilot bug proved that a single layer can fail silently for weeks.

Defense in depth for AI governance means an additional independent data layer between the AI platforms and the sensitive content. AI doesn’t have direct access to repositories; it authenticates through an external governance layer that enforces policies independently. Purpose binding that restricts which data classifications AI can access, least-privilege controls, continuous verification, and audit trails that the organization controls.

If Copilot had to authenticate through an independent governance layer before accessing email data, the bug inside the platform wouldn’t have been able to bypass those controls. Purpose binding would have held regardless, anomaly detection would have flagged the processing of confidential content, and the organization’s own audit trails would have captured what happened in real time — not weeks later.

It is time to decide

Every major technology shift creates a moment when organizations must decide whether to bolt on security as an afterthought or build it into the architecture from the start. We saw it with cloud migration, then we saw it with remote working. Now we’re seeing it with AI.

The organizations that treat this bug as a wake-up call and build independent AI governance at the data layer will be the ones that can scale AI adoption with confidence. They’ll also satisfy regulators with independent evidence. Plus, they’ll protect sensitive data not through trust in vendor controls, but through architecture that doesn’t depend on trust at all.

The labels were in place, and the policies were configured. Yet the AI still read the confidential emails. When this happens with your AI tools — and it will — make sure you have an independent governance layer that catches it, rather than relying on the vendor to hold its hands up, potentially weeks later. The architecture you choose today will prove vital.

Tim Freestone

Tim Freestone, the chief strategy officer at Kiteworks, is a senior leader with more than 17 years of expertise in marketing leadership, brand strategy, and process and organizational optimization. Since joining Kiteworks in 2021, he has played a pivotal role in shaping the global landscape of content governance, compliance, and protection.

The opinions expressed in this post belong to the individual contributors and do not necessarily reflect the views of Information Security Buzz.

Tim Freestone

Artificial intelligence and elections: When an election is annulled because of TikTok

NCSC warns organisations not to rush into agentic AI

Beyond deepfakes: Building identity resilience against AI impersonation

Working With Us

Write For Us

The Pages

What to do when your AI’s guardrails fail

A lack of safety net

This is an industry architecture problem

These aren’t theoretical concerns

Stop trusting AI platforms

It is time to decide

Tim Freestone

Related Posts

Artificial intelligence and elections: When an election is annulled because of TikTok

NCSC warns organisations not to rush into agentic AI

Beyond deepfakes: Building identity resilience against AI impersonation