Dynamics 365 Business Central and Azure Front Door issue.

Yesterday I received (and read) lots of messages by many partners and customers related to the unavailability of the Dynamics 365 Business Central SaaS service for some period.

The issue was not a Dynamics 365 Business Central-related issue, but the cause of this unreliability was a different service: Azure Front Door.

Azure Front Door is like a global delivery service for cloud services that makes them faster, more reliable, and safer for people around the world.

Here’s how it works in simple terms:

Imagine you run an online store with your main server in New York. When someone in Australia wants to visit your site, the data has to travel across oceans and continents, which makes everything slow.

Azure Front Door fixes this by:

  1. Having “mini-servers” everywhere: It uses over 100 locations worldwide (called “edge locations”) strategically placed in major cities
  2. Smart routing: When someone visits your site, Front Door automatically sends them to the closest edge location instead of your main server
  3. Content caching: It stores copies of your website’s images, videos, and other content at these edge locations, so they load instantly
  4. Load balancing: If you have multiple servers, Front Door distributes traffic evenly across them
  5. Built-in security: It protects against cyber attacks, blocks suspicious traffic, and provides encryption

Think of it like a global chain of coffee shops😉. Instead of everyone going to one central café in New York, Front Door has small coffee shops everywhere. When you want coffee, you go to the nearest shop and get it much faster, without waiting for it to be shipped from New York.

Key benefits:

  • ⚡ 3x faster loading times for users worldwide
  • 🛡 Automatic protection against attacks and threats
  • 🔄 99.99% uptime reliability
  • 🌍 Global scale that grows with your business
  • 💰 Cost savings by reducing server load and bandwidth

Azure Front Door supports multi-region backend pools and health probes. When a region (for example East US) goes down, Front Door automatically reroutes traffic to another region (ex. West Europe) without any DNS TTL delays. This ensures high availability and global resilience.

Azure Front Door also supports path-based routing. You can define routing rules that forward /api/ to your API service, /admin/ to an internal backend with IP restrictions, and /images/ to a static blob storage. No need for a gateway or reverse proxy layer in your app.

Azure Front Door supports geo-filtering and geo-routing. You can define rules such that users from EU countries are routed to EU-hosted backends, ensuring compliance without adding latency or complexity.

These are all features that also a cloud service like Dynamics 365 Business Central relies on.

What happened yesterday?

The Outage Timeline

  • Start: 15:45 UTC on October 29, 2025
  • End: 00:05 UTC on October 30, 2025
  • Duration: Approximately 8.5 hours
  • Impact: Widespread disruption affecting millions of users

What Went Wrong – Root Cause Analysis

The outage was triggered by an inadvertent tenant configuration change within Azure Front Door (AFD). Microsoft’s primary content delivery and global load balancing service. Here’s what happened:

  1. Faulty Configuration Deployment: A software defect in the validation mechanisms allowed an erroneous configuration to bypass safety checks during deployment.
  2. Cascading Failure: The invalid configuration caused a significant number of AFD nodes to fail loading properly, leading to:
  • Increased latencies and timeouts
  • Connection errors for downstream services
  • Imbalanced traffic distribution as unhealthy nodes dropped out
  1. Global Impact: The issue propagated across Microsoft’s global edge network, affecting services worldwide.

Affected Azure Services

The outage impacted numerous critical Azure services, including but not limited to:

Core Infrastructure:

  • Azure App Service
  • Azure SQL Database
  • Azure Virtual Desktop
  • Container Registry

Identity & Security:

  • Azure Active Directory B2C
  • Microsoft Entra ID (Identity & Access Management)
  • Microsoft Defender External Attack Surface Management

Developer Tools & AI:

  • Azure Databricks
  • Microsoft Copilot for Security
  • Azure Communication Services

Business Applications:

  • Azure Portal
  • Azure Maps
  • Azure Healthcare APIs
  • Media Services
  • Microsoft Purview
  • Microsoft Sentinel (Threat Intelligence)
  • Video Indexer

Resolution Process

Microsoft’s response involved multiple phases:

  1. Immediate Containment: Blocked all further configuration changes to prevent additional propagation
  2. Rollback Strategy: Deployed the “last known good” configuration across the global fleet
  3. Phased Recovery: Gradually reloaded configurations and rebalanced traffic to avoid overload
  4. Monitoring: Continued monitoring for any remaining issues with updates via Azure Service Health

Lessons Learned & Improvements

Following the incident, Microsoft implemented:

  • Enhanced Validation Controls: Additional safeguards to prevent similar configuration errors
  • Improved Rollback Mechanisms: Better automated recovery processes
  • Strengthened Monitoring: Enhanced detection and response capabilities

The issue is now mitigated and all services are now running fine from hours.

Broader Implications

This outage highlights the interconnected nature of modern cloud infrastructure and the challenges of managing global-scale services. Azure Front Door serves as a critical routing layer for numerous Microsoft services, making its stability paramount for the entire Azure ecosystem.

The incident underscores the importance of:

  • Robust configuration validation
  • Automated rollback capabilities
  • Comprehensive monitoring and alerting
  • Transparent communication during outages
  • Business continuity planning for cloud-dependent organizations

Microsoft reacted to this issue immediately (despite what could appear to the outside world, I can confirm that internally there was lot of work and exchange of messages across teams). I hope that this will not happen anymore.

Always remember to check Azure Status page when you have a service that is not reachable and remember that also Dynamics 365 Business Central relies on other Azure cloud services to work.

Original Post https://demiliani.com/2025/10/30/dynamics-365-business-central-and-azure-front-door-issue/

0 Votes: 0 Upvotes, 0 Downvotes (0 Points)

Leave a reply

Follow
Search
Loading

Signing-in 3 seconds...

Signing-up 3 seconds...