Last week, a major Active Directory authentication issue affected users worldwide. ONE next Exchange / Outlook issue later this week affected European and Indian Office 365 / Microsoft 365 customers. This week, Microsoft cloud service issues continue, affecting a number of Exchange, Outlook, Teams and SharePoint users.
Microsoft continued to warn some Office 365 / Microsoft 365 customers as some potential Exchange / Outlook issues started this week, including problems accessing and managing issues between Outlook for mobile and desktops. I asked Microsoft if these issues were related to last week’s Azure Active Directory authentication issues, but I was told the company had no comment. (I hear the issues are probably unrelated, for what it’s worth.)
On October 7, users, mostly in the US, began reporting in the afternoon ET that they had problems accessing the management center control panels. Around 2:30 p.m. ET, users went to Twitter and other social channels to report that they could not access Microsoft 365 services, including Teams, Exchange Online, Outlook.com, SharePoint Online, and OneDrive for Business. At the same time, issue alerts for Azure Active Directory and Azure Networking appeared on the Azure status page.
Around 4:00 p.m. ET, some Office 365 / Microsoft 365 customers started reporting that their services were recovering. (For my part, I still do not have access to the M365 Management Center, even as of 5:00 PM ET.)
The The Azure team also published a preliminary analysis of the root causes about the same time users have encountered problems accessing Microsoft or Azure services. In this report, Microsoft reported between about 2 p.m. ET and 3:40 p.m. ET, a subset of customers encountered resource connectivity issues that exploited the Azure network infrastructure in various areas. (“Resources with local dependencies in the same area should not be affected,” according to company officials.)
Microsoft has identified “a recent change (implemented) in WAN (Wide Area Networking) resources causing connectivity delays or domain failures” as the cause. To mitigate, the Azure team returned to the recent change to a healthy configuration.
On October 7, the Azure team also noted that a subset of customers experienced traffic to “unhealthy backends” with the Azure Front Door. Microsoft attributed the problem to a “configuration change (which) developed causing incorrect traffic routing” and returned the change to fix the problem.
The Microsoft 365 team, for its part, is performing the inability to access services in a “network infrastructure change” that may have affected many Microsoft 365 services, including Groups, Outlook, SharePoint, OneDrive for Business, and Outlook.com. The same team also said it added this afternoon an additional capability to handle “an observed increase in traffic in the management center caused by actions to mitigate a previous incident with a similar impact”.
After last week ‘s Azure AD issue – caused by incorrect control of a change, combined with a reset failure This week’s break is not a good thing for Microsoft’s cloud.