Microsoft accidentally caused problems with Azure containers themselves
- July 18, 2023
- 0
It can happen to the best. A programming error at Microsoft resulted in Azure Container Apps users being unable to access their log data for hours. The outage
It can happen to the best. A programming error at Microsoft resulted in Azure Container Apps users being unable to access their log data for hours. The outage
It can happen to the best. A programming error at Microsoft resulted in Azure Container Apps users being unable to access their log data for hours.
The outage in Azure Container Apps started on the night of July 6-7 and was not fixed until late morning. The SIEM platform Sentinel was also hit by the blows. Microsoft is now looking back at the incident in detail and is openly putting its hand in its own chest. It was reported that an internal coding error was the cause of the problems, with unpleasant consequences for users.
Microsoft made a code deployment for the container platform in Azure on July 3rd. What should normally have been standard procedure went wrong. A misconfiguration in the code caused the bootstrap service to get into a loop where it restarted every five to ten seconds. This also resulted in the telemetry panel being reconfigured every ten seconds.
The developers noticed the bug on July 6, but by that time the capacity of the telemetry control plane was exhausted. This caused requests from other applications to be rejected or delayed, preventing them from launching. After a few days delay, the impact of the coding error finally reached Azure Container Apps users.
Admitting your mistake is one thing, learning from it is another. Microsoft promises to take action to mitigate the impact of internal bugs if they ever recur. In addition to adding additional capacity to the telemetry control plane, Microsoft will also add caching and throttling. You can read more information in this blog.
The first week of July was turbulent for Microsoft Azure. The violent storm in the Netherlands caused a fiber optic cable to break, causing cloud outages in western Europe. Here, too, Microsoft was able to intervene quickly and also shared a comprehensive incident report.
Source: IT Daily
As an experienced journalist and author, Mary has been reporting on the latest news and trends for over 5 years. With a passion for uncovering the stories behind the headlines, Mary has earned a reputation as a trusted voice in the world of journalism. Her writing style is insightful, engaging and thought-provoking, as she takes a deep dive into the most pressing issues of our time.