Giotsas, Vasileios and Dietzel, Christoph and Smaragdakis, Georgios and Feldmann, Anja and Berger, Arthur and Aben, Emile (2017) Detecting peering infrastructure outages in the wild. In: SIGCOMM 2017 - Proceedings of the 2017 Conference of the ACM Special Interest Group on Data Communication :. Association for Computing Machinery, Inc, USA, pp. 446-459. ISBN 9781450346535
SIGCOMM2017.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (1MB)
Abstract
Peering infrastructures, namely, colocation facilities and Internet exchange points, are located in every major city, have hundreds of network members, and support hundreds of thousands of interconnections around the globe. These infrastructures are well provisioned and managed, but outages have to be expected, e.g., due to power failures, human errors, attacks, and natural disasters. However, little is known about the frequency and impact of outages at these critical infrastructures with high peering concentration. In this paper, we develop a novel and lightweight methodology for detecting peering infrastructure outages. Our methodology relies on the observation that BGP communities, announced with routing updates, are an excellent and yet unexplored source of information allowing us to pinpoint outage locations with high accuracy. We build and operate a system that can locate the epicenter of infrastructure outages at the level of a building and track the reaction of networks in near real-time. Our analysis unveils four times as many outages as compared to those publicly reported over the past five years. Moreover, we show that such outages have significant impact on remote networks and peering infrastructures. Our study provides a unique view of the Internet's behavior under stress that often goes unreported.