Tails in the cloud:a survey and taxonomy of straggler management within large‑scale cloud data centres

Singh Gill, Sukhpal and Ouyang, Xue and Garraghan, Peter (2020) Tails in the cloud:a survey and taxonomy of straggler management within large‑scale cloud data centres. Journal of Supercomputing. ISSN 0920-8542

[img]
Text (SUPE-D-20-00042.R1)
SUPE_D_20_00042.R1.pdf - Accepted Version
Restricted to Repository staff only until 12 March 2021.
Available under License Creative Commons Attribution-NonCommercial.

Download (661kB)

Abstract

Cloud computing systems are splitting compute- and data-intensive jobs into smaller tasks to execute them in a parallel manner using clusters to improve execution time. However, such systems at increasing scale are exposed to stragglers, whereby abnormally slow running tasks executing within a job substantially affect job performance completion. Such stragglers are a direct threat towards attaining fast execution of data-intensive jobs within cloud computing. Researchers have proposed an assortment of different mechanisms, frameworks, and management techniques to detect and mitigate stragglers both proactively and reactively. In this paper, we present a comprehensive review of straggler management techniques within large-scale cloud data centres. We provide a detailed taxonomy of straggler causes, as well as proposed management and mitigation techniques based on straggler characteristics and properties. From this systematic review, we outline several outstanding challenges and potential directions of possible future work for straggler research.

Item Type:
Journal Article
Journal or Publication Title:
Journal of Supercomputing
Additional Information:
The final publication is available at Springer via http://dx.doi.org/10.1007/s11227-020-03241-x
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/1700/1710
Subjects:
ID Code:
142329
Deposited By:
Deposited On:
25 Mar 2020 13:55
Refereed?:
Yes
Published?:
Published
Last Modified:
14 Jul 2020 10:40