Dean, Paul and Porter, Barry (2019) Emergent Scheduling of Distributed Execution Frameworks. In: Doctoral Symposium at the International Conference on Self-Adaptive and Self-Organizing Systems :. IEEE, pp. 240-242. ISBN 9781728124070
SASO_Doctoral_Symp_19_CR.pdf - Published Version
Available under License Unspecified.
Download (204kB)
Abstract
Distributed execution Frameworks (DEFs) provide a platform for handling the increasing volume of data available to distributed computational processes, forming the creation and usage of a large number of DEFs for performing distributed computations. For example, sorting and analyzing large data sets through map and reduce operations, performing a set of operations across points in a data stream to provide near real-time analysis, and the training and testing of machine learning models for varying methods of learning, such as, supervised, unsupervised and reinforcement learning, exploiting the vast amounts of data available. Leading to varying DEFs becoming optimal for either fine or coarse grained computations, for example Apache Spark provides a framework for coarse grained data parallel processes providing data locality adding latency to scheduling decisions which would hinder performance of fine-grained computation. Whereas Ray and Apache Flink provide solutions to avoid the latency incurred by the scheduling method used by apache Spark while potentially incurring longer job completion times as data locality is no longer a priority. Therefore, this PhD will focus on overcoming the issue of trading performance for differing workloads by exploiting the capabilities presented by emergent software systems which learn how to assemble and re-assemble themselves in response to their current deployment conditions and input pattern. This allows the creation of a component based DEF capable of altering both the local behaviour of a DEF (i.e. Local Schedulers and placement polices within a centralised scheduler) to potentially improve the performance of single DEF as well as global behaviour of a DEF, for example the adaptation of a centralised to two-level scheduler.