Power, Alexander and Kotonya, Gerald (2019) Providing Fault Tolerance via Complex Event Processing and Machine Learning for IoT Systems. In: Proceedings of the 9th International Conference on the Internet of Things, IoT 2019 :. ACM, New York. ISBN 9781450372077
pft.pdf - Accepted Version
Available under License Creative Commons Attribution-NonCommercial.
Download (3MB)
Abstract
Fault-tolerance (FT) support is a key challenge for ensuring dependable Internet of Things (IoT) systems. Many existing FT-support mechanisms in IoT are static, tightly coupled, inflexible implementations that struggle to adapt in dynamic IoT environments. This paper proposes Complex Patterns of Failure (CPoF), an approach to providing reactive and proactive FT using Complex Event Processing (CEP) and Machine Learning (ML). Error-detection strategies are defined as nondeterministic finite automata (NFA) and implemented via CEP systems. Reactive-FT support is monitored and learned from to train ML models that proactively handle imminent future occurrences of known errors. We evaluated CPoF on an indoor agriculture system with experiments that used time and error correlations to preempt battery-depletion failures. We trained predictive models to learn from reactive-FT support and provide preemptive error recovery.