Angelov, Plamen (2014) Outside the box : an alternative data analytics frame-work. Journal of Automation, Mobile Robotics and Intelligent Systems, 8 (2). pp. 29-35. ISSN 1897-8649
Full text not available from this repository.Abstract
In this paper, an alternative framework for data analytics is proposed which is based on the spatially-aware concepts of eccentricity and typicality which represent the density and proximity in the data space. This approach is statistical, but differs from the traditional probability theory which is frequentist in nature. It also differs from the belief and possibility-based approaches as well as from the deterministic first principles approaches, although it can be seen as deterministic in the sense that it provides exactly the same result for the same data. It also differs from the subjective expert-based approaches such as fuzzy sets. It can be used to detect anomalies, faults, form clusters, classes, predictive models, controllers. The main motivation for introducing the new typicality- and eccentricity-based data analytics (TEDA) is the fact that real processes which are of interest for data analytics, such as climate, economic and financial, electro-mechanical, biological, social and psychological etc., are often complex, uncertain and poorly known, but not purely random. Unlike, purely random processes, such as throwing dices, tossing coins, choosing coloured balls from bowls and other games, real life processes of interest do violate the main assumptions which the traditional probability theory requires. At the same time they are seldom deterministic (more precisely, have always uncertainty/noise component which is nondeterministic), creating expert and belief-based possibilistic models is cumbersome and subjective. Despite this, different groups of researchers and practitioners favour and do use one of the above approaches with probability theory being (perhaps) the most widely used one. The proposed new framework TEDA is a systematic methodology which does not require prior assumptions and can be used for development of a range of methods for anomalies and fault detection, image processing, clustering, classification, prediction, control, filtering, regression, etc. In this paper due to the space limitations, only few illustrative examples are provided aiming proof of concept.