Crone, S (2009) Mining the past to forecast the future: comments. International Journal of Forecasting, 25 (3). 456–460.
Abstract
In forecasting, data mining is frequently perceived as a distinct technological discipline without immediate relevance to the challenges of time series prediction. However, Hand (2009) postulates that when the large cross-sectional datasets of data mining and the high-frequency time series of forecasting converge, common problems and opportunities are created for the two disciplines. This commentary attempts to establish the relationship between data mining and forecasting via the dataset properties of aggregate and disaggregate modelling, in order to identify areas where research in data mining may contribute to current forecasting challenges, and vice versa. To forecasting, data mining offers insights on how to handle large, sparse datasets with many binary variables, in feature and instance selection. Furthermore data mining and related disciplines may stimulate research into how to overcome selectivity bias using reject inference on observational datasets and, through the use of experimental time series data, how to extend the utility and costs of errors beyond measuring performance, and how to find suitable time series benchmarks to evaluate computer intensive algorithms. Equally, data mining can profit from forecasting’s expertise in handling nonstationary data to counter the out-of-date-data problem, and how to develop empirical evidence beyond the fine tuning of algorithms, leading to a number of potential synergies and stimulating research in both data mining and forecasting.