A new online clustering approach for data in arbitrary shaped clusters

Hyde, Richard and Angelov, Plamen (2015) A new online clustering approach for data in arbitrary shaped clusters. In: Cybernetics (CYBCONF), 2015 IEEE 2nd International Conference on :. IEEE, POL, pp. 228-233. ISBN 9781479983209

[thumbnail of CYBCONF2015_CODAS]
CYBCONF2015_CODAS.pdf - Accepted Version
Available under License Creative Commons Attribution.

Download (661kB)


In this paper we demonstrate a new density based clustering technique, CODAS, for online clustering of streaming data into arbitrary shaped clusters. CODAS is a two stage process using a simple local density to initiate micro-clusters which are then combined into clusters. Memory efficiency is gained by not storing or re-using any data. Computational efficiency is gained by using hyper-spherical micro-clusters to achieve a micro-cluster joining technique that is dimensionally independent for speed. The micro-clusters divide the data space in to sub-spaces with a core region and a non-core region. Core regions which intersect define the clusters. A threshold value is used to identify outlier micro-clusters separately from small clusters of unusual data. The cluster information is fully maintained on-line. In this paper we compare CODAS with ELM, DEC, Chameleon, DBScan and Denstream and demonstrate that CODAS achieves comparable results but in a fully on-line and dimensionally scale-able manner.

Item Type:
Contribution in Book/Report/Proceedings
Additional Information:
©2015 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.
?? clusteringcodasonlinedata streamsbig dataarbitrary shapemicro-cluster ??
ID Code:
Deposited By:
Deposited On:
09 Sep 2015 06:36
Last Modified:
28 Apr 2024 23:14