Machine Learning within Latent Spaces formed by Foundation Models

Tomczyk, Bernard and Angelov, Plamen and Kangin, Dmitry (2024) Machine Learning within Latent Spaces formed by Foundation Models. In: 2024 IEEE 12th International Conference on Intelligent Systems, IS 2024 - Proceedings :. Institute of Electrical and Electronics Engineers Inc., BGR. ISBN 9798350350982

Full text not available from this repository.

Abstract

Foundation Models (FM) developed on very large generic data sets transformed the landscape of machine learning (ML). Vision transformers (ViT) closed the gap in performance between fine-tuned and unsupervised transfer learning. This opens the possibility to abandon the widely used until recently end-to-end approach. Instead, we consider a two-stage ML pipeline, where the first stage constitutes extracting features by pre-training large, multi-layer model with billions of parameters, and the second stage is a computationally lightweight learning of an entirely new, simpler model architecture based on prototypes within this feature space. In this paper we consider such two-stage approach to ML. We further analyse the use of several alternative light-weight methods in the second stage, including strategies for semi-supervised learning and a variety of strategies for linear fine-tuning. We demonstrate on the basis of nine well known benchmark data sets that the ultra-light-weight ML alternatives for the second stage (such as clustering, PCA, LDA and combinations of these) offer for the price of negligible drop in accuracy a significant (several orders of magnitude) drop of computational costs (time, energy and related CO2 emissions) as well as the ability to use no labels (fully unsupervised approach) or limited amount of labels (one per cluster labels) and the ability to address interpretability.

Item Type:
Contribution in Book/Report/Proceedings
Additional Information:
Publisher Copyright: © 2024 IEEE.
Uncontrolled Keywords:
/dk/atira/pure/subjectarea/asjc/1700/1702
Subjects:
?? comparisonfoundation feature space (ffs)foundation modelslda (clustering followed by lda using single label per cluster)supervisedunsupervisedartificial intelligenceinformation systems ??
ID Code:
228198
Deposited By:
Deposited On:
28 Nov 2025 13:40
Refereed?:
Yes
Published?:
Published
Last Modified:
28 Nov 2025 22:35