Lancaster EPrints

Identifying variables responsible for clustering in discriminant analysis of data from infrared microspectroscopy of a biological sample.

Martin, Francis L. and German, Matthew and Wit, Ernst and Fearn, Thomas and Ragavan, Narasimhan and Pollock, Hubert M. (2007) Identifying variables responsible for clustering in discriminant analysis of data from infrared microspectroscopy of a biological sample. Journal of Computational Biology, 14 (9). pp. 1176-1184. ISSN 1066-5277

Full text not available from this repository.


In the biomedical field, infrared (IR) spectroscopic studies can involve the processing of data derived from many samples, divided into classes such as category of tissue (e.g., normal or cancerous) or patient identity. We require reliable methods to identify the class-specific information on which of the wavenumbers, representing various molecular groups, are responsible for observed class groupings. Employing a prostate tissue sample divided into three regions (transition zone, peripheral zone, and adjacent adenocarcinoma), and interrogated using synchrotron Fourier-transform IR microspectroscopy, we compared two statistical methods: (a) a new “cluster vector” version of principal component analysis (PCA) in which the dimensions of the dataset are reduced, followed by linear discriminant analysis (LDA) to reveal clusters, through each of which a vector is constructed that identifies the contributory wavenumbers; and (b) stepwise LDA, which exploits the fact that spectral peaks which identify certain chemical bonds extend over several wavenumbers, and which following classification via either one or two wavenumbers, checks whether the resulting predictions are stable across a range of nearby wavenumbers. Stepwise LDA is the simpler of the two methods; the cluster vector approach can indicate which of the different classes of spectra exhibit the significant differences in signal seen at the “prominent” wavenumbers identified. In situations where IR spectra are found to separate into classes, the excellent agreement between the two quite different methods points to what will prove to be a new and reliable approach to establishing which molecular groups are responsible for such separation.

Item Type: Article
Journal or Publication Title: Journal of Computational Biology
Uncontrolled Keywords: adenocarcinoma ; biomedical ; clustering ; LDA ; microspectroscopy ; misclassification
Subjects: Q Science > QC Physics
Departments: Faculty of Health and Medicine > Health Research
Faculty of Science and Technology > Lancaster Environment Centre
Faculty of Science and Technology > Mathematics and Statistics
Faculty of Science and Technology > Physics
ID Code: 18630
Deposited By: ep_ss_importer
Deposited On: 31 Oct 2008 09:31
Refereed?: Yes
Published?: Published
Last Modified: 13 Dec 2017 02:46
Identification Number:

Actions (login required)

View Item