Carrick, Jon and Hook, Isobel (2022) Classification of Supernovae and Stars in the Era of Big Data and Artificial Intelligence. PhD thesis, Lancaster University.
2022carrickphd.pdf - Published Version
Available under License Creative Commons Attribution-NonCommercial-NoDerivs.
Download (26MB)
Abstract
In recent years, artificial intelligence (AI) has been applied in many fields of research. It is particularly well suited to astronomy, in which very large datasets from sky surveys cover a wide range of observations. The upcoming Legacy Survey of Space and Time (LSST) presents unprecedented big data challenges, requiring state-of-the-art methods to produce, process and analyse information. Observations of Type Ia supernovae help constrain cosmological parameters such as the dark energy equation of state, and AI will be instrumental in the next generation of cosmological measurements due to limited spectroscopic resources. AI also has the ability to improve our astrophysical understanding by perceiving patterns in data which may not be obvious to humans. In this thesis we investigate how advanced AI methods can be used in classification tasks: to identify Type Ia supernovae for cosmology from photometry using supervised learning; by determining a low-dimensional representation of stellar spectra, and inferring astrophysical concepts through unsupervised learning. In preparation for photometric classification of transients from LSST we run tests with different training samples. Using estimates of the depth to which the 4-metre Multi-Object Spectroscopic Telescope (4MOST) Time-Domain Extragalactic Survey (TiDES) can classify transients, we simulate a magnitude-limited training sample reaching rAB = 22.5 mag. We run our simulations with the software snmachine, a photometric classification pipeline using machine learning. The machine-learning algorithms struggle to classify supernovae when the training sample is magnitude-limited as its features are not representative of the test set. In contrast, representative training samples perform very well, particularly when redshift information is included. Classification performance noticeably improves when we combine the magnitude-limited training sample with a simulated realistic sample of faint, high-redshift supernovae observed from larger spectroscopic facilities; the algorithms' range of average area under ROC curve (AUC) scores over 10 runs increases from 0.547-0.628 to 0.946-0.969 and purity of the classified sample reaches 95% in all runs for 2 of the 4 algorithms. By creating new, artificial light curves using the augmentation software avocado, we achieve a purity in our classified sample of 95% in all 10 runs performed for all machine-learning algorithms considered. We also reach a highest average AUC score of 0.986 with the artificial neural network algorithm. Having real faint supernovae to complement our magnitude-limited sample is a crucial requirement in optimisation of a 4MOST spectroscopic sample. However, our results are a proof of concept that augmentation is also necessary to achieve the best classification results. During our investigation into an optimised training sample, we assumed that every training object has the correct class label. Spectroscopy is a reliable method to confirm object classification and is used to define our training sample. However, it is not necessarily perfect and we therefore consider the impact of potential misclassifications of training objects. Taking the predicted error rates in spectroscopic classification from the literature, we apply contamination to a TiDES training sample using simulated LSST data. With the recurrent neural network from the software SuperNNova, we determine appropriate hyperparameters using a perfect, uncontaminated TiDES training sample and then train a model on its contaminated counterpart to study its effects on photometric classification. We find that a contaminated training sample produces very little difference in classification performance, even when increasing contamination to 5%. Contamination causes more objects of both Type Ia and non-Ia to be classified as Ia, increasing efficiency, but decreasing purity, with changes of less than 1% on average. Similarly, we see a decrease of 0.1% in average accuracy, and no clear difference in AUC score, only varying at the fourth significant figure. These results are promising for photometric classification. Contaminated training appears to have little impact and propagation to cosmological measurements is expected to be minimal. In a separate study, we apply deep learning to data in the European Southern Observatory (ESO) archive using an autoencoder neural network with the aim of improving similarity-based searches using the network's own interpretation of the data. We train the network to reconstruct stellar spectra by passing them through an information bottleneck, creating a low-dimensional representation of the data. We find that this representation includes several informative dimensions and, comparing to known astrophysical labels, see clear correlations for two key nodes; the network learns concepts of radial velocity and effective temperature, completely unsupervised. The interpretation of the other informative nodes appears ambiguous, leaving room for future investigation. The results presented in this thesis emphasise the practical capabilities of AI in an astronomical context: Classification of astrophysical objects can be conducted through supervised learning using known labels, as well as unsupervised learning in a physics-agnostic process.