O'Ryan, David and Merin, Bruno and Simmons, Brooke and Vojtekova, Antonia and Anku, Anna and Walmsley, Mike and Garland, Izzy and Géron, Tobias and Keel, William and Kruk, Sandor and Lintott, Chris J. and Mantha, Kameswara Bharadwaj and Masters, Karen L. and Reerink, Jan and Smethurst, Rebecca J and Thorne, Matthew (2023) Harnessing the Hubble Space Telescope Archives : A Catalogue of 21,926 Interacting Galaxies. The Astrophysical Journal, 948 (1): 40. ISSN 0004-637X
manuscript.pdf - Accepted Version
Available under License Creative Commons Attribution.
Download (3MB)
Abstract
Mergers play a complex role in galaxy formation and evolution. Continuing to improve our understanding of these systems require ever larger samples, which can be difficult (even impossible) to select from individual surveys. We use the new platform ESA Datalabs to assemble a catalogue of interacting galaxies from the Hubble Space Telescope science archives; this catalogue is larger than previously published catalogues by nearly an order of magnitude. In particular, we apply the Zoobot convolutional neural network directly to the entire public archive of HST $F814W$ images and make probabilistic interaction predictions for 126 million sources from the Hubble Source Catalogue. We employ a combination of automated visual representation and visual analysis to identify a clean sample of 21,926 interacting galaxy systems, mostly with z < 1. 65\% of these systems have no previous references in either the NASA Extragalactic Database or Simbad. In the process of removing contamination, we also discover many other objects of interest, such as gravitational lenses, edge-on protoplanetary disks, and `backlit' overlapping galaxies. We briefly investigate the basic properties of this sample, and we make our catalogue publicly available for use by the community. In addition to providing a new catalogue of scientifically interesting objects imaged by HST, this work also demonstrates the power of the ESA Datalabs tool to facilitate substantial archival analysis without placing a high computational or storage burden on the end user.