hardRain:An R package for quick, automated rainfall detection in ecoacoustic datasets using a threshold-based approach

Metcalf, O.C. and Lees, A.C. and Barlow, J. and Marsden, S.J. and Devenish, C. (2020) hardRain:An R package for quick, automated rainfall detection in ecoacoustic datasets using a threshold-based approach. Ecological Indicators, 109. ISSN 1470-160X

[thumbnail of Metcalf et al. hardRain_preprint]
Text (Metcalf et al. hardRain_preprint)
Metcalf_et_al._hardRain_preprint.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (1MB)


The increasing demand for cost-efficient biodiversity data at large spatiotemporal scales has led to an increase in the collection of large ecoacoustic datasets. Whilst the ease of collection and storage of audio data has rapidly increased and costs fallen, methods for robust analysis of the data have not developed so quickly. Identification and classification of audio signals to species level is extremely desirable, but reliability can be highly affected by non-target noise, especially rainfall. Despite this demand, there are few easily applicable pre-processing methods available for rainfall detection for conservation practitioners and ecologists. Here, we use threshold values of two simple measures, Power Spectrum Density (amplitude) and Signal-to-Noise Ratio at two frequency bands, to differentiate between the presence and absence of heavy rainfall. We assess the effect of using different threshold values on Accuracy and Specificity. We apply the method to four datasets from both tropical and temperate regions, and find that it has up to 99% accuracy on tropical datasets (e.g. from the Brazilian Amazon), but performs less well in temperate environments. This is likely due to the intensity of rainfall in tropical forests and its falling on dense, broadleaf vegetation amplifying the sound. We show that by choosing between different threshold values, informed trade-offs can be made between Accuracy and Specificity, thus allowing the exclusion of large amounts of audio data containing rainfall in all locations without the loss of data not containing rain. We assess the impact of using different sample sizes of audio data to set threshold values, and find that 200 15 s audio files represents an optimal trade-off between effort, accuracy and specificity in most scenarios. This methodology and accompanying R package ‘hardRain’ is the first automated rainfall detection tool for pre-processing large acoustic datasets without the need for any additional rain gauge data.

Item Type:
Journal Article
Journal or Publication Title:
Ecological Indicators
Additional Information:
This is the author’s version of a work that was accepted for publication in Ecological Indicators. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Ecological Indicators, 109, 2020 DOI: 10.1016/j.ecolind.2019.105793
Uncontrolled Keywords:
ID Code:
Deposited By:
Deposited On:
20 Nov 2019 12:00
Last Modified:
18 Sep 2023 01:41