Data digging from photographs of moths using deep-learning-based object detection

Background

Biodiversity decline disproportionally affects hyperdiverse, yet low and middle-income, tropical regions, and Lepidoptera are among the most impacted taxa. Whereas monitoring in these countries is essential, only implementing low-cost solutions can guarantee success. The recent automatic image analysing methods allow for fast and low-cost solutions in monitoring biodiversity. Lepidoptera monitoring is relatively simple and can potentially be fully automated soon. Low-quality photo recordings may hamper the efforts though.

A year-long, photo-based moths survey was conducted in an under-studied region in Paraguay. Light trapping occurred between September 2015 and September 2016. Close-up photos of each morphospecies within a given timeframe in a day were taken every day, along with four overview images of the four corners of the sheet. Photos were taken by Allan Dietz, Stuart Douglas, Sara Binnie and Gábor Pozsgai. 

In this project, we aimed to examine whether modern object detection methods allow efficient processing, and data extraction, of (often poor quality) photos originally taken not for the purpose of automated image analysis.

Light trap location
Sampling site of light trapping at Laguna Blanca in Paraguay
What has happend so far

We wanted to create a training set for Lepidoptera detection, which is not specific to sheet photos and can even be used for monitoring in the future, at other sites, thus we used the close-up photos of the morphospecies, instead of cut-offs from the overview. As a first step, we resized these close-up photos (‘artistic photos’ and photos of each morphospecies). As a next step, we rejected the pictures with dark backgrounds (mostly the ‘artistic’ ones) (‘automated filtering’ based on brightness) because we only wanted to use photos which looked like those taken in the sheets. In the pictures we kept after this filtering, we used contour detection to find the moths and drew a box around them (‘bounding box’). After that, we checked all the pictures and did a manual selection based on the correctness of the boxes (those with incorrect boxes were excluded). We wanted to increase the training set (for the algorithm), so we made five different size variants for every picture. Then, we linked a data file with every picture that contained the parameters of the bounding box in a special format from which, during the teaching, the YOLO algorithm can read where the bounding box is and the information about the class of the bounded object (moth or non-moth).

We also wanted to include “non-moths” examples in the training set, as negative samples. Thus, we randomly cut several pictures of different taxa (such as dragonflies, beetles, and flies) from the overview images (since we did not have close-up photos of taxa other than Lepidoptera) and included them in the training set.

We taught the YOLOv3 algorithm to recognise Lepidoptera from a training set of 31400 downsized images. After that, we used automatic contour detection, created bounding boxes around the insects on the 682 overview (sheet) images and labelled these (bounding boxed) images if they belonged to Lepidoptera or not, with the help of the trained YOLOv3.

Finally, we tested how efficiently the algorithm recognized the moths (it meant the detection efficiency) by manually counting moths on ten randomly selected photos.

See the downloadable flowchart of the full method.

Outputs

Varga-Szilay, Z., & Pozsgai, G. (2022). Data digging from photographs of Paraguayan moths (Lepidoptera) using deep-learning-based object detection. 58th Annual Meeting of the Association for Tropical Biology and Conservation (ATBC 2022), Columbia, Cartagena. Poster