Deep Transfer Learning-Based Automated Identification of Bird Song.

Nabanita Das; Neelamadhab Padhy; Nilanjan Dey; Sudipta Bhattacharya; João Manuel R.S. Tavares

doi:10.9781/ijimai.2023.01.003

Authors

Nabanita Das GIET University
Neelamadhab Padhy GIET University
Nilanjan Dey Techno International New Town (India).
Sudipta Bhattacharya Bengal Institute of Technology (India).
João Manuel R.S. Tavares Universidade do Porto

DOI:

https://doi.org/10.9781/ijimai.2023.01.003

Keywords:

Species Evaluation, Convolutional Neural Network (CNN), Data Augmentation, Inception, Transfer Learning, VGG

Abstract

Bird species identification is becoming increasingly crucial for avian biodiversity conservation and assisting ornithologists in quantifying the presence of birds in a given area. Convolutional Neural Networks (CNNs) are advanced deep learning algorithms that have proven to perform well in speech classification. However, developing an accurate deep learning classifier requires a large amount of data. Such a large amount of data on endemic or endangered creatures is frequently difficult to gathered. Also, in some other fields, such as bioinformatics and robotics, the high cost of data collection and expensive annotation limit their progress, so large, well-annotated data creating a set is also difficult. A transfer learning method can alleviate overfitting concerns in a deep learning model. This feature serves as the inspiration for transfer learning, which was created to deal with situations where the data are distributed across a variety of functional domains. In this study, the ability of deep transfer models such as VGG16, VGG19 and InceptionV3 to effectively extract and discriminate speech signals from different species of birds with high prediction accuracy is explored. The obtained accuracies using VGG16, VGG19 and InceptionV3 were equal to 78, 61.9 and 85%, respectively, which are very promising.

Downloads

Download data is not yet available.

References

M. A. Tabur and Y. Ayvaz, “Ecological importance of birds,” in Second International Symposium on Sustainable Development Conference, 2010, Jun., pp. 560-565.

S. D. H. Permana et al., “Classification of bird sounds as an early warning method of forest fires using Convolutional Neural Network (CNN) algorithm,” Journal of King Saud University - Computer and Information Sciences. Inf. Sci., 2021.

G. F. Budney and R. W. Grotke, “Techniques for audio recording vocalizations of tropical birds,” Ornithological Monographs, no. 48, pp. 147-163, 1997, doi: 10.2307/40157532.

Available at: https://www.environmentalscience.org/birds-environmental-indicators (last access date: 18/18/2022).

Available at: https://www.ck12.org/biology/bird-ecology/lesson/Importanceof-Birds-MS-LS/ (last access date: 18/12/2022).

Available at: https://www.thespruce.com/bird-courtship-behavior-386714 (last access date: 18/12/2022).

Available at: https://www.birdlife.org/worldwide/news/why-we-needbirds-far-more-they-need-us (last access date: 18/12/2022).

S. Fagerlund, “Bird species recognition using support vector machines,” EURASIP Journal on Advances in Signal Processing, vol. 2007, no. 1, pp. 1-8, 2007, doi: 10.1155/2007/38637.

N. Das et al., Machine Learning Models for Bird Species Recognition Based on Vocalization: A Succinct Review. Information Technology and Intelligent Transportation Systems, 2020, pp. 117-124.

C. Yüksel, 2020, Bird call detection using deep learning (Master’s thesis, Fen Bilimleri Enstitüsü).

S. Bhattacharya et al., “Deep classification of sound: A concise” in Proceeding of First Doctoral Symposium on Natural Computing Research: DSNCR 2020, vol. 169. Springer Nature, 2021, Mar.

N. Das et al., “Building of an edge-enabled drone network ecosystem for bird species identification,” Ecological Informatics, vol. 68, p. 101540, 2022, doi: 10.1016/j.ecoinf.2021.101540.

S. Bhattacharya et al., “Deep analysis for speech emotion recognization” in Second International Conference on Computer Science, Engineering and Applications (ICCSEA), vol. 2022. IEEE, 2022, Sept., pp. 1-6, doi: 10.1109/ICCSEA54677.2022.9936080.

K. Lan et al., “A survey of data mining and deep learning in bioinformatics,” Journal of Medical Systems, vol. 42, no. 8, pp. 139, 2018, doi: 10.1007/s10916-018-1003-9.

Y. Wu et al., “Learning models for semantic classification of insufficient plantar pressure images,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 6, no. 1, pp. 51-61, 2020, doi: 10.9781/ijimai.2020.02.005.

H. Chang et al., “Unsupervised transfer learning via multi-scale convolutional sparse coding for biomedical applications,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 5, pp. 1182-1194, 2018, doi: 10.1109/TPAMI.2017.2656884.

R. Wald et al., “Hidden dependencies between class imbalance and difficulty of learning for bioinformatics datasets” in 14th International Conference on Information Reuse & Integration (IRI), vol. 2013. IEEE. IEEE, 2013, Aug., pp. 232-238, doi: 10.1109/IRI.2013.6642477.

C. Tan et al., “A survey on deep transfer learning” in International conference on artificial neural networks. Cham: Springer, 2018, Oct., pp. 270-279.

L. Muda et al., 2010, Voice recognition algorithms using mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. arXiv preprint arXiv:1003.4083.

C. Y. Koh et al., 2019, Sept., “Bird sound classification using convolutional neural networks” in Clef [Working notes].

A. Fritzler et al., 2017, “Recognizing bird species in audio files using transfer learning” in Clef [Working notes].

S. Ntalampiras, “Bird species identification via transfer learning from music genres,” Ecological Informatics, vol. 44, pp. 76-81, 2018, doi: 10.1016/j.ecoinf.2018.01.006.

D. B. Efremova et al., “Data-efficient classification of birdcall through convolutional neural networks transfer learning” in Digital Image Computing: Techniques and Applications (DICTA), vol. 2019. IEEE, 2019, Dec., pp. 1-8, doi: 10.1109/DICTA47822.2019.8946016.

J. Bai et al., 2019, “Inception-v3 based method of LifeCLEF,” vol. 2019 Bird Recognition in Clef [Working notes].

M. Zhong et al., “Multispecies bioacoustic classification using transfer learning of deep convolutional neural networks with pseudolabeling,” Applied Acoustics, vol. 166, p. 107375, 2020, doi: 10.1016/j.apacoust.2020.107375.

A. K. Ibrahim et al., “Transfer learning for efficient classification of grouper sound,” Journal of the Acoustical Society of America, vol. 148, no. 3, pp. EL260, 2020, doi: 10.1121/10.0001943.

R. Rajan and A. Noumida, “Multi-label bird species classification using transfer learning” in 2021 International Conference on Communication, Control and Information Sciences (ICCISc), vol. 1. IEEE, 2021, Jun., doi: 10.1109/ICCISc52257.2021.9484858.

E. J. Henri and Z. Mungloo-Dilmohamud, “A deep transfer learning model for the identification of bird songs: A case study for Mauritius” in International Conference on Electrical, Computer, Communications and Mechatronics Engineering (ICECCME), vol. 2021. IEEE, 2021, Oct., pp. 1-6, doi:10.1109/ICECCME52200.2021.9590917.

K. W. Gunawan et al., “A transfer learning strategy for owl sound classification by using image classification model with audio spectrogram,” International Journal on Electrical Engineering and Informatics, vol. 13, no. 3, pp. 546-553, 2021, doi: 10.15676/ijeei.2021.13.3.3.

S. Nayak et al., “Whose hoot? Identification of owl species using call recognition with neural networks,”, SSRN Journal, 2022, doi: 10.2139/ssrn.4020038.

N. Sharma et al., “Automatic identification of bird species using audio/video processing” in International Conference for Advancement in Technology (ICONAT), vol. 2022. IEEE, 2022, Jan., pp. 1-6, doi :10.1109/ICONAT53423.2022.9725906.

Y. Kumar et al., “A novel deep transfer learning models for recognition of birds sounds in different environment,” Soft Computing, pp. 1-14, 2022.

S. Bhattacharya et al., “Emotion detection from multilingual audio using deep analysis,” Multimedia Tools and Applications, pp. 1-30, 2022.

E. Sprengel et al., 2016, Audio-based bird species identification using deep learning techniques (No. CONF, pp. 547-559).

E. Cakir et al., “Convolutional recurrent neural networks for bird audio detection,” 25th European Signal Processing Conference EUSIPCO, vol. 2017, 2017. 2017-Janua, pp. 1744-1748, doi: 10.23919/EUSIPCO.2017.8081508.

J. Kim et al., “Acoustic classification of mosquitoes using convolutional neural networks combined with activity circadian rhythm information,”, International Journal of Interactive Multimedia and Artificial Intelligence, vol. 7, no. 2, 2021, doi: 10.9781/ijimai.2021.08.009.

S. Ahuja et al., “Deep transfer learning-based automated detection of COVID-19 from lung CT scan slices,” Applied intelligence (Dordrecht, Netherlands), vol. 51, no. 1, pp. 571-585, 2021, doi: 10.1007/s10489-020-01826-w.

M. Singh et al., “Transfer learning–based ensemble support vector machine model for automated COVID-19 detection using lung computerized tomography scan data,” Medical & Biological Engineering & Computing, vol. 59, no. 4, pp. 825-839, 2021, doi: 10.1007/s11517-020-02299-2.

D. A. Pitaloka et al., “Enhancing CNN with preprocessing stage in automatic emotion recognition,” Procedia Computer Science, vol. 116, pp. 523-529, 2017 [doi: 10.1016/j.procs.2017.10.038].

M. Morrison et al., 2021, Neural pitch-shifting and time-stretching with controllable LPCNet. arXiv preprint arXiv:2110.02360.

P. B. Baptista and C. Antunes, “Bioacoustic classification framework using transfer learning,” Model Decision Artificial Intelligence, vol. 35, 2021.

A. Bhaik et al., “Detection of improperly worn face masks using deep learning-A preventive measure against the spread of COVID-19,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 7, no. 7, 2021, doi: 10.9781/ijimai.2021.09.003.

K. He et al., “Deep residual learning for image recognition” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, vol. 7, no. 3, 2016, pp. 770-778, doi: 10.1109/CVPR.2016.90

Available at: https://iq.opengenus.org/vgg16/ [Last Access Date: 18.12.2022].

S. K. Rahut et al., “Bengali abusive speech classification: A transfer learning approach using” VGG-16 in Emerging Technology in Computing, Communication and Electronics (ETCCE), vol. 2020. IEEE, 2020, Dec., pp. 1-6.

A. Ashurov et al., “Environmental sound classification based on transfer learning techniques with multiple optimizers,” Electronics, vol. 11, no. 15, p. 2279, 2022 [doi: 10.3390/electronics11152279].

M. J. Horry et al., “COVID-19 detection through transfer learning using multimodal imaging data,” IEEE Access, vol. 8, pp. 149808-149824, 2020 [doi: 10.1109/ACCESS.2020.3016780].

Available at: https://blog.paperspace.com/popular-deep-learningarchitectures-resnet-inceptionv3-squeezenet/ [Last Access Date: 18.12.2022].

Y. Shen et al., “Urban acoustic classification based on deep feature transfer learning,” Journal of the Franklin Institute, vol. 357, no. 1, pp. 667-686, 2020 [doi: 10.1016/j.jfranklin.2019.10.014].

M. Lasseck, “Acoustic bird detection with deep convolutional neural networks” in Proc. Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018), 2018, Nov., pp. 143-147.