Deep learning bird song recognition based on MFF-ScSEnet

认领

导出

Link by DOI

反馈

作者信息关键词期刊信息基础信息归属信息摘要

成果类型：

期刊论文

作者：

Hu, Shipeng;Chu, Yihang;Wen, Zhifang;Zhou, Guoxiong;Sun, Yurong*;...

通讯作者：

Sun, Yurong;Chen, AB

作者机构：

[Chen, Aibin; Hu, Shipeng; Sun, Yurong; Wen, Zhifang; Chu, Yihang; Zhou, Guoxiong] Cent South Univ Forestry & Technol, Coll Sci, Changsha, Peoples R China.

通讯机构：

[Chen, AB ; Sun, YR] C

Cent South Univ Forestry & Technol, Coll Sci, Changsha, Peoples R China.

语种：

英文

关键词：

Feature fusion;Mel-spectrogram;ScSEnet;Sinc-spectrogram;SincNet-filter

期刊：

Ecological Indicators

ISSN：

1470-160X

年：

2023

卷：

154

页码：

110844

DOI：

10.1016/j.ecolind.2023.110844

基金类别：

Firstly, obtaining spectrum matrix x(f,r) with the help of STFT.(1) x(f,r)=∫-∞∞w(t-r)y(t)e-i2πfrdt Setting the length of the Hamming window w(t-r) to 512, the window shift to 256, f represents the frequency, y means the pre-emphasized birdsong audio signal, r represents the frame obtained by STFT of the current window, and x(f,r) represents the finally obtained spectrum matrix. In (1), y(t) is the pre-emphasized time domain signal and x(f,r) is the Hamming window with the center position at r. Fourier transform is carried out in the window to obtain the two-dimensional spectrogram matrix x(f,r) . The Mel frequency is the nonlinear frequency inspired by the hearing characteristics of the human ear, which also reflects that the Mel-spectrogram has strong learning of low-frequency signals. The two-dimensional spectrogram matrix x(f,r) is converted by the Mel filter, and its function is exhibited below:(2) fmel=2595log10(1+f700) Where fmel is the calculated Mel scale frequency and f is the normal Hertz frequency. The Mel filter bank imitates the human ear in filtering speech, with 512 triangular filters in the frequency range of a section of birdsong audio, with the width of the filters varying from small to large, with 50% overlap between each filter to avoid loss of information. On the Mel scale, these filters are shown as equal in width. Finally, the output matrix is converted into a spectrogram.

机构署名：

本校为第一且通讯机构

院系归属：

理学院

摘要：

Bird diversity plays an important role in ecological balance, and bird song identification is of great practical significance. The spectrum generated by feature extraction shows good performance on classification. However, the information extracted by the filter in the process of spectrogram generation can cause information loss, which limits the learning ability of birdsong recognition. This study proposes a feature fusion network (MFF-ScSEnet) to solve this problem. The audios of the birdsong extracted the Mel-spectrogram with low-frequency feature advantage by the Mel-filter, and the Sinc-s...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

Deep learning bird song recognition based on MFF-ScSEnet

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问