版权说明 操作指南
首页 > 成果 > 成果详情

Deep learning bird song recognition based on MFF-ScSEnet

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
期刊论文
作者:
Hu, Shipeng;Chu, Yihang;Wen, Zhifang;Zhou, Guoxiong;Sun, Yurong*;...
通讯作者:
Sun, Yurong;Chen, AB
作者机构:
[Chen, Aibin; Hu, Shipeng; Sun, Yurong; Wen, Zhifang; Chu, Yihang; Zhou, Guoxiong] Cent South Univ Forestry & Technol, Coll Sci, Changsha, Peoples R China.
通讯机构:
[Chen, AB ; Sun, YR] C
Cent South Univ Forestry & Technol, Coll Sci, Changsha, Peoples R China.
语种:
英文
关键词:
Feature fusion;Mel-spectrogram;ScSEnet;Sinc-spectrogram;SincNet-filter
期刊:
Ecological Indicators
ISSN:
1470-160X
年:
2023
卷:
154
页码:
110844
基金类别:
Firstly, obtaining spectrum matrix x(f,r) with the help of STFT.(1) x(f,r)=∫-∞∞w(t-r)y(t)e-i2πfrdt Setting the length of the Hamming window w(t-r) to 512, the window shift to 256, f represents the frequency, y means the pre-emphasized birdsong audio signal, r represents the frame obtained by STFT of the current window, and x(f,r) represents the finally obtained spectrum matrix. In (1), y(t) is the pre-emphasized time domain signal and x(f,r) is the Hamming window with the center position at r. Fourier transform is carried out in the window to obtain the two-dimensional spectrogram matrix x(f,r) . The Mel frequency is the nonlinear frequency inspired by the hearing characteristics of the human ear, which also reflects that the Mel-spectrogram has strong learning of low-frequency signals. The two-dimensional spectrogram matrix x(f,r) is converted by the Mel filter, and its function is exhibited below:(2) fmel=2595log10(1+f700) Where fmel is the calculated Mel scale frequency and f is the normal Hertz frequency. The Mel filter bank imitates the human ear in filtering speech, with 512 triangular filters in the frequency range of a section of birdsong audio, with the width of the filters varying from small to large, with 50% overlap between each filter to avoid loss of information. On the Mel scale, these filters are shown as equal in width. Finally, the output matrix is converted into a spectrogram.
机构署名:
本校为第一且通讯机构
院系归属:
理学院
摘要:
Bird diversity plays an important role in ecological balance, and bird song identification is of great practical significance. The spectrum generated by feature extraction shows good performance on classification. However, the information extracted by the filter in the process of spectrogram generation can cause information loss, which limits the learning ability of birdsong recognition. This study proposes a feature fusion network (MFF-ScSEnet) to solve this problem. The audios of the birdsong extracted the Mel-spectrogram with low-frequency feature advantage by the Mel-filter, and the Sinc-s...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com