1. |
王帅. 实时语音增强人工耳蜗的技术研究. 沈阳: 中国科学院大学(中国科学院沈阳计算技术研究所), 2023.
|
2. |
王瑜琳, 田学隆. 一种改进的电子耳蜗语音增强算法及其数字信号处理实现. 生物医学工程学杂志, 2014, 31(4): 742-746,754.
|
3. |
Pantev C, Okamoto H, Ross B, et al. Lateral inhibition and habituation of the human auditory cortex. Eur J Neur, 2015, 19(8): 2337-2344.
|
4. |
Xing Y, Ke W, Caterina G D, et al. Noise reduction using neural lateral inhibition for speech enhancement. Int J Mach Lear Comp, 2019, 11(5): 357-361.
|
5. |
Wall J, Glackin C, Cannings N, et al. Recurrent lateral inhibitory spiking networks for speech enhancement// 2016 International Joint Conference on Neural Networks (IJCNN). Vancouver: IEEE, 2016: 1023-1028.
|
6. |
兰朝凤, 蒋朋威, 陈欢, 等. 基于双路径递归网络与Conv-TasNet的多头注意力机制视听语音分离. 电子与信息学报, 2023, 46(3): 1005-1012.
|
7. |
陈国明. 基于人耳掩蔽效应的语音增强算法研究. 南京: 东南大学, 2005.
|
8. |
蔡汉添, 袁波涛. 一种基于听觉掩蔽模型的语音增强算法. 通信学报, 2002, 23(8): 93-98.
|
9. |
刘海滨, 吴镇扬, 赵力, 等. 非平稳环境下基于人耳听觉掩蔽特性的语音增强. 信号处理, 2003, 19(4): 303-307.
|
10. |
Mesgarani N, Chang E F. Selective cortical representation of attended speaker in multi-talker speech perception. Nature, 2012, 485: 233-236.
|
11. |
Kerlin J R, Shahin A J, Miller L M. Attentional gain control of ongoing cortical speech representations in a “Cocktail Party”. J Neur, 2010, 30(2): 620-628.
|
12. |
O'Sullivan J A, Power A J, Mesgarani N, et al. Attentional selection in a cocktail party environment can be decoded from single-trial EEG. Cereb Cortex, 2015, 25(7): 1697-1706.
|
13. |
Beaman C P. Auditory attention. Oxf Res Enc Psychol, 2021. DOI: 10.1093/acrefore/9780190236557.013.778.
|
14. |
Cherry E C. Some experiments on the recognition of speech, with one and with two ears. J Ac Soc Am, 1953, 25: 975-979.
|
15. |
Hu J, Shen L, Sun G. Squeeze-and-excitation networks. IEEE Trans Pat An Mach Int, 2017, 42(8): 2011-2023.
|
16. |
Chen B, Zhang Z, Liu N, et al. Spatiotemporal convolutional neural network with convolutional block attention module for micro-expression recognition. Information, 2020, 11(8): 380.
|
17. |
Prathipati A K, Chakravarthy A S N. Single channel speech enhancement using time-frequency attention mechanism based nested U-Net model. Eng Res Express, 2024, 6(3): 035206.
|
18. |
张天骐, 柏浩钧, 叶绍鹏, 等. 基于注意力门控膨胀卷积网络的单通道语音增强. 电子与信息学报, 2022, 44(9): 3277-3288.
|
19. |
张德辉, 董安明, 禹继国, 等. 融合门控循环单元及自注意力机制的生成对抗语音增强. 计算机科学, 2023, 50(S02): 350-358.
|
20. |
张天骐, 罗庆予, 张慧芝, 等. 复谱映射下融合高效Transformer的语音增强方法. 信号处理, 2024, 40(2): 406-416.
|
21. |
Wang D, Zhang X. THCHS-30: A free Chinese speech corpus. ArXiv, 2015: 1512.01882.
|
22. |
Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation. ArXiv, 2015: 1505.04597.
|
23. |
Stoller D, Ewert S, Dixon S. Wave-U-Net: A multi-scale neural network for end-to-end audio source separation. ArXiv, 2018: 1806.03185.
|
24. |
Martin K A. Neural inhibition// Nadel L. Encyclopedia of Cognitive Science. Atlanta: American Cancer Society, 2006.
|
25. |
Giri R, Isik U, Krishnaswamy A. Attention wave-U-Net for speech enhancement// 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics(WASPAA). New Paltz: IEEE, 2019: 249-253.
|
26. |
Fernandes B J, Cavalcanti G D, Ren T I. Lateral inhibition pyramidal neural network for image classification. IEEE Trans Cyb, 2013, 43(6): 2082-2091.
|
27. |
Cao C, Huang Y, Wang Z, et al. Lateral inhibition-inspired convolutional neural network for visual attention and saliency detection// Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans: AAAI, 2018: 6690–6697.
|
28. |
Kingma D P, Ba J. Adam: A method for stochastic optimization. ArXiv, 2014: 1412.6980.
|
29. |
Rix A W, Beerends J G, Hollier M P, et al. Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs// 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Salt Lake City: IEEE, 2001: 749-752.
|
30. |
Taal C H, Hendriks R C, Heusdens R, et al. A short-time objective intelligibility measure for time-frequency weighted noisy speech// 2010 IEEE International Conference on Acoustics, Speech and Signal Processing. Dallas: IEEE, 2010: 4214-4217.
|
31. |
ITU. ITU-T P. 835: Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm. ITU-T Recommendation, 2003: 835.
|
32. |
Arslan L M. Modified Wiener filtering. Sign Proc, 2006, 86(2): 267-272.
|
33. |
Pascual S, Bonafonte A, Serr J. SEGAN: speech enhancement generative adversarial network. ArXiv, 2017: 1703.09452.
|
34. |
Yin D, Luo C, Xiong Z, et al. PHASEN: A phase-and-harmonics-aware speech enhancement network. ArXiv, 2020: 1911.04697.
|
35. |
Défossez A, Usunier N, Bottou L, et al. Demucs: deep extractor for music sources with extra unlabeled data remixed. ArXiv, 2019: 1909.01174.
|
36. |
Tan K, Chen J, Wang D. Gated residual networks with dilated convolutions for supervised speech separation// 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Calgary: IEEE, 2018: 21-25.
|