Applied Speech Processing: Algorithms and Case Studies is concerned with supporting and enhancing the utilization of speech analytics in several systems and real-world activities, including sharing data analytics related information, creating collaboration networks between several participants, and the use of video-conferencing in different application areas. The book provides a well-standing forum to discuss the characteristics of the intelligent speech signal processing systems in different domains. The book is proposed for professionals, scientists, and engineers who are involved in new techniques of intelligent speech signal processing methods and systems. It provides an outstanding foundation for undergraduate and post-graduate students as well.
- Includes basics of speech data analysis and management tools with several applications, highlighting recording systems
- Covers different techniques of big data and Internet-of-Things in speech signal processing, including machine learning and data mining
- Offers a multidisciplinary view of current and future challenges in this field, with extensive case studies on the design, implementation, development and management of intelligent systems, neural networks, and related machine learning techniques for speech signal processing
Contributors |
|
ix | |
Preface |
|
xi | |
|
PART 1 Speech enhancement and synthesis |
|
|
|
1 Kurtosis-based, data-selective affine projection adaptive filtering algorithm for speech processing application |
|
|
3 | (24) |
|
|
|
|
3 | (1) |
|
|
4 | (1) |
|
|
5 | (2) |
|
|
7 | (16) |
|
|
23 | (1) |
|
|
24 | (1) |
|
|
24 | (3) |
|
2 Recursive noise estimation-based Wiener filtering for monaural speech enhancement |
|
|
27 | (20) |
|
|
|
|
27 | (1) |
|
2.2 Spectral subtraction method |
|
|
28 | (2) |
|
2.3 Recursive noise estimation |
|
|
30 | (1) |
|
2.4 Recursive noise estimation-based Wiener filtering |
|
|
31 | (1) |
|
2.5 Experimental setup and results |
|
|
32 | (13) |
|
|
45 | (1) |
|
|
45 | (2) |
|
3 Modified least mean square adaptive filter for speech enhancement |
|
|
47 | (28) |
|
|
|
|
47 | (2) |
|
|
49 | (1) |
|
3.3 Optimum filter for noise reduction |
|
|
50 | (2) |
|
3.4 Noise reduction using least mean square adaptive algorithms |
|
|
52 | (5) |
|
3.5 Experimental results and discussions |
|
|
57 | (14) |
|
|
71 | (1) |
|
|
72 | (3) |
|
4 Unsupervised single-channel speech enhancement based on phase aware time-frequency mask estimation |
|
|
75 | (26) |
|
|
|
|
75 | (1) |
|
|
75 | (1) |
|
|
76 | (2) |
|
4.4 Problem definition and notations |
|
|
78 | (1) |
|
4.5 Time-frequency mask estimation |
|
|
79 | (2) |
|
|
81 | (1) |
|
4.7 Experimental settings |
|
|
82 | (3) |
|
4.8 Results and discussion |
|
|
85 | (11) |
|
|
96 | (1) |
|
|
97 | (4) |
|
5 Harmonic adaptive speech synthesis |
|
|
101 | (16) |
|
|
|
|
|
|
101 | (2) |
|
5.2 Adaptive harmonic filtering approach to speech synthesis |
|
|
103 | (2) |
|
5.3 Experiments and results |
|
|
105 | (6) |
|
|
111 | (2) |
|
|
113 | (4) |
|
PART 2 Speech identification, feature selection and classification |
|
|
|
6 Linguistically involved data-driven approach for Malayalam phoneme-to-viseme mapping |
|
|
117 | (30) |
|
|
|
|
|
|
117 | (3) |
|
6.2 Viseme set-formation approaches |
|
|
120 | (1) |
|
6.3 Malayalam audio-visual speech database |
|
|
121 | (4) |
|
6.4 Malayalam phoneme-to-viseme/many-to-one mapping |
|
|
125 | (14) |
|
6.5 Durational analysis of visual speech |
|
|
139 | (2) |
|
|
141 | (1) |
|
|
142 | (1) |
|
|
143 | (1) |
|
|
143 | (4) |
|
7 Closed-set speaker identification system based on MFCC and PNCC features combination with different fusion strategies |
|
|
147 | (28) |
|
|
|
|
|
|
147 | (2) |
|
7.2 Biometric speaker identification framework |
|
|
149 | (4) |
|
7.3 Speaker identification systems with fusion strategies |
|
|
153 | (7) |
|
|
160 | (1) |
|
7.5 Comparisons with related work |
|
|
160 | (2) |
|
|
162 | (5) |
|
|
167 | (4) |
|
|
171 | (1) |
|
|
172 | (1) |
|
|
173 | (2) |
|
8 Analysis of machine learning algorithms for audio event classification using Mel-frequency cepstral coefficients |
|
|
175 | (16) |
|
|
|
|
|
175 | (1) |
|
|
176 | (3) |
|
8.3 Feature extraction for audio classification |
|
|
179 | (2) |
|
8.4 Machine learning techniques |
|
|
181 | (2) |
|
8.5 Experimental results and discussion |
|
|
183 | (4) |
|
|
187 | (1) |
|
|
188 | (3) |
Index |
|
191 | |
Nilanjan Dey (Senior Member, IEEE) received the B.Tech., M.Tech. in information technology from West Bengal Board of Technical University and Ph.D. degrees in electronics and telecommunication engineering from Jadavpur University, Kolkata, India, in 2005, 2011, and 2015, respectively. Currently, he is Associate Professor with the Techno International New Town, Kolkata and a visiting fellow of the University of Reading, UK. He has authored over 300 research articles in peer-reviewed journals and international conferences and 40 authored books. His research interests include medical imaging and machine learning. Moreover, he actively participates in program and organizing committees for prestigious international conferences, including World Conference on Smart Trends in Systems Security and Sustainability (WorldS4), International Congress on Information and Communication Technology (ICICT), International Conference on Information and Communications Technology for Sustainable Development (ICT4SD) etc.
He is also the Editor-in-Chief of International Journal of Ambient Computing and Intelligence, Associate Editor of IEEE Transactions on Technology and Society and series Co-Editor of Springer Tracts in Nature-Inspired Computing and Data-Intensive Research from Springer Nature and Advances in Ubiquitous Sensing Applications for Healthcare from Elsevier etc. Furthermore, he was an Editorial Board Member Complex & Intelligence Systems, Springer, Applied Soft Computing, Elsevier and he is an International Journal of Information Technology, Springer, International Journal of Information and Decision Sciences etc. He is a Fellow of IETE and member of IE, ISOC etc.