MARC View

000			06157nam a22006375i 4500
001			978-981-97-0601-3
003			DE-He213
005			20240730171253.0
007			cr nn 008mamaa
008			240214s2024 si \| s \|\|\|\| 0\|eng d
020			_a9789819706013 _9978-981-97-0601-3
024	7		_a10.1007/978-981-97-0601-3 _2doi
050		4	_aTA1634
072		7	_aUYQV _2bicssc
072		7	_aCOM016000 _2bisacsh
072		7	_aUYQV _2thema
082	0	4	_a006.37 _223
245	1	0	_aMan-Machine Speech Communication _h[electronic resource] : _b18th National Conference, NCMMSC 2023, Suzhou, China, December 8-10, 2023, Proceedings / _cedited by Jia Jia, Zhenhua Ling, Xie Chen, Ya Li, Zixing Zhang.
250			_a1st ed. 2024.
264		1	_aSingapore : _bSpringer Nature Singapore : _bImprint: Springer, _c2024.
300			_aXIV, 368 p. 108 illus., 86 illus. in color. _bonline resource.
336			_atext _btxt _2rdacontent
337			_acomputer _bc _2rdamedia
338			_aonline resource _bcr _2rdacarrier
347			_atext file _bPDF _2rda
490	1		_aCommunications in Computer and Information Science, _x1865-0937 ; _v2006
505	0		_aUltra-Low Complexity Residue Echo and Noise Suppression Based on Recurrent Neural Network -- Semi-End-to-End Nested Named Entity Recognition from Speech -- A Lightweight Music Source Separation Model with Graph Convolution Network -- Joint time-domain and frequency-domain progressive learning for single-channel speech enhancement and recognition -- A Study on Domain Adaptation for Audio-visual Speech Enhancement -- APNet2: High-quality and High-efficiency Neural Vocoder with Direct Prediction of Amplitude and Phase Spectra -- Within- and Between-Class Sample Interpolation Based Supervised Metric Learning for Speaker Verification -- Joint speech and noise estimation using SNR-adaptive target learning for deep-learning-based speech enhancement -- Data Augmentation By Finite Element Analysis for Enhanced Machine Anomalous Sound Detection -- A Fast Sampling Method in Diffusion-based Dance Generation Models -- End-to-end Streaming Customizable KeywordSpotting based on text-adaptive neural search -- The Production of Successive Addition Boundary Tone in Mandarin Preschoolers -- Emotional Support Dialog System Through Recursive Interactions Among Large Language Models -- Task-Adaptive Generative Adversarial Network based Speech Dereverberation for Robust Speech Recognition -- Real-time Automotive Engine Sound Simulation with Deep Neural Network -- A Framework Combining Separate and Joint Training for Neural Vocoder-Based Monaural Speech Enhancement -- Accent-VITS: accent transfer for end-to-end TTS -- Multi-branch Network with Cross-Domain Feature Fusion for Anomalous Sound Detection -- A Packet Loss Concealment Method Based on the Demucs Network Structure -- Improving Speech Perceptual Quality and Intelligibility through Sub-band Temporal Envelope Characteristics -- Adaptive Deep Graph Convolutional Network For Dialogical Speech Emotion Recognition -- Iterative Noisy-target Approach: Speech Enhancement without Clean Speech -- Joint Training or Not: An Exploration of Pre-trained Speech Models in Audio-Visual Speaker Diarization -- Zero-shot Singing Voice Conversion Method Based on Timbre Space Modeling and Excitation Signal Control -- A Comparative Study of Pre-trained Audio and Speech Models for Heart Sound Detection -- CAM-GUI: A Conversational Assistant on Mobile GUI -- A Pilot Study on the Prosodic Factors Influencing Voice Attractiveness of AI Speech -- The DKU-MSXF Diarization System for the VoxCeleb Speaker Recognition Challenge 2023 -- Chinese EFL Learners' Auditory and Visual Perception of English Statement and Question Intonation: The Effect of Stress -- An Improved System for Partially Fake Audio Detection Using Pre-trained Model -- Leveraging Synthetic Speech for CIF-based Customized Keyword Spotting.
520			_aThis book constitutes the refereed proceedings of the 18th National Conference on Man-Machine Speech Communication, NCMMSC 2023, held in Suzhou, China, during December 8-11, 2023. The 20 full papers and 11 short papers included in this book were carefully reviewed and selected from 117 submissions. They deal with topics such as speech recognition, synthesis, enhancement and coding, audio/music/singing synthesis, avatar, speaker recognition and verification, human-computer dialogue systems, large language models as well as phonetic and linguistic topics such as speech prosody analysis, pathological speech analysis, experimental phonetics, acoustic scene classification.
650		0	_aComputer vision. _997785
650		0	_aNatural language processing (Computer science). _94741
650		0	_aSignal processing. _94052
650		0	_aArtificial intelligence. _93407
650		0	_aUser interfaces (Computer systems). _911681
650		0	_aHuman-computer interaction. _96196
650	1	4	_aComputer Vision. _997787
650	2	4	_aNatural Language Processing (NLP). _931587
650	2	4	_aSignal, Speech and Image Processing. _931566
650	2	4	_aArtificial Intelligence. _93407
650	2	4	_aUser Interfaces and Human Computer Interaction. _931632
700	1		_aJia, Jia. _eeditor. _4edt _4http://id.loc.gov/vocabulary/relators/edt _997789
700	1		_aLing, Zhenhua. _eeditor. _4edt _4http://id.loc.gov/vocabulary/relators/edt _997791
700	1		_aChen, Xie. _eeditor. _4edt _4http://id.loc.gov/vocabulary/relators/edt _997793
700	1		_aLi, Ya. _eeditor. _4edt _4http://id.loc.gov/vocabulary/relators/edt _997795
700	1		_aZhang, Zixing. _eeditor. _4edt _4http://id.loc.gov/vocabulary/relators/edt _997797
710	2		_aSpringerLink (Online service) _997801
773	0		_tSpringer Nature eBook
776	0	8	_iPrinted edition: _z9789819706006
776	0	8	_iPrinted edition: _z9789819706020
830		0	_aCommunications in Computer and Information Science, _x1865-0937 ; _v2006 _997802
856	4	0	_uhttps://doi.org/10.1007/978-981-97-0601-3
912			_aZDB-2-SCS
912			_aZDB-2-SXCS
942			_cEBK
999			_c87481 _d87481