The Speech Processing Courses in Crete (SPCC) are targeting to teach graduate students and researchers the latest advancements of speech processing covering theory, hands on, and
establishing contacts between the academics and industry. The school will provide the chance to students and professionals to meet world leaders in speech technology, exchanging ideas,
sharing experiences and vision.
The school strongly supports the hands-on sessions to provide to the participants a good balance between theory and practice.
This year, the (summer) school takes place in ICASSP22 as a short course and is organized by the IEEE Signal Processing Educational Committee.
For 2022, the topic is: Inclusive Neural Speech Synthesis
The topic includes:- Front-end processing: overview and current trends
- Advancements in neural acoustic modeling and vocoders
- Inclusive neural Text-to-Speech synthesis
SPCC22 is dedicated to the memory of Ann Syrdal.
Course Overview
The Inclusive Neural Speech Synthesis - iNSS short course at ICASSP 2022 is split into two parts. In the first part, the main components of a Neural Text-to-Speech (TTS) system, from front-end to acoustic modeling and to the vocoders, will be presented. In the second part, concrete applications for making TTS an inclusive technology will be discussed and demonstrated. The main Neural TTS components will be presented taking into account the plethora of approaches currently used, provide the links between them and discuss perspectives. The applications will focus on people with mild to moderate hearing loss, elderly people, blind and low-vision users, and speech-impaired users. iNSS will strongly support hands-on sessions to provide a balance between theory and practice while introducing methods and tools to the participants. Using the Zoom platform, the course will be available for remote as well as for in-person participants.
Learning outcome
- Identify and summarize the basic steps for building a multilingual neural front-end
- Label pros and cons of current neural acoustic and vocoder models
- Recognition of opportunities for helping people via Text-to-Speech technology
- Test the efficiency of intelligibility boosting algorithms in adverse listening conditions
Target audience, and the expected prerequisite technical knowledge
iNSS is targeting to teach graduate students (MSc and PhDs), researchers and engineers. We believe that particularly the application section of iNSS will be of a strong interest to industry signal processing engineers and practitioners. Participants are expected to have basic background in signal processing and neural networks.
Supporting course resources, software, tools and readings
- Lecture slides and notes, as well video recordings of the lectures will be available to the participants (in person and remotely)
- Lectures via zoom
- Dedicated slack channel for comments and questions, which will stay open up to one week after the school
- Course website hosted by the University of Crete for sharing additional material such as key papers, code, data, etc (https://www.csd.uoc.gr/~spcc/)
- Hands on material will be based on Google Colaboratory/Jupyter Notebook
LATEST NEWS
- 28-4-2022: Apple has agreed to sponsor the course! The sponsorship will allow us to waive the registration of 10 students who will attend remotely!
Please send an e-mail to kafentz AT csd dot uoc dot gr to express your interest. Please include a CV and a recommendation letter (from your supervisor/advisor, for example)!
- 12-4-2022: There is now the opportunity to register for Education Short Courses only at the
2022 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)!
- 07-2-2022: This year, SPCC goes ICASSP22!
SPCC will take place in Singapore, as a short course for the International Conference on Acoustics, Speech, and Signal Processing. Stay tuned for more info!
For any questions regarding SPCC22, please send an e-mail to: spcc AT csd.uoc.gr