George P. Kafentzis

Research Summary

Speech and audio analysis constitute the core of my research activity, with particular emphasis on speech signal modeling, decomposition, transformation, and machine-learning-based interpretation. A substantial part of my Ph.D. work focused on the development of adaptive sinusoidal models for speech, where the signal is represented through time-varying amplitude- and phase-modulated components. Accurate parameter estimation in such models enables high-quality speech modifications, including time- and pitch-scale transformations, while precise speech decomposition can support downstream applications such as speech synthesis, emotion recognition, audio analysis, and musical instrument sound modeling. More broadly, my work has addressed several aspects of speech signal processing, including inverse filtering, glottal analysis, vocoding, speech transformation, text-to-speech synthesis, and acoustic modeling.

My current research interests extend toward machine learning and deep learning methods for speech, audio, and healthcare-oriented acoustic intelligence. I am particularly interested in pathological speech and voice analysis from the perspective of source-filter separation, since non-invasive glottal analysis may provide clinically useful information about vocal-fold function in a cost-effective way. In parallel, I am interested in affective speech processing, including emotion recognition from speech and the extraction of either data-driven or knowledge-based acoustic representations that can define a compact, interpretable emotional space useful in engineering, cognitive science, psychology, and related fields. More recently, my research has also expanded to cough and respiratory audio analysis, acoustic event detection, and predictive modeling for healthcare applications, combining classical signal processing expertise with modern machine learning and deep learning methodologies.

You can find my resume here

Interests

Speech Signal Processing
Affective Computing
Machine/Deep Learning
Pathological Speech Processing
Audio Signal Processing
Biosignal Processing

Research Projects

Machine Learning for Affective Computing

My line of research connects classical speech signal processing with data-driven affective modeling, aiming to derive compact and interpretable acoustic representations of affective state. It also complements the broader speech-analysis profile by extending signal decomposition, modeling, and classification methods toward human-centered applications in psychology, cognition, and human–computer interaction.
Speech Transformations based on Adaptive Quasi-Harmonic Environments

Representing speech as a sum of high resolution, time-varying amplitude- and phase-modulated sinusoidal components enables detailed decomposition of nonstationary speech signals. This modeling framework supports high-quality speech transformations, including time-scale and pitch-scale modification, as well as analysis/synthesis systems for speech processing.
Glottal Source Analysis and Applications via Inverse Filtering methods

This direction of my work is particularly relevant for pathological voice and speech analysis, where glottal-source parameters may reveal clinically meaningful deviations in phonation.
Biosignal Processing for Healthcare

My work also extends to cough audio analysis and classification, including acoustic event detection, data exploration, and machine/deep learning models for healthcare-oriented prediction. This work positions audio as a biosignal modality, combining signal processing, acoustic modeling, and machine learning for low-cost, non-invasive health assessment.