IEEE Transactions on Audio, Speech and Language Processing

Scope

The IEEE Transactions on Audio, Speech and Language Processing (TASLPRO) is dedicated to innovative theory and methods for processing signals representing audio, speech and language, and their applications. This includes analysis, synthesis, enhancement, transformation, classification and interpretation of such signals as well as the design, development, and evaluation of associated signal processing systems.

Machine learning and pattern analysis applied to any of the above areas is also welcome.

Reproducible research

The Transactions encourages authors to make their publications reproducible by making all information needed to reproduce the presented results available online. This typically requires publishing the code and data used to produce the publication's figures and tables on a website; see the supplemental materials section of the information for authors. It gives other researchers easier access to the work, and facilitates fair comparisons.

Multimedia content

It is now possible to submit for review and publish in Xplore supporting multimedia material such as speech samples, images, movies, matlab code etc. A multimedia graphical abstract can also be displayed along with the traditional text. More information is available under Multimedia Materials at the IEEE Author Center.

TASLPRO Volume 33 | 2025

Memory-Tuning: A Unified Parameter-Efficient Tuning Method for Pre-Trained Language Models

TASLPRO Volume 33 | 2025

Conventional fine-tuning encounters increasing difficulties given the size of current Pre-trained Language Models, which makes parameter-efficient tuning become the focal point of frontier research. Recent advances in this field is the unified tuning methods that aim to tune the representations of both multi-head attention (MHA) and fully connected feed-forward network (FFN) simultaneously, but they rely on existing tuning methods and do not explicitly model domain knowledge for downstream tasks.

Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception

TASLPRO Volume 33 | 2025

Audio and visual signals complement each other in human speech perception, and the same applies to automatic speech recognition. The visual signal is less evident than the acoustic signal, but more robust in a complex acoustic environment, as far as speech perception is concerned.

Adaptive Multimodal Graph Integration Network for Multimodal Sentiment Analysis

TASLPRO Volume 33 | 2025

Most current models for analyzing multimodal sequences often disregard the imbalanced contributions of individual modal representations caused by varying information densities, as well as the inherent multi-relational interactions across distinct modalities. Consequently, a biased understanding of the intricate interplay among modalities may be fostered, limiting prediction accuracy and effectiveness.

Operation-Augmented Numerical Reasoning for Question Answering

TASLPRO Articles

TASLP Volume 32 | 2024

Question answering requiring numerical reasoning, which generally involves symbolic operations such as sorting, counting, and addition, is a challenging task. To address such a problem, existing mixture-of-experts (MoE)-based methods design several specific answer predictors to handle different types of questions and achieve promising performance. However, they ignore the modeling and exploitation of fine-grained reasoning-related operations to support numerical reasoning, encountering the inadequacy in reasoning capability and interpretability.

Speech Dereverberation With Frequency Domain Autoregressive Modeling

TASLPRO Articles

TASLP Volume 32 | 2024

Speech applications in far-field real world settings often deal with signals that are corrupted by reverberation. The task of dereverberation constitutes an important step to improve the audible quality and to reduce the error rates in applications like automatic speech recognition (ASR). We propose a unified framework of speech dereverberation for improving the speech quality and the ASR performance using the approach of envelope-carrier decomposition provided by an autoregressive (AR) model.

Publications & Resources

Conferences & Events

Professional Development

Community & Involvement

About IEEE SPS

For Volunteers

IEEE Transactions on Audio, Speech and Language Processing

Scope

Reproducible research

Multimedia content

TASLPRO Volume 33 | 2025

Memory-Tuning: A Unified Parameter-Efficient Tuning Method for Pre-Trained Language Models

Predict-and-Update Network: Audio-Visual Speech Recognition Inspired by Human Speech Perception

Adaptive Multimodal Graph Integration Network for Multimodal Sentiment Analysis

TASLP Volume 32 | 2024

Operation-Augmented Numerical Reasoning for Question Answering

Speech Dereverberation With Frequency Domain Autoregressive Modeling

IEEE Signal Processing Society on YouTube

Publications & Resources

Conferences & Events

Professional Development

Community & Involvement

About IEEE SPS

For Volunteers