Skip to main content

Unit information: Speech and Audio Processing in 2018/19

Please note: It is possible that the information shown for future academic years may change due to developments in the relevant academic field. Optional unit availability varies depending on both staffing and student choice.

Unit name Speech and Audio Processing
Unit code EENGM1411
Credit points 10
Level of study M/7
Teaching block(s) Teaching Block 2 (weeks 13 - 24)
Unit director Dr. Hill
Open unit status Not open
Pre-requisites

EENG31400 Digital Filters and Spectral Analysis 3

Co-requisites

None

School/department Department of Electrical & Electronic Engineering
Faculty Faculty of Engineering

Description

This unit will cover speech and audio processing techniques widely used in multimedia engineering. The first part of the course will provide a brief description of the human auditory system and speech production mechanism. The second part will deal with compression techniques for speech including LPC analysis and CELP coders. Wideband audio compression schemes will also be covered, as exemplified by MP3 compression. A description of the compression algorithms featured in some of the international multimedia coding standards will be provided. The final part of the course will examine specific audio applications such as 3D audio, time stretching (using the phase vocoder) and some specific music synthesis techniques such as subtractive and FM synthesis.

Elements:

Speech and its Characteristics

  • Audio system fundamentals: Phase vocoder, spectrographs, DSP review.
  • Historical review: Music Synthesis, Music Analysis, Speech Synthesis.
  • Acoustics: The wave equation, acoustic tubes, reflections & resonance, oscillations & musical acoustics, spherical waves & room acoustics.
  • Auditory System: Psychophysics, auditory scene analysis.
  • Speech models / speech analysis and synthesis: LPC and cepstrum analysis. The use of HMMs for speech recognition.
  • Compression / Coding: CELP coders, multi-rate and wideband compression. MUSICAM and MPEG audio coding schemas.
  • Music analysis and recognition: Transcription, summarization, and similarity 3D Audio: Head Related Transfer Functions (HRTFs), using OpenAL.
  • Synthesis: Subtractive, additive, FM, wavetable and granular synthesis.

Intended learning outcomes

On successful completion of the unit a student will be able to:

  1. explain the different aspects of speech production, coding and recognition
  2. explain the fundamentals of speech and audio coding systems
  3. explain the algorithmic details of various international standards for speech and audio coding
  4. design musical effects processing and synthesis systems.

Teaching details

A combination of lectures and seminars

Assessment Details

Exam, 2 hours, 100% (All ILOs)

Reading and References

L. Rabiner and B. Juang. Fundamentals of Speech Recognition. Prentice-Hall Signal Processing Series. 1993.

B Gold & N Morgan, Speech and Audio Signal Processing, John Wliey & Sons, 1999

Feedback