Knowledge base: Warsaw University of Technology

Back

Sound description and recognition with the use of MPEG-7 audio high-level descriptors

Aneta Świercz

Abstract

The research project described in the dissertation was undertaken to determine whether the application of a model of signal processing occurring in the auditory system can improve the quality of sound description and recognition with the use of MPEG-7 audio high-level descriptors. The model of auditory filters comprised a bank of gammatone filters. The assessment was made by comparing an application employing gammatone filters with an application using the original MPEG-7 AudioSpectrumEnvelope descriptor, which analyzes sound using short-term Fourier transform (STFT). The tests were run on the sounds of musical instruments and speech. It was found that the use of the gammatone filters in place of STFT in the low-level descriptor produces better sound recognition with the use of MPEG-7 audio high-level tools. The result is significant due to the fact that the currently used MPEG-7 audio descriptors do not make use of filters corresponding to filtering in the human auditory system.
Record ID
WUT4b48095e86d74ab194d76d329c323ea6
Diploma type
Doctor of Philosophy
Author
Aneta Świercz Aneta Świercz,, The Institute of Radioelectronics and Multimedia Technology (FEIT/IRMT)Faculty of Electronics and Information Technology (FEIT)
Title in Polish
Model filtrów słuchowych a deskryptory MPEG-7 w rozpoznawaniu dźwięku
Title in English
Sound description and recognition with the use of MPEG-7 audio high-level descriptors
Language
(pl) Polish
Certifying Unit
Faculty of Electronics and Information Technology (FEIT)
Discipline
electronics / (technology domain) / (technological sciences)
Status
Finished
Defense Date
18-12-2014
Supervisor
Pages
119
Keywords in English
xxxx
Abstract in English
The research project described in the dissertation was undertaken to determine whether the application of a model of signal processing occurring in the auditory system can improve the quality of sound description and recognition with the use of MPEG-7 audio high-level descriptors. The model of auditory filters comprised a bank of gammatone filters. The assessment was made by comparing an application employing gammatone filters with an application using the original MPEG-7 AudioSpectrumEnvelope descriptor, which analyzes sound using short-term Fourier transform (STFT). The tests were run on the sounds of musical instruments and speech. It was found that the use of the gammatone filters in place of STFT in the low-level descriptor produces better sound recognition with the use of MPEG-7 audio high-level tools. The result is significant due to the fact that the currently used MPEG-7 audio descriptors do not make use of filters corresponding to filtering in the human auditory system.
Thesis file
Request a WCAG compliant version
Reviews
Request a WCAG compliant version

Uniform Resource Identifier
https://repo.pw.edu.pl/info/phd/WUT4b48095e86d74ab194d76d329c323ea6/
URN
urn:pw-repo:WUT4b48095e86d74ab194d76d329c323ea6

Confirmation
Are you sure?