The article presents a description of the algorithm of singing voice quality assessment that uses selected methods from the field of digital image processing and recognition. It adopts the assumption that an audio signal with recorded vocal exercise can be converted into a visual representation, and processed further, as an image. Presented approach is based on generating a sound spectrogram of a sample in the form of a rectangular matrix, objective improvement of its visual quality based on local changes in brightness and contrast, and scaling to a fixed size. Then, it uses a two-step approach: the construction of a representative database of reference samples and the identification of test samples. The process of building the database uses two-dimensional linear discriminant analysis. Then, the recognition operation is carried out in a reduced feature space that has been obtained by two-dimensional Karhunen-Loeve projection. Classification is done by a variant of Support Vector Machines approach. As it is shown, the results are very encouraging and are competitive to the most powerful state-of-the-art methods.
Forczmanski, Pawel (2016) Evaluation of Singer's Voice Quality by Means of Visual Pattern Recognition. In: Journal of Voice: Official Journal of the Voice Foundation, Vol. 30, No. 1, pp. 127.e21-127.e30. Available at http://openmusiclibrary.org/article/71153/.